Search results for: arabic text
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1660

Search results for: arabic text

1450 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: cooccurrence graph, entity relation graph, unstructured text, weighted distance

Procedia PDF Downloads 131
1449 Symmetric Key Encryption Algorithm Using Indian Traditional Musical Scale for Information Security

Authors: Aishwarya Talapuru, Sri Silpa Padmanabhuni, B. Jyoshna

Abstract:

Cryptography helps in preventing threats to information security by providing various algorithms. This study introduces a new symmetric key encryption algorithm for information security which is linked with the "raagas" which means Indian traditional scale and pattern of music notes. This algorithm takes the plain text as input and starts its encryption process. The algorithm then randomly selects a raaga from the list of raagas that is assumed to be present with both sender and the receiver. The plain text is associated with the thus selected raaga and an intermediate cipher-text is formed as the algorithm converts the plain text characters into other characters, depending upon the rules of the algorithm. This intermediate code or cipher text is arranged in various patterns in three different rounds of encryption performed. The total number of rounds in the algorithm is equal to the multiples of 3. To be more specific, the outcome or output of the sequence of first three rounds is again passed as the input to this sequence of rounds recursively, till the total number of rounds of encryption is performed. The raaga selected by the algorithm and the number of rounds performed will be specified at an arbitrary location in the key, in addition to important information regarding the rounds of encryption, embedded in the key which is known by the sender and interpreted only by the receiver, thereby making the algorithm hack proof. The key can be constructed of any number of bits without any restriction to the size. A software application is also developed to demonstrate this process of encryption, which dynamically takes the plain text as input and readily generates the cipher text as output. Therefore, this algorithm stands as one of the strongest tools for information security.

Keywords: cipher text, cryptography, plaintext, raaga

Procedia PDF Downloads 268
1448 On the Comprehension of English Compound Nouns by Arabic-Speaking EFL Learners

Authors: Abdel Rahman Altakhaineh, Mohamma Alaghawat, Hiba Alhendi

Abstract:

This paper reports an investigation of the comprehension of English compound nouns by sixty Arabic-speaking English Foreign Language (EFL) learners majoring in English at the University of Jordan, Amman. The investigation focused on the problems that these learners may encounter in understanding certain types of compounds and their ability to use their L1 compound noun knowledge to produce the meaning of L2 compound nouns. Participants whose English proficiency level was advanced underwent a test to identify the meaning ofan underlined compound without using a dictionary. Theresponses to the three different types of compounds were analyzed usingTwo-Way repeated measures ANOVA, and the results showed that there were different endocentric and exocentric compound responses within subordinative compounds, with a statistically significant difference between the two in favor of endocentric compounds. We argue that the endocentric, especially subordinative endocentric compounds,weremore easily understood due to its representative nature, i.e., because the head represents the meaning of the whole compound. The study concludes with pedagogical implications for teaching compound nouns.

Keywords: morphology, compounding, SLA, arabic-speaking EFL learners

Procedia PDF Downloads 89
1447 Practical Ways to Acquire the Arabic Language through Electronic Means

Authors: Hondozi Jahja

Abstract:

There is an obvious need to learn Arabic language and teach it to other speakers through the new curricula. The idea is to bridge the gap between theory and practice. To that end, we have sought to offer some means of help to master the Arabic language, in addition to our efforts to apply these means, enriching the culture of the student and develop his vocabulary. There is no doubt that taking care of the practical aspect of the grammar was our constant goal, and this particular aspect is what builds the student’s positive values and refine his taste and develop his language. In addressing these issues, we have adopted a school-based approach based primarily on the active and positive participation of the student. The theoretical linguistic issues - in our opinion - are not a primary goal, but the goal is to be used them by students through speaking and applying them. Among the objectives of this research is to establish the basic language skills of the students using new means that help the student to acquire these skills and apply them in various subjects of interest in his progress and development. Unfortunately, some of our students consider the grammar as ‘difficult’, ‘complex’ and ‘heavy’ in itself. This is one of the obstacles that stand in the way of their desired results. As a consequence, they end up talking – mumbling - about the difficulties they face in applying those rules. Therefore, some of our students finish their university studies and are unable to express what they feel using language correctly. For this purpose, we have sought in this research to follow a new integrated approach, which is to study the grammar of the language through modern means of the consolidation of the principle of functional language, and that the rule implies to control tongues and linguistic expressions properly. This research is a result of a practical experience as a teacher of Arabic language for non-native speakers at the ‘Hassan Pristina’ University, located in Pristina, the capital of Kosovo and at the Qatar Training Center since its establishment in 2012.

Keywords: arabic, applied methods, acquire, learning

Procedia PDF Downloads 139
1446 On Copular Constructions in Yemeni Arabic and the Cartography of Subjects

Authors: Ameen Alahdal

Abstract:

This paper investigates copular constructions in Raimi Yemeni Arabic (RYA). The aim of the paper is actually twofold. First it explores the types of copular constructions in Raimi Yemeni Arabic, a variety of Arabic that has not attracted a lot of attention. In this connection, the paper shows that RYA manifests ‘bare’, verbal and pronominal/PRON copular constructions, just like other varieties of Arabic and indeed other Semitic languages like Hebrew. The sentences below from RYA represent the three constructions, respectively. (1) a. nada Hilwah Nada pretty.3sf ‘Nada is pretty’ b. kan al-banat hina was the-girls here ‘The girls were here c. ali hu-l mudiir Ali he-the manager ‘Ali is the manager’ Interestingly, in addition to these common types of copular constructions, RYA seems to exhibit dual copula sentences, a construction that features both a pronominal copula and a verbal copula. Such a construction is attested neither in Standard Arabic nor in other modern varieties of Arabic such as Lebanese, Moroccan, Egyptian, Jordanian. Remarkably, dual copular sentences do not appear even in other dialects of Yemeni Arabic such as Sanaani, Adeni and Tehami. (2) is an example. (2) maha kan-ih mudarrisah maha was-she teacher.3sf ‘Maha was a teacehr’ Second, the paper considers the cartography of subject positions in copular constructions proposed by Shlonsky and Rizzi (2018). Different copular constructions seem to involve different subject positions (which might eventually correlate with different interpretations – not our concern in this paper). Here, it is argued that in a bare copular sentence, as in (1a), RYA might exploit two criterial subject positions (in Rizzi’s sense), in addition to the canonical Spec,TP position. Under mainstream minimalist assumption, a copular sentence is analyzed as a PredP. Thus, in addition to the PredP-related thematic subject position, a criterial subject position is posited outside of PredP. (3) below represents the cartography of subject positions in a bare copular construction. (3) [……..DP subj PredP DP Pred DP/AP/PP ] In PRON sentences, as exemplified in (1c), another two subject positions are postulated high in the clause, particularly above PolP. (4) illustrates the hierarchy of the subject positions in a PRON copular construction. The subject resides in Spec,SUBJ2P. (4) …DP SUBJ2 …DP SUBJ1 … Pol … DP subj PredP Another related phenomenon in RYA which sets it apart from other languages like Hebrew is that of negative bare copular construction. This construction involves a PRON, which is not found in its affirmative counterpart. PRON, however, is hosted neither by SUBJ20 nor by SUBJ10. Rather, PRON occurs below Neg0 (Pol0 in the hierarchy). This situation raises interesting issues for the hierarchy of subjects in copular constructions as well as to the syntax of the left periphery in general. With regard to what causes the subject to move, there are different potential triggers. For instance, movement of the subject at the base, i.e., out of PredP is triggered by a labeling failure. Other movements of the subject can be driven by a formal feature like EPP, or a criterial feature like [subj].

Keywords: Yemeni Arabic, copular constructions, cartography of subjects, labeling, criterial positions

Procedia PDF Downloads 85
1445 A Study of Various Ontology Learning Systems from Text and a Look into Future

Authors: Fatima Al-Aswadi, Chan Yong

Abstract:

With the large volume of unstructured data that increases day by day on the web, the motivation of representing the knowledge in this data in the machine processable form is increased. Ontology is one of the major cornerstones of representing the information in a more meaningful way on the semantic Web. The goal of Ontology learning from text is to elicit and represent domain knowledge in the machine readable form. This paper aims to give a follow-up review on the ontology learning systems from text and some of their defects. Furthermore, it discusses how far the ontology learning process will enhance in the future.

Keywords: concept discovery, deep learning, ontology learning, semantic relation, semantic web

Procedia PDF Downloads 492
1444 Principle Components Updates via Matrix Perturbations

Authors: Aiman Elragig, Hanan Dreiwi, Dung Ly, Idriss Elmabrook

Abstract:

This paper highlights a new approach to look at online principle components analysis (OPCA). Given a data matrix X R,^m x n we characterise the online updates of its covariance as a matrix perturbation problem. Up to the principle components, it turns out that online updates of the batch PCA can be captured by symmetric matrix perturbation of the batch covariance matrix. We have shown that as n→ n0 >> 1, the batch covariance and its update become almost similar. Finally, utilize our new setup of online updates to find a bound on the angle distance of the principle components of X and its update.

Keywords: online data updates, covariance matrix, online principle component analysis, matrix perturbation

Procedia PDF Downloads 179
1443 Teaching Pragmatic Coherence in Literary Text: Analysis of Chimamanda Adichie’s Americanah

Authors: Joy Aworo-Okoroh

Abstract:

Literary texts are mirrors of a real-life situation. Thus, authors choose the linguistic items that would best encode their intended meanings and messages. However, words mean more than they seem. The meaning of words is not static rather, it is dynamic as they constantly enter into relationships within a context. Literary texts can only be meaningful if all pragmatic cues are identified and interpreted. Drawing upon Teun Van Djik's theory of local pragmatic coherence, it is established that words enter into relations in a text and these relations account for sequential speech acts in the texts. Comprehension of the text is dependent on the interpretation of these relations.To show the relevance of pragmatic coherence in literary text analysis, ten conversations were selected in Americanah in order to give a clear idea of the pragmatic relations used. The conversations were analysed, identifying the speech act and epistemic relations inherent in them. A subtle analysis of the structure of the conversations was also carried out. It was discovered that justification is the most commonly used relation and the meaning of the text is dependent on the interpretation of these instances' pragmatic coherence. The study concludes that to effectively teach literature in English, pragmatic coherence should be incorporated as words mean more than they say.

Keywords: pragmatic coherence, epistemic coherence, speech act, Americanah

Procedia PDF Downloads 118
1442 Preparation of Polyethylene/Cashewnut Flour/ Gum Arabic Polymer Blends Through Melt-blending and Determination of Their Biodegradation by Composting Method for Possible Reduction of Polyethylene-based Wastes from the Environment

Authors: Abubakar Umar Birnin-yauri

Abstract:

Plastic wastes arising from Polyethylene (PE)-based materials are increasingly becoming environmental problem, this is owed to the fact that these PE waste materials will only decompose over hundreds, or even thousands of years, during which they cause serious environmental problems. In this research, Polymer blends prepared from PE, Cashewnut flour (CNF) and Gum Arabic (GA) were studied in order to assay their biodegradation potentials via composting method. Different sample formulations were made i.e., X1= (70% PE, 25% CNF and 5% GA, X2= (70% PE, 20% CNF and 10% GA), X3= (70% PE, 15% CNF and 15% GA), X4 = (70% PE, 10% CNF and 20% GA) and X5 = (70% PE, 5% CNF and 25% GA) respectively. The results obtained showed that X1 recorded weight loss of 9.89% of its original weight after the first 20 days and 37.45% after 100 day, and X2 lost 12.67 % after the first 20 days and 42.56% after 100day, sample X5 experienced the greatest weight lost in the two methods adopted which are 52.9% and 57.89%. Instrumental analysis such as Fourier Transform Infrared Spectroscopy, Thermogravimetric analysis and Scanning electron microscopy were performed on the polymer blends before and after biodegradation. The study revealed that the biodegradation of the polymer blends is influenced by the contents of both the CNF and GA added into the blends.

Keywords: polyethylene, cashewnut, gum Arabic, biodegradation, blend, environment

Procedia PDF Downloads 54
1441 Off-Line Text-Independent Arabic Writer Identification Using Optimum Codebooks

Authors: Ahmed Abdullah Ahmed

Abstract:

The task of recognizing the writer of a handwritten text has been an attractive research problem in the document analysis and recognition community with applications in handwriting forensics, paleography, document examination and handwriting recognition. This research presents an automatic method for writer recognition from digitized images of unconstrained writings. Although a great effort has been made by previous studies to come out with various methods, their performances, especially in terms of accuracy, are fallen short, and room for improvements is still wide open. The proposed technique employs optimal codebook based writer characterization where each writing sample is represented by a set of features computed from two codebooks, beginning and ending. Unlike most of the classical codebook based approaches which segment the writing into graphemes, this study is based on fragmenting a particular area of writing which are beginning and ending strokes. The proposed method starting with contour detection to extract significant information from the handwriting and the curve fragmentation is then employed to categorize the handwriting into Beginning and Ending zones into small fragments. The similar fragments of beginning strokes are grouped together to create Beginning cluster, and similarly, the ending strokes are grouped to create the ending cluster. These two clusters lead to the development of two codebooks (beginning and ending) by choosing the center of every similar fragments group. Writings under study are then represented by computing the probability of occurrence of codebook patterns. The probability distribution is used to characterize each writer. Two writings are then compared by computing distances between their respective probability distribution. The evaluations carried out on ICFHR standard dataset of 206 writers using Beginning and Ending codebooks separately. Finally, the Ending codebook achieved the highest identification rate of 98.23%, which is the best result so far on ICFHR dataset.

Keywords: off-line text-independent writer identification, feature extraction, codebook, fragments

Procedia PDF Downloads 495
1440 The Impact of Text Modifications on Ethiopian Students’ Reading Comprehension and Motivation

Authors: Asefa Kenefergib, Dawit Amogne, Yinager Teklesellassie

Abstract:

A study investigated the effects of text modifications on reading comprehension and motivation among Ethiopian secondary school students. A total of 120 students participated, initially taking a reading comprehension pretest and completing a reading motivation questionnaire. Afterward, they were divided into three groups: control, simplified, and elaborated. Each group then took part in a reading comprehension posttest and another reading motivation questionnaire following an eight-week instructional intervention. Despite initial differences, both the simplified and elaborated text groups showed comparable levels of reading motivation and comprehension. The data were analyzed using SPSS version 25, with a one-way ANOVA used to assess the effectiveness of the modified texts in enhancing reading comprehension. The results indicated that the experimental groups performed significantly better on the posttest compared to the control group, suggesting that text modifications can positively influence students' comprehension skills. Furthermore, the impact of text modifications on student reading motivation was assessed using a one-way ANOVA. The findings revealed that both the elaborated and simplified text groups scored higher than the control group in various dimensions of reading motivation, including reading efficacy, curiosity, challenge, compliance, and reading work avoidance. However, the control and simplified groups had nearly similar mean scores in the dimension of reading competition. These results clearly demonstrate that modifying texts can enhance EFL learners' reading motivation and comprehension.

Keywords: simplification, elaboration, reading motivation, reading comprehension

Procedia PDF Downloads 10
1439 The Pragmatics of the Evil Eye: Compliment Response Strategies in Egyptian Colloquial Arabic

Authors: HebatAllah Mohamed

Abstract:

The present study aims at identifying compliment response strategies used by Egyptian students when responding to a problematic and cultural-specific type of compliments: those allegedly provoking the evil eye. Discourse Completion Tasks (DCTs) and interviews were used to collect the data. both The participants were 21 female and 16 male Egyptian graduate and undergraduate students at the American university in Cairo. The results revealed a number of both common and different main and sub-categories of responses utilized by participants of both genders. Pedagogical implications are discussed.

Keywords: Arabic pragmatics, compliment responses, evil eye pragmatics, pragmatics in Egypt

Procedia PDF Downloads 470
1438 Broadening the Roles of Masjid: Reviving Prophetic Holistic Model in Fostering Islamic Education and Arabic Language in South-Western Nigeria

Authors: Ahmad Tijani Surajudeen, Muhammad Zahiri Awang Mat, Aliy Abdulwahid Adebisi

Abstract:

With arrival of Islam in the South-Western Nigeria in the late fifteenth and early sixteenth centuries, various masājid established in different parts of the area played vital roles towards the betterment and unity of the Muslims. However, despite the fact that the masājid in the South-Western part of Nigeria contributed immensely to the spiritual and educational enhancement of the Muslims, it has not fully captured the holistic educational roles as a unique model used by the Prophet (S.A.W). Therefore, the primary objective of this paper is to investigate and broaden the roles of masjid towards its compartmentalized and holistic contributions among the Muslims in the south-western Nigeria. The findings from the paper have identified five holistic roles of masjid, namely, spiritual, intellectual, physical, social and emotional contributions which have been exemplified in the prophetic model of masjid. The paper has argued that the five factors must be unreservedly unified towards the betterment of the Muslims and enhancement of Islamic education and Arabic Language in the South-Western Nigeria. However, the challenges of masjid management in the South-Western Nigeria are the main hindrance in achieving the holistic roles of masjid. It is thereby suggested that, the management of masjid should take the identified prophetic model of masjid into account in order to positively improve the affairs of Muslims as well as promoting the teaching and learning of Islamic education and Arabic language among the Muslims in the South-Western Nigeria.

Keywords: worship, Islamic education, Arabic language, prophetic holistic model

Procedia PDF Downloads 310
1437 A Similarity Measure for Classification and Clustering in Image Based Medical and Text Based Banking Applications

Authors: K. P. Sandesh, M. H. Suman

Abstract:

Text processing plays an important role in information retrieval, data-mining, and web search. Measuring the similarity between the documents is an important operation in the text processing field. In this project, a new similarity measure is proposed. To compute the similarity between two documents with respect to a feature the proposed measure takes the following three cases into account: (1) The feature appears in both documents; (2) The feature appears in only one document and; (3) The feature appears in none of the documents. The proposed measure is extended to gauge the similarity between two sets of documents. The effectiveness of our measure is evaluated on several real-world data sets for text classification and clustering problems, especially in banking and health sectors. The results show that the performance obtained by the proposed measure is better than that achieved by the other measures.

Keywords: document classification, document clustering, entropy, accuracy, classifiers, clustering algorithms

Procedia PDF Downloads 502
1436 Visual Text Analytics Technologies for Real-Time Big Data: Chronological Evolution and Issues

Authors: Siti Azrina B. A. Aziz, Siti Hafizah A. Hamid

Abstract:

New approaches to analyze and visualize data stream in real-time basis is important in making a prompt decision by the decision maker. Financial market trading and surveillance, large-scale emergency response and crowd control are some example scenarios that require real-time analytic and data visualization. This situation has led to the development of techniques and tools that support humans in analyzing the source data. With the emergence of Big Data and social media, new techniques and tools are required in order to process the streaming data. Today, ranges of tools which implement some of these functionalities are available. In this paper, we present chronological evolution evaluation of technologies for supporting of real-time analytic and visualization of the data stream. Based on the past research papers published from 2002 to 2014, we gathered the general information, main techniques, challenges and open issues. The techniques for streaming text visualization are identified based on Text Visualization Browser in chronological order. This paper aims to review the evolution of streaming text visualization techniques and tools, as well as to discuss the problems and challenges for each of identified tools.

Keywords: information visualization, visual analytics, text mining, visual text analytics tools, big data visualization

Procedia PDF Downloads 384
1435 Automatic Assignment of Geminate and Epenthetic Vowel for Amharic Text-to-Speech System

Authors: Tadesse Anberbir, Bankole Felix, Tomio Takara

Abstract:

In the development of a text-to-speech synthesizer, automatic derivation of correct pronunciation from the grapheme form of a text is a central problem. Particularly deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation, but neither is shown in orthography. In this paper, to proposed and integrated a morphological analyzer into an Amharic Text-to-Speech system, mainly to predict geminates and epenthetic vowel positions and prepared a duration modeling method. Amharic Text-to-Speech system (AmhTTS) is a parametric and rule-based system that adopts a cepstral method and uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. The naturalness of the system after employing the duration modeling was evaluated by sentence listening test, and we achieved an average Mean Opinion Score (MOS) 3.4 (68%), which is moderate. By modeling the duration of geminates and controlling the locations of epenthetic vowel, we are able to synthesize good quality speech. Our system is mainly suitable to be customized for other Ethiopian languages with limited resources.

Keywords: amharic, gemination, Speech synthesis, morphology, epenthesis

Procedia PDF Downloads 67
1434 Assessment of the Validity of Sentiment Analysis as a Tool to Analyze the Emotional Content of Text

Authors: Trisha Malhotra

Abstract:

Sentiment analysis is a recent field of study that computationally assesses the emotional nature of a body of text. To assess its test-validity, sentiment analysis was carried out on the emotional corpus of text from a personal 15-day mood diary. Self-reported mood scores varied more or less accurately with daily mood evaluation score given by the software. On further assessment, it was found that while sentiment analysis was good at assessing ‘global’ mood, it was not able to ‘locally’ identify and differentially score synonyms of various emotional words. It is further critiqued for treating the intensity of an emotion as universal across cultures. Finally, the software is shown not to account for emotional complexity in sentences by treating emotions as strictly positive or negative. Hence, it is posited that a better output could be two (positive and negative) affect scores for the same body of text.

Keywords: analysis, data, diary, emotions, mood, sentiment

Procedia PDF Downloads 247
1433 3D Text Toys: Creative Approach to Experiential and Immersive Learning for World Literacy

Authors: Azyz Sharafy

Abstract:

3D Text Toys is an innovative and creative approach that utilizes 3D text objects to enhance creativity, literacy, and basic learning in an enjoyable and gamified manner. By using 3D Text Toys, children can develop their creativity, visually learn words and texts, and apply their artistic talents within their creative abilities. This process incorporates haptic engagement with 2D and 3D texts, word building, and mechanical construction of everyday objects, thereby facilitating better word and text retention. The concept involves constructing visual objects made entirely out of 3D text/words, where each component of the object represents a word or text element. For instance, a bird can be recreated using words or text shaped like its wings, beak, legs, head, and body, resulting in a 3D representation of the bird purely composed of text. This can serve as an art piece or a learning tool in the form of a 3D text toy. These 3D text objects or toys can be crafted using natural materials such as leaves, twigs, strings, or ropes, or they can be made from various physical materials using traditional crafting tools. Digital versions of these objects can be created using 2D or 3D software on devices like phones, laptops, iPads, or computers. To transform digital designs into physical objects, computerized machines such as CNC routers, laser cutters, and 3D printers can be utilized. Once the parts are printed or cut out, students can assemble the 3D texts by gluing them together, resulting in natural or everyday 3D text objects. These objects can be painted to create artistic pieces or text toys, and the addition of wheels can transform them into moving toys. One of the significant advantages of this visual and creative object-based learning process is that students not only learn words but also derive enjoyment from the process of creating, painting, and playing with these objects. The ownership and creation process further enhances comprehension and word retention. Moreover, for individuals with learning disabilities such as dyslexia, ADD (Attention Deficit Disorder), or other learning difficulties, the visual and haptic approach of 3D Text Toys can serve as an additional creative and personalized learning aid. The application of 3D Text Toys extends to both the English language and any other global written language. The adaptation and creative application may vary depending on the country, space, and native written language. Furthermore, the implementation of this visual and haptic learning tool can be tailored to teach foreign languages based on age level and comprehension requirements. In summary, this creative, haptic, and visual approach has the potential to serve as a global literacy tool.

Keywords: 3D text toys, creative, artistic, visual learning for world literacy

Procedia PDF Downloads 44
1432 A Case Study Comparing the Effect of Computer Assisted Task-Based Language Teaching and Computer-Assisted Form Focused Language Instruction on Language Production of Students Learning Arabic as a Foreign Language

Authors: Hanan K. Hassanein

Abstract:

Task-based language teaching (TBLT) and focus on form instruction (FFI) methods were proven to improve quality and quantity of immediate language production. However, studies that compare between the effectiveness of the language production when using TBLT versus FFI are very little with results that are not consistent. Moreover, teaching Arabic using TBLT is a new field with few research that has investigated its application inside classrooms. Furthermore, to the best knowledge of the researcher, there are no prior studies that compared teaching Arabic as a foreign language in a classroom setting using computer-assisted task-based language teaching (CATBLT) with computer-assisted form focused language instruction (CAFFI). Accordingly, the focus of this presentation is to display CATBLT and CAFFI tools when teaching Arabic as a foreign language as well as demonstrate an experimental study that aims to identify whether or not CATBLT is a more effective instruction method. The effectiveness will be determined through comparing CATBLT and CAFFI in terms of accuracy, lexical complexity, and fluency of language produced by students. The participants of the study are 20 students enrolled in two intermediate-level Arabic as a foreign language classes. The experiment will take place over the course of 7 days. Based on a study conducted by Abdurrahman Arslanyilmaz for teaching Turkish as a second language, an in-house computer assisted tool for the TBLT and another one for FFI will be designed for the experiment. The experimental group will be instructed using the in-house CATBLT tool and the control group will be taught through the in-house CAFFI tool. The data that will be analyzed are the dialogues produced by students in both the experimental and control groups when completing a task or communicating in conversational activities. The dialogues of both groups will be analyzed to understand the effect of the type of instruction (CATBLT or CAFFI) on accuracy, lexical complexity, and fluency. Thus, the study aims to demonstrate whether or not there is an instruction method that positively affects the language produced by students learning Arabic as a foreign language more than the other.

Keywords: computer assisted language teaching, foreign language teaching, form-focused instruction, task based language teaching

Procedia PDF Downloads 234
1431 Recognition of Grocery Products in Images Captured by Cellular Phones

Authors: Farshideh Einsele, Hassan Foroosh

Abstract:

In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using wellknown geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process.

Keywords: camera-based OCR, feature extraction, document, image processing, grocery products

Procedia PDF Downloads 393
1430 Pragmatic Survey of Precedence as Linguistic 'Déjà Vu' in Political Text and Talk

Authors: Zarine Avetisyan

Abstract:

Both in language and literature there exists the theory of recurrence of text and talk chunks which brings us to the notion of precedence. It must be stated that precedence as a pragma-linguistic phenomenon is yet underknown and it is the main objective of the present research to revisit and reveal it thoroughly. In line with the main research objective, analysis of political text and talk provides abundant relevant data for the illustration of the phenomenon of precedence. The analysis focuses on certain pragmatic universals (e.g. intention) and categories (e.g. speech techniques) which lead to the disclosure of the present object of study.

Keywords: intention, precedence, political discourse, pragmatic universals

Procedia PDF Downloads 405
1429 Validation of the Arabic Version of the Positive and Negative Syndrome Scale (PANSS)

Authors: Arij Yehya, Suhaila Ghuloum, Abdlmoneim Abdulhakam, Azza Al-Mujalli, Mark Opler, Samer Hammoudeh, Yahya Hani, Sundus Mari, Reem Elsherbiny, Ziyad Mahfoud, Hassen Al-Amin

Abstract:

Introduction: The Positive and Negative Syndrome Scale (PANSS) is a valid instrument developed by Kay and colleagues6 to assess symptoms of patients with schizophrenia. It consists of 30 items that factor the symptoms into three subscales: positive, negative and general psychopathology. This scale has been translated and validated in several languages. Objective: This study aims to determine the validity and psychometric properties of the Arabic version of the PANSS. Methods: A standardized translation and cultural adaptation method was adopted. Patients diagnosed with schizophrenia (n=98), according to psychiatrist’s diagnosis based on DSM-IV criteria, were recruited from the Psychiatry Department at Rumailah Hospital, Qatar. A first rater confirmed the diagnosis using the Arabic version of Mini International Neuropsychiatric Interview (MINI 6). A second and independent rater-administered the Arabic version of PANSS. Also, a control group (n=101), with no history of psychiatric disorder was recruited from the family and friends of the patients and from primary health care centers in Qatar. Results: There were more males than females in our sample of patients with schizophrenia (68.9% and 31.6%, respectively). On the other hand, in the control group the number of females outweighed that of males (58.4% and 41.6% respectively). The scale had a good internal consistency with Cronbach’s alpha 0.91. There was a significant difference between the scores on the three subscales of the PANSS. Patients with schizophrenia scored significantly higher (p<.0001) than the control subjects on subscales for positive symptoms 20.01(SD=7.21) and 7.30(SD=1.38), negative symptoms 18.89(SD=8.88) and 7.37(SD=2.38) and general psychopathology 34.41 (SD=11.56) and 16.93 (SD=3.93), respectively. Factor analysis and ROC curve were carried out to further test the psychometrics of the scale. Conclusions: The Arabic version of PANSS is a reliable and valid tool to assess both positive and negative symptoms of patients with schizophrenia in a balanced manner. In addition to providing the Arab population with a standardized tool to monitor symptoms of schizophrenia, this version provides a gateway to compare the prevalence of positive and negative symptoms in the Arab world which can be compared to others done elsewhere.

Keywords: Arabic version, assessment, diagnosis, schizophrenia, validation

Procedia PDF Downloads 617
1428 Automatic Assignment of Geminate and Epenthetic Vowel for Amharic Text-to-Speech System

Authors: Tadesse Anberbir, Felix Bankole, Tomio Takara, Girma Mamo

Abstract:

In the development of a text-to-speech synthesizer, automatic derivation of correct pronunciation from the grapheme form of a text is a central problem. Particularly deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation but neither is shown in orthography. In this paper, we proposed and integrated a morphological analyzer into an Amharic Text-to-Speech system, mainly to predict geminates and epenthetic vowel positions, and prepared a duration modeling method. Amharic Text-to-Speech system (AmhTTS) is a parametric and rule-based system that adopts a cepstral method and uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. The naturalness of the system after employing the duration modeling was evaluated by sentence listening test and we achieved an average Mean Opinion Score (MOS) 3.4 (68%) which is moderate. By modeling the duration of geminates and controlling the locations of epenthetic vowel, we are able to synthesize good quality speech. Our system is mainly suitable to be customized for other Ethiopian languages with limited resources.

Keywords: Amharic, gemination, speech synthesis, morphology, epenthesis

Procedia PDF Downloads 66
1427 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: hidden markov model, natural language processing, POS tagging, viterbi algorithm

Procedia PDF Downloads 314
1426 Arabic Light Word Analyser: Roles with Deep Learning Approach

Authors: Mohammed Abu Shquier

Abstract:

This paper introduces a word segmentation method using the novel BP-LSTM-CRF architecture for processing semantic output training. The objective of web morphological analysis tools is to link a formal morpho-syntactic description to a lemma, along with morpho-syntactic information, a vocalized form, a vocalized analysis with morpho-syntactic information, and a list of paradigms. A key objective is to continuously enhance the proposed system through an inductive learning approach that considers semantic influences. The system is currently under construction and development based on data-driven learning. To evaluate the tool, an experiment on homograph analysis was conducted. The tool also encompasses the assumption of deep binary segmentation hypotheses, the arbitrary choice of trigram or n-gram continuation probabilities, language limitations, and morphology for both Modern Standard Arabic (MSA) and Dialectal Arabic (DA), which provide justification for updating this system. Most Arabic word analysis systems are based on the phonotactic morpho-syntactic analysis of a word transmitted using lexical rules, which are mainly used in MENA language technology tools, without taking into account contextual or semantic morphological implications. Therefore, it is necessary to have an automatic analysis tool taking into account the word sense and not only the morpho-syntactic category. Moreover, they are also based on statistical/stochastic models. These stochastic models, such as HMMs, have shown their effectiveness in different NLP applications: part-of-speech tagging, machine translation, speech recognition, etc. As an extension, we focus on language modeling using Recurrent Neural Network (RNN); given that morphological analysis coverage was very low in dialectal Arabic, it is significantly important to investigate deeply how the dialect data influence the accuracy of these approaches by developing dialectal morphological processing tools to show that dialectal variability can support to improve analysis.

Keywords: NLP, DL, ML, analyser, MSA, RNN, CNN

Procedia PDF Downloads 22
1425 Logic and Arabic Grammar Debates at Medieval Ages: A Quest for Muslim Contributions to Philosophical Development

Authors: Umar Sheikh Tahir

Abstract:

This paper focuses on the historiography of the relationship between Logic and Arabic grammar in the Muslim Medieval Ages (a period between 750 and 1100/ 150 and 500 Ah). This sensation appears in the famous debate among many others between grammarians represented by abū Sa'id al-Sairafī and logicians represented by abū Bishr Mattā on Logic and its validity. This incident took place in Baghdad around 932 AD. However, this study singlehandedly samples these debates as the base for the contributions of Islamic philosophers to philosophy of language as well as Epistemology. The question that shapes this research is: What is the intellectual development for Muslim thinkers to philosophy of language in regards to this debate? The current research addresses the Arabic grammar and logical debates by conducting historiography to emphasize on Islamic philosophers’ concerns about this issue. Consequently, this debate generates philosophical phenomena and resolutions in deep-thinking. In addition, these dialogues create a language impression for Philosophy in Islamic world from the period under study. Thereupon, Islamic philosophers’ discourse on this phenomenon serves as contribution to the Philosophy of Language.

Keywords: debates, epistemology, grammar and grammarians, Islamic philosophy, philosophy language, logic

Procedia PDF Downloads 204
1424 A Sociolinguistic Study of the Outcomes of Arabic-French Contact in the Algerian Dialect Tlemcen Speech Community as a Case Study

Authors: R. Rahmoun-Mrabet

Abstract:

It is acknowledged that our style of speaking changes according to a wide range of variables such as gender, setting, the age of both the addresser and the addressee, the conversation topic, and the aim of the interaction. These differences in style are noticeable in monolingual and multilingual speech communities. Yet, they are more observable in speech communities where two or more codes coexist. The linguistic situation in Algeria reflects a state of bilingualism because of the coexistence of Arabic and French. Nevertheless, like all Arab countries, it is characterized by diglossia i.e. the concomitance of Modern Standard Arabic (MSA) and Algerian Arabic (AA), the former standing for the ‘high variety’ and the latter for the ‘low variety’. The two varieties are derived from the same source but are used to fulfil distinct functions that is, MSA is used in the domains of religion, literature, education and formal settings. AA, on the other hand, is used in informal settings, in everyday speech. French has strongly affected the Algerian language and culture because of the historical background of Algeria, thus, what can easily be noticed in Algeria is that everyday speech is characterized by code-switching from dialectal Arabic and French or by the use of borrowings. Tamazight is also very present in many regions of Algeria and is the mother tongue of many Algerians. Yet, it is not used in the west of Algeria, where the study has been conducted. The present work, which was directed in the speech community of Tlemcen-Algeria, aims at depicting some of the outcomes of the contact of Arabic with French such as code-switching, borrowing and interference. The question that has been asked is whether Algerians are aware of their use of borrowings or not. Three steps are followed in this research; the first one is to depict the sociolinguistic situation in Algeria and to describe the linguistic characteristics of the dialect of Tlemcen, which are specific to this city. The second one is concerned with data collection. Data have been collected from 57 informants who were given questionnaires and who have then been classified according to their age, gender and level of education. Information has also been collected through observation, and note taking. The third step is devoted to analysis. The results obtained reveal that most Algerians are aware of their use of borrowings. The present work clarifies how words are borrowed from French, and then adapted to Arabic. It also illustrates the way in which singular words inflect into plural. The results expose the main characteristics of borrowing as opposed to code-switching. The study also clarifies how interference occurs at the level of nouns, verbs and adjectives.

Keywords: bilingualism, borrowing, code-switching, interference, language contact

Procedia PDF Downloads 259
1423 Towards a Deconstructive Text: Beyond Language and the Politics of Absences in Samuel Beckett’s Waiting for Godot

Authors: Afia Shahid

Abstract:

The writing of Samuel Beckett is associated with meaning in the meaninglessness and the production of what he calls ‘literature of unword’. The casual escape from the world of words in the form of silences and pauses, in his play Waiting for Godot, urges to ask question of their existence and ultimately leads to investigate the theory behind their use in the play. This paper proposes that these absences (silence and pause) in Beckett’s play force to think ‘beyond’ language. This paper asks how silence and pause in Beckett’s text speak for the emergence of poststructuralist text. It aims to identify the significant features of the philosophy of deconstruction in the play of Beckett to demystify the hostile complicity between literature and philosophy. With the interpretive paradigm of poststructuralism this research focuses on the text as a research data. It attempts to delineate the relationship between poststructuralist theoretical concerns and text of Beckett. Keeping in view the theoretical concerns of Poststructuralist theorist Jacques Derrida, the main concern of the discussion is directed towards the notion of ‘beyond’ language into the absences that are aimed at silencing the existing discourse with the ‘radical irony’ of this anti-formal art that contains its own denial and thus represents the idea of ceaseless questioning and radical contradiction in art and any text. This article asks how text of Beckett vibrates with loud silence and has disrupted language to demonstrate the emptiness of words and thus exploring the limitless void of absences. Beckett’s text resonates with silence and pause that is neither negation nor affirmation rather a poststructuralist’s suspension of reality that is ever changing with the undecidablity of all meanings. Within the theoretical notion of Derrida’s Différance this study interprets silence and pause in Beckett’s art. The silence and pause behave like Derrida’s Différance and have questioned their own existence in the text to deconstruct any definiteness and finality of reality to extend an undecidable threshold of poststructuralists that aims to evade the ‘labyrinth of language’.

Keywords: Différance, language, pause, poststructuralism, silence, text

Procedia PDF Downloads 194
1422 The Platform for Digitization of Georgian Documents

Authors: Erekle Magradze, Davit Soselia, Levan Shughliashvili, Irakli Koberidze, Shota Tsiskaridze, Victor Kakhniashvili, Tamar Chaghiashvili

Abstract:

Since the beginning of active publishing activity in Georgia, voluminous printed material has been accumulated, the digitization of which is an important task. Digitized materials will be available to the audience, and it will be possible to find text in them and conduct various factual research. Digitizing scanned documents means scanning documents, extracting text from the scanned documents, and processing the text into a corresponding language model to detect inaccuracies and grammatical errors. Implementing these stages requires a unified, scalable, and automated platform, where the digital service developed for each stage will perform the task assigned to it; at the same time, it will be possible to develop these services dynamically so that there is no interruption in the work of the platform.

Keywords: NLP, OCR, BERT, Kubernetes, transformers

Procedia PDF Downloads 127
1421 In-Context Meta Learning for Automatic Designing Pretext Tasks for Self-Supervised Image Analysis

Authors: Toktam Khatibi

Abstract:

Self-supervised learning (SSL) includes machine learning models that are trained on one aspect and/or one part of the input to learn other aspects and/or part of it. SSL models are divided into two different categories, including pre-text task-based models and contrastive learning ones. Pre-text tasks are some auxiliary tasks learning pseudo-labels, and the trained models are further fine-tuned for downstream tasks. However, one important disadvantage of SSL using pre-text task solving is defining an appropriate pre-text task for each image dataset with a variety of image modalities. Therefore, it is required to design an appropriate pretext task automatically for each dataset and each downstream task. To the best of our knowledge, the automatic designing of pretext tasks for image analysis has not been considered yet. In this paper, we present a framework based on In-context learning that describes each task based on its input and output data using a pre-trained image transformer. Our proposed method combines the input image and its learned description for optimizing the pre-text task design and its hyper-parameters using Meta-learning models. The representations learned from the pre-text tasks are fine-tuned for solving the downstream tasks. We demonstrate that our proposed framework outperforms the compared ones on unseen tasks and image modalities in addition to its superior performance for previously known tasks and datasets.

Keywords: in-context learning (ICL), meta learning, self-supervised learning (SSL), vision-language domain, transformers

Procedia PDF Downloads 58