Search results for: Arabic text classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3586

Search results for: Arabic text classification

3466 Study and Acquisition of the Duality of the Arabic Language

Authors: Oleg Redkin, Olga Bernikova

Abstract:

It is commonly accepted that every language is both pure linguistic phenomenon as well as socially significant communicative system, which exists on the basis of certain society - its collective 'native speaker'. Therefore the language evolution and features besides its own linguistic rules and regulations are also defined by the influence of a number of extra-linguistic factors. The above mentioned statement may be illustrated by the example of the Arabic language which may be characterized by the following peculiarities: - the inner logic of the Arabic language - the 'algebraicity' of its morphological paradigms and grammar rules; - association of the Arabic language with the sacred texts of Islam, its close ties with the pre-Islamic and Islamic cultural heritage - the pre-Islamic poetry and Islamic literature and science; - territorial distribution, which in recent years went far beyond the boundaries of its traditional realm due to the development of new technologies and the spread of mass media, and what is more important, migration processes; - association of the Arabic language with the so called 'Renaissance of Islam'. These peculiarities should be remembered while considering the status of the Modern Standard Arabic (MSA) language or the Classical Arabic (CA) language as well as the Modern Arabic (MA) dialects in synchrony or from the diachronic point of view. Continuity of any system in diachrony on the one hand depends on the level of its ability to adapt itself to changing environment and by its internal ties on the other. Structural durability of language is characterized by its inner logic, hierarchy of paradigms and its grammar rules, as well as continuity of their implementation in acts of everyday communication. Since the Arabic language is both linguistic and social phenomenon the process of the Arabic language acquisition and study should not be focused only on the knowledge about linguistic features or development of communicative skills alone, but must be supplied with the information related to culture, history and religion of peoples of certain region that will expand and enrich competences of the target audience.

Keywords: Arabic, culture, Islam, language

Procedia PDF Downloads 249
3465 Misconception on Multilingualism in Glorious Quran

Authors: Muhammed Unais

Abstract:

The holy Quran is a pure Arabic book completely ensured the absence of non Arabic term. If it was revealed in a multilingual way including various foreign languages besides the Arabic, it can be easily misunderstood that the Arabs became helpless to compile such a work positively responding to the challenge of Allah due to their lack of knowledge in other languages in which the Quran is compiled. As based on the presence of some non Arabic terms in Quran like Istabrq, Saradiq, Rabbaniyyoon, etc. some oriental scholars argued that the holy Quran is not a book revealed in Arabic. We can see some Muslim scholars who either support or deny the presence of foreign terms in Quran but all of them agree that the roots of these words suspected as non Arabic are from foreign languages and are assimilated to the Arabic and using as same in that foreign language. After this linguistic assimilation was occurred and the assimilated non Arabic words became familiar among the Arabs, the Quran revealed as using these words in such a way stating that all words it contains are Arabic either pure or assimilated. Hence the two of opinions around the authenticity and reliability of etymology of these words are right. Those who argue the presence of foreign words he is right by the way of the roots of that words are from foreign and those who argue its absence he is right for that are assimilated and changed as the pure Arabic. The possibility of multilingualism in a monolingual book is logically negative but its significance is being changed according to time and place. The problem of multilingualism in Quran is the misconception raised by some oriental scholars that the Arabs became helpless to compile a book equal to Quran not because of their weakness in Arabic but because the Quran is revealed in languages they are ignorant on them. Really, the Quran was revealed in pure Arabic, the most literate language of the Arabs, and the whole words and its meaning were familiar among them. If one become positively aware of the linguistic and cultural assimilation ever found in whole civilizations and cultural sets he will have not any question in this respect. In this paper the researcher intends to shed light on the possibility of multilingualism in a monolingual book and debates among scholars in this issue, foreign terms in Quran and the logical justifications along with the exclusive features of Quran.

Keywords: Quran, foreign Terms, multilingualism, language

Procedia PDF Downloads 357
3464 Emotional Analysis for Text Search Queries on Internet

Authors: Gemma García López

Abstract:

The goal of this study is to analyze if search queries carried out in search engines such as Google, can offer emotional information about the user that performs them. Knowing the emotional state in which the Internet user is located can be a key to achieve the maximum personalization of content and the detection of worrying behaviors. For this, two studies were carried out using tools with advanced natural language processing techniques. The first study determines if a query can be classified as positive, negative or neutral, while the second study extracts emotional content from words and applies the categorical and dimensional models for the representation of emotions. In addition, we use search queries in Spanish and English to establish similarities and differences between two languages. The results revealed that text search queries performed by users on the Internet can be classified emotionally. This allows us to better understand the emotional state of the user at the time of the search, which could involve adapting the technology and personalizing the responses to different emotional states.

Keywords: emotion classification, text search queries, emotional analysis, sentiment analysis in text, natural language processing

Procedia PDF Downloads 116
3463 Conceptual Metaphors of Responsibility in Arabic to English Translation of Political Speeches: A Corpus-Based Study

Authors: Amr Anany

Abstract:

This study offers a corpus-based analysis of the conceptual metaphors of RESPONSIBILITY inherent in the Arabic political speeches of King Abdulla II and their English translations rendered by the translators of the Royal Hashemite Court ("RHC translators"). In view of the Conceptual Metaphor Theory (CMT), the current study aims to uncover the extent to which the dominant ideology in the source Arabic speeches of King Abdulla II is conveyed into the target English translation. The study explores a bilingual corpus, including eleven authentic Arabic speeches delivered by King Abdulla II and their English translations. The study finds that both Arabic and English share several metaphorical expressions of RESPONSIBILITY that are based on bodily experience such as RESPONSIBILITY IS UP, RESPONSIBILITY IS AN OBJECT, and RESPONSIBILITY IS AN HONOR. Apparently, the study concludes that RHC translators succeed to convey the dominant ideology from the source Arabic speeches to the English ones using specific translation strategies.

Keywords: cognitive linguistics, CDA, conceptual metaphor theory, ideology, responsibility

Procedia PDF Downloads 37
3462 Structure Analysis of Text-Image Connection in Jalayrid Period Illustrated Manuscripts

Authors: Mahsa Khani Oushani

Abstract:

Text and image are two important elements in the field of Iranian art, the text component and the image component have always been manifested together. The image narrates the text and the text is the factor in the formation of the image and they are closely related to each other. The connection between text and image is an interactive and two-way connection in the tradition of Iranian manuscript arrangement. The interaction between the narrative description and the image scene is the result of a direct and close connection between the text and the image, which in addition to the decorative aspect, also has a descriptive aspect. In this article the connection between the text element and the image element and its adaptation to the theory of Roland Barthes, the structuralism theorist, in this regard will be discussed. This study tends to investigate the question of how the connection between text and image in illustrated manuscripts of the Jalayrid period is defined according to Barthes’ theory. And what kind of proportion has the artist created in the composition between text and image. Based on the results of reviewing the data of this study, it can be inferred that in the Jalayrid period, the image has a reference connection and although it is of major importance on the page, it also maintains a close connection with the text and is placed in a special proportion. It is not necessarily balanced and symmetrical and sometimes uses imbalance for composition. This research has been done by descriptive-analytical method, which has been done by library collection method.

Keywords: structure, text, image, Jalayrid, painter

Procedia PDF Downloads 193
3461 ViraPart: A Text Refinement Framework for Automatic Speech Recognition and Natural Language Processing Tasks in Persian

Authors: Narges Farokhshad, Milad Molazadeh, Saman Jamalabbasi, Hamed Babaei Giglou, Saeed Bibak

Abstract:

The Persian language is an inflectional subject-object-verb language. This fact makes Persian a more uncertain language. However, using techniques such as Zero-Width Non-Joiner (ZWNJ) recognition, punctuation restoration, and Persian Ezafe construction will lead us to a more understandable and precise language. In most of the works in Persian, these techniques are addressed individually. Despite that, we believe that for text refinement in Persian, all of these tasks are necessary. In this work, we proposed a ViraPart framework that uses embedded ParsBERT in its core for text clarifications. First, used the BERT variant for Persian followed by a classifier layer for classification procedures. Next, we combined models outputs to output cleartext. In the end, the proposed model for ZWNJ recognition, punctuation restoration, and Persian Ezafe construction performs the averaged F1 macro scores of 96.90%, 92.13%, and 98.50%, respectively. Experimental results show that our proposed approach is very effective in text refinement for the Persian language.

Keywords: Persian Ezafe, punctuation, ZWNJ, NLP, ParsBERT, transformers

Procedia PDF Downloads 171
3460 Exploring Utility and Intrinsic Value among UAE Arabic Teachers in Integrating M-Learning

Authors: Dina Tareq Ismail, Alexandria A. Proff

Abstract:

The United Arab Emirates (UAE) is a nation seeking to advance in all fields, particularly education. One area of focus for UAE 2021 agenda is to restructure UAE schools and universities by equipping them with highly developed technology. The agenda also advises educational institutions to prepare students with applicable and transferrable Information and Communication Technology (ICT) skills. Despite the emphasis on ICT and computer literacy skills, there exists limited empirical data on the use of M-Learning in the literature. This qualitative study explores the motivation of higher primary Arabic teachers in private schools toward implementing and integrating M-Learning apps in their classrooms. This research employs a phenomenological approach through the use of semistructured interviews with nine purposefully selected Arabic teachers. The data were analyzed using a content analysis via multiple stages of coding: open, axial, and thematic. Findings reveal three primary themes: (1) Arabic teachers with high levels of procedural knowledge in ICT are more motivated to implement M-Learning; (2) Arabic teachers' perceptions of self-efficacy influence their motivation toward implementation of M-Learning; (3) Arabic teachers implement M-Learning when they possess high utility and/or intrinsic value in these applications. These findings indicate a strong need for further training, equipping, and creating buy-in among Arabic teachers to enhance their ICT skills in implementing M-Learning. Further, given the limited availability of M-Learning apps designed for use in the Arabic language on the market, it is imperative that developers consider designing M-Learning tools that Arabic teachers, and Arabic-speaking students, can use and access more readily. This study contributes to closing the knowledge gap on teacher-motivation for implementing M-Learning in their classrooms in the UAE.

Keywords: ICT skills, m-learning, self-efficacy, teacher-motivation

Procedia PDF Downloads 82
3459 Sensitive Analysis of the ZF Model for ABC Multi Criteria Inventory Classification

Authors: Makram Ben Jeddou

Abstract:

The ABC classification is widely used by managers for inventory control. The classical ABC classification is based on the Pareto principle and according to the criterion of the annual use value only. Single criterion classification is often insufficient for a closely inventory control. Multi-criteria inventory classification models have been proposed by researchers in order to take into account other important criteria. From these models, we will consider the ZF model in order to make a sensitive analysis on the composite score calculated for each item. In fact, this score based on a normalized average between a good and a bad optimized index can affect the ABC items classification. We will then focus on the weights assigned to each index and propose a classification compromise.

Keywords: ABC classification, multi criteria inventory classification models, ZF-model

Procedia PDF Downloads 478
3458 A Preliminary Study for Building an Arabic Corpus of Pair Questions-Texts from the Web: Aqa-Webcorp

Authors: Wided Bakari, Patrce Bellot, Mahmoud Neji

Abstract:

With the development of electronic media and the heterogeneity of Arabic data on the Web, the idea of building a clean corpus for certain applications of natural language processing, including machine translation, information retrieval, question answer, become more and more pressing. In this manuscript, we seek to create and develop our own corpus of pair’s questions-texts. This constitution then will provide a better base for our experimentation step. Thus, we try to model this constitution by a method for Arabic insofar as it recovers texts from the web that could prove to be answers to our factual questions. To do this, we had to develop a java script that can extract from a given query a list of html pages. Then clean these pages to the extent of having a database of texts and a corpus of pair’s question-texts. In addition, we give preliminary results of our proposal method. Some investigations for the construction of Arabic corpus are also presented in this document.

Keywords: Arabic, web, corpus, search engine, URL, question, corpus building, script, Google, html, txt

Procedia PDF Downloads 295
3457 Improved Classification Procedure for Imbalanced and Overlapped Situations

Authors: Hankyu Lee, Seoung Bum Kim

Abstract:

The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.

Keywords: classification, imbalanced data with class overlap, split data space, support vector machine

Procedia PDF Downloads 277
3456 Selecting Answers for Questions with Multiple Answer Choices in Arabic Question Answering Based on Textual Entailment Recognition

Authors: Anes Enakoa, Yawei Liang

Abstract:

Question Answering (QA) system is one of the most important and demanding tasks in the field of Natural Language Processing (NLP). In QA systems, the answer generation task generates a list of candidate answers to the user's question, in which only one answer is correct. Answer selection is one of the main components of the QA, which is concerned with selecting the best answer choice from the candidate answers suggested by the system. However, the selection process can be very challenging especially in Arabic due to its particularities. To address this challenge, an approach is proposed to answer questions with multiple answer choices for Arabic QA systems based on Textual Entailment (TE) recognition. The developed approach employs a Support Vector Machine that considers lexical, semantic and syntactic features in order to recognize the entailment between the generated hypotheses (H) and the text (T). A set of experiments has been conducted for performance evaluation and the overall performance of the proposed method reached an accuracy of 67.5% with C@1 score of 80.46%. The obtained results are promising and demonstrate that the proposed method is effective for TE recognition task.

Keywords: information retrieval, machine learning, natural language processing, question answering, textual entailment

Procedia PDF Downloads 120
3455 English Loanwords in the Egyptian Variety of Arabic: Morphological and Phonological Changes

Authors: Mohamed Yacoub

Abstract:

This paper investigates the English loanwords in the Egyptian variety of Arabic and reaches three findings. Data, in the first finding, were collected from Egyptian movies and soap operas; over two hundred words have been borrowed from English, code-switching was not included. These words then have been put into eleven different categories according to their use and part of speech. Finding two addresses the morphological and phonological change that occurred to these words. Regarding the phonological change, eight categories were found in both consonant and vowel variation, five for consonants and three for vowels. Examples were given for each. Regarding the morphological change, five categories were found including the masculine, feminine, dual, broken, and non-pluralize-able nouns. The last finding is the answers to a four-question survey that addresses forty eight native speakers of Egyptian Arabic and found that most participants did not recognize English borrowed words and thought they were originally Arabic and could not give Arabic equivalents for the loanwords that they could recognize.

Keywords: sociolinguistics, loanwords, borrowing, morphology, phonology, variation, Egyptian dialect

Procedia PDF Downloads 357
3454 1/Sigma Term Weighting Scheme for Sentiment Analysis

Authors: Hanan Alshaher, Jinsheng Xu

Abstract:

Large amounts of data on the web can provide valuable information. For example, product reviews help business owners measure customer satisfaction. Sentiment analysis classifies texts into two polarities: positive and negative. This paper examines movie reviews and tweets using a new term weighting scheme, called one-over-sigma (1/sigma), on benchmark datasets for sentiment classification. The proposed method aims to improve the performance of sentiment classification. The results show that 1/sigma is more accurate than the popular term weighting schemes. In order to verify if the entropy reflects the discriminating power of terms, we report a comparison of entropy values for different term weighting schemes.

Keywords: 1/sigma, natural language processing, sentiment analysis, term weighting scheme, text classification

Procedia PDF Downloads 180
3453 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques

Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel

Abstract:

Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.

Keywords: cross-language analysis, machine learning, machine translation, sentiment analysis

Procedia PDF Downloads 680
3452 Voice Commands Recognition of Mentor Robot in Noisy Environment Using HTK

Authors: Khenfer-Koummich Fatma, Hendel Fatiha, Mesbahi Larbi

Abstract:

this paper presents an approach based on Hidden Markov Models (HMM: Hidden Markov Model) using HTK tools. The goal is to create a man-machine interface with a voice recognition system that allows the operator to tele-operate a mentor robot to execute specific tasks as rotate, raise, close, etc. This system should take into account different levels of environmental noise. This approach has been applied to isolated words representing the robot commands spoken in two languages: French and Arabic. The recognition rate obtained is the same in both speeches, Arabic and French in the neutral words. However, there is a slight difference in favor of the Arabic speech when Gaussian white noise is added with a Signal to Noise Ratio (SNR) equal to 30 db, the Arabic speech recognition rate is 69% and 80% for French speech recognition rate. This can be explained by the ability of phonetic context of each speech when the noise is added.

Keywords: voice command, HMM, TIMIT, noise, HTK, Arabic, speech recognition

Procedia PDF Downloads 352
3451 Encryption and Decryption of Nucleic Acid Using Deoxyribonucleic Acid Algorithm

Authors: Iftikhar A. Tayubi, Aabdulrahman Alsubhi, Abdullah Althrwi

Abstract:

The deoxyribonucleic acid text provides a single source of high-quality Cryptography about Deoxyribonucleic acid sequence for structural biologists. We will provide an intuitive, well-organized and user-friendly web interface that allows users to encrypt and decrypt Deoxy Ribonucleic Acid sequence text. It includes complex, securing by using Algorithm to encrypt and decrypt Deoxy Ribonucleic Acid sequence. The utility of this Deoxy Ribonucleic Acid Sequence Text is that, it can provide a user-friendly interface for users to Encrypt and Decrypt store the information about Deoxy Ribonucleic Acid sequence. These interfaces created in this project will satisfy the demands of the scientific community by providing fully encrypt of Deoxy Ribonucleic Acid sequence during this website. We have adopted a methodology by using C# and Active Server Page.NET for programming which is smart and secure. Deoxy Ribonucleic Acid sequence text is a wonderful piece of equipment for encrypting large quantities of data, efficiently. The users can thus navigate from one encoding and store orange text, depending on the field for user’s interest. Algorithm classification allows a user to Protect the deoxy ribonucleic acid sequence from change, whether an alteration or error occurred during the Deoxy Ribonucleic Acid sequence data transfer. It will check the integrity of the Deoxy Ribonucleic Acid sequence data during the access.

Keywords: algorithm, ASP.NET, DNA, encrypt, decrypt

Procedia PDF Downloads 203
3450 Survey on Arabic Sentiment Analysis in Twitter

Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb

Abstract:

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Keywords: big data, social networks, sentiment analysis, twitter

Procedia PDF Downloads 538
3449 Mask-Prompt-Rerank: An Unsupervised Method for Text Sentiment Transfer

Authors: Yufen Qin

Abstract:

Text sentiment transfer is an important branch of text style transfer. The goal is to generate text with another sentiment attribute based on a text with a specific sentiment attribute while maintaining the content and semantic information unrelated to sentiment unchanged in the process. There are currently two main challenges in this field: no parallel corpus and text attribute entanglement. In response to the above problems, this paper proposed a novel solution: Mask-Prompt-Rerank. Use the method of masking the sentiment words and then using prompt regeneration to transfer the sentence sentiment. Experiments on two sentiment benchmark datasets and one formality transfer benchmark dataset show that this approach makes the performance of small pre-trained language models comparable to that of the most advanced large models, while consuming two orders of magnitude less computing and memory.

Keywords: language model, natural language processing, prompt, text sentiment transfer

Procedia PDF Downloads 48
3448 A Neural Approach for the Offline Recognition of the Arabic Handwritten Words of the Algerian Departments

Authors: Salim Ouchtati, Jean Sequeira, Mouldi Bedda

Abstract:

In this work we present an off line system for the recognition of the Arabic handwritten words of the Algerian departments. The study is based mainly on the evaluation of neural network performances, trained with the gradient back propagation algorithm. The used parameters to form the input vector of the neural network are extracted on the binary images of the handwritten word by several methods: the parameters of distribution, the moments centered of the different projections and the Barr features. It should be noted that these methods are applied on segments gotten after the division of the binary image of the word in six segments. The classification is achieved by a multi layers perceptron. Detailed experiments are carried and satisfactory recognition results are reported.

Keywords: handwritten word recognition, neural networks, image processing, pattern recognition, features extraction

Procedia PDF Downloads 483
3447 Exploratory Analysis of A Review of Nonexistence Polarity in Native Speech

Authors: Deawan Rakin Ahamed Remal, Sinthia Chowdhury, Sharun Akter Khushbu, Sheak Rashed Haider Noori

Abstract:

Native Speech to text synthesis has its own leverage for the purpose of mankind. The extensive nature of art to speaking different accents is common but the purpose of communication between two different accent types of people is quite difficult. This problem will be motivated by the extraction of the wrong perception of language meaning. Thus, many existing automatic speech recognition has been placed to detect text. Overall study of this paper mentions a review of NSTTR (Native Speech Text to Text Recognition) synthesis compared with Text to Text recognition. Review has exposed many text to text recognition systems that are at a very early stage to comply with the system by native speech recognition. Many discussions started about the progression of chatbots, linguistic theory another is rule based approach. In the Recent years Deep learning is an overwhelming chapter for text to text learning to detect language nature. To the best of our knowledge, In the sub continent a huge number of people speak in Bangla language but they have different accents in different regions therefore study has been elaborate contradictory discussion achievement of existing works and findings of future needs in Bangla language acoustic accent.

Keywords: TTR, NSTTR, text to text recognition, deep learning, natural language processing

Procedia PDF Downloads 101
3446 Towards an Analysis of Rhetoric of Digital Arabic Discourse

Authors: Gameel Abdelmageed

Abstract:

Arabs have a rhetorical heritage which has greatly contributed to the monitoring and analyzing of the rhetoric of the Holy Quran, Hadith, and Arabic texts on poetry and oratory. But Arab scholars - as far as the researcher knows – have not contributed to monitoring and analyzing the rhetoric of digital Arabic discourse although it has prominence, particularly in social media and has strong effectiveness in the political and social life of Arab society. This discourse has made its impact by using very new rhetorical techniques in language, voice, image, painting and video clips which are known as “Multimedia” and belong to “Digital Rhetoric”. This study suggests that it is time to draw the attention of Arab scholars and invite them to monitor and analyze the rhetoric of digital Arabic discourse.

Keywords: digital discourse, digital rhetoric, Facebook, social media

Procedia PDF Downloads 346
3445 Syndromic Surveillance Framework Using Tweets Data Analytics

Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden

Abstract:

Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.

Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza

Procedia PDF Downloads 87
3444 The Conceptual Relationships in N+N Compounds in Arabic Compared to English

Authors: Abdel Rahman Altakhaineh

Abstract:

This paper has analysed the conceptual relations between the elements of NN compounds in Arabic and compared them to those found in English based on the framework of Conceptual Semantics and a modified version of Parallel Architecture referred to as Relational Morphology. The analysis revealed that the repertoire of possible semantic relations between the two nouns in Arabic NN compounds reproduces that in English NN compounds and that, therefore, the main difference is in headedness (right-headed in English, left-headed in Arabic). Adopting RM allows productive and idiosyncratic elements to interweave with each other naturally. Semantically transparent compounds can be stored in memory or produced and understood online, while compounds with different degrees of semantic idiosyncrasy are stored in memory. Furthermore, the predictable parts of idiosyncratic compounds are captured by general schemas. In compounds, such schemas pick out the range of possible semantic relations between the two nouns. Finally, conducting a cross-linguistic study of the systematic patterns of possible conceptual relationships between compound elements is an area worthy of further exploration. In addition, comparing and contrasting compounding in Arabic and Hebrew, especially as they are both Semitic languages, is another area that needs to be investigated thoroughly. It will help morphologists understand the extent to which Jackendoff’s repertoire of semantic relations in compounds is universal. That is, if a language as distant from English as Arabic displays a similar range of cases, this is evidence for a (relatively) universal set of relations from which individual languages may pick and choose.

Keywords: conceptual semantics, morphology, compounds, arabic, english

Procedia PDF Downloads 72
3443 A Model for Teaching Arabic Grammar in Light of the Common European Framework of Reference for Languages

Authors: Erfan Abdeldaim Mohamed Ahmed Abdalla

Abstract:

The complexity of Arabic grammar poses challenges for learners, particularly in relation to its arrangement, classification, abundance, and bifurcation. The challenge at hand is a result of the contextual factors that gave rise to the grammatical rules in question, as well as the pedagogical approach employed at the time, which was tailored to the needs of learners during that particular historical period. Consequently, modern-day students encounter this same obstacle. This requires a thorough examination of the arrangement and categorization of Arabic grammatical rules based on particular criteria, as well as an assessment of their objectives. Additionally, it is necessary to identify the prevalent and renowned grammatical rules, as well as those that are infrequently encountered, obscure and disregarded. This paper presents a compilation of grammatical rules that require arrangement and categorization in accordance with the standards outlined in the Common European Framework of Reference for Languages (CEFR). In addition to facilitating comprehension of the curriculum, accommodating learners' requirements, and establishing the fundamental competencies for achieving proficiency in Arabic, it is imperative to ascertain the conventions that language learners necessitate in alignment with explicitly delineated benchmarks such as the CEFR criteria. The aim of this study is to reduce the quantity of grammatical rules that are typically presented to non-native Arabic speakers in Arabic textbooks. This reduction is expected to enhance the motivation of learners to continue their Arabic language acquisition and to approach the level of proficiency of native speakers. The primary obstacle faced by learners is the intricate nature of Arabic grammar, which poses a significant challenge in the realm of study. The proliferation and complexity of regulations evident in Arabic language textbooks designed for individuals who are not native speakers is noteworthy. The inadequate organisation and delivery of the material create the impression that the grammar is being imparted to a student with the intention of memorising "Alfiyyat-Ibn-Malik." Consequently, the sequence of grammatical rules instruction was altered, with rules originally intended for later instruction being presented first and those intended for earlier instruction being presented subsequently. Students often focus on learning grammatical rules that are not necessarily required while neglecting the rules that are commonly used in everyday speech and writing. Non-Arab students are taught Arabic grammar chapters that are infrequently utilised in Arabic literature and may be a topic of debate among grammarians. The aforementioned findings are derived from the statistical analysis and investigations conducted by the researcher, which will be disclosed in due course of the research. To instruct non-Arabic speakers on grammatical rules, it is imperative to discern the most prevalent grammatical frameworks in grammar manuals and linguistic literature (study sample). The present proposal suggests the allocation of grammatical structures across linguistic levels, taking into account the guidelines of the CEFR, as well as the grammatical structures that are necessary for non-Arabic-speaking learners to generate a modern, cohesive, and comprehensible language.

Keywords: grammar, Arabic, functional, framework, problems, standards, statistical, popularity, analysis

Procedia PDF Downloads 59
3442 Anatomical Survey for Text Pattern Detection

Authors: S. Tehsin, S. Kausar

Abstract:

The ultimate aim of machine intelligence is to explore and materialize the human capabilities, one of which is the ability to detect various text objects within one or more images displayed on any canvas including prints, videos or electronic displays. Multimedia data has increased rapidly in past years. Textual information present in multimedia contains important information about the image/video content. However, it needs to technologically testify the commonly used human intelligence of detecting and differentiating the text within an image, for computers. Hence in this paper feature set based on anatomical study of human text detection system is proposed. Subsequent examination bears testimony to the fact that the features extracted proved instrumental to text detection.

Keywords: biologically inspired vision, content based retrieval, document analysis, text extraction

Procedia PDF Downloads 418
3441 The Formation of the Diminutive in Colloquial Jordanian Arabic

Authors: Yousef Barahmeh

Abstract:

This paper is a linguistic and pragmatic analysis of the use of the diminutive in Colloquial Jordanian Arabic (CJA). It demonstrates a peculiar form of the diminutive in CJA inflected by means of feminine plural ends with -aat suffix. The analysis shows that the pragmatic function(s) of the diminutive in CJA refers primarily to ‘littleness’ while the morphological inflection conveys the message of ‘the plethora’. Examples of this linguistic phenomenon are intelligible and often include a large number of words that are culture-specific to the rural dialect in the north of Jordan. In both cases, the diminutive in CJA is an adaptive strategy relative to its pragmatic and social contexts.

Keywords: Colloquial Jordanian Arabic, diminutive, morphology, pragmatics

Procedia PDF Downloads 236
3440 Classification of Political Affiliations by Reduced Number of Features

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.

Keywords: feature selection, LIWC, machine learning, politics

Procedia PDF Downloads 353
3439 Current Trends in the Arabic Linguistics Development: Between National Tradition and Global Tendencies

Authors: Olga Bernikova, Oleg Redkin

Abstract:

Globalization is a process of worldwide economic, political and cultural integration. Obviously, this phenomenon has both positive and negative issues. This article analyzes the impact of the modern process of globalization on the national traditions of language teaching and research. In this context, the problem of the ratio of local to global can be viewed from several sides. Firstly, since English is the language of over 80 percent of scientific and technical research worldwide, what should be the language of science in certain region? Secondly, language 'globality' is not always associated with English, because intercultural communications may have their regional peculiarities. For example, in the Arab world, Modern Standard Arabic can also be regarded as 'global' phenomenon, since the mother-tongue languages of the population are local Arabic dialects. In addition, the correlation 'local' versus 'global' is manifested not only in the linguistic sphere but also in the methodology used in language acquisition and research. Thus, the major principles of the Arabic philological tradition, which goes back to the 7th century, are still spread in the modern Arab world. At the same time, the terminology and methods of language research that are peculiar to this tradition are quite far from the issues of general linguistics that underlies the description of all the languages of the world. The present research relies on a comparative analysis of sources in Arabic linguistics, including original works in Arabic dating back to the 12th-13th centuries. As a case study, interaction of local and global is also considered on the example of the Arabic teaching and research in Russia. Speaking about the correlation between local and global it is possible to forecast development of two parallel tendencies: the spread of the phenomena of globalization on one hand, and local implementation of a language policy aimed at preserving native languages, including Arabic, on the other.

Keywords: Arabic, global, language, local, tradition

Procedia PDF Downloads 228
3438 The Syntactic Features of Islamic Legal Texts and Their Implications for Translation

Authors: Rafat Y. Alwazna

Abstract:

Certain religious texts are deemed part of legal texts that are characterised by high sensitivity and sacredness. Amongst such religious texts are Islamic legal texts that are replete with Islamic legal terms that designate particular legal concepts peculiar to Islamic legal system and legal culture. However, from the syntactic perspective, Islamic legal texts prove lengthy, condensed and convoluted, with little use of punctuation system, but with an extensive use of subordinations and co-ordinations, which separate the main verb from the subject, and which, of course, carry a heavy load of legal detail. The present paper seeks to examine the syntactic features of Islamic legal texts through analysing a short text of Islamic jurisprudence in an attempt at exploring the syntactic features that characterise this type of legal text. A translation of this text into legal English is then exercised to find the translation implications that have emerged as a result of the English translation. Based on these implications, the paper compares and contrasts the syntactic features of Islamic legal texts to those of legal English texts. Finally, the present paper argues that there are a number of syntactic features of Islamic legal texts, such as nominalisation, passivisation, little use of punctuation system, the use of the Arabic cohesive device, etc., which are also possessed by English legal texts except for the last feature and with some variations. The paper also claims that when rendering an Islamic legal text into legal English, certain implications emerge, such as the necessity of a sentence break, the omission of the cohesive device concerned and the increase in the use of nominalisation, passivisation, passive participles, and so on.

Keywords: English legal texts, Islamic legal texts, nominalisation, participles, passivisation, syntactic features, translation implications

Procedia PDF Downloads 186
3437 Classification of Attacks Over Cloud Environment

Authors: Karim Abouelmehdi, Loubna Dali, Elmoutaoukkil Abdelmajid, Hoda Elsayed, Eladnani Fatiha, Benihssane Abderahim

Abstract:

The security of cloud services is the concern of cloud service providers. In this paper, we will mention different classifications of cloud attacks referred by specialized organizations. Each agency has its classification of well-defined properties. The purpose is to present a high-level classification of current research in cloud computing security. This classification is organized around attack strategies and corresponding defenses.

Keywords: cloud computing, classification, risk, security

Procedia PDF Downloads 506