Search results for: arabic text
1441 The Sufi Madad in Arabic Literature and Translation
Authors: Riham Debian
Abstract:
This paper deals with the translational mystic in Arabic aesthetics and their linguistic and narrative revelation and mediation across textual spaces. The paper particularly engages with the nature of the Egyptian Sufi Madad, its relation to spaces/places, its intergenerational and intertextual manifestations, and its intersection with questions of identity—the historical spaces and geographical places one inhabits and embodies. Opening a repertoire between contextualized stylistics and poetics semiology (Boise-Bier2011; Jackobson 1960), the paper reads in al-Ghitany’s Kitab al-Tagiliat (The Book of Revelation1983), Bassiouny’s Sabil Al-Ghareq (2018) and its translation (Fountain of the Drowning2022). The paper examines the stylistic and poetical encoding and recoding of the Sufi Madads from Ghitany to Bassiouny and their entanglement in the question of Egyptian identity-politics through the embodiment of historical places and geographical spaces. The paper argues for the intergenerational intertextuality of Arabic aesthetics that stylistically and poetically enacts the mysticism of Sufi Madad through historical and geographical semioticization of the Egyptian character continuity across time and space. Both Ghitany and Bassiouny engage with the historical novel as a form of delivery of their Egyptian mystical relation with time and place. Both novelist-historians are involved with the question of place and the life-worlds that spaces generate across time and gender.Keywords: intertextuality, interdiscusivity, madad, egyptian identity
Procedia PDF Downloads 961440 Literary Theatre and Embodied Theatre: A Practice-Based Research in Exploring the Authorship of a Performance
Authors: Rahul Bishnoi
Abstract:
Theatre, as Ann Ubersfld calls it, is a paradox. At once, it is both a literary work and a physical representation. Theatre as a text is eternal, reproducible, and identical while as a performance, theatre is momentary and never identical to the previous performances. In this dual existence of theatre, who is the author? Is the author the playwright who writes the dramatic text, or the director who orchestrates the performance, or the actor who embodies the text? From the poststructuralist lens of Barthes, the author is dead. Barthes’ argument of discrete temporality, i.e. the author is the before, and the text is the after, does not hold true for theatre. A published literary work is written, edited, printed, distributed and then gets consumed by the reader. On the other hand, theatrical production is immediate; an actor performs and the audience witnesses it instantaneously. Time, so to speak, does not separate the author, the text, and the reader anymore. The question of authorship gets further complicated in Augusto Boal’s “Theatre of the Oppressed” movement where the audience is a direct participant like the actors in the performance. In this research, through an experimental performance, the duality of theatre is explored with the authorship discourse. And the conventional definition of authorship is subjected to additional complexity by erasing the distinction between an actor and the audience. The design/methodology of the experimental performance is as follows: The audience will be asked to produce a text under an anonymous virtual alias. The text, as it is being produced, will be read and performed by the actor. The audience who are also collectively “authoring” the text, will watch this performance and write further until everyone has contributed with one input each. The cycle of writing, reading, performing, witnessing, and writing will continue until the end. The intention is to create a dynamic system of writing/reading with the embodiment of the text through the actor. The actor is giving up the power to the audience to write the spoken word, stage instruction and direction while still keeping the agency of interpreting that input and performing in the chosen manner. This rapid conversation between the actor and the audience also creates a conversion of authorship. The main conclusion of this study is a perspective on the nature of dynamic authorship of theatre containing a critical enquiry of the collaboratively produced text, an individually performed act, and a collectively witnessed event. Using practice as a methodology, this paper contests the poststructuralist notion of the author as merely a ‘scriptor’ and breaks it further by involving the audience in the authorship as well.Keywords: practice based research, performance studies, post-humanism, Avant-garde art, theatre
Procedia PDF Downloads 1111439 Teaching Synonyms for Non-Arabic Speakers
Authors: Loay Badran
Abstract:
This article on synonymy came into existence to meet the academic needs of students who specialize in this field. The article has two parts: the first part discusses the forms that authors of textbooks and dictionaries assumed when explaining a word as well as explaining the precision or lack of it thereof in delivering an understandable and clear meaning of using such forms. Meanwhile, the second part of this research article focuses on the application of synonymy and at taking into consideration the point of view of others who dismissed synonymy in its minute details, especially Alaskari in his book “Linguistic Differences” “Al Forouq Alloqhawiyyah”. The author determined that collecting the most commonly-used synonymous notions scattered in Alaskari’s book and compiling them in tables would be of great importance in easing lessons according to the Arabic Alphabet System meanwhile citing all that pertains to the corresponding scattered pages in “Linguistic Differences”.Keywords: synonymy, semantics, camel, teaching, non-native
Procedia PDF Downloads 641438 Text Mining Techniques for Prioritizing Pathogenic Mutations in Protein Families Known to Misfold or Aggregate
Authors: Khaleel Saleh Al-Rababah
Abstract:
Amyloid fibril forming regions, which are known as protein aggregates, in sequences of some protein families are associated with a number of diseases known as amyloidosis. Mutations play a role in forming fibrils by accelerating the fibril formation process. In this paper we want to extract diseases that caused by those mutations as a result of the impact of the mutations on structural and functional properties of the aggregated protein. We propose a text mining system, to automatically extract mutations, diseases and relations between mutations and diseases. We presented an algorithm based on finite state to cluster mutations found in the same sentence as a sentence could contain different mutation cause different diseases. Also, we presented a co reference algorithm that enables cross-link sentences.Keywords: amyloid, amyloidosis, co reference, protein, text mining
Procedia PDF Downloads 5261437 The Application of Lesson Study Model in Writing Review Text in Junior High School
Authors: Sulastriningsih Djumingin
Abstract:
This study has some objectives. It aims at describing the ability of the second-grade students to write review text without applying the Lesson Study model at SMPN 18 Makassar. Second, it seeks to describe the ability of the second-grade students to write review text by applying the Lesson Study model at SMPN 18 Makassar. Third, it aims at testing the effectiveness of the Lesson Study model in writing review text at SMPN 18 Makassar. This research was true experimental design with posttest Only group design involving two groups consisting of one class of the control group and one class of the experimental group. The research populations were all the second-grade students at SMPN 18 Makassar amounted to 250 students consisting of 8 classes. The sampling technique was purposive sampling technique. The control class was VIII2 consisting of 30 students, while the experimental class was VIII8 consisting of 30 students. The research instruments were in the form of observation and tests. The collected data were analyzed using descriptive statistical techniques and inferential statistical techniques with t-test types processed using SPSS 21 for windows. The results shows that: (1) of 30 students in control class, there are only 14 (47%) students who get the score more than 7.5, categorized as inadequate; (2) in the experimental class, there are 26 (87%) students who obtain the score of 7.5, categorized as adequate; (3) the Lesson Study models is effective to be applied in writing review text. Based on the comparison of the ability of the control class and experimental class, it indicates that the value of t-count is greater than the value of t-table (2.411> 1.667). It means that the alternative hypothesis (H1) proposed by the researcher is accepted.Keywords: application, lesson study, review text, writing
Procedia PDF Downloads 2021436 Developed Text-Independent Speaker Verification System
Authors: Mohammed Arif, Abdessalam Kifouche
Abstract:
Speech is a very convenient way of communication between people and machines. It conveys information about the identity of the talker. Since speaker recognition technology is increasingly securing our everyday lives, the objective of this paper is to develop two automatic text-independent speaker verification systems (TI SV) using low-level spectral features and machine learning methods. (i) The first system is based on a support vector machine (SVM), which was widely used in voice signal processing with the aim of speaker recognition involving verifying the identity of the speaker based on its voice characteristics, and (ii) the second is based on Gaussian Mixture Model (GMM) and Universal Background Model (UBM) to combine different functions from different resources to implement the SVM based.Keywords: speaker verification, text-independent, support vector machine, Gaussian mixture model, cepstral analysis
Procedia PDF Downloads 581435 Text Mining of Veterinary Forums for Epidemiological Surveillance Supplementation
Authors: Samuel Munaf, Kevin Swingler, Franz Brülisauer, Anthony O’Hare, George Gunn, Aaron Reeves
Abstract:
Web scraping and text mining are popular computer science methods deployed by public health researchers to augment traditional epidemiological surveillance. However, within veterinary disease surveillance, such techniques are still in the early stages of development and have not yet been fully utilised. This study presents an exploration into the utility of incorporating internet-based data to better understand the smallholder farming communities within Scotland by using online text extraction and the subsequent mining of this data. Web scraping of the livestock fora was conducted in conjunction with text mining of the data in search of common themes, words, and topics found within the text. Results from bi-grams and topic modelling uncover four main topics of interest within the data pertaining to aspects of livestock husbandry: feeding, breeding, slaughter, and disposal. These topics were found amongst both the poultry and pig sub-forums. Topic modeling appears to be a useful method of unsupervised classification regarding this form of data, as it has produced clusters that relate to biosecurity and animal welfare. Internet data can be a very effective tool in aiding traditional veterinary surveillance methods, but the requirement for human validation of said data is crucial. This opens avenues of research via the incorporation of other dynamic social media data, namely Twitter and Facebook/Meta, in addition to time series analysis to highlight temporal patterns.Keywords: veterinary epidemiology, disease surveillance, infodemiology, infoveillance, smallholding, social media, web scraping, sentiment analysis, geolocation, text mining, NLP
Procedia PDF Downloads 1011434 Evolutionary Methods in Cryptography
Authors: Wafa Slaibi Alsharafat
Abstract:
Genetic algorithms (GA) are random algorithms as random numbers that are generated during the operation of the algorithm determine what happens. This means that if GA is applied twice to optimize exactly the same problem it might produces two different answers. In this project, we propose an evolutionary algorithm and Genetic Algorithm (GA) to be implemented in symmetric encryption and decryption. Here, user's message and user secret information (key) which represent plain text to be transferred into cipher text.Keywords: GA, encryption, decryption, crossover
Procedia PDF Downloads 4461433 Linguistic Misinterpretation and the Dialogue of Civilizations
Authors: Oleg Redkin, Olga Bernikova
Abstract:
Globalization and migrations have made cross-cultural contacts more frequent and intensive. Sometimes, these contacts may lead to misunderstanding between partners of communication and misinterpretations of the verbal messages that some researchers tend to consider as the 'clash of civilizations'. In most cases, reasons for that may be found in cultural and linguistic differences and hence misinterpretations of intentions and behavior. The current research examines factors of verbal and non-verbal communication that should be taken into consideration in verbal and non-verbal contacts. Language is one of the most important manifestations of the cultural code, and it is often considered as one of the special features of a civilization. The Arabic language, in particular, is commonly associated with Islam and the language and the Arab-Muslim civilization. It is one of the most important markers of self-identification for more than 200 million of native speakers. Arabic is the language of the Quran and hence the symbol of religious affiliation for more than one billion Muslims around the globe. Adequate interpretation of Arabic texts requires profound knowledge of its grammar, semantics of its vocabulary. Communicating sides who belong to different cultural groups are guided by different models of behavior and hierarchy of values, besides that the vocabulary each of them uses in the dialogue may convey different semantic realities and vary in connotations. In this context direct, literal translation in most cases cannot adequately convey the original meaning of the original message. Besides that peculiarities and diversities of the extralinguistic information, such as the body language, communicative etiquette, cultural background and religious affiliations may make the dialogue even more difficult. It is very likely that the so called 'clash of civilizations' in most cases is due to misinterpretation of counterpart's means of discourse such as language, cultural codes, and models of behavior rather than lies in basic contradictions between partners of communication. In the process of communication, one has to rely on universal values rather than focus on cultural or religious peculiarities, to take into account current linguistic and extralinguistic context.Keywords: Arabic, civilization, discourse, language, linguistic
Procedia PDF Downloads 2211432 A Conglomerate of Multiple Optical Character Recognition Table Detection and Extraction
Authors: Smita Pallavi, Raj Ratn Pranesh, Sumit Kumar
Abstract:
Information representation as tables is compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used; however, industry still faces challenges in detecting and extracting tables from OCR (Optical Character Recognition) documents or images. This paper proposes an algorithm that detects and extracts multiple tables from OCR document. The algorithm uses a combination of image processing techniques, text recognition, and procedural coding to identify distinct tables in the same image and map the text to appropriate the corresponding cell in dataframe, which can be stored as comma-separated values, database, excel, and multiple other usable formats.Keywords: table extraction, optical character recognition, image processing, text extraction, morphological transformation
Procedia PDF Downloads 1451431 Syndromic Surveillance Framework Using Tweets Data Analytics
Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden
Abstract:
Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza
Procedia PDF Downloads 1161430 On the Weightlessness of Vowel Lengthening: Insights from Arabic Dialect of Yemen and Contribution to Psychoneurolinguistics
Authors: Sadeq Al Yaari, Muhammad Alkhunayn, Montaha Al Yaari, Ayman Al Yaari, Aayah Al Yaari, Adham Al Yaari, Sajedah Al Yaari, Fatehi Eissa
Abstract:
Introduction: It is well established that lengthening (longer duration) is considered one of the correlates of lexical and phrasal prominence. However, it is unexplored whether the scope of vowel lengthening in the Arabic dialect of Yemen (ADY) is differently affected by educated and/or uneducated speakers from different dialectal backgrounds. Specifically, the research aims to examine whether or not linguistic background acquired through different educational channels makes a difference in the speech of the speaker and how that is reflected in related psychoneurolinguistic impairments. Methods: For the above mentioned purpose, we conducted an articulatory experiment wherein a set of words from ADY were examined in the dialectal speech of thousand and seven hundred Yemeni educated and uneducated speakers aged 19-61 years growing up in five regions of the country: Northern, southern, eastern, western and central and were, accordingly, assigned into five dialectal groups. A seven-minute video clip was shown to the participants, who have been asked to spontaneously describe the scene they had just watched before the researchers linguistically and statistically analyzed recordings to weigh vowel lengthening in the speech of the participants. Results: The results show that vowels (monophthongs and diphthongs) are lengthened by all participants. Unexpectedly, educated and uneducated speakers from northern and central dialects lengthen vowels. Compared with uneducated speakers from the same dialect, educated speakers lengthen fewer vowels in their dialectal speech. Conclusions: These findings support the notion that extensive exposure to dialects on account of standard language can cause changes to the patterns of dialects themselves, and this can be seen in the speech of educated and uneducated speakers of these dialects. Further research is needed to clarify the phonemic distinctive features and frequency of lengthening in other open class systems (i.e., nouns, adjectives, and adverbs). Phonetic and phonological report measures are needed as well as validation of existing measures for assessing phonemic vowel length in the Arabic population in general and Arabic individuals with voice, speech, and language impairments in particular.Keywords: vowel lengthening, Arabic dialect of Yemen, phonetics, phonology, impairment, distinctive features
Procedia PDF Downloads 431429 Case Study about Women Driving in Saudi Arabia Announced in 2018: Netnographic and Data Mining Study
Authors: Majdah Alnefaie
Abstract:
The ‘netnographic study’ and data mining have been used to monitor the public interaction on Social Media Sites (SMSs) to understand what the motivational factors influence the Saudi intentions regarding allowing women driving in Saudi Arabia in 2018. The netnographic study monitored the publics’ textual and visual communications in Twitter, Snapchat, and YouTube. SMSs users’ communications method is also known as electronic word of mouth (eWOM). Netnography methodology is still in its initial stages as it depends on manual extraction, reading and classification of SMSs users text. On the other hand, data mining is come from the computer and physical sciences background, therefore it is much harder to extract meaning from unstructured qualitative data. In addition, the new development in data mining software does not support the Arabic text, especially local slang in Saudi Arabia. Therefore, collaborations between social and computer scientists such as ‘netnographic study’ and data mining will enhance the efficiency of this study methodology leading to comprehensive research outcome. The eWOM communications between individuals on SMSs can promote a sense that sharing their preferences and experiences regarding politics and social government regulations is a part of their daily life, highlighting the importance of using SMSs as assistance in promoting participation in political and social. Therefore, public interactions on SMSs are important tools to comprehend people’s intentions regarding the new government regulations in the country. This study aims to answer this question, "What factors influence the Saudi Arabians' intentions of Saudi female's car-driving in 2018". The study utilized qualitative method known as netnographic study. The study used R studio to collect and analyses 27000 Saudi users’ comments from 25th May until 25th June 2018. The study has developed data collection model that support importing and analysing the Arabic text in the local slang. The data collection model in this study has been clustered based on different type of social networks, gender and the study main factors. The social network analysis was employed to collect comments from SMSs owned by governments’ originations, celebrities, vloggers, social activist and news SMSs accounts. The comments were collected from both males and females SMSs users. The sentiment analysis shows that the total number of positive comments Saudi females car driving was higher than negative comments. The data have provided the most important factors influenced the Saudi Arabians’ intention of Saudi females car driving including, culture and environment, freedom of choice, equal opportunities, security and safety. The most interesting finding indicted that women driving would play a role in increasing the individual freedom of choice. Saudi female will be able to drive cars to fulfill her daily life and family needs without being stressed due to the lack of transportation. The study outcome will help Saudi government to improve woman quality of life by increasing the ability to find more jobs and studies, increasing income through decreasing the spending on transport means such as taxi and having more freedom of choice in woman daily life needs. The study enhances the importance of using use marketing research to measure the public opinions on the new government regulations in the country. The study has explained the limitations and suggestions for future research.Keywords: netnographic study, data mining, social media, Saudi Arabia, female driving
Procedia PDF Downloads 1551428 Grammatical and Lexical Cohesion in the Japan’s Prime Minister Shinzo Abe’s Speech Text ‘Nihon wa Modottekimashita’
Authors: Nadya Inda Syartanti
Abstract:
This research aims to identify, classify, and analyze descriptively the aspects of grammatical and lexical cohesion in the speech text of Japan’s Prime Minister Shinzo Abe entitled Nihon wa Modotte kimashita delivered in Washington DC, the United States on February 23, 2013, as a research data source. The method used is qualitative research, which uses descriptions through words that are applied by analyzing aspects of grammatical and lexical cohesion proposed by Halliday and Hasan (1976). The aspects of grammatical cohesion consist of references (personal, demonstrative, interrogative pronouns), substitution, ellipsis, and conjunction. In contrast, lexical cohesion consists of reiteration (repetition, synonym, antonym, hyponym, meronym) and collocation. Data classification is based on the 6 aspects of the cohesion. Through some aspects of cohesion, this research tries to find out the frequency of using grammatical and lexical cohesion in Shinzo Abe's speech text entitled Nihon wa Modotte kimashita. The results of this research are expected to help overcome the difficulty of understanding speech texts in Japanese. Therefore, this research can be a reference for learners, researchers, and anyone who is interested in the field of discourse analysis.Keywords: cohesion, grammatical cohesion, lexical cohesion, speech text, Shinzo Abe
Procedia PDF Downloads 1621427 Spontaneous Message Detection of Annoying Situation in Community Networks Using Mining Algorithm
Authors: P. Senthil Kumari
Abstract:
Main concerns in data mining investigation are social controls of data mining for handling ambiguity, noise, or incompleteness on text data. We describe an innovative approach for unplanned text data detection of community networks achieved by classification mechanism. In a tangible domain claim with humble secrecy backgrounds provided by community network for evading annoying content is presented on consumer message partition. To avoid this, mining methodology provides the capability to unswervingly switch the messages and similarly recover the superiority of ordering. Here we designated learning-centered mining approaches with pre-processing technique to complete this effort. Our involvement of work compact with rule-based personalization for automatic text categorization which was appropriate in many dissimilar frameworks and offers tolerance value for permits the background of comments conferring to a variety of conditions associated with the policy or rule arrangements processed by learning algorithm. Remarkably, we find that the choice of classifier has predicted the class labels for control of the inadequate documents on community network with great value of effect.Keywords: text mining, data classification, community network, learning algorithm
Procedia PDF Downloads 5091426 Wasting Human and Computer Resources
Authors: Mária Csernoch, Piroska Biró
Abstract:
The legends about “user-friendly” and “easy-to-use” birotical tools (computer-related office tools) have been spreading and misleading end-users. This approach has led us to the extremely high number of incorrect documents, causing serious financial losses in the creating, modifying, and retrieving processes. Our research proved that there are at least two sources of this underachievement: (1) The lack of the definition of the correctly edited, formatted documents. Consequently, end-users do not know whether their methods and results are correct or not. They are not aware of their ignorance. They are so ignorant that their ignorance does not allow them to realize their lack of knowledge. (2) The end-users’ problem-solving methods. We have found that in non-traditional programming environments end-users apply, almost exclusively, surface approach metacognitive methods to carry out their computer related activities, which are proved less effective than deep approach methods. Based on these findings we have developed deep approach methods which are based on and adapted from traditional programming languages. In this study, we focus on the most popular type of birotical documents, the text-based documents. We have provided the definition of the correctly edited text, and based on this definition, adapted the debugging method known in programming. According to the method, before the realization of text editing, a thorough debugging of already existing texts and the categorization of errors are carried out. With this method in advance to real text editing users learn the requirements of text-based documents and also of the correctly formatted text. The method has been proved much more effective than the previously applied surface approach methods. The advantages of the method are that the real text handling requires much less human and computer sources than clicking aimlessly in the GUI (Graphical User Interface), and the data retrieval is much more effective than from error-prone documents.Keywords: deep approach metacognitive methods, error-prone birotical documents, financial losses, human and computer resources
Procedia PDF Downloads 3821425 Short Text Classification for Saudi Tweets
Authors: Asma A. Alsufyani, Maram A. Alharthi, Maha J. Althobaiti, Manal S. Alharthi, Huda Rizq
Abstract:
Twitter is one of the most popular microblogging sites that allows users to publish short text messages called 'tweets'. Increasing the number of accounts to follow (followings) increases the number of tweets that will be displayed from different topics in an unclassified manner in the timeline of the user. Therefore, it can be a vital solution for many Twitter users to have their tweets in a timeline classified into general categories to save the user’s time and to provide easy and quick access to tweets based on topics. In this paper, we developed a classifier for timeline tweets trained on a dataset consisting of 3600 tweets in total, which were collected from Saudi Twitter and annotated manually. We experimented with the well-known Bag-of-Words approach to text classification, and we used support vector machines (SVM) in the training process. The trained classifier performed well on a test dataset, with an average F1-measure equal to 92.3%. The classifier has been integrated into an application, which practically proved the classifier’s ability to classify timeline tweets of the user.Keywords: corpus creation, feature extraction, machine learning, short text classification, social media, support vector machine, Twitter
Procedia PDF Downloads 1551424 Walking in the Steps of Poets: Evoking Past Poets in Sufi Poetry
Authors: Bilal Orfali
Abstract:
It is common practice in modern times to read mystical poetry and apply it to our mundane lives and loves. Sufis in the early period did the opposite. Their mystical hymns often spun out of the courtly poetic ghazal, panegyric, and wine songs. This paper highlights the relation of the Arabic courtly poetic canon to early Sufism. Sufi akhbār and poetry evoke past poets and their poetic heritage. They tend to quote or refer to eminent poets whose poetry must have been widely circulated and memorized. However, Sufism places this readily recognizable poetry in a new context that deliberately changes the past. It is a process of a metaphorization in which the reality of the pre-Islamic, Umayyad, and Abbasid models now acts as a device or metaphor for the Sufi poetics.Keywords: Sufism, Arabic poetry, literature, Islamic literature, Abbasid
Procedia PDF Downloads 3131423 Unlocking the Potential of Short Texts with Semantic Enrichment, Disambiguation Techniques, and Context Fusion
Authors: Mouheb Mehdoui, Amel Fraisse, Mounir Zrigui
Abstract:
This paper explores the potential of short texts through semantic enrichment and disambiguation techniques. By employing context fusion, we aim to enhance the comprehension and utility of concise textual information. The methodologies utilized are grounded in recent advancements in natural language processing, which allow for a deeper understanding of semantics within limited text formats. Specifically, topic classification is employed to understand the context of the sentence and assess the relevance of added expressions. Additionally, word sense disambiguation is used to clarify unclear words, replacing them with more precise terms. The implications of this research extend to various applications, including information retrieval and knowledge representation. Ultimately, this work highlights the importance of refining short text processing techniques to unlock their full potential in real-world applications.Keywords: information traffic, text summarization, word-sense disambiguation, semantic enrichment, ambiguity resolution, short text enhancement, information retrieval, contextual understanding, natural language processing, ambiguity
Procedia PDF Downloads 141422 Moral Wrongdoers: Evaluating the Value of Moral Actions Performed by War Criminals
Authors: Jean-Francois Caron
Abstract:
This text explores the value of moral acts performed by war criminals, and the extent to which they should alleviate the punishment these individuals ought to receive for violating the rules of war. Without neglecting the necessity of retribution in war crimes cases, it argues from an ethical perspective that we should not rule out the possibility of considering lesser punishments for war criminals who decide to perform a moral act, as it might produce significant positive moral outcomes. This text also analyzes how such a norm could be justified from a moral perspective.Keywords: war criminals, pardon, amnesty, retribution
Procedia PDF Downloads 2831421 A Deep Learning Approach to Subsection Identification in Electronic Health Records
Authors: Nitin Shravan, Sudarsun Santhiappan, B. Sivaselvan
Abstract:
Subsection identification, in the context of Electronic Health Records (EHRs), is identifying the important sections for down-stream tasks like auto-coding. In this work, we classify the text present in EHRs according to their information, using machine learning and deep learning techniques. We initially describe briefly about the problem and formulate it as a text classification problem. Then, we discuss upon the methods from the literature. We try two approaches - traditional feature extraction based machine learning methods and deep learning methods. Through experiments on a private dataset, we establish that the deep learning methods perform better than the feature extraction based Machine Learning Models.Keywords: deep learning, machine learning, semantic clinical classification, subsection identification, text classification
Procedia PDF Downloads 2201420 Identification of Text Domains and Register Variation through the Analysis of Lexical Distribution in a Bangla Mass Media Text Corpus
Authors: Mahul Bhattacharyya, Niladri Sekhar Dash
Abstract:
The present research paper is an experimental attempt to investigate the nature of variation in the register in three major text domains, namely, social, cultural, and political texts collected from the corpus of Bangla printed mass media texts. This present study uses a corpus of a moderate amount of Bangla mass media text that contains nearly one million words collected from different media sources like newspapers, magazines, advertisements, periodicals, etc. The analysis of corpus data reveals that each text has certain lexical properties that not only control their identity but also mark their uniqueness across the domains. At first, the subject domains of the texts are classified into two parameters namely, ‘Genre' and 'Text Type'. Next, some empirical investigations are made to understand how the domains vary from each other in terms of lexical properties like both function and content words. Here the method of comparative-cum-contrastive matching of lexical load across domains is invoked through word frequency count to track how domain-specific words and terms may be marked as decisive indicators in the act of specifying the textual contexts and subject domains. The study shows that the common lexical stock that percolates across all text domains are quite dicey in nature as their lexicological identity does not have any bearing in the act of specifying subject domains. Therefore, it becomes necessary for language users to anchor upon certain domain-specific lexical items to recognize a text that belongs to a specific text domain. The eventual findings of this study confirm that texts belonging to different subject domains in Bangla news text corpus clearly differ on the parameters of lexical load, lexical choice, lexical clustering, lexical collocation. In fact, based on these parameters, along with some statistical calculations, it is possible to classify mass media texts into different types to mark their relation with regard to the domains they should actually belong. The advantage of this analysis lies in the proper identification of the linguistic factors which will give language users a better insight into the method they employ in text comprehension, as well as construct a systemic frame for designing text identification strategy for language learners. The availability of huge amount of Bangla media text data is useful for achieving accurate conclusions with a certain amount of reliability and authenticity. This kind of corpus-based analysis is quite relevant for a resource-poor language like Bangla, as no attempt has ever been made to understand how the structure and texture of Bangla mass media texts vary due to certain linguistic and extra-linguistic constraints that are actively operational to specific text domains. Since mass media language is assumed to be the most 'recent representation' of the actual use of the language, this study is expected to show how the Bangla news texts reflect the thoughts of the society and how they leave a strong impact on the thought process of the speech community.Keywords: Bangla, corpus, discourse, domains, lexical choice, mass media, register, variation
Procedia PDF Downloads 1741419 Resource Framework Descriptors for Interestingness in Data
Authors: C. B. Abhilash, Kavi Mahesh
Abstract:
Human beings are the most advanced species on earth; it's all because of the ability to communicate and share information via human language. In today's world, a huge amount of data is available on the web in text format. This has also resulted in the generation of big data in structured and unstructured formats. In general, the data is in the textual form, which is highly unstructured. To get insights and actionable content from this data, we need to incorporate the concepts of text mining and natural language processing. In our study, we mainly focus on Interesting data through which interesting facts are generated for the knowledge base. The approach is to derive the analytics from the text via the application of natural language processing. Using semantic web Resource framework descriptors (RDF), we generate the triple from the given data and derive the interesting patterns. The methodology also illustrates data integration using the RDF for reliable, interesting patterns.Keywords: RDF, interestingness, knowledge base, semantic data
Procedia PDF Downloads 1641418 The Audio-Visual and Syntactic Priming Effect on Specific Language Impairment and Gender in Modern Standard Arabic
Authors: Mohammad Al-Dawoody
Abstract:
This study aims at exploring if priming is affected by gender in Modern Standard Arabic and if it is restricted solely to subjects with no specific language impairment (SLI). The sample in this study consists of 74 subjects, between the ages of 11;1 and 11;10, distributed into (a) 2 SLI experimental groups of 38 subjects divided into two gender groups of 18 females and 20 males and (b) 2 non-SLI control groups of 36 subjects divided into two gender groups of 17 females and 19 males. Employing a mixed research design, the researcher conducted this study within the framework of the relevance theory (RT) whose main assumption is that human beings are endowed with a biological ability to magnify the relevance of the incoming stimuli. Each of the four groups was given two different priming stimuli: audio-visual priming (T1) and syntactic priming (T2). The results showed that the priming effect was sheer distinct among SLI participants especially when retrieving typical responses (TR) in T1 and T2 with slight superiority of males over females. The results also revealed that non-SLI females showed stronger original response (OR) priming in T1 than males and that non-SLI males in T2 excelled in OR priming than females. Furthermore, the results suggested that the audio-visual priming has a stronger effect on SLI females than non-SLI females and that syntactic priming seems to have the same effect on the two groups (non-SLI and SLI females). The conclusion is that the priming effect varies according to gender and is not confined merely to non-SLI subjects.Keywords: specific language impairment, relevance theory, audio-visual priming, syntactic priming, modern standard Arabic
Procedia PDF Downloads 1771417 Object Recognition Approach Based on Generalized Hough Transform and Color Distribution Serving in Generating Arabic Sentences
Authors: Nada Farhani, Naim Terbeh, Mounir Zrigui
Abstract:
The recognition of the objects contained in images has always presented a challenge in the field of research because of several difficulties that the researcher can envisage because of the variability of shape, position, contrast of objects, etc. In this paper, we will be interested in the recognition of objects. The classical Hough Transform (HT) presented a tool for detecting straight line segments in images. The technique of HT has been generalized (GHT) for the detection of arbitrary forms. With GHT, the forms sought are not necessarily defined analytically but rather by a particular silhouette. For more precision, we proposed to combine the results from the GHT with the results from a calculation of similarity between the histograms and the spatiograms of the images. The main purpose of our work is to use the concepts from recognition to generate sentences in Arabic that summarize the content of the image.Keywords: recognition of shape, generalized hough transformation, histogram, spatiogram, learning
Procedia PDF Downloads 1581416 Avoidance and Selectivity in the Acquisition of Arabic as a Second/Foreign Language
Authors: Abeer Heider
Abstract:
This paper explores and classifies the different kinds of avoidances that students commonly make in the acquisition of Arabic as a second/foreign language, and suggests specific strategies to help students lessen their avoidance trends in hopes of streamlining the learning process. Students most commonly use avoidance strategies in grammar, and word choice. These different types of strategies have different implications and naturally require different approaches. Thus the question remains as to the most effective way to help students improve their Arabic, and how teachers can efficiently utilize these techniques. It is hoped that this research will contribute to understand the role of avoidance in the field of the second language acquisition in general, and as a type of input. Yet some researchers also note that similarity between L1 and L2 may be problematic as well since the learner may doubt that such similarity indeed exists and consequently avoid the identical constructions or elements (Jordens, 1977; Kellermann, 1977, 1978, 1986). In an effort to resolve this issue, a case study is being conducted. The present case study attempts to provide a broader analysis of what is acquired than is usually the case, analyzing the learners ‘accomplishments in terms of three –part framework of the components of communicative competence suggested by Michele Canale: grammatical competence, sociolinguistic competence and discourse competence. The subjects of this study are 15 students’ 22th year who came to study Arabic at Qatar University of Cairo. The 15 students are in the advanced level. They were complete intermediate level in Arabic when they arrive in Qatar for the first time. The study used discourse analytic method to examine how the first language affects students’ production and output in the second language, and how and when students use avoidance methods in their learning. The study will be conducted through Fall 2015 through analyzing audio recordings that are recorded throughout the entire semester. The recordings will be around 30 clips. The students are using supplementary listening and speaking materials. The group will be tested at the end of the term to assess any measurable difference between the techniques. Questionnaires will be administered to teachers and students before and after the semester to assess any change in attitude toward avoidance and selectivity methods. Responses to these questionnaires are analyzed and discussed to assess the relative merits of the aforementioned strategies to avoidance and selectivity to further support on. Implications and recommendations for teacher training are proposed.Keywords: the second language acquisition, learning languages, selectivity, avoidance
Procedia PDF Downloads 2771415 A Neural Approach for the Offline Recognition of the Arabic Handwritten Words of the Algerian Departments
Authors: Salim Ouchtati, Jean Sequeira, Mouldi Bedda
Abstract:
In this work we present an off line system for the recognition of the Arabic handwritten words of the Algerian departments. The study is based mainly on the evaluation of neural network performances, trained with the gradient back propagation algorithm. The used parameters to form the input vector of the neural network are extracted on the binary images of the handwritten word by several methods: the parameters of distribution, the moments centered of the different projections and the Barr features. It should be noted that these methods are applied on segments gotten after the division of the binary image of the word in six segments. The classification is achieved by a multi layers perceptron. Detailed experiments are carried and satisfactory recognition results are reported.Keywords: handwritten word recognition, neural networks, image processing, pattern recognition, features extraction
Procedia PDF Downloads 5141414 The Woman in Arabic Popular Proverbs, Stereotypical Roles and Actual Pain: The Woman in the Institution of Marriage as a Sample
Authors: Hanan Bishara
Abstract:
This study deals with the subject of Popular Arabic Proverbs and the stereotypical roles and images that they create about the woman in general and Arab woman in particular. Popular proverbs in general are considered to be essence of experiences of society and the extract of its collective thought establish wisdom in a distinguished concise tight mold or style that affects the majority of people and keep them alive by virtue of constant use and oral currency through which they are transmitted from one generation to another. Proverbs deal with different aspects and types of people, different social relations, including the society's attitude about the woman. Proverbs about women in the human heritage in general and the Arab heritage in particular are considered of a special characteristics and remarkable in their being dynamic ones that move in all directions of life. Most of them carry the essence of the social issues and are distributed in such a way that they have become part of the private life of the general public. This distribution covers all periods and fields of the woman's life, the social, the economic and psychological ones. The woman occupies a major space in the Popular Proverbs because she is the center of social life inside and outside the house. The woman's statuses and images in the provers are numerous and she is often described in parallel images but each one differs from the other. These images intertwine due to their varieties and multiplicity and ultimately, they constitute a general stereotypical image of the woman, which degrades her status as a woman, a mother and a wife. The study shows how Popular Proverbs in Arabic reflect the Arab woman's position and status in her society.Keywords: Arab, proverb, popular, society, woman
Procedia PDF Downloads 2021413 A Framework of Product Information Service System Using Mobile Image Retrieval and Text Mining Techniques
Authors: Mei-Yi Wu, Shang-Ming Huang
Abstract:
The online shoppers nowadays often search the product information on the Internet using some keywords of products. To use this kind of information searching model, shoppers should have a preliminary understanding about their interesting products and choose the correct keywords. However, if the products are first contact (for example, the worn clothes or backpack of passengers which you do not have any idea about the brands), these products cannot be retrieved due to insufficient information. In this paper, we discuss and study the applications in E-commerce using image retrieval and text mining techniques. We design a reasonable E-commerce application system containing three layers in the architecture to provide users product information. The system can automatically search and retrieval similar images and corresponding web pages on Internet according to the target pictures which taken by users. Then text mining techniques are applied to extract important keywords from these retrieval web pages and search the prices on different online shopping stores with these keywords using a web crawler. Finally, the users can obtain the product information including photos and prices of their favorite products. The experiments shows the efficiency of proposed system.Keywords: mobile image retrieval, text mining, product information service system, online marketing
Procedia PDF Downloads 3601412 Concordance of Maghrebian Place Names in Hungarian School Atlases
Authors: Malak Alasli
Abstract:
Hungarians come to use geographic names that are foreign to their environment and language in diverse settings, hence the aim of trying to adapt them to their own linguistic context. The Maghreb region (Morocco, Algeria, and Tunisia) uses both Arabic and French in presenting the place names. Consequently, the lexicographical treatment of the toponym will, therefore, consist of both the presentation of the toponymic term and the pronunciation of the entries. The motivation behind this approach is the need for a better identification of the place in question by avoiding ambiguities, and for more respect to the heritage by conforming to the right use of toponyms both in written as well as in oral practice. The goal is to provide Hungarians with a set of data by attempting a system of transliteration from French/Arabic to Hungarian, where the place names of the Maghreb are transliterated for more efficient usage. To examine the importance of toponyms’ pronunciation, the latter were collected from several 20th and 21st Hungarian school atlases. Most people meet, for the first time, foreign place names in school, hence the choice of solely extracting place names from school atlases as sample data. Interviews targeted university students, where they were asked to pronounce the place names collected. Results revealed the intricacy behind the pronunciation. Two main conclusions emerged; Hungarian students encountered challenges reading the toponyms, and Arabic speakers could not identify the names either, which causes a cut in communication. Ergo, the importance of elaborating on the pronunciation of toponyms. Concordance is where you find variants of a name. Therefore, a chart was put forward including all the name variants obtained from various references with their Arabic transcription indicating any changes that may have occurred, and the origin of the denomination (Roman, Berber, etc.). A case will also be added for comments and observations. This work embraces a dual purpose. It will provide information to Hungarians on the official names of foreign places in case of occurring changes; for instance, 'El-goléa, Algeria' (used in a latest edition of a school atlas) has now the official name of 'El Ménia'. It will also serve as a reference for knowing the correct and precise forms of place names’ pronunciation.Keywords: concordance, onomastics, settlement names, school atlases
Procedia PDF Downloads 110