Search results for: text retrieval
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1550

Search results for: text retrieval

1400 Spacial Poetic Text throughout Samih al-Qasim's Poetry

Authors: Saleem Abu Jaber, Khaled Igbaria

Abstract:

For readers, space/place is one of the most significant references to reveal deep significances and indications in modern Arabic poetic texts. Generally, when poets evoke places and/or spaces, they do not mean to refer readers to detailed geographic or physical spaces, but to the symbolic significances and dimensions that those spaces have and through which poets encourage spacial awareness in their readers. Recently, as a result, there has been a great deal of interest in research addressing spacial poetic texts and dimensions in modern Arabic poetry in general and in Palestinian poetry in particular. Samih al-Qasim is one of the most recent prominent Palestinian revolutionary poets. Al-Qasim has published six series of poems that are well known in the Arab world. Although several researchers have studied al-Qasim's poetry, to our knowledge, yet no one has studied the aspect of spacial poetic text in his poetry. Therefore, this paper seeks to fill a gap in the scholarship that has not been addressed up to now. This article aims, not only to demonstrate the presence of spacial poetic text and dimensions throughout al-Qasim's poetry, but also to investigate the purpose for which the poet uses spacial poetic text. Our theory is that the poet, consciously and significantly, uses spacial poetic texts to magnify the Palestinian identity of the Palestinian readers.  Methodologically, we applied a descriptive analytic method, referencing al-Qasim's poetry, addressing spacial poetic texts practically but not theoretically or statistically.

Keywords: spatial poetic text, Samih al-Qasim, space and identity, Palestinian poetry

Procedia PDF Downloads 294
1399 Study of Evaluation Model Based on Information System Success Model and Flow Theory Using Web-scale Discovery System

Authors: June-Jei Kuo, Yi-Chuan Hsieh

Abstract:

Because of the rapid growth of information technology, more and more libraries introduce the new information retrieval systems to enhance the users’ experience, improve the retrieval efficiency, and increase the applicability of the library resources. Nevertheless, few of them are discussed the usability from the users’ aspect. The aims of this study are to understand that the scenario of the information retrieval system utilization, and to know why users are willing to continuously use the web-scale discovery system to improve the web-scale discovery system and promote their use of university libraries. Besides of questionnaires, observations and interviews, this study employs both Information System Success Model introduced by DeLone and McLean in 2003 and the flow theory to evaluate the system quality, information quality, service quality, use, user satisfaction, flow, and continuing to use web-scale discovery system of students from National Chung Hsing University. Then, the results are analyzed through descriptive statistics and structural equation modeling using AMOS. The results reveal that in web-scale discovery system, the user’s evaluation of system quality, information quality, and service quality is positively related to the use and satisfaction; however, the service quality only affects user satisfaction. User satisfaction and the flow show a significant impact on continuing to use. Moreover, user satisfaction has a significant impact on user flow. According to the results of this study, to maintain the stability of the information retrieval system, to improve the information content quality, and to enhance the relationship between subject librarians and students are recommended for the academic libraries. Meanwhile, to improve the system user interface, to minimize layer from system-level, to strengthen the data accuracy and relevance, to modify the sorting criteria of the data, and to support the auto-correct function are required for system provider. Finally, to establish better communication with librariana commended for all users.

Keywords: web-scale discovery system, discovery system, information system success model, flow theory, academic library

Procedia PDF Downloads 76
1398 Procedures and Strategies in Translation: Two Marathi Translations of Train to Pakistan by Khushwant Singh

Authors: Manoj Gujar

Abstract:

The present paper is an attempt to interpret two Marathi translations of Khushwant Singh’s (1915-2014) novel Train to Pakistan (1956). The 20th century was branded as an era of Liberalization, Privatization and Globalization. Different countries and cultures have enunciated interaction with one another in an unprecedented manner. The world is becoming multilingual and multicultural. The democratic countries such as the U.S.A., the U.K., and India have become pivotal centers of interlingual and cross-cultural exchange. People belonging to different nationalities showed keen interest in knowing the characteristic features of different languages and of their cultures. Here, ‘Translation’ plays an important role in such multilingual and multicultural contexts. Translation is not only translation of a language but a translation of a culture. However, in the act of translation a translator makes use of such procedures as borrowing, definition, literal translation, substitution, lexical creation, omission, addition as well as their various combinations. To him, a text produced in one linguistic and cultural context can reach other linguistic and cultural contexts through these processes of translation. A worthy work of art appeals many readers. India, being a multilingual country we find that there goes multiple translations of the same text in different Indian languages. But sometimes, if can be found that a same text appeals to different ages and the same text gets translated into the same language by the two or more authors. In this reference, the present paper is an attempt to study how different translations of the same text differ in terms of procedures and strategies during the process of the translation of culture. The source text is Khushwant Singh’s historical novel Train to Pakistan (1956). The novel was widely appreciated and so translated into different regional languages in India. The novel has two Marathi translations: Agniratha (1972) by Hidayatkhan and Train to Pakistan (1980) by Anil Kinikar. This paper is an attempt to evaluate the strategies and procedures in translation to analyze these two Marathi translations. Hidayat Khan made a lot of omissions of the significant details and distorted the original text to a large extent, whereas, Anil Kinikar has done justice to the Source Text by rendering it in Marathi as faithfully as possible.

Keywords: culture, multilingual, procedures and strategies, translation

Procedia PDF Downloads 357
1397 Unraveling the Threads of Madness: Henry Russell’s 'The Maniac' as an Advocate for Deinstitutionalization in the Nineteenth Century

Authors: T. J. Laws-Nicola

Abstract:

Henry Russell was best known as a composer of more than 300 songs. Many of his compositions were popular for both their sentimental texts, as in ‘The Old Armchair,’ and those of a more political nature, such as ‘Woodsman, Spare That Tree!’ Indeed, Russell had written such songs of advocacy as those associated with abolitionism (‘The Slave Ship’) and environmentalism (‘Woodsman, Spare that Tree!’). ‘The Maniac’ is his only composition addressing the issue of institutionalization. The text is borrowed and adapted from the monodrama The Captive by M.G. ‘Monk’ Lewis. Through an analysis of form, harmony, melody, text, and thematic development and interactions between text and music we can approach a clearer understanding of ‘The Maniac’ and how the text and music interact. Select periodicals, such as The London Times, provide contemporary critical review for ‘The Maniac.’ Additional nineteenth century songs whose texts focus on madness and/or institutionalization will assist in building a stylistic and cultural context for ‘The Maniac.’ Through comparative analyses of ‘The Maniac’ with a body of songs that focus on similar topics, we can approach a clear understanding of the song as a vehicle for deinstitutionalization.

Keywords: 19th century song, institutionalization, M. G. Lewis, Henry Russell

Procedia PDF Downloads 508
1396 Literary Theatre and Embodied Theatre: A Practice-Based Research in Exploring the Authorship of a Performance

Authors: Rahul Bishnoi

Abstract:

Theatre, as Ann Ubersfld calls it, is a paradox. At once, it is both a literary work and a physical representation. Theatre as a text is eternal, reproducible, and identical while as a performance, theatre is momentary and never identical to the previous performances. In this dual existence of theatre, who is the author? Is the author the playwright who writes the dramatic text, or the director who orchestrates the performance, or the actor who embodies the text? From the poststructuralist lens of Barthes, the author is dead. Barthes’ argument of discrete temporality, i.e. the author is the before, and the text is the after, does not hold true for theatre. A published literary work is written, edited, printed, distributed and then gets consumed by the reader. On the other hand, theatrical production is immediate; an actor performs and the audience witnesses it instantaneously. Time, so to speak, does not separate the author, the text, and the reader anymore. The question of authorship gets further complicated in Augusto Boal’s “Theatre of the Oppressed” movement where the audience is a direct participant like the actors in the performance. In this research, through an experimental performance, the duality of theatre is explored with the authorship discourse. And the conventional definition of authorship is subjected to additional complexity by erasing the distinction between an actor and the audience. The design/methodology of the experimental performance is as follows: The audience will be asked to produce a text under an anonymous virtual alias. The text, as it is being produced, will be read and performed by the actor. The audience who are also collectively “authoring” the text, will watch this performance and write further until everyone has contributed with one input each. The cycle of writing, reading, performing, witnessing, and writing will continue until the end. The intention is to create a dynamic system of writing/reading with the embodiment of the text through the actor. The actor is giving up the power to the audience to write the spoken word, stage instruction and direction while still keeping the agency of interpreting that input and performing in the chosen manner. This rapid conversation between the actor and the audience also creates a conversion of authorship. The main conclusion of this study is a perspective on the nature of dynamic authorship of theatre containing a critical enquiry of the collaboratively produced text, an individually performed act, and a collectively witnessed event. Using practice as a methodology, this paper contests the poststructuralist notion of the author as merely a ‘scriptor’ and breaks it further by involving the audience in the authorship as well.

Keywords: practice based research, performance studies, post-humanism, Avant-garde art, theatre

Procedia PDF Downloads 82
1395 Text Mining Techniques for Prioritizing Pathogenic Mutations in Protein Families Known to Misfold or Aggregate

Authors: Khaleel Saleh Al-Rababah

Abstract:

Amyloid fibril forming regions, which are known as protein aggregates, in sequences of some protein families are associated with a number of diseases known as amyloidosis. Mutations play a role in forming fibrils by accelerating the fibril formation process. In this paper we want to extract diseases that caused by those mutations as a result of the impact of the mutations on structural and functional properties of the aggregated protein. We propose a text mining system, to automatically extract mutations, diseases and relations between mutations and diseases. We presented an algorithm based on finite state to cluster mutations found in the same sentence as a sentence could contain different mutation cause different diseases. Also, we presented a co reference algorithm that enables cross-link sentences.

Keywords: amyloid, amyloidosis, co reference, protein, text mining

Procedia PDF Downloads 504
1394 The Application of Lesson Study Model in Writing Review Text in Junior High School

Authors: Sulastriningsih Djumingin

Abstract:

This study has some objectives. It aims at describing the ability of the second-grade students to write review text without applying the Lesson Study model at SMPN 18 Makassar. Second, it seeks to describe the ability of the second-grade students to write review text by applying the Lesson Study model at SMPN 18 Makassar. Third, it aims at testing the effectiveness of the Lesson Study model in writing review text at SMPN 18 Makassar. This research was true experimental design with posttest Only group design involving two groups consisting of one class of the control group and one class of the experimental group. The research populations were all the second-grade students at SMPN 18 Makassar amounted to 250 students consisting of 8 classes. The sampling technique was purposive sampling technique. The control class was VIII2 consisting of 30 students, while the experimental class was VIII8 consisting of 30 students. The research instruments were in the form of observation and tests. The collected data were analyzed using descriptive statistical techniques and inferential statistical techniques with t-test types processed using SPSS 21 for windows. The results shows that: (1) of 30 students in control class, there are only 14 (47%) students who get the score more than 7.5, categorized as inadequate; (2) in the experimental class, there are 26 (87%) students who obtain the score of 7.5, categorized as adequate; (3) the Lesson Study models is effective to be applied in writing review text. Based on the comparison of the ability of the control class and experimental class, it indicates that the value of t-count is greater than the value of t-table (2.411> 1.667). It means that the alternative hypothesis (H1) proposed by the researcher is accepted.

Keywords: application, lesson study, review text, writing

Procedia PDF Downloads 183
1393 Developed Text-Independent Speaker Verification System

Authors: Mohammed Arif, Abdessalam Kifouche

Abstract:

Speech is a very convenient way of communication between people and machines. It conveys information about the identity of the talker. Since speaker recognition technology is increasingly securing our everyday lives, the objective of this paper is to develop two automatic text-independent speaker verification systems (TI SV) using low-level spectral features and machine learning methods. (i) The first system is based on a support vector machine (SVM), which was widely used in voice signal processing with the aim of speaker recognition involving verifying the identity of the speaker based on its voice characteristics, and (ii) the second is based on Gaussian Mixture Model (GMM) and Universal Background Model (UBM) to combine different functions from different resources to implement the SVM based.

Keywords: speaker verification, text-independent, support vector machine, Gaussian mixture model, cepstral analysis

Procedia PDF Downloads 28
1392 Text Mining of Veterinary Forums for Epidemiological Surveillance Supplementation

Authors: Samuel Munaf, Kevin Swingler, Franz Brülisauer, Anthony O’Hare, George Gunn, Aaron Reeves

Abstract:

Web scraping and text mining are popular computer science methods deployed by public health researchers to augment traditional epidemiological surveillance. However, within veterinary disease surveillance, such techniques are still in the early stages of development and have not yet been fully utilised. This study presents an exploration into the utility of incorporating internet-based data to better understand the smallholder farming communities within Scotland by using online text extraction and the subsequent mining of this data. Web scraping of the livestock fora was conducted in conjunction with text mining of the data in search of common themes, words, and topics found within the text. Results from bi-grams and topic modelling uncover four main topics of interest within the data pertaining to aspects of livestock husbandry: feeding, breeding, slaughter, and disposal. These topics were found amongst both the poultry and pig sub-forums. Topic modeling appears to be a useful method of unsupervised classification regarding this form of data, as it has produced clusters that relate to biosecurity and animal welfare. Internet data can be a very effective tool in aiding traditional veterinary surveillance methods, but the requirement for human validation of said data is crucial. This opens avenues of research via the incorporation of other dynamic social media data, namely Twitter and Facebook/Meta, in addition to time series analysis to highlight temporal patterns.

Keywords: veterinary epidemiology, disease surveillance, infodemiology, infoveillance, smallholding, social media, web scraping, sentiment analysis, geolocation, text mining, NLP

Procedia PDF Downloads 72
1391 Evolutionary Methods in Cryptography

Authors: Wafa Slaibi Alsharafat

Abstract:

Genetic algorithms (GA) are random algorithms as random numbers that are generated during the operation of the algorithm determine what happens. This means that if GA is applied twice to optimize exactly the same problem it might produces two different answers. In this project, we propose an evolutionary algorithm and Genetic Algorithm (GA) to be implemented in symmetric encryption and decryption. Here, user's message and user secret information (key) which represent plain text to be transferred into cipher text.

Keywords: GA, encryption, decryption, crossover

Procedia PDF Downloads 419
1390 3D Object Retrieval Based on Similarity Calculation in 3D Computer Aided Design Systems

Authors: Ahmed Fradi

Abstract:

Nowadays, recent technological advances in the acquisition, modeling, and processing of three-dimensional (3D) objects data lead to the creation of models stored in huge databases, which are used in various domains such as computer vision, augmented reality, game industry, medicine, CAD (Computer-aided design), 3D printing etc. On the other hand, the industry is currently benefiting from powerful modeling tools enabling designers to easily and quickly produce 3D models. The great ease of acquisition and modeling of 3D objects make possible to create large 3D models databases, then, it becomes difficult to navigate them. Therefore, the indexing of 3D objects appears as a necessary and promising solution to manage this type of data, to extract model information, retrieve an existing model or calculate similarity between 3D objects. The objective of the proposed research is to develop a framework allowing easy and fast access to 3D objects in a CAD models database with specific indexing algorithm to find objects similar to a reference model. Our main objectives are to study existing methods of similarity calculation of 3D objects (essentially shape-based methods) by specifying the characteristics of each method as well as the difference between them, and then we will propose a new approach for indexing and comparing 3D models, which is suitable for our case study and which is based on some previously studied methods. Our proposed approach is finally illustrated by an implementation, and evaluated in a professional context.

Keywords: CAD, 3D object retrieval, shape based retrieval, similarity calculation

Procedia PDF Downloads 244
1389 A Conglomerate of Multiple Optical Character Recognition Table Detection and Extraction

Authors: Smita Pallavi, Raj Ratn Pranesh, Sumit Kumar

Abstract:

Information representation as tables is compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used; however, industry still faces challenges in detecting and extracting tables from OCR (Optical Character Recognition) documents or images. This paper proposes an algorithm that detects and extracts multiple tables from OCR document. The algorithm uses a combination of image processing techniques, text recognition, and procedural coding to identify distinct tables in the same image and map the text to appropriate the corresponding cell in dataframe, which can be stored as comma-separated values, database, excel, and multiple other usable formats.

Keywords: table extraction, optical character recognition, image processing, text extraction, morphological transformation

Procedia PDF Downloads 124
1388 Grammatical and Lexical Cohesion in the Japan’s Prime Minister Shinzo Abe’s Speech Text ‘Nihon wa Modottekimashita’

Authors: Nadya Inda Syartanti

Abstract:

This research aims to identify, classify, and analyze descriptively the aspects of grammatical and lexical cohesion in the speech text of Japan’s Prime Minister Shinzo Abe entitled Nihon wa Modotte kimashita delivered in Washington DC, the United States on February 23, 2013, as a research data source. The method used is qualitative research, which uses descriptions through words that are applied by analyzing aspects of grammatical and lexical cohesion proposed by Halliday and Hasan (1976). The aspects of grammatical cohesion consist of references (personal, demonstrative, interrogative pronouns), substitution, ellipsis, and conjunction. In contrast, lexical cohesion consists of reiteration (repetition, synonym, antonym, hyponym, meronym) and collocation. Data classification is based on the 6 aspects of the cohesion. Through some aspects of cohesion, this research tries to find out the frequency of using grammatical and lexical cohesion in Shinzo Abe's speech text entitled Nihon wa Modotte kimashita. The results of this research are expected to help overcome the difficulty of understanding speech texts in Japanese. Therefore, this research can be a reference for learners, researchers, and anyone who is interested in the field of discourse analysis.

Keywords: cohesion, grammatical cohesion, lexical cohesion, speech text, Shinzo Abe

Procedia PDF Downloads 137
1387 Spontaneous Message Detection of Annoying Situation in Community Networks Using Mining Algorithm

Authors: P. Senthil Kumari

Abstract:

Main concerns in data mining investigation are social controls of data mining for handling ambiguity, noise, or incompleteness on text data. We describe an innovative approach for unplanned text data detection of community networks achieved by classification mechanism. In a tangible domain claim with humble secrecy backgrounds provided by community network for evading annoying content is presented on consumer message partition. To avoid this, mining methodology provides the capability to unswervingly switch the messages and similarly recover the superiority of ordering. Here we designated learning-centered mining approaches with pre-processing technique to complete this effort. Our involvement of work compact with rule-based personalization for automatic text categorization which was appropriate in many dissimilar frameworks and offers tolerance value for permits the background of comments conferring to a variety of conditions associated with the policy or rule arrangements processed by learning algorithm. Remarkably, we find that the choice of classifier has predicted the class labels for control of the inadequate documents on community network with great value of effect.

Keywords: text mining, data classification, community network, learning algorithm

Procedia PDF Downloads 480
1386 Automatic Extraction of Water Bodies Using Whole-R Method

Authors: Nikhat Nawaz, S. Srinivasulu, P. Kesava Rao

Abstract:

Feature extraction plays an important role in many remote sensing applications. Automatic extraction of water bodies is of great significance in many remote sensing applications like change detection, image retrieval etc. This paper presents a procedure for automatic extraction of water information from remote sensing images. The algorithm uses the relative location of R-colour component of the chromaticity diagram. This method is then integrated with the effectiveness of the spatial scale transformation of whole method. The whole method is based on water index fitted from spectral library. Experimental results demonstrate the improved accuracy and effectiveness of the integrated method for automatic extraction of water bodies.

Keywords: feature extraction, remote sensing, image retrieval, chromaticity, water index, spectral library, integrated method

Procedia PDF Downloads 360
1385 Short Text Classification for Saudi Tweets

Authors: Asma A. Alsufyani, Maram A. Alharthi, Maha J. Althobaiti, Manal S. Alharthi, Huda Rizq

Abstract:

Twitter is one of the most popular microblogging sites that allows users to publish short text messages called 'tweets'. Increasing the number of accounts to follow (followings) increases the number of tweets that will be displayed from different topics in an unclassified manner in the timeline of the user. Therefore, it can be a vital solution for many Twitter users to have their tweets in a timeline classified into general categories to save the user’s time and to provide easy and quick access to tweets based on topics. In this paper, we developed a classifier for timeline tweets trained on a dataset consisting of 3600 tweets in total, which were collected from Saudi Twitter and annotated manually. We experimented with the well-known Bag-of-Words approach to text classification, and we used support vector machines (SVM) in the training process. The trained classifier performed well on a test dataset, with an average F1-measure equal to 92.3%. The classifier has been integrated into an application, which practically proved the classifier’s ability to classify timeline tweets of the user.

Keywords: corpus creation, feature extraction, machine learning, short text classification, social media, support vector machine, Twitter

Procedia PDF Downloads 131
1384 Development of Innovative Islamic Web Applications

Authors: Farrukh Shahzad

Abstract:

The rich Islamic resources related to religious text, Islamic sciences, and history are widely available in print and in electronic format online. However, most of these works are only available in Arabic language. In this research, an attempt is made to utilize these resources to create interactive web applications in Arabic, English and other languages. The system utilizes the Pattern Recognition, Knowledge Management, Data Mining, Information Retrieval and Management, Indexing, storage and data-analysis techniques to parse, store, convert and manage the information from authentic Arabic resources. These interactive web Apps provide smart multi-lingual search, tree based search, on-demand information matching and linking. In this paper, we provide details of application architecture, design, implementation and technologies employed. We also presented the summary of web applications already developed. We have also included some screen shots from the corresponding web sites. These web applications provide an Innovative On-line Learning Systems (eLearning and computer based education).

Keywords: Islamic resources, Muslim scholars, hadith, narrators, history, fiqh

Procedia PDF Downloads 262
1383 Moral Wrongdoers: Evaluating the Value of Moral Actions Performed by War Criminals

Authors: Jean-Francois Caron

Abstract:

This text explores the value of moral acts performed by war criminals, and the extent to which they should alleviate the punishment these individuals ought to receive for violating the rules of war. Without neglecting the necessity of retribution in war crimes cases, it argues from an ethical perspective that we should not rule out the possibility of considering lesser punishments for war criminals who decide to perform a moral act, as it might produce significant positive moral outcomes. This text also analyzes how such a norm could be justified from a moral perspective.

Keywords: war criminals, pardon, amnesty, retribution

Procedia PDF Downloads 259
1382 A Deep Learning Approach to Subsection Identification in Electronic Health Records

Authors: Nitin Shravan, Sudarsun Santhiappan, B. Sivaselvan

Abstract:

Subsection identification, in the context of Electronic Health Records (EHRs), is identifying the important sections for down-stream tasks like auto-coding. In this work, we classify the text present in EHRs according to their information, using machine learning and deep learning techniques. We initially describe briefly about the problem and formulate it as a text classification problem. Then, we discuss upon the methods from the literature. We try two approaches - traditional feature extraction based machine learning methods and deep learning methods. Through experiments on a private dataset, we establish that the deep learning methods perform better than the feature extraction based Machine Learning Models.

Keywords: deep learning, machine learning, semantic clinical classification, subsection identification, text classification

Procedia PDF Downloads 192
1381 Identification of Text Domains and Register Variation through the Analysis of Lexical Distribution in a Bangla Mass Media Text Corpus

Authors: Mahul Bhattacharyya, Niladri Sekhar Dash

Abstract:

The present research paper is an experimental attempt to investigate the nature of variation in the register in three major text domains, namely, social, cultural, and political texts collected from the corpus of Bangla printed mass media texts. This present study uses a corpus of a moderate amount of Bangla mass media text that contains nearly one million words collected from different media sources like newspapers, magazines, advertisements, periodicals, etc. The analysis of corpus data reveals that each text has certain lexical properties that not only control their identity but also mark their uniqueness across the domains. At first, the subject domains of the texts are classified into two parameters namely, ‘Genre' and 'Text Type'. Next, some empirical investigations are made to understand how the domains vary from each other in terms of lexical properties like both function and content words. Here the method of comparative-cum-contrastive matching of lexical load across domains is invoked through word frequency count to track how domain-specific words and terms may be marked as decisive indicators in the act of specifying the textual contexts and subject domains. The study shows that the common lexical stock that percolates across all text domains are quite dicey in nature as their lexicological identity does not have any bearing in the act of specifying subject domains. Therefore, it becomes necessary for language users to anchor upon certain domain-specific lexical items to recognize a text that belongs to a specific text domain. The eventual findings of this study confirm that texts belonging to different subject domains in Bangla news text corpus clearly differ on the parameters of lexical load, lexical choice, lexical clustering, lexical collocation. In fact, based on these parameters, along with some statistical calculations, it is possible to classify mass media texts into different types to mark their relation with regard to the domains they should actually belong. The advantage of this analysis lies in the proper identification of the linguistic factors which will give language users a better insight into the method they employ in text comprehension, as well as construct a systemic frame for designing text identification strategy for language learners. The availability of huge amount of Bangla media text data is useful for achieving accurate conclusions with a certain amount of reliability and authenticity. This kind of corpus-based analysis is quite relevant for a resource-poor language like Bangla, as no attempt has ever been made to understand how the structure and texture of Bangla mass media texts vary due to certain linguistic and extra-linguistic constraints that are actively operational to specific text domains. Since mass media language is assumed to be the most 'recent representation' of the actual use of the language, this study is expected to show how the Bangla news texts reflect the thoughts of the society and how they leave a strong impact on the thought process of the speech community.

Keywords: Bangla, corpus, discourse, domains, lexical choice, mass media, register, variation

Procedia PDF Downloads 158
1380 Resource Framework Descriptors for Interestingness in Data

Authors: C. B. Abhilash, Kavi Mahesh

Abstract:

Human beings are the most advanced species on earth; it's all because of the ability to communicate and share information via human language. In today's world, a huge amount of data is available on the web in text format. This has also resulted in the generation of big data in structured and unstructured formats. In general, the data is in the textual form, which is highly unstructured. To get insights and actionable content from this data, we need to incorporate the concepts of text mining and natural language processing. In our study, we mainly focus on Interesting data through which interesting facts are generated for the knowledge base. The approach is to derive the analytics from the text via the application of natural language processing. Using semantic web Resource framework descriptors (RDF), we generate the triple from the given data and derive the interesting patterns. The methodology also illustrates data integration using the RDF for reliable, interesting patterns.

Keywords: RDF, interestingness, knowledge base, semantic data

Procedia PDF Downloads 132
1379 Smartphones: Tools for Enhancing Teaching in Nigeria’s Higher Institutions

Authors: Ma'amun Muhammed

Abstract:

The ability of smartphones in enhancing communication, providing access to business and serving as a pool for information retrieval has a far reaching and potentially beneficial impacts on enhancing teaching in higher institutions in the developing countries like Nigeria. Nigeria as one of the fast growing economies in Africa, whose citizens patronize smartphones can utilize this opportunity by inculcating the culture of using smartphones not only for communication, business transaction, banking etc. but also for enhancing teaching in the higher institutions. Smartphones have become part and parcel of our lives, particularly among young people. The primary objective of this paper is to ascertain the use of smartphones in enhancing teaching in Nigeria’s higher institutions, to achieve this, content analysis was used thoroughly. This paper examines the opportunities offered by smartphones to the students of higher institutions of learning, the challenges being faced by lecturers of these institutions in classrooms. Lastly, it offers solution on how some of these critical challenges will be overcame, so as to utilize the technology of these devices.

Keywords: communication, information retrieval, mobile phone, smartphones teaching

Procedia PDF Downloads 395
1378 The Application of a Hybrid Neural Network for Recognition of a Handwritten Kazakh Text

Authors: Almagul Assainova , Dariya Abykenova, Liudmila Goncharenko, Sergey Sybachin, Saule Rakhimova, Abay Aman

Abstract:

The recognition of a handwritten Kazakh text is a relevant objective today for the digitization of materials. The study presents a model of a hybrid neural network for handwriting recognition, which includes a convolutional neural network and a multi-layer perceptron. Each network includes 1024 input neurons and 42 output neurons. The model is implemented in the program, written in the Python programming language using the EMNIST database, NumPy, Keras, and Tensorflow modules. The neural network training of such specific letters of the Kazakh alphabet as ә, ғ, қ, ң, ө, ұ, ү, h, і was conducted. The neural network model and the program created on its basis can be used in electronic document management systems to digitize the Kazakh text.

Keywords: handwriting recognition system, image recognition, Kazakh font, machine learning, neural networks

Procedia PDF Downloads 237
1377 The Effects of Three Pre-Reading Activities (Text Summary, Vocabulary Definition, and Pre-Passage Questions) on the Reading Comprehension of Iranian EFL Learners

Authors: Leila Anjomshoa, Firooz Sadighi

Abstract:

This study investigated the effects of three types of pre-reading activities (vocabulary definitions, text summary and pre-passage questions) on EFL learners’ English reading comprehension. On the basis of the results of a placement test administered to two hundred and thirty English students at Kerman Azad University, 200 subjects (one hundred intermediate and one hundred advanced) were selected.Four texts, two of them at intermediate level and two of them at advanced level were chosen. The data gathered was subjected to the statistical procedures of ANOVA. A close examination of the results through Tukey’s HSD showed the fact that the experimental groups performed better than the control group, highlighting the effect of the treatment on them. Also, the experimental group C (text summary), performed remarkably better than the other three groups (both experimental & control). Group B subjects, vocabulary definitions, performed better than groups A and D. The pre-passage questions group’s (D) performance showed higher scores than the control condition.

Keywords: pre-reading activities, text summary, vocabulary definition, and pre-passage questions, reading comprehension

Procedia PDF Downloads 326
1376 Retrieving Similar Segmented Objects Using Motion Descriptors

Authors: Konstantinos C. Kartsakalis, Angeliki Skoura, Vasileios Megalooikonomou

Abstract:

The fuzzy composition of objects depicted in images acquired through MR imaging or the use of bio-scanners has often been a point of controversy for field experts attempting to effectively delineate between the visualized objects. Modern approaches in medical image segmentation tend to consider fuzziness as a characteristic and inherent feature of the depicted object, instead of an undesirable trait. In this paper, a novel technique for efficient image retrieval in the context of images in which segmented objects are either crisp or fuzzily bounded is presented. Moreover, the proposed method is applied in the case of multiple, even conflicting, segmentations from field experts. Experimental results demonstrate the efficiency of the suggested method in retrieving similar objects from the aforementioned categories while taking into account the fuzzy nature of the depicted data.

Keywords: fuzzy object, fuzzy image segmentation, motion descriptors, MRI imaging, object-based image retrieval

Procedia PDF Downloads 355
1375 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: information retrieval, unified medical language system, syntax based analysis, natural language processing, medical informatics

Procedia PDF Downloads 111
1374 Entropy in a Field of Emergence in an Aspect of Linguo-Culture

Authors: Nurvadi Albekov

Abstract:

Communicative situation is a basis, which designates potential models of ‘constructed forms’, a motivated basis of a text, for a text can be assumed as a product of the communicative situation. It is within the field of emergence the models of text, that can be potentially prognosticated in a certain communicative situation, are designated. Every text can be assumed as conceptual system structured on the base of certain communicative situation. However in the process of ‘structuring’ of a certain model of ‘conceptual system’ consciousness of a recipient is able act only within the border of the field of emergence for going out of this border indicates misunderstanding of the communicative situation. On the base of communicative situation we can witness the increment of meaning where the synergizing of the informative model of communication, formed by using of the invariant units of a language system, is a result of verbalization of the communicative situation. The potential of the models of a text, prognosticated within the field of emergence, also depends on the communicative situation. The conception ‘the field of emergence’ is interpreted as a unit of the language system, having poly-directed universal structure, implying the presence of the core, the center and the periphery, including different levels of means of a functioning system of language, both in terms of linguistic resources, and in terms of extra linguistic factors interaction of which results increment of a text. The conception ‘field of emergence’ is considered as the most promising in the analysis of texts: oral, written, printed and electronic. As a unit of the language system field of emergence has several properties that predict its use during the study of a text in different levels. This work is an attempt analysis of entropy in a text in the aspect of lingua-cultural code, prognosticated within the model of the field of emergence. The article describes the problem of entropy in the field of emergence, caused by influence of the extra-linguistic factors. The increasing of entropy is caused not only by the fact of intrusion of the language resources but by influence of the alien culture in a whole, and by appearance of non-typical for this very culture symbols in the field of emergence. The borrowing of alien lingua-cultural symbols into the lingua-culture of the author is a reason of increasing the entropy when constructing a text both in meaning and in structuring level. It is nothing but artificial formatting of lexical units that violate stylistic unity of a phrase. It is marked that one of the important characteristics descending the entropy in the field of emergence is a typical similarity of lexical and semantic resources of the different lingua-cultures in aspects of extra linguistic factors.

Keywords: communicative situation, field of emergence, lingua-culture, entropy

Procedia PDF Downloads 341
1373 Support Vector Regression for Retrieval of Soil Moisture Using Bistatic Scatterometer Data at X-Band

Authors: Dileep Kumar Gupta, Rajendra Prasad, Pradeep Kumar, Varun Narayan Mishra, Ajeet Kumar Vishwakarma, Prashant K. Srivastava

Abstract:

An approach was evaluated for the retrieval of soil moisture of bare soil surface using bistatic scatterometer data in the angular range of 200 to 700 at VV- and HH- polarization. The microwave data was acquired by specially designed X-band (10 GHz) bistatic scatterometer. The linear regression analysis was done between scattering coefficients and soil moisture content to select the suitable incidence angle for retrieval of soil moisture content. The 250 incidence angle was found more suitable. The support vector regression analysis was used to approximate the function described by the input-output relationship between the scattering coefficient and corresponding measured values of the soil moisture content. The performance of support vector regression algorithm was evaluated by comparing the observed and the estimated soil moisture content by statistical performance indices %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE). The values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 2.9451, 1.0986, and 0.9214, respectively at HH-polarization. At VV- polarization, the values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 3.6186, 0.9373, and 0.9428, respectively.

Keywords: bistatic scatterometer, soil moisture, support vector regression, RMSE, %Bias, NSE

Procedia PDF Downloads 399
1372 Understanding the Qualitative Nature of Product Reviews by Integrating Text Processing Algorithm and Usability Feature Extraction

Authors: Cherry Yieng Siang Ling, Joong Hee Lee, Myung Hwan Yun

Abstract:

The quality of a product to be usable has become the basic requirement in consumer’s perspective while failing the requirement ends up the customer from not using the product. Identifying usability issues from analyzing quantitative and qualitative data collected from usability testing and evaluation activities aids in the process of product design, yet the lack of studies and researches regarding analysis methodologies in qualitative text data of usability field inhibits the potential of these data for more useful applications. While the possibility of analyzing qualitative text data found with the rapid development of data analysis studies such as natural language processing field in understanding human language in computer, and machine learning field in providing predictive model and clustering tool. Therefore, this research aims to study the application capability of text processing algorithm in analysis of qualitative text data collected from usability activities. This research utilized datasets collected from LG neckband headset usability experiment in which the datasets consist of headset survey text data, subject’s data and product physical data. In the analysis procedure, which integrated with the text-processing algorithm, the process includes training of comments onto vector space, labeling them with the subject and product physical feature data, and clustering to validate the result of comment vector clustering. The result shows 'volume and music control button' as the usability feature that matches best with the cluster of comment vectors where centroid comments of a cluster emphasized more on button positions, while centroid comments of the other cluster emphasized more on button interface issues. When volume and music control buttons are designed separately, the participant experienced less confusion, and thus, the comments mentioned only about the buttons' positions. While in the situation where the volume and music control buttons are designed as a single button, the participants experienced interface issues regarding the buttons such as operating methods of functions and confusion of functions' buttons. The relevance of the cluster centroid comments with the extracted feature explained the capability of text processing algorithms in analyzing qualitative text data from usability testing and evaluations.

Keywords: usability, qualitative data, text-processing algorithm, natural language processing

Procedia PDF Downloads 265
1371 The Difference of Learning Outcomes in Reading Comprehension between Text and Film as The Media in Indonesian Language for Foreign Speaker in Intermediate Level

Authors: Siti Ayu Ningsih

Abstract:

This study aims to find the differences outcomes in learning reading comprehension with text and film as media on Indonesian Language for foreign speaker (BIPA) learning at intermediate level. By using quantitative and qualitative research methods, the respondent of this study is a single respondent from D'Royal Morocco Integrative Islamic School in grade nine from secondary level. Quantitative method used to calculate the learning outcomes that have been given the appropriate action cycle, whereas qualitative method used to translate the findings derived from quantitative methods to be described. The technique used in this study is the observation techniques and testing work. Based on the research, it is known that the use of the text media is more effective than the film for intermediate level of Indonesian Language for foreign speaker learner. This is because, when using film the learner does not have enough time to take note the difficult vocabulary and don't have enough time to look for the meaning of the vocabulary from the dictionary. While the use of media texts shows the better effectiveness because it does not require additional time to take note the difficult words. For the words that are difficult or strange, the learner can immediately find its meaning from the dictionary. The presence of the text is also very helpful for Indonesian Language for foreign speaker learner to find the answers according to the questions more easily. By matching the vocabulary of the question into the text references.

Keywords: Indonesian language for foreign speaker, learning outcome, media, reading comprehension

Procedia PDF Downloads 180