Search results for: text information retrieval
11339 Identifying Concerned Citizen Communication Style During the State Parliamentary Elections in Bavaria
Authors: Volker Mittendorf, Andre Schmale
Abstract:
In this case study, we want to explore the Twitter-use of candidates during the state parliamentary elections-year 2018 in Bavaria, Germany. This paper focusses on the seven parties that probably entered the parliament. Against this background, the paper classifies the use of language as populism which itself is considered as a political communication style. First, we determine the election campaigns which started in the years 2017 on Twitter, after that we categorize the posting times of the different direct candidates in order to derive ideal types from our empirical data. Second, we have done the exploration based on the dictionary of concerned citizens which contains German political language of the right and the far right. According to that, we are analyzing the corpus with methods of text mining and social network analysis, and afterwards we display the results in a network of words of concerned citizen communication style (CCCS).Keywords: populism, communication style, election, text mining, social media
Procedia PDF Downloads 14911338 Convolutional Neural Networks-Optimized Text Recognition with Binary Embeddings for Arabic Expiry Date Recognition
Authors: Mohamed Lotfy, Ghada Soliman
Abstract:
Recognizing Arabic dot-matrix digits is a challenging problem due to the unique characteristics of dot-matrix fonts, such as irregular dot spacing and varying dot sizes. This paper presents an approach for recognizing Arabic digits printed in dot matrix format. The proposed model is based on Convolutional Neural Networks (CNN) that take the dot matrix as input and generate embeddings that are rounded to generate binary representations of the digits. The binary embeddings are then used to perform Optical Character Recognition (OCR) on the digit images. To overcome the challenge of the limited availability of dotted Arabic expiration date images, we developed a True Type Font (TTF) for generating synthetic images of Arabic dot-matrix characters. The model was trained on a synthetic dataset of 3287 images and 658 synthetic images for testing, representing realistic expiration dates from 2019 to 2027 in the format of yyyy/mm/dd. Our model achieved an accuracy of 98.94% on the expiry date recognition with Arabic dot matrix format using fewer parameters and less computational resources than traditional CNN-based models. By investigating and presenting our findings comprehensively, we aim to contribute substantially to the field of OCR and pave the way for advancements in Arabic dot-matrix character recognition. Our proposed approach is not limited to Arabic dot matrix digit recognition but can also be extended to text recognition tasks, such as text classification and sentiment analysis.Keywords: computer vision, pattern recognition, optical character recognition, deep learning
Procedia PDF Downloads 9311337 The Use of Information and Communication Technologies in Electoral Procedures: Comments on Electronic Voting Security
Authors: Magdalena Musiał-Karg
Abstract:
The expansion of telecommunication and progress of electronic media constitute important elements of our times. The recent worldwide convergence of information and communication technologies (ICT) and dynamic development of the mass media is leading to noticeable changes in the functioning of contemporary states and societies. Currently, modern technologies play more and more important roles and filter down to almost every field of contemporary human life. It results in the growth of online interactions that can be observed by the inconceivable increase in the number of people with home PCs and Internet access. The proof of it is undoubtedly the emergence and use of concepts such as e-society, e-banking, e-services, e-government, e-government, e-participation and e-democracy. The newly coined word e-democracy evidences that modern technologies have also been widely used in politics. Without any doubt in most countries all actors of political market (politicians, political parties, servants in political/public sector, media) use modern forms of communication with the society. Most of these modern technologies progress the processes of getting and sending information to the citizens, communication with the electorate, and also – which seems to be the biggest advantage – electoral procedures. Thanks to implementation of ICT the interaction between politicians and electorate are improved. The main goal of this text is to analyze electronic voting (e-voting) as one of the important forms of electronic democracy in terms of security aspects. The author of this paper aimed at answering the questions of security of electronic voting as an additional form of participation in elections and referenda.Keywords: electronic democracy, electronic voting, security of e-voting, information and communication technology (ICT)
Procedia PDF Downloads 24011336 L1 Poetry and Moral Tales as a Factor Affecting L2 Acquisition in EFL Settings
Authors: Arif Ahmed Mohammed Al-Ahdal
Abstract:
Poetry, tales, and fables have always been a part of the L1 repertoire and one that takes the learners to another amazing and fascinating world of imagination. The storytelling class and the genre of poems are activities greatly enjoyed by all age groups. The very significant idea behind their inclusion in the language curriculum is to sensitize young minds to a wide range of human emotions that are believed to greatly contribute to building their social resilience, emotional stability, empathy towards fellow creatures, and literacy. Quite certainly, the learning objective at this stage is not language acquisition (though it happens as an automatic process) but getting the young learners to be acquainted with an entire spectrum of what may be called the ‘noble’ abilities of the human race. They enrich their very existence, inspiring them to unearth ‘selves’ that help them as adults and enable them to co-exist fruitfully and symbiotically with their fellow human beings. By extension, ‘higher’ training in these literature genres shows the universality of human emotions, sufferings, aspirations, and hopes. The current study is anchored on the Reader-Response-Theory in literature learning, which suggests that the reader reconstructs work and re-enacts the author's creative role. Reiteratingly, literary works provide clues or verbal symbols in a linguistic system, widely accepted by everyone who shares the language, but everyone reads their own life experiences and situations into them. The significance of words depends on the reader, even if they have a typical relationship. In every reading, there is an interaction between the reader and the text. The process of reading is an experience in which the reader tries to comprehend the literary work, which surpasses its full potential since it provides emotional and intellectual reactions that are not anticipated from the document but cannot be affirmed just by the reader as a part of the text. The idea is that the text forms the basis of a unifying experience. A reinterpretation of the literary text may transform it into a guiding principle to respond to actual experiences and personal memories. The impulses delivered to the reader vary according to poetry or texts; nevertheless, the readers differ considerably even with the same material. Previous studies confirm that poetry is a useful tool for learning a language. This present paper works on these hypotheses and proposes to study the impetus given to L2 learning as a factor of exposure to poetry and meaningful stories in L1. The driving force behind the choice of this topic is the first-hand experience that the researcher had while teaching a literary text to a group of BA students who, as a reaction to the text, initially burst into tears and ultimately turned the class into an interactive session. The study also intends to compare the performance of male and female students post intervention using pre and post-tests, apart from undertaking a detailed inquiry via interviews with college learners of English to understand how L1 literature plays a great role in the acquisition of L2.Keywords: SLA, literary text, poetry, tales, affective factors
Procedia PDF Downloads 7711335 Enhance Construction Visual As-Built Schedule Management Using BIM Technology
Authors: Shu-Hui Jan, Hui-Ping Tserng, Shih-Ping Ho
Abstract:
Construction project control attempts to obtain real-time as-built schedule information and to eliminate project delays by effectively enhancing dynamic schedule control and management. Suitable platforms for enhancing an as-built schedule visually during the construction phase are necessary and important for general contractors. As the application of building information modeling (BIM) becomes more common, schedule management integrated with the BIM approach becomes essential to enhance visual construction management implementation for the general contractor during the construction phase. To enhance visualization of the updated as-built schedule for the general contractor, this study presents a novel system called the Construction BIM-assisted Schedule Management (ConBIM-SM) system for general contractors in
Keywords: building information modeling (BIM), construction schedule management, as-built schedule management, BIM schedule updating mechanism
Procedia PDF Downloads 37511334 Quick Response Codes in Physio: A Simple Click to Long-Term Oxygen Therapy Education
Authors: K. W. Lee, C. M. Choi, H. C. Tsang, W. K. Fong, Y. K. Cheng, L. Y. Chan, C. K. Yuen, P. W. Lau, Y. L. To, K. C. Chow
Abstract:
QR (Quick Response) Code is a matrix barcode. It enables users to open websites, photos and other information with mobile devices by just snapping the code. In usual Long Term Oxygen Therapy arrangement, piles of LTOT related information like leaflets from different oxygen service providers are given to patients to choose an appropriate plan according to their needs. If these printed materials are transformed into electronic format (QR Code), it would be more environmentally-friendly. More importantly, electronic materials including LTOT equipment operation and dyspnoea relieving techniques also empower patients in long-term disease management. The objective to this study is to investigate the effect of QR code in patient education on new LTOT users. This study was carried out in medical wards of North District Hospital. Adult patients and relatives who followed commands, were able to use smartphones with internet services and required LTOT arrangement on hospital discharge were recruited. In LTOT arrangement, apart from the usual LTOT education booklets which included patients’ personal information (e.g. oxygen titration and six-minute walk test results etc.), extra leaflets consisted of 1. QR codes of LTOT plans from different oxygen service providers, 2. Education materials of dyspnoea management and 3. Instructions on LTOT equipment operation were given. Upon completion of LTOT arrangement, a questionnaire about the use of QR code on patient education was filled in by patients or relatives. A total of 10 new LTOT users were recruited from November 2017 to January 2018. Initially, 70% of them did not know anything about the QR code, but all of them understood its operation after a simple demonstration. 70% of them agreed that it was convenient to use (20% strongly agree, 40% agree, 10% somewhat agree). 80% of them agreed that QR code could facilitate the retrieval of more LTOT related information (10% strongly agree, 70% agree) while 90% agreed that we should continue delivering QR code leaflets to new LTOT users in the future (30% strongly agree, 40% agree, 20% somewhat agree). It is proven that QR code is a convenient and environmentally-friendly tool to deliver information. It is also relatively easy to be introduced to new users. It has received welcoming feedbacks from current users.Keywords: long-term oxygen therapy, physiotherapy, patient education, QR code
Procedia PDF Downloads 14811333 Chemical Reaction Algorithm for Expectation Maximization Clustering
Authors: Li Ni, Pen ManMan, Li KenLi
Abstract:
Clustering is an intensive research for some years because of its multifaceted applications, such as biology, information retrieval, medicine, business and so on. The expectation maximization (EM) is a kind of algorithm framework in clustering methods, one of the ten algorithms of machine learning. Traditionally, optimization of objective function has been the standard approach in EM. Hence, research has investigated the utility of evolutionary computing and related techniques in the regard. Chemical Reaction Optimization (CRO) is a recently established method. So the property embedded in CRO is used to solve optimization problems. This paper presents an algorithm framework (EM-CRO) with modified CRO operators based on EM cluster problems. The hybrid algorithm is mainly to solve the problem of initial value sensitivity of the objective function optimization clustering algorithm. Our experiments mainly take the EM classic algorithm:k-means and fuzzy k-means as an example, through the CRO algorithm to optimize its initial value, get K-means-CRO and FKM-CRO algorithm. The experimental results of them show that there is improved efficiency for solving objective function optimization clustering problems.Keywords: chemical reaction optimization, expection maimization, initia, objective function clustering
Procedia PDF Downloads 71311332 Improved Pitch Detection Using Fourier Approximation Method
Authors: Balachandra Kumaraswamy, P. G. Poonacha
Abstract:
Automatic Music Information Retrieval has been one of the challenging topics of research for a few decades now with several interesting approaches reported in the literature. In this paper we have developed a pitch extraction method based on a finite Fourier series approximation to the given window of samples. We then estimate pitch as the fundamental period of the finite Fourier series approximation to the given window of samples. This method uses analysis of the strength of harmonics present in the signal to reduce octave as well as harmonic errors. The performance of our method is compared with three best known methods for pitch extraction, namely, Yin, Windowed Special Normalization of the Auto-Correlation Function and Harmonic Product Spectrum methods of pitch extraction. Our study with artificially created signals as well as music files show that Fourier Approximation method gives much better estimate of pitch with less octave and harmonic errors.Keywords: pitch, fourier series, yin, normalization of the auto- correlation function, harmonic product, mean square error
Procedia PDF Downloads 41211331 Robust Medical Image Watermarking Using Frequency Domain and Least Significant Bits Algorithms
Authors: Volkan Kaya, Ersin Elbasi
Abstract:
Watermarking and stenography are getting importance recently because of copyright protection and authentication. In watermarking we embed stamp, logo, noise or image to multimedia elements such as image, video, audio, animation and text. There are several works have been done in watermarking for different purposes. In this research work, we used watermarking techniques to embed patient information into the medical magnetic resonance (MR) images. There are two methods have been used; frequency domain (Digital Wavelet Transform-DWT, Digital Cosine Transform-DCT, and Digital Fourier Transform-DFT) and spatial domain (Least Significant Bits-LSB) domain. Experimental results show that embedding in frequency domains resist against one type of attacks, and embedding in spatial domain is resist against another group of attacks. Peak Signal Noise Ratio (PSNR) and Similarity Ratio (SR) values are two measurement values for testing. These two values give very promising result for information hiding in medical MR images.Keywords: watermarking, medical image, frequency domain, least significant bits, security
Procedia PDF Downloads 28711330 Multi-Dimensional Experience of Processing Textual and Visual Information: Case Study of Allocations to Places in the Mind’s Eye Based on Individual’s Semantic Knowledge Base
Authors: Joanna Wielochowska, Aneta Wielochowska
Abstract:
Whilst the relationship between scientific areas such as cognitive psychology, neurobiology and philosophy of mind has been emphasized in recent decades of scientific research, concepts and discoveries made in both fields overlap and complement each other in their quest for answers to similar questions. The object of the following case study is to describe, analyze and illustrate the nature and characteristics of a certain cognitive experience which appears to display features of synaesthesia, or rather high-level synaesthesia (ideasthesia). The following research has been conducted on the subject of two authors, monozygotic twins (both polysynaesthetes) experiencing involuntary associations of identical nature. Authors made attempts to identify which cognitive and conceptual dependencies may guide this experience. Operating on self-introduced nomenclature, the described phenomenon- multi-dimensional processing of textual and visual information- aims to define a relationship that involuntarily and immediately couples the content introduced by means of text or image a sensation of appearing in a certain place in the mind’s eye. More precisely: (I) defining a concept introduced by means of textual content during activity of reading or writing, or (II) defining a concept introduced by means of visual content during activity of looking at image(s) with simultaneous sensation of being allocated to a given place in the mind’s eye. A place can be then defined as a cognitive representation of a certain concept. During the activity of processing information, a person has an immediate and involuntary feel of appearing in a certain place themselves, just like a character of a story, ‘observing’ a venue or a scenery from one or more perspectives and angles. That forms a unique and unified experience, constituting a background mental landscape of text or image being looked at. We came to a conclusion that semantic allocations to a given place could be divided and classified into the categories and subcategories and are naturally linked with an individual’s semantic knowledge-base. A place can be defined as a representation one’s unique idea of a given concept that has been established in their semantic knowledge base. A multi-level structure of selectivity of places in the mind’s eye, as a reaction to a given information (one stimuli), draws comparisons to structures and patterns found in botany. Double-flowered varieties of flowers and a whorl system (arrangement) which is characteristic to components of some flower species were given as an illustrative example. A composition of petals that fan out from one single point and wrap around a stem inspired an idea that, just like in nature, in philosophy of mind there are patterns driven by the logic specific to a given phenomenon. The study intertwines terms perceived through the philosophical lens, such as definition of meaning, subjectivity of meaning, mental atmosphere of places, and others. Analysis of this rare experience aims to contribute to constantly developing theoretical framework of the philosophy of mind and influence the way human semantic knowledge base and processing given content in terms of distinguishing between information and meaning is researched.Keywords: information and meaning, information processing, mental atmosphere of places, patterns in nature, philosophy of mind, selectivity, semantic knowledge base, senses, synaesthesia
Procedia PDF Downloads 12411329 A Phenomenological Approach to Computational Modeling of Analogy
Authors: José Eduardo García-Mendiola
Abstract:
In this work, a phenomenological approach to computational modeling of analogy processing is carried out. The paper goes through the consideration of the structure of the analogy, based on the possibility of sustaining the genesis of its elements regarding Husserl's genetic theory of association. Among particular processes which take place in order to get analogical inferences, there is one which arises crucial for enabling efficient base cases retrieval through long-term memory, namely analogical transference grounded on familiarity. In general, it has been argued that analogical reasoning is a way by which a conscious agent tries to determine or define a certain scope of objects and relationships between them using previous knowledge of other familiar domain of objects and relations. However, looking for a complete description of analogy process, a deeper consideration of phenomenological nature is required in so far, its simulation by computational programs is aimed. Also, one would get an idea of how complex it would be to have a fully computational account of the analogy elements. In fact, familiarity is not a result of a mere chain of repetitions of objects or events but generated insofar as the object/attribute or event in question is integrable inside a certain context that is taking shape as functionalities and functional approaches or perspectives of the object are being defined. Its familiarity is generated not by the identification of its parts or objective determinations as if they were isolated from those functionalities and approaches. Rather, at the core of such a familiarity between entities of different kinds lays the way they are functionally encoded. So, and hoping to make deeper inroads towards these topics, this essay allows us to consider that cognitive-computational perspectives can visualize, from the phenomenological projection of the analogy process reviewing achievements already obtained as well as exploration of new theoretical-experimental configurations towards implementation of analogy models in specific as well as in general purpose machines.Keywords: analogy, association, encoding, retrieval
Procedia PDF Downloads 12111328 Ancient Port Towns of Western Coastal Plain in Kerala, India: From Manuscripts to Material Remains
Authors: Saravanan R.
Abstract:
The landscape of Kerala was paved way for the growth of maritime contacts with foreigners. Pepper was the important exported item from here because this region only having pepper production on the West Coast of India. The paper is attempting to analysis the available references of ancient port town in Kerala. It is merely preliminary investigation about Early Historic urban centres with the available literary evidences and excavations reports that would help us to understand the ancient port town in Kerala coast. There were number of ancient port towns mentioned in classical Greek and Sangam literatures. For instance, Naura, Tyndis, Nelcynda, Bacare and Muziris were the major sites of Kerala which represented only in the text but not able to locate these sites on the ground so far. There are lot of studies on site based as well as state based regarding the various aspects of ancient port towns. But, it is mainly focussed on factual narration and theoretical interpretation.Keywords: urban centre, amphora, Muziris, port town, Sangam text and trade
Procedia PDF Downloads 7111327 Perspectives of Computational Modeling in Sanskrit Lexicons
Authors: Baldev Ram Khandoliyan, Ram Kishor
Abstract:
India has a classical tradition of Sanskrit Lexicons. Research work has been done on the study of Indian lexicography. India has seen amazing strides in Information and Communication Technology (ICT) applications for Indian languages in general and for Sanskrit in particular. Since Machine Translation from Sanskrit to other Indian languages is often the desired goal, traditional Sanskrit lexicography has attracted a lot of attention from the ICT and Computational Linguistics community. From Nighaŋţu and Nirukta to Amarakośa and Medinīkośa, Sanskrit owns a rich history of lexicography. As these kośas do not follow the same typology or standard in the selection and arrangement of the words and the information related to them, several types of Kośa-styles have emerged in this tradition. The model of a grammar given by Aṣṭādhyāyī is well appreciated by Indian and western linguists and grammarians. But the different models provided by lexicographic tradition also have importance. The general usefulness of Sanskrit traditional Kośas is well discussed by some scholars. That is most of the matter made available in the text. Some also have discussed the good arrangement of lexica. This paper aims to discuss some more use of the different models of Sanskrit lexicography especially focusing on its computational modeling and its use in different computational operations.Keywords: computational lexicography, Sanskrit Lexicons, nighanṭu, kośa, Amarkosa
Procedia PDF Downloads 16411326 Mondoc: Informal Lightweight Ontology for Faceted Semantic Classification of Hypernymy
Authors: M. Regina Carreira-Lopez
Abstract:
Lightweight ontologies seek to concrete union relationships between a parent node, and a secondary node, also called "child node". This logic relation (L) can be formally defined as a triple ontological relation (LO) equivalent to LO in ⟨LN, LE, LC⟩, and where LN represents a finite set of nodes (N); LE is a set of entities (E), each of which represents a relationship between nodes to form a rooted tree of ⟨LN, LE⟩; and LC is a finite set of concepts (C), encoded in a formal language (FL). Mondoc enables more refined searches on semantic and classified facets for retrieving specialized knowledge about Atlantic migrations, from the Declaration of Independence of the United States of America (1776) and to the end of the Spanish Civil War (1939). The model looks forward to increasing documentary relevance by applying an inverse frequency of co-ocurrent hypernymy phenomena for a concrete dataset of textual corpora, with RMySQL package. Mondoc profiles archival utilities implementing SQL programming code, and allows data export to XML schemas, for achieving semantic and faceted analysis of speech by analyzing keywords in context (KWIC). The methodology applies random and unrestricted sampling techniques with RMySQL to verify the resonance phenomena of inverse documentary relevance between the number of co-occurrences of the same term (t) in more than two documents of a set of texts (D). Secondly, the research also evidences co-associations between (t) and their corresponding synonyms and antonyms (synsets) are also inverse. The results from grouping facets or polysemic words with synsets in more than two textual corpora within their syntagmatic context (nouns, verbs, adjectives, etc.) state how to proceed with semantic indexing of hypernymy phenomena for subject-heading lists and for authority lists for documentary and archival purposes. Mondoc contributes to the development of web directories and seems to achieve a proper and more selective search of e-documents (classification ontology). It can also foster on-line catalogs production for semantic authorities, or concepts, through XML schemas, because its applications could be used for implementing data models, by a prior adaptation of the based-ontology to structured meta-languages, such as OWL, RDF (descriptive ontology). Mondoc serves to the classification of concepts and applies a semantic indexing approach of facets. It enables information retrieval, as well as quantitative and qualitative data interpretation. The model reproduces a triple tuple ⟨LN, LE, LT, LCF L, BKF⟩ where LN is a set of entities that connect with other nodes to concrete a rooted tree in ⟨LN, LE⟩. LT specifies a set of terms, and LCF acts as a finite set of concepts, encoded in a formal language, L. Mondoc only resolves partial problems of linguistic ambiguity (in case of synonymy and antonymy), but neither the pragmatic dimension of natural language nor the cognitive perspective is addressed. To achieve this goal, forthcoming programming developments should target at oriented meta-languages with structured documents in XML.Keywords: hypernymy, information retrieval, lightweight ontology, resonance
Procedia PDF Downloads 12511325 Adaptation of Hough Transform Algorithm for Text Document Skew Angle Detection
Authors: Kayode A. Olaniyi, Olabanji F. Omotoye, Adeola A. Ogunleye
Abstract:
The skew detection and correction form an important part of digital document analysis. This is because uncompensated skew can deteriorate document features and can complicate further document image processing steps. Efficient text document analysis and digitization can rarely be achieved when a document is skewed even at a small angle. Once the documents have been digitized through the scanning system and binarization also achieved, document skew correction is required before further image analysis. Research efforts have been put in this area with algorithms developed to eliminate document skew. Skew angle correction algorithms can be compared based on performance criteria. Most important performance criteria are accuracy of skew angle detection, range of skew angle for detection, speed of processing the image, computational complexity and consequently memory space used. The standard Hough Transform has successfully been implemented for text documentation skew angle estimation application. However, the standard Hough Transform algorithm level of accuracy depends largely on how much fine the step size for the angle used. This consequently consumes more time and memory space for increase accuracy and, especially where number of pixels is considerable large. Whenever the Hough transform is used, there is always a tradeoff between accuracy and speed. So a more efficient solution is needed that optimizes space as well as time. In this paper, an improved Hough transform (HT) technique that optimizes space as well as time to robustly detect document skew is presented. The modified algorithm of Hough Transform presents solution to the contradiction between the memory space, running time and accuracy. Our algorithm starts with the first step of angle estimation accurate up to zero decimal place using the standard Hough Transform algorithm achieving minimal running time and space but lacks relative accuracy. Then to increase accuracy, suppose estimated angle found using the basic Hough algorithm is x degree, we then run again basic algorithm from range between ±x degrees with accuracy of one decimal place. Same process is iterated till level of desired accuracy is achieved. The procedure of our skew estimation and correction algorithm of text images is implemented using MATLAB. The memory space estimation and process time are also tabulated with skew angle assumption of within 00 and 450. The simulation results which is demonstrated in Matlab show the high performance of our algorithms with less computational time and memory space used in detecting document skew for a variety of documents with different levels of complexity.Keywords: hough-transform, skew-detection, skew-angle, skew-correction, text-document
Procedia PDF Downloads 15811324 Direct Blind Separation Methods for Convolutive Images Mixtures
Authors: Ahmed Hammed, Wady Naanaa
Abstract:
In this paper, we propose a general approach to deal with the problem of a convolutive mixture of images. We use a direct blind source separation method by adding only one non-statistical justified constraint describing the relationships between different mixing matrix at the aim to make its resolution easy. This method can be applied, provided that this constraint is known, to degraded document affected by the overlapping of text-patterns and images. This is due to chemical and physical reactions of the materials (paper, inks,...) occurring during the documents aging, and other unpredictable causes such as humidity, microorganism infestation, human handling, etc. We will demonstrate that this problem corresponds to a convolutive mixture of images. Subsequently, we will show how the validation of our method through numerical examples. We can so obtain clear images from unreadable ones which can be caused by pages superposition, a phenomenon similar to that we find every often in archival documents.Keywords: blind source separation, convoluted mixture, degraded documents, text-patterns overlapping
Procedia PDF Downloads 32211323 Building Knowledge Society: The Imperative Role of Library and Information Centres (LICs) in Developing Countries
Authors: Desmond Chinedu Oparaku, Oyemike Victor Benson, Ifeyinwa A. Ariole
Abstract:
A critical examination of the emerging knowledge society reveals that library and information centres have a significant role to play in the building of knowledge society. The major highlights of this paper include: the conceptual analysis of knowledge society, overview of library and information centres in developing countries, role of libraries and information centre in building up of knowledge society, library and information professionals as factor in building knowledge, challenges faced by Library and Information Centres (LICs) in building knowledge society, strategies for building knowledge society. The position of this paper is that in spite of the influx of varied information and communication technologies in the information industry which is the driving force of knowledge society, there is a dire need for Libraries and Information Centres (LIC) to contribute positively to the migration and transition processes from the information society to knowledge-based society.Keywords: information and communication technology (ICT), information centres, information industry, information society
Procedia PDF Downloads 37911322 A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features
Authors: Hee-Jeong Ahn, Kee-Won Kim, Seung-Hoon Kim
Abstract:
The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama.Keywords: data mining, Korean linguistic feature, literary fiction, relationship extraction
Procedia PDF Downloads 38011321 A Comparative Study about the Use of SMS in Formal Writing of the Students in Universities
Authors: Sajjad Hussain
Abstract:
Technology has revolutionized the way of communication around the globe. Its use and users are multiplying with every passing minute. The current study reveals the effect of SMS on the formal writing of the students. Students are the regular users of this service and have become addict to short language. This short language is understandable to a particular community and not to the whole as it does not adhere to the Standard English writing practices. Data has been collected from quiz, assignments text and through questionaries’ which supports this postulate that students are frequently practicing it in their formal writing. Certain corrosive measures needs to be taken to address the issue. Second language learners have been found it practicing to greater extent.Keywords: information technology, SMS, messaging, communication, social media, internet, language
Procedia PDF Downloads 53311320 Advantages of Multispectral Imaging for Accurate Gas Temperature Profile Retrieval from Fire Combustion Reactions
Authors: Jean-Philippe Gagnon, Benjamin Saute, Stéphane Boubanga-Tombet
Abstract:
Infrared thermal imaging is used for a wide range of applications, especially in the combustion domain. However, it is well known that most combustion gases such as carbon dioxide (CO₂), water vapor (H₂O), and carbon monoxide (CO) selectively absorb/emit infrared radiation at discrete energies, i.e., over a very narrow spectral range. Therefore, temperature profiles of most combustion processes derived from conventional broadband imaging are inaccurate without prior knowledge or assumptions about the spectral emissivity properties of the combustion gases. Using spectral filters allows estimating these critical emissivity parameters in addition to providing selectivity regarding the chemical nature of the combustion gases. However, due to the turbulent nature of most flames, it is crucial that such information be obtained without sacrificing temporal resolution. For this reason, Telops has developed a time-resolved multispectral imaging system which combines a high-performance broadband camera synchronized with a rotating spectral filter wheel. In order to illustrate the benefits of using this system to characterize combustion experiments, measurements were carried out using a Telops MS-IR MW on a very simple combustion system: a wood fire. The temperature profiles calculated using the spectral information from the different channels were compared with corresponding temperature profiles obtained with conventional broadband imaging. The results illustrate the benefits of the Telops MS-IR cameras for the characterization of laminar and turbulent combustion systems at a high temporal resolution.Keywords: infrared, multispectral, fire, broadband, gas temperature, IR camera
Procedia PDF Downloads 14311319 Information and Cooperativity in Fiction: The Pragmatics of David Baboulene’s “Knowledge Gaps”
Authors: Cara DiGirolamo
Abstract:
In his 2017 Ph.D. thesis, script doctor David Baboulene presented a theory of fiction in which differences in the knowledge states between participants in a literary experience, including reader, author, and characters, create many story elements, among them suspense, expectations, subtext, theme, metaphor, and allegory. This theory can be adjusted and modeled by incorporating a formal pragmatic approach that understands narrative as a speech act with a conversational function. This approach requires both the Speaker and the Listener to be understood as participants in the discourse. It also uses theories of cooperativity and the QUD to identify the existence of implicit questions. This approach predicts that what an effective literary narrative must do: provide a conversational context early in the story so the reader can engage with the text as a conversational participant. In addition, this model incorporates schema theory. Schema theory is a cognitive model for learning and processing information about the world and transforming it into functional knowledge. Using this approach can extend the QUD model. Instead of describing conversation as a form of information gathering restricted to question-answer sets, the QUD can include knowledge modeling and understanding as a possible outcome of a conversation. With this model, Baboulene’s “Knowledge Gaps” can provide real insight into storytelling as a conversational move, and extend the QUD to be able to simply and effectively apply to a more diverse set of conversational interactions and also to narrative texts.Keywords: literature, speech acts, QUD, literary theory
Procedia PDF Downloads 211318 Topic-to-Essay Generation with Event Element Constraints
Authors: Yufen Qin
Abstract:
Topic-to-Essay generation is a challenging task in Natural language processing, which aims to generate novel, diverse, and topic-related text based on user input. Previous research has overlooked the generation of articles under the constraints of event elements, resulting in issues such as incomplete event elements and logical inconsistencies in the generated results. To fill this gap, this paper proposes an event-constrained approach for a topic-to-essay generation that enforces the completeness of event elements during the generation process. Additionally, a language model is employed to verify the logical consistency of the generated results. Experimental results demonstrate that the proposed model achieves a better BLEU-2 score and performs better than the baseline in terms of subjective evaluation on a real dataset, indicating its capability to generate higher-quality topic-related text.Keywords: event element, language model, natural language processing, topic-to-essay generation.
Procedia PDF Downloads 23611317 Translation Quality Assessment: Proposing a Linguistic-Based Model for Translation Criticism with Considering Ideology and Power Relations
Authors: Mehrnoosh Pirhayati
Abstract:
In this study, the researcher tried to propose a model of Translation Criticism (TC) regarding the phenomenon of Translation Quality Assessment (TQA). With changing the general view on re/writing as an illegal act, the researcher defined a scale for the act of translation and determined the redline of translation with other products. This research attempts to show TC as a related phenomenon to TQA. This study shows that TQA with using the rules and factors of TC as depicted in both product-oriented analysis and process-oriented analysis, determines the orientation or the level of the quality of translation. This study also depicts that TC, regarding TQA’s perspective, reveals the aim of the translation of original text and the root of ideological manipulation and re/writing. On the other hand, this study stresses the existence of a direct relationship between the linguistic materials and semiotic codes of a text or book. This study can be fruitful for translators, scholars, translation criticizers, and translation quality assessors, and also it is applicable in the area of pedagogy.Keywords: a model of translation criticism, a model of translation quality assessment, critical discourse analysis (CDA), re/writing, translation criticism (TC), translation quality assessment (TQA)
Procedia PDF Downloads 32011316 Intentionality and Context in the Paradox of Reward and Punishment in the Meccan Surahs
Authors: Asmaa Fathy Mohamed Desoky
Abstract:
The subject of this research is the inference of intentionality and context from the verses of the Meccan surahs, which include the paradox of reward and punishment, applied to the duality of disbelief and faith; The Holy Quran is the most important sacred linguistic reference in the Arabic language because it is rich in all the rules of the language in addition to the linguistic miracle. the Quranic text is a first-class intentional text, sent down to convey something to the recipient (Muhammad first and then communicates it to Muslims) and influence and convince him, which opens the door to many Ijtihad; a desire to reach the will of Allah and his intention from his words Almighty. Intentionality as a term is one of the most important deliberative terms, but it will be modified to suit the Quranic discourse, especially since intentionality is related to intention-as it turned out earlier - that is, it turns the reader or recipient into a predictor of the unseen, and this does not correspond to the Quranic discourse. Hence, in this research, a set of dualities will be identified that will be studied in order to clarify the meaning of them according to the opinions of previous interpreters in accordance with the sanctity of the Quranic discourse, which is intentionally related to the dualities of reward and punishment, such as: the duality of disbelief and faith, noting that it is a duality that combines opposites and Paradox on one level, because it may be an external paradox between action and reaction, and may be an internal paradox in matters related to faith, and may be a situational paradox in a specific event or a certain fact. It should be noted that the intention of the Qur'anic text is fully realized in form and content, in whole and in part, and this research includes a presentation of some applied models of the issues of intention and context that appear in the verses of the paradox of reward and punishment in the Meccan surahs in Quraan.Keywords: intentionality, context, the paradox, reward, punishment, Meccan surahs
Procedia PDF Downloads 7911315 Switching to the Latin Alphabet in Kazakhstan: A Brief Overview of Character Recognition Methods
Authors: Ainagul Yermekova, Liudmila Goncharenko, Ali Baghirzade, Sergey Sybachin
Abstract:
In this article, we address the problem of Kazakhstan's transition to the Latin alphabet. The transition process started in 2017 and is scheduled to be completed in 2025. In connection with these events, the problem of recognizing the characters of the new alphabet is raised. Well-known character recognition programs such as ABBYY FineReader, FormReader, MyScript Stylus did not recognize specific Kazakh letters that were used in Cyrillic. The author tries to give an assessment of the well-known method of character recognition that could be in demand as part of the country's transition to the Latin alphabet. Three methods of character recognition: template, structured, and feature-based, are considered through the algorithms of operation. At the end of the article, a general conclusion is made about the possibility of applying a certain method to a particular recognition process: for example, in the process of population census, recognition of typographic text in Latin, or recognition of photos of car numbers, store signs, etc.Keywords: text detection, template method, recognition algorithm, structured method, feature method
Procedia PDF Downloads 18611314 A Method for Clinical Concept Extraction from Medical Text
Authors: Moshe Wasserblat, Jonathan Mamou, Oren Pereg
Abstract:
Natural Language Processing (NLP) has made a major leap in the last few years, in practical integration into medical solutions; for example, extracting clinical concepts from medical texts such as medical condition, medication, treatment, and symptoms. However, training and deploying those models in real environments still demands a large amount of annotated data and NLP/Machine Learning (ML) expertise, which makes this process costly and time-consuming. We present a practical and efficient method for clinical concept extraction that does not require costly labeled data nor ML expertise. The method includes three steps: Step 1- the user injects a large in-domain text corpus (e.g., PubMed). Then, the system builds a contextual model containing vector representations of concepts in the corpus, in an unsupervised manner (e.g., Phrase2Vec). Step 2- the user provides a seed set of terms representing a specific medical concept (e.g., for the concept of the symptoms, the user may provide: ‘dry mouth,’ ‘itchy skin,’ and ‘blurred vision’). Then, the system matches the seed set against the contextual model and extracts the most semantically similar terms (e.g., additional symptoms). The result is a complete set of terms related to the medical concept. Step 3 –in production, there is a need to extract medical concepts from the unseen medical text. The system extracts key-phrases from the new text, then matches them against the complete set of terms from step 2, and the most semantically similar will be annotated with the same medical concept category. As an example, the seed symptom concepts would result in the following annotation: “The patient complaints on fatigue [symptom], dry skin [symptom], and Weight loss [symptom], which can be an early sign for Diabetes.” Our evaluations show promising results for extracting concepts from medical corpora. The method allows medical analysts to easily and efficiently build taxonomies (in step 2) representing their domain-specific concepts, and automatically annotate a large number of texts (in step 3) for classification/summarization of medical reports.Keywords: clinical concepts, concept expansion, medical records annotation, medical records summarization
Procedia PDF Downloads 13511313 A Forward-Looking View of the Intellectual Capital Accounting Information System
Authors: Rbiha Salsabil Ketitni
Abstract:
The entire company is a series of information among themselves so that each information serves several events and activities, and the latter is nothing but a large set of data or huge data. The enormity of information leads to the possibility of losing it sometimes, and this possibility must be avoided in the institution, especially the information that has a significant impact on it. In most cases, to avoid the loss of this information and to be relatively correct, information systems are used. At present, it is impossible to have a company that does not have information systems, as the latter works to organize the information as well as to preserve it and even saves time for its owner and this is the result of the speed of its mission. This study aims to provide an idea of an accounting information system that opens a forward-looking study for its manufacture and development by researchers, scientists, and professionals. This is the result of most individuals seeing a great contradiction between the work of an information system for moral capital and does not provide real values when measured, and its disclosure in financial reports is not distinguished by transparency.Keywords: accounting, intellectual capital, intellectual capital accounting, information system
Procedia PDF Downloads 8411312 Use and Relationship of Shell Nouns as Cohesive Devices in the Quality of Second Language Writing
Authors: Kristine D. de Leon, Junifer A. Abatayo, Jose Cristina M. Pariña
Abstract:
The current study is a comparative analysis of the use of shell nouns as a cohesive device (CD) in an English for Second Language (ESL) setting in order to identify their use and relationship in the quality of second language (L2) writing. As these nouns were established to anticipate the meaning within, across or outside the text, their use has fascinated writing researchers. The corpus of the study included published articles from reputable journals and graduate students’ papers in order to analyze the frequency of shell nouns using “highly prevalent” nouns in the academic community, to identify the different lexicogrammatical patterns where these nouns occur and to the functions connected with these patterns. The result of the study implies that published authors used more shell nouns in their paper than graduate students. However, the functions of the different lexicogrammatical patterns for the frequently occurring shell nouns are somewhat similar. These results could help students in enhancing the cohesion of their text and in comprehending it.Keywords: anaphoric, cataphoric, lexico-grammatical, shell nouns
Procedia PDF Downloads 18511311 Using the Smith-Waterman Algorithm to Extract Features in the Classification of Obesity Status
Authors: Rosa Figueroa, Christopher Flores
Abstract:
Text categorization is the problem of assigning a new document to a set of predetermined categories, on the basis of a training set of free-text data that contains documents whose category membership is known. To train a classification model, it is necessary to extract characteristics in the form of tokens that facilitate the learning and classification process. In text categorization, the feature extraction process involves the use of word sequences also known as N-grams. In general, it is expected that documents belonging to the same category share similar features. The Smith-Waterman (SW) algorithm is a dynamic programming algorithm that performs a local sequence alignment in order to determine similar regions between two strings or protein sequences. This work explores the use of SW algorithm as an alternative to feature extraction in text categorization. The dataset used for this purpose, contains 2,610 annotated documents with the classes Obese/Non-Obese. This dataset was represented in a matrix form using the Bag of Word approach. The score selected to represent the occurrence of the tokens in each document was the term frequency-inverse document frequency (TF-IDF). In order to extract features for classification, four experiments were conducted: the first experiment used SW to extract features, the second one used unigrams (single word), the third one used bigrams (two word sequence) and the last experiment used a combination of unigrams and bigrams to extract features for classification. To test the effectiveness of the extracted feature set for the four experiments, a Support Vector Machine (SVM) classifier was tuned using 20% of the dataset. The remaining 80% of the dataset together with 5-Fold Cross Validation were used to evaluate and compare the performance of the four experiments of feature extraction. Results from the tuning process suggest that SW performs better than the N-gram based feature extraction. These results were confirmed by using the remaining 80% of the dataset, where SW performed the best (accuracy = 97.10%, weighted average F-measure = 97.07%). The second best was obtained by the combination of unigrams-bigrams (accuracy = 96.04, weighted average F-measure = 95.97) closely followed by the bigrams (accuracy = 94.56%, weighted average F-measure = 94.46%) and finally unigrams (accuracy = 92.96%, weighted average F-measure = 92.90%).Keywords: comorbidities, machine learning, obesity, Smith-Waterman algorithm
Procedia PDF Downloads 29711310 Unsupervised Reciter Recognition Using Gaussian Mixture Models
Authors: Ahmad Alwosheel, Ahmed Alqaraawi
Abstract:
This work proposes an unsupervised text-independent probabilistic approach to recognize Quran reciter voice. It is an accurate approach that works on real time applications. This approach does not require a prior information about reciter models. It has two phases, where in the training phase the reciters' acoustical features are modeled using Gaussian Mixture Models, while in the testing phase, unlabeled reciter's acoustical features are examined among GMM models. Using this approach, a high accuracy results are achieved with efficient computation time process.Keywords: Quran, speaker recognition, reciter recognition, Gaussian Mixture Model
Procedia PDF Downloads 380