Search results for: text representation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2370

Search results for: text representation

2280 Recognition of Cursive Arabic Handwritten Text Using Embedded Training Based on Hidden Markov Models (HMMs)

Authors: Rabi Mouhcine, Amrouch Mustapha, Mahani Zouhir, Mammass Driss

Abstract:

In this paper, we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models (HMMs). The system is analytical without explicit segmentation used embedded training to perform and enhance the character models. Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image. These features are modelled using hidden Markov models and trained by embedded training. The experiments on images of the benchmark IFN/ENIT database show that the proposed system improves recognition.

Keywords: recognition, handwriting, Arabic text, HMMs, embedded training

Procedia PDF Downloads 323
2279 Me and My Selfie: Identity Building Through Self Representation in Social Media

Authors: Revytia Tanera

Abstract:

This research is a pilot study to examine the rise of selfie trend in dealing with individual self representation and identity building in social media. The symbolic interactionism theory is used as the concept of the desired self image, and Cooley’s looking glass-self concept is used to analyze the mechanical reflection of ourselves; how do people perform their “digital self” in social media. In-depth interviews were conducted in the study with a non-random sample who owns a smartphone with a front camera feature and are active in social media. This research is trying to find out whether the selfie trend brings any influence on identity building on each individual. Through analysis of interview results, it can be concluded that people take selfie photos in order to express themselves and to boost their confidence. This study suggests a follow up and more in depth analysis on identity and self representation from various age groups.

Keywords: self representation, selfie, social media, symbolic interaction, looking glass-self

Procedia PDF Downloads 264
2278 An Assessment of Female Representation in Philippine Cinema in Comparison to American Cinema (1975 to 2020)

Authors: Amanda Julia Binay, Patricia Elise Suarez

Abstract:

Female representation in media is an important subject in the discussion of gender equality, especially in impactful and influential media like film. As the Filipino film industry continues to grow and evolve, the need for analysis on Filipino female representation on screen is imperative. Additionally, there has been limited research made on female representation in the Philippine film scene. Thus, the paper aims to analyze the presence and evolution of female representation in Philippine cinema and compare the findings with that of American films to see how Filipino filmmakers hold their own against the standards of international movements that call for more and better female representation, especially in Hollywood. The participants selected were Filipino and American films released within the years 1975 to 2020 in five (5) year intervals. Twenty (20) critically acclaimed and highest-grossing Filipino films and twenty (20) critically acclaimed and highest-grossing Hollywood films were then subject to the Bechdel and Peirce tests to obtain statistical measures of their female representation. The findings of the study reveal that the presence of female representation in Philippine film history has been consistent and has continued to grow and evolve throughout the years, with strong female leads with vibrant characteristics and diverse stories. However, analysis of female representation regarding American films has shown an extreme lack thereof with more misogynistic, sexist, and limiting ideals. Thus, the study concludes that the state of female representation in Philippine cinema and film industry holds its own when compared to American cinema and film industry and even outperforms it in many aspects of female representation, such as consistent inclusion and depiction of multi-dimensional female leads and female relationships. Hence, the study implies that women’s consistent presence in Philippine cinema mirrors Filipino women’s prominent role in Philippine society and that American cinema must continue to make efforts to change their portrayals of female characters, leads, and relationships to make them more grounded in reality.

Keywords: female representation, gender studies, feminism, philippine cinema, American cinema, bechdel test, peirce test, comparative analysis

Procedia PDF Downloads 308
2277 Poetics of the Connecting ha’: A Textual Study in the Poetry of Al-Husari Al-Qayrawani

Authors: Mahmoud al-Ashiriy

Abstract:

This paper begins from the idea that the real history of literature is the history of its style. And since the rhyme –as known- is not merely the last letter, that have received a lot of analysis and investigation, but it is a collection of other values in addition to its different markings. This paper will explore the work of the connecting ha’ and its effectiveness in shaping the text of poetry, since it establishes vocal rhythms in addition to its role in indicating references through the pronoun, vertically through the poem through the sequence of its verses, also horizontally through what environs the one verse of sentences. If the scientific formation of prosody stopped at the possibilities and prohibitions; literary criticism and poetry studies should explore what is above the rule of aesthetic horizon of poetic effectiveness that varies from a text to another, a poet to another, a literary period to another, or from a poetic taste to another. Then the paper will explore this poetic essence in the texts of the famous Andalusian Poet Al-Husari Al-Qayrawani through his well-known Daliyya (a poem that its verses end with the letter D), and the role of the connecting ha’ in fulfilling its text and the accomplishment of its poetics, departing from this to the diwan (the big collection of poems) also as a higher text that surpasses the text/poem, and through what it represents of effectiveness the work of the phenomenon in accomplishing the poetics of the poem of Al-Husari Al-Qayrawani who is one of the pillars of Arabic poetics in Andalusia.

Keywords: Al-Husari Al-Qayrawni, poetics, rhyme, stylistics, science of the text

Procedia PDF Downloads 533
2276 A Clustering Algorithm for Massive Texts

Authors: Ming Liu, Chong Wu, Bingquan Liu, Lei Chen

Abstract:

Internet users have to face the massive amount of textual data every day. Organizing texts into categories can help users dig the useful information from large-scale text collection. Clustering, in fact, is one of the most promising tools for categorizing texts due to its unsupervised characteristic. Unfortunately, most of traditional clustering algorithms lose their high qualities on large-scale text collection. This situation mainly attributes to the high- dimensional vectors generated from texts. To effectively and efficiently cluster large-scale text collection, this paper proposes a vector reconstruction based clustering algorithm. Only the features that can represent the cluster are preserved in cluster’s representative vector. This algorithm alternately repeats two sub-processes until it converges. One process is partial tuning sub-process, where feature’s weight is fine-tuned by iterative process. To accelerate clustering velocity, an intersection based similarity measurement and its corresponding neuron adjustment function are proposed and implemented in this sub-process. The other process is overall tuning sub-process, where the features are reallocated among different clusters. In this sub-process, the features useless to represent the cluster are removed from cluster’s representative vector. Experimental results on the three text collections (including two small-scale and one large-scale text collections) demonstrate that our algorithm obtains high quality on both small-scale and large-scale text collections.

Keywords: vector reconstruction, large-scale text clustering, partial tuning sub-process, overall tuning sub-process

Procedia PDF Downloads 404
2275 Representation of Reality in Nigerian Poetry

Authors: Zainab Abdulkarim

Abstract:

Literature is the study of life, a source of knowledge. It involves the truth about many things in life. Most of these creative artistes most especially the poets are representatives of the voices of the people. These set of artistes have been the critics to all involved in the development of their nation. This paper will examine how Nigerian Poets goes further not just by writing but by showing the different ways the country has been convoluted. This paper intends to show the power and ability literature has in representation. The power is to represent the important values of life. There is no doubt that literature asserts truth. Through the various poems examined in this paper, Nigerian Poets have proved to portray the realities of the nation.

Keywords: literature, poets, reality, representation

Procedia PDF Downloads 280
2274 Image Transform Based on Integral Equation-Wavelet Approach

Authors: Yuan Yan Tang, Lina Yang, Hong Li

Abstract:

Harmonic model is a very important approximation for the image transform. The harmanic model converts an image into arbitrary shape; however, this mode cannot be described by any fixed functions in mathematics. In fact, it is represented by partial differential equation (PDE) with boundary conditions. Therefore, to develop an efficient method to solve such a PDE is extremely significant in the image transform. In this paper, a novel Integral Equation-Wavelet based method is presented, which consists of three steps: (1) The partial differential equation is converted into boundary integral equation and representation by an indirect method. (2) The boundary integral equation and representation are changed to plane integral equation and representation by boundary measure formula. (3) The plane integral equation and representation are then solved by a method we call wavelet collocation. Our approach has two main advantages, the shape of an image is arbitrary and the program code is independent of the boundary. The performance of our method is evaluated by numerical experiments.

Keywords: harmonic model, partial differential equation (PDE), integral equation, integral representation, boundary measure formula, wavelet collocation

Procedia PDF Downloads 521
2273 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: cooccurrence graph, entity relation graph, unstructured text, weighted distance

Procedia PDF Downloads 115
2272 Representation of the Kurdish Opposition: From Periphery to Center

Authors: Songul Miftakhov

Abstract:

This study explores political representation and engagement of Eastern and Southeastern Anatolia regions, known to have dense Kurdish population and referred further to as Eastern region, in the Turkish parliament between 1946 and 1980. Traditional local notables had most of the privileges to be represented given their connectedness with political parties. Traditional local notables integrated into right-wing parties considering political and economic aspects. At the same time, they kept control over local political involvement channels. As a result, political representation and presence were monopolized at central, local and civil society levels. One part of Kurdish intellectuals was marginalized from the parliament after addressing issues in Eastern Anatolia and trying to develop solutions apart from the mainstream. Some of them took part in Kurdish oppositional left wing in the 1960s and jounced power of settled notables in 1970s in local administrations or as independent members of the parliament.

Keywords: Kurdish representation, parliament, local nobles, Eastern and Southeastern Anatolia

Procedia PDF Downloads 130
2271 Symmetric Key Encryption Algorithm Using Indian Traditional Musical Scale for Information Security

Authors: Aishwarya Talapuru, Sri Silpa Padmanabhuni, B. Jyoshna

Abstract:

Cryptography helps in preventing threats to information security by providing various algorithms. This study introduces a new symmetric key encryption algorithm for information security which is linked with the "raagas" which means Indian traditional scale and pattern of music notes. This algorithm takes the plain text as input and starts its encryption process. The algorithm then randomly selects a raaga from the list of raagas that is assumed to be present with both sender and the receiver. The plain text is associated with the thus selected raaga and an intermediate cipher-text is formed as the algorithm converts the plain text characters into other characters, depending upon the rules of the algorithm. This intermediate code or cipher text is arranged in various patterns in three different rounds of encryption performed. The total number of rounds in the algorithm is equal to the multiples of 3. To be more specific, the outcome or output of the sequence of first three rounds is again passed as the input to this sequence of rounds recursively, till the total number of rounds of encryption is performed. The raaga selected by the algorithm and the number of rounds performed will be specified at an arbitrary location in the key, in addition to important information regarding the rounds of encryption, embedded in the key which is known by the sender and interpreted only by the receiver, thereby making the algorithm hack proof. The key can be constructed of any number of bits without any restriction to the size. A software application is also developed to demonstrate this process of encryption, which dynamically takes the plain text as input and readily generates the cipher text as output. Therefore, this algorithm stands as one of the strongest tools for information security.

Keywords: cipher text, cryptography, plaintext, raaga

Procedia PDF Downloads 261
2270 Residential Architecture and Its Representation in Movies: Bangkok's Spatial Research in the Study of Thai Cinematography

Authors: Janis Matvejs

Abstract:

Visual representation of a city creates unique perspectives that allow to interpret the urban environment and enable to understand a space that is culturally created and territorially organized. Residential complexes are an essential part of cities and cinema is a specific representation form of these areas. There has been very little research done on exploring how these areas are depicted in the Thai movies. The aim of this research is to interpret the discourse of residential areas of Bangkok throughout the 20th and 21st centuries and to examine essential changes in the residential structure. Specific cinematic formal techniques in relation to the urban image were used. The movie review results were compared with changes in Bangkok’s residential development. Movie analysis displayed that residential areas are frequently used in Thai cinematography and they make up an integral part of the urban visual perception.

Keywords: Bangkok, cinema, residential area, representation, visual perception

Procedia PDF Downloads 165
2269 The Effects of Watching Text-Relevant Video Segments with/without Subtitles on Vocabulary Development of Arabic as a Foreign Language Learners

Authors: Amirreza Karami, Hawraa Nafea Hameed Alzouwain, Freddie A. Bowles

Abstract:

This study investigates the effects of watching text-relevant video segments with/without subtitles on vocabulary development of Arabic as a Foreign Language (AFL) learners. The participants of the study were assigned to two groups: one control group and one experimental group. The control group received no video-based instruction while the experimental group watched a text-relevant video segment in three stages: pre, while, and post-instruction. The preliminary results of the pre-test and post-test show that watching text-relevant video segments through following a pre-while-post procedure can help the vocabulary development of AFL learners more than non-video-based instruction.

Keywords: text-relevant video segments, vocabulary development, Arabic as a Foreign Language, AFL, pre-while-post instruction

Procedia PDF Downloads 134
2268 The Representation of Female Characters by Women Directors in Surveillance Spaces in Turkish Cinema

Authors: Berceste Gülçin Özdemir

Abstract:

The representation of women characters in cinema has been discussed for centuries. In cinema where dominant narrative codes prevail and scopophilic views exist over women characters, passive stereotypes of women are observed in the representation of women characters. In films shot from a woman’s point of view in Turkish Cinema and even in the films outside the main stream in which the stories of women characters are told, the fact that women characters are discussed on the basis of feminist film theories triggers the question: ‘Are feminist films produced in Turkish Cinema?’ The spaces that are used in the representation of women characters are observed to be used as spaces that convert characters into passive subjects on the basis of the space factor in the narrative. The representation of women characters in the possible surveillance spaces integrates the characters and compresses them in these spaces. In this study, narrative analysis was used to investigate women characters representation in the surveillance spaces. For the study framework, firstly a case study films are selected, and in the second level, women characters representations in surveillance spaces are argued by narrative analysis using feminist film theories. Two questions are argued with feminist film theories: ‘Why do especially women directors represent their female characters to viewers by representing them in surveillance spaces?’ and ‘Can this type of presentation contribute to the feminist film practice and become important with regard to feminist film theories?’ The representation of women characters in a passive and observed way in surveillance spaces of the narrative reveals the questioning of also the discourses of films outside of the main stream. As films that produce alternative discourses and reveal different cinematic languages, those outside the main stream are expected to bring other points of view also to the representation of women characters in spaces. These questionings are selected as the baseline and Turkish films such as Watch Tower and Mustang, directed by women, were examined. This examination paves the way for discussions regarding the women characters in surveillance spaces. Outcomes can be argued from the viewpoint of representation in the genre by feminist film theories. In the context of feminist film theories and feminist film practice, alternatives should be found that can corporally reveal the existence of women in both the representation of women characters in spaces and in the usage of the space factor.

Keywords: feminist film theory, representation, space, women directors

Procedia PDF Downloads 259
2267 A Study of Various Ontology Learning Systems from Text and a Look into Future

Authors: Fatima Al-Aswadi, Chan Yong

Abstract:

With the large volume of unstructured data that increases day by day on the web, the motivation of representing the knowledge in this data in the machine processable form is increased. Ontology is one of the major cornerstones of representing the information in a more meaningful way on the semantic Web. The goal of Ontology learning from text is to elicit and represent domain knowledge in the machine readable form. This paper aims to give a follow-up review on the ontology learning systems from text and some of their defects. Furthermore, it discusses how far the ontology learning process will enhance in the future.

Keywords: concept discovery, deep learning, ontology learning, semantic relation, semantic web

Procedia PDF Downloads 478
2266 Principle Components Updates via Matrix Perturbations

Authors: Aiman Elragig, Hanan Dreiwi, Dung Ly, Idriss Elmabrook

Abstract:

This paper highlights a new approach to look at online principle components analysis (OPCA). Given a data matrix X R,^m x n we characterise the online updates of its covariance as a matrix perturbation problem. Up to the principle components, it turns out that online updates of the batch PCA can be captured by symmetric matrix perturbation of the batch covariance matrix. We have shown that as n→ n0 >> 1, the batch covariance and its update become almost similar. Finally, utilize our new setup of online updates to find a bound on the angle distance of the principle components of X and its update.

Keywords: online data updates, covariance matrix, online principle component analysis, matrix perturbation

Procedia PDF Downloads 167
2265 Teaching Pragmatic Coherence in Literary Text: Analysis of Chimamanda Adichie’s Americanah

Authors: Joy Aworo-Okoroh

Abstract:

Literary texts are mirrors of a real-life situation. Thus, authors choose the linguistic items that would best encode their intended meanings and messages. However, words mean more than they seem. The meaning of words is not static rather, it is dynamic as they constantly enter into relationships within a context. Literary texts can only be meaningful if all pragmatic cues are identified and interpreted. Drawing upon Teun Van Djik's theory of local pragmatic coherence, it is established that words enter into relations in a text and these relations account for sequential speech acts in the texts. Comprehension of the text is dependent on the interpretation of these relations.To show the relevance of pragmatic coherence in literary text analysis, ten conversations were selected in Americanah in order to give a clear idea of the pragmatic relations used. The conversations were analysed, identifying the speech act and epistemic relations inherent in them. A subtle analysis of the structure of the conversations was also carried out. It was discovered that justification is the most commonly used relation and the meaning of the text is dependent on the interpretation of these instances' pragmatic coherence. The study concludes that to effectively teach literature in English, pragmatic coherence should be incorporated as words mean more than they say.

Keywords: pragmatic coherence, epistemic coherence, speech act, Americanah

Procedia PDF Downloads 107
2264 First Time Voters Representation of Leadership as Exemplified by 2016 Presidentiables

Authors: Fevy Kae Mateo, Kimberly Javier, Alyzza Marie Palles

Abstract:

Leadership is a process of relationship involving interaction with other people. Leaders emphasise authority, which executes and implements regulations, maintains the rules and leads to a better future. The First Time voters are very significant because there are the stakeholders of the type of leader to be deployed. They also have the capacity of engaging the government and can be the agents of change. The objective of the study is to identify the strengths and weaknesses of leader. Moreover, the study identifies the qualities of a leader. Finally, the study determines first-time voter’s representation of a leader. Focus Group Discussion was carried out into two groups of first time voter’s ages 18 to 21 years old. Verbatim transcripts of the discussion were analyzed using Thematic Analysis. Overall results showed super ordinate themes for weaknesses of leader: Lace of transparency in the government, poor communication strategy, and valuing experience over potential and other contributory factor; for strength of a leader: analytical skill, emotional intelligence in political work, analytical ability and economic status on political participation; finally, in the representation of a leader: positive representation of a leader and negative representation of a leader.

Keywords: first time voters, focus group discussion, leadership, qualitative research design

Procedia PDF Downloads 227
2263 A Similarity Measure for Classification and Clustering in Image Based Medical and Text Based Banking Applications

Authors: K. P. Sandesh, M. H. Suman

Abstract:

Text processing plays an important role in information retrieval, data-mining, and web search. Measuring the similarity between the documents is an important operation in the text processing field. In this project, a new similarity measure is proposed. To compute the similarity between two documents with respect to a feature the proposed measure takes the following three cases into account: (1) The feature appears in both documents; (2) The feature appears in only one document and; (3) The feature appears in none of the documents. The proposed measure is extended to gauge the similarity between two sets of documents. The effectiveness of our measure is evaluated on several real-world data sets for text classification and clustering problems, especially in banking and health sectors. The results show that the performance obtained by the proposed measure is better than that achieved by the other measures.

Keywords: document classification, document clustering, entropy, accuracy, classifiers, clustering algorithms

Procedia PDF Downloads 480
2262 Visual Text Analytics Technologies for Real-Time Big Data: Chronological Evolution and Issues

Authors: Siti Azrina B. A. Aziz, Siti Hafizah A. Hamid

Abstract:

New approaches to analyze and visualize data stream in real-time basis is important in making a prompt decision by the decision maker. Financial market trading and surveillance, large-scale emergency response and crowd control are some example scenarios that require real-time analytic and data visualization. This situation has led to the development of techniques and tools that support humans in analyzing the source data. With the emergence of Big Data and social media, new techniques and tools are required in order to process the streaming data. Today, ranges of tools which implement some of these functionalities are available. In this paper, we present chronological evolution evaluation of technologies for supporting of real-time analytic and visualization of the data stream. Based on the past research papers published from 2002 to 2014, we gathered the general information, main techniques, challenges and open issues. The techniques for streaming text visualization are identified based on Text Visualization Browser in chronological order. This paper aims to review the evolution of streaming text visualization techniques and tools, as well as to discuss the problems and challenges for each of identified tools.

Keywords: information visualization, visual analytics, text mining, visual text analytics tools, big data visualization

Procedia PDF Downloads 373
2261 Automatic Assignment of Geminate and Epenthetic Vowel for Amharic Text-to-Speech System

Authors: Tadesse Anberbir, Bankole Felix, Tomio Takara

Abstract:

In the development of a text-to-speech synthesizer, automatic derivation of correct pronunciation from the grapheme form of a text is a central problem. Particularly deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation, but neither is shown in orthography. In this paper, to proposed and integrated a morphological analyzer into an Amharic Text-to-Speech system, mainly to predict geminates and epenthetic vowel positions and prepared a duration modeling method. Amharic Text-to-Speech system (AmhTTS) is a parametric and rule-based system that adopts a cepstral method and uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. The naturalness of the system after employing the duration modeling was evaluated by sentence listening test, and we achieved an average Mean Opinion Score (MOS) 3.4 (68%), which is moderate. By modeling the duration of geminates and controlling the locations of epenthetic vowel, we are able to synthesize good quality speech. Our system is mainly suitable to be customized for other Ethiopian languages with limited resources.

Keywords: amharic, gemination, Speech synthesis, morphology, epenthesis

Procedia PDF Downloads 55
2260 Assessment of the Validity of Sentiment Analysis as a Tool to Analyze the Emotional Content of Text

Authors: Trisha Malhotra

Abstract:

Sentiment analysis is a recent field of study that computationally assesses the emotional nature of a body of text. To assess its test-validity, sentiment analysis was carried out on the emotional corpus of text from a personal 15-day mood diary. Self-reported mood scores varied more or less accurately with daily mood evaluation score given by the software. On further assessment, it was found that while sentiment analysis was good at assessing ‘global’ mood, it was not able to ‘locally’ identify and differentially score synonyms of various emotional words. It is further critiqued for treating the intensity of an emotion as universal across cultures. Finally, the software is shown not to account for emotional complexity in sentences by treating emotions as strictly positive or negative. Hence, it is posited that a better output could be two (positive and negative) affect scores for the same body of text.

Keywords: analysis, data, diary, emotions, mood, sentiment

Procedia PDF Downloads 238
2259 Motion Effects of Arabic Typography on Screen-Based Media

Authors: Ibrahim Hassan

Abstract:

Motion typography is one of the most important types of visual communication based on display. Through the digital display media, we can control the text properties (size, direction, thickness, color, etc.). The use of motion typography in visual communication made it have several images. We need to adjust the terminology and clarify the different differences between them, so relying on the word motion typography -considered a general term- is not enough to separate the different communicative functions of the moving text. In this paper, we discuss the different effects of motion typography on Arabic writing and how we can achieve harmony between the movement and the letterform, and we will, during our experiments, present a new type of text movement.

Keywords: Arabic typography, motion typography, kinetic typography, fluid typography, temporal typography

Procedia PDF Downloads 121
2258 Recognition of Grocery Products in Images Captured by Cellular Phones

Authors: Farshideh Einsele, Hassan Foroosh

Abstract:

In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using wellknown geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process.

Keywords: camera-based OCR, feature extraction, document, image processing, grocery products

Procedia PDF Downloads 375
2257 Pragmatic Survey of Precedence as Linguistic 'Déjà Vu' in Political Text and Talk

Authors: Zarine Avetisyan

Abstract:

Both in language and literature there exists the theory of recurrence of text and talk chunks which brings us to the notion of precedence. It must be stated that precedence as a pragma-linguistic phenomenon is yet underknown and it is the main objective of the present research to revisit and reveal it thoroughly. In line with the main research objective, analysis of political text and talk provides abundant relevant data for the illustration of the phenomenon of precedence. The analysis focuses on certain pragmatic universals (e.g. intention) and categories (e.g. speech techniques) which lead to the disclosure of the present object of study.

Keywords: intention, precedence, political discourse, pragmatic universals

Procedia PDF Downloads 396
2256 The Winning Possibility of Female Candidate in Korea

Authors: Minjeoung Kim

Abstract:

The majority of Korean female members of parliament(MPs) had been elected from the proportional representation till the 19th assemblies but in the 20th general election women MPs of the district representation is slightly more than women MPs of the proportional representation. The chance of women candidates to win is not as low as we assume. Therefore this study aims to reveal which factors influence the election of women candidates, other factors except the political party, because the effect of political party is already well known. Gangnam Eul is selected because female candidate was elected in spite of the low percentage of vote won by her political party. According to the survey, the female candidate was elected thanks to her policies and election pledges. Therefore, women candidates can be elected when they are nominated as candidates by their party in a safe constituency but also they can be elected with their good policies and election pledges in an unsafe constituency. And also the degree of the education, the age and the profession of voters influenced the support of female candidate.

Keywords: women candidates, 20th general election, winning in the district representation, policies and election pledges

Procedia PDF Downloads 224
2255 Automatic Assignment of Geminate and Epenthetic Vowel for Amharic Text-to-Speech System

Authors: Tadesse Anberbir, Felix Bankole, Tomio Takara, Girma Mamo

Abstract:

In the development of a text-to-speech synthesizer, automatic derivation of correct pronunciation from the grapheme form of a text is a central problem. Particularly deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation but neither is shown in orthography. In this paper, we proposed and integrated a morphological analyzer into an Amharic Text-to-Speech system, mainly to predict geminates and epenthetic vowel positions, and prepared a duration modeling method. Amharic Text-to-Speech system (AmhTTS) is a parametric and rule-based system that adopts a cepstral method and uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. The naturalness of the system after employing the duration modeling was evaluated by sentence listening test and we achieved an average Mean Opinion Score (MOS) 3.4 (68%) which is moderate. By modeling the duration of geminates and controlling the locations of epenthetic vowel, we are able to synthesize good quality speech. Our system is mainly suitable to be customized for other Ethiopian languages with limited resources.

Keywords: Amharic, gemination, speech synthesis, morphology, epenthesis

Procedia PDF Downloads 50
2254 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: hidden markov model, natural language processing, POS tagging, viterbi algorithm

Procedia PDF Downloads 302
2253 Deep Learning Based-Object-classes Semantic Classification of Arabic Texts

Authors: Imen Elleuch, Wael Ouarda, Gargouri Bilel

Abstract:

We proposes in this paper a Deep Learning based approach to classify text in order to enrich an Arabic ontology based on the objects classes of Gaston Gross. Those object classes are defined by taking into account the syntactic and semantic features of the treated language. Thus, our proposed approach is a hybrid one. In fact, it is based on the one hand on the object classes that represents a knowledge based-approach on classification of text and in the other hand it uses the deep learning approach that use the word embedding-based-approach to classify text. We have applied our proposed approach on a corpus constructed from an Arabic dictionary. The obtained semantic classification of text will enrich the Arabic objects classes ontology. In fact, new classes can be added to the ontology or an expansion of the features that characterizes each object class can be updated. The obtained results are compared to a similar work that treats the same object with a classical linguistic approach for the semantic classification of text. This comparison highlight our hybrid proposed approach that can be ameliorated by broaden the dataset used in the deep learning process.

Keywords: deep-learning approach, object-classes, semantic classification, Arabic

Procedia PDF Downloads 41
2252 Knowledge Representation and Inconsistency Reasoning of Class Diagram Maintenance in Big Data

Authors: Chi-Lun Liu

Abstract:

Requirements modeling and analysis are important in successful information systems' maintenance. Unified Modeling Language (UML) class diagrams are useful standards for modeling information systems. To our best knowledge, there is a lack of a systems development methodology described by the organism metaphor. The core concept of this metaphor is adaptation. Using the knowledge representation and reasoning approach and ontologies to adopt new requirements are emergent in recent years. This paper proposes an organic methodology which is based on constructivism theory. This methodology is a knowledge representation and reasoning approach to analyze new requirements in the class diagrams maintenance. The process and rules in the proposed methodology automatically analyze inconsistencies in the class diagram. In the big data era, developing an automatic tool based on the proposed methodology to analyze large amounts of class diagram data is an important research topic in the future.

Keywords: knowledge representation, reasoning, ontology, class diagram, software engineering

Procedia PDF Downloads 210
2251 Towards a Deconstructive Text: Beyond Language and the Politics of Absences in Samuel Beckett’s Waiting for Godot

Authors: Afia Shahid

Abstract:

The writing of Samuel Beckett is associated with meaning in the meaninglessness and the production of what he calls ‘literature of unword’. The casual escape from the world of words in the form of silences and pauses, in his play Waiting for Godot, urges to ask question of their existence and ultimately leads to investigate the theory behind their use in the play. This paper proposes that these absences (silence and pause) in Beckett’s play force to think ‘beyond’ language. This paper asks how silence and pause in Beckett’s text speak for the emergence of poststructuralist text. It aims to identify the significant features of the philosophy of deconstruction in the play of Beckett to demystify the hostile complicity between literature and philosophy. With the interpretive paradigm of poststructuralism this research focuses on the text as a research data. It attempts to delineate the relationship between poststructuralist theoretical concerns and text of Beckett. Keeping in view the theoretical concerns of Poststructuralist theorist Jacques Derrida, the main concern of the discussion is directed towards the notion of ‘beyond’ language into the absences that are aimed at silencing the existing discourse with the ‘radical irony’ of this anti-formal art that contains its own denial and thus represents the idea of ceaseless questioning and radical contradiction in art and any text. This article asks how text of Beckett vibrates with loud silence and has disrupted language to demonstrate the emptiness of words and thus exploring the limitless void of absences. Beckett’s text resonates with silence and pause that is neither negation nor affirmation rather a poststructuralist’s suspension of reality that is ever changing with the undecidablity of all meanings. Within the theoretical notion of Derrida’s Différance this study interprets silence and pause in Beckett’s art. The silence and pause behave like Derrida’s Différance and have questioned their own existence in the text to deconstruct any definiteness and finality of reality to extend an undecidable threshold of poststructuralists that aims to evade the ‘labyrinth of language’.

Keywords: Différance, language, pause, poststructuralism, silence, text

Procedia PDF Downloads 177