Search results for: Bangla-UNL Dictionary
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 88

Search results for: Bangla-UNL Dictionary

58 Lexical Based Method for Opinion Detection on Tripadvisor Collection

Authors: Faiza Belbachir, Thibault Schienhinski

Abstract:

The massive development of online social networks allows users to post and share their opinions on various topics. With this huge volume of opinion, it is interesting to extract and interpret these information for different domains, e.g., product and service benchmarking, politic, system of recommendation. This is why opinion detection is one of the most important research tasks. It consists on differentiating between opinion data and factual data. The difficulty of this task is to determine an approach which returns opinionated document. Generally, there are two approaches used for opinion detection i.e. Lexical based approaches and Machine Learning based approaches. In Lexical based approaches, a dictionary of sentimental words is used, words are associated with weights. The opinion score of document is derived by the occurrence of words from this dictionary. In Machine learning approaches, usually a classifier is trained using a set of annotated document containing sentiment, and features such as n-grams of words, part-of-speech tags, and logical forms. Majority of these works are based on documents text to determine opinion score but dont take into account if these texts are really correct. Thus, it is interesting to exploit other information to improve opinion detection. In our work, we will develop a new way to consider the opinion score. We introduce the notion of trust score. We determine opinionated documents but also if these opinions are really trustable information in relation with topics. For that we use lexical SentiWordNet to calculate opinion and trust scores, we compute different features about users like (numbers of their comments, numbers of their useful comments, Average useful review). After that, we combine opinion score and trust score to obtain a final score. We applied our method to detect trust opinions in TRIPADVISOR collection. Our experimental results report that the combination between opinion score and trust score improves opinion detection.

Keywords: Tripadvisor, opinion detection, SentiWordNet, trust score

Procedia PDF Downloads 156
57 Atomic Decomposition Audio Data Compression and Denoising Using Sparse Dictionary Feature Learning

Authors: T. Bryan , V. Kepuska, I. Kostnaic

Abstract:

A method of data compression and denoising is introduced that is based on atomic decomposition of audio data using “basis vectors” that are learned from the audio data itself. The basis vectors are shown to have higher data compression and better signal-to-noise enhancement than the Gabor and gammatone “seed atoms” that were used to generate them. The basis vectors are the input weights of a Sparse AutoEncoder (SAE) that is trained using “envelope samples” of windowed segments of the audio data. The envelope samples are extracted from the audio data by performing atomic decomposition with Gabor or gammatone seed atoms. This process identifies segments of audio data that are locally coherent with the seed atoms. Envelope samples are extracted by identifying locally coherent audio data segments with Gabor or gammatone seed atoms, found by matching pursuit. The envelope samples are formed by taking the kronecker products of the atomic envelopes with the locally coherent data segments. Oracle signal-to-noise ratio (SNR) verses data compression curves are generated for the seed atoms as well as the basis vectors learned from Gabor and gammatone seed atoms. SNR data compression curves are generated for speech signals as well as early American music recordings. The basis vectors are shown to have higher denoising capability for data compression rates ranging from 90% to 99.84% for speech as well as music. Envelope samples are displayed as images by folding the time series into column vectors. This display method is used to compare of the output of the SAE with the envelope samples that produced them. The basis vectors are also displayed as images. Sparsity is shown to play an important role in producing the highest denoising basis vectors.

Keywords: sparse dictionary learning, autoencoder, sparse autoencoder, basis vectors, atomic decomposition, envelope sampling, envelope samples, Gabor, gammatone, matching pursuit

Procedia PDF Downloads 217
56 The Linguistic Fingerprint in Western and Arab Judicial Applications

Authors: Asem Bani Amer

Abstract:

This study handles the linguistic fingerprint in judicial applications described in a law technicality that is recent and developing. It can be adopted to discover criminals by identifying their way of speaking and their special linguistic expressions. This is achieved by understanding the expression "linguistic fingerprint," its concept, and its extended domain, then revealing some of the linguistic fingerprint tools in Western judicial applications and deducing a technical imagination for a linguistic fingerprint in the Arabic language, which is needy for such judicial applications regarding this field, through dictionaries, language rhythm, and language structure.

Keywords: linguistic fingerprint, judicial, application, dictionary, picture, rhythm, structure

Procedia PDF Downloads 44
55 A Newspapers Expectations Indicator from Web Scraping

Authors: Pilar Rey del Castillo

Abstract:

This document describes the building of an average indicator of the general sentiments about the future exposed in the newspapers in Spain. The raw data are collected through the scraping of the Digital Periodical and Newspaper Library website. Basic tools of natural language processing are later applied to the collected information to evaluate the sentiment strength of each word in the texts using a polarized dictionary. The last step consists of summarizing these sentiments to produce daily indices. The results are a first insight into the applicability of these techniques to produce periodic sentiment indicators.

Keywords: natural language processing, periodic indicator, sentiment analysis, web scraping

Procedia PDF Downloads 94
54 Morphological Rules of Bangla Repetition Words for UNL Based Machine Translation

Authors: Nawab Yousuf Ali, S. Golam, A. Ameer, Ashok Toru Roy

Abstract:

This paper develops new morphological rules suitable for Bangla repetition words to be incorporated into an inter lingua representation called Universal Networking Language (UNL). The proposed rules are to be used to combine verb roots and their inflexions to produce words which are then combined with other similar types of words to generate repetition words. This paper outlines the format of morphological rules for different types of repetition words that come from verb roots based on the framework of UNL provided by the UNL centre of the Universal Networking Digital Language (UNDL) foundation.

Keywords: Universal Networking Language (UNL), universal word (UW), head word (HW), Bangla-UNL Dictionary, morphological rule, enconverter (EnCo)

Procedia PDF Downloads 274
53 Identifying Concerned Citizen Communication Style During the State Parliamentary Elections in Bavaria

Authors: Volker Mittendorf, Andre Schmale

Abstract:

In this case study, we want to explore the Twitter-use of candidates during the state parliamentary elections-year 2018 in Bavaria, Germany. This paper focusses on the seven parties that probably entered the parliament. Against this background, the paper classifies the use of language as populism which itself is considered as a political communication style. First, we determine the election campaigns which started in the years 2017 on Twitter, after that we categorize the posting times of the different direct candidates in order to derive ideal types from our empirical data. Second, we have done the exploration based on the dictionary of concerned citizens which contains German political language of the right and the far right. According to that, we are analyzing the corpus with methods of text mining and social network analysis, and afterwards we display the results in a network of words of concerned citizen communication style (CCCS).

Keywords: populism, communication style, election, text mining, social media

Procedia PDF Downloads 112
52 Calm, Confusing and Chaotic: Investigating Humanness through Sentiment Analysis of Abstract Artworks

Authors: Enya Autumn Trenholm-Jensen, Hjalte Hviid Mikkelsen

Abstract:

This study was done in the pursuit of nuancing the discussion surrounding what it means to be human in a time of unparalleled technological development. Subjectivity was deemed to be an accessible example of humanity to study, and art was a fitting medium through which to probe subjectivity. Upon careful theoretical consideration, abstract art was found to fit the parameters of the study with the added bonus of being, as of yet, uninterpretable from an AI perspective. It was hypothesised that dissimilar appraisals of the art stimuli would be found through sentiment and terminology. Opinion data was collected through survey responses and analysed using Valence Aware Dictionary for sEntiment Reasoning (VADER) sentiment analysis. The results reflected the enigmatic nature of subjectivity through erratic ratings of the art stimuli. However, significant themes were found in the terminology used in the responses. The implications of the findings are discussed in relation to the uniqueness, or lack thereof, of human subjectivity, and directions for future research are provided.

Keywords: abstract art, artificial intelligence, cognition, sentiment, subjectivity

Procedia PDF Downloads 84
51 Stock Price Prediction with 'Earnings' Conference Call Sentiment

Authors: Sungzoon Cho, Hye Jin Lee, Sungwhan Jeon, Dongyoung Min, Sungwon Lyu

Abstract:

Major public corporations worldwide use conference calls to report their quarterly earnings. These 'earnings' conference calls allow for questions from stock analysts. We investigated if it is possible to identify sentiment from the call script and use it to predict stock price movement. We analyzed call scripts from six companies, two each from Korea, China and Indonesia during six years 2011Q1 – 2017Q2. Random forest with Frequency-based sentiment scores using Loughran MacDonald Dictionary did better than control model with only financial indicators. When the stock prices went up 20 days from earnings release, our model predicted correctly 77% of time. When the model predicted 'up,' actual stock prices went up 65% of time. This preliminary result encourages us to investigate advanced sentiment scoring methodologies such as topic modeling, auto-encoder, and word2vec variants.

Keywords: earnings call script, random forest, sentiment analysis, stock price prediction

Procedia PDF Downloads 257
50 Terrorism in German and Italian Press Headlines: A Cognitive Linguistic Analysis of Conceptual Metaphors

Authors: Silvia Sommella

Abstract:

Islamic terrorism has gained a lot of media attention in the last years also because of the striking increase of terror attacks since 2014. The main aim of this paper is to illustrate the phenomenon of Islamic terrorism by applying frame semantics and metaphor analysis to German and Italian press headlines of the two online weekly publications Der Spiegel and L’Espresso between 2014 and 2019. This study focuses on how media discourse – through the use of conceptual metaphors – let arise in people a particular reception of the phenomenon of Islamic terrorism and accept governmental strategies and policies, perceiving terrorists as evildoers, as the members of an uncivilised group ‘other’ opposed to the civilised group ‘we’: two groups that are perceived as opposed. The press headlines are analyzed on the basis of the cognitive linguistics, namely Lakoff and Johnson’s conceptualization of metaphor to distinguish between abstract conceptual metaphors and specific metaphorical expressions. The study focuses on the contexts, frames, and metaphors. The method adopted in this study is Konerding’s frame semantics (1993). Konerding carried out on the basis of dictionaries – in particular of the Duden Deutsches Universalwörterbuch (Duden Universal German Dictionary) – in a pilot study of a lexicological work hyperonym reduction of substantives, working exclusively with nouns because hyperonyms usually occur in the dictionary meaning explanations as for the main elements of nominal phrases. The results of Konerding’s hyperonym type reduction is a small set of German nouns and they correspond to the highest hyperonyms, the so-called categories, matrix frames: ‘object’, ‘organism’, ‘person/actant’, ‘event’, ‘action/interaction/communication’, ‘institution/social group’, ‘surroundings’, ‘part/piece’, ‘totality/whole’, ‘state/property’. The second step of Konerding’s pilot study consists in determining the potential reference points of each category so that conventionally expectable routinized predications arise as predictors. Konerding found out which predicators the ascertained noun types can be linked to. For the purpose of this study, metaphorical expressions will be listed and categorized in conceptual metaphors and under the matrix frames that correspond to the particular conceptual metaphor. All of the corpus analyses are carried out using Ant Conc corpus software. The research will verify some previously analyzed metaphors such as TERRORISM AS WAR, A CRIME, A NATURAL EVENT, A DISEASE and will identify new conceptualizations and metaphors about Islamic terrorism, especially in the Italian language like TERRORISM AS A GAME, WARES, A DRAMATIC PLAY. Through the identification of particular frames and their construction, the research seeks to understand the public reception and the way to handle the discourse about Islamic terrorism in the above mentioned online weekly publications under a contrastive analysis in the German and in the Italian language.

Keywords: cognitive linguistics, frame semantics, Islamic terrorism, media

Procedia PDF Downloads 137
49 Selfie: Redefining Culture of Narcissism

Authors: Junali Deka

Abstract:

“Pictures speak more than a thousand words”. It is the power of image which can have multiple meanings the way it is read by the viewers. This research article is an outcome of the extensive study of the phenomenon of‘selfie culture’ and dire need of self-constructed virtual identity among youths. In the recent times, there has been a revolutionary change in the concept of photography in terms of both techniques and applications. The popularity of ‘self-portraits’ mainly depend on the temporal space and time created on social networking sites like Facebook, Instagram. With reference to Stuart’s Hall encoding and decoding process, the article studies the behavior of the users who post photographs online. The photographic messages (Roland Barthes) are interpreted differently by different viewers. The notion of ‘self’, ‘self-love and practice of looking (Marita Sturken) and ways of seeing (John Berger) got new definition and dimensional together. After Oscars Night, show host Ellen DeGeneres’s selfie created the most buzz and hype in the social media. The term was judged the word of 2013, and has earned its place in the dictionary. “In November 2013, the word "selfie" was announced as being the "word of the year" by the Oxford English Dictionary. By the end of 2012, Time magazine considered selfie one of the "top 10 buzzwords" of that year; although selfies had existed long before, it was in 2012 that the term "really hit the big time an Australian origin. The present study was carried to understand the concept of ‘selfie-bug’ and the phenomenon it has created among youth (especially students) at large in developing a pseudo-image of its own. The topic was relevant and gave a platform to discuss about the cultural, psychological and sociological implications of selfie in the age of digital technology. At the first level, content analysis of the primary and secondary sources including newspapers articles and online resources was carried out followed by a small online survey conducted with the help of questionnaire to find out the student’s view on selfie and its social and psychological effects. The newspapers reports and online resources confirmed that selfie is a new trend in the digital media and it has redefined the notion of beauty and self-love. The Facebook and Instagram are the major platforms used to express one-self and creation of virtual identity. The findings clearly reflected the active participation of female students in comparison to male students. The study of the photographs of few selected respondents revealed the difference of attitude and image building among male and female users. The study underlines some basic questions about the desire of reconstruction of identity among young generation, such as - are they becoming culturally narcissist; responsible factors for cultural, social and moral changes in the society, psychological and technological effects caused by Smartphone as well, culminating into a big question mark whether the selfie is a social signifier of identity construction.

Keywords: Culture, Narcissist, Photographs, Selfie

Procedia PDF Downloads 366
48 Specialized Building Terminology of the 19th Century

Authors: Klara Kroftova, Martin Ebel

Abstract:

Human history is characterized by continuous evolution. As mankind developed, so did crafts, doctrine, and, of course, language. Each field of human activity, science, and art or architecture has its own vocabulary, terms with its specific, well-defined meaning. These are words or phrases that may have a general meaning in a certain context, but which, when used in specific contexts, are characterized by their expertise. The development of architecture in this area is, therefore, closely related to the development of architecture. People discovered new building materials, building constructions, decorating, furnishings, etc. and with each new knowledge came a new name. Architecture and construction were specific to individual nations, but throughout human history, they were also copied differently from other nations. Thus, the terminology of the Czech language was established, but also adopted from foreign languages. In this paper, we will focus on the linguistic analysis of terms that we most often encounter in the study of 19th-century architecture in the Austro-Hungarian Monarchy. The article is supplemented by a small picture dictionary.

Keywords: tenement houses, 19th century, terminology, Austro-Hungarian monarchy

Procedia PDF Downloads 93
47 Priority of Goal Over Source in Persian Directional Motion Verbs

Authors: Tahereh Samenian

Abstract:

There is ample evidence that source and goal are disproportionately expressed in languages, and goal usually plays a more prominent role than source. The results show that the mismatch between the goal and the source is not entirely rooted in non-linguistic behaviors, i.e. that linguistic descriptions also show the focus of the goal on the source in events; Non-verbal memory for events, on the other hand, indicates that the focus of the goal is only on events that are purposefully moving and the actor is alive. In the present study, an attempt is made to examine the principle of priority of the goal over the source by focusing on Persian directional motion verbs. For this purpose, 117 Persian directional motion verbs have been selected from the dictionary and data for them have been collected from the body of Bijan Khan and the components of goal and source have been identified in sentences and the prominence of the components of goal and source has been shown in the form of diagrams. As it was obtained from the data, Persian motion-directional verbs also showed the bias of the goal over source in motion events.

Keywords: motion-directional verbs, priority of goal over source principle, cognitive factors, linguistic factors

Procedia PDF Downloads 53
46 Approaching the Words Denoting Cognitive Activity in Vietnamese Language in Comparison with English Language

Authors: Thi Phuong Ly Tran

Abstract:

Being a basic and unique to human beings, cognitive activity possesses spiritualistic characteristics and is conveyed through languages. Words that represent rational cognition or processes related to rationality as follow: know, think, understand, doubt, be afraid, remember, forget, think (that), realize (that), find (that), etc. can reflect the process by which human beings have transformed cognitive activities into diversified and delicate manners through linguistic tasks. In this research article, applying the descriptive method and comparative method, we would like to utilize the application of the theoretical system of linguistic characteristics of cognitive verbs in Vietnamese language in comparison with English language. These achievements of this article will meaningfully contribute to highlight characteristics of Vietnamese language and identify the similarities and differences in the linguistic processes of Vietnamese and English people as well as supply more knowledge for social requirements such as foreign language learning, dictionary editing, language teaching in schools.

Keywords: cognitive activity, cognitive perspective, Vietnamese language, English language

Procedia PDF Downloads 170
45 Proactive WPA/WPA2 Security Using DD-WRT Firmware

Authors: Mustafa Kamoona, Mohamed El-Sharkawy

Abstract:

Although the latest Wireless Local Area Network technology Wi-Fi 802.11i standard addresses many of the security weaknesses of the antecedent Wired Equivalent Privacy (WEP) protocol, there are still scenarios where the network security are still vulnerable. The first security model that 802.11i offers is the Personal model which is very cheap and simple to install and maintain, yet it uses a Pre Shared Key (PSK) and thus has a low to medium security level. The second model that 802.11i provide is the Enterprise model which is highly secured but much more expensive and difficult to install/maintain and requires the installation and maintenance of an authentication server that will handle the authentication and key management for the wireless network. A central issue with the personal model is that the PSK needs to be shared with all the devices that are connected to the specific Wi-Fi network. This pre-shared key, unless changed regularly, can be cracked using offline dictionary attacks within a matter of hours. The key is burdensome to change in all the connected devices manually unless there is some kind of algorithm that coordinate this PSK update. The key idea of this paper is to propose a new algorithm that proactively and effectively coordinates the pre-shared key generation, management, and distribution in the cheap WPA/WPA2 personal security model using only a DD-WRT router.

Keywords: Wi-Fi, WPS, TLS, DD-WRT

Procedia PDF Downloads 196
44 Adaption Model for Building Agile Pronunciation Dictionaries Using Phonemic Distance Measurements

Authors: Akella Amarendra Babu, Rama Devi Yellasiri, Natukula Sainath

Abstract:

Where human beings can easily learn and adopt pronunciation variations, machines need training before put into use. Also humans keep minimum vocabulary and their pronunciation variations are stored in front-end of their memory for ready reference, while machines keep the entire pronunciation dictionary for ready reference. Supervised methods are used for preparation of pronunciation dictionaries which take large amounts of manual effort, cost, time and are not suitable for real time use. This paper presents an unsupervised adaptation model for building agile and dynamic pronunciation dictionaries online. These methods mimic human approach in learning the new pronunciations in real time. A new algorithm for measuring sound distances called Dynamic Phone Warping is presented and tested. Performance of the system is measured using an adaptation model and the precision metrics is found to be better than 86 percent.

Keywords: pronunciation variations, dynamic programming, machine learning, natural language processing

Procedia PDF Downloads 135
43 On the Comprehension of English Compound Nouns by Arabic-Speaking EFL Learners

Authors: Abdel Rahman Altakhaineh, Mohamma Alaghawat, Hiba Alhendi

Abstract:

This paper reports an investigation of the comprehension of English compound nouns by sixty Arabic-speaking English Foreign Language (EFL) learners majoring in English at the University of Jordan, Amman. The investigation focused on the problems that these learners may encounter in understanding certain types of compounds and their ability to use their L1 compound noun knowledge to produce the meaning of L2 compound nouns. Participants whose English proficiency level was advanced underwent a test to identify the meaning ofan underlined compound without using a dictionary. Theresponses to the three different types of compounds were analyzed usingTwo-Way repeated measures ANOVA, and the results showed that there were different endocentric and exocentric compound responses within subordinative compounds, with a statistically significant difference between the two in favor of endocentric compounds. We argue that the endocentric, especially subordinative endocentric compounds,weremore easily understood due to its representative nature, i.e., because the head represents the meaning of the whole compound. The study concludes with pedagogical implications for teaching compound nouns.

Keywords: morphology, compounding, SLA, arabic-speaking EFL learners

Procedia PDF Downloads 69
42 Meaningful Habit for EFL Learners

Authors: Ana Maghfiroh

Abstract:

Learning a foreign language needs a big effort from the learner itself to make their language ability grows better day by day. Among those, they also need a support from all around them including teacher, friends, as well as activities which support them to speak the language. When those activities developed well as a habit which are done regularly, it will help improving the students’ language competence. It was a qualitative research which aimed to find out and describe some activities implemented in Pesantren Al Mawaddah, Ponorogo, in order to teach the students a foreign language. In collecting the data, the researcher used interview, questionnaire, and documentation. From the study, it was found that Pesantren Al Mawaddah had successfully built the language habit on the students to speak the target language. More than 15 hours a day students were compelled to speak foreign language, Arabic or English, in turn. It aimed to habituate the students to keep in touch with the target language. The habit was developed through daily language activities, such as dawn vocabs giving, dictionary handling, daily language use, speech training and language intensive course, daily language input, and night vocabs memorizing. That habit then developed the students awareness towards the language learned as well as promoted their language mastery.

Keywords: habit, communicative competence, daily language activities, Pesantren

Procedia PDF Downloads 495
41 Sparsity-Based Unsupervised Unmixing of Hyperspectral Imaging Data Using Basis Pursuit

Authors: Ahmed Elrewainy

Abstract:

Mixing in the hyperspectral imaging occurs due to the low spatial resolutions of the used cameras. The existing pure materials “endmembers” in the scene share the spectra pixels with different amounts called “abundances”. Unmixing of the data cube is an important task to know the present endmembers in the cube for the analysis of these images. Unsupervised unmixing is done with no information about the given data cube. Sparsity is one of the recent approaches used in the source recovery or unmixing techniques. The l1-norm optimization problem “basis pursuit” could be used as a sparsity-based approach to solve this unmixing problem where the endmembers is assumed to be sparse in an appropriate domain known as dictionary. This optimization problem is solved using proximal method “iterative thresholding”. The l1-norm basis pursuit optimization problem as a sparsity-based unmixing technique was used to unmix real and synthetic hyperspectral data cubes.

Keywords: basis pursuit, blind source separation, hyperspectral imaging, spectral unmixing, wavelets

Procedia PDF Downloads 167
40 The Effect of Using LDOCE on Iranian EFL Learners’ Pronunciation Accuracy

Authors: Mohammad Hadi Mahmoodi, Elahe Saedpanah

Abstract:

Since pronunciation is among those factors that can have strong effects on EFL learners’ successful communication, instructional programs with accurate pronunciation purposes seem to be a necessity in any L2 teaching context. The widespread use of smart mobile phones brings with itself various educational applications, which can assist foreign language learners in learning and speaking another language other than their L1. In line with this supportive innovation, the present study investigated the role of LDOCE (Longman Dictionary of Contemporary English), a mobile application, on improving Iranian EFL learners’ pronunciation accuracy. To this aim, 40 EFL learners studying English at the intermediate level participated in the current study. This was an experimental research with two groups of 20 students in an experimental and a control group. The data were collected through the administration of a pronunciation pretest before the instruction and a post-test after the treatment. In addition, the assessment was based on the pupils’ recorded voices while reading the selected words. The results of the independent samples t-test indicated that using LDOCE significantly affected Iranian EFL learners' pronunciation accuracy with those in the experimental group outperforming their control group counterparts.

Keywords: LDOCE, EFL learners, pronunciation accuracy, CALL, MALL

Procedia PDF Downloads 511
39 A Sparse Representation Speech Denoising Method Based on Adapted Stopping Residue Error

Authors: Qianhua He, Weili Zhou, Aiwu Chen

Abstract:

A sparse representation speech denoising method based on adapted stopping residue error was presented in this paper. Firstly, the cross-correlation between the clean speech spectrum and the noise spectrum was analyzed, and an estimation method was proposed. In the denoising method, an over-complete dictionary of the clean speech power spectrum was learned with the K-singular value decomposition (K-SVD) algorithm. In the sparse representation stage, the stopping residue error was adaptively achieved according to the estimated cross-correlation and the adjusted noise spectrum, and the orthogonal matching pursuit (OMP) approach was applied to reconstruct the clean speech spectrum from the noisy speech. Finally, the clean speech was re-synthesised via the inverse Fourier transform with the reconstructed speech spectrum and the noisy speech phase. The experiment results show that the proposed method outperforms the conventional methods in terms of subjective and objective measure.

Keywords: speech denoising, sparse representation, k-singular value decomposition, orthogonal matching pursuit

Procedia PDF Downloads 464
38 A Board of Comparative Study of Central Secondary Education (CBSE) and Board of Secondry Education Madhya Pradesh BHOPAL (BSEMPB) Hindi Text Books of Class-VI

Authors: Shri Krishna Mishra, Badri Yadav

Abstract:

Proficient persons should be involved in formulation of the structure of the textbooks so that the topics selected in the Hindi textbooks for Class VII should contribute towards linguistic and literary development of the child and the language of the textbook matches the comprehension level of the student.The topics of tile textbooks should provide good illustrations and suitable exercises. Topics of variety of taste can be included in the textbook to satisfy the inquisitive children. There could be abstracts/hints at the beginning of each lesson. Meanings for difficult words must be given at the end of each topic for convenience of the parents and children as most of them find it difficult and time consuming to use Hindi dictionary. Exercises should be relevant covering the whole topic and the difficulty level should match the maturity level of the students in respect of CBSE Board. The stitching and binding of CBSE prescribed books may be improved to increase durability.

Keywords: comparative study of CBSE and BSEMPB, Central Secondary Education, Board of Secondry Education, BHOPAL

Procedia PDF Downloads 367
37 Deep Learning Based-Object-classes Semantic Classification of Arabic Texts

Authors: Imen Elleuch, Wael Ouarda, Gargouri Bilel

Abstract:

We proposes in this paper a Deep Learning based approach to classify text in order to enrich an Arabic ontology based on the objects classes of Gaston Gross. Those object classes are defined by taking into account the syntactic and semantic features of the treated language. Thus, our proposed approach is a hybrid one. In fact, it is based on the one hand on the object classes that represents a knowledge based-approach on classification of text and in the other hand it uses the deep learning approach that use the word embedding-based-approach to classify text. We have applied our proposed approach on a corpus constructed from an Arabic dictionary. The obtained semantic classification of text will enrich the Arabic objects classes ontology. In fact, new classes can be added to the ontology or an expansion of the features that characterizes each object class can be updated. The obtained results are compared to a similar work that treats the same object with a classical linguistic approach for the semantic classification of text. This comparison highlight our hybrid proposed approach that can be ameliorated by broaden the dataset used in the deep learning process.

Keywords: deep-learning approach, object-classes, semantic classification, Arabic

Procedia PDF Downloads 31
36 Particular Features of the First Romanian Multilingual Dictionaries

Authors: Mihaela Mocanu

Abstract:

The Romanian multilingual dictionaries – also named polyglot, plurilingual or polylingual dictionaries, have known a slow yet constant development starting with the end of the 17th century, when the first such work is attested, to the present time, when we witness a considerable increase of the number of polyglot dictionaries, especially the terminological ones. This paper aims at analyzing the context in which the first Romanian multilingual dictionaries were issued, as well as and the organization and structure particularities of the first lexicographic works of this type. The irretrievable loss of some of these works as well as the partial conservation of others renders the attempt to retrace the beginnings of Romanian lexicography extremely difficult. The research methodology is part of a descriptive and analytical approach based on two types of sources, subject to contrastive analysis: the notes made by the initiators of lexicographic projects and the testimonies of their contemporaries, respectively, along with the specialized studies regarding the history of the old Romanian lexicography. The analysis of the contents has indicated that these dictionaries lacked a scientific apparatus in the true sense of the phrase, failed to obey unitary organizational criteria, being limited, most of the times, to mere inventories of words, where the Romanian term was assigned its correspondent in other languages. Motivated by practical reasons, the first multilingual dictionaries were aimed at the clerics their purpose being to ensure the translators’ fidelity towards the original religious texts, regarded as sacred.

Keywords: Romanian lexicography, multilingual dictionary, terminology, language

Procedia PDF Downloads 253
35 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30dB SNR as a reference for voice activity.

Keywords: atomic decomposition, gabor, gammatone, matching pursuit, voice activity detection

Procedia PDF Downloads 254
34 Possibilities and Challenges of Using Machine Translation in Foreign Language Education

Authors: Miho Yamashita

Abstract:

In recent years, there have been attempts to introduce Machine Translation (MT) into foreign language teaching, especially in writing instructions. This is because the performance of neural machine translation has improved dramatically since 2016, and some university instructors started to introduce MT translations to their students as a "good model" to learn from. However, MT is still not perfect, and there are many incorrect translations. In order to translate the intended text into a foreign language, it is necessary to edit the original manuscript written in the native language (pre-edit) and revise the translated foreign language text (post-edit). The latter is considered especially difficult for users without a high proficiency level of foreign language. Therefore, the author allowed her students to use MT in her writing class in one of the private universities in Japan and investigated 1) how groups of students with different English proficiency levels revised MT translations when translating Japanese manuscripts into English and 2) whether the post-edit process differed when the students revised alone or in pairs. The results showed that in 1), certain non-post-edited grammatical errors were found regardless of their proficiency levels, indicating the need for teacher intervention, and in 2), more appropriate corrections were found in pairs, and their frequent use of a dictionary was also observed. In this presentation, the author will discuss how MT writing instruction can be integrated effectively in an aim to achieve multimodal foreign language education.

Keywords: machine translation, writing instruction, pre-edit, post-edit

Procedia PDF Downloads 26
33 Antimicrobial and Antioxidant Activities of Actinobacteria Isolated from the Pollen of Pinus sylvestris Grown on the Lake Baikal Shore

Authors: Denis V. Axenov-Gribanov, Irina V. Voytsekhovskaya, Evgenii S. Protasov, Maxim A. Timofeyev

Abstract:

Isolated ecosystems existing under specific environmental conditions have been shown to be promising sources of new strains of actinobacteria. The taiga forest of Baikal Siberia has not been well studied, and its actinobacterial population remains uncharacterized. The proximity between the huge water mass of Lake Baikal and high mountain ranges influences the structure and diversity of the plant world in Siberia. Here, we report the isolation of eighteen actinobacterial strains from male cones of Pinus sylvestris trees growing on the shore of the ancient Lake Baikal in Siberia. The actinobacterial strains were isolated on solid nutrient MS media and Czapek agar supplemented with cycloheximide and phosphomycin. Identification of actinobacteria was carried out by 16S rRNA gene sequencing and further analysis of the evolutionary history. Four different liquid and solid media (NL19, DNPM, SG and ISP) were tested for metabolite production. The metabolite extracts produced by the isolated strains were tested for antibacterial and antifungal activities. Also, antiradical activity of crude extracts was carried out. Strain Streptomyces sp. IB 2014 I 74-3 that active against Gram-negative bacteria was selected for dereplication analysis with using the high-yield liquid chromatography with mass-spectrometry. Mass detection was performed in both positive and negative modes, with the detection range set to 160–2500 m/z. Data were collected and analyzed using Bruker Compass Data Analysis software, version 4.1. Dereplication was performed using the Dictionary of Natural Products (DNP) database version 6.1 with the following search parameters: accurate molecular mass, absorption spectra and source of compound isolation. Thus, in addition to more common representative strains of Streptomyces, several species belonging to the genera Rhodococcus, Amycolatopsis, and Micromonospora were isolated. Several of the selected strains were deposited in the Russian Collection of Agricultural Microorganisms (RCAM), St. Petersburg, Russia. All isolated strains exhibited antibacterial and antifungal activities. We identified several strains that inhibited the growth of the pathogen Candida albicans but did not hinder the growth of Saccharomyces cerevisiae. Several isolates were active against Gram-positive and Gram-negative bacteria. Moreover, extracts of several strains demonstrated high antioxidant activity. The high proportion of biologically active strains producing antibacterial and specific antifungal compounds may reflect their role in protecting pollen against phytopathogens. Dereplication of the secondary metabolites of the strain Streptomyces sp. IB 2014 I 74-3 was resulted in the fact that a total of 59 major compounds were detected in the culture liquid extract of strain cultivated in ISP medium. Eight compounds were preliminarily identified based on characteristics described in the Dictionary of Natural Products database, using the search parameters Streptomyces sp. IB 2014 I 74-3 was found to produce saframycin A, Y3 and S; 2-amino-3-oxo-3H-phenoxazine-1,8-dicarboxylic acid; galtamycinone; platencin A4-13R and A4-4S; ganefromycin d1; the antibiotic SS 8201B; and streptothricin D, 40-decarbamoyl, 60-carbamoyl. Moreover, forty-nine of the 59 compounds detected in the extract examined in the present study did not result in any positive hits when searching within the DNP database and could not be identified based on available mass-spec data. Thus, these compounds might represent new findings.

Keywords: actinobacteria, Baikal Lake, biodiversity, male cones, Pinus sylvestris

Procedia PDF Downloads 192
32 Information Disclosure And Financial Sentiment Index Using a Machine Learning Approach

Authors: Alev Atak

Abstract:

In this paper, we aim to create a financial sentiment index by investigating the company’s voluntary information disclosures. We retrieve structured content from BIST 100 companies’ financial reports for the period 1998-2018 and extract relevant financial information for sentiment analysis through Natural Language Processing. We measure strategy-related disclosures and their cross-sectional variation and classify report content into generic sections using synonym lists divided into four main categories according to their liquidity risk profile, risk positions, intra-annual information, and exposure to risk. We use Word Error Rate and Cosin Similarity for comparing and measuring text similarity and derivation in sets of texts. In addition to performing text extraction, we will provide a range of text analysis options, such as the readability metrics, word counts using pre-determined lists (e.g., forward-looking, uncertainty, tone, etc.), and comparison with reference corpus (word, parts of speech and semantic level). Therefore, we create an adequate analytical tool and a financial dictionary to depict the importance of granular financial disclosure for investors to identify correctly the risk-taking behavior and hence make the aggregated effects traceable.

Keywords: financial sentiment, machine learning, information disclosure, risk

Procedia PDF Downloads 63
31 Using Audit Tools to Maintain Data Quality for ACC/NCDR PCI Registry Abstraction

Authors: Vikrum Malhotra, Manpreet Kaur, Ayesha Ghotto

Abstract:

Background: Cardiac registries such as ACC Percutaneous Coronary Intervention Registry require high quality data to be abstracted, including data elements such as nuclear cardiology, diagnostic coronary angiography, and PCI. Introduction: The audit tool created is used by data abstractors to provide data audits and assess the accuracy and inter-rater reliability of abstraction performed by the abstractors for a health system. This audit tool solution has been developed across 13 registries, including ACC/NCDR registries, PCI, STS, Get with the Guidelines. Methodology: The data audit tool was used to audit internal registry abstraction for all data elements, including stress test performed, type of stress test, data of stress test, results of stress test, risk/extent of ischemia, diagnostic catheterization detail, and PCI data elements for ACC/NCDR PCI registries. This is being used across 20 hospital systems internally and providing abstraction and audit services for them. Results: The data audit tool had inter-rater reliability and accuracy greater than 95% data accuracy and IRR score for the PCI registry in 50 PCI registry cases in 2021. Conclusion: The tool is being used internally for surgical societies and across hospital systems. The audit tool enables the abstractor to be assessed by an external abstractor and includes all of the data dictionary fields for each registry.

Keywords: abstraction, cardiac registry, cardiovascular registry, registry, data

Procedia PDF Downloads 67
30 An Investigation into Problems Confronting Pre-Service Teachers of French in South-West Nigeria

Authors: Modupe Beatrice Adeyinka

Abstract:

French, as a foreign language in Nigeria, is pronounced to be the second official language and a compulsory subject in the primary school level; hence, colleges of education across the nation are saddled with the responsibility of training teachers for the subject. However, it has been observed that this policy has not been fully implemented, for French teachers in training, do face many challenges, of which translation is chief. In a bid to investigate the major cause of the perceived translation problem, this study examined French translation problems of pre-service teachers in selected colleges of education in the southwest, Nigeria. This study adopted a descriptive survey research design. The simple random sampling technique was used to select four colleges of education in the southwest, where 100 French students were randomly selected by selecting 25 from each school. The pre-service teachers’ French translation problems’ questionnaire (PTFTPQ) was used as an instrument while four research questions were answered and three null hypotheses were tested. Among others, the findings revealed that students do have problems with false friends, though mainly with its interpretation when attempting French-English translation and vice versa; majority of the students make use of French dictionary as a way out and found the material very useful for their understanding of false friends. Teachers were, therefore, urged to attend in-service training where they would be exposed to new and emerging strategies, approaches and methodologies of French language teaching that will make students overcome the challenge of translation in learning French.

Keywords: false friends, French language, pre-service teachers, source language, target language, translation

Procedia PDF Downloads 109
29 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: cooccurrence graph, entity relation graph, unstructured text, weighted distance

Procedia PDF Downloads 106