Search results for: CCL Corpus
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 82

Search results for: CCL Corpus

52 Re-telling Goa's History: The Margin Narrative

Authors: Anna Beatriz Paula

Abstract:

This paper presents the first reflexions about Margaret Mascarenhas-s novel, “Skin", based on post-colonial critic perception of History and its agents. By doing so, this study will put light on a literary corpus of Indian Literatures: the Goan Literature whose cultural basis creates an unique historiographic metafiction conducted by different characters that one by one plays the narrator role.

Keywords: Goa, History, Literature, Metafiction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2130
51 A Corpus-Based Approach to Understanding Market Access in Fisheries and Aquaculture: A Systematic Literature Review

Authors: Cheryl Marie Cordeiro

Abstract:

Although fisheries and aquaculture studies might seem marginal to international business (IB) studies in general, fisheries and aquaculture IB (FAIB) management is currently facing increasing pressure to meet global demand and consumption for fish in the next coming decades. In part address to this challenge, the purpose of this systematic review of literature (SLR) study is to investigate the use of the term ‘market access’ in its context of use in the generic literature and business sector discourse, in comparison to the more specific literature and discourse in fisheries, aquaculture and seafood. This SLR aims to uncover the knowledge/interest gaps between the academic subject discourses and business sector practices. Corpus driven in methodology and using a triangulation method of three different text analysis software including AntConc, VOSviewer and Web of Science (WoS) analytics, the SLR results indicate a gap in conceptual knowledge and business practices in how ‘market access’ is conceived and used in the context of the pharmaceutical healthcare industry and FAIB research and practice. While it is acknowledged that the product orientation of different business sectors might differ, this SLR study works with the assumption that both business sectors are global in orientation. These business sectors are complex in their operations from product to market. This SLR suggests a conceptual model in understanding the challenges, the potential barriers as well as avenues for solutions to developing market access for FAIB.

Keywords: Market access, fisheries and aquaculture, international business, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 945
50 Named Entity Recognition using Support Vector Machine: A Language Independent Approach

Authors: Asif Ekbal, Sivaji Bandyopadhyay

Abstract:

Named Entity Recognition (NER) aims to classify each word of a document into predefined target named entity classes and is now-a-days considered to be fundamental for many Natural Language Processing (NLP) tasks such as information retrieval, machine translation, information extraction, question answering systems and others. This paper reports about the development of a NER system for Bengali and Hindi using Support Vector Machine (SVM). Though this state of the art machine learning technique has been widely applied to NER in several well-studied languages, the use of this technique to Indian languages (ILs) is very new. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the four different named (NE) classes, such as Person name, Location name, Organization name and Miscellaneous name. We have used the annotated corpora of 122,467 tokens of Bengali and 502,974 tokens of Hindi tagged with the twelve different NE classes 1, defined as part of the IJCNLP-08 NER Shared Task for South and South East Asian Languages (SSEAL) 2. In addition, we have manually annotated 150K wordforms of the Bengali news corpus, developed from the web-archive of a leading Bengali newspaper. We have also developed an unsupervised algorithm in order to generate the lexical context patterns from a part of the unlabeled Bengali news corpus. Lexical patterns have been used as the features of SVM in order to improve the system performance. The NER system has been tested with the gold standard test sets of 35K, and 60K tokens for Bengali, and Hindi, respectively. Evaluation results have demonstrated the recall, precision, and f-score values of 88.61%, 80.12%, and 84.15%, respectively, for Bengali and 80.23%, 74.34%, and 77.17%, respectively, for Hindi. Results show the improvement in the f-score by 5.13% with the use of context patterns. Statistical analysis, ANOVA is also performed to compare the performance of the proposed NER system with that of the existing HMM based system for both the languages.

Keywords: Named Entity (NE), Named Entity Recognition (NER), Support Vector Machine (SVM), Bengali, Hindi.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3402
49 Analysis of Steles with Libyan Inscriptions of Grande Kabylia, Algeria

Authors: Samia Ait Ali Yahia

Abstract:

Several steles with Libyan inscriptions were discovered in Grande Kabylia (Algeria), but very few researchers were interested in these inscriptions. Our work is to list, if possible all these steles in order to do a descriptive study of the corpus. The steles analysis will be focused on the iconographic and epigraphic level and on the different forms of Libyan characters in order to highlight the alphabet used by the Grande Kabylia.

Keywords: Epigraphy, stele, Libyan inscription, Grande Kabylia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1089
48 Effective Features for Disambiguation of Turkish Verbs

Authors: Zeynep Orhan, Zeynep Altan

Abstract:

This paper summarizes the results of some experiments for finding the effective features for disambiguation of Turkish verbs. Word sense disambiguation is a current area of investigation in which verbs have the dominant role. Generally verbs have more senses than the other types of words in the average and detecting these features for verbs may lead to some improvements for other word types. In this paper we have considered only the syntactical features that can be obtained from the corpus and tested by using some famous machine learning algorithms.

Keywords: Word sense disambiguation, feature selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1746
47 Achieving Maximum Performance through the Practice of Entrepreneurial Ethics: Evidence from SMEs in Nigeria

Authors: S. B. Tende, H. L. Abubakar

Abstract:

It is acknowledged that small and medium enterprises (SMEs) may encounter different ethical issues and pressures that could affect the way in which they strategize or make decisions concerning the outcome of their business. Therefore, this research aimed at assessing entrepreneurial ethics in the business of SMEs in Nigeria. Secondary data were adopted as source of corpus for the analysis. The findings conclude that a sound entrepreneurial ethics system has a significant effect on the level of performance of SMEs in Nigeria. The Nigerian Government needs to provide both guiding and physical structures; as well as learning systems that could inculcate these entrepreneurial ethics.

Keywords: Entrepreneurial ethics, culture, performance, SME.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 964
46 Software Architectural Design Ontology

Authors: Muhammad Irfan Marwat, Sadaqat Jan, Syed Zafar Ali Shah

Abstract:

Software Architecture plays a key role in software development but absence of formal description of Software Architecture causes different impede in software development. To cope with these difficulties, ontology has been used as artifact. This paper proposes ontology for Software Architectural design based on IEEE model for architecture description and Kruchten 4+1 model for viewpoints classification. For categorization of style and views, ISO/IEC 42010 has been used. Corpus method has been used to evaluate ontology. The main aim of the proposed ontology is to classify and locate Software Architectural design information.

Keywords: Software Architecture Ontology, Semantic based Software Architecture, Software Architecture, Ontology, Software Engineering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4187
45 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: Hidden Markov model, Viterbi algorithm, POS tagging, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1707
44 A Similarity Measure for Clustering and its Applications

Authors: Guadalupe J. Torres, Ram B. Basnet, Andrew H. Sung, Srinivas Mukkamala, Bernardete M. Ribeiro

Abstract:

This paper introduces a measure of similarity between two clusterings of the same dataset produced by two different algorithms, or even the same algorithm (K-means, for instance, with different initializations usually produce different results in clustering the same dataset). We then apply the measure to calculate the similarity between pairs of clusterings, with special interest directed at comparing the similarity between various machine clusterings and human clustering of datasets. The similarity measure thus can be used to identify the best (in terms of most similar to human) clustering algorithm for a specific problem at hand. Experimental results pertaining to the text categorization problem of a Portuguese corpus (wherein a translation-into-English approach is used) are presented, as well as results on the well-known benchmark IRIS dataset. The significance and other potential applications of the proposed measure are discussed.

Keywords: Clustering Algorithms, Clustering Applications, Similarity Measures, Text Clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1570
43 A Content Vector Model for Text Classification

Authors: Eric Jiang

Abstract:

As a popular rank-reduced vector space approach, Latent Semantic Indexing (LSI) has been used in information retrieval and other applications. In this paper, an LSI-based content vector model for text classification is presented, which constructs multiple augmented category LSI spaces and classifies text by their content. The model integrates the class discriminative information from the training data and is equipped with several pertinent feature selection and text classification algorithms. The proposed classifier has been applied to email classification and its experiments on a benchmark spam testing corpus (PU1) have shown that the approach represents a competitive alternative to other email classifiers based on the well-known SVM and naïve Bayes algorithms.

Keywords: Feature Selection, Latent Semantic Indexing, Text Classification, Vector Space Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1884
42 A System of Automatic Speech Recognition based on the Technique of Temporal Retiming

Authors: Samir Abdelhamid, Noureddine Bouguechal

Abstract:

We report in this paper the procedure of a system of automatic speech recognition based on techniques of the dynamic programming. The technique of temporal retiming is a technique used to synchronize between two forms to compare. We will see how this technique is adapted to the field of the automatic speech recognition. We will expose, in a first place, the theory of the function of retiming which is used to compare and to adjust an unknown form with a whole of forms of reference constituting the vocabulary of the application. Then we will give, in the second place, the various algorithms necessary to their implementation on machine. The algorithms which we will present were tested on part of the corpus of words in Arab language Arabdic-10 [4] and gave whole satisfaction. These algorithms are effective insofar as we apply them to the small ones or average vocabularies.

Keywords: Continuous speech recognition, temporal retiming, phonetic decoding, algorithms, vocal signal, dynamic programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1346
41 Questions in the School

Authors: Jana M. Havigerová, Jiří Haviger

Abstract:

Paper deals with the topic of questions as important components of information behavior in the school. By analyzing the Corpus Schola2010, the state of contemporary education in terms of questioning is proven unsatisfactory: 80% of the questions are asked by teachers; most of teacher-s questions are asked at the beginning of the first grade, than their number decreases and is settling down on 80±10 questions per lesson. The average number of questions within one lesson per one pupil is generally less than one whole question. The highest values are achieved in the first, sixth, eighth and tenth grade,, i.e. in the transition years in which pupils are moving into higher levels of education and every following year it declines. We can state Czech school do not support questioning and question skill of their pupils, thereby typical Czech schools are neglecting the development of thinking, reasoning and cooperation of their pupils.

Keywords: information behavior, questions, primary and secondary education, Czech Republic

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1463
40 Resolving Dependency Ambiguity of Subordinate Clauses using Support Vector Machines

Authors: Sang-Soo Kim, Seong-Bae Park, Sang-Jo Lee

Abstract:

In this paper, we propose a method of resolving dependency ambiguities of Korean subordinate clauses based on Support Vector Machines (SVMs). Dependency analysis of clauses is well known to be one of the most difficult tasks in parsing sentences, especially in Korean. In order to solve this problem, we assume that the dependency relation of Korean subordinate clauses is the dependency relation among verb phrase, verb and endings in the clauses. As a result, this problem is represented as a binary classification task. In order to apply SVMs to this problem, we selected two kinds of features: static and dynamic features. The experimental results on STEP2000 corpus show that our system achieves the accuracy of 73.5%.

Keywords: Dependency analysis, subordinate clauses, binaryclassification, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1596
39 Growing Self Organising Map Based Exploratory Analysis of Text Data

Authors: Sumith Matharage, Damminda Alahakoon

Abstract:

Textual data plays an important role in the modern world. The possibilities of applying data mining techniques to uncover hidden information present in large volumes of text collections is immense. The Growing Self Organizing Map (GSOM) is a highly successful member of the Self Organising Map family and has been used as a clustering and visualisation tool across wide range of disciplines to discover hidden patterns present in the data. A comprehensive analysis of the GSOM’s capabilities as a text clustering and visualisation tool has so far not been published. These functionalities, namely map visualisation capabilities, automatic cluster identification and hierarchical clustering capabilities are presented in this paper and are further demonstrated with experiments on a benchmark text corpus.

Keywords: Text Clustering, Growing Self Organizing Map, Automatic Cluster Identification, Hierarchical Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1996
38 A New Time-Frequency Speech Analysis Approach Based On Adaptive Fourier Decomposition

Authors: Liming Zhang

Abstract:

In this paper, a new adaptive Fourier decomposition (AFD) based time-frequency speech analysis approach is proposed. Given the fact that the fundamental frequency of speech signals often undergo fluctuation, the classical short-time Fourier transform (STFT) based spectrogram analysis suffers from the difficulty of window size selection. AFD is a newly developed signal decomposition theory. It is designed to deal with time-varying non-stationary signals. Its outstanding characteristic is to provide instantaneous frequency for each decomposed component, so the time-frequency analysis becomes easier. Experiments are conducted based on the sample sentence in TIMIT Acoustic-Phonetic Continuous Speech Corpus. The results show that the AFD based time-frequency distribution outperforms the STFT based one.

Keywords: Adaptive fourier decomposition, instantaneous frequency, speech analysis, time-frequency distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1724
37 Automatic Text Summarization

Authors: Mohamed Abdel Fattah, Fuji Ren

Abstract:

This work proposes an approach to address automatic text summarization. This approach is a trainable summarizer, which takes into account several features, including sentence position, positive keyword, negative keyword, sentence centrality, sentence resemblance to the title, sentence inclusion of name entity, sentence inclusion of numerical data, sentence relative length, Bushy path of the sentence and aggregated similarity for each sentence to generate summaries. First we investigate the effect of each sentence feature on the summarization task. Then we use all features score function to train genetic algorithm (GA) and mathematical regression (MR) models to obtain a suitable combination of feature weights. The proposed approach performance is measured at several compression rates on a data corpus composed of 100 English religious articles. The results of the proposed approach are promising.

Keywords: Automatic Summarization, Genetic Algorithm, Mathematical Regression, Text Features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2334
36 Extraction of Temporal Relation by the Creation of Historical Natural Disaster Archive

Authors: Suguru Yoshioka, Seiichi Tani, Seinosuke Toda

Abstract:

In historical science and social science, the influence of natural disaster upon society is a matter of great interest. In recent years, some archives are made through many hands for natural disasters, however it is inefficiency and waste. So, we suppose a computer system to create a historical natural disaster archive. As the target of this analysis, we consider newspaper articles. The news articles are considered to be typical examples that prescribe the temporal relations of affairs for natural disaster. In order to do this analysis, we identify the occurrences in newspaper articles by some index entries, considering the affairs which are specific to natural disasters, and show the temporal relation between natural disasters. We designed and implemented the automatic system of “extraction of the occurrences of natural disaster" and “temporal relation table for natural disaster."

Keywords: Database, digital library, corpus, historical natural disaster, temporal relation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1403
35 Foreign Elements in the Methodologies of Usul Fiqh: Analyzing the Orientalist Thought

Authors: Ariyanti Mustapha

Abstract:

The development of Islamic jurisprudence since the first century of the hijra has fascinated many orientalists to explore the historiography of Islamic legislation. The practice of usul fiqh began during the lifetime of the Prophet Muhammad and was continued by the companions as the legal reasoning due to the absence of the legal injunction in the Qur’an and Sunnah. The orientalists propagated that the Roman and Jewish legislation were transplanted into Islamic jurisprudence and it was the primary reason for its progression. We used qualitative and comparative methods to analyze the orientalists’ views. Results showed that many erroneous facts were propagated by Goldziher and Schacht by claiming the parallels between the principles, methodologies, and fundamental concepts in Islamic jurisprudence and Roman Provincial law. The orientalists claimed that Islamic jurisprudence was derived from the corpus of Jewish Mishnah and Ha-kol. These judgments are used by the orientalists to prove the inferiority of Islamic jurisprudence. Nevertheless, many evidences have proven that Islamic legislation is capable of developing independently without any foreign transplant.

Keywords: Foreign transplant, ijtihad, orientalist, usul fiqh.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 388
34 Assessment of the Validity of Sentiment Analysis as a Tool to Analyze the Emotional Content of Text

Authors: Trisha Malhotra

Abstract:

Sentiment analysis is a recent field of study that computationally assesses the emotional nature of a body of text. To assess its test-validity, sentiment analysis was carried out on the emotional corpus of text from a personal 15-day mood diary. Self-reported mood scores varied more or less accurately with daily mood evaluation score given by the software. On further assessment, it was found that while sentiment analysis was good at assessing ‘global’ mood, it was not able to ‘locally’ identify and differentially score synonyms of various emotional words. It is further critiqued for treating the intensity of an emotion as universal across cultures. Finally, the software is shown not to account for emotional complexity in sentences by treating emotions as strictly positive or negative. Hence, it is posited that a better output could be two (positive and negative) affect scores for the same body of text.

Keywords: Analysis, data, diary, emotions, mood, sentiment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1125
33 Algorithm for Information Retrieval Optimization

Authors: Kehinde K. Agbele, Kehinde Daniel Aruleba, Eniafe F. Ayetiran

Abstract:

When using Information Retrieval Systems (IRS), users often present search queries made of ad-hoc keywords. It is then up to the IRS to obtain a precise representation of the user’s information need and the context of the information. This paper investigates optimization of IRS to individual information needs in order of relevance. The study addressed development of algorithms that optimize the ranking of documents retrieved from IRS. This study discusses and describes a Document Ranking Optimization (DROPT) algorithm for information retrieval (IR) in an Internet-based or designated databases environment. Conversely, as the volume of information available online and in designated databases is growing continuously, ranking algorithms can play a major role in the context of search results. In this paper, a DROPT technique for documents retrieved from a corpus is developed with respect to document index keywords and the query vectors. This is based on calculating the weight (

Keywords: Internet ranking,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1475
32 Modelling Medieval Vaults: Digital Simulation of the North Transept Vault of St Mary, Nantwich, England

Authors: N. Webb, A. Buchanan

Abstract:

Digital and virtual heritage is often associated with the recreation of lost artefacts and architecture; however, we can also investigate works that were not completed, using digital tools and techniques. Here we explore physical evidence of a fourteenth-century Gothic vault located in the north transept of St Mary’s church in Nantwich, Cheshire, using existing springer stones that are built into the walls as a starting point. Digital surveying tools are used to document the architecture, followed by an analysis process to hypothesise and simulate possible design solutions, had the vault been completed. A number of options, both two-dimensionally and three-dimensionally, are discussed based on comparison with examples of other contemporary vaults, thus adding another specimen to the corpus of vault designs. Dissemination methods such as digital models and 3D prints are also explored as possible resources for demonstrating what the finished vault might have looked like for heritage interpretation and other purposes.

Keywords: Digital simulation, heritage interpretation, medieval vaults, virtual heritage, 3D scanning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1178
31 Hybrid Machine Learning Approach for Text Categorization

Authors: Nerijus Remeikis, Ignas Skucas, Vida Melninkaite

Abstract:

Text categorization - the assignment of natural language documents to one or more predefined categories based on their semantic content - is an important component in many information organization and management tasks. Performance of neural networks learning is known to be sensitive to the initial weights and architecture. This paper discusses the use multilayer neural network initialization with decision tree classifier for improving text categorization accuracy. An adaptation of the algorithm is proposed in which a decision tree from root node until a final leave is used for initialization of multilayer neural network. The experimental evaluation demonstrates this approach provides better classification accuracy with Reuters-21578 corpus, one of the standard benchmarks for text categorization tasks. We present results comparing the accuracy of this approach with multilayer neural network initialized with traditional random method and decision tree classifiers.

Keywords: Text categorization, decision trees, neural networks, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1805
30 Environmental Interference Cancellation of Speech with the Radial Basis Function Networks: An Experimental Comparison

Authors: Nima Hatami

Abstract:

In this paper, we use Radial Basis Function Networks (RBFN) for solving the problem of environmental interference cancellation of speech signal. We show that the Second Order Thin- Plate Spline (SOTPS) kernel cancels the interferences effectively. For make comparison, we test our experiments on two conventional most used RBFN kernels: the Gaussian and First order TPS (FOTPS) basis functions. The speech signals used here were taken from the OGI Multi-Language Telephone Speech Corpus database and were corrupted with six type of environmental noise from NOISEX-92 database. Experimental results show that the SOTPS kernel can considerably outperform the Gaussian and FOTPS functions on speech interference cancellation problem.

Keywords: Environmental interference, interference cancellation of speech, Radial Basis Function networks, Gaussian and TPS kernels.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1562
29 High-Individuality Voice Conversion Based on Concatenative Speech Synthesis

Authors: Kei Fujii, Jun Okawa, Kaori Suigetsu

Abstract:

Concatenative speech synthesis is a method that can make speech sound which has naturalness and high-individuality of a speaker by introducing a large speech corpus. Based on this method, in this paper, we propose a voice conversion method whose conversion speech has high-individuality and naturalness. The authors also have two subjective evaluation experiments for evaluating individuality and sound quality of conversion speech. From the results, following three facts have be confirmed: (a) the proposal method can convert the individuality of speakers well, (b) employing the framework of unit selection (especially join cost) of concatenative speech synthesis into conventional voice conversion improves the sound quality of conversion speech, and (c) the proposal method is robust against the difference of genders between a source speaker and a target speaker.

Keywords: concatenative speech synthesis, join cost, speaker individuality, unit selection, voice conversion

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1938
28 The Predictability and Abstractness of Language: A Study in Understanding and Usage of the English Language through Probabilistic Modeling and Frequency

Authors: Revanth Sai Kosaraju, Michael Ramscar, Melody Dye

Abstract:

Accounts of language acquisition differ significantly in their treatment of the role of prediction in language learning. In particular, nativist accounts posit that probabilistic learning about words and word sequences has little to do with how children come to use language. The accuracy of this claim was examined by testing whether distributional probabilities and frequency contributed to how well 3-4 year olds repeat simple word chunks. Corresponding chunks were the same length, expressed similar content, and were all grammatically acceptable, yet the results of the study showed marked differences in performance when overall distributional frequency varied. It was found that a distributional model of language predicted the empirical findings better than a number of other models, replicating earlier findings and showing that children attend to distributional probabilities in an adult corpus. This suggested that language is more prediction-and-error based, rather than on abstract rules which nativist camps suggest.

Keywords: Abstractness, child psychology, language acquisition, prediction and error.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2095
27 Use and Relationship of Shell Nouns as Cohesive Devices in the Quality of Second Language Writing

Authors: Kristine D. de Leon, Junifer A. Abatayo, Jose Cristina M. Pariña

Abstract:

The current study is a comparative analysis of the use of shell nouns as a cohesive device (CD) in an English for Second Language (ESL) setting in order to identify their use and relationship in the quality of second language (L2) writing. As these nouns were established to anticipate the meaning within, across or outside the text, their use has fascinated writing researchers. The corpus of the study included published articles from reputable journals and graduate students’ papers in order to analyze the frequency of shell nouns using “highly prevalent” nouns in the academic community, to identify the different lexicogrammatical patterns where these nouns occur and to the functions connected with these patterns. The result of the study implies that published authors used more shell nouns in their paper than graduate students. However, the functions of the different lexicogrammatical patterns for the frequently occurring shell nouns are somewhat similar. These results could help students in enhancing the cohesion of their text and in comprehending it.

Keywords: Anaphoric-cataphoric, cohesive device, lexicogrammatical, shell nouns.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 652
26 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: Body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1125
25 Measuring Text-Based Semantics Relatedness Using WordNet

Authors: Madiha Khan, Sidrah Ramzan, Seemab Khan, Shahzad Hassan, Kamran Saeed

Abstract:

Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.

Keywords: GraphViz representation, semantic relatedness, similarity measurement, WordNet similarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 836
24 Embodied Cognition and Its Implications in Education: An Overview of Recent Literature

Authors: Panagiotis Kosmas, Panayiotis Zaphiris

Abstract:

Embodied Cognition (EC) as a learning paradigm is based on the idea of an inseparable link between body, mind, and environment. In recent years, the advent of theoretical learning approaches around EC theory has resulted in a number of empirical studies exploring the implementation of the theory in education. This systematic literature overview identifies the mainstream of EC research and emphasizes on the implementation of the theory across learning environments. Based on a corpus of 43 manuscripts, published between 2013 and 2017, it sets out to describe the range of topics covered under the umbrella of EC and provides a holistic view of the field. The aim of the present review is to investigate the main issues in EC research related to the various learning contexts. Particularly, the study addresses the research methods and technologies that are utilized, and it also explores the integration of body into the learning context. An important finding from the overview is the potential of the theory in different educational environments and disciplines. However, there is a lack of an explicit pedagogical framework from an educational perspective for a successful implementation in various learning contexts.

Keywords: Embodied cognition, embodied learning, education, technology, schools.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1726
23 Concept Indexing using Ontology and Supervised Machine Learning

Authors: Rossitza M. Setchi, Qiao Tang

Abstract:

Nowadays, ontologies are the only widely accepted paradigm for the management of sharable and reusable knowledge in a way that allows its automatic interpretation. They are collaboratively created across the Web and used to index, search and annotate documents. The vast majority of the ontology based approaches, however, focus on indexing texts at document level. Recently, with the advances in ontological engineering, it became clear that information indexing can largely benefit from the use of general purpose ontologies which aid the indexing of documents at word level. This paper presents a concept indexing algorithm, which adds ontology information to words and phrases and allows full text to be searched, browsed and analyzed at different levels of abstraction. This algorithm uses a general purpose ontology, OntoRo, and an ontologically tagged corpus, OntoCorp, both developed for the purpose of this research. OntoRo and OntoCorp are used in a two-stage supervised machine learning process aimed at generating ontology tagging rules. The first experimental tests show a tagging accuracy of 78.91% which is encouraging in terms of the further improvement of the algorithm.

Keywords: Concepts, indexing, machine learning, ontology, tagging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1677