Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1411

Search results for: topic naming

1411 Web Search Engine Based Naming Procedure for Independent Topic

Authors: Takahiro Nishigaki, Takashi Onoda

Abstract:

In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.

Keywords: independent topic analysis, topic extraction, topic naming, web search engine

Procedia PDF Downloads 94

1410 Neural Correlates of Arabic Digits Naming

Authors: Fernando Ojedo, Alejandro Alvarez, Pedro Macizo

Abstract:

In the present study, we explored electrophysiological correlates of Arabic digits naming to determine semantic processing of numbers. Participants named Arabic digits grouped by category or intermixed with exemplars of other semantic categories while the N400 event-related potential was examined. Around 350-450 ms after the presentation of Arabic digits, brain waves were more positive in anterior regions and more negative in posterior regions when stimuli were grouped by category relative to the mixed condition. Contrary to what was found in other studies, electrophysiological results suggested that the production of numerals involved semantic mediation.

Keywords: Arabic digit naming, event-related potentials, semantic processing, number production

Procedia PDF Downloads 546

1409 A Genetic Algorithm Based Ensemble Method with Pairwise Consensus Score on Malware Cacophonous Labels

Authors: Shih-Yu Wang, Shun-Wen Hsiao

Abstract:

In the field of cybersecurity, there exists many vendors giving malware samples classified results, namely naming after the label that contains some important information which is also called AV label. Lots of researchers relay on AV labels for research. Unfortunately, AV labels are too cluttered. They do not have a fixed format and fixed naming rules because the naming results were based on each classifiers' viewpoints. A way to fix the problem is taking a majority vote. However, voting can sometimes create problems of bias. Thus, we create a novel ensemble approach which does not rely on the cacophonous naming result but depend on group identification to aggregate everyone's opinion. To achieve this purpose, we develop an scoring system called Pairwise Consensus Score (PCS) to calculate result similarity. The entire method architecture combine Genetic Algorithm and PCS to find maximum consensus in the group. Experimental results revealed that our method outperformed the majority voting by 10% in term of the score.

Keywords: genetic algorithm, ensemble learning, malware family, malware labeling, AV labels

Procedia PDF Downloads 53

1408 Clustering Ethno-Informatics of Naming Village in Java Island Using Data Mining

Authors: Atje Setiawan Abdullah, Budi Nurani Ruchjana, I. Gede Nyoman Mindra Jaya, Eddy Hermawan

Abstract:

Ethnoscience is used to see the culture with a scientific perspective, which may help to understand how people develop various forms of knowledge and belief, initially focusing on the ecology and history of the contributions that have been there. One of the areas studied in ethnoscience is etno-informatics, is the application of informatics in the culture. In this study the science of informatics used is data mining, a process to automatically extract knowledge from large databases, to obtain interesting patterns in order to obtain a knowledge. While the application of culture described by naming database village on the island of Java were obtained from Geographic Indonesia Information Agency (BIG), 2014. The purpose of this study is; first, to classify the naming of the village on the island of Java based on the structure of the word naming the village, including the prefix of the word, syllable contained, and complete word. Second to classify the meaning of naming the village based on specific categories, as well as its role in the community behavioral characteristics. Third, how to visualize the naming of the village to a map location, to see the similarity of naming villages in each province. In this research we have developed two theorems, i.e theorems area as a result of research studies have collected intersection naming villages in each province on the island of Java, and the composition of the wedge theorem sets the provinces in Java is used to view the peculiarities of a location study. The methodology in this study base on the method of Knowledge Discovery in Database (KDD) on data mining, the process includes preprocessing, data mining and post processing. The results showed that the Java community prioritizes merit in running his life, always working hard to achieve a more prosperous life, and love as well as water and environmental sustainment. Naming villages in each location adjacent province has a high degree of similarity, and influence each other. Cultural similarities in the province of Central Java, East Java and West Java-Banten have a high similarity, whereas in Jakarta-Yogyakarta has a low similarity. This research resulted in the cultural character of communities within the meaning of the naming of the village on the island of Java, this character is expected to serve as a guide in the behavior of people's daily life on the island of Java.

Keywords: ethnoscience, ethno-informatics, data mining, clustering, Java island culture

Procedia PDF Downloads 251

1407 Evaluation of Persian Medical Terms Compatibility with International Naming Criteria Based on the Applied Translation Procedures

Authors: Ali Akbar Zeinali

Abstract:

Lack of appropriate equivalences for the terms or technical words is the result of ineffective translation guidelines adopted in the translation processes. The increasing number of foreign words and specific terms incorporated into the native language are due to the ongoing development of technology and science. Many problems appear in medical translation when the Persian translators try to employ non-Persian or imported words in medical texts, in which multiple equivalents may be created for one particular word based on the individual preferences of authors and translators in the target language due to lack of standardization. The study attempted to discuss the findings based on the compatibility of the international naming criteria, considering the translation procedures. About 67% of 339 equivalents under this study were grouped as incompatible words while about 33% of them were compatible terms. The similarities and differences were investigated and discussed according to the compatibility status of the equivalents with Sager’s criteria. Such equivalents have been classified into several groups through bi-dimensional descriptions that were different features of translation procedures related to the international naming criteria. In review of the frequency distribution of compatibilities, the equivalents were divided into two categories of compatibles and incompatibles, indicating the effectiveness of the applied translation procedures.

Keywords: linguistics, medical translation, naming, terminology

Procedia PDF Downloads 96

1406 Discovering Word-Class Deficits in Persons with Aphasia

Authors: Yashaswini Channabasavegowda, Hema Nagaraj

Abstract:

Aim: The current study aims at discovering word-class deficits concerning the noun-verb ratio in confrontation naming, picture description, and picture-word matching tasks. A total of ten persons with aphasia (PWA) and ten age-matched neurotypical individuals (NTI) were recruited for the study. The research includes both behavioural and objective measures to assess the word class deficits in PWA. Objective: The main objective of the research is to identify word class deficits seen in persons with aphasia, using various speech eliciting tasks. Method: The study was conducted in the L1 of the participants, considered to be Kannada. Action naming test and Boston naming test adapted to the Kannada version are administered to the participants; also, a picture description task is carried out. Picture-word matching task was carried out using e-prime software (version 2) to measure the accuracy and reaction time with respect to identification verbs and nouns. The stimulus was presented through auditory and visual modes. Data were analysed to identify errors noticed in the naming of nouns versus verbs, with respect to the Boston naming test and action naming test and also usage of nouns and verbs in the picture description task. Reaction time and accuracy for picture-word matching were extracted from the software. Results: PWA showed a significant difference in sentence structure compared to age-matched NTI. Also, PWA showed impairment in syntactic measures in the picture description task, with fewer correct grammatical sentences and fewer correct usage of verbs and nouns, and they produced a greater proportion of nouns compared to verbs. PWA had poorer accuracy and lesser reaction time in the picture-word matching task compared to NTI, and accuracy was higher for nouns compared to verbs in PWA. The deficits were noticed irrespective of the cause leading to aphasia.

Keywords: nouns, verbs, aphasia, naming, description

Procedia PDF Downloads 74

1405 Lexical-Semantic Processing by Chinese as a Second Language Learners

Authors: Yi-Hsiu Lai

Abstract:

The present study aimed to elucidate the lexical-semantic processing for Chinese as second language (CSL) learners. Twenty L1 speakers of Chinese and twenty CSL learners in Taiwan participated in a picture naming task and a category fluency task. Based on their Chinese proficiency levels, these CSL learners were further divided into two sub-groups: ten CSL learners of elementary Chinese proficiency level and ten CSL learners of intermediate Chinese proficiency level. Instruments for the naming task were sixty black-and-white pictures: thirty-five object pictures and twenty-five action pictures. Object pictures were divided into two categories: living objects and non-living objects. Action pictures were composed of two categories: action verbs and process verbs. As in the naming task, the category fluency task consisted of two semantic categories – objects (i.e., living and non-living objects) and actions (i.e., action and process verbs). Participants were asked to report as many items within a category as possible in one minute. Oral productions were tape-recorded and transcribed for further analysis. Both error types and error frequency were calculated. Statistical analysis was further conducted to examine these error types and frequency made by CSL learners. Additionally, category effects, pictorial effects and L2 proficiency were discussed. Findings in the present study helped characterize the lexical-semantic process of Chinese naming in CSL learners of different Chinese proficiency levels and made contributions to Chinese vocabulary teaching and learning in the future.

Keywords: lexical-semantic processing, Mandarin Chinese, naming, category effects

Procedia PDF Downloads 427

1404 Gender Differences in the Descriptions of Shape

Authors: Shu-Feng Chang

Abstract:

During the past years, gender issues have been discussed in many fields. It causes such differences not only in physical field but also in mental field. Gender differences also appear in our daily life, especially in the communication of spoken language. This statement was proved in the descriptions of color. However, the research about describing shape was fewer. The purpose of the study was to determine the description of the shape was different or alike due to gender. If it was different, this difference was dissimilar or as the same as the conclusion of color. Data were collected on the shape descriptions by 15 female and 15male participants in describing five pictures. As a result, it was really different for the descriptions of shape due to gender factor. The findings of shape descriptions were almost as the same as color naming with gender factor.

Keywords: gender, naming, shape, sociolinguistics

Procedia PDF Downloads 520

1403 Phonological Processing and Its Role in Pseudo-Word Decoding in Children Learning to Read Kannada Language between 5.6 to 8.6 Years

Authors: Vangmayee. V. Subban, Somashekara H. S, Shwetha Prabhu, Jayashree S. Bhat

Abstract:

Introduction and Need: Phonological processing is critical in learning to read alphabetical and non-alphabetical languages. However, its role in learning to read Kannada an alphasyllabary is equivocal. The literature has focused on the developmental role of phonological awareness on reading. To the best of authors knowledge, the role of phonological memory and phonological naming has not been addressed in alphasyllabary Kannada language. Therefore, there is a need to evaluate the comprehensive role of the phonological processing skills in Kannada on word decoding skills during the early years of schooling. Aim and Objectives: The present study aimed to explore the phonological processing abilities and their role in learning to decode pseudowords in children learning to read the Kannada language during initial years of formal schooling between 5.6 to 8.6 years. Method: In this cross sectional study, 60 typically developing Kannada speaking children, 20 each from Grade I, Grade II, and Grade III between the age range of 5.6 to 6.6 years, 6.7 to 7.6 years and 7.7 to 8.6 years respectively were selected from Kannada medium schools. Phonological processing abilities were assessed using an assessment tool specifically developed to address the objectives of the present research. The assessment tool was content validated by subject experts and had good inter and intra-subject reliability. Phonological awareness was assessed at syllable level using syllable segmentation, blending, and syllable stripping at initial, medial and final position. Phonological memory was assessed using pseudoword repetition task and phonological naming was assessed using rapid automatized naming of objects. Both phonological awareneness and phonological memory measures were scored for the accuracy of the response, whereas Rapid Automatized Naming (RAN) was scored for total naming speed. Results: The mean scores comparison using one-way ANOVA revealed a significant difference (p ≤ 0.05) between the groups on all the measures of phonological awareness, pseudoword repetition, rapid automatized naming, and pseudoword reading. Subsequent post-hoc grade wise comparison using Bonferroni test revealed significant differences (p ≤ 0.05) between each of the grades for all the tasks except (p ≥ 0.05) for syllable blending, syllable stripping, and pseudoword repetition between Grade II and Grade III. The Pearson correlations revealed a highly significant positive correlation (p=0.000) between all the variables except phonological naming which had significant negative correlations. However, the correlation co-efficient was higher for phonological awareness measures compared to others. Hence, phonological awareness was chosen a first independent variable to enter in the hierarchical regression equation followed by rapid automatized naming and finally, pseudoword repetition. The regression analysis revealed syllable awareness as a single most significant predictor of pseudoword reading by explaining the unique variance of 74% and there was no significant change in R² when RAN and pseudoword repetition were added subsequently to the regression equation. Conclusion: Present study concluded that syllable awareness matures completely by Grade II, whereas the phonological memory and phonological naming continue to develop beyond Grade III. Amongst phonological processing skills, phonological awareness, especially syllable awareness is crucial for word decoding than phonological memory and naming during initial years of schooling.

Keywords: phonological awareness, phonological memory, phonological naming, phonological processing, pseudo-word decoding

Procedia PDF Downloads 141

1402 Lecture Video Indexing and Retrieval Using Topic Keywords

Authors: B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa

Abstract:

In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.

Keywords: video indexing and retrieval, lecture videos, content based video search, multimodal indexing

Procedia PDF Downloads 204

1401 Off-Topic Text Detection System Using a Hybrid Model

Authors: Usama Shahid

Abstract:

Be it written documents, news columns, or students' essays, verifying the content can be a time-consuming task. Apart from the spelling and grammar mistakes, the proofreader is also supposed to verify whether the content included in the essay or document is relevant or not. The irrelevant content in any document or essay is referred to as off-topic text and in this paper, we will address the problem of off-topic text detection from a document using machine learning techniques. Our study aims to identify the off-topic content from a document using Echo state network model and we will also compare data with other models. The previous study uses Convolutional Neural Networks and TFIDF to detect off-topic text. We will rearrange the existing datasets and take new classifiers along with new word embeddings and implement them on existing and new datasets in order to compare the results with the previously existing CNN model.

Keywords: off topic, text detection, eco state network, machine learning

Procedia PDF Downloads 51

1400 Topic-to-Essay Generation with Event Element Constraints

Authors: Yufen Qin

Abstract:

Topic-to-Essay generation is a challenging task in Natural language processing, which aims to generate novel, diverse, and topic-related text based on user input. Previous research has overlooked the generation of articles under the constraints of event elements, resulting in issues such as incomplete event elements and logical inconsistencies in the generated results. To fill this gap, this paper proposes an event-constrained approach for a topic-to-essay generation that enforces the completeness of event elements during the generation process. Additionally, a language model is employed to verify the logical consistency of the generated results. Experimental results demonstrate that the proposed model achieves a better BLEU-2 score and performs better than the baseline in terms of subjective evaluation on a real dataset, indicating its capability to generate higher-quality topic-related text.

Keywords: event element, language model, natural language processing, topic-to-essay generation.

Procedia PDF Downloads 191

1399 Trend Detection Using Community Rank and Hawkes Process

Authors: Shashank Bhatnagar, W. Wilfred Godfrey

Abstract:

We develop in this paper, an approach to find the trendy topic, which not only considers the user-topic interaction but also considers the community, in which user belongs. This method modifies the previous approach of user-topic interaction to user-community-topic interaction with better speed-up in the range of [1.1-3]. We assume that trend detection in a social network is dependent on two things. The one is, broadcast of messages in social network governed by self-exciting point process, namely called Hawkes process and the second is, Community Rank. The influencer node links to others in the community and decides the community rank based on its PageRank and the number of users links to that community. The community rank decides the influence of one community over the other. Hence, the Hawkes process with the kernel of user-community-topic decides the trendy topic disseminated into the social network.

Keywords: community detection, community rank, Hawkes process, influencer node, pagerank, trend detection

Procedia PDF Downloads 350

1398 Investigating Naming and Connected Speech Impairments in Moroccan AD Patients

Authors: Mounia El Jaouhari, Mira Goral, Samir Diouny

Abstract:

Introduction: Previous research has indicated that language impairments are recognized as a feature of many neurodegenerative disorders, including non-language-led dementia subtypes such as Alzheimer´s disease (AD). In this preliminary study, the focal aim is to quantify the semantic content of naming and connected speech samples of Moroccan patients diagnosed with AD using two tasks taken from the culturally adapted and validated Moroccan version of the Boston Diagnostic Aphasia Examination. Methods: Five individuals with AD and five neurologically healthy individuals matched for age, gender, and education will participate in the study. Participants with AD will be diagnosed on the basis of the Moroccan version of the Diagnostic and Statistial Manual of Mental Disorders (DSM-4) screening test, the Moroccan version of the Mini Mental State Examination (MMSE) test scores, and neuroimaging analyses. The participants will engage in two tasks taken from the MDAE-SF: 1) Picture description and 2) Naming. Expected findings: Consistent with previous studies conducted on English speaking AD patients, we expect to find significant word production and retrieval impairments in AD patients in all measures. Moreover, we expect to find category fluency impairments that further endorse semantic breakdown accounts. In sum, not only will the findings of the current study shed more light on the locus of word retrieval impairments noted in AD, but also reflect the nature of Arabic morphology. In addition, the error patterns are expected to be similar to those found in previous AD studies in other languages.

Keywords: alzheimer's disease, anomia, connected speech, semantic impairments, moroccan arabic

Procedia PDF Downloads 116

1397 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 119

1396 Street Naming and Property Addressing Systems for New Development in Ghana: A Case Study of Nkawkaw in the Kwahu West Municipality

Authors: Jonathan Nii Laryea Ashong, Samuel Opare

Abstract:

Current sustainable cities debate focuses on the formidable problems for the Ghana’s largest urban and rural agglomerations, the majority of all urban dwellers continue to reside in far smaller urban settlements. It is estimated that by year 2030, almost all the Ghana’s population growth will likely be intense in urban areas including Nkawkaw in the Kwahu West Municipality of Ghana. Nkawkaw is situated on the road and former railway between Accra and Kumasi, and lies about halfway between these cities. It is also connected by road to Koforidua and Konongo. According to the 2013 census, Nkawkaw has a settlement population of 61,785. Many international agencies, government and private architectures’ are been asked to adequately recognize the naming of streets and property addressing system among the 170 districts across Ghana. The naming of streets and numbering of properties is to assist Metropolitan, Municipal and District Assemblies to manage the processes for establishing coherent address system nationally. Street addressing in the Nkawkaw in the Kwahu West Municipality which makes it possible to identify the location of a parcel of land, public places or dwellings on the ground based on system of names and numbers, yet agreement on how to progress towards it remains elusive. Therefore, reliable and effective development control for proper street naming and property addressing systems are required. The Intelligent Addressing (IA) technology from the UK is being used to name streets and properties in Ghana. The intelligent addressing employs the technique of unique property Reference Number and the unique street reference number which would transform national security and other service providers’ ability to respond rapidly to distress calls. Where name change is warranted following the review of existing streets names, the Physical Planning Department (PPDs) shall, in consultation with the relevant traditional authorities and community leadership (or relevant major stakeholders), select a street name in accordance with the provisions of the policy and the processes outlined for street name change for new development. In the case of existing streets with no names, the respective PPDs shall, in consultation with the relevant traditional authorities and community leadership (or relevant major stakeholders), select a street name in accordance with the requirements set out in municipality. Naming of access ways proposed for new developments shall be done at the time of developing sector layouts (subdivision maps) for the designated areas. In the case of private gated developments, the developer shall submit the names of the access ways as part of the plan and other documentation forwarded to the Municipal District Assembly for approval. The names shall be reviewed first by the PPD to avoid duplication and to ensure conformity to the required standards before submission to the Assembly’s Statutory Planning Committee for approval. The Kwahu West Municipality is supposed to be self-sustaining, providing basic services to inhabitants as a result of proper planning layouts, street naming and property addressing system that prevail in the area. The implications of these future projections are discussed.

Keywords: Nkawkaw, Kwahu west municipality, street naming, property, addressing system

Procedia PDF Downloads 467

1395 Recognizing an Individual, Their Topic of Conversation and Cultural Background from 3D Body Movement

Authors: Gheida J. Shahrour, Martin J. Russell

Abstract:

The 3D body movement signals captured during human-human conversation include clues not only to the content of people’s communication but also to their culture and personality. This paper is concerned with automatic extraction of this information from body movement signals. For the purpose of this research, we collected a novel corpus from 27 subjects, arranged them into groups according to their culture. We arranged each group into pairs and each pair communicated with each other about different topics. A state-of-art recognition system is applied to the problems of person, culture, and topic recognition. We borrowed modeling, classification, and normalization techniques from speech recognition. We used Gaussian Mixture Modeling (GMM) as the main technique for building our three systems, obtaining 77.78%, 55.47%, and 39.06% from the person, culture, and topic recognition systems respectively. In addition, we combined the above GMM systems with Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and 40.63% accuracy for person, culture, and topic recognition respectively. Although direct comparison among these three recognition systems is difficult, it seems that our person recognition system performs best for both GMM and GMM-SVM, suggesting that inter-subject differences (i.e. subject’s personality traits) are a major source of variation. When removing these traits from culture and topic recognition systems using the Nuisance Attribute Projection (NAP) and the Intersession Variability Compensation (ISVC) techniques, we obtained 73.44% and 46.09% accuracy from culture and topic recognition systems respectively.

Keywords: person recognition, topic recognition, culture recognition, 3D body movement signals, variability compensation

Procedia PDF Downloads 509

1394 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: text mining, Twitter, topic model, sentiment analysis

Procedia PDF Downloads 149

1393 Semantic Processing in Chinese: Category Effects, Task Effects and Age Effects

Authors: Yi-Hsiu Lai

Abstract:

The present study aimed to elucidate the nature of semantic processing in Chinese. Language and cognition related to the issue of aging are examined from the perspective of picture naming and category fluency tasks. Twenty Chinese-speaking adults (ranging from 25 to 45 years old) and twenty Chinese-speaking seniors (ranging from 65 to 75 years old) in Taiwan participated in this study. Each of them individually completed two tasks: a picture naming task and a category fluency task. Instruments for the naming task were sixty black-and-white pictures: thirty-five object and twenty-five action pictures. Category fluency task also consisted of two semantic categories – objects (or nouns) and actions (or verbs). Participants were asked to report as many items within a category as possible in one minute. Scores of action fluency and of object fluency were a summation of correct responses in these two categories. Category effects (actions vs. objects) and age effects were examined in these tasks. Objects were further divided into two major types: living objects and non-living objects. Actions were also categorized into two major types: action verbs and process verbs. Reaction time to each picture/question was additionally calculated and analyzed. Results of the category fluency task indicated that the content of information in Chinese seniors was comparatively deteriorated, thus producing smaller number of semantic-lexical items. Significant group difference was also found in the results of reaction time. Category Effect was significant for both Chinese adults and seniors in the semantic fluency task. Findings in the present study helped characterize the nature of semantic processing in Chinese-speaking adults and seniors and contributed to the issue of language and aging.

Keywords: semantic processing, aging, Chinese, category effects

Procedia PDF Downloads 331

1392 The name of Thai Muslim students: The Reflection of value and Identity of Thai Muslim

Authors: Apichaya Kaewuthai

Abstract:

To study the meaning of Muslim name in order to analyse the underlining value and identity from first year to forth year Muslim students at Prince of Songkla University, Hatyai Campus. The questionnaires are employed as a main analytical tool to acquire the names from 80 Muslim students in four study years. The meanings of obtained names are subsequently analysed and summarized base upon related documents to uncover the beneath value. The study reveals that name of male is derived from the name of prophet; Nabi Muhammad, merit, dignity, origins, leadership and the faith in Islam. For female, on the other hand, their names are related to virtue and beauty, cleanliness and peace, hope and flowers which comply with their characteristics. One of the reasons contribute to the principle of naming is the regulation of Ministry of Culture which states that the name should represent one’s nature and characters. The given name reflects value and identity of Muslim which can be classified into three categories including 1) Value related to belief in Islam 2) value related to relationship among families and relatives 3) value about relationship with nature and environment. All the above mentioned reflect Muslim value and identity vividly. The name of Muslim students allows the researcher to perceive the perspective, belief and value in giving the name of Thai Muslim. Besides, it reveals social condition and their culture. It can also be the fundamental of studying the meaning of name in other races.

Keywords: the naming, Thai Muslim, culture, economic

Procedia PDF Downloads 283

1391 Optimized Text Summarization Model on Mobile Screens for Sight-Interpreters: An Empirical Study

Authors: Jianhua Wang

Abstract:

To obtain key information quickly from long texts on small screens of mobile devices, sight-interpreters need to establish optimized summarization model for fast information retrieval. Four summarization models based on previous studies were studied including title+key words (TKW), title+topic sentences (TTS), key words+topic sentences (KWTS) and title+key words+topic sentences (TKWTS). Psychological experiments were conducted on the four models for three different genres of interpreting texts to establish the optimized summarization model for sight-interpreters. This empirical study shows that the optimized summarization model for sight-interpreters to quickly grasp the key information of the texts they interpret is title+key words (TKW) for cultural texts, title+key words+topic sentences (TKWTS) for economic texts and topic sentences+key words (TSKW) for political texts.

Keywords: different genres, mobile screens, optimized summarization models, sight-interpreters

Procedia PDF Downloads 280

1390 Linguistic Devices Reflecting Violence in Border–Provinces of Southern Thailand on the Front Page of Local and National Newspapers

Authors: Chanokporn Angsuviriya

Abstract:

The objective of the study is to analyse linguistic devices reflecting the violence in the south border provinces; namely Pattani, Yala, Narathiwat and Songkla on 1,344 front pages of three local newspapers; namely ChaoTai, Focus PhakTai and Samila Time and of two national newspapers, including ThaiRath and Matichon, between 2004 and 2005, and 2011 and 2012. The study shows that there are two important linguistic devices: 1) lexical choices consisting of the use of verbs describing violence, the use of quantitative words and the use of words naming someone who committed violent acts, and 2) metaphors consisting of “a violent problem is heat”, “a victim is a leaf”, and “a terrorist is a dog”. Comparing linguistic devices between two types of newspapers, national newspapers choose to use words more violently than local newspapers do. Moreover, they create more negative images of the south of Thailand by using stative verbs. In addition, in term of metaphors “a terrorist is a fox.” is only found in national newspapers. As regards naming terrorists “southern insurgents”, this noun phrase which is collectively called by national newspapers has strongly negative meaning. Moreover, “southern insurgents” have been perceived by the Thais in the whole country while “insurgents” that are not modified have been only used by local newspapers.

Keywords: linguistic devices, local newspapers, national newspapers, violence

Procedia PDF Downloads 207

1389 Visualization and Performance Measure to Determine Number of Topics in Twitter Data Clustering Using Hybrid Topic Modeling

Authors: Moulana Mohammed

Abstract:

Topic models are widely used in building clusters of documents for more than a decade, yet problems occurring in choosing optimal number of topics. The main problem is the lack of a stable metric of the quality of topics obtained during the construction of topic models. The authors analyzed from previous works, most of the models used in determining the number of topics are non-parametric and quality of topics determined by using perplexity and coherence measures and concluded that they are not applicable in solving this problem. In this paper, we used the parametric method, which is an extension of the traditional topic model with visual access tendency for visualization of the number of topics (clusters) to complement clustering and to choose optimal number of topics based on results of cluster validity indices. Developed hybrid topic models are demonstrated with different Twitter datasets on various topics in obtaining the optimal number of topics and in measuring the quality of clusters. The experimental results showed that the Visual Non-negative Matrix Factorization (VNMF) topic model performs well in determining the optimal number of topics with interactive visualization and in performance measure of the quality of clusters with validity indices.

Keywords: interactive visualization, visual mon-negative matrix factorization model, optimal number of topics, cluster validity indices, Twitter data clustering

Procedia PDF Downloads 106

1388 Examining the Effects of Increasing Lexical Retrieval Attempts in Tablet-Based Naming Therapy for Aphasia

Authors: Jeanne Gallee, Sofia Vallila-Rohter

Abstract:

Technology-based applications are increasingly being utilized in aphasia rehabilitation as a means of increasing intensity of treatment and improving accessibility to treatment. These interactive therapies, often available on tablets, lead individuals to complete language and cognitive rehabilitation tasks that draw upon skills such as the ability to name items, recognize semantic features, count syllables, rhyme, and categorize objects. Tasks involve visual and auditory stimulus cues and provide feedback about the accuracy of a person’s response. Research has begun to examine the efficacy of tablet-based therapies for aphasia, yet much remains unknown about how individuals interact with these therapy applications. Thus, the current study aims to examine the efficacy of a tablet-based therapy program for anomia, further examining how strategy training might influence the way that individuals with aphasia engage with and benefit from therapy. Individuals with aphasia are enrolled in one of two treatment paradigms: traditional therapy or strategy therapy. For ten weeks, all participants receive 2 hours of weekly in-house therapy using Constant Therapy, a tablet-based therapy application. Participants are provided with iPads and are additionally encouraged to work on therapy tasks for one hour a day at home (home logins). For those enrolled in traditional therapy, in-house sessions involve completing therapy tasks while a clinician researcher is present. For those enrolled in the strategy training group, in-house sessions focus on limiting cue use in order to maximize lexical retrieval attempts and naming opportunities. The strategy paradigm is based on the principle that retrieval attempts may foster long-term naming gains. Data have been collected from 7 participants with aphasia (3 in the traditional therapy group, 4 in the strategy training group). We examine cue use, latency of responses and accuracy through the course of therapy, comparing results across group and setting (in-house sessions vs. home logins).

Keywords: aphasia, speech-language pathology, traumatic brain injury, language

Procedia PDF Downloads 170

1387 Using Bidirectional Encoder Representations from Transformers to Extract Topic-Independent Sentiment Features for Social Media Bot Detection

Authors: Maryam Heidari, James H. Jones Jr.

Abstract:

Millions of online posts about different topics and products are shared on popular social media platforms. One use of this content is to provide crowd-sourced information about a specific topic, event or product. However, this use raises an important question: what percentage of information available through these services is trustworthy? In particular, might some of this information be generated by a machine, i.e., a bot, instead of a human? Bots can be, and often are, purposely designed to generate enough volume to skew an apparent trend or position on a topic, yet the consumer of such content cannot easily distinguish a bot post from a human post. In this paper, we introduce a model for social media bot detection which uses Bidirectional Encoder Representations from Transformers (Google Bert) for sentiment classification of tweets to identify topic-independent features. Our use of a Natural Language Processing approach to derive topic-independent features for our new bot detection model distinguishes this work from previous bot detection models. We achieve 94\% accuracy classifying the contents of data as generated by a bot or a human, where the most accurate prior work achieved accuracy of 92\%.

Keywords: bot detection, natural language processing, neural network, social media

Procedia PDF Downloads 88

1386 Examining the Function of Containers and Determining Lexical Indices for the Shapes of Pottery and the Poems Written on Them from the End of the 3rd Century to the End of the 8th Century

Authors: Mohadese Sookhtesaraii, Abed Taghavi, Kosar Sookhtesaraii

Abstract:

Pottery is always attended by human beings for its application functions. By passing time and human development and writing progressing, writing was started to do on pottery dishes. Some of important issues in making thise dishes, in addition to their application, are their names and obviosely their relationship between their function and their names. These names are different based on their appearances and the kind of their using. So by meaning these words in dictionary, naming these dishes are classified. In poetry works there are so many names of these dishes which are showing their importance and their using. More using of some of these dishes name in poem and writing works is caused the select these dishes. For better and precise analysing the form of pottery it emphasis on the meaning which are in dictionary and the names that are existed in poems and writters works. On the other hand, on the dishes there are written poet more than text, that it can study their beautiful aspect. Seperate from their meanings. Dishes name like Chamaneh, Satgini, was clearly named for drinking in dictionary. while using Khonb was applied for storing. So dishes applying can be the basis of classifying. The size and capacity of these dishes is also caused the differences in naming the dishes. Such as Khom, Khonb which are same in farm but. They are different in capacity and size. Meaning are written on these dishe was studied. In addition to preying phrase, they had loving meaning or inviting to drink and enjoying and shorting the human life.

Keywords: pialeh, sajegni, khomre, pottery

Procedia PDF Downloads 26

1385 Analysis of Trends in Environmental Health Research Using Topic Modeling

Authors: Hayoung Cho, Gabi Cho

Abstract:

In response to the continuing increase of demands for living environment safety, the Korean government has established and implemented various environmental health policies and set a high priority to the related R&D. However, the level of related technologies such as environmental risk assessment are still relatively low, and there is a need for detailed investment strategies in the field of environmental health research. As scientific research papers can give valuable implications on the development of a certain field, this study analyzed the global research trends in the field of environmental health over the past 10 years (2005~2015). Research topics were extracted from abstracts of the collected SCI papers using topic modeling to study the changes in research trends and discover emerging technologies. The method of topic modeling can improve the traditional bibliometric approach and provide a more comprehensive review of the global research development. The results of this study are expected to help provide insights for effective policy making and R&D investment direction.

Keywords: environmental health, paper analysis, research trends, topic modeling

Procedia PDF Downloads 260

1384 Effects of Topic Familiarity on Linguistic Aspects in EFL Learners’ Writing Performance

Authors: Jeong-Won Lee, Kyeong-Ok Yoon

Abstract:

The current study aimed to investigate the effects of topic familiarity and language proficiency on linguistic aspects (lexical complexity, syntactic complexity, accuracy, and fluency) in EFL learners’ argumentative essays. For the study 64 college students were asked to write an argumentative essay for the two different topics (Driving and Smoking) chosen by the consideration of topic familiarity. The students were divided into two language proficiency groups (high-level and intermediate) according to their English writing proficiency. The findings of the study are as follows: 1) the participants of this study exhibited lower levels of lexical and syntactic complexity as well as accuracy when performing writing tasks with unfamiliar topics; and 2) they demonstrated the use of a wider range of vocabulary, and longer and more complex structures, and produced accurate and lengthier texts compared to their intermediate peers. Discussion and pedagogical implications for instruction of writing classes in EFL contexts were addressed.

Keywords: topic familiarity, complexity, accuracy, fluency

Procedia PDF Downloads 18

1383 Artificial Intelligence Assisted Sentiment Analysis of Hotel Reviews Using Topic Modeling

Authors: Sushma Ghogale

Abstract:

With a surge in user-generated content or feedback or reviews on the internet, it has become possible and important to know consumers' opinions about products and services. This data is important for both potential customers and businesses providing the services. Data from social media is attracting significant attention and has become the most prominent channel of expressing an unregulated opinion. Prospective customers look for reviews from experienced customers before deciding to buy a product or service. Several websites provide a platform for users to post their feedback for the provider and potential customers. However, the biggest challenge in analyzing such data is in extracting latent features and providing term-level analysis of the data. This paper proposes an approach to use topic modeling to classify the reviews into topics and conduct sentiment analysis to mine the opinions. This approach can analyse and classify latent topics mentioned by reviewers on business sites or review sites, or social media using topic modeling to identify the importance of each topic. It is followed by sentiment analysis to assess the satisfaction level of each topic. This approach provides a classification of hotel reviews using multiple machine learning techniques and comparing different classifiers to mine the opinions of user reviews through sentiment analysis. This experiment concludes that Multinomial Naïve Bayes classifier produces higher accuracy than other classifiers.

Keywords: latent Dirichlet allocation, topic modeling, text classification, sentiment analysis

Procedia PDF Downloads 72

1382 Investigating Dynamic Transition Process of Issues Using Unstructured Text Analysis

Authors: Myungsu Lim, William Xiu Shun Wong, Yoonjin Hyun, Chen Liu, Seongi Choi, Dasom Kim, Namgyu Kim

Abstract:

The amount of real-time data generated through various mass media has been increasing rapidly. In this study, we had performed topic analysis by using the unstructured text data that is distributed through news article. As one of the most prevalent applications of topic analysis, the issue tracking technique investigates the changes of the social issues that identified through topic analysis. Currently, traditional issue tracking is conducted by identifying the main topics of documents that cover an entire period at the same time and analyzing the occurrence of each topic by the period of occurrence. However, this traditional issue tracking approach has limitation that it cannot discover dynamic mutation process of complex social issues. The purpose of this study is to overcome the limitations of the existing issue tracking method. We first derived core issues of each period, and then discover the dynamic mutation process of various issues. In this study, we further analyze the mutation process from the perspective of the issues categories, in order to figure out the pattern of issue flow, including the frequency and reliability of the pattern. In other words, this study allows us to understand the components of the complex issues by tracking the dynamic history of issues. This methodology can facilitate a clearer understanding of complex social phenomena by providing mutation history and related category information of the phenomena.

Keywords: Data Mining, Issue Tracking, Text Mining, topic Analysis, topic Detection, Trend Detection

Procedia PDF Downloads 372