Search results for: text preprocessing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1462

Search results for: text preprocessing

712 Feature Selection Approach for the Classification of Hydraulic Leakages in Hydraulic Final Inspection using Machine Learning

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Manufacturing companies are facing global competition and enormous cost pressure. The use of machine learning applications can help reduce production costs and create added value. Predictive quality enables the securing of product quality through data-supported predictions using machine learning models as a basis for decisions on test results. Furthermore, machine learning methods are able to process large amounts of data, deal with unfavourable row-column ratios and detect dependencies between the covariates and the given target as well as assess the multidimensional influence of all input variables on the target. Real production data are often subject to highly fluctuating boundary conditions and unbalanced data sets. Changes in production data manifest themselves in trends, systematic shifts, and seasonal effects. Thus, Machine learning applications require intensive pre-processing and feature selection. Data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets. Within the used real data set of Bosch hydraulic valves, the comparability of the same production conditions in the production of hydraulic valves within certain time periods can be identified by applying the concept drift method. Furthermore, a classification model is developed to evaluate the feature importance in different subsets within the identified time periods. By selecting comparable and stable features, the number of features used can be significantly reduced without a strong decrease in predictive power. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. In this research, the ada boosting classifier is used to predict the leakage of hydraulic valves based on geometric gauge blocks from machining, mating data from the assembly, and hydraulic measurement data from end-of-line testing. In addition, the most suitable methods are selected and accurate quality predictions are achieved.

Keywords: classification, achine learning, predictive quality, feature selection

Procedia PDF Downloads 162
711 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: text mining, topic extraction, independent, incremental, independent component analysis

Procedia PDF Downloads 309
710 Your Second Step on Research Method: Applied Linguistic Perspective

Authors: Sadeq Al Yaari, Ayman Al Yaari, Adham Al Yaari, Montaha Al Yaari, Aayah Al Yaari, Sajedah Al Yaari

Abstract:

Aims: To summarize and critically review involved articles for the purpose of investigating the research ethics in them. It also tests the hypothesis, identifying causal relationship, association between variables and differences between/ among groups of participants Design: This is quasi experimental study wherein scientific models were included. It starts from the ideas before the researchers draw the questions, formulate the hypothesis and seek for the solutions. Hypothesis was brief and to the point. A data collection form was constructed. The researchers made use of speculative, presumptive, stipulated and conclusive propositions. Data are statistically analyzed and visualized and are treated objectively in light of the characteristics of a good research. Outcomes: Results and discussion are relevant to the statement of the problem and research objectives. Principles of ethical research were met where the researchers ensured high ethical standards. Variables’ types are scientifically analyzed.

Keywords: research, method, analysis, speech, text

Procedia PDF Downloads 46
709 Use of Interpretable Evolved Search Query Classifiers for Sinhala Documents

Authors: Prasanna Haddela

Abstract:

Document analysis is a well matured yet still active research field, partly as a result of the intricate nature of building computational tools but also due to the inherent problems arising from the variety and complexity of human languages. Breaking down language barriers is vital in enabling access to a number of recent technologies. This paper investigates the application of document classification methods to new Sinhalese datasets. This language is geographically isolated and rich with many of its own unique features. We will examine the interpretability of the classification models with a particular focus on the use of evolved Lucene search queries generated using a Genetic Algorithm (GA) as a method of document classification. We will compare the accuracy and interpretability of these search queries with other popular classifiers. The results are promising and are roughly in line with previous work on English language datasets.

Keywords: evolved search queries, Sinhala document classification, Lucene Sinhala analyzer, interpretable text classification, genetic algorithm

Procedia PDF Downloads 114
708 Visual Construction of Youth in Czechoslovak Press Photographs: 1959-1989

Authors: Jana Teplá

Abstract:

This text focuses on the visual construction of youth in press photographs in socialist Czechoslovakia. It deals with photographs in a magazine for young readers, Mladý svět, published by the Socialist Union of Youth of Czechoslovakia. The aim of this study was to develop a methodological tool for uncovering the values and the ideological messages in the strategies used in the visual construction of reality in the socialist press. Two methods of visual analysis were applied to the photographs, a quantitative content analysis and a social semiotic analysis. The social semiotic analysis focused on images representing youth in their free time. The study shows that the meaning of a socialist press photograph is a result of a struggle for ideological power between formal and informal ideologies. This struggle takes place within the process of production of the photograph and also within the process of interpretation of the photograph.

Keywords: ideology, press photography, socialist regime, social semiotics, youth

Procedia PDF Downloads 280
707 A Survey of Response Generation of Dialogue Systems

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

An essential task in the field of artificial intelligence is to allow computers to interact with people through natural language. Therefore, researches such as virtual assistants and dialogue systems have received widespread attention from industry and academia. The response generation plays a crucial role in dialogue systems, so to push forward the research on this topic, this paper surveys various methods for response generation. We sort out these methods into three categories. First one includes finite state machine methods, framework methods, and instance methods. The second contains full-text indexing methods, ontology methods, vast knowledge base method, and some other methods. The third covers retrieval methods and generative methods. We also discuss some hybrid methods based knowledge and deep learning. We compare their disadvantages and advantages and point out in which ways these studies can be improved further. Our discussion covers some studies published in leading conferences such as IJCAI and AAAI in recent years.

Keywords: deep learning, generative, knowledge, response generation, retrieval

Procedia PDF Downloads 134
706 Cinema Reception in a Digital World: A Study of Cinema Audiences in India

Authors: Sanjay Ranade

Abstract:

Traditional film theory assumes the cinema audience in a darkened room where cinema is projected on to a white screen, and the audience suspends their sense of reality. Shifts in audiences due to changes in cultural tastes or trends have been studied for decades. In the past two decades, however, the audience, especially the youth, has shifted to digital media for the consumption of cinema. As a result, not only are audiences watching cinema on different devices, they are also consuming cinema in places and ways never imagined before. Public transport often crowded to the brim with a lot of ambient content, and a variety of workplaces have become sites for cinema viewing. Cinema is watched piecemeal and at different times of the day. Audiences use devices such as mobile phones and tablets to watch cinema. The cinema viewing experience is getting redesigned by the user. The emerging design allows the spectator to not only consume images and narratives but also produce, reproduce, and manipulate existing images and narratives, thereby participating in the process and influencing it. Spectatorship studies stress on the importance of subjectivity when dealing with the structure of the film text and the cultural and psychological implications in the engagement between the spectator and the film text. Indian cinema has been booming and contributing to global movie production significantly. In 2005 film production was 1000 films a year and doubled to 2000 by 2016. Digital technology helped push this growth in 2012. Film studies in India have had a decided Euro-American bias. The studies have chiefly analysed the content for ideological leanings or myth or as reflections of society, societal changes, or articulation of identity or presented retrospectives of directors, actors, music directors, etc. The one factor relegated to the background has been the spectator. If they have been addressed, they are treated as a collective of class or gender. India has a performative tradition going back several centuries. How Indians receive cinema is an important aspect to study with respect to film studies. This exploratory and descriptive study looked at 162 young media students studying cinema at the undergraduate and postgraduate levels. The students, speaking as many as 20 languages amongst them, were drawn from across the country’s media schools. The study looked at nine film societies registered with the Federation of Film Societies of India. A structured questionnaire was made and distributed online through media teachers for the students. The film societies were approached through the regional office of the FFSI in Mumbai. Lastly, group discussions were held in Mumbai with students and teachers of media. A group consisted of between five and twelve student participants, along with one or two teachers. All the respondents looked at themselves as spectators and shared their experiences of spectators of cinema, providing a very rich insight into Indian conditions of viewing cinema and challenges for cinema ahead.

Keywords: audience, digital, film studies, reception, reception spectatorship

Procedia PDF Downloads 131
705 Utilizing Quantum Chemistry for Nanotechnology: Electron and Spin Movement in Molecular Devices

Authors: Mahsa Fathollahzadeh

Abstract:

The quick advancement of nanotechnology necessitates the creation of innovative theoretical approaches to elucidate complex experimental findings and forecast novel capabilities of nanodevices. Therefore, over the past ten years, a difficult task in quantum chemistry has been comprehending electron and spin transport in molecular devices. This thorough evaluation presents a comprehensive overview of current research and its status in the field of molecular electronics, emphasizing the theoretical applications to various device types and including a brief introduction to theoretical methods and their practical implementation plan. The subject matter includes a variety of molecular mechanisms like molecular cables, diodes, transistors, electrical and visual switches, nano detectors, magnetic valve gadgets, inverse electrical resistance gadgets, and electron tunneling exploration. The text discusses both the constraints of the method presented and the potential strategies to address them, with a total of 183 references.

Keywords: chemistry, nanotechnology, quantum, molecule, spin

Procedia PDF Downloads 50
704 The Study of Information Uses Behaviour of Tourists in Songkhla Province, Thailand

Authors: Patraporn Kaewkhanitarak, Suchada Srichuar, Narawat Kanjanapan

Abstract:

This research is the survey research. The purpose of this research is to study information uses behavior and problem of tourists in Songkhla Province. The tool used in this study include structure questioner standardize in 5 levels rating scale. The 400 participants selected by convenience sampling (allowable error 5%) by Taro Yamane method. The collecting data period is 6 months from January-June 2014. The result of this study found that the type of information that the tourists often use to plan their trip is internet (x̅ = 3.81) and the most popular text is restaurant (x̅ = 3.77). The tourists found that booking or buying service from internet provided more affordable price and they could select appropriate plan by themselves. The most convenience source of information that the tourists often use is internet and website (x̅ = 3.69). Nevertheless, they explained that most of tourist information source in Songkhla province are lack and insufficient of tourist organization that provide information and service related to tourism.

Keywords: information, behavior, tourists, Thailand

Procedia PDF Downloads 253
703 Knowledge Transfer and the Translation of Technical Texts

Authors: Ahmed Alaoui

Abstract:

This paper contributes to the ongoing debate as to the relevance of translation studies to professional practitioners. It exposes the various misconceptions permeating the links between theory and practice in the translation landscape in the Arab World. It is a thesis of this paper that specialization in translation should be redefined; taking account of the fact, that specialized knowledge alone is neither crucial nor sufficient in technical translation. It should be tested against the readability of the translated text, the appropriateness of its style and the usability of its content by end-users to carry out their intended tasks. The paper also proposes a preliminary model to establish a working link between theory and practice from the perspective of professional trainers and practitioners, calling for the latter to participate in the production of knowledge in a systematic fashion. While this proposal is driven by a rather intuitive conviction, a research line is needed to specify the methodological moves to establish the mediation strategies that would relate the components in the model of knowledge transfer proposed in this paper.

Keywords: knowledge transfer, misconceptions, specialized texts, translation theory, translation practice

Procedia PDF Downloads 395
702 Comparative Methods for Speech Enhancement and the Effects on Text-Independent Speaker Identification Performance

Authors: R. Ajgou, S. Sbaa, S. Ghendir, A. Chemsa, A. Taleb-Ahmed

Abstract:

The speech enhancement algorithm is to improve speech quality. In this paper, we review some speech enhancement methods and we evaluated their performance based on Perceptual Evaluation of Speech Quality scores (PESQ, ITU-T P.862). All method was evaluated in presence of different kind of noise using TIMIT database and NOIZEUS noisy speech corpus.. The noise was taken from the AURORA database and includes suburban train noise, babble, car, exhibition hall, restaurant, street, airport and train station noise. Simulation results showed improved performance of speech enhancement for Tracking of non-stationary noise approach in comparison with various methods in terms of PESQ measure. Moreover, we have evaluated the effects of the speech enhancement technique on Speaker Identification system based on autoregressive (AR) model and Mel-frequency Cepstral coefficients (MFCC).

Keywords: speech enhancement, pesq, speaker recognition, MFCC

Procedia PDF Downloads 424
701 Topic Sentiments toward the COVID-19 Vaccine on Twitter

Authors: Melissa Vang, Raheyma Khan, Haihua Chen

Abstract:

The coronavirus disease 2019 (COVID‐19) pandemic has changed people's lives from all over the world. More people have turned to Twitter to engage online and discuss the COVID-19 vaccine. This study aims to present a text mining approach to identify people's attitudes towards the COVID-19 vaccine on Twitter. To achieve this purpose, we collected 54,268 COVID-19 vaccine tweets from September 01, 2020, to November 01, 2020, then the BERT model is used for the sentiment and topic analysis. The results show that people had more negative than positive attitudes about the vaccine, and countries with an increasing number of confirmed cases had a higher percentage of negative attitudes. Additionally, the topics discussed in positive and negative tweets are different. The tweet datasets can be helpful to information professionals to inform the public about vaccine-related informational resources. Our findings may have implications for understanding people's cognitions and feelings about the vaccine.

Keywords: BERT, COVID-19 vaccine, sentiment analysis, topic modeling

Procedia PDF Downloads 152
700 Multimodal Biometric Cryptography Based Authentication in Cloud Environment to Enhance Information Security

Authors: D. Pugazhenthi, B. Sree Vidya

Abstract:

Cloud computing is one of the emerging technologies that enables end users to use the services of cloud on ‘pay per usage’ strategy. This technology grows in a fast pace and so is its security threat. One among the various services provided by cloud is storage. In this service, security plays a vital factor for both authenticating legitimate users and protection of information. This paper brings in efficient ways of authenticating users as well as securing information on the cloud. Initial phase proposed in this paper deals with an authentication technique using multi-factor and multi-dimensional authentication system with multi-level security. Unique identification and slow intrusive formulates an advanced reliability on user-behaviour based biometrics than conventional means of password authentication. By biometric systems, the accounts are accessed only by a legitimate user and not by a nonentity. The biometric templates employed here do not include single trait but multiple, viz., iris and finger prints. The coordinating stage of the authentication system functions on Ensemble Support Vector Machine (SVM) and optimization by assembling weights of base SVMs for SVM ensemble after individual SVM of ensemble is trained by the Artificial Fish Swarm Algorithm (AFSA). Thus it helps in generating a user-specific secure cryptographic key of the multimodal biometric template by fusion process. Data security problem is averted and enhanced security architecture is proposed using encryption and decryption system with double key cryptography based on Fuzzy Neural Network (FNN) for data storing and retrieval in cloud computing . The proposing scheme aims to protect the records from hackers by arresting the breaking of cipher text to original text. This improves the authentication performance that the proposed double cryptographic key scheme is capable of providing better user authentication and better security which distinguish between the genuine and fake users. Thus, there are three important modules in this proposed work such as 1) Feature extraction, 2) Multimodal biometric template generation and 3) Cryptographic key generation. The extraction of the feature and texture properties from the respective fingerprint and iris images has been done initially. Finally, with the help of fuzzy neural network and symmetric cryptography algorithm, the technique of double key encryption technique has been developed. As the proposed approach is based on neural networks, it has the advantage of not being decrypted by the hacker even though the data were hacked already. The results prove that authentication process is optimal and stored information is secured.

Keywords: artificial fish swarm algorithm (AFSA), biometric authentication, decryption, encryption, fingerprint, fusion, fuzzy neural network (FNN), iris, multi-modal, support vector machine classification

Procedia PDF Downloads 260
699 DURAFILE: A Collaborative Tool for Preserving Digital Media Files

Authors: Santiago Macho, Miquel Montaner, Raivo Ruusalepp, Ferran Candela, Xavier Tarres, Rando Rostok

Abstract:

During our lives, we generate a lot of personal information such as photos, music, text documents and videos that link us with our past. This data that used to be tangible is now digital information stored in our computers, which implies a software dependence to make them accessible in the future. Technology, however, constantly evolves and goes through regular shifts, quickly rendering various file formats obsolete. The need for accessing data in the future affects not only personal users but also organizations. In a digital environment, a reliable preservation plan and the ability to adapt to fast changing technology are essential for maintaining data collections in the long term. We present in this paper the European FP7 project called DURAFILE that provides the technology to preserve media files for personal users and organizations while maintaining their quality.

Keywords: artificial intelligence, digital preservation, social search, digital preservation plans

Procedia PDF Downloads 445
698 Evolving Knowledge Extraction from Online Resources

Authors: Zhibo Xiao, Tharini Nayanika de Silva, Kezhi Mao

Abstract:

In this paper, we present an evolving knowledge extraction system named AKEOS (Automatic Knowledge Extraction from Online Sources). AKEOS consists of two modules, including a one-time learning module and an evolving learning module. The one-time learning module takes in user input query, and automatically harvests knowledge from online unstructured resources in an unsupervised way. The output of the one-time learning is a structured vector representing the harvested knowledge. The evolving learning module automatically schedules and performs repeated one-time learning to extract the newest information and track the development of an event. In addition, the evolving learning module summarizes the knowledge learned at different time points to produce a final knowledge vector about the event. With the evolving learning, we are able to visualize the key information of the event, discover the trends, and track the development of an event.

Keywords: evolving learning, knowledge extraction, knowledge graph, text mining

Procedia PDF Downloads 458
697 The Relationship between Confidence, Accuracy, and Decision Making in a Mobile Review Program

Authors: Carla Van De Sande, Jana Vandenberg

Abstract:

Just like physical skills, cognitive skills grow rusty over time unless they are regularly used and practiced, so academic breaks can have negative consequences on student learning and success. The Keeping in School Shape (KiSS) program is an engaging, accessible, and cost-effective intervention that harnesses the benefits of retrieval practice by using technology to help students maintain proficiency over breaks from school by delivering a daily review problem via text message or email. A growth mindset is promoted through feedback messages encouraging students to try again if they get a problem wrong and to take on a challenging problem if they get a problem correct. This paper reports on the relationship between confidence, accuracy, and decision-making during the implementation of the KiSS Program at a large university during winter break for students enrolled in an engineering introductory Calculus course sequence.

Keywords: growth mindset, learning loss, on-the-go learning, retrieval practice

Procedia PDF Downloads 206
696 Malaysian ESL Writing Process: A Comparison with England’s

Authors: Henry Nicholas Lee, George Thomas, Juliana Johari, Carmilla Freddie, Caroline Val Madin

Abstract:

Research in comparative and international education often provides value-laden views of an education system within and in between other countries. These views are frequently used by policy makers or educators to explore similarities and differences for, among others, benchmarking purposes. In this study, a comparison is made between Malaysia and England, focusing on the process of writing children went through to create a text, using a multimodal theoretical framework to analyse this comparison. The main purpose is political in nature as it served as an answer to Malaysia’s call for benchmarking of best practices for language learning. Furthermore, the focus on writing in this study adds into more empirical findings about early writers’ writing development and writing improvement, especially for children at the ages of 5-9. In research, comparative studies in English as a Second Language (ESL) writing pedagogy – particularly in Malaysia since the introduction of the Standard- based English Language Curriculum (KSSR) in 2011 as a draft and its full implementation in 2017; reviewed 2018 KSSR-CEFR aligned – has not been done comparatively. In theory, a multimodal theoretical framework somehow allows a logical comparison between first language and ESL which would provide useful insights to illuminate the writing process between Malaysia and England. The comparisons are not representative because of the different school systems in both countries. So far, the literature informs us that the curriculum for language learning is very much emphasised on children’s linguistic abilities, which include their proficiency and mastery of the language, its conventions, and technicalities. However, recent empirical findings suggested that literacy in its concepts and characters need change. In view of this suggestion, the comparison will look at how the process of writing is implemented through the five modes of communication: linguistic, visual, aural, spatial, and gestural. This project draws on data from Malaysia and England, involving 10 teachers, 26 classroom observations, 20 lesson plans, 20 interviews, and 20 brief conversations with teachers. The research focused upon 20 primary children of different genders aged 5-9, and in addition to primary data descriptions, 40 children’s works, 40 brief classroom conversations, 30 classroom photographs, and 30 school compound photographs were undertaken to investigate teachers and children’s use of modes and semiotic resources to design a text. The data were analysed by means of within-case analysis, cross-case analysis, and constant comparative analysis, with an initial stage of data categorisation, followed by general and specific coding, which clustered the data into thematic groups. The study highlights the importance of teachers’ and children’s engagement and interaction with various modes of communication, an adaptation from the English approaches to teaching writing within the KSSR framework and providing ‘voice’ to ESL writers to ensure that both have access to the knowledge and skills required to make decisions in developing multimodal texts and artefacts.

Keywords: comparative education, early writers, KSSR, multimodal theoretical framework, writing development

Procedia PDF Downloads 71
695 Opinion Mining and Sentiment Analysis on DEFT

Authors: Najiba Ouled Omar, Azza Harbaoui, Henda Ben Ghezala

Abstract:

Current research practices sentiment analysis with a focus on social networks, DEfi Fouille de Texte (DEFT) (Text Mining Challenge) evaluation campaign focuses on opinion mining and sentiment analysis on social networks, especially social network Twitter. It aims to confront the systems produced by several teams from public and private research laboratories. DEFT offers participants the opportunity to work on regularly renewed themes and proposes to work on opinion mining in several editions. The purpose of this article is to scrutinize and analyze the works relating to opinions mining and sentiment analysis in the Twitter social network realized by DEFT. It examines the tasks proposed by the organizers of the challenge and the methods used by the participants.

Keywords: opinion mining, sentiment analysis, emotion, polarity, annotation, OSEE, figurative language, DEFT, Twitter, Tweet

Procedia PDF Downloads 141
694 Arabic Light Stemmer for Better Search Accuracy

Authors: Sahar Khedr, Dina Sayed, Ayman Hanafy

Abstract:

Arabic is one of the most ancient and critical languages in the world. It has over than 250 million Arabic native speakers and more than twenty countries having Arabic as one of its official languages. In the past decade, we have witnessed a rapid evolution in smart devices, social network and technology sector which led to the need to provide tools and libraries that properly tackle the Arabic language in different domains. Stemming is one of the most crucial linguistic fundamentals. It is used in many applications especially in information extraction and text mining fields. The motivation behind this work is to enhance the Arabic light stemmer to serve the data mining industry and leverage it in an open source community. The presented implementation works on enhancing the Arabic light stemmer by utilizing and enhancing an algorithm that provides an extension for a new set of rules and patterns accompanied by adjusted procedure. This study has proven a significant enhancement for better search accuracy with an average 10% improvement in comparison with previous works.

Keywords: Arabic data mining, Arabic Information extraction, Arabic Light stemmer, Arabic stemmer

Procedia PDF Downloads 311
693 Hermeneutical Understanding of 2 Cor. 7:1 in the Light of Igbo Cultural Concept of Purification

Authors: H. E. Amolo

Abstract:

The concepts of pollution or contamination and purification or ritual cleansing are very important concepts among traditional Africans. This is because in relation to human behaviors and attitudes, they constitute on the one hand what could be referred to as moral demands and on the other, what results in the default of such demands. The many taboos which a man has to observe are not to be regarded as things mechanical which do not touch the heart, but that the avoidance is a sacred law respected by the community. In breaking it, you offend the divine power’. Researches have shown that, Africans tenaciously hold the belief that, moral values are based upon the recognition of the divine will and that sin in the community must be expelled if perfect peace is to be enjoyed. Sadly enough, these moral values are gradually eroding in contemporary times. Thus, this study proposal calls for a survey of the passage from an African cultural context; how it can enhance the understanding of the text, as well as how it can complement its scholarly interpretation with the view of institutionalizing the concept of holiness as a means of bringing the people closer to God, and also instilling ethical purity and righteousness.

Keywords: cultural practices, Igbo ideology, purification, rituals

Procedia PDF Downloads 309
692 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 94
691 Project and Experiment-Based Fluid Dynamics Education

Authors: Etsuo Morishita

Abstract:

This paper presents the project and experiment-based fluid dynamics education in Meisei University, a private institution in Tokyo, Japan. We pay attention not only to the basic engineering courses but also to the practical aspect of engineering experience. So, we prepare courses called the Projects from I to VI. The Projects I and II are designed for the first year, III and IV are designated for the second year, V and VI are prepared for the third year, respectively. Each supervisor is responsible for two of these projects every year. When students take the Project V and VI at the third year, we automatically assume that these students will join the lab of the project for the graduation thesis. We would like to show our experience in the Project I in the summer term, 2016. In this project, we introduce a traction flight vehicle called Cat Flyer. This is a kind of a kite towed by a car for example. This is very similar to parasailing, but flight is possible even on the roads. Experiments in mechanical engineering education are also very important, and we would like to explain our course on centrifugal pump, venture, and orifice. Although these are described in detail in the text books of fluid dynamics, it is still crucial to have practical experiments as a student.

Keywords: aerodynamics, experiment, fluid dynamics, project

Procedia PDF Downloads 260
690 Resume Ranking Using Custom Word2vec and Rule-Based Natural Language Processing Techniques

Authors: Subodh Chandra Shakya, Rajendra Sapkota, Aakash Tamang, Shushant Pudasaini, Sujan Adhikari, Sajjan Adhikari

Abstract:

Lots of efforts have been made in order to measure the semantic similarity between the text corpora in the documents. Techniques have been evolved to measure the similarity of two documents. One such state-of-art technique in the field of Natural Language Processing (NLP) is word to vector models, which converts the words into their word-embedding and measures the similarity between the vectors. We found this to be quite useful for the task of resume ranking. So, this research paper is the implementation of the word2vec model along with other Natural Language Processing techniques in order to rank the resumes for the particular job description so as to automate the process of hiring. The research paper proposes the system and the findings that were made during the process of building the system.

Keywords: chunking, document similarity, information extraction, natural language processing, word2vec, word embedding

Procedia PDF Downloads 160
689 Syndromic Surveillance Framework Using Tweets Data Analytics

Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden

Abstract:

Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.

Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza

Procedia PDF Downloads 116
688 The Attitudes of Pre-Service Teachers towards Analytical Thinking Skill Development Based on Miller’s Model

Authors: Thassanant Unnanantn, Suttipong Boonphadung

Abstract:

This research study aimed to survey and analyze the attitudes of pre-service teachers’ the analytical thinking development based on Miller’s Model. The informants of this study were 22 third year teacher students majoring in Thai. The course where the instruction was conducted was English for Academic Purposes in Thai Language 2. The instrument of this research was an open-ended questionnaire with two dimensions of questions: academic and satisfaction dimensions. The investigation revealed the positive attitudes. In the academic dimension, the majority of 12 (54.54%), the highest percentage, reflected that the method of teaching analytical thinking and language simultaneously was their new knowledge and the similar percentage also belonged to text cohesion in writing. For the satisfaction, the highest frequency count was from 17 of them (77.27%) and this majority favored the openness or friendliness of the teacher.

Keywords: analytical thinking development, Miller’s Model, attitudes, pre-service teachers

Procedia PDF Downloads 310
687 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: mutex task generation, data augmentation, meta-learning, text classification.

Procedia PDF Downloads 144
686 A Blockchain-Based Privacy-Preserving Physical Delivery System

Authors: Shahin Zanbaghi, Saeed Samet

Abstract:

The internet has transformed the way we shop. Previously, most of our purchases came in the form of shopping trips to a nearby store. Now, it’s as easy as clicking a mouse. But with great convenience comes great responsibility. We have to be constantly vigilant about our personal information. In this work, our proposed approach is to encrypt the information printed on the physical packages, which include personal information in plain text, using a symmetric encryption algorithm; then, we store that encrypted information into a Blockchain network rather than storing them in companies or corporations centralized databases. We present, implement and assess a blockchain-based system using Ethereum smart contracts. We present detailed algorithms that explain the details of our smart contract. We present the security, cost, and performance analysis of the proposed method. Our work indicates that the proposed solution is economically attainable and provides data integrity, security, transparency, and data traceability.

Keywords: blockchain, Ethereum, smart contract, commit-reveal scheme

Procedia PDF Downloads 150
685 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques

Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel

Abstract:

Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.

Keywords: cross-language analysis, machine learning, machine translation, sentiment analysis

Procedia PDF Downloads 715
684 Cultural References in Jean-François Menard's French Translation of Harry Potter a L'ecole Des Sorciers: An Analysis of the Translated Catchphrases and Spells and Cultural Elements

Authors: Brynn Patrice Fader

Abstract:

The objective of this research project is to assess the ways in which Jean-Francois Menards French translation Harry Potter a l'ecole des sorciers translates the cultural references from the original text JK Rowlings' Harry Potter and the Philosophers Stone. The method of this analysis is to focus on analyzing the reasons for and the ways in which Menard translates the spells and catchphrases throughout the novel and the effects that these choices have on the reader. While at times Menard resorts to the omission or manipulation and borrowing he also contrasts these techniques by transferring the cultural references using the direct translational approach. It appears that the translator resorts to techniques other than direct translation when it is necessary to ensure that the target audience will understand the events and conversations taking place.

Keywords: cultural elements, direct translation, manipulation, omission

Procedia PDF Downloads 321
683 Sentiment Classification of Documents

Authors: Swarnadip Ghosh

Abstract:

Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it determines whether a piece of writing is positive, negative or neutral.Sentiment analysis of documents holds great importance in today's world, when numerous information is stored in databases and in the world wide web. An efficient algorithm to illicit such information, would be beneficial for social, economic as well as medical purposes. In this project, we have developed an algorithm to classify a document into positive or negative. Using our algorithm, we obtained a feature set from the data, and classified the documents based on this feature set. It is important to note that, in the classification, we have not used the independence assumption, which is considered by many procedures like the Naive Bayes. This makes the algorithm more general in scope. Moreover, because of the sparsity and high dimensionality of such data, we did not use empirical distribution for estimation, but developed a method by finding degree of close clustering of the data points. We have applied our algorithm on a movie review data set obtained from IMDb and obtained satisfactory results.

Keywords: sentiment, Run's Test, cross validation, higher dimensional pmf estimation

Procedia PDF Downloads 404