Search results for: Text Mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2254

Search results for: Text Mining

1054 Use of Interpretable Evolved Search Query Classifiers for Sinhala Documents

Authors: Prasanna Haddela

Abstract:

Document analysis is a well matured yet still active research field, partly as a result of the intricate nature of building computational tools but also due to the inherent problems arising from the variety and complexity of human languages. Breaking down language barriers is vital in enabling access to a number of recent technologies. This paper investigates the application of document classification methods to new Sinhalese datasets. This language is geographically isolated and rich with many of its own unique features. We will examine the interpretability of the classification models with a particular focus on the use of evolved Lucene search queries generated using a Genetic Algorithm (GA) as a method of document classification. We will compare the accuracy and interpretability of these search queries with other popular classifiers. The results are promising and are roughly in line with previous work on English language datasets.

Keywords: evolved search queries, Sinhala document classification, Lucene Sinhala analyzer, interpretable text classification, genetic algorithm

Procedia PDF Downloads 107
1053 Visual Construction of Youth in Czechoslovak Press Photographs: 1959-1989

Authors: Jana Teplá

Abstract:

This text focuses on the visual construction of youth in press photographs in socialist Czechoslovakia. It deals with photographs in a magazine for young readers, Mladý svět, published by the Socialist Union of Youth of Czechoslovakia. The aim of this study was to develop a methodological tool for uncovering the values and the ideological messages in the strategies used in the visual construction of reality in the socialist press. Two methods of visual analysis were applied to the photographs, a quantitative content analysis and a social semiotic analysis. The social semiotic analysis focused on images representing youth in their free time. The study shows that the meaning of a socialist press photograph is a result of a struggle for ideological power between formal and informal ideologies. This struggle takes place within the process of production of the photograph and also within the process of interpretation of the photograph.

Keywords: ideology, press photography, socialist regime, social semiotics, youth

Procedia PDF Downloads 276
1052 A Survey of Response Generation of Dialogue Systems

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

An essential task in the field of artificial intelligence is to allow computers to interact with people through natural language. Therefore, researches such as virtual assistants and dialogue systems have received widespread attention from industry and academia. The response generation plays a crucial role in dialogue systems, so to push forward the research on this topic, this paper surveys various methods for response generation. We sort out these methods into three categories. First one includes finite state machine methods, framework methods, and instance methods. The second contains full-text indexing methods, ontology methods, vast knowledge base method, and some other methods. The third covers retrieval methods and generative methods. We also discuss some hybrid methods based knowledge and deep learning. We compare their disadvantages and advantages and point out in which ways these studies can be improved further. Our discussion covers some studies published in leading conferences such as IJCAI and AAAI in recent years.

Keywords: deep learning, generative, knowledge, response generation, retrieval

Procedia PDF Downloads 130
1051 Machine Learning Methods for Network Intrusion Detection

Authors: Mouhammad Alkasassbeh, Mohammad Almseidin

Abstract:

Network security engineers work to keep services available all the time by handling intruder attacks. Intrusion Detection System (IDS) is one of the obtainable mechanisms that is used to sense and classify any abnormal actions. Therefore, the IDS must be always up to date with the latest intruder attacks signatures to preserve confidentiality, integrity, and availability of the services. The speed of the IDS is a very important issue as well learning the new attacks. This research work illustrates how the Knowledge Discovery and Data Mining (or Knowledge Discovery in Databases) KDD dataset is very handy for testing and evaluating different Machine Learning Techniques. It mainly focuses on the KDD preprocess part in order to prepare a decent and fair experimental data set. The J48, MLP, and Bayes Network classifiers have been chosen for this study. It has been proven that the J48 classifier has achieved the highest accuracy rate for detecting and classifying all KDD dataset attacks, which are of type DOS, R2L, U2R, and PROBE.

Keywords: IDS, DDoS, MLP, KDD

Procedia PDF Downloads 232
1050 Cinema Reception in a Digital World: A Study of Cinema Audiences in India

Authors: Sanjay Ranade

Abstract:

Traditional film theory assumes the cinema audience in a darkened room where cinema is projected on to a white screen, and the audience suspends their sense of reality. Shifts in audiences due to changes in cultural tastes or trends have been studied for decades. In the past two decades, however, the audience, especially the youth, has shifted to digital media for the consumption of cinema. As a result, not only are audiences watching cinema on different devices, they are also consuming cinema in places and ways never imagined before. Public transport often crowded to the brim with a lot of ambient content, and a variety of workplaces have become sites for cinema viewing. Cinema is watched piecemeal and at different times of the day. Audiences use devices such as mobile phones and tablets to watch cinema. The cinema viewing experience is getting redesigned by the user. The emerging design allows the spectator to not only consume images and narratives but also produce, reproduce, and manipulate existing images and narratives, thereby participating in the process and influencing it. Spectatorship studies stress on the importance of subjectivity when dealing with the structure of the film text and the cultural and psychological implications in the engagement between the spectator and the film text. Indian cinema has been booming and contributing to global movie production significantly. In 2005 film production was 1000 films a year and doubled to 2000 by 2016. Digital technology helped push this growth in 2012. Film studies in India have had a decided Euro-American bias. The studies have chiefly analysed the content for ideological leanings or myth or as reflections of society, societal changes, or articulation of identity or presented retrospectives of directors, actors, music directors, etc. The one factor relegated to the background has been the spectator. If they have been addressed, they are treated as a collective of class or gender. India has a performative tradition going back several centuries. How Indians receive cinema is an important aspect to study with respect to film studies. This exploratory and descriptive study looked at 162 young media students studying cinema at the undergraduate and postgraduate levels. The students, speaking as many as 20 languages amongst them, were drawn from across the country’s media schools. The study looked at nine film societies registered with the Federation of Film Societies of India. A structured questionnaire was made and distributed online through media teachers for the students. The film societies were approached through the regional office of the FFSI in Mumbai. Lastly, group discussions were held in Mumbai with students and teachers of media. A group consisted of between five and twelve student participants, along with one or two teachers. All the respondents looked at themselves as spectators and shared their experiences of spectators of cinema, providing a very rich insight into Indian conditions of viewing cinema and challenges for cinema ahead.

Keywords: audience, digital, film studies, reception, reception spectatorship

Procedia PDF Downloads 126
1049 Utilizing Quantum Chemistry for Nanotechnology: Electron and Spin Movement in Molecular Devices

Authors: Mahsa Fathollahzadeh

Abstract:

The quick advancement of nanotechnology necessitates the creation of innovative theoretical approaches to elucidate complex experimental findings and forecast novel capabilities of nanodevices. Therefore, over the past ten years, a difficult task in quantum chemistry has been comprehending electron and spin transport in molecular devices. This thorough evaluation presents a comprehensive overview of current research and its status in the field of molecular electronics, emphasizing the theoretical applications to various device types and including a brief introduction to theoretical methods and their practical implementation plan. The subject matter includes a variety of molecular mechanisms like molecular cables, diodes, transistors, electrical and visual switches, nano detectors, magnetic valve gadgets, inverse electrical resistance gadgets, and electron tunneling exploration. The text discusses both the constraints of the method presented and the potential strategies to address them, with a total of 183 references.

Keywords: chemistry, nanotechnology, quantum, molecule, spin

Procedia PDF Downloads 42
1048 The Study of Information Uses Behaviour of Tourists in Songkhla Province, Thailand

Authors: Patraporn Kaewkhanitarak, Suchada Srichuar, Narawat Kanjanapan

Abstract:

This research is the survey research. The purpose of this research is to study information uses behavior and problem of tourists in Songkhla Province. The tool used in this study include structure questioner standardize in 5 levels rating scale. The 400 participants selected by convenience sampling (allowable error 5%) by Taro Yamane method. The collecting data period is 6 months from January-June 2014. The result of this study found that the type of information that the tourists often use to plan their trip is internet (x̅ = 3.81) and the most popular text is restaurant (x̅ = 3.77). The tourists found that booking or buying service from internet provided more affordable price and they could select appropriate plan by themselves. The most convenience source of information that the tourists often use is internet and website (x̅ = 3.69). Nevertheless, they explained that most of tourist information source in Songkhla province are lack and insufficient of tourist organization that provide information and service related to tourism.

Keywords: information, behavior, tourists, Thailand

Procedia PDF Downloads 245
1047 Knowledge Transfer and the Translation of Technical Texts

Authors: Ahmed Alaoui

Abstract:

This paper contributes to the ongoing debate as to the relevance of translation studies to professional practitioners. It exposes the various misconceptions permeating the links between theory and practice in the translation landscape in the Arab World. It is a thesis of this paper that specialization in translation should be redefined; taking account of the fact, that specialized knowledge alone is neither crucial nor sufficient in technical translation. It should be tested against the readability of the translated text, the appropriateness of its style and the usability of its content by end-users to carry out their intended tasks. The paper also proposes a preliminary model to establish a working link between theory and practice from the perspective of professional trainers and practitioners, calling for the latter to participate in the production of knowledge in a systematic fashion. While this proposal is driven by a rather intuitive conviction, a research line is needed to specify the methodological moves to establish the mediation strategies that would relate the components in the model of knowledge transfer proposed in this paper.

Keywords: knowledge transfer, misconceptions, specialized texts, translation theory, translation practice

Procedia PDF Downloads 387
1046 From Two-Way to Multi-Way: A Comparative Study for Map-Reduce Join Algorithms

Authors: Marwa Hussien Mohamed, Mohamed Helmy Khafagy

Abstract:

Map-Reduce is a programming model which is widely used to extract valuable information from enormous volumes of data. Map-reduce designed to support heterogeneous datasets. Apache Hadoop map-reduce used extensively to uncover hidden pattern like data mining, SQL, etc. The most important operation for data analysis is joining operation. But, map-reduce framework does not directly support join algorithm. This paper explains and compares two-way and multi-way map-reduce join algorithms for map reduce also we implement MR join Algorithms and show the performance of each phase in MR join algorithms. Our experimental results show that map side join and map merge join in two-way join algorithms has the longest time according to preprocessing step sorting data and reduce side cascade join has the longest time at Multi-Way join algorithms.

Keywords: Hadoop, MapReduce, multi-way join, two-way join, Ubuntu

Procedia PDF Downloads 480
1045 Comparative Methods for Speech Enhancement and the Effects on Text-Independent Speaker Identification Performance

Authors: R. Ajgou, S. Sbaa, S. Ghendir, A. Chemsa, A. Taleb-Ahmed

Abstract:

The speech enhancement algorithm is to improve speech quality. In this paper, we review some speech enhancement methods and we evaluated their performance based on Perceptual Evaluation of Speech Quality scores (PESQ, ITU-T P.862). All method was evaluated in presence of different kind of noise using TIMIT database and NOIZEUS noisy speech corpus.. The noise was taken from the AURORA database and includes suburban train noise, babble, car, exhibition hall, restaurant, street, airport and train station noise. Simulation results showed improved performance of speech enhancement for Tracking of non-stationary noise approach in comparison with various methods in terms of PESQ measure. Moreover, we have evaluated the effects of the speech enhancement technique on Speaker Identification system based on autoregressive (AR) model and Mel-frequency Cepstral coefficients (MFCC).

Keywords: speech enhancement, pesq, speaker recognition, MFCC

Procedia PDF Downloads 420
1044 Multimodal Biometric Cryptography Based Authentication in Cloud Environment to Enhance Information Security

Authors: D. Pugazhenthi, B. Sree Vidya

Abstract:

Cloud computing is one of the emerging technologies that enables end users to use the services of cloud on ‘pay per usage’ strategy. This technology grows in a fast pace and so is its security threat. One among the various services provided by cloud is storage. In this service, security plays a vital factor for both authenticating legitimate users and protection of information. This paper brings in efficient ways of authenticating users as well as securing information on the cloud. Initial phase proposed in this paper deals with an authentication technique using multi-factor and multi-dimensional authentication system with multi-level security. Unique identification and slow intrusive formulates an advanced reliability on user-behaviour based biometrics than conventional means of password authentication. By biometric systems, the accounts are accessed only by a legitimate user and not by a nonentity. The biometric templates employed here do not include single trait but multiple, viz., iris and finger prints. The coordinating stage of the authentication system functions on Ensemble Support Vector Machine (SVM) and optimization by assembling weights of base SVMs for SVM ensemble after individual SVM of ensemble is trained by the Artificial Fish Swarm Algorithm (AFSA). Thus it helps in generating a user-specific secure cryptographic key of the multimodal biometric template by fusion process. Data security problem is averted and enhanced security architecture is proposed using encryption and decryption system with double key cryptography based on Fuzzy Neural Network (FNN) for data storing and retrieval in cloud computing . The proposing scheme aims to protect the records from hackers by arresting the breaking of cipher text to original text. This improves the authentication performance that the proposed double cryptographic key scheme is capable of providing better user authentication and better security which distinguish between the genuine and fake users. Thus, there are three important modules in this proposed work such as 1) Feature extraction, 2) Multimodal biometric template generation and 3) Cryptographic key generation. The extraction of the feature and texture properties from the respective fingerprint and iris images has been done initially. Finally, with the help of fuzzy neural network and symmetric cryptography algorithm, the technique of double key encryption technique has been developed. As the proposed approach is based on neural networks, it has the advantage of not being decrypted by the hacker even though the data were hacked already. The results prove that authentication process is optimal and stored information is secured.

Keywords: artificial fish swarm algorithm (AFSA), biometric authentication, decryption, encryption, fingerprint, fusion, fuzzy neural network (FNN), iris, multi-modal, support vector machine classification

Procedia PDF Downloads 258
1043 DURAFILE: A Collaborative Tool for Preserving Digital Media Files

Authors: Santiago Macho, Miquel Montaner, Raivo Ruusalepp, Ferran Candela, Xavier Tarres, Rando Rostok

Abstract:

During our lives, we generate a lot of personal information such as photos, music, text documents and videos that link us with our past. This data that used to be tangible is now digital information stored in our computers, which implies a software dependence to make them accessible in the future. Technology, however, constantly evolves and goes through regular shifts, quickly rendering various file formats obsolete. The need for accessing data in the future affects not only personal users but also organizations. In a digital environment, a reliable preservation plan and the ability to adapt to fast changing technology are essential for maintaining data collections in the long term. We present in this paper the European FP7 project called DURAFILE that provides the technology to preserve media files for personal users and organizations while maintaining their quality.

Keywords: artificial intelligence, digital preservation, social search, digital preservation plans

Procedia PDF Downloads 441
1042 Applying Arima Data Mining Techniques to ERP to Generate Sales Demand Forecasting: A Case Study

Authors: Ghaleb Y. Abbasi, Israa Abu Rumman

Abstract:

This paper modeled sales history archived from 2012 to 2015 bulked in monthly bins for five products for a medical supply company in Jordan. The sales forecasts and extracted consistent patterns in the sales demand history from the Enterprise Resource Planning (ERP) system were used to predict future forecasting and generate sales demand forecasting using time series analysis statistical technique called Auto Regressive Integrated Moving Average (ARIMA). This was used to model and estimate realistic sales demand patterns and predict future forecasting to decide the best models for five products. Analysis revealed that the current replenishment system indicated inventory overstocking.

Keywords: ARIMA models, sales demand forecasting, time series, R code

Procedia PDF Downloads 381
1041 The Relationship between Confidence, Accuracy, and Decision Making in a Mobile Review Program

Authors: Carla Van De Sande, Jana Vandenberg

Abstract:

Just like physical skills, cognitive skills grow rusty over time unless they are regularly used and practiced, so academic breaks can have negative consequences on student learning and success. The Keeping in School Shape (KiSS) program is an engaging, accessible, and cost-effective intervention that harnesses the benefits of retrieval practice by using technology to help students maintain proficiency over breaks from school by delivering a daily review problem via text message or email. A growth mindset is promoted through feedback messages encouraging students to try again if they get a problem wrong and to take on a challenging problem if they get a problem correct. This paper reports on the relationship between confidence, accuracy, and decision-making during the implementation of the KiSS Program at a large university during winter break for students enrolled in an engineering introductory Calculus course sequence.

Keywords: growth mindset, learning loss, on-the-go learning, retrieval practice

Procedia PDF Downloads 203
1040 Cloning and Characterization of Uridine-5’-Diphosphate -Glucose Pyrophosphorylases from Lactobacillus Kefiranofaciens and Rhodococcus Wratislaviensis

Authors: Mesfin Angaw Tesfay

Abstract:

Uridine-5’-diphosphate (UDP)-glucose is one of the most versatile building blocks within the metabolism of prokaryotes and eukaryotes serving as an activated sugar donor during the glycosylation of natural products. It is formed by the enzyme UDP-glucose pyrophosphorylase (UGPase) using uridine-5′-triphosphate (UTP) and α-d-glucose 1-phosphate as a substrate. Herein two UGPase genes from Lactobacillus kefiranofaciens ZW3 (LkUGPase) and Rhodococcus wratislaviensis IFP 2016 (RwUGPase) were identified through genome mining approaches. The LkUGPase and RwUGPase have 299 and 306 amino acids, respectively. Both UGPase has the conserved UTP binding site (G-X-G-T-R-X-L-P) and the glucose -1-phosphate binding site (V-E-K-P). The LkUGPase and RwUGPase were cloned in E. coli and SDS-PAGE analysis showed the expression of both enzymes forming about 36 KDa of protein band after induction. LkUGPase and RwUGPase have an activity of 1549.95 and 671.53 U/mg respectively. Currently, their kinetic properties are under investigation.

Keywords: UGPase, LkUGPase, RwUGPase, UDP-glucose, Glycosylation

Procedia PDF Downloads 10
1039 Malaysian ESL Writing Process: A Comparison with England’s

Authors: Henry Nicholas Lee, George Thomas, Juliana Johari, Carmilla Freddie, Caroline Val Madin

Abstract:

Research in comparative and international education often provides value-laden views of an education system within and in between other countries. These views are frequently used by policy makers or educators to explore similarities and differences for, among others, benchmarking purposes. In this study, a comparison is made between Malaysia and England, focusing on the process of writing children went through to create a text, using a multimodal theoretical framework to analyse this comparison. The main purpose is political in nature as it served as an answer to Malaysia’s call for benchmarking of best practices for language learning. Furthermore, the focus on writing in this study adds into more empirical findings about early writers’ writing development and writing improvement, especially for children at the ages of 5-9. In research, comparative studies in English as a Second Language (ESL) writing pedagogy – particularly in Malaysia since the introduction of the Standard- based English Language Curriculum (KSSR) in 2011 as a draft and its full implementation in 2017; reviewed 2018 KSSR-CEFR aligned – has not been done comparatively. In theory, a multimodal theoretical framework somehow allows a logical comparison between first language and ESL which would provide useful insights to illuminate the writing process between Malaysia and England. The comparisons are not representative because of the different school systems in both countries. So far, the literature informs us that the curriculum for language learning is very much emphasised on children’s linguistic abilities, which include their proficiency and mastery of the language, its conventions, and technicalities. However, recent empirical findings suggested that literacy in its concepts and characters need change. In view of this suggestion, the comparison will look at how the process of writing is implemented through the five modes of communication: linguistic, visual, aural, spatial, and gestural. This project draws on data from Malaysia and England, involving 10 teachers, 26 classroom observations, 20 lesson plans, 20 interviews, and 20 brief conversations with teachers. The research focused upon 20 primary children of different genders aged 5-9, and in addition to primary data descriptions, 40 children’s works, 40 brief classroom conversations, 30 classroom photographs, and 30 school compound photographs were undertaken to investigate teachers and children’s use of modes and semiotic resources to design a text. The data were analysed by means of within-case analysis, cross-case analysis, and constant comparative analysis, with an initial stage of data categorisation, followed by general and specific coding, which clustered the data into thematic groups. The study highlights the importance of teachers’ and children’s engagement and interaction with various modes of communication, an adaptation from the English approaches to teaching writing within the KSSR framework and providing ‘voice’ to ESL writers to ensure that both have access to the knowledge and skills required to make decisions in developing multimodal texts and artefacts.

Keywords: comparative education, early writers, KSSR, multimodal theoretical framework, writing development

Procedia PDF Downloads 65
1038 Hermeneutical Understanding of 2 Cor. 7:1 in the Light of Igbo Cultural Concept of Purification

Authors: H. E. Amolo

Abstract:

The concepts of pollution or contamination and purification or ritual cleansing are very important concepts among traditional Africans. This is because in relation to human behaviors and attitudes, they constitute on the one hand what could be referred to as moral demands and on the other, what results in the default of such demands. The many taboos which a man has to observe are not to be regarded as things mechanical which do not touch the heart, but that the avoidance is a sacred law respected by the community. In breaking it, you offend the divine power’. Researches have shown that, Africans tenaciously hold the belief that, moral values are based upon the recognition of the divine will and that sin in the community must be expelled if perfect peace is to be enjoyed. Sadly enough, these moral values are gradually eroding in contemporary times. Thus, this study proposal calls for a survey of the passage from an African cultural context; how it can enhance the understanding of the text, as well as how it can complement its scholarly interpretation with the view of institutionalizing the concept of holiness as a means of bringing the people closer to God, and also instilling ethical purity and righteousness.

Keywords: cultural practices, Igbo ideology, purification, rituals

Procedia PDF Downloads 302
1037 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 89
1036 An Improvement of Multi-Label Image Classification Method Based on Histogram of Oriented Gradient

Authors: Ziad Abdallah, Mohamad Oueidat, Ali El-Zaart

Abstract:

Image Multi-label Classification (IMC) assigns a label or a set of labels to an image. The big demand for image annotation and archiving in the web attracts the researchers to develop many algorithms for this application domain. The existing techniques for IMC have two drawbacks: The description of the elementary characteristics from the image and the correlation between labels are not taken into account. In this paper, we present an algorithm (MIML-HOGLPP), which simultaneously handles these limitations. The algorithm uses the histogram of gradients as feature descriptor. It applies the Label Priority Power-set as multi-label transformation to solve the problem of label correlation. The experiment shows that the results of MIML-HOGLPP are better in terms of some of the evaluation metrics comparing with the two existing techniques.

Keywords: data mining, information retrieval system, multi-label, problem transformation, histogram of gradients

Procedia PDF Downloads 371
1035 Project and Experiment-Based Fluid Dynamics Education

Authors: Etsuo Morishita

Abstract:

This paper presents the project and experiment-based fluid dynamics education in Meisei University, a private institution in Tokyo, Japan. We pay attention not only to the basic engineering courses but also to the practical aspect of engineering experience. So, we prepare courses called the Projects from I to VI. The Projects I and II are designed for the first year, III and IV are designated for the second year, V and VI are prepared for the third year, respectively. Each supervisor is responsible for two of these projects every year. When students take the Project V and VI at the third year, we automatically assume that these students will join the lab of the project for the graduation thesis. We would like to show our experience in the Project I in the summer term, 2016. In this project, we introduce a traction flight vehicle called Cat Flyer. This is a kind of a kite towed by a car for example. This is very similar to parasailing, but flight is possible even on the roads. Experiments in mechanical engineering education are also very important, and we would like to explain our course on centrifugal pump, venture, and orifice. Although these are described in detail in the text books of fluid dynamics, it is still crucial to have practical experiments as a student.

Keywords: aerodynamics, experiment, fluid dynamics, project

Procedia PDF Downloads 255
1034 Resume Ranking Using Custom Word2vec and Rule-Based Natural Language Processing Techniques

Authors: Subodh Chandra Shakya, Rajendra Sapkota, Aakash Tamang, Shushant Pudasaini, Sujan Adhikari, Sajjan Adhikari

Abstract:

Lots of efforts have been made in order to measure the semantic similarity between the text corpora in the documents. Techniques have been evolved to measure the similarity of two documents. One such state-of-art technique in the field of Natural Language Processing (NLP) is word to vector models, which converts the words into their word-embedding and measures the similarity between the vectors. We found this to be quite useful for the task of resume ranking. So, this research paper is the implementation of the word2vec model along with other Natural Language Processing techniques in order to rank the resumes for the particular job description so as to automate the process of hiring. The research paper proposes the system and the findings that were made during the process of building the system.

Keywords: chunking, document similarity, information extraction, natural language processing, word2vec, word embedding

Procedia PDF Downloads 153
1033 The Attitudes of Pre-Service Teachers towards Analytical Thinking Skill Development Based on Miller’s Model

Authors: Thassanant Unnanantn, Suttipong Boonphadung

Abstract:

This research study aimed to survey and analyze the attitudes of pre-service teachers’ the analytical thinking development based on Miller’s Model. The informants of this study were 22 third year teacher students majoring in Thai. The course where the instruction was conducted was English for Academic Purposes in Thai Language 2. The instrument of this research was an open-ended questionnaire with two dimensions of questions: academic and satisfaction dimensions. The investigation revealed the positive attitudes. In the academic dimension, the majority of 12 (54.54%), the highest percentage, reflected that the method of teaching analytical thinking and language simultaneously was their new knowledge and the similar percentage also belonged to text cohesion in writing. For the satisfaction, the highest frequency count was from 17 of them (77.27%) and this majority favored the openness or friendliness of the teacher.

Keywords: analytical thinking development, Miller’s Model, attitudes, pre-service teachers

Procedia PDF Downloads 305
1032 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: mutex task generation, data augmentation, meta-learning, text classification.

Procedia PDF Downloads 135
1031 The Effect of Feature Selection on Pattern Classification

Authors: Chih-Fong Tsai, Ya-Han Hu

Abstract:

The aim of feature selection (or dimensionality reduction) is to filter out unrepresentative features (or variables) making the classifier perform better than the one without feature selection. Since there are many well-known feature selection algorithms, and different classifiers based on different selection results may perform differently, very few studies consider examining the effect of performing different feature selection algorithms on the classification performances by different classifiers over different types of datasets. In this paper, two widely used algorithms, which are the genetic algorithm (GA) and information gain (IG), are used to perform feature selection. On the other hand, three well-known classifiers are constructed, which are the CART decision tree (DT), multi-layer perceptron (MLP) neural network, and support vector machine (SVM). Based on 14 different types of datasets, the experimental results show that in most cases IG is a better feature selection algorithm than GA. In addition, the combinations of IG with DT and IG with SVM perform best and second best for small and large scale datasets.

Keywords: data mining, feature selection, pattern classification, dimensionality reduction

Procedia PDF Downloads 664
1030 Industrial Process Mining Based on Data Pattern Modeling and Nonlinear Analysis

Authors: Hyun-Woo Cho

Abstract:

Unexpected events may occur with serious impacts on industrial process. This work utilizes a data representation technique to model and to analyze process data pattern for the purpose of diagnosis. In this work, the use of triangular representation of process data is evaluated using simulation process. Furthermore, the effect of using different pre-treatment techniques based on such as linear or nonlinear reduced spaces was compared. This work extracted the fault pattern in the reduced space, not in the original data space. The results have shown that the non-linear technique based diagnosis method produced more reliable results and outperforms linear method.

Keywords: process monitoring, data analysis, pattern modeling, fault, nonlinear techniques

Procedia PDF Downloads 384
1029 A Blockchain-Based Privacy-Preserving Physical Delivery System

Authors: Shahin Zanbaghi, Saeed Samet

Abstract:

The internet has transformed the way we shop. Previously, most of our purchases came in the form of shopping trips to a nearby store. Now, it’s as easy as clicking a mouse. But with great convenience comes great responsibility. We have to be constantly vigilant about our personal information. In this work, our proposed approach is to encrypt the information printed on the physical packages, which include personal information in plain text, using a symmetric encryption algorithm; then, we store that encrypted information into a Blockchain network rather than storing them in companies or corporations centralized databases. We present, implement and assess a blockchain-based system using Ethereum smart contracts. We present detailed algorithms that explain the details of our smart contract. We present the security, cost, and performance analysis of the proposed method. Our work indicates that the proposed solution is economically attainable and provides data integrity, security, transparency, and data traceability.

Keywords: blockchain, Ethereum, smart contract, commit-reveal scheme

Procedia PDF Downloads 147
1028 Chelator-assisted Phytoextraction of Nickel from Nickeliferous Lateritic Soil by Phyllanthus sp. nov.

Authors: Grecco M. Ante, Princess Rochelle O. Gan

Abstract:

Plants that can absorb greater than 10,000 µg Ni/g dry mass in their stems and leaves are termed as ‘hypernickelophores’. Chelators are chemicals that make the metals in the soil more soluble, making them a potential enhancer for phytoextraction. This study aims to observe the effect of different concentrations of the chelating agent ethylene diamine tetraacetate (EDTA) on the metal uptake (or rate of phytoextraction) of Nickel by Phyllanthus sp. nov. The plant is found to be a hyperickelophore in normal conditions. The addition of EDTA increased the metal uptake of the plant. The increasing amount of the chelating agent causes a decrease in the phytoextraction of the plant but moves the onset of its peak of maximum nickel content in its tissue to an earlier time. The chelator-assisted phytoextraction of nickel by Phyllanthus sp. nov. is proven to be an efficient auxiliary mining operation for nickel laterite mines.

Keywords: phytomining, Phyllanthus sp. nov., EDTA, nickel, laterite

Procedia PDF Downloads 459
1027 A Case-Based Reasoning-Decision Tree Hybrid System for Stock Selection

Authors: Yaojun Wang, Yaoqing Wang

Abstract:

Stock selection is an important decision-making problem. Many machine learning and data mining technologies are employed to build automatic stock-selection system. A profitable stock-selection system should consider the stock’s investment value and the market timing. In this paper, we present a hybrid system including both engage for stock selection. This system uses a case-based reasoning (CBR) model to execute the stock classification, uses a decision-tree model to help with market timing and stock selection. The experiments show that the performance of this hybrid system is better than that of other techniques regarding to the classification accuracy, the average return and the Sharpe ratio.

Keywords: case-based reasoning, decision tree, stock selection, machine learning

Procedia PDF Downloads 416
1026 Cultural References in Jean-François Menard's French Translation of Harry Potter a L'ecole Des Sorciers: An Analysis of the Translated Catchphrases and Spells and Cultural Elements

Authors: Brynn Patrice Fader

Abstract:

The objective of this research project is to assess the ways in which Jean-Francois Menards French translation Harry Potter a l'ecole des sorciers translates the cultural references from the original text JK Rowlings' Harry Potter and the Philosophers Stone. The method of this analysis is to focus on analyzing the reasons for and the ways in which Menard translates the spells and catchphrases throughout the novel and the effects that these choices have on the reader. While at times Menard resorts to the omission or manipulation and borrowing he also contrasts these techniques by transferring the cultural references using the direct translational approach. It appears that the translator resorts to techniques other than direct translation when it is necessary to ensure that the target audience will understand the events and conversations taking place.

Keywords: cultural elements, direct translation, manipulation, omission

Procedia PDF Downloads 309
1025 Case-Based Reasoning: A Hybrid Classification Model Improved with an Expert's Knowledge for High-Dimensional Problems

Authors: Bruno Trstenjak, Dzenana Donko

Abstract:

Data mining and classification of objects is the process of data analysis, using various machine learning techniques, which is used today in various fields of research. This paper presents a concept of hybrid classification model improved with the expert knowledge. The hybrid model in its algorithm has integrated several machine learning techniques (Information Gain, K-means, and Case-Based Reasoning) and the expert’s knowledge into one. The knowledge of experts is used to determine the importance of features. The paper presents the model algorithm and the results of the case study in which the emphasis was put on achieving the maximum classification accuracy without reducing the number of features.

Keywords: case based reasoning, classification, expert's knowledge, hybrid model

Procedia PDF Downloads 365