Search results for: text mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2198

Search results for: text mining

668 Improving Second Language Speaking Skills via Video Exchange

Authors: Nami Takase

Abstract:

Computer-mediated-communication allows people to connect and interact with each other as if they were sharing the same space. The current study examined the effects of using video letters (VLs) on the development of second language speaking skills of Common European Framework of Reference for Languages (CEFR) A1 and CEFR B2 level learners of English as a foreign language. Two groups were formed to measure the impact of VLs. The experimental and control groups were given the same topic, and both groups worked with a native English-speaking university student from the United States of America. Students in the experimental group exchanged VLs, and students in the control group used video conferencing. Pre- and post-tests were conducted to examine the effects of each practice mode. The transcribed speech-text data showed that the VL group had improved speech accuracy scores, while the video conferencing group had increased sentence complexity scores. The use of VLs may be more effective for beginner-level learners because they are able to notice their own errors and replay videos to better understand the native speaker’s speech at their own pace. Both the VL and video conferencing groups provided positive feedback regarding their interactions with native speakers. The results showed how different types of computer-mediated communication impacts different areas of language learning and speaking practice and how each of these types of online communication tool is suited to different teaching objectives.

Keywords: computer-assisted-language-learning, computer-mediated-communication, english as a foreign language, speaking

Procedia PDF Downloads 75
667 Multi-Criteria Inventory Classification Process Based on Logical Analysis of Data

Authors: Diana López-Soto, Soumaya Yacout, Francisco Ángel-Bello

Abstract:

Although inventories are considered as stocks of money sitting on shelve, they are needed in order to secure a constant and continuous production. Therefore, companies need to have control over the amount of inventory in order to find the balance between excessive and shortage of inventory. The classification of items according to certain criteria such as the price, the usage rate and the lead time before arrival allows any company to concentrate its investment in inventory according to certain ranking or priority of items. This makes the decision making process for inventory management easier and more justifiable. The purpose of this paper is to present a new approach for the classification of new items based on the already existing criteria. This approach is called the Logical Analysis of Data (LAD). It is used in this paper to assist the process of ABC items classification based on multiple criteria. LAD is a data mining technique based on Boolean theory that is used for pattern recognition. This technique has been tested in medicine, industry, credit risk analysis, and engineering with remarkable results. An application on ABC inventory classification is presented for the first time, and the results are compared with those obtained when using the well-known AHP technique and the ANN technique. The results show that LAD presented very good classification accuracy.

Keywords: ABC multi-criteria inventory classification, inventory management, multi-class LAD model, multi-criteria classification

Procedia PDF Downloads 847
666 MIMIC: A Multi Input Micro-Influencers Classifier

Authors: Simone Leonardi, Luca Ardito

Abstract:

Micro-influencers are effective elements in the marketing strategies of companies and institutions because of their capability to create an hyper-engaged audience around a specific topic of interest. In recent years, many scientific approaches and commercial tools have handled the task of detecting this type of social media users. These strategies adopt solutions ranging from rule based machine learning models to deep neural networks and graph analysis on text, images, and account information. This work compares the existing solutions and proposes an ensemble method to generalize them with different input data and social media platforms. The deployed solution combines deep learning models on unstructured data with statistical machine learning models on structured data. We retrieve both social media accounts information and multimedia posts on Twitter and Instagram. These data are mapped into feature vectors for an eXtreme Gradient Boosting (XGBoost) classifier. Sixty different topics have been analyzed to build a rule based gold standard dataset and to compare the performances of our approach against baseline classifiers. We prove the effectiveness of our work by comparing the accuracy, precision, recall, and f1 score of our model with different configurations and architectures. We obtained an accuracy of 0.91 with our best performing model.

Keywords: deep learning, gradient boosting, image processing, micro-influencers, NLP, social media

Procedia PDF Downloads 146
665 Unpacking Chilean Preservice Teachers’ Beliefs on Practicum Experiences through Digital Stories

Authors: Claudio Díaz, Mabel Ortiz

Abstract:

An EFL teacher education programme in Chile takes five years to train a future teacher of English. Preservice teachers are prepared to learn an advanced level of English and teach the language from 5th to 12th grade in the Chilean educational system. In the context of their first EFL Methodology course in year four, preservice teachers have to create a five-minute digital story that starts from a critical incident they have experienced as teachers-to-be during their observations or interventions in the schools. A critical incident can be defined as a happening, a specific incident or event either observed by them or involving them. The happening sparks their thinking and may make them subsequently think differently about the particular event. When they create their digital stories, preservice teachers put technology, teaching practice and theory together to narrate a story that is complemented by still images, moving images, text, sound effects and music. The story should be told as a personal narrative, which explains the critical incident. This presentation will focus on the creation process of 50 Chilean preservice teachers’ digital stories highlighting the critical incidents they started their stories. It will also unpack preservice teachers’ beliefs and reflections when approaching their teaching practices in schools. These beliefs will be coded and categorized through content analysis to evidence preservice teachers’ most rooted conceptions about English teaching and learning in Chilean schools. The findings seem to indicate that preservice teachers’ beliefs are strongly mediated by contextual and affective factors.

Keywords: beliefs, digital stories, preservice teachers, practicum

Procedia PDF Downloads 412
664 Information Technology Approaches to Literature Text Analysis

Authors: Ayse Tarhan, Mustafa Ilkan, Mohammad Karimzadeh

Abstract:

Science was considered as part of philosophy in ancient Greece. By the nineteenth century, it was understood that philosophy was very inclusive and that social and human sciences such as literature, history, and psychology should be separated and perceived as an autonomous branch of science. The computer was also first seen as a tool of mathematical science. Over time, computer science has grown by encompassing every area in which technology exists, and its growth compelled the division of computer science into different disciplines, just as philosophy had been divided into different branches of science. Now there is almost no branch of science in which computers are not used. One of the newer autonomous disciplines of computer science is digital humanities, and one of the areas of digital humanities is literature. The material of literature is words, and thanks to the software tools created using computer programming languages, data that a literature researcher would need months to complete, can be achieved quickly and objectively. In this article, three different tools that literary researchers can use in their work will be introduced. These studies were created with the computer programming languages Python and R and brought to the world of literature. The purpose of introducing the aforementioned studies is to set an example for the development of special tools or programs on Ottoman language and literature in the future and to support such initiatives. The first example to be introduced is the Stylometry tool developed with the R language. The other is The Metrical Tool, which is used to measure data in poems and was developed with Python. The latest literature analysis tool in this article is Voyant Tools, which is a multifunctional and easy-to-use tool.

Keywords: DH, literature, information technologies, stylometry, the metrical tool, voyant tools

Procedia PDF Downloads 126
663 Ideology Shift in Political Translation

Authors: Jingsong Ma

Abstract:

In political translation, ideology plays an important role in conveying implications accurately. Ideological collisions can occur in political translation when there existdifferences of political environments embedded in the translingual political texts in both source and target languages. To reach an accurate translationrequires the translatorto understand the ideologies implied in (and often transcending) the texts. This paper explores the conditions, procedure, and purpose of processingideological collision and resolution of such issues in political translation. These points will be elucidated by case studies of translating English and Chinese political texts. First, there are specific political terminologies in certain political environments. These terminological peculiarities in one language are often determined by ideological elements rather than by syntactical and semantical understanding. The translation of these ideological-loaded terminologiesis a process and operation consisting of understanding the ideological context, including cultural, historical, and political situations. This will be explained with characteristic Chinese political terminologies and their renderings in English. Second, when the ideology in the source language fails to match with the ideology in the target language, the decisions to highlight or disregard these conflicts are shaped by power relations, political engagement, social context, etc. It thus is necessary to go beyond linguisticanalysis of the context by deciphering ideology in political documents to provide a faithful or equivalent rendering of certain messages. Finally, one of the practical issues is about equivalence in political translation by redefining the notion of faithfulness and retainment of ideological messages in the source language in translations of political texts. To avoid distortion, the translator should be liberated from grip the literal meaning, instead diving into functional meanings of the text.

Keywords: translation, ideology, politics, society

Procedia PDF Downloads 89
662 South Africa’s Post-Apartheid Film Narratives of HIV/AIDS: A Case of ‘Yesterday’

Authors: Moyahabo Molefe

Abstract:

The persistence of HIV/AIDS infection rates in SA has not only been a subject of academic debate but a mediated narrative that has dominated SA’s post-apartheid film space over the last two decades. SA’s colonial geo-spatial architecture still influences migrant labour patterns, which the Oscar-nominated (2003) SA film ‘Yesterday’ has erstwhile reflected upon, yet continues to account for the spread of HIV/AIDS in SA society. Accordingly, men who had left their homes in the rural areas to work in the mines in the cities become infected with HIV/AIDS, only to return home to infect their wives or partners in the rural areas. This paper analyses, through Social Semiotic theory, how SA geo-spatial arrangement had raptured family structures with both men and women taking new residences in the urban areas where they work away from their homes. By using Social semiotic theory, this paper seeks to understand how images and discourses have been deployed in the film ‘Yesterday’ to demonstrate how HIV/AIDS is embedded in the socio-cultural, economic and political architect of SA society. The study uses qualitative approach and content/text/visual semiotic analysis to decipher meanings from array of imagery and discourses/dialogues that are used to mythologise the relationship between the spread of HIV/AIDS and SA migrant labour patterns. The findings of the study are significant to propose a conceptual framework that can be used to mitigate the spread of HIV/AIDS among SA populace, against the backdrop of changing migrant labour patterns and other related factors

Keywords: colonialism, decoloniality, HIV/AIDS, labour migration patterns, social semiotics

Procedia PDF Downloads 42
661 A Qualitative Study of Health-Related Beliefs and Practices among Vegetarians

Authors: Lorena Antonovici, Maria Nicoleta Turliuc

Abstract:

The process of becoming a vegetarian involves changes in several life aspects, including health. Despite its relevance, however, little research has been carried out to analyze vegetarians' self-perceived health, and even less empirical attention has received in the Romanian population. This study aimed to assess health-related beliefs and practices among vegetarian adults in a Romanian sample. We have undertaken 20 semi-structured interviews (10 males, 10 females) based on a snowball sample with a mean age of 31 years. The interview guide was divided into three sections: causes of adopting the diet, general aspects (beliefs, practices, tensions, and conflicts) and consequences of adopting the diet (significant changes, positive aspects, and difficulties, physical and mental health). Additional anamnestic data were reported by means of a questionnaire. Data analyses were performed using Tropes text analysis software (v. 8.2) and SPSS software (v. 24.0.) Findings showed that most of the participants considered a vegetarian diet as a natural and healthy choice as opposed to meat-eating, which is not healthy, and its consumption should be moderated among omnivores. A higher proportion of participants (65%) had an average body mass index (BMI), and several women even assumed having certain affections that no longer occur after following a vegetarian diet. Moreover, participants admitted having better moods and mental health status, given their self-contentment with the dietary choice. Relatives were perceived as more skeptical about their practices than others, and especially women had this view. This study provides a valuable insight into health-related beliefs and practices and how a vegetarian diet might interact.

Keywords: beliefs, health, practices, vegetarians

Procedia PDF Downloads 98
660 Python Implementation for S1000D Applicability Depended Processing Model - SALERNO

Authors: Theresia El Khoury, Georges Badr, Amir Hajjam El Hassani, Stéphane N’Guyen Van Ky

Abstract:

The widespread adoption of machine learning and artificial intelligence across different domains can be attributed to the digitization of data over several decades, resulting in vast amounts of data, types, and structures. Thus, data processing and preparation turn out to be a crucial stage. However, applying these techniques to S1000D standard-based data poses a challenge due to its complexity and the need to preserve logical information. This paper describes SALERNO, an S1000d AppLicability dEpended pRocessiNg mOdel. This python-based model analyzes and converts the XML S1000D-based files into an easier data format that can be used in machine learning techniques while preserving the different logic and relationships in files. The model parses the files in the given folder, filters them, and extracts the required information to be saved in appropriate data frames and Excel sheets. Its main idea is to group the extracted information by applicability. In addition, it extracts the full text by replacing internal and external references while maintaining the relationships between files, as well as the necessary requirements. The resulting files can then be saved in databases and used in different models. Documents in both English and French languages were tested, and special characters were decoded. Updates on the technical manuals were taken into consideration as well. The model was tested on different versions of the S1000D, and the results demonstrated its ability to effectively handle the applicability, requirements, references, and relationships across all files and on different levels.

Keywords: aeronautics, big data, data processing, machine learning, S1000D

Procedia PDF Downloads 96
659 A Corpus-Based Approach to Understanding Market Access in Fisheries and Aquaculture: A Systematic Literature Review

Authors: Cheryl Marie Cordeiro

Abstract:

Although fisheries and aquaculture studies might seem marginal to international business (IB) studies in general, fisheries and aquaculture IB (FAIB) management is currently facing increasing pressure to meet global demand and consumption for fish in the next coming decades. In part address to this challenge, the purpose of this systematic review of literature (SLR) study is to investigate the use of the term ‘market access’ in its context of use in the generic literature and business sector discourse, in comparison to the more specific literature and discourse in fisheries, aquaculture and seafood. This SLR aims to uncover the knowledge/interest gaps between the academic subject discourses and business sector practices. Corpus driven in methodology and using a triangulation method of three different text analysis software including AntConc, VOSviewer and Web of Science (WoS) analytics, the SLR results indicate a gap in conceptual knowledge and business practices in how ‘market access’ is conceived and used in the context of the pharmaceutical healthcare industry and FAIB research and practice. While it is acknowledged that the product orientation of different business sectors might differ, this SLR study works with the assumption that both business sectors are global in orientation. These business sectors are complex in their operations from product to market. This SLR suggests a conceptual model in understanding the challenges, the potential barriers as well as avenues for solutions to developing market access for FAIB.

Keywords: market access, fisheries and aquaculture, international business, systematic literature review

Procedia PDF Downloads 125
658 A Hybrid Feature Selection and Deep Learning Algorithm for Cancer Disease Classification

Authors: Niousha Bagheri Khulenjani, Mohammad Saniee Abadeh

Abstract:

Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.

Keywords: cancer classification, feature selection, deep learning, genetic algorithm

Procedia PDF Downloads 90
657 Comparing the Sequence and Effectiveness of Teaching the Four Basic Operations and Mathematics in Primary Schools

Authors: Abubakar Sadiq Mensah, Hassan Usman

Abstract:

The study compared the effectiveness of Audition, Multiplication, subtraction and Division (AMSD) and Addition, subtraction, Multiplication and Division (ASMD), sequence of teaching these four basic operations in mathematics to primary one pupil’s in Katsina Local Government, Katsina State. The study determined the sequence that was more effective and mostly adopted by teachers of the operations. One hundred (100) teachers and sixty pupils (60) from primary one were used for the study. The pupils were divided into two equal groups. The researcher taught these operations to each group separately for four weeks (4 weeks). Group one was taught using the ASMD sequence, while group two was taught using ASMD sequence. In order to generate the needed data for the study, questionnaires and tests were administered on the samples. Data collected were analyzed and major findings were arrived at: (i) Two primary mathematics text books were used in all the primary schools in the area; (ii) Each of the textbooks contained the ASMD sequence; (iii) 73% of the teachers sampled adopted the ASMD sequence of teaching these operations; and (iv) Group one of the pupils (taught using AMSD sequence) performed significantly better than their counter parts in group two (taught using AMSD sequence). On the basis of this, the researcher concluded that the AMSD sequence was more effective in teaching the operations than the ASMD sequence. Consequently, the researcher concluded that primary schools teachers, authors of primary mathematics textbooks, and curriculum planner should adopt the AMSD sequence of teaching these operations.

Keywords: matematic, high school, four basic operations, effectiveness of teaching

Procedia PDF Downloads 230
656 CompPSA: A Component-Based Pairwise RNA Secondary Structure Alignment Algorithm

Authors: Ghada Badr, Arwa Alturki

Abstract:

The biological function of an RNA molecule depends on its structure. The objective of the alignment is finding the homology between two or more RNA secondary structures. Knowing the common functionalities between two RNA structures allows a better understanding and a discovery of other relationships between them. Besides, identifying non-coding RNAs -that is not translated into a protein- is a popular application in which RNA structural alignment is the first step A few methods for RNA structure-to-structure alignment have been developed. Most of these methods are partial structure-to-structure, sequence-to-structure, or structure-to-sequence alignment. Less attention is given in the literature to the use of efficient RNA structure representation and the structure-to-structure alignment methods are lacking. In this paper, we introduce an O(N2) Component-based Pairwise RNA Structure Alignment (CompPSA) algorithm, where structures are given as a component-based representation and where N is the maximum number of components in the two structures. The proposed algorithm compares the two RNA secondary structures based on their weighted component features rather than on their base-pair details. Extensive experiments are conducted illustrating the efficiency of the CompPSA algorithm when compared to other approaches and on different real and simulated datasets. The CompPSA algorithm shows an accurate similarity measure between components. The algorithm gives the flexibility for the user to align the two RNA structures based on their weighted features (position, full length, and/or stem length). Moreover, the algorithm proves scalability and efficiency in time and memory performance.

Keywords: alignment, RNA secondary structure, pairwise, component-based, data mining

Procedia PDF Downloads 434
655 Application of Latent Class Analysis and Self-Organizing Maps for the Prediction of Treatment Outcomes for Chronic Fatigue Syndrome

Authors: Ben Clapperton, Daniel Stahl, Kimberley Goldsmith, Trudie Chalder

Abstract:

Chronic fatigue syndrome (CFS) is a condition characterised by chronic disabling fatigue and other symptoms that currently can't be explained by any underlying medical condition. Although clinical trials support the effectiveness of cognitive behaviour therapy (CBT), the success rate for individual patients is modest. Patients vary in their response and little is known which factors predict or moderate treatment outcomes. The aim of the project is to develop a prediction model from baseline characteristics of patients, such as demographics, clinical and psychological variables, which may predict likely treatment outcome and provide guidance for clinical decision making and help clinicians to recommend the best treatment. The project is aimed at identifying subgroups of patients with similar baseline characteristics that are predictive of treatment effects using modern cluster analyses and data mining machine learning algorithms. The characteristics of these groups will then be used to inform the types of individuals who benefit from a specific treatment. In addition, results will provide a better understanding of for whom the treatment works. The suitability of different clustering methods to identify subgroups and their response to different treatments of CFS patients is compared.

Keywords: chronic fatigue syndrome, latent class analysis, prediction modelling, self-organizing maps

Procedia PDF Downloads 199
654 A Critical Geography of Reforestation Program in Ghana

Authors: John Narh

Abstract:

There is high rate of deforestation in Ghana due to agricultural expansion, illegal mining and illegal logging. While it is attempting to address the illegalities, Ghana has also initiated a reforestation program known as the Modified Taungya System (MTS). Within the MTS framework, farmers are allocated degraded forestland and provided with tree seedlings to practice agroforestry until the trees form canopy. Yet, the political, ecological and economic models that inform the selection of tree species, the motivations of participating farmers as well as the factors that accounts for differential access to the land and performance of farmers engaged in the program lie underexplored. Using a sequential explanatory mixed methods approach in five forest-fringe communities in the Eastern Region of Ghana, the study reveals that economic factors and Ghana’s commitment to international conventions on the environment underpin the selection of tree species for the MTS program. Social network and access to remittances play critical roles in having access to, and enhances poor farmers’ chances in the program respectively. Farmers are more motivated by the access to degraded forestland to cultivate food crops than having a share in the trees that they plant. As such, in communities where participating farmers are not informed about their benefit in the tree that they plant, the program is largely unsuccessful.

Keywords: translocality, deforestation, forest management, social network

Procedia PDF Downloads 66
653 Leaching Properties of Phosphate Rocks in the Nile River

Authors: Abdelkader T. Ahmed

Abstract:

Phosphate Rocks (PR) are natural sediment rocks. These rocks contain several chemical compositions of heavy metals and radioactive elements. Mining and transportation these rocks beside or through the natural water streams may lead to water contamination. When PR is in contact with water in the field, as a consequence of precipitation events, changes in water table or sinking in water streams, elements such as salts and heavy metals, may be released to the water. In this work, the leaching properties of PR in Nile River water was investigated by experimental lab work. The study focused on evaluating potential environmental impacts of some constituents, including phosphors, cadmium, curium and lead of PR on the water quality of Nile by applying tank leaching tests. In these tests the potential impact of changing conditions, such as phosphate content in PR, liquid to solid ratio (L/S) and pH value, was studied on the long-term release of heavy metals and salts. Experimental results showed that cadmium and lead were released in very low concentrations but curium and phosphors were in high concentrations. Results showed also that the release rate from PR for all constituents was low even in long periods.

Keywords: leaching tests, Nile river, phosphate rocks, water quality

Procedia PDF Downloads 301
652 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services

Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme

Abstract:

Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.

Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing

Procedia PDF Downloads 88
651 Balance Transfer of Heavy Metals in Marine Environments Subject to Natural and Anthropogenic Inputs: A Case Study on the Mejerda River Delta

Authors: Mohamed Amine Helali, Walid Oueslati, Ayed Added

Abstract:

Sedimentation rates and total fluxes of heavy metals (Fe, Mn, Pb, Zn and Cu) was measured in three different depths (10m, 20m and 40m) during March and August 2012, offshore of the Mejerda River outlet (Gulf of Tunis, Tunisia). The sedimentation rates are estimated from the fluxes of the suspended particulate matter at 7.32, 5.45 and 4.39 mm y⁻¹ respectively at 10m, 20m and 40m depth. Heavy metals sequestration in sediments was determined by chemical speciation and the total metal contents in each core collected from 10, 20 and 40m depth. Heavy metals intake to the sediment was measured also from the suspended particulate matter, while the fluxes from the sediment to the water column was determined using the benthic chambers technique and from the diffusive fluxes in the pore water. Results shown that iron is the only metal for which the balance transfer between intake/uptake (45 to 117 / 1.8 to 5.8 g m² y⁻¹) and sequestration (277 to 378 g m² y⁻¹) was negative, at the opposite of the Lead which intake fluxes (360 to 480 mg m² y⁻¹) are more than sequestration fluxes (50 to 92 mg m² y⁻¹). The balance transfer is neutral for Mn, Zn, and Cu. These clearly indicate that the contributions of Mejerda have consistently varied over time, probably due to the migration of the River mouth and to the changes in the mining activity in the Mejerda catchment and the recent human activities which affect the delta area.

Keywords: delta, fluxes, heavy metals, sediments, sedimentation rates

Procedia PDF Downloads 183
650 Time and Cost Prediction Models for Language Classification Over a Large Corpus on Spark

Authors: Jairson Barbosa Rodrigues, Paulo Romero Martins Maciel, Germano Crispim Vasconcelos

Abstract:

This paper presents an investigation of the performance impacts regarding the variation of five factors (input data size, node number, cores, memory, and disks) when applying a distributed implementation of Naïve Bayes for text classification of a large Corpus on the Spark big data processing framework. Problem: The algorithm's performance depends on multiple factors, and knowing before-hand the effects of each factor becomes especially critical as hardware is priced by time slice in cloud environments. Objectives: To explain the functional relationship between factors and performance and to develop linear predictor models for time and cost. Methods: the solid statistical principles of Design of Experiments (DoE), particularly the randomized two-level fractional factorial design with replications. This research involved 48 real clusters with different hardware arrangements. The metrics were analyzed using linear models for screening, ranking, and measurement of each factor's impact. Results: Our findings include prediction models and show some non-intuitive results about the small influence of cores and the neutrality of memory and disks on total execution time, and the non-significant impact of data input scale on costs, although notably impacts the execution time.

Keywords: big data, design of experiments, distributed machine learning, natural language processing, spark

Procedia PDF Downloads 86
649 Data-Driven Decision Making: A Reference Model for Organizational, Educational and Competency-Based Learning Systems

Authors: Emanuel Koseos

Abstract:

Data-Driven Decision Making (DDDM) refers to making decisions that are based on historical data in order to inform practice, develop strategies and implement policies that benefit organizational settings. In educational technology, DDDM facilitates the implementation of differential educational learning approaches such as Educational Data Mining (EDM) and Competency-Based Education (CBE), which commonly target university classrooms. There is a current need for DDDM models applied to middle and secondary schools from a concern for assessing the needs, progress and performance of students and educators with respect to regional standards, policies and evolution of curriculums. To address these concerns, we propose a DDDM reference model developed using educational key process initiatives as inputs to a machine learning framework implemented with statistical software (SAS, R) to provide a best-practices, complex-free and automated approach for educators at their regional level. We assessed the efficiency of the model over a six-year period using data from 45 schools and grades K-12 in the Langley, BC, Canada regional school district. We concluded that the model has wider appeal, such as business learning systems.

Keywords: competency-based learning, data-driven decision making, machine learning, secondary schools

Procedia PDF Downloads 149
648 Human Digital Twin for Personal Conversation Automation Using Supervised Machine Learning Approaches

Authors: Aya Salama

Abstract:

Digital Twin is an emerging research topic that attracted researchers in the last decade. It is used in many fields, such as smart manufacturing and smart healthcare because it saves time and money. It is usually related to other technologies such as Data Mining, Artificial Intelligence, and Machine Learning. However, Human digital twin (HDT), in specific, is still a novel idea that still needs to prove its feasibility. HDT expands the idea of Digital Twin to human beings, which are living beings and different from the inanimate physical entities. The goal of this research was to create a Human digital twin that is responsible for real-time human replies automation by simulating human behavior. For this reason, clustering, supervised classification, topic extraction, and sentiment analysis were studied in this paper. The feasibility of the HDT for personal replies generation on social messaging applications was proved in this work. The overall accuracy of the proposed approach in this paper was 63% which is a very promising result that can open the way for researchers to expand the idea of HDT. This was achieved by using Random Forest for clustering the question data base and matching new questions. K-nearest neighbor was also applied for sentiment analysis.

Keywords: human digital twin, sentiment analysis, topic extraction, supervised machine learning, unsupervised machine learning, classification, clustering

Procedia PDF Downloads 65
647 Development of Prediction Models of Day-Ahead Hourly Building Electricity Consumption and Peak Power Demand Using the Machine Learning Method

Authors: Dalin Si, Azizan Aziz, Bertrand Lasternas

Abstract:

To encourage building owners to purchase electricity at the wholesale market and reduce building peak demand, this study aims to develop models that predict day-ahead hourly electricity consumption and demand using artificial neural network (ANN) and support vector machine (SVM). All prediction models are built in Python, with tool Scikit-learn and Pybrain. The input data for both consumption and demand prediction are time stamp, outdoor dry bulb temperature, relative humidity, air handling unit (AHU), supply air temperature and solar radiation. Solar radiation, which is unavailable a day-ahead, is predicted at first, and then this estimation is used as an input to predict consumption and demand. Models to predict consumption and demand are trained in both SVM and ANN, and depend on cooling or heating, weekdays or weekends. The results show that ANN is the better option for both consumption and demand prediction. It can achieve 15.50% to 20.03% coefficient of variance of root mean square error (CVRMSE) for consumption prediction and 22.89% to 32.42% CVRMSE for demand prediction, respectively. To conclude, the presented models have potential to help building owners to purchase electricity at the wholesale market, but they are not robust when used in demand response control.

Keywords: building energy prediction, data mining, demand response, electricity market

Procedia PDF Downloads 291
646 Geomechanical Technologies for Assessing Three-Dimensional Stability of Underground Excavations Utilizing Remote-Sensing, Finite Element Analysis, and Scientific Visualization

Authors: Kwang Chun, John Kemeny

Abstract:

Light detection and ranging (LiDAR) has been a prevalent remote-sensing technology applied in the geological fields due to its high precision and ease of use. One of the major applications is to use the detailed geometrical information of underground structures as a basis for the generation of a three-dimensional numerical model that can be used in a geotechnical stability analysis such as FEM or DEM. To date, however, straightforward techniques in reconstructing the numerical model from the scanned data of the underground structures have not been well established or tested. In this paper, we propose a comprehensive approach integrating all the various processes, from LiDAR scanning to finite element numerical analysis. The study focuses on converting LiDAR 3D point clouds of geologic structures containing complex surface geometries into a finite element model. This methodology has been applied to Kartchner Caverns in Arizona, where detailed underground and surface point clouds can be used for the analysis of underground stability. Numerical simulations were performed using the finite element code Abaqus and presented by 3D computing visualization solution, ParaView. The results are useful in studying the stability of all types of underground excavations including underground mining and tunneling.

Keywords: finite element analysis, LiDAR, remote-sensing, scientific visualization, underground stability

Procedia PDF Downloads 141
645 The South Looking East: The New Geopolitics of Latin America

Authors: Heike Pintor Pirzkall

Abstract:

The positive economic evolution of many countries in the Latin American Continent, mainly in South America, has changed the geopolitical position of the region in the world. It is no longer the Hinterland or backyard of the United States, now it has become the Heartland for Europe and Asia. This position has favored the interest of countries like China or India, who are combining trade agreements with special assistance and aid agreements in many fields like agriculture, alternative energy resources, defense and mining. As many countries in the region are no longer low income countries, a more equal relationship in development aid has been created were the donor and the recipient have become partners and where new actors intervene in a triangular relationship that promotes new alternative aid structures. Triangular co-operation brings together the best of different actors who are providers of development co-operation, partners in SouthSouth co-operation and international organizations. The objective is to share knowledge and implement projects that support the common goal of reducing poverty and promoting development. The intention of this paper is to explain the reasons for Latin America´s “virage” to the east and to give examples of projects and agreements between Latin American countries, China and India which will help to understand the intensification of south-east relations in recent years.

Keywords: development cooperation, China, Latin America, triangular cooperation, natural resources, partnership

Procedia PDF Downloads 358
644 The Role of Ideophones: Phonological and Morphological Characteristics in Literature

Authors: Cristina Bahón Arnaiz

Abstract:

Many Asian languages, such as Korean and Japanese, are well-known for their wide use of sound symbolic words or ideophones. This is a very particular characteristic which enriches its lexicon hugely. Ideophones are a class of sound symbolic words that utilize sound symbolism to express aspects, states, emotions, or conditions that can be experienced through the senses, such as shape, color, smell, action or movement. Ideophones have very particular characteristics in terms of sound symbolism and morphology, which distinguish them from other words. The phonological characteristics of ideophones are vowel ablaut or vowel gradation and consonant mutation. In the case of Korean, there are light vowels and dark vowels. Depending on the type of vowel that is used, the meaning will slightly change. Consonant mutation, also known as consonant ablaut, contributes to the level of intensity, emphasis, and volume of an expression. In addition to these phonological characteristics, there is one main morphological singularity, which is reduplication and it carries the meaning of continuity, repetition, intensity, emphasis, and plurality. All these characteristics play an important role in both linguistics and literature as they enhance the meaning of what is trying to be expressed with incredible semantic detail, expressiveness, and rhythm. The following study will analyze the ideophones used in a single paragraph of a Korean novel, which add incredible yet subtle detail to the meaning of the words, and advance the expressiveness and rhythm of the text. The results from analyzing one paragraph from a novel, after presenting the phonological and morphological characteristics of Korean ideophones, will evidence the important role that ideophones play in literature. 

Keywords: ideophones, mimetic words, phonomimes, phenomimes, psychomimes, sound symbolism

Procedia PDF Downloads 124
643 Instructional Immediacy Practices in Asynchronous Learning Environment: Tutors' Perspectives

Authors: Samar Alharbi, Yota Dimitriadi

Abstract:

With the exponential growth of information and communication technologies in higher education, new online teaching strategies have become increasingly important for student engagement and learning. In particular, some institutions depend solely on asynchronous e-learning to provide courses for their students. The major challenge facing these institutions is how to improve the quality of teaching and learning in their asynchronous tools. One of the most important methods that can help e-learner to enhance their social learning and social presence in asynchronous learning setting is immediacy. This study explores tutors perceptions of their instructional immediacy practices as part of their communication actions in online learning environments. It was used a mixed-methods design under the umbrella of pragmatic philosophical assumption. The participants included tutors at an educational institution in a Saudi university. The participants were selected with a purposive sampling approach and chose an institution that offered fully online courses to students. The findings of the quantitative data show the importance of teachers’ immediacy practices in an online text-based learning environment. The qualitative data contained three main themes: the tutors’ encouragement of student interaction; their promotion of class participation; and their addressing of the needs of the students. The findings from these mixed methods can provide teachers with insights into instructional designs and strategies that they can adopt in order to use e-immediacy in effective ways, thus improving their students’ online learning experiences.

Keywords: asynchronous e-learning, higher education, immediacy, tutor

Procedia PDF Downloads 179
642 The Mineralogy of Shales from the Pilbara and How Chemical Weathering Affects the Intact Strength

Authors: Arturo Maldonado

Abstract:

In the iron ore mining industry, the intact strength of rock units is defined using the uniaxial compressive strength (UCS). This parameter is very important for the classification of shale materials, allowing the split between rock and cohesive soils based on the magnitude of UCS. For this research, it is assumed that UCS less than or equal to 1 MPa is representative of soils. Several researchers have anticipated that the magnitude of UCS reduces with weathering progression, also since UCS is a directional property, its magnitude depends upon the rock fabric orientation. Thus, the paper presents how the UCS of shales is affected by both weathering grade and bedding orientation. The mineralogy of shales has been defined using Hyper-spectral and chemical assays to define the mineral constituents of shale and other non-shale materials. Geological classification tools have been used to define distinct lithological types, and in this manner, the author uses mineralogical datasets to recognize and isolate shales from other rock types and develop tertiary plots for fresh and weathered shales. The mineralogical classification of shales has reduced the contamination of lithology types and facilitated the study of the physical factors affecting the intact strength of shales, like anisotropic strength due to bedding orientation. The analysis of mineralogical characteristics of shales is perhaps the most important contribution of this paper to other researchers who may wish to explore similar methods.

Keywords: rock mechanics, mineralogy, shales, weathering, anisotropy

Procedia PDF Downloads 19
641 Identifying Enablers and Barriers of Healthcare Knowledge Transfer: A Systematic Review

Authors: Yousuf Nasser Al Khamisi

Abstract:

Purpose: This paper presents a Knowledge Transfer (KT) Framework in healthcare sectors by applying a systematic literature review process to the healthcare organizations domain to identify enablers and barriers of KT in Healthcare. Methods: The paper conducted a systematic literature search of peer-reviewed papers that described key elements of KT using four databases (Medline, Cinahl, Scopus, and Proquest) for a 10-year period (1/1/2008–16/10/2017). The results of the literature review were used to build a conceptual framework of KT in healthcare organizations. The author used a systematic review of the literature, as described by Barbara Kitchenham in Procedures for Performing Systematic Reviews. Findings: The paper highlighted the impacts of using Knowledge Management (KM) concept at a healthcare organization in controlling infectious diseases in hospitals, improving family medicine performance and enhancing quality improvement practices. Moreover, it found that good-coding performance is analytically linked with a knowledge sharing network structure rich in brokerage and hierarchy rather than in density. The unavailability or ignored of the latest evidence on more cost-effective or more efficient delivery approaches leads to increase the healthcare costs and may lead to unintended results. Originality: Search procedure produced 12,093 results, of which 3523 were general articles about KM and KT. The titles and abstracts of these articles had been screened to segregate what is related and what is not. 94 articles identified by the researchers for full-text assessment. The total number of eligible articles after removing un-related articles was 22 articles.

Keywords: healthcare organisation, knowledge management, knowledge transfer, KT framework

Procedia PDF Downloads 115
640 Reduction in Hot Metal Silicon through Statistical Analysis at G-Blast Furnace, Tata Steel Jamshedpur

Authors: Shoumodip Roy, Ankit Singhania, Santanu Mallick, Abhiram Jha, M. K. Agarwal, R. V. Ramna, Uttam Singh

Abstract:

The quality of hot metal at any blast furnace is judged by the silicon content in it. Lower hot metal silicon not only enhances process efficiency at steel melting shops but also reduces hot metal costs. The Hot metal produced at G-Blast furnace Tata Steel Jamshedpur has a significantly higher Si content than Benchmark Blast furnaces. The higher content of hot metal Si is mainly due to inferior raw material quality than those used in benchmark blast furnaces. With minimum control over raw material quality, the only option left to control hot metal Si is via optimizing the furnace parameters. Therefore, in order to identify the levers to reduce hot metal Si, Data mining was carried out, and multiple regression models were developed. The statistical analysis revealed that Slag B3{(CaO+MgO)/SiO2}, Slag Alumina and Hot metal temperature are key controllable parameters affecting hot metal silicon. Contour Plots were used to determine the optimum range of levels identified through statistical analysis. A trial plan was formulated to operate relevant parameters, at G blast furnace, in the identified range to reduce hot metal silicon. This paper details out the process followed and subsequent reduction in hot metal silicon by 15% at G blast furnace.

Keywords: blast furnace, optimization, silicon, statistical tools

Procedia PDF Downloads 199
639 Optimization and Automation of Functional Testing with White-Box Testing Method

Authors: Reyhaneh Soltanshah, Hamid R. Zarandi

Abstract:

In order to be more efficient in industries that are related to computer systems, software testing is necessary despite spending time and money. In the embedded system software test, complete knowledge of the embedded system architecture is necessary to avoid significant costs and damages. Software tests increase the price of the final product. The aim of this article is to provide a method to reduce time and cost in tests based on program structure. First, a complete review of eleven white box test methods based on ISO/IEC/IEEE 29119 2015 and 2021 versions has been done. The proposed algorithm is designed using two versions of the 29119 standards, and some white-box testing methods that are expensive or have little coverage have been removed. On each of the functions, white box test methods were applied according to the 29119 standard and then the proposed algorithm was implemented on the functions. To speed up the implementation of the proposed method, the Unity framework has been used with some changes. Unity framework can be used in embedded software testing due to its open source and ability to implement white box test methods. The test items obtained from these two approaches were evaluated using a mathematical ratio, which in various software mining reduced between 50% and 80% of the test cost and reached the desired result with the minimum number of test items.

Keywords: embedded software, reduce costs, software testing, white-box testing

Procedia PDF Downloads 11