Search results for: multiple data stores

7427 Natural Language News Generation from Big Data

Authors: Bastian Haarmann, Lukas Sikorski

Abstract:

In this paper, we introduce an NLG application for the automatic creation of ready-to-publish texts from big data. The resulting fully automatic generated news stories have a high resemblance to the style in which the human writer would draw up such a story. Topics include soccer games, stock exchange market reports, and weather forecasts. Each generated text is unique. Readyto-publish stories written by a computer application can help humans to quickly grasp the outcomes of big data analyses, save timeconsuming pre-formulations for journalists and cater to rather small audiences by offering stories that would otherwise not exist.

Keywords: Big data, natural language generation, publishing, robotic journalism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1661

7426 Production of Apricot Vinegar Using an Isolated Acetobacter Strain from Iranian Apricot

Authors: Keivan Beheshti Maal, Rasoul Shafiei, Noushin Kabiri

Abstract:

Vinegar or sour wine is a product of alcoholic and subsequent acetous fermentation of sugary precursors derived from several fruits or starchy substrates. This delicious food additive and supplement contains not less than 4 grams of acetic acid in 100 cubic centimeters at 20°C. Among the large number of bacteria that are able to produce acetic acid, only few genera are used in vinegar industry most significant of which are Acetobacter and Gluconobacter. In this research we isolated and identified an Acetobacter strain from Iranian apricot, a very delicious and sensitive summer fruit to decay, we gathered from fruit's stores in Isfahan, Iran. The main culture media we used were Carr, GYC, Frateur and an industrial medium for vinegar production. We isolated this strain using a novel miniature fermentor we made at Pars Yeema Biotechnologists Co., Isfahan Science and Technology Town (ISTT), Isfahan, Iran. The microscopic examinations of isolated strain from Iranian apricot showed gram negative rods to cocobacilli. Their catalase reaction was positive and oxidase reaction was negative and could ferment ethanol to acetic acid. Also it showed an acceptable growth in 5%, 7% and 9% ethanol concentrations at 30°C using modified Carr media after 24, 48 and 96 hours incubation respectively. According to its tolerance against high concentrations of ethanol after four days incubation and its high acetic acid production, 8.53%, after 144 hours, this strain could be considered as a suitable industrial strain for a production of a new type of vinegar, apricot vinegar, with a new and delicious taste. In conclusion this is the first report of isolation and identification of an Acetobacter strain from Iranian apricot with a very good tolerance against high ethanol concentrations as well as high acetic acid productivity in an acceptable incubation period of time industrially. This strain could be used in vinegar industry to convert apricot spoilage to a beneficiary product and mentioned characteristics have made it as an amenable strain in food and agricultural biotechnology.

Keywords: Acetic Acid Bacteria, Acetobacter, Fermentation, Food and Agricultural Biotechnology, Iranian Apricot, Vinegar.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3945

7425 Yield Prediction Using Support Vectors Based Under-Sampling in Semiconductor Process

Authors: Sae-Rom Pak, Seung Hwan Park, Jeong Ho Cho, Daewoong An, Cheong-Sool Park, Jun Seok Kim, Jun-Geol Baek

Abstract:

It is important to predict yield in semiconductor test process in order to increase yield. In this study, yield prediction means finding out defective die, wafer or lot effectively. Semiconductor test process consists of some test steps and each test includes various test items. In other world, test data has a big and complicated characteristic. It also is disproportionably distributed as the number of data belonging to FAIL class is extremely low. For yield prediction, general data mining techniques have a limitation without any data preprocessing due to eigen properties of test data. Therefore, this study proposes an under-sampling method using support vector machine (SVM) to eliminate an imbalanced characteristic. For evaluating a performance, randomly under-sampling method is compared with the proposed method using actual semiconductor test data. As a result, sampling method using SVM is effective in generating robust model for yield prediction.

Keywords: Yield Prediction, Semiconductor Test Process, Support Vector Machine, Under Sampling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2376

7424 A New Model for Discovering XML Association Rules from XML Documents

Authors: R. AliMohammadzadeh, M. Rahgozar, A. Zarnani

Abstract:

The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized association rules form a XML document collection. We directly use frequent subtree mining techniques in the discovery process and do not ignore the tree structure of data in the final rules. The frequent subtrees based on the user provided support are split to complement subtrees to form the rules. We explain our model within multi-steps from data preparation to rule generation.

Keywords: XML, Data Mining, Association Rule Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1619

7423 Modelling Silica Optical Fibre Reliability: A Software Application

Authors: I. Severin, M. Caramihai, R. El Abdi, M. Poulain, A. Avadanii

Abstract:

In order to assess optical fiber reliability in different environmental and stress conditions series of testing are performed simulating overlapping of chemical and mechanical controlled varying factors. Each series of testing may be compared using statistical processing: i.e. Weibull plots. Due to the numerous data to treat, a software application has appeared useful to interpret selected series of experiments in function of envisaged factors. The current paper presents a software application used in the storage, modelling and interpretation of experimental data gathered from optical fibre testing. The present paper strictly deals with the software part of the project (regarding the modelling, storage and processing of user supplied data).

Keywords: Optical fibres, computer aided analysis, data models, data processing, graphical user interfaces.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1804

7422 The Role of Synthetic Data in Aerial Object Detection

Authors: Ava Dodd, Jonathan Adams

Abstract:

The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represent another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.

Keywords: computer vision, machine learning, synthetic data, YOLOv4

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 819

7421 Integrated Approaches to Enhance Aggregate Production Planning with Inventory Uncertainty Based On Improved Harmony Search Algorithm

Authors: P. Luangpaiboon, P. Aungkulanon

Abstract:

This work presents a multiple objective linear programming (MOLP) model based on the desirability function approach for solving the aggregate production planning (APP) decision problem upon Masud and Hwang-s model. The proposed model minimises total production costs, carrying or backordering costs and rates of change in labor levels. An industrial case demonstrates the feasibility of applying the proposed model to the APP problems with three scenarios of inventory levels. The proposed model yields an efficient compromise solution and the overall levels of DM satisfaction with the multiple combined response levels. There has been a trend to solve complex planning problems using various metaheuristics. Therefore, in this paper, the multi-objective APP problem is solved by hybrid metaheuristics of the hunting search (HuSIHSA) and firefly (FAIHSA) mechanisms on the improved harmony search algorithm. Results obtained from the solution of are then compared. It is observed that the FAIHSA can be used as a successful alternative solution mechanism for solving APP problems over three scenarios. Furthermore, the FAIHSA provides a systematic framework for facilitating the decision-making process, enabling a decision maker interactively to modify the desirability function approach and related model parameters until a good optimal solution is obtained with proper selection of control parameters when compared.

Keywords: Aggregate Production Planning, Desirability Function Approach, Improved Harmony Search Algorithm, Hunting Search Algorithm and Firefly Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1906

7420 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: Early Warning System, Knowledge Management, Topic Modeling, Market Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1904

7419 An Implementation of Data Reusable MPEG Video Coding Scheme

Authors: Vasily G. Moshnyaga

Abstract:

This paper presents an optimized MPEG2 video codec implementation, which drastically reduces the number of computations and memory accesses required for video compression. Unlike traditional scheme, we reuse data stored in frame memory to omit unnecessary coding operations and memory read/writes for unchanged macroblocks. Due to dynamic memory sharing among reference frames, data-driven macroblock characterization and selective macroblock processing, we perform less than 15% of the total operations required by a conventional coder while maintaining high picture quality.

Keywords: Data reuse, adaptive processing, video coding, MPEG

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1252

7418 A Challenge to Acquire Serious Victims’ Locations during Acute Period of Giant Disasters

Authors: Keiko Shimazu, Yasuhiro Maida, Tetsuya Sugata, Daisuke Tamakoshi, Kenji Makabe, Haruki Suzuki

Abstract:

In this paper, we report how to acquire serious victims’ locations in the Acute Stage of Large-scale Disasters, in an Emergency Information Network System designed by us. The background of our concept is based on the Great East Japan Earthquake occurred on March 11^th, 2011. Through many experiences of national crises caused by earthquakes and tsunamis, we have established advanced communication systems and advanced disaster medical response systems. However, Japan was devastated by huge tsunamis swept a vast area of Tohoku causing a complete breakdown of all the infrastructures including telecommunications. Therefore, we noticed that we need interdisciplinary collaboration between science of disaster medicine, regional administrative sociology, satellite communication technology and systems engineering experts. Communication of emergency information was limited causing a serious delay in the initial rescue and medical operation. For the emergency rescue and medical operations, the most important thing is to identify the number of casualties, their locations and status and to dispatch doctors and rescue workers from multiple organizations. In the case of the Tohoku earthquake, the dispatching mechanism and/or decision support system did not exist to allocate the appropriate number of doctors and locate disaster victims. Even though the doctors and rescue workers from multiple government organizations have their own dedicated communication system, the systems are not interoperable.

Keywords: Crisis management, disaster mitigation, messing, MGRS, Satellite communication system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 813

7417 A Hybrid Scheme for on-Line Diagnostic Decision Making Using Optimal Data Representation and Filtering Technique

Authors: Hyun-Woo Cho

Abstract:

The early diagnostic decision making in industrial processes is absolutely necessary to produce high quality final products. It helps to provide early warning for a special event in a process, and finding its assignable cause can be obtained. This work presents a hybrid diagnostic schmes for batch processes. Nonlinear representation of raw process data is combined with classification tree techniques. The nonlinear kernel-based dimension reduction is executed for nonlinear classification decision boundaries for fault classes. In order to enhance diagnosis performance for batch processes, filtering of the data is performed to get rid of the irrelevant information of the process data. For the diagnosis performance of several representation, filtering, and future observation estimation methods, four diagnostic schemes are evaluated. In this work, the performance of the presented diagnosis schemes is demonstrated using batch process data.

Keywords: Diagnostics, batch process, nonlinear representation, data filtering, multivariate statistical approach

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1304

7416 Increasing Replica Consistency Performances with Load Balancing Strategy in Data Grid Systems

Authors: Sarra Senhadji, Amar Kateb, Hafida Belbachir

Abstract:

Data replication in data grid systems is one of the important solutions that improve availability, scalability, and fault tolerance. However, this technique can also bring some involved issues such as maintaining replica consistency. Moreover, as grid environment are very dynamic some nodes can be more uploaded than the others to become eventually a bottleneck. The main idea of our work is to propose a complementary solution between replica consistency maintenance and dynamic load balancing strategy to improve access performances under a simulated grid environment.

Keywords: Consistency, replication, data grid, load balancing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2311

7415 Nonparametric Control Chart Using Density Weighted Support Vector Data Description

Authors: Myungraee Cha, Jun Seok Kim, Seung Hwan Park, Jun-Geol Baek

Abstract:

In manufacturing industries, development of measurement leads to increase the number of monitoring variables and eventually the importance of multivariate control comes to the fore. Statistical process control (SPC) is one of the most widely used as multivariate control chart. Nevertheless, SPC is restricted to apply in processes because its assumption of data as following specific distribution. Unfortunately, process data are composed by the mixture of several processes and it is hard to estimate as one certain distribution. To alternative conventional SPC, therefore, nonparametric control chart come into the picture because of the strength of nonparametric control chart, the absence of parameter estimation. SVDD based control chart is one of the nonparametric control charts having the advantage of flexible control boundary. However,basic concept of SVDD has been an oversight to the important of data characteristic, density distribution. Therefore, we proposed DW-SVDD (Density Weighted SVDD) to cover up the weakness of conventional SVDD. DW-SVDD makes a new attempt to consider dense of data as introducing the notion of density Weight. We extend as control chart using new proposed SVDD and a simulation study of various distributional data is conducted to demonstrate the improvement of performance.

Keywords: Density estimation, Multivariate control chart, Oneclass classification, Support vector data description (SVDD)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2105

7414 Identifying a Drug Addict Person Using Artificial Neural Networks

Authors: Mustafa Al Sukar, Azzam Sleit, Abdullatif Abu-Dalhoum, Bassam Al-Kasasbeh

Abstract:

Use and abuse of drugs by teens is very common and can have dangerous consequences. The drugs contribute to physical and sexual aggression such as assault or rape. Some teenagers regularly use drugs to compensate for depression, anxiety or a lack of positive social skills. Teen resort to smoking should not be minimized because it can be "gateway drugs" for other drugs (marijuana, cocaine, hallucinogens, inhalants, and heroin). The combination of teenagers' curiosity, risk taking behavior, and social pressure make it very difficult to say no. This leads most teenagers to the questions: "Will it hurt to try once?" Nowadays, technological advances are changing our lives very rapidly and adding a lot of technologies that help us to track the risk of drug abuse such as smart phones, Wireless Sensor Networks (WSNs), Internet of Things (IoT), etc. This technique may help us to early discovery of drug abuse in order to prevent an aggravation of the influence of drugs on the abuser. In this paper, we have developed a Decision Support System (DSS) for detecting the drug abuse using Artificial Neural Network (ANN); we used a Multilayer Perceptron (MLP) feed-forward neural network in developing the system. The input layer includes 50 variables while the output layer contains one neuron which indicates whether the person is a drug addict. An iterative process is used to determine the number of hidden layers and the number of neurons in each one. We used multiple experiment models that have been completed with Log-Sigmoid transfer function. Particularly, 10-fold cross validation schemes are used to access the generalization of the proposed system. The experiment results have obtained 98.42% classification accuracy for correct diagnosis in our system. The data had been taken from 184 cases in Jordan according to a set of questions compiled from Specialists, and data have been obtained through the families of drug abusers.

Keywords: Artificial Neural Network, Decision Support System, drug abuse, drug addiction, Multilayer Perceptron.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657

7413 Slugging Frequency Correlation for Inclined Gas-liquid Flow

Authors: V. Hernandez-Perez, M. Abdulkadir, B. J. Azzopardi

Abstract:

In this work, new experimental data for slugging frequency in inclined gas-liquid flow are reported, and a new correlation is proposed. Scale experiments were carried out using a mixture of air and water in a 6 m long pipe. Two different pipe diameters were used, namely, 38 and 67 mm. The data were taken with capacitance type sensors at a data acquisition frequency of 200 Hz over an interval of 60 seconds. For the range of flow conditions studied, the liquid superficial velocity is observed to influence the frequency strongly. A comparison of the present data with correlations available in the literature reveals a lack of agreement. A new correlation for slug frequency has been proposed for the inclined flow, which represents the main contribution of this work.

Keywords: slug frequency, inclined flow

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3143

7412 FCA-based Conceptual Knowledge Discovery in Folksonomy

Authors: Yu-Kyung Kang, Suk-Hyung Hwang, Kyoung-Mo Yang

Abstract:

The tagging data of (users, tags and resources) constitutes a folksonomy that is the user-driven and bottom-up approach to organizing and classifying information on the Web. Tagging data stored in the folksonomy include a lot of very useful information and knowledge. However, appropriate approach for analyzing tagging data and discovering hidden knowledge from them still remains one of the main problems on the folksonomy mining researches. In this paper, we have proposed a folksonomy data mining approach based on FCA for discovering hidden knowledge easily from folksonomy. Also we have demonstrated how our proposed approach can be applied in the collaborative tagging system through our experiment. Our proposed approach can be applied to some interesting areas such as social network analysis, semantic web mining and so on.

Keywords: Folksonomy data mining, formal concept analysis, collaborative tagging, conceptual knowledge discovery, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2013

7411 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy

Authors: Nazaket Gazieva

Abstract:

Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.

Keywords: Biometric voice prints, fundamental frequency, phonogram, speech signal, temporal characteristics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 537

7410 Assessing the Impact of Quinoa Cultivation Adopted to Produce a Secure Food Crop and Poverty Reduction by Farmers in Rural Pakistan

Authors: Ejaz Ashraf, Raheel Babar, Muhammad Yaseen, Hafiz Khurram Shurjeel, Nosheen Fatima

Abstract:

Main purpose of this study was to assess adoption level of farmers for quinoa cultivation after they had been taught through training and visit extension approach. At this time of the 21^st century, population structure, climate change, food requirements and eating habits of people are changing rapidly. In this scenario, farmers must play their key role in sustainable crop development and production through adoption of new crops that may also be helpful to overcome the issue of food insecurity as well as reducing poverty in rural areas. Its cultivation in Pakistan is at the early stages and there is a need to raise awareness among farmers to grow quinoa crops. In the middle of the 2015, a training and visit extension approach was used to raise awareness and convince farmers to grow quinoa in the area. During training and visit extension program, 80 farmers were randomly selected for the training of quinoa cultivation. Later on, these farmers trained 60 more farmers living into their neighborhood. After six months, a survey was conducted with all 140 farmers to assess the impact of the training and visit program on adoption level of respondents for the quinoa crop. The survey instrument was developed with the help of literature review and other experts of the crop. Validity and reliability of the instrument were checked before complete data collection. The data were analyzed by using SPSS. Multiple regression analysis was used for interpretation of the results from the survey, which indicated that factors like information/ training, change in agronomic and plant protection practices play a key role in the adoption of quinoa cultivation by respondents. In addition, the model explains more than 50% of variation in the adoption level of respondents. It is concluded that farmers need timely information for improved knowledge of agronomic and plant protection practices to adopt cultivation of the quinoa crop in the area.

Keywords: Farmers, quinoa, adoption, contact, training and visit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 899

7409 An Empirical Evaluation of Performance of Machine Learning Techniques on Imbalanced Software Quality Data

Authors: Ruchika Malhotra, Megha Khanna

Abstract:

The development of change prediction models can help the software practitioners in planning testing and inspection resources at early phases of software development. However, a major challenge faced during the training process of any classification model is the imbalanced nature of the software quality data. A data with very few minority outcome categories leads to inefficient learning process and a classification model developed from the imbalanced data generally does not predict these minority categories correctly. Thus, for a given dataset, a minority of classes may be change prone whereas a majority of classes may be non-change prone. This study explores various alternatives for adeptly handling the imbalanced software quality data using different sampling methods and effective MetaCost learners. The study also analyzes and justifies the use of different performance metrics while dealing with the imbalanced data. In order to empirically validate different alternatives, the study uses change data from three application packages of open-source Android data set and evaluates the performance of six different machine learning techniques. The results of the study indicate extensive improvement in the performance of the classification models when using resampling method and robust performance measures.

Keywords: Change proneness, empirical validation, imbalanced learning, machine learning techniques, object-oriented metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1505

7408 Neuropalliative Care in Patients with Progressive Neurological Disease in Czech Republic: Study Protocol

Authors: R. Bužgová, R. Kozáková, M. Škutová, M. Bar, P. Ressner, P. Bártová

Abstract:

Introduction: Currently, there has been an increasing concern about the provision of palliative care in non-oncological patients in both professional literature and clinical practice. However, there is not much scientific information on how to provide neurological and palliative care together. The main objective of the project is to create and to verify a concept of neuro-palliative and rehabilitative care for patients with selected neurological diseases in an advanced stage of the disease and also to evaluate bio-psychosocial and spiritual needs of these patients and their caregivers related to the quality of life using created standardized tools. Methodology: Triangulation of research methods (qualitative and quantitative) will be used. A concept of care and assessment tools will be developed by analyzing interviews and focus groups. Qualitative data will be analyzed using grounded theory. The concept of care will be tested in the context of the intervention study. Using quantitative analysis, we will assess the effect of an intervention provided on the saturation of needs, quality of life, and quality of care. A research sample will be made up of the patients with selected neurological diseases (Parkinson´s syndrome, motor neuron disease, multiple sclerosis, Huntington’s disease), together with patients´ family members. Based on the results, educational materials and a certified course for health care professionals will be created. Findings: Based on qualitative data analysis, we will propose the concept of integrated care model combining neurological, rehabilitative and specialist palliative care for patients with selected neurological diseases in different settings of care and services. Patients´ needs related to quality of life will be described by newly created and validated measuring tools before the start of intervention (application of neuro-palliative and palliative approach) and then in the time interval. Conclusion: Based on the results, educational materials and a certified course for doctors and health care professionals will be created.

Keywords: Multidisciplinary approach, neuropalliative care, research, quality of life.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 890

7407 Optimizing Resource Allocation and Indoor Location Using Bluetooth Low Energy

Authors: Néstor Álvarez-Díaz, Pino Caballero-Gil, Héctor Reboso-Morales, Francisco Martín-Fernández

Abstract:

The recent tendency of ”Internet of Things” (IoT) has developed in the last years, causing the emergence of innovative communication methods among multiple devices. The appearance of Bluetooth Low Energy (BLE) has allowed a push to IoT in relation to smartphones. In this moment, a set of new applications related to several topics like entertainment and advertisement has begun to be developed but not much has been done till now to take advantage of the potential that these technologies can offer on many business areas and in everyday tasks. In the present work, the application of BLE technology and smartphones is proposed on some business areas related to the optimization of resource allocation in huge facilities like airports. An indoor location system has been developed through triangulation methods with the use of BLE beacons. The described system can be used to locate all employees inside the building in such a way that any task can be automatically assigned to a group of employees. It should be noted that this system cannot only be used to link needs with employees according to distances, but it also takes into account other factors like occupation level or category. In addition, it has been endowed with a security system to manage business and personnel sensitive data. The efficiency of communications is another essential characteristic that has been taken into account in this work.

Keywords: Bluetooth Low Energy, indoor location, resource assignment, smartphones.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1648

7406 An Exploratory Study of Reliability of Ranking vs. Rating in Peer Assessment

Authors: Yang Song, Yifan Guo, Edward F. Gehringer

Abstract:

Fifty years of research has found great potential for peer assessment as a pedagogical approach. With peer assessment, not only do students receive more copious assessments; they also learn to become assessors. In recent decades, more educational peer assessments have been facilitated by online systems. Those online systems are designed differently to suit different class settings and student groups, but they basically fall into two categories: rating-based and ranking-based. The rating-based systems ask assessors to rate the artifacts one by one following some review rubrics. The ranking-based systems allow assessors to review a set of artifacts and give a rank for each of them. Though there are different systems and a large number of users of each category, there is no comprehensive comparison on which design leads to higher reliability. In this paper, we designed algorithms to evaluate assessors' reliabilities based on their rating/ranking against the global ranks of the artifacts they have reviewed. These algorithms are suitable for data from both rating-based and ranking-based peer assessment systems. The experiments were done based on more than 15,000 peer assessments from multiple peer assessment systems. We found that the assessors in ranking-based peer assessments are at least 10% more reliable than the assessors in rating-based peer assessments. Further analysis also demonstrated that the assessors in ranking-based assessments tend to assess the more differentiable artifacts correctly, but there is no such pattern for rating-based assessors.

Keywords: Peer assessment, peer rating, peer ranking, reliability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1094

7405 Human Fall Detection by FMCW Radar Based on Time-Varying Range-Doppler Features

Authors: Xiang Yu, Chuntao Feng, Lu Yang, Meiyang Song, Wenhao Zhou

Abstract:

The existing two-dimensional micro-Doppler features extraction ignores the correlation information between the spatial and temporal dimension features. For the range-Doppler map, the time dimension is introduced, and a frequency modulation continuous wave (FMCW) radar human fall detection algorithm based on time-varying range-Doppler features is proposed. Firstly, the range-Doppler sequence maps are generated from the echo signals of the continuous motion of the human body collected by the radar. Then the three-dimensional data cube composed of multiple frames of range-Doppler maps is input into the three-dimensional Convolutional Neural Network (3D CNN). The spatial and temporal features of time-varying range-Doppler are extracted by the convolution layer and pool layer at the same time. Finally, the extracted spatial and temporal features are input into the fully connected layer for classification. The experimental results show that the proposed fall detection algorithm has a detection accuracy of 95.66%.

Keywords: FMCW radar, fall detection, 3D CNN, time-varying range-Doppler features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 482

7404 Plant Varieties Selection System

Authors: Kitti Koonsanit, Chuleerat Jaruskulchai, Poonsak Miphokasap, Apisit Eiumnoh

Abstract:

In the end of the day, meteorological data and environmental data becomes widely used such as plant varieties selection system. Variety plant selection for planted area is of almost importance for all crops, including varieties of sugarcane. Since sugarcane have many varieties. Variety plant non selection for planting may not be adapted to the climate or soil conditions for planted area. Poor growth, bloom drop, poor fruit, and low price are to be from varieties which were not recommended for those planted area. This paper presents plant varieties selection system for planted areas in Thailand from meteorological data and environmental data by the use of decision tree techniques. With this software developed as an environmental data analysis tool, it can analyze resulting easier and faster. Our software is a front end of WEKA that provides fundamental data mining functions such as classify, clustering, and analysis functions. It also supports pre-processing, analysis, and decision tree output with exporting result. After that, our software can export and display data result to Google maps API in order to display result and plot plant icons effectively.

Keywords: Plant varieties selection system, decision tree, expert recommendation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778

7403 Jitter Transfer in High Speed Data Links

Authors: Tsunwai Gary Yip

Abstract:

Phase locked loops for data links operating at 10 Gb/s or faster are low phase noise devices designed to operate with a low jitter reference clock. Characterization of their jitter transfer function is difficult because the intrinsic noise of the device is comparable to the random noise level in the reference clock signal. A linear model is proposed to account for the intrinsic noise of a PLL. The intrinsic noise data of a PLL for 10 Gb/s links is presented. The jitter transfer function of a PLL in a test chip for 12.8 Gb/s data links was determined in experiments using the 400 MHz reference clock as the source of simultaneous excitations over a wide range of frequency. The result shows that the PLL jitter transfer function can be approximated by a second order linear model.

Keywords: Intrinsic phase noise, jitter in data link, PLL jitter transfer function, high speed clocking in electronic circuit

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1927

7402 Protein Secondary Structure Prediction Using Parallelized Rule Induction from Coverings

Authors: Leong Lee, Cyriac Kandoth, Jennifer L. Leopold, Ronald L. Frank

Abstract:

Protein 3D structure prediction has always been an important research area in bioinformatics. In particular, the prediction of secondary structure has been a well-studied research topic. Despite the recent breakthrough of combining multiple sequence alignment information and artificial intelligence algorithms to predict protein secondary structure, the Q3 accuracy of various computational prediction algorithms rarely has exceeded 75%. In a previous paper [1], this research team presented a rule-based method called RT-RICO (Relaxed Threshold Rule Induction from Coverings) to predict protein secondary structure. The average Q3 accuracy on the sample datasets using RT-RICO was 80.3%, an improvement over comparable computational methods. Although this demonstrated that RT-RICO might be a promising approach for predicting secondary structure, the algorithm-s computational complexity and program running time limited its use. Herein a parallelized implementation of a slightly modified RT-RICO approach is presented. This new version of the algorithm facilitated the testing of a much larger dataset of 396 protein domains [2]. Parallelized RTRICO achieved a Q3 score of 74.6%, which is higher than the consensus prediction accuracy of 72.9% that was achieved for the same test dataset by a combination of four secondary structure prediction methods [2].

Keywords: data mining, protein secondary structure prediction, parallelization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1576

7401 Investigating the Areas of Self-Reflection in Malaysian Students’ Personal Blogs: A Case Study

Authors: Chen May Oh, Nadzrah Abu Bakar

Abstract:

This case study investigates the areas of self-reflection through the written content of four university students’ blogs. The study was undertaken to explore the categories of self-reflection in relation to the use of blogs. Data collection methods included downloading students’ blog entries and recording individual interviews to further support the data. Data was analyzed using computer assisted qualitative data analysis software, Nvivo, to categories and code the data. The categories of self-reflection revealed in the findings showed that university students used blogs to reflect on (1) life in varsity, (2) emotions and feelings, (3) various relationships, (4) personal growth, (5) spirituality, (6) health conditions, (7) busyness with daily chores, (8) gifts for people and themselves and (9) personal interests. Overall, all four of the students had positive experiences and felt satisfied using blogs for self-reflection.

Keywords: Blogging, personal growth, self-reflection, university students.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1196

7400 Machine Learning Framework: Competitive Intelligence and Key Drivers Identification of Market Share Trends among Healthcare Facilities

Authors: A. Appe, B. Poluparthi, L. Kasivajjula, U. Mv, S. Bagadi, P. Modi, A. Singh, H. Gunupudi, S. Troiano, J. Paul, J. Stovall, J. Yamamoto

Abstract:

The necessity of data-driven decisions in healthcare strategy formulation is rapidly increasing. A reliable framework which helps identify factors impacting a healthcare provider facility or a hospital (from here on termed as facility) market share is of key importance. This pilot study aims at developing a data-driven machine learning-regression framework which aids strategists in formulating key decisions to improve the facility’s market share which in turn impacts in improving the quality of healthcare services. The US (United States) healthcare business is chosen for the study, and the data spanning 60 key facilities in Washington State and about 3 years of historical data are considered. In the current analysis, market share is termed as the ratio of the facility’s encounters to the total encounters among the group of potential competitor facilities. The current study proposes a two-pronged approach of competitor identification and regression approach to evaluate and predict market share, respectively. Leveraged model agnostic technique, SHAP (SHapley Additive exPlanations), to quantify the relative importance of features impacting the market share. Typical techniques in literature to quantify the degree of competitiveness among facilities use an empirical method to calculate a competitive factor to interpret the severity of competition. The proposed method identifies a pool of competitors, develops Directed Acyclic Graphs (DAGs) and feature level word vectors, and evaluates the key connected components at the facility level. This technique is robust since it is data-driven, which minimizes the bias from empirical techniques. The DAGs factor in partial correlations at various segregations and key demographics of facilities along with a placeholder to factor in various business rules (for e.g., quantifying the patient exchanges, provider references, and sister facilities). Identified are the multiple groups of competitors among facilities. Leveraging the competitors' identified developed and fine-tuned Random Forest Regression model to predict the market share. To identify key drivers of market share at an overall level, permutation feature importance of the attributes was calculated. For relative quantification of features at a facility level, incorporated SHAP, a model agnostic explainer. This helped to identify and rank the attributes at each facility which impacts the market share. This approach proposes an amalgamation of the two popular and efficient modeling practices, viz., machine learning with graphs and tree-based regression techniques to reduce the bias. With these, we helped to drive strategic business decisions.

Keywords: Competition, DAGs, hospital, healthcare, machine learning, market share, random forest, SHAP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 227

7399 Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic

Authors: Pejman Hosseinioun, Hasan Shakeri, Ghasem Ghorbanirostam

Abstract:

In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.

Keywords: Decision support system, data mining, knowledge discovery, data discovery, fuzzy logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2114

7398 A New Algorithm for Cluster Initialization

Authors: Moth'd Belal. Al-Daoud

Abstract:

Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the k-means algorithm. Solutions obtained from this technique are dependent on the initialization of cluster centers. In this article we propose a new algorithm to initialize the clusters. The proposed algorithm is based on finding a set of medians extracted from a dimension with maximum variance. The algorithm has been applied to different data sets and good results are obtained.

Keywords: clustering, k-means, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2089