Search results for: Survey data visualization.

7022 Risk Classification of SMEs by Early Warning Model Based on Data Mining

Authors: Nermin Ozgulbas, Ali Serhan Koyuncugil

Abstract:

One of the biggest problems of SMEs is their tendencies to financial distress because of insufficient finance background. In this study, an Early Warning System (EWS) model based on data mining for financial risk detection is presented. CHAID algorithm has been used for development of the EWS. Developed EWS can be served like a tailor made financial advisor in decision making process of the firms with its automated nature to the ones who have inadequate financial background. Besides, an application of the model implemented which covered 7,853 SMEs based on Turkish Central Bank (TCB) 2007 data. By using EWS model, 31 risk profiles, 15 risk indicators, 2 early warning signals, and 4 financial road maps has been determined for financial risk mitigation.

Keywords: Early Warning Systems, Data Mining, Financial Risk, SMEs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3359

7021 Behavior of Engineering Students in Kuwait University

Authors: M. A. Al-Ajmi, R. S. Al-Kandari

Abstract:

This initial study is concerned with the behavior of engineering students in Kuwait University which became a concern due to the global issues of education in all levels. A survey has been conducted to identify academic and societal issues affecting the engineering student performance. The study is drawing major conclusions with regard to private tutoring and the online availability of textbooks’ solution manuals.

Keywords: Solution manual, engineering, textbook, ethics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1735

7020 Long-Range Dependence of Financial Time Series Data

Authors: Chatchai Pesee

Abstract:

This paper examines long-range dependence or longmemory of financial time series on the exchange rate data by the fractional Brownian motion (fBm). The principle of spectral density function in Section 2 is used to find the range of Hurst parameter (H) of the fBm. If 0< H <1/2, then it has a short-range dependence (SRD). It simulates long-memory or long-range dependence (LRD) if 1/2< H <1. The curve of exchange rate data is fBm because of the specific appearance of the Hurst parameter (H). Furthermore, some of the definitions of the fBm, long-range dependence and selfsimilarity are reviewed in Section II as well. Our results indicate that there exists a long-memory or a long-range dependence (LRD) for the exchange rate data in section III. Long-range dependence of the exchange rate data and estimation of the Hurst parameter (H) are discussed in Section IV, while a conclusion is discussed in Section V.

Keywords: Fractional Brownian motion, long-rangedependence, memory, short-range dependence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1861

7019 Directional Drilling Optimization by Non-Rotating Stabilizer

Authors: Eisa Noveiri, Adel Taheri Nia

Abstract:

The Non-Rotating Adjustable Stabilizer / Directional Solution (NAS/DS) is the imitation of a mechanical process or an object by a directional drilling operation that causes a respond mathematically and graphically to data and decision to choose the best conditions compared to the previous mode. The NAS/DS Auto Guide rotary steerable tool is undergoing final field trials. The point-the-bit tool can use any bit, work at any rotating speed, work with any MWD/LWD system, and there is no pressure drop through the tool. It is a fully closed-loop system that automatically maintains a specified curvature rate. The Non–Rotating Adjustable stabilizer (NAS) can be controls curvature rate by exactly positioning and run with the optimum bit, use the most effective weight (WOB) and rotary speed (RPM) and apply all of the available hydraulic energy to the bit. The directional simulator allowed to specify the size of the curvature rate performance errors of the NAS tool and the magnitude of the random errors in the survey measurements called the Directional Solution (DS). The combination of these technologies (NAS/DS) will provide smoother bore holes, reduced drilling time, reduced drilling cost and incredible targeting precision. This simulator controls curvature rate by precisely adjusting the radial extension of stabilizer blades on a near bit Non-Rotating Stabilizer and control process corrects for the secondary effects caused by formation characteristics, bit and tool wear, and manufacturing tolerances.

Keywords: non-rotating, Adjustable stabilizer, simulator, Directional Drilling, optimization, Oil Well Drilling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3251

7018 Building Facade Study in Lahijan City, Iran: The Impact of Facade's Visual Elements on Historical Image

Authors: N. Utaberta, A. Jalali, S. Johar, M. Surat, A. I. Che-Ani

Abstract:

Buildings are considered as significant part in the cities, which plays main role in organization and arrangement of city appearance, which is affects image of that building facades, as an connective between inner and outer space, have a main role in city image and they are classified as rich image and poor image by people evaluation which related to visual architectural and urban elements in building facades. the buildings in Karimi street , in Lahijan city where, lies in north of Iran, contain the variety of building's facade types which, have made a city image in Historical part of Lahijan city, while reflected the Iranian cities identity. The study attempt to identify the architectural and urban elements that impression the image of building facades in historical area, based on public evaluation. Quantitative method were used and the data was collected through questionnaire survey, the result presented architectural style, color, shape, and design evaluated by people as most important factor which should be understate in future development. in fact, the rich architectural style with strong design make strong city image as weak design make poor city image.

Keywords: Building's facade, historical area.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3877

7017 Meta Random Forests

Authors: Praveen Boinee, Alessandro De Angelis, Gian Luca Foresti

Abstract:

Leo Breimans Random Forests (RF) is a recent development in tree based classifiers and quickly proven to be one of the most important algorithms in the machine learning literature. It has shown robust and improved results of classifications on standard data sets. Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques to the random forests. We experiment the working of the ensembles of random forests on the standard data sets available in UCI data sets. We compare the original random forest algorithm with their ensemble counterparts and discuss the results.

Keywords: Random Forests [RF], ensembles, UCI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2671

7016 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain subgroups of time series data with normal distribution from the inflow into wastewater treatment plant data, composed of several groups differing by mean value. Two simple algorithms, K-mean and EM, were chosen as a clustering method. The Rand index was used to measure the similarity. After simple meta-clustering, a regression model was performed for each subgroups. The final model was a sum of the subgroups models. The quality of the obtained model was compared with the regression model made using the same explanatory variables, but with no clustering of data. Results were compared using determination coefficient (R2), measure of prediction accuracy- mean absolute percentage error (MAPE) and comparison on a linear chart. Preliminary results allow us to foresee the potential of the presented technique.

Keywords: Clustering, Data analysis, Data mining, Predictive models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1927

7015 Studies on Determination of the Optimum Distance Between the Tmotes for Optimum Data Transfer in a Network with WLL Capability

Authors: N C Santhosh Kumar, N K Kishore

Abstract:

Using mini modules of Tmotes, it is possible to automate a small personal area network. This idea can be extended to large networks too by implementing multi-hop routing. Linking the various Tmotes using Programming languages like Nesc, Java and having transmitter and receiver sections, a network can be monitored. It is foreseen that, depending on the application, a long range at a low data transfer rate or average throughput may be an acceptable trade-off. To reduce the overall costs involved, an optimum number of Tmotes to be used under various conditions (Indoor/Outdoor) is to be deduced. By analyzing the data rates or throughputs at various locations of Tmotes, it is possible to deduce an optimal number of Tmotes for a specific network. This paper deals with the determination of optimum distances to reduce the cost and increase the reliability of the entire sensor network with Wireless Local Loop (WLL) capability.

Keywords: Average throughput, data rate, multi-hop routing, optimum data transfer, throughput, Tmotes, wireless local loop.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1340

7014 Unified Structured Process for Health Analytics

Authors: Supunmali Ahangama, Danny Chiang Choon Poo

Abstract:

Health analytics (HA) is used in healthcare systems for effective decision making, management and planning of healthcare and related activities. However, user resistances, unique position of medical data content and structure (including heterogeneous and unstructured data) and impromptu HA projects have held up the progress in HA applications. Notably, the accuracy of outcomes depends on the skills and the domain knowledge of the data analyst working on the healthcare data. Success of HA depends on having a sound process model, effective project management and availability of supporting tools. Thus, to overcome these challenges through an effective process model, we propose a HA process model with features from rational unified process (RUP) model and agile methodology.

Keywords: Agile methodology, health analytics, unified process model, UML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2307

7013 The Classification Model for Hard Disk Drive Functional Tests under Sparse Data Conditions

Authors: S. Pattanapairoj, D. Chetchotsak

Abstract:

This paper proposed classification models that would be used as a proxy for hard disk drive (HDD) functional test equitant which required approximately more than two weeks to perform the HDD status classification in either “Pass" or “Fail". These models were constructed by using committee network which consisted of a number of single neural networks. This paper also included the method to solve the problem of sparseness data in failed part, which was called “enforce learning method". Our results reveal that the constructed classification models with the proposed method could perform well in the sparse data conditions and thus the models, which used a few seconds for HDD classification, could be used to substitute the HDD functional tests.

Keywords: Sparse data, Classifications, Committee network

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1713

7012 Analyzing the Participation of Young People in Politics: An Exploratory Study Applied on Motivation in Croatia

Authors: Valentina Piric, Maja Martinovic, Zoran Barac

Abstract:

The application of marketing to the domain of politics has become relevant in recent times. With this article the authors wanted to explore the issue of the current political engagement among young people in Croatia. The question is what makes young people (age 18-30) politically active in young democracies such as that of the Republic of Croatia. Therefore, the objective of this study was to discover the real or hidden motivations behind the decision to actively participate in politics among young members of the two largest political parties in the country – the Croatian Democratic Union and the Social Democratic Party of Croatia. The study expected to find that the motivation for political engagement of young people is often connected with a possible achievement of individual goals and egoistic needs such as: self-acceptance, social success, financial success, prestige, reputation, status, recognition from the others etc. It was also expected that, due to the poor economic and social situation in the country, young people feel an increasing disconnection from politics. Additionally, the authors expected to find that there is a huge potential to engage young people in the political life of the country through a proper and more interactive use of marketing communication campaigns and social media platforms, with an emphasis on highly ethical motives of political activity and their benefits to society. All respondents included in the quantitative survey (sample size [N=100]) are active in one of the two largest political parties in Croatia. The sampling and distribution of the survey occurred in the field in September 2016. The results of the survey demonstrate that in Croatia, the way young people feel about politics and act accordingly, are in fact similar to what the theory describes. The research findings reveal that young people are politically active; however, the challenge is to find a way to motivate even more young people in Croatia to actively participate in the political and democratic processes in the country and to encourage them to see additional benefits out of this practice, not only related to their individual motives, but related more to the well-being of Croatia as a country and of every member of society. The research also discovered a huge potential for political marketing communication possibilities, especially related to interactive social media. It is possible that the social media channels have a stronger influence on the decision-making process among young people when compared to groups of reference. The level of interest in politics among young Croatians varies; some of them are almost indifferent, whilst others express a serious interest in different ways to actively contribute to the political life of the country, defining a participation in the political life of their country almost as their moral obligation. However, additional observations and further research need to be conducted to get a clearer and more precise picture about the interest in politics among young people in Croatia and their social potential.

Keywords: Croatia, marketing communication, motivation, politics, young people.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1443

7011 Accurate Position Electromagnetic Sensor Using Data Acquisition System

Authors: Z. Ezzouine, A. Nakheli

Abstract:

This paper presents a high position electromagnetic sensor system (HPESS) that is applicable for moving object detection. The authors have developed a high-performance position sensor prototype dedicated to students’ laboratory. The challenge was to obtain a highly accurate and real-time sensor that is able to calculate position, length or displacement. An electromagnetic solution based on a two coil induction principal was adopted. The HPESS converts mechanical motion to electric energy with direct contact. The output signal can then be fed to an electronic circuit. The voltage output change from the sensor is captured by data acquisition system using LabVIEW software. The displacement of the moving object is determined. The measured data are transmitted to a PC in real-time via a DAQ (NI USB -6281). This paper also describes the data acquisition analysis and the conditioning card developed specially for sensor signal monitoring. The data is then recorded and viewed using a user interface written using National Instrument LabVIEW software. On-line displays of time and voltage of the sensor signal provide a user-friendly data acquisition interface. The sensor provides an uncomplicated, accurate, reliable, inexpensive transducer for highly sophisticated control systems.

Keywords: Electromagnetic sensor, data acquisition, accurately, position measurement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 938

7010 Calculus Logarithmic Function for Image Encryption

Authors: Adil AL-Rammahi

Abstract:

When we prefer to make the data secure from various attacks and fore integrity of data, we must encrypt the data before it is transmitted or stored. This paper introduces a new effective and lossless image encryption algorithm using a natural logarithmic function. The new algorithm encrypts an image through a three stage process. In the first stage, a reference natural logarithmic function is generated as the foundation for the encryption image. The image numeral matrix is then analyzed to five integer numbers, and then the numbers’ positions are transformed to matrices. The advantages of this method is useful for efficiently encrypting a variety of digital images, such as binary images, gray images, and RGB images without any quality loss. The principles of the presented scheme could be applied to provide complexity and then security for a variety of data systems such as image and others.

Keywords: Linear Systems, Image Encryption, Calculus.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2369

7009 Intelligent BRT in Tehran

Authors: P. Parvizi, S. Mohammadi

Abstract:

an intelligent BRT system is necessary when communities looking for new ways to use high capacity rapid transit at a reduced cost.This paper will describe the intelligent control system that works with Datacenter. With the help of GPS system, the data center can monitor the situation of each bus and bus station. Through RFID technology, bus station and traffic light can transfer data with bus and by Wimax communication technology all of parts can talk together; data center learns all information about the location of bus, the arrival of bus in each station and the number of passengers in station and bus.Finally, the paper presents the case study of those theories in Tehran BRT.

Keywords: TehranBRT, RFID, Intelligent Transportation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2425

7008 Spread Spectrum Image Watermarking for Secured Multimedia Data Communication

Authors: Tirtha S. Das, Ayan K. Sau, Subir K. Sarkar

Abstract:

Digital watermarking is a way to provide the facility of secure multimedia data communication besides its copyright protection approach. The Spread Spectrum modulation principle is widely used in digital watermarking to satisfy the robustness of multimedia signals against various signal-processing operations. Several SS watermarking algorithms have been proposed for multimedia signals but very few works have discussed on the issues responsible for secure data communication and its robustness improvement. The current paper has critically analyzed few such factors namely properties of spreading codes, proper signal decomposition suitable for data embedding, security provided by the key, successive bit cancellation method applied at decoder which have greater impact on the detection reliability, secure communication of significant signal under camouflage of insignificant signals etc. Based on the analysis, robust SS watermarking scheme for secure data communication is proposed in wavelet domain and improvement in secure communication and robustness performance is reported through experimental results. The reported result also shows improvement in visual and statistical invisibility of the hidden data.

Keywords: Spread spectrum modulation, spreading code, signaldecomposition, security, successive bit cancellation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2761

7007 Comparison of Hough Transform and Mean Shift Algorithm for Estimation of the Orientation Angle of Industrial Data Matrix Codes

Authors: Ion-Cosmin Dita, Vasile Gui, Franz Quint, Marius Otesteanu

Abstract:

In automatic manufacturing and assembling of mechanical, electrical and electronic parts one needs to reliably identify the position of components and to extract the information of these components. Data Matrix Codes (DMC) are established by these days in many areas of industrial manufacturing thanks to their concentration of information on small spaces. In today’s usually order-related industry, where increased tracing requirements prevail, they offer further advantages over other identification systems. This underlines in an impressive way the necessity of a robust code reading system for detecting DMC on the components in factories. This paper compares two methods for estimating the angle of orientation of Data Matrix Codes: one method based on the Hough Transform and the other based on the Mean Shift Algorithm. We concentrate on Data Matrix Codes in industrial environment, punched, milled, lasered or etched on different materials in arbitrary orientation.

Keywords: Industrial data matrix code, Hough transform, mean shift.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1320

7006 Implementation of Security Algorithms for u-Health Monitoring System

Authors: Jiho Park, Yong-Gyu Lee, Gilwon Yoon

Abstract:

Data security in u-Health system can be an important issue because wireless network is vulnerable to hacking. However, it is not easy to implement a proper security algorithm in an embedded u-health monitoring because of hardware constraints such as low performance, power consumption and limited memory size and etc. To secure data that contain personal and biosignal information, we implemented several security algorithms such as Blowfish, data encryption standard (DES), advanced encryption standard (AES) and Rivest Cipher 4 (RC4) for our u-Health monitoring system and the results were successful. Under the same experimental conditions, we compared these algorithms. RC4 had the fastest execution time. Memory usage was the most efficient for DES. However, considering performance and safety capability, however, we concluded that AES was the most appropriate algorithm for a personal u-Health monitoring system.

Keywords: biosignal, data encryption, security measures, u-health

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2105

7005 A Symbol by Symbol Clustering Based Blind Equalizer

Authors: Kristina Georgoulakis

Abstract:

A new blind symbol by symbol equalizer is proposed. The operation of the proposed equalizer is based on the geometric properties of the two dimensional data constellation. An unsupervised clustering technique is used to locate the clusters formed by the received data. The symmetric properties of the clusters labels are subsequently utilized in order to label the clusters. Following this step, the received data are compared to clusters and decisions are made on a symbol by symbol basis, by assigning to each data the label of the nearest cluster. The operation of the equalizer is investigated both in linear and nonlinear channels. The performance of the proposed equalizer is compared to the performance of a CMAbased blind equalizer.

Keywords: Blind equalization, channel equalization, cluster based equalisers

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1407

7004 Zero Inflated Models for Overdispersed Count Data

Authors: Y. N. Phang, E. F. Loh

Abstract:

The zero inflated models are usually used in modeling count data with excess zeros where the existence of the excess zeros could be structural zeros or zeros which occur by chance. These type of data are commonly found in various disciplines such as finance, insurance, biomedical, econometrical, ecology, and health sciences which involve sex and health dental epidemiology. The most popular zero inflated models used by many researchers are zero inflated Poisson and zero inflated negative binomial models. In addition, zero inflated generalized Poisson and zero inflated double Poisson models are also discussed and found in some literature. Recently zero inflated inverse trinomial model and zero inflated strict arcsine models are advocated and proven to serve as alternative models in modeling overdispersed count data caused by excessive zeros and unobserved heterogeneity. The purpose of this paper is to review some related literature and provide a variety of examples from different disciplines in the application of zero inflated models. Different model selection methods used in model comparison are discussed.

Keywords: Overdispersed count data, model selection methods, likelihood ratio, AIC, BIC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4496

7003 Formalizing a Procedure for Generating Uncertain Resource Availability Assumptions Based On Real Time Logistic Data Capturing with Auto-ID Systems for Reactive Scheduling

Authors: Lars Laußat, Manfred Helmus, Kamil Szczesny, Markus König

Abstract:

As one result of the project “Reactive Construction Project Scheduling using Real Time Construction Logistic Data and Simulation”, a procedure for using data about uncertain resource availability assumptions in reactive scheduling processes has been developed. Prediction data about resource availability is generated in a formalized way using real-time monitoring data e.g. from auto-ID systems on the construction site and in the supply chains. The paper focusses on the formalization of the procedure for monitoring construction logistic processes, for the detection of disturbance and for generating of new and uncertain scheduling assumptions for the reactive resource constrained simulation procedure that is and will be further described in other papers.

Keywords: Auto-ID, Construction Logistic, Fuzzy, Monitoring, RFID, Scheduling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1755

7002 Nuclear Data Evaluation for 217Po

Authors: Sherif S. Nafee, Amir K. Al-Ramady, Salem S. Shaheen

Abstract:

Evaluated nuclear decay data for the 217Po nuclide is presented in the present work. These data include recommended values for the half-life T1/2, α-, β-- and γ-ray emission energies and probabilities. Decay data from 221Rn α and 217Bi β—decays are presented. Q(α) has been updated based on the recent published work of the Atomic Mass Evaluation AME2012. In addition, the logft values were calculated using the Logft program from the ENSDF evaluation package. Moreover, the total internal conversion electrons and the K-shell to L-shell and L-shell to M-shell and to N-shell conversion electrons ratios K/L, L/M and L/N have been calculated using Bricc program. Meanwhile, recommendation values or the multi-polarities have been assigned based on recently measurement yield a better intensity balance at the 254 keV and 264 keV gamma transitions.

Keywords: Atomic Mass Evaluation, Nuclear Data Evaluation, Total Electron Conversion Electrons.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2238

7001 Detection of Keypoint in Press-Fit Curve Based on Convolutional Neural Network

Authors: Shoujia Fang, Guoqing Ding, Xin Chen

Abstract:

The quality of press-fit assembly is closely related to reliability and safety of product. The paper proposed a keypoint detection method based on convolutional neural network to improve the accuracy of keypoint detection in press-fit curve. It would provide an auxiliary basis for judging quality of press-fit assembly. The press-fit curve is a curve of press-fit force and displacement. Both force data and distance data are time-series data. Therefore, one-dimensional convolutional neural network is used to process the press-fit curve. After the obtained press-fit data is filtered, the multi-layer one-dimensional convolutional neural network is used to perform the automatic learning of press-fit curve features, and then sent to the multi-layer perceptron to finally output keypoint of the curve. We used the data of press-fit assembly equipment in the actual production process to train CNN model, and we used different data from the same equipment to evaluate the performance of detection. Compared with the existing research result, the performance of detection was significantly improved. This method can provide a reliable basis for the judgment of press-fit quality.

Keywords: Keypoint detection, curve feature, convolutional neural network, press-fit assembly.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 904

7000 Proposal to Increase the Efficiency, Reliability and Safety of the Centre of Data Collection Management and Their Evaluation Using Cluster Solutions

Authors: Martin Juhas, Bohuslava Juhasova, Igor Halenar, Andrej Elias

Abstract:

This article deals with the possibility of increasing efficiency, reliability and safety of the system for teledosimetric data collection management and their evaluation as a part of complex study for activity “Research of data collection, their measurement and evaluation with mobile and autonomous units” within project “Research of monitoring and evaluation of non-standard conditions in the area of nuclear power plants”. Possible weaknesses in existing system are identified. A study of available cluster solutions with possibility of their deploying to analysed system is presented

Keywords: Teledosimetric data, efficiency, reliability, safety, cluster solution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538

6999 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: Classification algorithms; data mining; tourism; knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2516

6998 Clustering Categorical Data Using Hierarchies (CLUCDUH)

Authors: Gökhan Silahtaroğlu

Abstract:

Clustering large populations is an important problem when the data contain noise and different shapes. A good clustering algorithm or approach should be efficient enough to detect clusters sensitively. Besides space complexity, time complexity also gains importance as the size grows. Using hierarchies we developed a new algorithm to split attributes according to the values they have and choosing the dimension for splitting so as to divide the database roughly into equal parts as much as possible. At each node we calculate some certain descriptive statistical features of the data which reside and by pruning we generate the natural clusters with a complexity of O(n).

Keywords: Clustering, tree, split, pruning, entropy, gini.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1530

6997 Analysis of Users’ Behavior on Book Loan Log Based On Association Rule Mining

Authors: Kanyarat Bussaban, Kunyanuth Kularbphettong

Abstract:

This research aims to create a model for analysis of student behavior using Library resources based on data mining technique in case of Suan Sunandha Rajabhat University. The model was created under association rules, Apriori algorithm. The results were found 14 rules and the rules were tested with testing data set and it showed that the ability of classify data was 79.24percent and the MSE was 22.91. The results showed that the user’s behavior model by using association rule technique can use to manage the library resources.

Keywords: Behavior, data mining technique, Apriori algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2286

6996 Data Integrity: Challenges in Health Information Systems in South Africa

Authors: T. Thulare, M. Herselman, A. Botha

Abstract:

Poor system use, including inappropriate design of health information systems, causes difficulties in communication with patients and increased time spent by healthcare professionals in recording the necessary health information for medical records. System features like pop-up reminders, complex menus, and poor user interfaces can make medical records far more time consuming than paper cards as well as affect decision-making processes. Although errors associated with health information and their real and likely effect on the quality of care and patient safety have been documented for many years, more research is needed to measure the occurrence of these errors and determine the causes to implement solutions. Therefore, the purpose of this paper is to identify data integrity challenges in hospital information systems through a scoping review and based on the results provide recommendations on how to manage these. Only 34 papers were found to be most suitable out of 297 publications initially identified in the field. The results indicated that human and computerized systems are the most common challenges associated with data integrity and factors such as policy, environment, health workforce, and lack of awareness attribute to these challenges but if measures are taken the data integrity challenges can be managed.

Keywords: Data integrity, data integrity challenges, hospital information systems, South Africa.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1308

6995 Self-Supervised Pretraining on Paired Sequences of fMRI Data for Transfer Learning to Brain Decoding Tasks

Authors: Sean Paulsen, Michael Casey

Abstract:

In this work, we present a self-supervised pretraining framework for transformers on functional Magnetic Resonance Imaging (fMRI) data. First, we pretrain our architecture on two self-supervised tasks simultaneously to teach the model a general understanding of the temporal and spatial dynamics of human auditory cortex during music listening. Our pretraining results are the first to suggest a synergistic effect of multitask training on fMRI data. Second, we finetune the pretrained models and train additional fresh models on a supervised fMRI classification task. We observe significantly improved accuracy on held-out runs with the finetuned models, which demonstrates the ability of our pretraining tasks to facilitate transfer learning. This work contributes to the growing body of literature on transformer architectures for pretraining and transfer learning with fMRI data, and serves as a proof of concept for our pretraining tasks and multitask pretraining on fMRI data.

Keywords: Transfer learning, fMRI, self-supervised, brain decoding, transformer, multitask training.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 97

6994 A Causal Model for Environmental Design of Residential Community for Elderly Well-Being in Thailand

Authors: Porntip Ruengtam

Abstract:

This article is an extension of previous research presenting the relevant factors related to environmental perceptions, residential community, and the design of a healing environment, which have effects on the well-being and requirements of Thai elderly. Research methodology began with observations and interviews in three case studies in terms of the management processes and environment design of similar existing projects in Thailand. The interview results were taken to summarize with related theories and literature. A questionnaire survey was designed for data collection to confirm the factors of requirements in a residential community intended for the Thai elderly. A structural equation model (SEM) was formulated to explain the cause-effect factors for the requirements of a residential community for Thai elderly. The research revealed that the requirements of a residential community for Thai elderly were classified into three groups when utilizing a technique for exploratory factor analysis. The factors were comprised of (1) requirements for general facilities and activities, (2) requirements for facilities related to health and security, and (3) requirements for facilities related to physical exercise in the residential community. The results from the SEM showed the background of elderly people had a direct effect on their requirements for a residential community from various aspects. The results should lead to the formulation of policies for design and management of residential communities for the elderly in order to enhance quality of life as well as both the physical and mental health of the Thai elderly.

Keywords: Elderly, environmental design, residential community, structural equation modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 881

6993 On Speeding Up Support Vector Machines: Proximity Graphs Versus Random Sampling for Pre-Selection Condensation

Authors: Xiaohua Liu, Juan F. Beltran, Nishant Mohanchandra, Godfried T. Toussaint

Abstract:

Support vector machines (SVMs) are considered to be the best machine learning algorithms for minimizing the predictive probability of misclassification. However, their drawback is that for large data sets the computation of the optimal decision boundary is a time consuming function of the size of the training set. Hence several methods have been proposed to speed up the SVM algorithm. Here three methods used to speed up the computation of the SVM classifiers are compared experimentally using a musical genre classification problem. The simplest method pre-selects a random sample of the data before the application of the SVM algorithm. Two additional methods use proximity graphs to pre-select data that are near the decision boundary. One uses k-Nearest Neighbor graphs and the other Relative Neighborhood Graphs to accomplish the task.

Keywords: Machine learning, data mining, support vector machines, proximity graphs, relative-neighborhood graphs, k-nearestneighbor graphs, random sampling, training data condensation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1898