Search results for: genomic data analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13530

Search results for: genomic data analysis

12900 Secure Data Aggregation Using Clusters in Sensor Networks

Authors: Prakash G L, Thejaswini M, S H Manjula, K R Venugopal, L M Patnaik

Abstract:

Wireless sensor network can be applied to both abominable and military environments. A primary goal in the design of wireless sensor networks is lifetime maximization, constrained by the energy capacity of batteries. One well-known method to reduce energy consumption in such networks is data aggregation. Providing efcient data aggregation while preserving data privacy is a challenging problem in wireless sensor networks research. In this paper, we present privacy-preserving data aggregation scheme for additive aggregation functions. The Cluster-based Private Data Aggregation (CPDA)leverages clustering protocol and algebraic properties of polynomials. It has the advantage of incurring less communication overhead. The goal of our work is to bridge the gap between collaborative data collection by wireless sensor networks and data privacy. We present simulation results of our schemes and compare their performance to a typical data aggregation scheme TAG, where no data privacy protection is provided. Results show the efficacy and efficiency of our schemes.

Keywords: Aggregation, Clustering, Query Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734
12899 Tourism Satellite Account: Approach and Information System Development

Authors: Pappas Theodoros, Michael Diakomichalis

Abstract:

Measuring the economic impact of tourism in a benchmark economy is a global concern, with previous measurements being partial and not fully integrated. Tourism is a phenomenon that requires individual consumption of visitors, and which should be observed and measured to reveal the overall contribution of tourism to an economy. The Tourism Satellite Account (TSA) is a critical tool for assessing the annual growth of tourism, providing reliable measurements. This article presents a system of TSA information that encompasses all functions TSA functions, including input, storage, management, and analysis of data, as well as additional future functions and enhances the efficiency of tourism data management and TSA collection utility. The methodology and results presented offer new insights for the development and implementation of TSA.

Keywords: Tourism Satellite Account, information system, data-based tourist account.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 59
12898 Understanding Cruise Passengers’ On-board Experience throughout the Customer Decision Journey

Authors: Sabina Akter, Osiris Valdez Banda, Pentti Kujala, Jani Romanoff

Abstract:

This paper examines the relationship between on-board environmental factors and customer overall satisfaction in the context of the cruise on-board experience. The on-board environmental factors considered are ambient, layout/design, social, product/service and on-board enjoyment factors. The study presents a data-driven framework and model for the on-board cruise experience. The data are collected from 893 respondents in an application of a self-administered online questionnaire of their cruise experience. This study reveals the cruise passengers’ on-board experience through the customer decision journey based on the publicly available data. Pearson correlation and regression analysis have been applied, and the results show a positive and a significant relationship between the environmental factors and on-board experience. These data help understand the cruise passengers’ on-board experience, which will be used for the ultimate decision-making process in cruise ship design.

Keywords: Cruise behavior, on-board environmental factors, on-board experience, user or customer satisfaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 873
12897 Linking OpenCourseWares and Open Education Resources: Creating an Effective Search and Recommendation System

Authors: Brett E. Shelton, Joel Duffin, Yuxuan Wang, Justin Ball

Abstract:

With a growing number of digital libraries and other open education repositories being made available throughout the world, effective search and retrieval tools are necessary to access the desired materials that surpass the effectiveness of traditional, allinclusive search engines. This paper discusses the design and use of Folksemantic, a platform that integrates OpenCourseWare search, Open Educational Resource recommendations, and social network functionality into a single open source project. The paper describes how the system was originally envisioned, its goals for users, and data that provides insight into how it is actually being used. Data sources include website click-through data, query logs, web server log files and user account data. Based on a descriptive analysis of its current use, modifications to the platform's design are recommended to better address goals of the system, along with recommendations for additional phases of research.

Keywords: Digital libraries, open education, recommendation system, social networks

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2200
12896 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling

Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal

Abstract:

Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.

Keywords: Benchmark collection, program educational objectives, student outcomes, ABET, Accreditation, machine learning, supervised multiclass classification, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 837
12895 Distinguishing Innocent Murmurs from Murmurs caused by Aortic Stenosis by Recurrence Quantification Analysis

Authors: Christer Ahlstrom, Katja Höglund, Peter Hult, Jens Häggström, Clarence Kvart, Per Ask

Abstract:

It is sometimes difficult to differentiate between innocent murmurs and pathological murmurs during auscultation. In these difficult cases, an intelligent stethoscope with decision support abilities would be of great value. In this study, using a dog model, phonocardiographic recordings were obtained from 27 boxer dogs with various degrees of aortic stenosis (AS) severity. As a reference for severity assessment, continuous wave Doppler was used. The data were analyzed with recurrence quantification analysis (RQA) with the aim to find features able to distinguish innocent murmurs from murmurs caused by AS. Four out of eight investigated RQA features showed significant differences between innocent murmurs and pathological murmurs. Using a plain linear discriminant analysis classifier, the best pair of features (recurrence rate and entropy) resulted in a sensitivity of 90% and a specificity of 88%. In conclusion, RQA provide valid features which can be used for differentiation between innocent murmurs and murmurs caused by AS.

Keywords: Bioacoustics, murmur, phonocardiographic signal, recurrence quantification analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2005
12894 Integration of Image and Patient Data, Software and International Coding Systems for Use in a Mammography Research Project

Authors: V. Balanica, W. I. D. Rae, M. Caramihai, S. Acho, C. P. Herbst

Abstract:

Mammographic images and data analysis to facilitate modelling or computer aided diagnostic (CAD) software development should best be done using a common database that can handle various mammographic image file formats and relate these to other patient information. This would optimize the use of the data as both primary reporting and enhanced information extraction of research data could be performed from the single dataset. One desired improvement is the integration of DICOM file header information into the database, as an efficient and reliable source of supplementary patient information intrinsically available in the images. The purpose of this paper was to design a suitable database to link and integrate different types of image files and gather common information that can be further used for research purposes. An interface was developed for accessing, adding, updating, modifying and extracting data from the common database, enhancing the future possible application of the data in CAD processing. Technically, future developments envisaged include the creation of an advanced search function to selects image files based on descriptor combinations. Results can be further used for specific CAD processing and other research. Design of a user friendly configuration utility for importing of the required fields from the DICOM files must be done.

Keywords: Database Integration, Mammogram Classification, Tumour Classification, Computer Aided Diagnosis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1945
12893 A Web and Cloud-Based Measurement System Analysis Tool for the Automotive Industry

Authors: C. A. Barros, Ana P. Barroso

Abstract:

Any industrial company needs to determine the amount of variation that exists within its measurement process and guarantee the reliability of their data, studying the performance of their measurement system, in terms of linearity, bias, repeatability and reproducibility and stability. This issue is critical for automotive industry suppliers, who are required to be certified by the 16949:2016 standard (replaces the ISO/TS 16949) of International Automotive Task Force, defining the requirements of a quality management system for companies in the automotive industry. Measurement System Analysis (MSA) is one of the mandatory tools. Frequently, the measurement system in companies is not connected to the equipment and do not incorporate the methods proposed by the Automotive Industry Action Group (AIAG). To address these constraints, an R&D project is in progress, whose objective is to develop a web and cloud-based MSA tool. This MSA tool incorporates Industry 4.0 concepts, such as, Internet of Things (IoT) protocols to assure the connection with the measuring equipment, cloud computing, artificial intelligence, statistical tools, and advanced mathematical algorithms. This paper presents the preliminary findings of the project. The web and cloud-based MSA tool is innovative because it implements all statistical tests proposed in the MSA-4 reference manual from AIAG as well as other emerging methods and techniques. As it is integrated with the measuring devices, it reduces the manual input of data and therefore the errors. The tool ensures traceability of all performed tests and can be used in quality laboratories and in the production lines. Besides, it monitors MSAs over time, allowing both the analysis of deviations from the variation of the measurements performed and the management of measurement equipment and calibrations. To develop the MSA tool a ten-step approach was implemented. Firstly, it was performed a benchmarking analysis of the current competitors and commercial solutions linked to MSA, concerning Industry 4.0 paradigm. Next, an analysis of the size of the target market for the MSA tool was done. Afterwards, data flow and traceability requirements were analysed in order to implement an IoT data network that interconnects with the equipment, preferably via wireless. The MSA web solution was designed under UI/UX principles and an API in python language was developed to perform the algorithms and the statistical analysis. Continuous validation of the tool by companies is being performed to assure real time management of the ‘big data’. The main results of this R&D project are: MSA Tool, web and cloud-based; Python API; New Algorithms to the market; and Style Guide of UI/UX of the tool. The MSA tool proposed adds value to the state of the art as it ensures an effective response to the new challenges of measurement systems, which are increasingly critical in production processes. Although the automotive industry has triggered the development of this innovative MSA tool, other industries would also benefit from it. Currently, companies from molds and plastics, chemical and food industry are already validating it.

Keywords: Automotive industry, Industry 4.0, internet of things, IATF 16949:2016, measurement system analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 993
12892 A New Protocol for Concealed Data Aggregation in Wireless Sensor Networks

Authors: M. Abbasi Dezfouli, S. Mazraeh, M. H. Yektaie

Abstract:

Wireless sensor networks (WSN) consists of many sensor nodes that are placed on unattended environments such as military sites in order to collect important information. Implementing a secure protocol that can prevent forwarding forged data and modifying content of aggregated data and has low delay and overhead of communication, computing and storage is very important. This paper presents a new protocol for concealed data aggregation (CDA). In this protocol, the network is divided to virtual cells, nodes within each cell produce a shared key to send and receive of concealed data with each other. Considering to data aggregation in each cell is locally and implementing a secure authentication mechanism, data aggregation delay is very low and producing false data in the network by malicious nodes is not possible. To evaluate the performance of our proposed protocol, we have presented computational models that show the performance and low overhead in our protocol.

Keywords: Wireless Sensor Networks, Security, Concealed Data Aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1735
12891 The Development of Taiwanese Electronic Medical Record Systems Evaluation Instrument

Authors: Y. Y. Su, K. T. Win, H. C. Chiu

Abstract:

This study used Item Analysis, Exploratory Factor Analysis (EFA) and Reliability Analysis (Cronbach-s α value) to exam the Questions which selected by the Delphi method based on the issue of “Socio-technical system (STS)" and user-centered perspective. A structure questionnaire with seventy-four questions which could be categorized into nine dimensions (healthcare environment, organization behaviour, system quality, medical data quality, service quality, safety quality, user usage, user satisfaction, and organization net benefits) was provided to evaluate EMR of the Taiwanese healthcare environment.

Keywords: Instrument development, Reliability test, Validity test, Electronic Medical Record Evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1495
12890 Behavioral Response of Bee Farmers to Climate Change in South East, Nigeria

Authors: Jude A. Mbanasor, Chigozirim N. Onwusiribe

Abstract:

The enigma climate change is no longer an illusion but a reality. In the recent years, the Nigeria climate has changed and the changes are shown by the changing patterns of rainfall, the sunshine, increasing level carbon and nitrous emission as well as deforestation. This study analyzed the behavioural response of bee keepers to variations in the climate and the adaptation techniques developed in response to the climate variation. Beekeeping is a viable economic activity for the alleviation of poverty as the products include honey, wax, pollen, propolis, royal jelly, venom, queens, bees and their larvae and are all marketable. The study adopted the multistage sampling technique to select 120 beekeepers from the five states of Southeast Nigeria. Well-structured questionnaires and focus group discussions were adopted to collect the required data. Statistical tools like the Principal component analysis, data envelopment models, graphs, and charts were used for the data analysis. Changing patterns of rainfall and sunshine with the increasing rate of deforestation had a negative effect on the habitat of the bees. The bee keepers have adopted the Kenya Top bar and Langstroth hives and they establish the bee hives on fallow farmland close to the cultivated communal farms with more flowering crops.

Keywords: Climate, smart, smallholder, farmer, socioeconomic, response.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 607
12889 Correlation Analysis to Quantify Learning Outcomes for Different Teaching Pedagogies

Authors: Kanika Sood, Sijie Shang

Abstract:

A fundamental goal of education includes preparing students to become a part of the global workforce by making beneficial contributions to society. In this paper, we analyze student performance for multiple courses that involve different teaching pedagogies: a cooperative learning technique and an inquiry-based learning strategy. Student performance includes student engagement, grades, and attendance records. We perform this study in the Computer Science department for online and in-person courses for 450 students. We will perform correlation analysis to study the relationship between student scores and other parameters such as gender, mode of learning. We use natural language processing and machine learning to analyze student feedback data and performance data. We assess the learning outcomes of two teaching pedagogies for undergraduate and graduate courses to showcase the impact of pedagogical adoption and learning outcome as determinants of academic achievement. Early findings suggest that when using the specified pedagogies, students become experts on their topics and illustrate enhanced engagement with peers.

Keywords: Bag-of-words, cooperative learning, education, inquiry-based learning, in-person learning, Natural Language Processing, online learning, sentiment analysis, teaching pedagogy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 81
12888 Biplot Analysis for Evaluation of Tolerance in Some Bean (Phaseolus vulgaris L.) Genotypes to Bean Common Mosaic Virus (BCMV)

Authors: S. Ghasemi, M. M. Kamelmanesh, A. Namayandeh, R. Biabanikhankahdani

Abstract:

The common bean is the most important grain legume for direct human consumption in the world and BCMV is one of the world's most serious bean diseases that can reduce yield and quality of harvested product. To determine the best tolerance index to BCMV and recognize tolerant genotypes, 2 experiments were conducted in field conditions. Twenty five common bean genotypes were sown in 2 separate RCB design with 3 replications under contamination and non-contamination conditions. On the basis of the results of indices correlations GMP, MP and HARM were determined as the most suitable tolerance indices. The results of principle components analysis indicated 2 first components totally explained 98.52% of variations among data. The first and second components were named potential yield and stress susceptible respectively. Based on the results of BCMV tolerance indices assessment and biplot analysis WA8563-4, WA8563-2 and Cardinal were the genotypes that exhibited potential seed yield under contamination and noncontamination conditions.

Keywords: Phaseolus vulgaris, BCMV, principle components analysis, bi-plot analysis, tolerance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1358
12887 Enhancing Predictive Accuracy in Pharmaceutical Sales Through an Ensemble Kernel Gaussian Process Regression Approach

Authors: Shahin Mirshekari, Mohammadreza Moradi, Hossein Jafari, Mehdi Jafari, Mohammad Ensaf

Abstract:

This research employs Gaussian Process Regression (GPR) with an ensemble kernel, integrating Exponential Squared, Revised Matérn, and Rational Quadratic kernels to analyze pharmaceutical sales data. Bayesian optimization was used to identify optimal kernel weights: 0.76 for Exponential Squared, 0.21 for Revised Matérn, and 0.13 for Rational Quadratic. The ensemble kernel demonstrated superior performance in predictive accuracy, achieving an R² score near 1.0, and significantly lower values in MSE, MAE, and RMSE. These findings highlight the efficacy of ensemble kernels in GPR for predictive analytics in complex pharmaceutical sales datasets.

Keywords: Gaussian Process Regression, Ensemble Kernels, Bayesian Optimization, Pharmaceutical Sales Analysis, Time Series Forecasting, Data Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 111
12886 The New Method of Concealed Data Aggregation in Wireless Sensor: A Case Study

Authors: M. Abbasi Dezfouli, S. Mazraeh, M. H. Yektaie

Abstract:

Wireless sensor networks (WSN) consists of many sensor nodes that are placed on unattended environments such as military sites in order to collect important information. Implementing a secure protocol that can prevent forwarding forged data and modifying content of aggregated data and has low delay and overhead of communication, computing and storage is very important. This paper presents a new protocol for concealed data aggregation (CDA). In this protocol, the network is divided to virtual cells, nodes within each cell produce a shared key to send and receive of concealed data with each other. Considering to data aggregation in each cell is locally and implementing a secure authentication mechanism, data aggregation delay is very low and producing false data in the network by malicious nodes is not possible. To evaluate the performance of our proposed protocol, we have presented computational models that show the performance and low overhead in our protocol.

Keywords: Wireless Sensor Networks, Security, Concealed Data Aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1768
12885 A Survey on Performance Tools for OpenMP

Authors: Mubarak S. Mohsen, Rosni Abdullah, Yong M. Teo

Abstract:

Advances in processors architecture, such as multicore, increase the size of complexity of parallel computer systems. With multi-core architecture there are different parallel languages that can be used to run parallel programs. One of these languages is OpenMP which embedded in C/Cµ or FORTRAN. Because of this new architecture and the complexity, it is very important to evaluate the performance of OpenMP constructs, kernels, and application program on multi-core systems. Performance is the activity of collecting the information about the execution characteristics of a program. Performance tools consists of at least three interfacing software layers, including instrumentation, measurement, and analysis. The instrumentation layer defines the measured performance events. The measurement layer determines what performance event is actually captured and how it is measured by the tool. The analysis layer processes the performance data and summarizes it into a form that can be displayed in performance tools. In this paper, a number of OpenMP performance tools are surveyed, explaining how each is used to collect, analyse, and display data collection.

Keywords: Parallel performance tools, OpenMP, multi-core.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1922
12884 Demographic Factors Influencing Employees’ Salary Expectations and Labor Turnover

Authors: M. Osipova

Abstract:

Thanks to informational technologies development every sphere of economics is becoming more and more datacentralized as people are generating huge datasets containing information on any aspect of their life. Applying research of such data to human resources management allows getting scarce statistics on labor market state including salary expectations and potential employees’ typical career behavior, and this information can become a reliable basis for management decisions. The following article presents results of career behavior research based on freely accessible resume data. Information used for study is much wider than one usually uses in human resources surveys. That is why there is enough data for statistically significant results even for subgroups analysis.

Keywords: Human resources management, labor market, salary expectations, statistics, turnover.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1846
12883 Patents as Indicators of Innovative Environment

Authors: S. Karklina, I. Erins

Abstract:

The main problem is that there is a very low innovation performance in Latvia. Since Latvia is a Member State of European Union, it also shall have to fulfill the set targets and to improve innovative results.Universities are one of the main performers to provide innovative capacity of country. University, industry and government need to cooperate for getting best results.The intellectual property is one of the indicators to determine innovation level in the country or organization, and patents are one of the characteristics of intellectual property.The objective of the article is to determine indicators characterizing innovative environment in Latvia and influence of the development of universities on them.The methods that will be used in the article to achieve the objectives are quantitative and qualitative analysis of the literature, statistical data analysis and graphical analysis methods.

Keywords: HEI, innovations, Latvia, patents.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1882
12882 ANN Models for Microstrip Line Synthesis and Analysis

Authors: Dr.K.Sri Rama Krishna, J.Lakshmi Narayana, Dr.L.Pratap Reddy

Abstract:

Microstrip lines, widely used for good reason, are broadband in frequency and provide circuits that are compact and light in weight. They are generally economical to produce since they are readily adaptable to hybrid and monolithic integrated circuit (IC) fabrication technologies at RF and microwave frequencies. Although, the existing EM simulation models used for the synthesis and analysis of microstrip lines are reasonably accurate, they are computationally intensive and time consuming. Neural networks recently gained attention as fast and flexible vehicles to microwave modeling, simulation and optimization. After learning and abstracting from microwave data, through a process called training, neural network models are used during microwave design to provide instant answers to the task learned.This paper presents simple and accurate ANN models for the synthesis and analysis of Microstrip lines to more accurately compute the characteristic parameters and the physical dimensions respectively for the required design specifications.

Keywords: Neural Models, Algorithms, Microstrip Lines, Analysis, Synthesis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2150
12881 Daily and Seasonal Changes of Air Pollution in Kuwait

Authors: H. Ettouney, A. AL-Haddad, S. Saqer

Abstract:

This paper focuses on assessment of air pollution in Umm-Alhyman, Kuwait, which is located south to oil refineries, power station, oil field, and highways. The measurements were made over a period of four days in March and July in 2001, 2004, and 2008. The measured pollutants included methanated and nonmethanated hydrocarbons (MHC, NMHC), CO, CO2, SO2, NOX, O3, and PM10. Also, meteorological parameters were measured, which includes temperature, wind speed and direction, and solar radiation. Over the study period, data analysis showed increase in measured SO2, NOX and CO by factors of 1.2, 5.5 and 2, respectively. This is explained in terms of increase in industrial activities, motor vehicle density, and power generation. Predictions of the measured data were made by the ISC-AERMOD software package and by using the ISCST3 model option. Finally, comparison was made between measured data against international standards.

Keywords: Air pollution, Emission inventory, ISCST3 model, Modeling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2421
12880 Survival Model for Partly Interval-Censored Data with Application to Anti D in Rhesus D Negative Studies

Authors: F. A. M. Elfaki, Amar Abobakar, M. Azram, M. Usman

Abstract:

This paper discusses regression analysis of partly interval-censored failure time data, which is occur in many fields including demographical, epidemiological, financial, medical and sociological studies. For the problem, we focus on the situation where the survival time of interest can be described by the additive hazards model in the present of partly interval-censored. A major advantage of the approach is its simplicity and it can be easily implemented by using R software. Simulation studies are conducted which indicate that the approach performs well for practical situations and comparable to the existing methods. The methodology is applied to a set of partly interval-censored failure time data arising from anti D in Rhesus D negative studies.

Keywords: Anti D in Rhesus D negative, Cox’s model, EM algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1693
12879 An Efficient Data Mining Approach on Compressed Transactions

Authors: Jia-Yu Dai, Don-Lin Yang, Jungpin Wu, Ming-Chuan Hung

Abstract:

In an era of knowledge explosion, the growth of data increases rapidly day by day. Since data storage is a limited resource, how to reduce the data space in the process becomes a challenge issue. Data compression provides a good solution which can lower the required space. Data mining has many useful applications in recent years because it can help users discover interesting knowledge in large databases. However, existing compression algorithms are not appropriate for data mining. In [1, 2], two different approaches were proposed to compress databases and then perform the data mining process. However, they all lack the ability to decompress the data to their original state and improve the data mining performance. In this research a new approach called Mining Merged Transactions with the Quantification Table (M2TQT) was proposed to solve these problems. M2TQT uses the relationship of transactions to merge related transactions and builds a quantification table to prune the candidate itemsets which are impossible to become frequent in order to improve the performance of mining association rules. The experiments show that M2TQT performs better than existing approaches.

Keywords: Association rule, data mining, merged transaction, quantification table.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1960
12878 A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.

Keywords: Pattern recognition, partitional clustering, K-means clustering, Manhattan distance, terrorism data analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1359
12877 Discriminant Analysis as a Function of Predictive Learning to Select Evolutionary Algorithms in Intelligent Transportation System

Authors: Jorge A. Ruiz-Vanoye, Ocotlán Díaz-Parra, Alejandro Fuentes-Penna, Daniel Vélez-Díaz, Edith Olaco García

Abstract:

In this paper, we present the use of the discriminant analysis to select evolutionary algorithms that better solve instances of the vehicle routing problem with time windows. We use indicators as independent variables to obtain the classification criteria, and the best algorithm from the generic genetic algorithm (GA), random search (RS), steady-state genetic algorithm (SSGA), and sexual genetic algorithm (SXGA) as the dependent variable for the classification. The discriminant classification was trained with classic instances of the vehicle routing problem with time windows obtained from the Solomon benchmark. We obtained a classification of the discriminant analysis of 66.7%.

Keywords: Intelligent transportation systems, data-mining techniques, evolutionary algorithms, discriminant analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1547
12876 Hybrid Structure Learning Approach for Assessing the Phosphate Laundries Impact

Authors: Emna Benmohamed, Hela Ltifi, Mounir Ben Ayed

Abstract:

Bayesian Network (BN) is one of the most efficient classification methods. It is widely used in several fields (i.e., medical diagnostics, risk analysis, bioinformatics research). The BN is defined as a probabilistic graphical model that represents a formalism for reasoning under uncertainty. This classification method has a high-performance rate in the extraction of new knowledge from data. The construction of this model consists of two phases for structure learning and parameter learning. For solving this problem, the K2 algorithm is one of the representative data-driven algorithms, which is based on score and search approach. In addition, the integration of the expert's knowledge in the structure learning process allows the obtainment of the highest accuracy. In this paper, we propose a hybrid approach combining the improvement of the K2 algorithm called K2 algorithm for Parents and Children search (K2PC) and the expert-driven method for learning the structure of BN. The evaluation of the experimental results, using the well-known benchmarks, proves that our K2PC algorithm has better performance in terms of correct structure detection. The real application of our model shows its efficiency in the analysis of the phosphate laundry effluents' impact on the watershed in the Gafsa area (southwestern Tunisia).

Keywords: Classification, Bayesian network; structure learning, K2 algorithm, expert knowledge, surface water analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 512
12875 Performance Comparison of ADTree and Naive Bayes Algorithms for Spam Filtering

Authors: Thanh Nguyen, Andrei Doncescu, Pierre Siegel

Abstract:

Classification is an important data mining technique and could be used as data filtering in artificial intelligence. The broad application of classification for all kind of data leads to be used in nearly every field of our modern life. Classification helps us to put together different items according to the feature items decided as interesting and useful. In this paper, we compare two classification methods Naïve Bayes and ADTree use to detect spam e-mail. This choice is motivated by the fact that Naive Bayes algorithm is based on probability calculus while ADTree algorithm is based on decision tree. The parameter settings of the above classifiers use the maximization of true positive rate and minimization of false positive rate. The experiment results present classification accuracy and cost analysis in view of optimal classifier choice for Spam Detection. It is point out the number of attributes to obtain a tradeoff between number of them and the classification accuracy.

Keywords: Classification, data mining, spam filtering, naive Bayes, decision tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1499
12874 Broadband PowerLine Communications: Performance Analysis

Authors: Justinian Anatory, Nelson Theethayi, M. M. Kissaka, N. H. Mvungi

Abstract:

Power line channel is proposed as an alternative for broadband data transmission especially in developing countries like Tanzania [1]. However the channel is affected by stochastic attenuation and deep notches which can lead to the limitation of channel capacity and achievable data rate. Various studies have characterized the channel without giving exactly the maximum performance and limitation in data transfer rate may be this is due to complexity of channel modeling being used. In this paper the channel performance of medium voltage, low voltage and indoor power line channel is presented. In the investigations orthogonal frequency division multiplexing (OFDM) with phase shift keying (PSK) as carrier modulation schemes is considered, for indoor, medium and low voltage channels with typical ten branches and also Golay coding is applied for medium voltage channel. From channels, frequency response deep notches are observed in various frequencies which can lead to reduce the achievable data rate. However, is observed that data rate up to 240Mbps is realized for a signal to noise ratio of about 50dB for indoor and low voltage channels, however for medium voltage a typical link with ten branches is affected by strong multipath and coding is required for feasible broadband data transfer.

Keywords: Powerline Communications, branched network, channel model, modulation, channel performance, OFDM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1833
12873 Performance Evaluation of Al Jame’ Roundabout Using SIDRA

Authors: D. Muley, H. S. Al-Mandhari

Abstract:

This paper evaluates the performance of a multi-lane four legged modern roundabout operating in Muscat using SIDRA model. The performance measures include Degree of Saturation (DOS), average delay, and queue lengths. The geometric and traffic data were used for model preparation. Gap acceptance parameters, critical gap and follow up headway, were used for calibration of SIDRA model. The results from the analysis showed that currently the roundabout is experiencing delays up to 610 seconds per vehicle with DOS 1.67 during peak hour. Further, sensitivity analysis for general and roundabout parameters was performed, amongst lane width, cruise speed, inscribed diameter, entry radius and entry angle showed that inscribed diameter is most crucial factor affecting delay and DOS. Up gradation of roundabout to fully signalized junction was found as the suitable solution which will serve for future years with LOS C for design year having DOS of 0.9 with average control delay of 51.9 seconds per vehicle.

Keywords: Performance analysis, roundabout, sensitivity analysis, SIDRA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3304
12872 Large-Deflection Analysis of Automotive Vehicle's Door Wiring Harness System Using Finite Element Method

Authors: Byeong-Sam Kim, Kangsu Lee, Kyoungwoo Park, Samir Ben Chaabane

Abstract:

A Vehicle-s door wireing harness arrangement structure is provided. In vehicle-s door wiring harness(W/H) system is more toward to arrange a passenger compartment than a hinge and a weatherstrip. This article gives some insight into the dimensioning process, with special focus on large deflection analysis of wiring harness(W/H) in vehicle-s door structures for durability problem. An Finite elements analysis for door wiring harness(W/H) are used for residual stresses and dimensional stability with bending flexible. Durability test data for slim test specimens were compared with the numerical predicted fatigue life for verification. The final lifing of the component combines the effects of these microstructural features with the complex stress state arising from the combined service loading and residual stresses.

Keywords: Large deflection, wiring harness system, finite element analysis, vehicle's door.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3315
12871 Case Study Approach Using Scenario Analysis to Analyze Unabsorbed Head Office Overheads

Authors: K. C. Iyer, T. Gupta, Y. M. Bindal

Abstract:

Head office overhead (HOOH) is an indirect cost and is recovered through individual project billings by the contractor. Delay in a project impacts the absorption of HOOH cost allocated to that particular project and thus diminishes the expected profit of the contractor. This unabsorbed HOOH cost is later claimed by contractors as damages. The subjective nature of the available formulae to compute unabsorbed HOOH is the difficulty that contractors and owners face and thus dispute it. The paper attempts to bring together the rationale of various HOOH formulae by gathering contractor’s HOOH cost data on all of its project, using case study approach and comparing variations in values of HOOH using scenario analysis. The case study approach uses project data collected from four construction projects of a contractor in India to calculate unabsorbed HOOH costs from various available formulae. Scenario analysis provides further variations in HOOH values after considering two independent situations mainly scope changes and new projects during the delay period. Interestingly, one of the findings in this study reveals that, in spite of HOOH getting absorbed by additional works available during the period of delay, a few formulae depict an increase in the value of unabsorbed HOOH, neglecting any absorption by the increase in scope. This indicates that these formulae are inappropriate for use in case of a change to the scope of work. Results of this study can help both parties in deciding on an appropriate formula more objectively, considering the events on a project causing the delay and contractor's position in respect of obtaining new projects.

Keywords: Absorbed and unabsorbed overheads, head office overheads, scenario analysis, scope variation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 825