Search results for: test data compression (TDC)
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9641

Search results for: test data compression (TDC)

7181 Economized Sensor Data Processing with Vehicle Platooning

Authors: Henry Hexmoor, Kailash Yelasani

Abstract:

We present vehicular platooning as a special case of crowd-sensing framework where sharing sensory information among a crowd is used for their collective benefit. After offering an abstract policy that governs processes involving a vehicular platoon, we review several common scenarios and components surrounding vehicular platooning. We then present a simulated prototype that illustrates efficiency of road usage and vehicle travel time derived from platooning. We have argued that one of the paramount benefits of platooning that is overlooked elsewhere, is the substantial computational savings (i.e., economizing benefits) in acquisition and processing of sensory data among vehicles sharing the road. The most capable vehicle can share data gathered from its sensors with nearby vehicles grouped into a platoon.

Keywords: Cloud network, collaboration, Internet of Things, social network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 688
7180 Machine Learning Development Audit Framework: Assessment and Inspection of Risk and Quality of Data, Model and Development Process

Authors: Jan Stodt, Christoph Reich

Abstract:

The usage of machine learning models for prediction is growing rapidly and proof that the intended requirements are met is essential. Audits are a proven method to determine whether requirements or guidelines are met. However, machine learning models have intrinsic characteristics, such as the quality of training data, that make it difficult to demonstrate the required behavior and make audits more challenging. This paper describes an ML audit framework that evaluates and reviews the risks of machine learning applications, the quality of the training data, and the machine learning model. We evaluate and demonstrate the functionality of the proposed framework by auditing an steel plate fault prediction model.

Keywords: Audit, machine learning, assessment, metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 941
7179 Information Seeking through Assimilation Process in Thai Organization

Authors: Pornprom Chomngam

Abstract:

The purpose of this study is to examine employee assessments of the usefulness/value of different types of information available to those employees during the process of organizational assimilation. Participants in the study were 247 “new" employees at Bangkok Bank. Bangkok Bank considers employees whose length of stay with the bank has been less than 18 months as new employees. Questionnaires were administered to all of the Bank-s new employees to obtain the data for this study. Repeated measures analysis was used to analyze the data. The data were summed and coded by using Statistical Package for Social Science. Newcomers indicate that social information is the most useful information, followed by job (technical, referent, and appraisal information), political, normative, and organizational information. Essentially, social, job, and political information are evaluated by newcomers as highly useful, while normative and organizational information are rated as moderately useful.

Keywords: Information seeking, organization assimilation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1622
7178 Using Data Mining Techniques for Finding Cardiac Outlier Patients

Authors: Farhan Ismaeel Dakheel, Raoof Smko, K. Negrat, Abdelsalam Almarimi

Abstract:

In this paper we used data mining techniques to identify outlier patients who are using large amount of drugs over a long period of time. Any healthcare or health insurance system should deal with the quantities of drugs utilized by chronic diseases patients. In Kingdom of Bahrain, about 20% of health budget is spent on medications. For the managers of healthcare systems, there is no enough information about the ways of drug utilization by chronic diseases patients, is there any misuse or is there outliers patients. In this work, which has been done in cooperation with information department in the Bahrain Defence Force hospital; we select the data for Cardiac patients in the period starting from 1/1/2008 to December 31/12/2008 to be the data for the model in this paper. We used three techniques for finding the drug utilization for cardiac patients. First we applied a clustering technique, followed by measuring of clustering validity, and finally we applied a decision tree as classification algorithm. The clustering results is divided into three clusters according to the drug utilization, for 1603 patients, who received 15,806 prescriptions during this period can be partitioned into three groups, where 23 patients (2.59%) who received 1316 prescriptions (8.32%) are classified to be outliers. The classification algorithm shows that the use of average drug utilization and the age, and the gender of the patient can be considered to be the main predictive factors in the induced model.

Keywords: Data Mining, Clustering, Classification, Drug Utilization..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1877
7177 Retail Strategy to Reduce Waste Keeping High Profit Utilizing Taylor's Law in Point-of-Sales Data

Authors: Gen Sakoda, Hideki Takayasu, Misako Takayasu

Abstract:

Waste reduction is a fundamental problem for sustainability. Methods for waste reduction with point-of-sales (POS) data are proposed, utilizing the knowledge of a recent econophysics study on a statistical property of POS data. Concretely, the non-stationary time series analysis method based on the Particle Filter is developed, which considers abnormal fluctuation scaling known as Taylor's law. This method is extended for handling incomplete sales data because of stock-outs by introducing maximum likelihood estimation for censored data. The way for optimal stock determination with pricing the cost of waste reduction is also proposed. This study focuses on the examination of the methods for large sales numbers where Taylor's law is obvious. Numerical analysis using aggregated POS data shows the effectiveness of the methods to reduce food waste maintaining a high profit for large sales numbers. Moreover, the way of pricing the cost of waste reduction reveals that a small profit loss realizes substantial waste reduction, especially in the case that the proportionality constant  of Taylor’s law is small. Specifically, around 1% profit loss realizes half disposal at =0.12, which is the actual  value of processed food items used in this research. The methods provide practical and effective solutions for waste reduction keeping a high profit, especially with large sales numbers.

Keywords: Food waste reduction, particle filter, point of sales, sustainable development goals, Taylor's Law, time series analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 835
7176 Experimental teaching, Perceived usefulness, Ease of use, Learning Interest and Science Achievement of Taiwan 8th Graders in TIMSS 2007 Database

Authors: Pei Wen Liao, Tsung Hau Jen

Abstract:

the data of Taiwanese 8th grader in the 4th cycle of Trends in International Mathematics and Science Study (TIMSS) are analyzed to examine the influence of the science teachers- preference in experimental teaching on the relationships between the affective variables ( the perceived usefulness of science, ease of using science and science learning interest) and the academic achievement in science. After dealing with the missing data, 3711 students and 145 science teacher-s data were analyzed through a Hierarchical Linear Modeling technique. The major objective of this study was to determine the role of the experimental teaching moderates the relationship between perceived usefulness and achievement.

Keywords: TIMSS database, Science achievement, Experimental teaching, Perceived Usefulness, Perceived Ease of Use

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1625
7175 Effect of Two Different Biochars on Germination and Seedlings Growth of Salad, Cress and Barley

Authors: L. Bouqbis, H.W. Koyro, M. C. Harrouni, S. Daoud, L. F. Z. Ainlhout, C. I. Kammann

Abstract:

The application of biochar to soils is becoming more and more common. Its application which is generally reported to improve the physical, chemical, and biological properties of soils, has an indirect effect on soil health and increased crop yields. However, many of the previous results are highly variable and dependent mainly on the initial soil properties, biochar characteristics, and production conditions. In this study, two biochars which are biochar II (BC II) derived from a blend of paper sludge and wheat husks and biochar 005 (BC 005) derived from sewage sludge with a KCl additive, are used, and the physical and chemical properties of BC II are characterized. To determine the potential impact of salt stress and toxic and volatile substances, the second part of this study focused on the effect biochars have on germination of salad (Lactuca sativa L.), barley (Hordeum vulgare), and cress (Lepidium sativum) respectively. Our results indicate that Biochar II showed some unique properties compared to the soil, such as high EC, high content of K, Na, Mg, and low content of heavy metals. Concerning salad and barley germination test, no negative effect of BC II and BC 005 was observed. However, a negative effect of BC 005 at 8% level was revealed. The test of the effect of volatile substances on germination of cress revealed a positive effect of BC II, while a negative effect was observed for BC 005. Moreover, the water holding capacities of biochar-sand mixtures increased with increasing biochar application. Collectively, BC II could be safely used for agriculture and could provide the potential for a better plant growth.

Keywords: Biochar, phytotoxic tests, seedlings growth, water holding capacity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1056
7174 A Cumulative Learning Approach to Data Mining Employing Censored Production Rules (CPRs)

Authors: Rekha Kandwal, Kamal K.Bharadwaj

Abstract:

Knowledge is indispensable but voluminous knowledge becomes a bottleneck for efficient processing. A great challenge for data mining activity is the generation of large number of potential rules as a result of mining process. In fact sometimes result size is comparable to the original data. Traditional data mining pruning activities such as support do not sufficiently reduce the huge rule space. Moreover, many practical applications are characterized by continual change of data and knowledge, thereby making knowledge voluminous with each change. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. Michalski & Winston proposed Censored Production Rules (CPRs), as an extension of production rules, that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence, are tight or there is simply no information available as to whether it holds or not. Thus the 'If P Then D' part of the CPR expresses important information while the Unless C part acts only as a switch changes the polarity of D to ~D. In this paper a scheme based on Dempster-Shafer Theory (DST) interpretation of a CPR is suggested for discovering CPRs from the discovered flat PRs. The discovery of CPRs from flat rules would result in considerable reduction of the already discovered rules. The proposed scheme incrementally incorporates new knowledge and also reduces the size of knowledge base considerably with each episode. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested cumulative learning scheme would be useful in mining data streams.

Keywords: Censored production rules, cumulative learning, data mining, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1462
7173 Service Quality and Consumer Behavior on Metered Taxi Services

Authors: Nattapong Techarattanased

Abstract:

The purposes of this research are to make comparisons in respect of the behaviors on the use of the services of metered taxi classified by the demographic factor and to study the influence of the recognition on service quality having the effect on usage behaviors of metered taxi services of consumers in Bangkok Metropolitan Areas. The samples used in this research were 400 metered taxi service users in Bangkok Metropolitan Areas and questionnaire was used as the tool for collecting the data. Analysis statistics are mean and multiple regression analysis. Results of the research revealed that the consumers recognize the overall quality of services in each aspect include tangible aspects of the service, responses to customers, assurance on the confidence, understanding and knowing of customers which is rated at the moderate level except the aspect of the assurance on the confidence and trustworthiness which are rated at a high level. For the result of hypothetical test, it is found that the quality in providing the services on the aspect of the assurance given to the customers has the effect on the usage behaviors of metered taxi services and the aspect of the frequency on the use of the services per month which in this connection. Such variable can forecast at one point nine percent (1.9%). In addition, quality in providing the services and the aspect of the responses to customers have the effect on the behaviors on the use of metered taxi services on the aspect of the expenses on the use of services per month which in this connection, such variable can forecast at two point one percent (2.1%).

Keywords: Consumer behavior, metered taxi, satisfaction, service quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3526
7172 Vitamin D Deficiency and Insufficiency in Postmenopausal Women with Obesity

Authors: Vladyslav Povoroznyuk, Anna Musiienko, Nataliia Dzerovych, Roksolana Povoroznyuk, Oksana Ivanyk

Abstract:

Deficiency and insufficiency of Vitamin D is a pandemic of the 21st century. Obesity patients have a lower level of vitamin D, but the literature data are contradictory. The purpose of this study is to investigate deficiency and insufficiency vitamin D in postmenopausal women with obesity. We examined 1007 women aged 50-89 years. Mean age was 65.74±8.61 years; mean height was 1.61±0.07 m; mean weight was 70.65±13.50 kg; mean body mass index was 27.27±4.86 kg/m2, and mean 25(OH) D levels in serum was 26.00±12.00 nmol/l. The women were divided into the following six groups depending on body mass index: I group – 338 women with normal body weight, II group – 16 women with insufficient body weight, III group – 382 women with excessive body weight, IV group – 199 women with obesity of class I, V group – 60 women with obesity of class II, and VI group – 12 women with obesity of class III. Level of 25(OH)D in serum was measured by means of an electrochemiluminescent method - Elecsys 2010 analyzer (Roche Diagnostics, Germany) and cobas test-systems. 34.4% of the examined women have deficiency of vitamin D and 31.4% insufficiency. Women with obesity of class I (23.60±10.24 ng/ml) and obese of class II (22.38±10.34 ng/ml) had significantly lower levels of 25 (OH) D compared to women with normal body weight (28.24±12.99 ng/ml), p=0.00003. In women with obesity, BMI significantly influences vitamin D level, and this influence does not depend on the season.

Keywords: Obesity, body mass index, vitamin D deficiency/insufficiency, postmenopausal women, age.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1041
7171 Sand Production Modelled with Darcy Fluid Flow Using Discrete Element Method

Authors: M. N. Nwodo, Y. P. Cheng, N. H. Minh

Abstract:

In the process of recovering oil in weak sandstone formations, the strength of sandstones around the wellbore is weakened due to the increase of effective stress/load from the completion activities around the cavity. The weakened and de-bonded sandstone may be eroded away by the produced fluid, which is termed sand production. It is one of the major trending subjects in the petroleum industry because of its significant negative impacts, as well as some observed positive impacts. For efficient sand management therefore, there has been need for a reliable study tool to understand the mechanism of sanding. One method of studying sand production is the use of the widely recognized Discrete Element Method (DEM), Particle Flow Code (PFC3D) which represents sands as granular individual elements bonded together at contact points. However, there is limited knowledge of the particle-scale behavior of the weak sandstone, and the parameters that affect sanding. This paper aims to investigate the reliability of using PFC3D and a simple Darcy flow in understanding the sand production behavior of a weak sandstone. An isotropic tri-axial test on a weak oil sandstone sample was first simulated at a confining stress of 1MPa to calibrate and validate the parallel bond models of PFC3D using a 10m height and 10m diameter solid cylindrical model. The effect of the confining stress on the number of bonds failure was studied using this cylindrical model. With the calibrated data and sample material properties obtained from the tri-axial test, simulations without and with fluid flow were carried out to check on the effect of Darcy flow on bonds failure using the same model geometry. The fluid flow network comprised of every four particles connected with tetrahedral flow pipes with a central pore or flow domain. Parametric studies included the effects of confining stress, and fluid pressure; as well as validating flow rate – permeability relationship to verify Darcy’s fluid flow law. The effect of model size scaling on sanding was also investigated using 4m height, 2m diameter model. The parallel bond model successfully calibrated the sample’s strength of 4.4MPa, showing a sharp peak strength before strain-softening, similar to the behavior of real cemented sandstones. There seems to be an exponential increasing relationship for the bigger model, but a curvilinear shape for the smaller model. The presence of the Darcy flow induced tensile forces and increased the number of broken bonds. For the parametric studies, flow rate has a linear relationship with permeability at constant pressure head. The higher the fluid flow pressure, the higher the number of broken bonds/sanding. The DEM PFC3D is a promising tool to studying the micromechanical behavior of cemented sandstones.

Keywords: Discrete Element Method, fluid flow, parametric study, sand production/bonds failure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1766
7170 Pattern Classification of Back-Propagation Algorithm Using Exclusive Connecting Network

Authors: Insung Jung, Gi-Nam Wang

Abstract:

The objective of this paper is to a design of pattern classification model based on the back-propagation (BP) algorithm for decision support system. Standard BP model has done full connection of each node in the layers from input to output layers. Therefore, it takes a lot of computing time and iteration computing for good performance and less accepted error rate when we are doing some pattern generation or training the network. However, this model is using exclusive connection in between hidden layer nodes and output nodes. The advantage of this model is less number of iteration and better performance compare with standard back-propagation model. We simulated some cases of classification data and different setting of network factors (e.g. hidden layer number and nodes, number of classification and iteration). During our simulation, we found that most of simulations cases were satisfied by BP based using exclusive connection network model compared to standard BP. We expect that this algorithm can be available to identification of user face, analysis of data, mapping data in between environment data and information.

Keywords: Neural network, Back-propagation, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1633
7169 A Survey in Techniques for Imbalanced Intrusion Detection System Datasets

Authors: Najmeh Abedzadeh, Matthew Jacobs

Abstract:

An intrusion detection system (IDS) is a software application that monitors malicious activities and generates alerts if any are detected. However, most network activities in IDS datasets are normal, and the relatively few numbers of attacks make the available data imbalanced. Consequently, cyber-attacks can hide inside a large number of normal activities, and machine learning algorithms have difficulty learning and classifying the data correctly. In this paper, a comprehensive literature review is conducted on different types of algorithms for both implementing the IDS and methods in correcting the imbalanced IDS dataset. The most famous algorithms are machine learning (ML), deep learning (DL), synthetic minority over-sampling technique (SMOTE), and reinforcement learning (RL). Most of the research use the CSE-CIC-IDS2017, CSE-CIC-IDS2018, and NSL-KDD datasets for evaluating their algorithms.

Keywords: IDS, intrusion detection system, imbalanced datasets, sampling algorithms, big data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1062
7168 Disidentification of Historical City Centers: A Comparative Study of the Old and New Settlements of Mardin, Turkey

Authors: Fatma Kürüm Varolgüneş, Fatih Canan

Abstract:

Mardin is one of the unique cities in Turkey with its rich cultural and historical heritage. Mardin’s traditional dwellings have been affected both by natural data such as climate and topography and by cultural data like lifestyle and belief. However, in the new settlements, housing is formed with modern approaches and unsuitable forms clashing with Mardin’s culture and environment. While the city is expanding, traditional textures are ignored. Thus, traditional settlements are losing their identity and are vanishing because of the rapid change and transformation. The main aim of this paper is to determine the physical and social data needed to define the characteristic features of Mardin’s old and new settlements. In this context, based on social and cultural data, old and new settlement formations of Mardin have been investigated from various aspects. During this research, the following methods have been utilized: observations, interviews, public surveys, literature review, as well as site examination via maps, photographs and questionnaire methodology. In conclusion, this paper focuses on how changes in the physical forms of cities affect the typology and the identity of cities, as in the case of Mardin.

Keywords: Urban and local identity, historical city center, traditional settlements, Mardin, Turkey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1014
7167 Data Mining Techniques in Computer-Aided Diagnosis: Non-Invasive Cancer Detection

Authors: Florin Gorunescu

Abstract:

Diagnosis can be achieved by building a model of a certain organ under surveillance and comparing it with the real time physiological measurements taken from the patient. This paper deals with the presentation of the benefits of using Data Mining techniques in the computer-aided diagnosis (CAD), focusing on the cancer detection, in order to help doctors to make optimal decisions quickly and accurately. In the field of the noninvasive diagnosis techniques, the endoscopic ultrasound elastography (EUSE) is a recent elasticity imaging technique, allowing characterizing the difference between malignant and benign tumors. Digitalizing and summarizing the main EUSE sample movies features in a vector form concern with the use of the exploratory data analysis (EDA). Neural networks are then trained on the corresponding EUSE sample movies vector input in such a way that these intelligent systems are able to offer a very precise and objective diagnosis, discriminating between benign and malignant tumors. A concrete application of these Data Mining techniques illustrates the suitability and the reliability of this methodology in CAD.

Keywords: Endoscopic ultrasound elastography, exploratorydata analysis, neural networks, non-invasive cancer detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1837
7166 From Electroencephalogram to Epileptic Seizures Detection by Using Artificial Neural Networks

Authors: Gaetano Zazzaro, Angelo Martone, Roberto V. Montaquila, Luigi Pavone

Abstract:

Seizure is the main factor that affects the quality of life of epileptic patients. The diagnosis of epilepsy, and hence the identification of epileptogenic zone, is commonly made by using continuous Electroencephalogram (EEG) signal monitoring. Seizure identification on EEG signals is made manually by epileptologists and this process is usually very long and error prone. The aim of this paper is to describe an automated method able to detect seizures in EEG signals, using knowledge discovery in database process and data mining methods and algorithms, which can support physicians during the seizure detection process. Our detection method is based on Artificial Neural Network classifier, trained by applying the multilayer perceptron algorithm, and by using a software application, called Training Builder that has been developed for the massive extraction of features from EEG signals. This tool is able to cover all the data preparation steps ranging from signal processing to data analysis techniques, including the sliding window paradigm, the dimensionality reduction algorithms, information theory, and feature selection measures. The final model shows excellent performances, reaching an accuracy of over 99% during tests on data of a single patient retrieved from a publicly available EEG dataset.

Keywords: Artificial Neural Network, Data Mining, Electroencephalogram, Epilepsy, Feature Extraction, Seizure Detection, Signal Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1280
7165 Construction 4.0: The Future of the Construction Industry in South Africa

Authors: Temidayo. O. Osunsanmi, Clinton Aigbavboa, Ayodeji Oke

Abstract:

The construction industry is a renowned latecomer to the efficiency offered by the adoption of information technology. Whereas, the banking, manufacturing, retailing industries have keyed into the future by using digitization and information technology as a new approach for ensuring competitive gain and efficiency. The construction industry has yet to fully realize similar benefits because the adoption of ICT is still at the infancy stage with a major concentration on the use of software. Thus, this study evaluates the awareness and readiness of construction professionals towards embracing a full digitalization of the construction industry using construction 4.0. The term ‘construction 4.0’ was coined from the industry 4.0 concept which is regarded as the fourth industrial revolution that originated from Germany. A questionnaire was utilized for sourcing data distributed to practicing construction professionals through a convenience sampling method. Using SPSS v24, the hypotheses posed were tested with the Mann Whitney test. The result revealed that there are no differences between the consulting and contracting organizations on the readiness for adopting construction 4.0 concepts in the construction industry. Using factor analysis, the study discovers that adopting construction 4.0 will improve the performance of the construction industry regarding cost and time savings and also create sustainable buildings. In conclusion, the study determined that construction professionals have a low awareness towards construction 4.0 concepts. The study recommends an increase in awareness of construction 4.0 concepts through seminars, workshops and training, while construction professionals should take hold of the benefits of adopting construction 4.0 concepts. The study contributes to the roadmap for the implementation of construction industry 4.0 concepts in the South African construction industry.

Keywords: Building information technology, Construction 4.0, Industry 4.0, Smart Site.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5726
7164 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian

Abstract:

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

Keywords: Data mining, K-means, road traffic accidents, Waze, Weka.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1176
7163 Efficient Tuning Parameter Selection by Cross-Validated Score in High Dimensional Models

Authors: Yoonsuh Jung

Abstract:

As DNA microarray data contain relatively small sample size compared to the number of genes, high dimensional models are often employed. In high dimensional models, the selection of tuning parameter (or, penalty parameter) is often one of the crucial parts of the modeling. Cross-validation is one of the most common methods for the tuning parameter selection, which selects a parameter value with the smallest cross-validated score. However, selecting a single value as an ‘optimal’ value for the parameter can be very unstable due to the sampling variation since the sample sizes of microarray data are often small. Our approach is to choose multiple candidates of tuning parameter first, then average the candidates with different weights depending on their performance. The additional step of estimating the weights and averaging the candidates rarely increase the computational cost, while it can considerably improve the traditional cross-validation. We show that the selected value from the suggested methods often lead to stable parameter selection as well as improved detection of significant genetic variables compared to the tradition cross-validation via real data and simulated data sets.

Keywords: Cross Validation, Parameter Averaging, Parameter Selection, Regularization Parameter Search.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1550
7162 Simulation of Static Frequency Converter for Synchronous Machine Operation and Investigation of Shaft Voltage

Authors: Arun Kumar Datta, M. A. Ansari, N. R. Mondal, B. V. Raghavaiah, Manisha Dubey, Shailendra Jain

Abstract:

This study is carried out to understand the effects of Static frequency converter (SFC) on large machine. SFC has a feature of four quadrant operations. By virtue of this it can be implemented to run a synchronous machine either as a motor or alternator. This dual mode operation helps a single machine to start & run as a motor and then it can be converted as an alternator whenever required. One such dual purpose machine is taken here for study. This machine is installed at a laboratory carrying out short circuit test on high power electrical equipment. SFC connected with this machine is broadly described in this paper. The same SFC has been modeled with the MATLAB/Simulink software. The data applied on this virtual model are the actual parameters from SFC and synchronous machine. After running the model, simulated machine voltage and current waveforms are validated with the real measurements. Processing of these waveforms is done through Fast Fourier Transformation (FFT) which reveals that the waveforms are not sinusoidal rather they contain number of harmonics. These harmonics are the major cause of generating shaft voltage. It is known that bearings of electrical machine are vulnerable to current flow through it due to shaft voltage. A general discussion on causes of shaft voltage in perspective with this machine is presented in this paper.

Keywords: Alternators, AC-DC power conversion, capacitive coupling, electric discharge machining, frequency converter, Fourier transforms, inductive coupling, simulation, Shaft voltage, synchronous machines, static excitation, thyristor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6008
7161 Using ALOHA Code to Evaluate CO2 Concentration for Maanshan Nuclear Power Plant

Authors: W. S. Hsu, S. W. Chen, Y. T. Ku, Y. Chiang, J. R. Wang , J. H. Yang, C. Shih

Abstract:

ALOHA code was used to calculate the concentration under the CO2 storage burst condition for Maanshan nuclear power plant (NPP) in this study. Five main data are input into ALOHA code including location, building, chemical, atmospheric, and source data. The data from Final Safety Analysis Report (FSAR) and some reports were used in this study. The ALOHA results are compared with the failure criteria of R.G. 1.78 to confirm the habitability of control room. The result of comparison presents that the ALOHA result is below the R.G. 1.78 criteria. This implies that the habitability of control room can be maintained in this case. The sensitivity study for atmospheric parameters was performed in this study. The results show that the wind speed has the larger effect in the concentration calculation.

Keywords: PWR, ALOHA, habitability, Maanshan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 708
7160 Review and Comparison of Associative Classification Data Mining Approaches

Authors: Suzan Wedyan

Abstract:

Associative classification (AC) is a data mining approach that combines association rule and classification to build classification models (classifiers). AC has attracted a significant attention from several researchers mainly because it derives accurate classifiers that contain simple yet effective rules. In the last decade, a number of associative classification algorithms have been proposed such as Classification based Association (CBA), Classification based on Multiple Association Rules (CMAR), Class based Associative Classification (CACA), and Classification based on Predicted Association Rule (CPAR). This paper surveys major AC algorithms and compares the steps and methods performed in each algorithm including: rule learning, rule sorting, rule pruning, classifier building, and class prediction.

Keywords: Associative Classification, Classification, Data Mining, Learning, Rule Ranking, Rule Pruning, Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6599
7159 Land Use/Land Cover Mapping Using Landsat 8 and Sentinel-2 in a Mediterranean Landscape

Authors: M. Vogiatzis, K. Perakis

Abstract:

Spatial-explicit and up-to-date land use/land cover information is fundamental for spatial planning, land management, sustainable development, and sound decision-making. In the last decade, many satellite-derived land cover products at different spatial, spectral, and temporal resolutions have been developed, such as the European Copernicus Land Cover product. However, more efficient and detailed information for land use/land cover is required at the regional or local scale. A typical Mediterranean basin with a complex landscape comprised of various forest types, crops, artificial surfaces, and wetlands was selected to test and develop our approach. In this study, we investigate the improvement of Copernicus Land Cover product (CLC2018) using Landsat 8 and Sentinel-2 pixel-based classification based on all available existing geospatial data (Forest Maps, LPIS, Natura2000 habitats, cadastral parcels, etc.). We examined and compared the performance of the Random Forest classifier for land use/land cover mapping. In total, 10 land use/land cover categories were recognized in Landsat 8 and 11 in Sentinel-2A. A comparison of the overall classification accuracies for 2018 shows that Landsat 8 classification accuracy was slightly higher than Sentinel-2A (82,99% vs. 80,30%). We concluded that the main land use/land cover types of CLC2018, even within a heterogeneous area, can be successfully mapped and updated according to CLC nomenclature. Future research should be oriented toward integrating spatiotemporal information from seasonal bands and spectral indexes in the classification process.

Keywords: land use/land cover, random forest, Landsat-8 OLI, Sentinel-2A MSI, Corine land cover

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 281
7158 Distributed Splay Suffix Arrays: A New Structure for Distributed String Search

Authors: Tu Kun, Gu Nai-jie, Bi Kun, Liu Gang, Dong Wan-li

Abstract:

As a structure for processing string problem, suffix array is certainly widely-known and extensively-studied. But if the string access pattern follows the “90/10" rule, suffix array can not take advantage of the fact that we often find something that we have just found. Although the splay tree is an efficient data structure for small documents when the access pattern follows the “90/10" rule, it requires many structures and an excessive amount of pointer manipulations for efficiently processing and searching large documents. In this paper, we propose a new and conceptually powerful data structure, called splay suffix arrays (SSA), for string search. This data structure combines the features of splay tree and suffix arrays into a new approach which is suitable to implementation on both conventional and clustered computers.

Keywords: suffix arrays, splay tree, string search, distributedalgorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751
7157 Dimension Free Rigid Point Set Registration in Linear Time

Authors: Jianqin Qu

Abstract:

This paper proposes a rigid point set matching algorithm in arbitrary dimensions based on the idea of symmetric covariant function. A group of functions of the points in the set are formulated using rigid invariants. Each of these functions computes a pair of correspondence from the given point set. Then the computed correspondences are used to recover the unknown rigid transform parameters. Each computed point can be geometrically interpreted as the weighted mean center of the point set. The algorithm is compact, fast, and dimension free without any optimization process. It either computes the desired transform for noiseless data in linear time, or fails quickly in exceptional cases. Experimental results for synthetic data and 2D/3D real data are provided, which demonstrate potential applications of the algorithm to a wide range of problems.

Keywords: Covariant point, point matching, dimension free, rigid registration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 663
7156 Functional and Efficient Query Interpreters: Principle, Application and Performances’ Comparison

Authors: Laurent Thiry, Michel Hassenforder

Abstract:

This paper presents a general approach to implement efficient queries’ interpreters in a functional programming language. Indeed, most of the standard tools actually available use an imperative and/or object-oriented language for the implementation (e.g. Java for Jena-Fuseki) but other paradigms are possible with, maybe, better performances. To proceed, the paper first explains how to model data structures and queries in a functional point of view. Then, it proposes a general methodology to get performances (i.e. number of computation steps to answer a query) then it explains how to integrate some optimization techniques (short-cut fusion and, more important, data transformations). It then compares the functional server proposed to a standard tool (Fuseki) demonstrating that the first one can be twice to ten times faster to answer queries.

Keywords: Data transformation, functional programming, information server, optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 724
7155 Analysis of Surface Hardness, Surface Roughness, and Near Surface Microstructure of AISI 4140 Steel Worked with Turn-Assisted Deep Cold Rolling Process

Authors: P. R. Prabhu, S. M. Kulkarni, S. S. Sharma, K. Jagannath, Achutha Kini U.

Abstract:

In the present study, response surface methodology has been used to optimize turn-assisted deep cold rolling process of AISI 4140 steel. A regression model is developed to predict surface hardness and surface roughness using response surface methodology and central composite design. In the development of predictive model, deep cold rolling force, ball diameter, initial roughness of the workpiece, and number of tool passes are considered as model variables. The rolling force and the ball diameter are the significant factors on the surface hardness and ball diameter and numbers of tool passes are found to be significant for surface roughness. The predicted surface hardness and surface roughness values and the subsequent verification experiments under the optimal operating conditions confirmed the validity of the predicted model. The absolute average error between the experimental and predicted values at the optimal combination of parameter settings for surface hardness and surface roughness is calculated as 0.16% and 1.58% respectively. Using the optimal processing parameters, the surface hardness is improved from 225 to 306 HV, which resulted in an increase in the near surface hardness by about 36% and the surface roughness is improved from 4.84µm to 0.252 µm, which resulted in decrease in the surface roughness by about 95%. The depth of compression is found to be more than 300µm from the microstructure analysis and this is in correlation with the results obtained from the microhardness measurements. Taylor hobson talysurf tester, micro vickers hardness tester, optical microscopy and X-ray diffractometer are used to characterize the modified surface layer. 

Keywords: Surface hardness, response surface methodology, microstructure, central composite design, deep cold rolling, surface roughness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1777
7154 A Secure Proxy Signature Scheme with Fault Tolerance Based on RSA System

Authors: H. El-Kamchouchi, Heba Gaber, Fatma Ahmed, Dalia H. El-Kamchouchi

Abstract:

Due to the rapid growth in modern communication systems, fault tolerance and data security are two important issues in a secure transaction. During the transmission of data between the sender and receiver, errors may occur frequently. Therefore, the sender must re-transmit the data to the receiver in order to correct these errors, which makes the system very feeble. To improve the scalability of the scheme, we present a secure proxy signature scheme with fault tolerance over an efficient and secure authenticated key agreement protocol based on RSA system. Authenticated key agreement protocols have an important role in building a secure communications network between the two parties.

Keywords: Proxy signature, fault tolerance, RSA, key agreement protocol.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1453
7153 Performance of the Aptima® HIV-1 Quant Dx Assay on the Panther System

Authors: Siobhan O’Shea, Sangeetha Vijaysri Nair, Hee Cheol Kim, Charles Thomas Nugent, Cheuk Yan William Tong, Sam Douthwaite, Andrew Worlock

Abstract:

The Aptima® HIV-1 Quant Dx Assay is a fully automated assay on the Panther system. It is based on Transcription- Mediated Amplification and real time detection technologies. This assay is intended for monitoring HIV-1 viral load in plasma specimens and for the detection of HIV-1 in plasma and serum specimens. Nine-hundred and seventy nine specimens selected at random from routine testing at St Thomas’ Hospital, London were anonymised and used to compare the performance of the Aptima HIV-1 Quant Dx assay and Roche COBAS® AmpliPrep/COBAS® TaqMan® HIV-1 Test, v2.0. Two-hundred and thirty four specimens gave quantitative HIV-1 viral load results in both assays. The quantitative results reported by the Aptima Assay were comparable to those reported by the Roche COBAS AmpliPrep/COBAS TaqMan HIV-1 Test, v2.0 with a linear regression slope of 1.04 and an intercept on -0.097. The Aptima assay detected HIV-1 in more samples than the COBAS assay. This was not due to lack of specificity of the Aptima assay because this assay gave 99.83% specificity on testing plasma specimens from 600 HIV-1 negative individuals. To understand the reason for this higher detection rate a side-by-side comparison of low level panels made from the HIV-1 3rd international standard (NIBSC10/152) and clinical samples of various subtypes were tested in both assays. The Aptima assay was more sensitive than the COBAS assay. The good sensitivity, specificity and agreement with other commercial assays make the HIV-1 Quant Dx Assay appropriate for both viral load monitoring and detection of HIV-1 infections.

Keywords: HIV viral load, Aptima, Roche, Panther system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3193
7152 Rapid Determination of Biochemical Oxygen Demand

Authors: Mayur Milan Kale, Indu Mehrotra

Abstract:

Biochemical Oxygen Demand (BOD) is a measure of the oxygen used in bacteria mediated oxidation of organic substances in water and wastewater. Theoretically an infinite time is required for complete biochemical oxidation of organic matter, but the measurement is made over 5-days at 20 0C or 3-days at 27 0C test period with or without dilution. Researchers have worked to further reduce the time of measurement. The objective of this paper is to review advancement made in BOD measurement primarily to minimize the time and negate the measurement difficulties. Survey of literature review in four such techniques namely BOD-BARTTM, Biosensors, Ferricyanidemediated approach, luminous bacterial immobilized chip method. Basic principle, method of determination, data validation and their advantage and disadvantages have been incorporated of each of the methods. In the BOD-BARTTM method the time lag is calculated for the system to change from oxidative to reductive state. BIOSENSORS are the biological sensing element with a transducer which produces a signal proportional to the analyte concentration. Microbial species has its metabolic deficiencies. Co-immobilization of bacteria using sol-gel biosensor increases the range of substrate. In ferricyanidemediated approach, ferricyanide has been used as e-acceptor instead of oxygen. In Luminous bacterial cells-immobilized chip method, bacterial bioluminescence which is caused by lux genes was observed. Physiological responses is measured and correlated to BOD due to reduction or emission. There is a scope to further probe into the rapid estimation of BOD.

Keywords: BOD, Four methods, Rapid estimation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3618