Search results for: Data availability
7532 Using Data Clustering in Oral Medicine
Authors: Fahad Shahbaz Khan, Rao Muhammad Anwer, Olof Torgersson
Abstract:
The vast amount of information hidden in huge databases has created tremendous interests in the field of data mining. This paper examines the possibility of using data clustering techniques in oral medicine to identify functional relationships between different attributes and classification of similar patient examinations. Commonly used data clustering algorithms have been reviewed and as a result several interesting results have been gathered.Keywords: Oral Medicine, Cluto, Data Clustering, Data Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19777531 The Application of Data Mining Technology in Building Energy Consumption Data Analysis
Authors: Liang Zhao, Jili Zhang, Chongquan Zhong
Abstract:
Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.
Keywords: Data mining, data analysis, prediction, optimization, building operational performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37097530 Query Algebra for Semistuctured Data
Authors: Ei Ei Myat, Ni Lar Thein
Abstract:
With the tremendous growth of World Wide Web (WWW) data, there is an emerging need for effective information retrieval at the document level. Several query languages such as XML-QL, XPath, XQL, Quilt and XQuery are proposed in recent years to provide faster way of querying XML data, but they still lack of generality and efficiency. Our approach towards evolving a framework for querying semistructured documents is based on formal query algebra. Two elements are introduced in the proposed framework: first, a generic and flexible data model for logical representation of semistructured data and second, a set of operators for the manipulation of objects defined in the data model. In additional to accommodating several peculiarities of semistructured data, our model offers novel features such as bidirectional paths for navigational querying and partitions for data transformation that are not available in other proposals.Keywords: Algebra, Semistructured data, Query Algebra.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13767529 Simulation Data Summarization Based on Spatial Histograms
Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura
Abstract:
In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.Keywords: Simulation data, data summarization, spatial histograms, exploration and visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7537528 Joint Use of Factor Analysis (FA) and Data Envelopment Analysis (DEA) for Ranking of Data Envelopment Analysis
Authors: Reza Nadimi, Fariborz Jolai
Abstract:
This article combines two techniques: data envelopment analysis (DEA) and Factor analysis (FA) to data reduction in decision making units (DMU). Data envelopment analysis (DEA), a popular linear programming technique is useful to rate comparatively operational efficiency of decision making units (DMU) based on their deterministic (not necessarily stochastic) input–output data and factor analysis techniques, have been proposed as data reduction and classification technique, which can be applied in data envelopment analysis (DEA) technique for reduction input – output data. Numerical results reveal that the new approach shows a good consistency in ranking with DEA.Keywords: Effectiveness, Decision Making, Data EnvelopmentAnalysis, Factor Analysis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24257527 Behavior of Engineering Students in Kuwait University
Authors: M. A. Al-Ajmi, R. S. Al-Kandari
Abstract:
This initial study is concerned with the behavior of engineering students in Kuwait University which became a concern due to the global issues of education in all levels. A survey has been conducted to identify academic and societal issues affecting the engineering student performance. The study is drawing major conclusions with regard to private tutoring and the online availability of textbooks’ solution manuals.
Keywords: Solution manual, engineering, textbook, ethics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17517526 Selenium Content in Agricultural Soils and Wheat from the Balkan Peninsula
Authors: S. Krustev, V. Angelova, P. Zaprjanova
Abstract:
Selenium (Se) is an essential micro-nutrient for human and animals but it is highly toxic. Its organic compounds play an important role in biochemistry and nutrition of the cells. Concentration levels of this element in the different regions of the world vary considerably. This study aimed to compare the availability and levels of the Se in some rural areas of the Balkan Peninsula and relationship with the concentrations of other trace elements. For this purpose soil samples and wheat grains from different regions of Bulgaria, Serbia, Nord Macedonia, Romania, and Greece situated far from large industrial centers have been analyzed. The main methods for their determination were the atomic spectral techniques – atomic absorption and plasma atomic emission. As a result of this study, data on microelements levels from the main grain-producing regions of the Balkan Peninsula were determined and systematized. The presented results confirm the low levels of Se in this region: 0.222– 0.962 mg.kg-1 in soils and 0.001 - 0.005 mg.kg-1 in wheat grains and require measures to offset the effect of this deficiency.
Keywords: Agricultural soils, Balkan Peninsula, rural areas, selenium.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6557525 Performance Analysis of Expert Systems Incorporating Neural Network for Fault Detection of an Electric Motor
Authors: M. Khatami Rad, N. Jamali, M. Torabizadeh, A. Noshadi
Abstract:
In this paper, an artificial neural network simulator is employed to carry out diagnosis and prognosis on electric motor as rotating machinery based on predictive maintenance. Vibration data of the primary failed motor including unbalance, misalignment and bearing fault were collected for training the neural network. Neural network training was performed for a variety of inputs and the motor condition was used as the expert training information. The main purpose of applying the neural network as an expert system was to detect the type of failure and applying preventive maintenance. The advantage of this study is for machinery Industries by providing appropriate maintenance that has an essential activity to keep the production process going at all processes in the machinery industry. Proper maintenance is pivotal in order to prevent the possible failures in operating system and increase the availability and effectiveness of a system by analyzing vibration monitoring and developing expert system.Keywords: Condition based monitoring, expert system, neural network, fault detection, vibration monitoring.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19907524 An Integrated Logistics Model of Spare Parts Maintenance Planning within the Aviation Industry
Authors: Roy Fritzsche, Rainer Lasch
Abstract:
Avoidable unscheduled maintenance events and unnecessary spare parts deliveries are mostly caused by an incorrect choice of the underlying maintenance strategy. For a faster and more efficient supply of spare parts for aircrafts of an airline we examine options for improving the underlying logistics network integrated in an existing aviation industry network. This paper presents a dynamic prediction model as decision support for maintenance method selection considering requirements of an entire flight network. The objective is to guarantee a high supply of spare parts by an optimal interaction of various network levels and thus to reduce unscheduled maintenance events and minimize total costs. By using a prognostics-based preventive maintenance strategy unscheduled component failures are avoided for an increase in availability and reliability of the entire system. The model is intended for use in an aviation company that utilizes a structured planning process based on collected failures data of components.Keywords: Aviation industry, Prognosis, Reliability, Preventive maintenance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 45387523 Observations about the Principal Components Analysis and Data Clustering Techniques in the Study of Medical Data
Authors: Cristina G. Dascâlu, Corina Dima Cozma, Elena Carmen Cotrutz
Abstract:
The medical data statistical analysis often requires the using of some special techniques, because of the particularities of these data. The principal components analysis and the data clustering are two statistical methods for data mining very useful in the medical field, the first one as a method to decrease the number of studied parameters, and the second one as a method to analyze the connections between diagnosis and the data about the patient-s condition. In this paper we investigate the implications obtained from a specific data analysis technique: the data clustering preceded by a selection of the most relevant parameters, made using the principal components analysis. Our assumption was that, using the principal components analysis before data clustering - in order to select and to classify only the most relevant parameters – the accuracy of clustering is improved, but the practical results showed the opposite fact: the clustering accuracy decreases, with a percentage approximately equal with the percentage of information loss reported by the principal components analysis.Keywords: Data clustering, medical data, principal components analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15027522 Exploring Utility and Intrinsic Value among UAE Arabic Teachers in Integrating M-Learning
Authors: Dina Tareq Ismail, Alexandria A. Proff
Abstract:
The United Arab Emirates (UAE) is a nation seeking to advance in all fields, particularly education. One area of focus for UAE 2021 agenda is to restructure UAE schools and universities by equipping them with highly developed technology. The agenda also advises educational institutions to prepare students with applicable and transferrable Information and Communication Technology (ICT) skills. Despite the emphasis on ICT and computer literacy skills, there exists limited empirical data on the use of M-Learning in the literature. This qualitative study explores the motivation of higher primary Arabic teachers in private schools toward implementing and integrating M-Learning apps in their classrooms. This research employs a phenomenological approach through the use of semistructured interviews with nine purposefully selected Arabic teachers. The data were analyzed using a content analysis via multiple stages of coding: open, axial, and thematic. Findings reveal three primary themes: (1) Arabic teachers with high levels of procedural knowledge in ICT are more motivated to implement M-Learning; (2) Arabic teachers' perceptions of self-efficacy influence their motivation toward implementation of M-Learning; (3) Arabic teachers implement M-Learning when they possess high utility and/or intrinsic value in these applications. These findings indicate a strong need for further training, equipping, and creating buy-in among Arabic teachers to enhance their ICT skills in implementing M-Learning. Further, given the limited availability of M-Learning apps designed for use in the Arabic language on the market, it is imperative that developers consider designing M-Learning tools that Arabic teachers, and Arabic-speaking students, can use and access more readily. This study contributes to closing the knowledge gap on teacher-motivation for implementing M-Learning in their classrooms in the UAE.Keywords: ICT Skills, M-Learning, self-efficacy, teachermotivation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4837521 Acetalization of Carbonyl Compounds by Using Al2 (HPO4)3 under Green Condition Mg HPO4
Authors: Fariba Jafari, Samaneh Heydarian
Abstract:
Al2(HPO4)3 was easily prepared and used as a solid acid in acetalization of carbonyl compounds at room temperature and under solvent-free conditions. The protection was done in short reaction times and in good to high isolated yields. The cheapness and availability of this reagent with easy procedure and work-up make this method attractive for the organic synthesis.Keywords: Acetalization, acid catalysis, carbonylcompounds, green condition, protection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13427520 Project Complexity Indices based on Topology Features
Authors: Amer A. Boushaala
Abstract:
The heuristic decision rules used for project scheduling will vary depending upon the project-s size, complexity, duration, personnel, and owner requirements. The concept of project complexity has received little detailed attention. The need to differentiate between easy and hard problem instances and the interest in isolating the fundamental factors that determine the computing effort required by these procedures inspired a number of researchers to develop various complexity measures. In this study, the most common measures of project complexity are presented. A new measure of project complexity is developed. The main privilege of the proposed measure is that, it considers size, shape and logic characteristics, time characteristics, resource demands and availability characteristics as well as number of critical activities and critical paths. The degree of sensitivity of the proposed measure for complexity of project networks has been tested and evaluated against the other measures of complexity of the considered fifty project networks under consideration in the current study. The developed measure showed more sensitivity to the changes in the network data and gives accurate quantified results when comparing the complexities of networks.Keywords: Activity networks, Complexity index, Networkcomplexity measure, Network topology, Project Network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16817519 MATLAB/SIMULINK Based Model of Single- Machine Infinite-Bus with TCSC for Stability Studies and Tuning Employing GA
Authors: Sidhartha Panda, Narayana Prasad Padhy
Abstract:
With constraints on data availability and for study of power system stability it is adequate to model the synchronous generator with field circuit and one equivalent damper on q-axis known as the model 1.1. This paper presents a systematic procedure for modelling and simulation of a single-machine infinite-bus power system installed with a thyristor controlled series compensator (TCSC) where the synchronous generator is represented by model 1.1, so that impact of TCSC on power system stability can be more reasonably evaluated. The model of the example power system is developed using MATLAB/SIMULINK which can be can be used for teaching the power system stability phenomena, and also for research works especially to develop generator controllers using advanced technologies. Further, the parameters of the TCSC controller are optimized using genetic algorithm. The non-linear simulation results are presented to validate the effectiveness of the proposed approach.
Keywords: Genetic algorithm, MATLAB/SIMULINK, modelling and simulation, power system stability, single-machineinfinite-bus power system, thyristor controlled series compensator.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 165137518 Optimized Data Fusion in an Intelligent Integrated GPS/INS System Using Genetic Algorithm
Authors: Ali Asadian, Behzad Moshiri, Ali Khaki Sedigh, Caro Lucas
Abstract:
Most integrated inertial navigation systems (INS) and global positioning systems (GPS) have been implemented using the Kalman filtering technique with its drawbacks related to the need for predefined INS error model and observability of at least four satellites. Most recently, a method using a hybrid-adaptive network based fuzzy inference system (ANFIS) has been proposed which is trained during the availability of GPS signal to map the error between the GPS and the INS. Then it will be used to predict the error of the INS position components during GPS signal blockage. This paper introduces a genetic optimization algorithm that is used to update the ANFIS parameters with respect to the INS/GPS error function used as the objective function to be minimized. The results demonstrate the advantages of the genetically optimized ANFIS for INS/GPS integration in comparison with conventional ANFIS specially in the cases of satellites- outages. Coping with this problem plays an important role in assessment of the fusion approach in land navigation.Keywords: Adaptive Network based Fuzzy Inference System (ANFIS), Genetic optimization, Global Positioning System (GPS), Inertial Navigation System (INS).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19097517 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area
Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim
Abstract:
In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.Keywords: Data Estimation, link data, machine learning, road network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15047516 CNet Module Design of IMCS
Authors: Youkyung Park, SeungYup Kang, SungHo Kim, SimKyun Yook
Abstract:
IMCS is Integrated Monitoring and Control System for thermal power plant. This system consists of mainly two parts; controllers and OIS (Operator Interface System). These two parts are connected by Ethernet-based communication. The controller side of communication is managed by CNet module and OIS side is managed by data server of OIS. CNet module sends the data of controller to data server and receives commend data from data server. To minimizes or balance the load of data server, this module buffers data created by controller at every cycle and send buffered data to data server on request of data server. For multiple data server, this module manages the connection line with each data server and response for each request from multiple data server. CNet module is included in each controller of redundant system. When controller fail-over happens on redundant system, this module can provide data of controller to data sever without loss. This paper presents three main features – separation of get task, usage of ring buffer and monitoring communication status –of CNet module to carry out these functions.Keywords: Ethernet communication, DCS, power plant, ring buffer, data integrity
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15637515 Space-Time Variation in Rainfall and Runoff: Upper Betwa Catchment
Authors: Ritu Ahlawat
Abstract:
Among all geo-hydrological relationships, rainfallrunoff relationship is of utmost importance in any hydrological investigation and water resource planning. Spatial variation, lag time involved in obtaining areal estimates for the basin as a whole can affect the parameterization in design stage as well as in planning stage. In conventional hydrological processing of data, spatial aspect is either ignored or interpolated at sub-basin level. Temporal variation when analysed for different stages can provide clues for its spatial effectiveness. The interplay of space-time variation at pixel level can provide better understanding of basin parameters. Sustenance of design structures for different return periods and their spatial auto-correlations should be studied at different geographical scales for better management and planning of water resources. In order to understand the relative effect of spatio-temporal variation in hydrological data network, a detailed geo-hydrological analysis of Betwa river catchment falling in Lower Yamuna Basin is presented in this paper. Moreover, the exact estimates about the availability of water in the Betwa river catchment, especially in the wake of recent Betwa-Ken linkage project, need thorough scientific investigation for better planning. Therefore, an attempt in this direction is made here to analyse the existing hydrological and meteorological data with the help of SPSS, GIS and MS-EXCEL software. A comparison of spatial and temporal correlations at subcatchment level in case of upper Betwa reaches has been made to demonstrate the representativeness of rain gauges. First, flows at different locations are used to derive correlation and regression coefficients. Then, long-term normal water yield estimates based on pixel-wise regression coefficients of rainfall-runoff relationship have been mapped. The areal values obtained from these maps can definitely improve upon estimates based on point-based extrapolations or areal interpolations.Keywords: Catchment's runoff estimates, influence area regional regression coefficients, runoff yield series,
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21007514 Strengthening Legal Protection of Personal Data through Technical Protection Regulation in Line with Human Rights
Authors: Tomy Prihananto, Damar Apri Sudarmadi
Abstract:
Indonesia recognizes the right to privacy as a human right. Indonesia provides legal protection against data management activities because the protection of personal data is a part of human rights. This paper aims to describe the arrangement of data management and data management in Indonesia. This paper is a descriptive research with qualitative approach and collecting data from literature study. Results of this paper are comprehensive arrangement of data that have been set up as a technical requirement of data protection by encryption methods. Arrangements on encryption and protection of personal data are mutually reinforcing arrangements in the protection of personal data. Indonesia has two important and immediately enacted laws that provide protection for the privacy of information that is part of human rights.
Keywords: Indonesia, protection, personal data, privacy, human rights, encryption.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9877513 Comparative Evaluation of Adaptive and Conventional Distance Relay for Parallel Transmission Line with Mutual Coupling
Authors: S.G. Srivani, Chandrasekhar Reddy Atla, K.P.Vittal
Abstract:
This paper presents the development of adaptive distance relay for protection of parallel transmission line with mutual coupling. The proposed adaptive relay, automatically adjusts its operation based on the acquisition of the data from distance relay of adjacent line and status of adjacent line from line circuit breaker IED (Intelligent Electronic Device). The zero sequence current of the adjacent parallel transmission line is used to compute zero sequence current ratio and the mutual coupling effect is fully compensated. The relay adapts to changing circumstances, like failure in communication from other relays and non - availability of adjacent transmission line. The performance of the proposed adaptive relay is tested using steady state and dynamic test procedures. The fault transients are obtained by simulating a realistic parallel transmission line system with mutual coupling effect in PSCAD. The evaluation test results show the efficacy of adaptive distance relay over the conventional distance relay.Keywords: Adaptive relaying, distance measurement, mutualcoupling, quadrilateral trip characteristic, zones of protection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31457512 Big Brain: A Single Database System for a Federated Data Warehouse Architecture
Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf
Abstract:
Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.Keywords: Data integration, data warehousing, federated architecture, online analytical processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7107511 ATM Service Analysis Using Predictive Data Mining
Authors: S. Madhavi, S. Abirami, C. Bharathi, B. Ekambaram, T. Krishna Sankar, A. Nattudurai, N. Vijayarangan
Abstract:
The high utilization rate of Automated Teller Machine (ATM) has inevitably caused the phenomena of waiting for a long time in the queue. This in turn has increased the out of stock situations. The ATM utilization helps to determine the usage level and states the necessity of the ATM based on the utilization of the ATM system. The time in which the ATM used more frequently (peak time) and based on the predicted solution the necessary actions are taken by the bank management. The analysis can be done by using the concept of Data Mining and the major part are analyzed based on the predictive data mining. The results are predicted from the historical data (past data) and track the relevant solution which is required. Weka tool is used for the analysis of data based on predictive data mining.
Keywords: ATM, Bank Management, Data Mining, Historical data, Predictive Data Mining, Weka tool.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 56137510 The Data Mining usage in Production System Management
Authors: Pavel Vazan, Pavol Tanuska, Michal Kebisek
Abstract:
The paper gives the pilot results of the project that is oriented on the use of data mining techniques and knowledge discoveries from production systems through them. They have been used in the management of these systems. The simulation models of manufacturing systems have been developed to obtain the necessary data about production. The authors have developed the way of storing data obtained from the simulation models in the data warehouse. Data mining model has been created by using specific methods and selected techniques for defined problems of production system management. The new knowledge has been applied to production management system. Gained knowledge has been tested on simulation models of the production system. An important benefit of the project has been proposal of the new methodology. This methodology is focused on data mining from the databases that store operational data about the production process.Keywords: data mining, data warehousing, management of production system, simulation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34767509 A Review: Comparative Study of Diverse Collection of Data Mining Tools
Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila
Abstract:
There have been a lot of efforts and researches undertaken in developing efficient tools for performing several tasks in data mining. Due to the massive amount of information embedded in huge data warehouses maintained in several domains, the extraction of meaningful pattern is no longer feasible. This issue turns to be more obligatory for developing several tools in data mining. Furthermore the major aspire of data mining software is to build a resourceful predictive or descriptive model for handling large amount of information more efficiently and user friendly. Data mining mainly contracts with excessive collection of data that inflicts huge rigorous computational constraints. These out coming challenges lead to the emergence of powerful data mining technologies. In this survey a diverse collection of data mining tools are exemplified and also contrasted with the salient features and performance behavior of each tool.
Keywords: Business Analytics, Data Mining, Data Analysis, Machine Learning, Text Mining, Predictive Analytics, Visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33647508 Landscape Data Transformation: Categorical Descriptions to Numerical Descriptors
Authors: Dennis A. Apuan
Abstract:
Categorical data based on description of the agricultural landscape imposed some mathematical and analytical limitations. This problem however can be overcome by data transformation through coding scheme and the use of non-parametric multivariate approach. The present study describes data transformation from qualitative to numerical descriptors. In a collection of 103 random soil samples over a 60 hectare field, categorical data were obtained from the following variables: levels of nitrogen, phosphorus, potassium, pH, hue, chroma, value and data on topography, vegetation type, and the presence of rocks. Categorical data were coded, and Spearman-s rho correlation was then calculated using PAST software ver. 1.78 in which Principal Component Analysis was based. Results revealed successful data transformation, generating 1030 quantitative descriptors. Visualization based on the new set of descriptors showed clear differences among sites, and amount of variation was successfully measured. Possible applications of data transformation are discussed.Keywords: data transformation, numerical descriptors, principalcomponent analysis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15057507 Copper Price Prediction Model for Various Economic Situations
Authors: Haidy S. Ghali, Engy Serag, A. Samer Ezeldin
Abstract:
Copper is an essential raw material used in the construction industry. During 2021 and the first half of 2022, the global market suffered from a significant fluctuation in copper raw material prices due to the aftermath of both the COVID-19 pandemic and the Russia-Ukraine war which exposed its consumers to an unexpected financial risk. Thereto, this paper aims to develop two hybrid price prediction models using artificial neural network and long short-term memory (ANN-LSTM), by Python, that can forecast the average monthly copper prices, traded in the London Metal Exchange; the first model is a multivariate model that forecasts the copper price of the next 1-month and the second is a univariate model that predicts the copper prices of the upcoming three months. Historical data of average monthly London Metal Exchange copper prices are collected from January 2009 till July 2022 and potential external factors are identified and employed in the multivariate model. These factors lie under three main categories: energy prices, and economic indicators of the three major exporting countries of copper depending on the data availability. Before developing the LSTM models, the collected external parameters are analyzed with respect to the copper prices using correlation, and multicollinearity tests in R software; then, the parameters are further screened to select the parameters that influence the copper prices. Then, the two LSTM models are developed, and the dataset is divided into training, validation, and testing sets. The results show that the performance of the 3-month prediction model is better than the 1-month prediction model; but still, both models can act as predicting tools for diverse economic situations.
Keywords: Copper prices, prediction model, neural network, time series forecasting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1877506 A Survey of Semantic Integration Approaches in Bioinformatics
Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir
Abstract:
Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.Keywords: Semantic data integration, biological ontology, linked data, semantic web, OWL, RDF.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18197505 The Mentoring in Professional Development of University Teachers
Authors: Nagore Guerra Bilbao, Clemente Lobato Fraile
Abstract:
Mentoring is provided by professionals with a higher level of experience and competence as part of the professional development of a university faculty. This paper explores the characteristics of the mentoring provided by those teachers participating in the development of an active methodology program run at the University of the Basque Country: to examine and to analyze mentors’ performance with the aim of providing empirical evidence regarding its value as a lifelong learning strategy for teaching staff. A total of 183 teachers were trained during the first three programs. The analysis method uses a coding technique and is based on flexible, systematic guidelines for gathering and analyzing qualitative data. The results have confirmed the conception of mentoring as a methodological innovation in higher education. In short, university teachers in general assessed the mentoring they received positively, considering it to be a valid, useful strategy in their professional development. They highlighted the methodological expertise of their mentor and underscored how they monitored the learning process of the active method and provided guidance and advice when necessary. Finally, they also drew attention to traits such as availability, personal commitment and flexibility in. However, a minority critique is pointed to some aspects of the performance of some mentors.
Keywords: Higher education, Mentoring, Professional development, University teachers.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9517504 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering
Authors: Yunus Doğan, Ahmet Durap
Abstract:
Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.
Keywords: Clustering algorithms, coastal engineering, data mining, data summarization, statistical methods.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12447503 Dimensional Modeling of HIV Data Using Open Source
Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer
Abstract:
Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1959