Search results for: Data Assimilation
6686 Image Distortion Correction Method of 2-MHz Side Scan Sonar for Underwater Structure Inspection
Authors: Youngseok Kim, Chul Park, Jonghwa Yi, Sangsik Choi
Abstract:
The 2-MHz Side Scan SONAR (SSS) attached to the boat for inspection of underwater structures is affected by shaking. It is difficult to determine the exact scale of damage of structure. In this study, a motion sensor is attached to the inside of the 2-MHz SSS to get roll, pitch, and yaw direction data, and developed the image stabilization tool to correct the sonar image. We checked that reliable data can be obtained with an average error rate of 1.99% between the measured value and the actual distance through experiment. It is possible to get the accurate sonar data to inspect damage in underwater structure.
Keywords: Image stabilization, motion sensor, safety inspection, sonar image, underwater structure.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10646685 Web Application to Profiling Scientific Institutions through Citation Mining
Authors: Hector D. Cortes, Jesus A. del Rio, Esther O. Garcia, Miguel Robles
Abstract:
Recently the use of data mining to scientific bibliographic data bases has been implemented to analyze the pathways of the knowledge or the core scientific relevances of a laureated novel or a country. This specific case of data mining has been named citation mining, and it is the integration of citation bibliometrics and text mining. In this paper we present an improved WEB implementation of statistical physics algorithms to perform the text mining component of citation mining. In particular we use an entropic like distance between the compression of text as an indicator of the similarity between them. Finally, we have included the recently proposed index h to characterize the scientific production. We have used this web implementation to identify users, applications and impact of the Mexican scientific institutions located in the State of Morelos.
Keywords: Citation Mining, Text Mining, Science Impact
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17596684 Questions Categorization in E-Learning Environment Using Data Mining Technique
Authors: Vilas P. Mahatme, K. K. Bhoyar
Abstract:
Nowadays, education cannot be imagined without digital technologies. It broadens the horizons of teaching learning processes. Several universities are offering online courses. For evaluation purpose, e-examination systems are being widely adopted in academic environments. Multiple-choice tests are extremely popular. Moving away from traditional examinations to e-examination, Moodle as Learning Management Systems (LMS) is being used. Moodle logs every click that students make for attempting and navigational purposes in e-examination. Data mining has been applied in various domains including retail sales, bioinformatics. In recent years, there has been increasing interest in the use of data mining in e-learning environment. It has been applied to discover, extract, and evaluate parameters related to student’s learning performance. The combination of data mining and e-learning is still in its babyhood. Log data generated by the students during online examination can be used to discover knowledge with the help of data mining techniques. In web based applications, number of right and wrong answers of the test result is not sufficient to assess and evaluate the student’s performance. So, assessment techniques must be intelligent enough. If student cannot answer the question asked by the instructor then some easier question can be asked. Otherwise, more difficult question can be post on similar topic. To do so, it is necessary to identify difficulty level of the questions. Proposed work concentrate on the same issue. Data mining techniques in specific clustering is used in this work. This method decide difficulty levels of the question and categories them as tough, easy or moderate and later this will be served to the desire students based on their performance. Proposed experiment categories the question set and also group the students based on their performance in examination. This will help the instructor to guide the students more specifically. In short mined knowledge helps to support, guide, facilitate and enhance learning as a whole.Keywords: Data mining, e-examination, e-learning, moodle.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20796683 Blockchain’s Feasibility in Military Data Networks
Authors: Brenden M. Shutt, Lubjana Beshaj, Paul L. Goethals, Ambrose Kam
Abstract:
Communication security is of particular interest to military data networks. A relatively novel approach to network security is blockchain, a cryptographically secured distribution ledger with a decentralized consensus mechanism for data transaction processing. Recent advances in blockchain technology have proposed new techniques for both data validation and trust management, as well as different frameworks for managing dataflow. The purpose of this work is to test the feasibility of different blockchain architectures as applied to military command and control networks. Various architectures are tested through discrete-event simulation and the feasibility is determined based upon a blockchain design’s ability to maintain long-term stable performance at industry standards of throughput, network latency, and security. This work proposes a consortium blockchain architecture with a computationally inexpensive consensus mechanism, one that leverages a Proof-of-Identity (PoI) concept and a reputation management mechanism.Keywords: Blockchain, command & control network, discrete-event simulation, reputation management.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8516682 What the Future Holds for Social Media Data Analysis
Authors: P. Wlodarczak, J. Soar, M. Ally
Abstract:
The dramatic rise in the use of Social Media (SM) platforms such as Facebook and Twitter provide access to an unprecedented amount of user data. Users may post reviews on products and services they bought, write about their interests, share ideas or give their opinions and views on political issues. There is a growing interest in the analysis of SM data from organisations for detecting new trends, obtaining user opinions on their products and services or finding out about their online reputations. A recent research trend in SM analysis is making predictions based on sentiment analysis of SM. Often indicators of historic SM data are represented as time series and correlated with a variety of real world phenomena like the outcome of elections, the development of financial indicators, box office revenue and disease outbreaks. This paper examines the current state of research in the area of SM mining and predictive analysis and gives an overview of the analysis methods using opinion mining and machine learning techniques.
Keywords: Social Media, text mining, knowledge discovery, predictive analysis, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38526681 Speedup of Data Vortex Network Architecture
Authors: Qimin Yang
Abstract:
In this paper, 3X3 routing nodes are proposed to provide speedup and parallel processing capability in Data Vortex network architectures. The new design not only significantly improves network throughput and latency, but also eliminates the need for distributive traffic control mechanism originally embedded among nodes and the need for nodal buffering. The cost effectiveness is studied by a comparison study with the previously proposed 2- input buffered networks, and considerable performance enhancement can be achieved with similar or lower cost of hardware. Unlike previous implementation, the network leaves small probability of contention, therefore, the packet drop rate must be kept low for such implementation to be feasible and attractive, and it can be achieved with proper choice of operation conditions.Keywords: Data Vortex, Packet Switch, Interconnection network, deflection, Network-on-chip
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15716680 An Energy Efficient Digital Baseband for Batteryless Remote Control
Authors: Wei-Da Toh, Yuan Gao, Minkyu Je
Abstract:
In this paper, an energy efficient digital baseband circuit for piezoelectric (PE) harvester powered batteryless remote control system is presented. Pulse mode PE harvester, which provides short duration of energy, is adopted to replace conventional chemical battery in wireless remote controller. The transmitter digital baseband repeats the control command transmission once the digital circuit is initiated by the power-on-reset. A power efficient data frame format is proposed to maximize the transmission repetition time. By using the proposed frame format and receiver clock and data recovery method, the receiver baseband is able to decode the command even when the received data has 20% error. The proposed transmitter and receiver baseband are implemented using FPGA and simulation results are presented.
Keywords: Clock and Data Recovery (CDR), Correlator, Digital Baseband, Gold Code, Power-On-Reset.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20266679 Multipath Routing Protocol Using Basic Reconstruction Routing (BRR) Algorithm in Wireless Sensor Network
Authors: K. Rajasekaran, Kannan Balasubramanian
Abstract:
A sensory network consists of multiple detection locations called sensor nodes, each of which is tiny, featherweight and portable. A single path routing protocols in wireless sensor network can lead to holes in the network, since only the nodes present in the single path is used for the data transmission. Apart from the advantages like reduced computation, complexity and resource utilization, there are some drawbacks like throughput, increased traffic load and delay in data delivery. Therefore, multipath routing protocols are preferred for WSN. Distributing the traffic among multiple paths increases the network lifetime. We propose a scheme, for the data to be transmitted through a dominant path to save energy. In order to obtain a high delivery ratio, a basic route reconstruction protocol is utilized to reconstruct the path whenever a failure is detected. A basic reconstruction routing (BRR) algorithm is proposed, in which a node can leap over path failure by using the already existing routing information from its neighbourhood while the composed data is transmitted from the source to the sink. In order to save the energy and attain high data delivery ratio, data is transmitted along a multiple path, which is achieved by BRR algorithm whenever a failure is detected. Further, the analysis of how the proposed protocol overcomes the drawback of the existing protocols is presented. The performance of our protocol is compared to AOMDV and energy efficient node-disjoint multipath routing protocol (EENDMRP). The system is implemented using NS-2.34. The simulation results show that the proposed protocol has high delivery ratio with low energy consumption.Keywords: Multipath routing, WSN, energy efficient routing, alternate route, assured data delivery.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17256678 A Simple Deterministic Model for the Spread of Leptospirosis in Thailand
Authors: W. Triampo, D. Baowan, I.M. Tang, N. Nuttavut, J. Wong-Ekkabut, G. Doungchawee
Abstract:
In this work, we consider a deterministic model for the transmission of leptospirosis which is currently spreading in the Thai population. The SIR model which incorporates the features of this disease is applied to the epidemiological data in Thailand. It is seen that the numerical solutions of the SIR equations are in good agreement with real empirical data. Further improvements are discussed.Keywords: Leptospirosis, SIR Model, Deterministic model, Thailand.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19906677 The Relationship between Class Attendance and Performance of Industrial Engineering Students Enrolled for a Statistics Subject at the University of Technology
Authors: Tshaudi Motsima
Abstract:
Class attendance is key at all levels of education. At tertiary level many students develop a tendency of not attending all classes without being aware of the repercussions of not attending all classes. It is important for all students to attend all classes as they can receive first-hand information and they can benefit more. The student who attends classes is likely to perform better academically than the student who does not. The aim of this paper is to assess the relationship between class attendance and academic performance of industrial engineering students. The data for this study were collected through the attendance register of students and the other data were accessed from the Integrated Tertiary Software and the Higher Education Data Analyzer Portal. Data analysis was conducted on a sample of 93 students. The results revealed that students with medium predicate scores (OR = 3.8; p = 0.027) and students with low predicate scores (OR = 21.4, p < 0.001) were significantly likely to attend less than 80% of the classes as compared to students with high predicate scores. Students with examination performance of less than 50% were likely to attend less than 80% of classes than students with examination performance of 50% and above, but the differences were not statistically significant (OR = 1.3; p = 0.750).
Keywords: Class attendance, examination performance, final outcome, logistic regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4686676 Reversible Medical Image Watermarking For Tamper Detection And Recovery With Run Length Encoding Compression
Authors: Siau-Chuin Liew, Siau-Way Liew, Jasni Mohd Zain
Abstract:
Digital watermarking in medical images can ensure the authenticity and integrity of the image. This design paper reviews some existing watermarking schemes and proposes a reversible tamper detection and recovery watermarking scheme. Watermark data from ROI (Region Of Interest) are stored in RONI (Region Of Non Interest). The embedded watermark allows tampering detection and tampered image recovery. The watermark is also reversible and data compression technique was used to allow higher embedding capacity.Keywords: data compression, medical image, reversible, tamperdetection and recovery, watermark.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20836675 Applying Hybrid Graph Drawing and Clustering Methods on Stock Investment Analysis
Authors: Mouataz Zreika, Maria Estela Varua
Abstract:
Stock investment decisions are often made based on current events of the global economy and the analysis of historical data. Conversely, visual representation could assist investors’ gain deeper understanding and better insight on stock market trends more efficiently. The trend analysis is based on long-term data collection. The study adopts a hybrid method that combines the Clustering algorithm and Force-directed algorithm to overcome the scalability problem when visualizing large data. This method exemplifies the potential relationships between each stock, as well as determining the degree of strength and connectivity, which will provide investors another understanding of the stock relationship for reference. Information derived from visualization will also help them make an informed decision. The results of the experiments show that the proposed method is able to produced visualized data aesthetically by providing clearer views for connectivity and edge weights.Keywords: Clustering, force-directed, graph drawing, stock investment analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15966674 Implementing an Intuitive Reasoner with a Large Weather Database
Authors: Yung-Chien Sun, O. Grant Clark
Abstract:
In this paper, the implementation of a rule-based intuitive reasoner is presented. The implementation included two parts: the rule induction module and the intuitive reasoner. A large weather database was acquired as the data source. Twelve weather variables from those data were chosen as the “target variables" whose values were predicted by the intuitive reasoner. A “complex" situation was simulated by making only subsets of the data available to the rule induction module. As a result, the rules induced were based on incomplete information with variable levels of certainty. The certainty level was modeled by a metric called "Strength of Belief", which was assigned to each rule or datum as ancillary information about the confidence in its accuracy. Two techniques were employed to induce rules from the data subsets: decision tree and multi-polynomial regression, respectively for the discrete and the continuous type of target variables. The intuitive reasoner was tested for its ability to use the induced rules to predict the classes of the discrete target variables and the values of the continuous target variables. The intuitive reasoner implemented two types of reasoning: fast and broad where, by analogy to human thought, the former corresponds to fast decision making and the latter to deeper contemplation. . For reference, a weather data analysis approach which had been applied on similar tasks was adopted to analyze the complete database and create predictive models for the same 12 target variables. The values predicted by the intuitive reasoner and the reference approach were compared with actual data. The intuitive reasoner reached near-100% accuracy for two continuous target variables. For the discrete target variables, the intuitive reasoner predicted at least 70% as accurately as the reference reasoner. Since the intuitive reasoner operated on rules derived from only about 10% of the total data, it demonstrated the potential advantages in dealing with sparse data sets as compared with conventional methods.Keywords: Artificial intelligence, intuition, knowledge acquisition, limited certainty.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13866673 Multiphase Coexistence for Aqueous System with Hydrophilic Agent
Authors: G. B. Hong, H. W. Chen
Abstract:
Liquid-Liquid Equilibrium (LLE) data are measured for the ternary mixtures of water + 1-butanol + butyl acetate and quaternary mixtures of water + 1-butanol + butyl acetate + glycerol at atmospheric pressure at 313.15 K. In addition, isothermal vapor–liquid–liquid equilibrium (VLLE) data are determined experimentally at 333.15 K. The region of heterogeneity is found to increase as the hydrophilic agent (glycerol) is introduced into the aqueous mixtures. The experimental data are correlated with the NRTL model. The predicted results from the solution model with the model parameters determined from the constituent binaries are also compared with the experimental values.Keywords: LLE, VLLE, hydrophilic agent, NRTL.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12866672 Mining Educational Data to Support Students’ Major Selection
Authors: Kunyanuth Kularbphettong, Cholticha Tongsiri
Abstract:
This paper aims to create the model for student in choosing an emphasized track of student majoring in computer science at Suan Sunandha Rajabhat University. The objective of this research is to develop the suggested system using data mining technique to analyze knowledge and conduct decision rules. Such relationships can be used to demonstrate the reasonableness of student choosing a track as well as to support his/her decision and the system is verified by experts in the field. The sampling is from student of computer science based on the system and the questionnaire to see the satisfaction. The system result is found to be satisfactory by both experts and student as well.
Keywords: Data mining technique, the decision support system, knowledge and decision rules.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32876671 Categorical Missing Data Imputation Using Fuzzy Neural Networks with Numerical and Categorical Inputs
Authors: Pilar Rey-del-Castillo, Jesús Cardeñosa
Abstract:
There are many situations where input feature vectors are incomplete and methods to tackle the problem have been studied for a long time. A commonly used procedure is to replace each missing value with an imputation. This paper presents a method to perform categorical missing data imputation from numerical and categorical variables. The imputations are based on Simpson-s fuzzy min-max neural networks where the input variables for learning and classification are just numerical. The proposed method extends the input to categorical variables by introducing new fuzzy sets, a new operation and a new architecture. The procedure is tested and compared with others using opinion poll data.
Keywords: Classifier, imputation techniques, fuzzy systems, fuzzy min-max neural networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17866670 A Hidden Markov Model for Modeling Pavement Deterioration under Incomplete Monitoring Data
Authors: Nam Lethanh, Bryan T. Adey
Abstract:
In this paper, the potential use of an exponential hidden Markov model to model a hidden pavement deterioration process, i.e. one that is not directly measurable, is investigated. It is assumed that the evolution of the physical condition, which is the hidden process, and the evolution of the values of pavement distress indicators, can be adequately described using discrete condition states and modeled as a Markov processes. It is also assumed that condition data can be collected by visual inspections over time and represented continuously using an exponential distribution. The advantage of using such a model in decision making process is illustrated through an empirical study using real world data.Keywords: Deterioration modeling, Exponential distribution, Hidden Markov model, Pavement management
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23116669 Automated Knowledge Engineering
Authors: Sandeep Chandana, Rene V. Mayorga, Christine W. Chan
Abstract:
This article outlines conceptualization and implementation of an intelligent system capable of extracting knowledge from databases. Use of hybridized features of both the Rough and Fuzzy Set theory render the developed system flexibility in dealing with discreet as well as continuous datasets. A raw data set provided to the system, is initially transformed in a computer legible format followed by pruning of the data set. The refined data set is then processed through various Rough Set operators which enable discovery of parameter relationships and interdependencies. The discovered knowledge is automatically transformed into a rule base expressed in Fuzzy terms. Two exemplary cancer repository datasets (for Breast and Lung Cancer) have been used to test and implement the proposed framework.Keywords: Knowledge Extraction, Fuzzy Sets, Rough Sets, Neuro–Fuzzy Systems, Databases
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17926668 Using Data Mining Techniques for Estimating Minimum, Maximum and Average Daily Temperature Values
Authors: S. Kotsiantis, A. Kostoulas, S. Lykoudis, A. Argiriou, K. Menagias
Abstract:
Estimates of temperature values at a specific time of day, from daytime and daily profiles, are needed for a number of environmental, ecological, agricultural and technical applications, ranging from natural hazards assessments, crop growth forecasting to design of solar energy systems. The scope of this research is to investigate the efficiency of data mining techniques in estimating minimum, maximum and mean temperature values. For this reason, a number of experiments have been conducted with well-known regression algorithms using temperature data from the city of Patras in Greece. The performance of these algorithms has been evaluated using standard statistical indicators, such as Correlation Coefficient, Root Mean Squared Error, etc.
Keywords: regression algorithms, supervised machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34216667 A Real-Time Signal Processing Technique for MIDI Generation
Authors: Farshad Arvin, Shyamala Doraisamy
Abstract:
This paper presents a new hardware interface using a microcontroller which processes audio music signals to standard MIDI data. A technique for processing music signals by extracting note parameters from music signals is described. An algorithm to convert the voice samples for real-time processing without complex calculations is proposed. A high frequency microcontroller as the main processor is deployed to execute the outlined algorithm. The MIDI data generated is transmitted using the EIA-232 protocol. The analyses of data generated show the feasibility of using microcontrollers for real-time MIDI generation hardware interface.Keywords: Signal processing, MIDI, Microcontroller, EIA-232.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21296666 Enhancing Temporal Extrapolation of Wind Speed Using a Hybrid Technique: A Case Study in West Coast of Denmark
Authors: B. Elshafei, X. Mao
Abstract:
The demand for renewable energy is significantly increasing, major investments are being supplied to the wind power generation industry as a leading source of clean energy. The wind energy sector is entirely dependable and driven by the prediction of wind speed, which by the nature of wind is very stochastic and widely random. This s0tudy employs deep multi-fidelity Gaussian process regression, used to predict wind speeds for medium term time horizons. Data of the RUNE experiment in the west coast of Denmark were provided by the Technical University of Denmark, which represent the wind speed across the study area from the period between December 2015 and March 2016. The study aims to investigate the effect of pre-processing the data by denoising the signal using empirical wavelet transform (EWT) and engaging the vector components of wind speed to increase the number of input data layers for data fusion using deep multi-fidelity Gaussian process regression (GPR). The outcomes were compared using root mean square error (RMSE) and the results demonstrated a significant increase in the accuracy of predictions which demonstrated that using vector components of the wind speed as additional predictors exhibits more accurate predictions than strategies that ignore them, reflecting the importance of the inclusion of all sub data and pre-processing signals for wind speed forecasting models.
Keywords: Data fusion, Gaussian process regression, signal denoise, temporal extrapolation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5036665 An Energy Aware Data Aggregation in Wireless Sensor Network Using Connected Dominant Set
Authors: M. Santhalakshmi, P Suganthi
Abstract:
Wireless Sensor Networks (WSNs) have many advantages. Their deployment is easier and faster than wired sensor networks or other wireless networks, as they do not need fixed infrastructure. Nodes are partitioned into many small groups named clusters to aggregate data through network organization. WSN clustering guarantees performance achievement of sensor nodes. Sensor nodes energy consumption is reduced by eliminating redundant energy use and balancing energy sensor nodes use over a network. The aim of such clustering protocols is to prolong network life. Low Energy Adaptive Clustering Hierarchy (LEACH) is a popular protocol in WSN. LEACH is a clustering protocol in which the random rotations of local cluster heads are utilized in order to distribute energy load among all sensor nodes in the network. This paper proposes Connected Dominant Set (CDS) based cluster formation. CDS aggregates data in a promising approach for reducing routing overhead since messages are transmitted only within virtual backbone by means of CDS and also data aggregating lowers the ratio of responding hosts to the hosts existing in virtual backbones. CDS tries to increase networks lifetime considering such parameters as sensors lifetime, remaining and consumption energies in order to have an almost optimal data aggregation within networks. Experimental results proved CDS outperformed LEACH regarding number of cluster formations, average packet loss rate, average end to end delay, life computation, and remaining energy computation.Keywords: Wireless sensor network, connected dominant set, clustering, data aggregation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11326664 Deadline Missing Prediction for Mobile Robots through the Use of Historical Data
Authors: Edwaldo R. B. Monteiro, Patricia D. M. Plentz, Edson R. De Pieri
Abstract:
Mobile robotics is gaining an increasingly important role in modern society. Several potentially dangerous or laborious tasks for human are assigned to mobile robots, which are increasingly capable. Many of these tasks need to be performed within a specified period, i.e, meet a deadline. Missing the deadline can result in financial and/or material losses. Mechanisms for predicting the missing of deadlines are fundamental because corrective actions can be taken to avoid or minimize the losses resulting from missing the deadline. In this work we propose a simple but reliable deadline missing prediction mechanism for mobile robots through the use of historical data and we use the Pioneer 3-DX robot for experiments and simulations, one of the most popular robots in academia.
Keywords: Deadline missing, historical data, mobile robots, prediction mechanism.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18116663 Ensemble Approach for Predicting Student's Academic Performance
Authors: L. A. Muhammad, M. S. Argungu
Abstract:
Educational data mining (EDM) has recorded substantial considerations. Techniques of data mining in one way or the other have been proposed to dig out out-of-sight knowledge in educational data. The result of the study got assists academic institutions in further enhancing their process of learning and methods of passing knowledge to students. Consequently, the performance of students boasts and the educational products are by no doubt enhanced. This study adopted a student performance prediction model premised on techniques of data mining with Students' Essential Features (SEF). SEF are linked to the learner's interactivity with the e-learning management system. The performance of the student's predictive model is assessed by a set of classifiers, viz. Bayes Network, Logistic Regression, and Reduce Error Pruning Tree (REP). Consequently, ensemble methods of Bagging, Boosting, and Random Forest (RF) are applied to improve the performance of these single classifiers. The study reveals that the result shows a robust affinity between learners' behaviors and their academic attainment. Result from the study shows that the REP Tree and its ensemble record the highest accuracy of 83.33% using SEF. Hence, in terms of the Receiver Operating Curve (ROC), boosting method of REP Tree records 0.903, which is the best. This result further demonstrates the dependability of the proposed model.
Keywords: Ensemble, bagging, Random Forest, boosting, data mining, classifiers, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7756662 A Survey on Facial Feature Points Detection Techniques and Approaches
Authors: Rachid Ahdid, Khaddouj Taifi, Said Safi, Bouzid Manaut
Abstract:
Automatic detection of facial feature points plays an important role in applications such as facial feature tracking, human-machine interaction and face recognition. The majority of facial feature points detection methods using two-dimensional or three-dimensional data are covered in existing survey papers. In this article chosen approaches to the facial features detection have been gathered and described. This overview focuses on the class of researches exploiting facial feature points detection to represent facial surface for two-dimensional or three-dimensional face. In the conclusion, we discusses advantages and disadvantages of the presented algorithms.Keywords: Facial feature points, face recognition, facial feature tracking, two-dimensional data, three-dimensional data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16836661 Mining Network Data for Intrusion Detection through Naïve Bayesian with Clustering
Authors: Dewan Md. Farid, Nouria Harbi, Suman Ahmmed, Md. Zahidur Rahman, Chowdhury Mofizur Rahman
Abstract:
Network security attacks are the violation of information security policy that received much attention to the computational intelligence society in the last decades. Data mining has become a very useful technique for detecting network intrusions by extracting useful knowledge from large number of network data or logs. Naïve Bayesian classifier is one of the most popular data mining algorithm for classification, which provides an optimal way to predict the class of an unknown example. It has been tested that one set of probability derived from data is not good enough to have good classification rate. In this paper, we proposed a new learning algorithm for mining network logs to detect network intrusions through naïve Bayesian classifier, which first clusters the network logs into several groups based on similarity of logs, and then calculates the prior and conditional probabilities for each group of logs. For classifying a new log, the algorithm checks in which cluster the log belongs and then use that cluster-s probability set to classify the new log. We tested the performance of our proposed algorithm by employing KDD99 benchmark network intrusion detection dataset, and the experimental results proved that it improves detection rates as well as reduces false positives for different types of network intrusions.Keywords: Clustering, detection rate, false positive, naïveBayesian classifier, network intrusion detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 55396660 Evaluation of Model Evaluation Criterion for Software Development Effort Estimation
Authors: S. K. Pillai, M. K. Jeyakumar
Abstract:
Estimation of model parameters is necessary to predict the behavior of a system. Model parameters are estimated using optimization criteria. Most algorithms use historical data to estimate model parameters. The known target values (actual) and the output produced by the model are compared. The differences between the two form the basis to estimate the parameters. In order to compare different models developed using the same data different criteria are used. The data obtained for short scale projects are used here. We consider software effort estimation problem using radial basis function network. The accuracy comparison is made using various existing criteria for one and two predictors. Then, we propose a new criterion based on linear least squares for evaluation and compared the results of one and two predictors. We have considered another data set and evaluated prediction accuracy using the new criterion. The new criterion is easy to comprehend compared to single statistic. Although software effort estimation is considered, this method is applicable for any modeling and prediction.
Keywords: Software effort estimation, accuracy, Radial Basis Function, linear least squares.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20456659 Solar Seawater Desalination Still with Seawater Preheater Using Efficient Heat Transfer Oil: Numerical Investigation and Data Verification
Authors: Ahmed N. Shmroukh, Gamal Tag Abdel-Jaber, Rashed D. Aldughpassi
Abstract:
The feasibility of improving the performance of the proposed solar still unit which operated in very hot climate is investigated numerically and verified with experimental data. This solar desalination unit with proposed auxiliary device as seawater preheating system using petrol based textherm oil was used to produce pure fresh water from seawater. The effective evaporation area of basin is about 1 m2. The unit was tested in two main operation modes which are normal and with seawater preheating system. The results showed that, there is good agreement between the theoretical data and the experimental data; this means that the numerical model can be accurately dependable for predicting the proposed solar still performance and design parameters. The results also showed that the fresh water productivity of the solar still in the modified preheating case which is higher than normal case, leads to an increase in productivity of 42%.Keywords: Improving productivity, seawater desalination, solar stills, theoretical model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7776658 The Necessity to Standardize Procedures of Providing Engineering Geological Data for Designing Road and Railway Tunneling Projects
Authors: Atefeh Saljooghi Khoshkar, Jafar Hassanpour
Abstract:
One of the main problems of design stage relating to many tunneling projects is the lack of an appropriate standard for the provision of engineering geological data in a predefined format. In particular, this is more reflected in highway and railroad tunnels projects in which there is a number of tunnels and different professional teams involved. In this regard, a comprehensive software needs to be designed using the accepted methods in order to help engineering geologists to prepare standard reports, which contain sufficient input data for the design stage. Regarding this necessity, an applied software has been designed using macro capabilities and Visual Basic programming language (VBA) through Microsoft Excel. In this software, all of the engineering geological input data, which are required for designing different parts of tunnels such as discontinuities properties, rock mass strength parameters, rock mass classification systems, boreability classification, the penetration rate and so forth can be calculated and reported in a standard format.
Keywords: Engineering geology, rock mass classification, rock mechanic, tunnel.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1396657 The Quality Assessment of Seismic Reflection Survey Data Using Statistical Analysis: A Case Study of Fort Abbas Area, Cholistan Desert, Pakistan
Authors: U. Waqas, M. F. Ahmed, A. Mehmood, M. A. Rashid
Abstract:
In geophysical exploration surveys, the quality of acquired data holds significant importance before executing the data processing and interpretation phases. In this study, 2D seismic reflection survey data of Fort Abbas area, Cholistan Desert, Pakistan was taken as test case in order to assess its quality on statistical bases by using normalized root mean square error (NRMSE), Cronbach’s alpha test (α) and null hypothesis tests (t-test and F-test). The analysis challenged the quality of the acquired data and highlighted the significant errors in the acquired database. It is proven that the study area is plain, tectonically least affected and rich in oil and gas reserves. However, subsurface 3D modeling and contouring by using acquired database revealed high degrees of structural complexities and intense folding. The NRMSE had highest percentage of residuals between the estimated and predicted cases. The outcomes of hypothesis testing also proved the biasness and erraticness of the acquired database. Low estimated value of alpha (α) in Cronbach’s alpha test confirmed poor reliability of acquired database. A very low quality of acquired database needs excessive static correction or in some cases, reacquisition of data is also suggested which is most of the time not feasible on economic grounds. The outcomes of this study could be used to assess the quality of large databases and to further utilize as a guideline to establish database quality assessment models to make much more informed decisions in hydrocarbon exploration field.
Keywords: Data quality, null hypothesis, seismic lines, seismic reflection survey.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 616