Search results for: violation data discovery
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25809

Search results for: violation data discovery

24009 The Relationship between Class Attendance and Performance of Industrial Engineering Students Enrolled for a Statistics Subject at the University of Technology

Authors: Tshaudi Motsima

Abstract:

Class attendance is key at all levels of education. At tertiary level many students develop a tendency of not attending all classes without being aware of the repercussions of not attending all classes. It is important for all students to attend all classes as they can receive first-hand information and they can benefit more. The student who attends classes is likely to perform better academically than the student who does not. The aim of this paper is to assess the relationship between class attendance and academic performance of industrial engineering students. The data for this study were collected through the attendance register of students and the other data were accessed from the Integrated Tertiary Software and the Higher Education Data Analyzer Portal. Data analysis was conducted on a sample of 93 students. The results revealed that students with medium predicate scores (OR = 3.8; p = 0.027) and students with low predicate scores (OR = 21.4, p < 0.001) were significantly likely to attend less than 80% of the classes as compared to students with high predicate scores. Students with examination performance of less than 50% were likely to attend less than 80% of classes than students with examination performance of 50% and above, but the differences were not statistically significant (OR = 1.3; p = 0.750).

Keywords: class attendance, examination performance, final outcome, logistic regression

Procedia PDF Downloads 134
24008 Multimodal Optimization of Density-Based Clustering Using Collective Animal Behavior Algorithm

Authors: Kristian Bautista, Ruben A. Idoy

Abstract:

A bio-inspired metaheuristic algorithm inspired by the theory of collective animal behavior (CAB) was integrated to density-based clustering modeled as multimodal optimization problem. The algorithm was tested on synthetic, Iris, Glass, Pima and Thyroid data sets in order to measure its effectiveness relative to CDE-based Clustering algorithm. Upon preliminary testing, it was found out that one of the parameter settings used was ineffective in performing clustering when applied to the algorithm prompting the researcher to do an investigation. It was revealed that fine tuning distance δ3 that determines the extent to which a given data point will be clustered helped improve the quality of cluster output. Even though the modification of distance δ3 significantly improved the solution quality and cluster output of the algorithm, results suggest that there is no difference between the population mean of the solutions obtained using the original and modified parameter setting for all data sets. This implies that using either the original or modified parameter setting will not have any effect towards obtaining the best global and local animal positions. Results also suggest that CDE-based clustering algorithm is better than CAB-density clustering algorithm for all data sets. Nevertheless, CAB-density clustering algorithm is still a good clustering algorithm because it has correctly identified the number of classes of some data sets more frequently in a thirty trial run with a much smaller standard deviation, a potential in clustering high dimensional data sets. Thus, the researcher recommends further investigation in the post-processing stage of the algorithm.

Keywords: clustering, metaheuristics, collective animal behavior algorithm, density-based clustering, multimodal optimization

Procedia PDF Downloads 234
24007 Multiphase Coexistence for Aqueous System with Hydrophilic Agent

Authors: G. B. Hong

Abstract:

Liquid-Liquid Equilibrium (LLE) data are measured for the ternary mixtures of water + 1-butanol + butyl acetate and quaternary mixtures of water + 1-butanol + butyl acetate + glycerol at atmospheric pressure at 313.15 K. In addition, isothermal Vapor–Liquid–Liquid Equilibrium (VLLE) data are determined experimentally at 333.15 K. The region of heterogeneity is found to increase as the hydrophilic agent (glycerol) is introduced into the aqueous mixtures. The experimental data are correlated with the NRTL model. The predicted results from the solution model with the model parameters determined from the constituent binaries are also compared with the experimental values.

Keywords: LLE, VLLE, hydrophilic agent, NRTL

Procedia PDF Downloads 244
24006 ISMARA: Completely Automated Inference of Gene Regulatory Networks from High-Throughput Data

Authors: Piotr J. Balwierz, Mikhail Pachkov, Phil Arnold, Andreas J. Gruber, Mihaela Zavolan, Erik van Nimwegen

Abstract:

Understanding the key players and interactions in the regulatory networks that control gene expression and chromatin state across different cell types and tissues in metazoans remains one of the central challenges in systems biology. Our laboratory has pioneered a number of methods for automatically inferring core gene regulatory networks directly from high-throughput data by modeling gene expression (RNA-seq) and chromatin state (ChIP-seq) measurements in terms of genome-wide computational predictions of regulatory sites for hundreds of transcription factors and micro-RNAs. These methods have now been completely automated in an integrated webserver called ISMARA that allows researchers to analyze their own data by simply uploading RNA-seq or ChIP-seq data sets and provides results in an integrated web interface as well as in downloadable flat form. For any data set, ISMARA infers the key regulators in the system, their activities across the input samples, the genes and pathways they target, and the core interactions between the regulators. We believe that by empowering experimental researchers to apply cutting-edge computational systems biology tools to their data in a completely automated manner, ISMARA can play an important role in developing our understanding of regulatory networks across metazoans.

Keywords: gene expression analysis, high-throughput sequencing analysis, transcription factor activity, transcription regulation

Procedia PDF Downloads 67
24005 The Power of the Proper Orthogonal Decomposition Method

Authors: Charles Lee

Abstract:

The Principal Orthogonal Decomposition (POD) technique has been used as a model reduction tool for many applications in engineering and science. In principle, one begins with an ensemble of data, called snapshots, collected from an experiment or laboratory results. The beauty of the POD technique is that when applied, the entire data set can be represented by the smallest number of orthogonal basis elements. It is the such capability that allows us to reduce the complexity and dimensions of many physical applications. Mathematical formulations and numerical schemes for the POD method will be discussed along with applications in NASA’s Deep Space Large Antenna Arrays, Satellite Image Reconstruction, Cancer Detection with DNA Microarray Data, Maximizing Stock Return, and Medical Imaging.

Keywords: reduced-order methods, principal component analysis, cancer detection, image reconstruction, stock portfolios

Procedia PDF Downloads 86
24004 A Reflection of the Contemporary Life of Urban People Through Mixed Media Art

Authors: Van Huong Mai, Kanokwan Nithiratphat, Adool Booncham

Abstract:

The Movement of Contemporary Life consisted of two purposes, which were to study the movement and development of the modern life and to create the visual arts, which were paintings expressed via the form of apartment buildings was used from mixed media (digital printing and acrylic painting on canvas) which conveyed the rapid pace of modern life leading to diverse movements in viewer’s feeling. The operation of this creation was collected field data, documentary data, and influence from creative work. The data analysis was analyzed in order to theme, form, technique, and process to satisfy of concept and special character of the pieces.

Keywords: movement, contemporary life, visual art, acrylic painting, digital art, urban space

Procedia PDF Downloads 99
24003 Mining Educational Data to Support Students’ Major Selection

Authors: Kunyanuth Kularbphettong, Cholticha Tongsiri

Abstract:

This paper aims to create the model for student in choosing an emphasized track of student majoring in computer science at Suan Sunandha Rajabhat University. The objective of this research is to develop the suggested system using data mining technique to analyze knowledge and conduct decision rules. Such relationships can be used to demonstrate the reasonableness of student choosing a track as well as to support his/her decision and the system is verified by experts in the field. The sampling is from student of computer science based on the system and the questionnaire to see the satisfaction. The system result is found to be satisfactory by both experts and student as well.

Keywords: data mining technique, the decision support system, knowledge and decision rules, education

Procedia PDF Downloads 425
24002 SPBAC: A Semantic Policy-Based Access Control for Database Query

Authors: Aaron Zhang, Alimire Kahaer, Gerald Weber, Nalin Arachchilage

Abstract:

Access control is an essential safeguard for the security of enterprise data, which controls users’ access to information resources and ensures the confidentiality and integrity of information resources [1]. Research shows that the more common types of access control now have shortcomings [2]. In this direction, to improve the existing access control, we have studied the current technologies in the field of data security, deeply investigated the previous data access control policies and their problems, identified the existing deficiencies, and proposed a new extension structure of SPBAC. SPBAC extension proposed in this paper aims to combine Policy-Based Access Control (PBAC) with semantics to provide logically connected, real-time data access functionality by establishing associations between enterprise data through semantics. Our design combines policies with linked data through semantics to create a "Semantic link" so that access control is no longer per-database and determines that users in each role should be granted access based on the instance policy, and improves the SPBAC implementation by constructing policies and defined attributes through the XACML specification, which is designed to extend on the original XACML model. While providing relevant design solutions, this paper hopes to continue to study the feasibility and subsequent implementation of related work at a later stage.

Keywords: access control, semantic policy-based access control, semantic link, access control model, instance policy, XACML

Procedia PDF Downloads 95
24001 A Regression Analysis Study of the Applicability of Side Scan Sonar based Safety Inspection of Underwater Structures

Authors: Chul Park, Youngseok Kim, Sangsik Choi

Abstract:

This study developed an electric jig for underwater structure inspection in order to solve the problem of the application of side scan sonar to underwater inspection, and analyzed correlations of empirical data in order to enhance sonar data resolution. For the application of tow-typed sonar to underwater structure inspection, an electric jig was developed. In fact, it was difficult to inspect a cross-section at the time of inspection with tow-typed equipment. With the development of the electric jig for underwater structure inspection, it was possible to shorten an inspection time over 20%, compared to conventional tow-typed side scan sonar, and to inspect a proper cross-section through accurate angle control. The indoor test conducted to enhance sonar data resolution proved that a water depth, the distance from an underwater structure, and a filming angle influenced a resolution and data quality. Based on the data accumulated through field experience, multiple regression analysis was conducted on correlations between three variables. As a result, the relational equation of sonar operation according to a water depth was drawn.

Keywords: underwater structure, SONAR, safety inspection, resolution

Procedia PDF Downloads 265
24000 Enhanced Imperialist Competitive Algorithm for the Cell Formation Problem Using Sequence Data

Authors: S. H. Borghei, E. Teymourian, M. Mobin, G. M. Komaki, S. Sheikh

Abstract:

Imperialist competitive algorithm (ICA) is a recent meta-heuristic method that is inspired by the social evolutions for solving NP-Hard problems. The ICA is a population based algorithm which has achieved a great performance in comparison to other meta-heuristics. This study is about developing enhanced ICA approach to solve the cell formation problem (CFP) using sequence data. In addition to the conventional ICA, an enhanced version of ICA, namely EICA, applies local search techniques to add more intensification aptitude and embed the features of exploration and intensification more successfully. Suitable performance measures are used to compare the proposed algorithms with some other powerful solution approaches in the literature. In the same way, for checking the proficiency of algorithms, forty test problems are presented. Five benchmark problems have sequence data, and other ones are based on 0-1 matrices modified to sequence based problems. Computational results elucidate the efficiency of the EICA in solving CFP problems.

Keywords: cell formation problem, group technology, imperialist competitive algorithm, sequence data

Procedia PDF Downloads 455
23999 Establishment of Bit Selective Mode Storage Covert Channel in VANETs

Authors: Amarpreet Singh, Kimi Manchanda

Abstract:

Intended for providing the security in the VANETS (Vehicular Ad hoc Network) scenario, the covert storage channel is implemented through data transmitted between the sender and the receiver. Covert channels are the logical links which are used for the communication purpose and hiding the secure data from the intruders. This paper refers to the Establishment of bit selective mode covert storage channels in VANETS. In this scenario, the data is being transmitted with two modes i.e. the normal mode and the covert mode. During the communication between vehicles in this scenario, the controlling of bits is possible through the optional bits of IPV6 Header Format. This implementation is fulfilled with the help of Network simulator.

Keywords: covert mode, normal mode, VANET, OBU, on-board unit

Procedia PDF Downloads 368
23998 Enhancing Temporal Extrapolation of Wind Speed Using a Hybrid Technique: A Case Study in West Coast of Denmark

Authors: B. Elshafei, X. Mao

Abstract:

The demand for renewable energy is significantly increasing, major investments are being supplied to the wind power generation industry as a leading source of clean energy. The wind energy sector is entirely dependable and driven by the prediction of wind speed, which by the nature of wind is very stochastic and widely random. This s0tudy employs deep multi-fidelity Gaussian process regression, used to predict wind speeds for medium term time horizons. Data of the RUNE experiment in the west coast of Denmark were provided by the Technical University of Denmark, which represent the wind speed across the study area from the period between December 2015 and March 2016. The study aims to investigate the effect of pre-processing the data by denoising the signal using empirical wavelet transform (EWT) and engaging the vector components of wind speed to increase the number of input data layers for data fusion using deep multi-fidelity Gaussian process regression (GPR). The outcomes were compared using root mean square error (RMSE) and the results demonstrated a significant increase in the accuracy of predictions which demonstrated that using vector components of the wind speed as additional predictors exhibits more accurate predictions than strategies that ignore them, reflecting the importance of the inclusion of all sub data and pre-processing signals for wind speed forecasting models.

Keywords: data fusion, Gaussian process regression, signal denoise, temporal extrapolation

Procedia PDF Downloads 137
23997 Deadline Missing Prediction for Mobile Robots through the Use of Historical Data

Authors: Edwaldo R. B. Monteiro, Patricia D. M. Plentz, Edson R. De Pieri

Abstract:

Mobile robotics is gaining an increasingly important role in modern society. Several potentially dangerous or laborious tasks for human are assigned to mobile robots, which are increasingly capable. Many of these tasks need to be performed within a specified period, i.e., meet a deadline. Missing the deadline can result in financial and/or material losses. Mechanisms for predicting the missing of deadlines are fundamental because corrective actions can be taken to avoid or minimize the losses resulting from missing the deadline. In this work we propose a simple but reliable deadline missing prediction mechanism for mobile robots through the use of historical data and we use the Pioneer 3-DX robot for experiments and simulations, one of the most popular robots in academia.

Keywords: deadline missing, historical data, mobile robots, prediction mechanism

Procedia PDF Downloads 401
23996 The Intention to Use Telecare in People of Fall Experience: Application of Fuzzy Neural Network

Authors: Jui-Chen Huang, Shou-Hsiung Cheng

Abstract:

This study examined their willingness to use telecare for people who have had experience falling in the last three months in Taiwan. This study adopted convenience sampling and a structural questionnaire to collect data. It was based on the definition and the constructs related to the Health Belief Model (HBM). HBM is comprised of seven constructs: perceived benefits (PBs), perceived disease threat (PDT), perceived barriers of taking action (PBTA), external cues to action (ECUE), internal cues to action (ICUE), attitude toward using (ATT), and behavioral intention to use (BI). This study adopted Fuzzy Neural Network (FNN) to put forward an effective method. It shows the dependence of ATT on PB, PDT, PBTA, ECUE, and ICUE. The training and testing data RMSE (root mean square error) are 0.028 and 0.166 in the FNN, respectively. The training and testing data RMSE are 0.828 and 0.578 in the regression model, respectively. On the other hand, as to the dependence of ATT on BI, as presented in the FNN, the training and testing data RMSE are 0.050 and 0.109, respectively. The training and testing data RMSE are 0.529 and 0.571 in the regression model, respectively. The results show that the FNN method is better than the regression analysis. It is an effective and viable good way.

Keywords: fall, fuzzy neural network, health belief model, telecare, willingness

Procedia PDF Downloads 202
23995 Effect of Viscous Dissipation on 3-D MHD Casson Flow in Presence of Chemical Reaction: A Numerical Study

Authors: Bandari Shanker, Alfunsa Prathiba

Abstract:

The influence of viscous dissipation on MHD Casson 3-D fluid flow in two perpendicular directions past a linearly stretching sheet in the presence of a chemical reaction is explored in this work. For exceptional circumstances, self-similar solutions are obtained and compared to the given data. The enhancement in the values Ecert number the temperature boundary layer increases. Further, the current findings are observed to be in great accord with the existing data. In both directions, non - dimensional velocities and stress distribution are achieved. The relevant data are graphed and explained quantitatively in relation to changes in the Casson fluid parameter as well as other fluid flow parameters.

Keywords: viscous dissipation, 3-D Casson flow, chemical reaction, Ecert number

Procedia PDF Downloads 193
23994 Improving Fine Motor Skills in the Hands of Children with ASD with Applying the Fine Motor Activities in Montessori Method of Education

Authors: Yeganeh Faraji, Ned Faraji

Abstract:

The aim of the present study is to search for the effects of training on improving fine hand skills in children with autistic spectrum disorder through the case study statistic method. The sample group was selected by the available sampling method and included four participants. The methodology of this research was a single-subject semi-experimental of AB design. The data were gathered by natural observation. In the next stage, the data were recorded on data record sheets and then presented on diagrams. The sample group was evaluated by an assessment which the researcher created based on Lincoln-Oseretsky’ motor development scale in two pre-test and post-test phases. In order to promote fingers’ fine movement, the Montessori method was applied. Collecting and analyzing data which were shown by the data presentation method and diagrams, proved that it had no significant effect on improving fingers’ fine movement. Therefore, based on the current research findings, it is suggested that future researchers can apply various teaching methods and different tests for improving fine hand skills or increasing the period of training.

Keywords: autism spectrum disorder, Montessori method, fine motor skills, Lincoln-Oseretsky assessment

Procedia PDF Downloads 96
23993 Application of Public Access Two-Dimensional Hydrodynamic and Distributed Hydrological Models for Flood Forecasting in Ungauged Basins

Authors: Ahmad Shayeq Azizi, Yuji Toda

Abstract:

In Afghanistan, floods are the most frequent and recurrent events among other natural disasters. On the other hand, lack of monitoring data is a severe problem, which increases the difficulty of making the appropriate flood countermeasures of flood forecasting. This study is carried out to simulate the flood inundation in Harirud River Basin by application of distributed hydrological model, Integrated Flood Analysis System (IFAS) and 2D hydrodynamic model, International River Interface Cooperative (iRIC) based on satellite rainfall combined with historical peak discharge and global accessed data. The results of the simulation can predict the inundation area, depth and velocity, and the hardware countermeasures such as the impact of levee installation can be discussed by using the present method. The methodology proposed in this study is suitable for the area where hydrological and geographical data including river survey data are poorly observed.

Keywords: distributed hydrological model, flood inundation, hydrodynamic model, ungauged basins

Procedia PDF Downloads 167
23992 FlexPoints: Efficient Algorithm for Detection of Electrocardiogram Characteristic Points

Authors: Daniel Bulanda, Janusz A. Starzyk, Adrian Horzyk

Abstract:

The electrocardiogram (ECG) is one of the most commonly used medical tests, essential for correct diagnosis and treatment of the patient. While ECG devices generate a huge amount of data, only a small part of them carries valuable medical information. To deal with this problem, many compression algorithms and filters have been developed over the past years. However, the rapid development of new machine learning techniques poses new challenges. To address this class of problems, we created the FlexPoints algorithm that searches for characteristic points on the ECG signal and ignores all other points that do not carry relevant medical information. The conducted experiments proved that the presented algorithm can significantly reduce the number of data points which represents ECG signal without losing valuable medical information. These sparse but essential characteristic points (flex points) can be a perfect input for some modern machine learning models, which works much better using flex points as an input instead of raw data or data compressed by many popular algorithms.

Keywords: characteristic points, electrocardiogram, ECG, machine learning, signal compression

Procedia PDF Downloads 164
23991 Detailed Analysis of Multi-Mode Optical Fiber Infrastructures for Data Centers

Authors: Matej Komanec, Jan Bohata, Stanislav Zvanovec, Tomas Nemecek, Jan Broucek, Josef Beran

Abstract:

With the exponential growth of social networks, video streaming and increasing demands on data rates, the number of newly built data centers rises proportionately. The data centers, however, have to adjust to the rapidly increased amount of data that has to be processed. For this purpose, multi-mode (MM) fiber based infrastructures are often employed. It stems from the fact, the connections in data centers are typically realized within a short distance, and the application of MM fibers and components considerably reduces costs. On the other hand, the usage of MM components brings specific requirements for installation service conditions. Moreover, it has to be taken into account that MM fiber components have a higher production tolerance for parameters like core and cladding diameters, eccentricity, etc. Due to the high demands for the reliability of data center components, the determination of properly excited optical field inside the MM fiber core belongs to the key parameters while designing such an MM optical system architecture. Appropriately excited mode field of the MM fiber provides optimal power budget in connections, leads to the decrease of insertion losses (IL) and achieves effective modal bandwidth (EMB). The main parameter, in this case, is the encircled flux (EF), which should be properly defined for variable optical sources and consequent different mode-field distribution. In this paper, we present detailed investigation and measurements of the mode field distribution for short MM links purposed in particular for data centers with the emphasis on reliability and safety. These measurements are essential for large MM network design. The various scenarios, containing different fibers and connectors, were tested in terms of IL and mode-field distribution to reveal potential challenges. Furthermore, we focused on estimation of particular defects and errors, which can realistically occur like eccentricity, connector shifting or dust, were simulated and measured, and their dependence to EF statistics and functionality of data center infrastructure was evaluated. The experimental tests were performed at two wavelengths, commonly used in MM networks, of 850 nm and 1310 nm to verify EF statistics. Finally, we provide recommendations for data center systems and networks, using OM3 and OM4 MM fiber connections.

Keywords: optical fiber, multi-mode, data centers, encircled flux

Procedia PDF Downloads 377
23990 Relationship between Driving under the Influence and Traffic Safety

Authors: Eun Hak Lee, Young-Hyun Seo, Hosuk Shin, Seung-Young Kho

Abstract:

Among traffic crashes, driving under the influence (DUI) of alcohol is the most dangerous behavior in Seoul, South Korea. In 2016 alone 40 deaths occurred on of 2,857 cases of DUI. Since DUI is one of the major factors in increasing the severity of crashes, the intensive management of DUI required to reduce traffic crash deaths and the crash damages. This study aims to investigate the relationship between DUI and traffic safety in order to establish countermeasures for traffic safety improvement. The analysis was conducted on the habitual drivers who drove under the influence. Information of habitual drivers is matched to crash data and fine data. The descriptive statistics on data used in this study, which consists of driver license acquisition, traffic fine, and crash data provided by the Korean National Police Agency, are described. The drivers under the influence are classified by statistically significant criteria, such as driver’s age, license type, driving experience, and crash reasons. With the results of the analysis, we propose some countermeasures to enhance traffic safety.

Keywords: driving under influence, traffic safety, traffic crash, traffic fine

Procedia PDF Downloads 223
23989 Simplified Measurement of Occupational Energy Expenditure

Authors: J. Wicks

Abstract:

Aim: To develop a simple methodology to allow collected heart rate (HR) data from inexpensive wearable devices to be expressed in a suitable format (METs) to quantitate occupational (and recreational) activity. Introduction: Assessment of occupational activity is commonly done by utilizing questionnaires in combination with prescribed MET levels of a vast range of previously measured activities. However for any individual the intensity of performing a specific activity can vary significantly. Ideally objective measurement of individual activity is preferred. Though there are a wide range of HR recording devices there is a distinct lack methodology to allow processing of collected data to quantitate energy expenditure (EE). The HR index equation expresses METs in relation to relative HR i.e. the ratio of activity HR to resting HR. The use of this equation provides a simple utility for objective measurement of EE. Methods: During a typical occupational work period of approximately 8 hours HR data was recorded using a Polar RS 400 wrist monitor. Recorded data was downloaded to a Windows PC and non HR data was stripped from the ASCII file using ‘Notepad’. The HR data was exported to a spread sheet program and sorted by HR range into a histogram format. Three HRs were determined, namely a resting HR (the HR delimiting the lowest 30 minutes of recorded data), a mean HR and a peak HR (the HR delimiting the highest 30 minutes of recorded data). HR indices were calculated (mean index equals mean HR/rest HR and peak index equals peak HR/rest HR) with mean and peak indices being converted to METs using the HR index equation. Conclusion: Inexpensive HR recording devices can be utilized to make reasonable estimates of occupational (or recreational) EE suitable for large scale demographic screening by utilizing the HR index equation. The intrinsic value of the HR index equation is that it is independent of factors that influence absolute HR, namely fitness, smoking and beta-blockade.

Keywords: energy expenditure, heart rate histograms, heart rate index, occupational activity

Procedia PDF Downloads 296
23988 Empirical Study of Running Correlations in Exam Marks: Same Statistical Pattern as Chance

Authors: Weisi Guo

Abstract:

It is well established that there may be running correlations in sequential exam marks due to students sitting in the order of course registration patterns. As such, a random and non-sequential sampling of exam marks is a standard recommended practice. Here, the paper examines a large number of exam data stretching several years across different modules to see the degree to which it is true. Using the real mark distribution as a generative process, it was found that random simulated data had no more sequential randomness than the real data. That is to say, the running correlations that one often observes are statistically identical to chance. Digging deeper, it was found that some high running correlations have students that indeed share a common course history and make similar mistakes. However, at the statistical scale of a module question, the combined effect is statistically similar to the random shuffling of papers. As such, there may not be the need to take random samples for marks, but it still remains good practice to mark papers in a random sequence to reduce the repetitive marking bias and errors.

Keywords: data analysis, empirical study, exams, marking

Procedia PDF Downloads 183
23987 Factors Influencing Soil Organic Carbon Storage Estimation in Agricultural Soils: A Machine Learning Approach Using Remote Sensing Data Integration

Authors: O. Sunantha, S. Zhenfeng, S. Phattraporn, A. Zeeshan

Abstract:

The decline of soil organic carbon (SOC) in global agriculture is a critical issue requiring rapid and accurate estimation for informed policymaking. While it is recognized that SOC predictors vary significantly when derived from remote sensing data and environmental variables, identifying the specific parameters most suitable for accurately estimating SOC in diverse agricultural areas remains a challenge. This study utilizes remote sensing data to precisely estimate SOC and identify influential factors in diverse agricultural areas, such as paddy, corn, sugarcane, cassava, and perennial crops. Extreme gradient boosting (XGBoost), random forest (RF), and support vector regression (SVR) models are employed to analyze these factors' impact on SOC estimation. The results show key factors influencing SOC estimation include slope, vegetation indices (EVI), spectral reflectance indices (red index, red edge2), temperature, land use, and surface soil moisture, as indicated by their averaged importance scores across XGBoost, RF, and SVR models. Therefore, using different machine learning algorithms for SOC estimation reveals varying influential factors from remote sensing data and environmental variables. This approach emphasizes feature selection, as different machine learning algorithms identify various key factors from remote sensing data and environmental variables for accurate SOC estimation.

Keywords: factors influencing SOC estimation, remote sensing data, environmental variables, machine learning

Procedia PDF Downloads 39
23986 Visualization-Based Feature Extraction for Classification in Real-Time Interaction

Authors: Ágoston Nagy

Abstract:

This paper introduces a method of using unsupervised machine learning to visualize the feature space of a dataset in 2D, in order to find most characteristic segments in the set. After dimension reduction, users can select clusters by manual drawing. Selected clusters are recorded into a data model that is used for later predictions, based on realtime data. Predictions are made with supervised learning, using Gesture Recognition Toolkit. The paper introduces two example applications: a semantic audio organizer for analyzing incoming sounds, and a gesture database organizer where gestural data (recorded by a Leap motion) is visualized for further manipulation.

Keywords: gesture recognition, machine learning, real-time interaction, visualization

Procedia PDF Downloads 354
23985 Design and Development of Bar Graph Data Visualization in 2D and 3D Space Using Front-End Technologies

Authors: Sourabh Yaduvanshi, Varsha Namdeo, Namrata Yaduvanshi

Abstract:

This study delves into the design and development intricacies of crafting detailed 2D bar charts via d3.js, recognizing its limitations in generating 3D visuals within the Document Object Model (DOM). The study combines three.js with d3.js, facilitating a smooth evolution from 2D to immersive 3D representations. This fusion epitomizes the synergy between front-end technologies, expanding horizons in data visualization. Beyond technical expertise, it symbolizes a creative convergence, pushing boundaries in visual representation. The abstract illuminates methodologies, unraveling the intricate integration of this fusion and guiding enthusiasts. It narrates a compelling story of transcending 2D constraints, propelling data visualization into captivating three-dimensional realms, and igniting creativity in front-end visualization endeavors.

Keywords: design, development, front-end technologies, visualization

Procedia PDF Downloads 40
23984 Prediction of All-Beta Protein Secondary Structure Using Garnier-Osguthorpe-Robson Method

Authors: K. Tejasri, K. Suvarna Vani, S. Prathyusha, S. Ramya

Abstract:

Proteins are chained sequences of amino acids which are brought together by the peptide bonds. Many varying formations of the chains are possible due to multiple combinations of amino acids and rotation in numerous positions along the chain. Protein structure prediction is one of the crucial goals worked towards by the members of bioinformatics and theoretical chemistry backgrounds. Among the four different structure levels in proteins, we emphasize mainly the secondary level structure. Generally, the secondary protein basically comprises alpha-helix and beta-sheets. Multi-class classification problem of data with disparity is truly a challenge to overcome and has to be addressed for the beta strands. Imbalanced data distribution constitutes a couple of the classes of data having very limited training samples collated with other classes. The secondary structure data is extracted from the protein primary sequence, and the beta-strands are predicted using suitable machine learning algorithms.

Keywords: proteins, secondary structure elements, beta-sheets, beta-strands, alpha-helices, machine learning algorithms

Procedia PDF Downloads 95
23983 Identify Users Behavior from Mobile Web Access Logs Using Automated Log Analyzer

Authors: Bharat P. Modi, Jayesh M. Patel

Abstract:

Mobile Internet is acting as a major source of data. As the number of web pages continues to grow the Mobile web provides the data miners with just the right ingredients for extracting information. In order to cater to this growing need, a special term called Mobile Web mining was coined. Mobile Web mining makes use of data mining techniques and deciphers potentially useful information from web data. Web Usage mining deals with understanding the behavior of users by making use of Mobile Web Access Logs that are generated on the server while the user is accessing the website. A Web access log comprises of various entries like the name of the user, his IP address, a number of bytes transferred time-stamp etc. A variety of Log Analyzer tools exists which help in analyzing various things like users navigational pattern, the part of the website the users are mostly interested in etc. The present paper makes use of such log analyzer tool called Mobile Web Log Expert for ascertaining the behavior of users who access an astrology website. It also provides a comparative study between a few log analyzer tools available.

Keywords: mobile web access logs, web usage mining, web server, log analyzer

Procedia PDF Downloads 363
23982 Modeling Food Popularity Dependencies Using Social Media Data

Authors: DEVASHISH KHULBE, MANU PATHAK

Abstract:

The rise in popularity of major social media platforms have enabled people to share photos and textual information about their daily life. One of the popular topics about which information is shared is food. Since a lot of media about food are attributed to particular locations and restaurants, information like spatio-temporal popularity of various cuisines can be analyzed. Tracking the popularity of food types and retail locations across space and time can also be useful for business owners and restaurant investors. In this work, we present an approach using off-the shelf machine learning techniques to identify trends and popularity of cuisine types in an area using geo-tagged data from social media, Google images and Yelp. After adjusting for time, we use the Kernel Density Estimation to get hot spots across the location and model the dependencies among food cuisines popularity using Bayesian Networks. We consider the Manhattan borough of New York City as the location for our analyses but the approach can be used for any area with social media data and information about retail businesses.

Keywords: Web Mining, Geographic Information Systems, Business popularity, Spatial Data Analyses

Procedia PDF Downloads 118
23981 Hierarchical Piecewise Linear Representation of Time Series Data

Authors: Vineetha Bettaiah, Heggere S. Ranganath

Abstract:

This paper presents a Hierarchical Piecewise Linear Approximation (HPLA) for the representation of time series data in which the time series is treated as a curve in the time-amplitude image space. The curve is partitioned into segments by choosing perceptually important points as break points. Each segment between adjacent break points is recursively partitioned into two segments at the best point or midpoint until the error between the approximating line and the original curve becomes less than a pre-specified threshold. The HPLA representation achieves dimensionality reduction while preserving prominent local features and general shape of time series. The representation permits course-fine processing at different levels of details, allows flexible definition of similarity based on mathematical measures or general time series shape, and supports time series data mining operations including query by content, clustering and classification based on whole or subsequence similarity.

Keywords: data mining, dimensionality reduction, piecewise linear representation, time series representation

Procedia PDF Downloads 276
23980 Satellite Statistical Data Approach for Upwelling Identification and Prediction in South of East Java and Bali Sea

Authors: Hary Aprianto Wijaya Siahaan, Bayu Edo Pratama

Abstract:

Sea fishery's potential to become one of the nation's assets which very contributed to Indonesia's economy. This fishery potential not in spite of the availability of the chlorophyll in the territorial waters of Indonesia. The research was conducted using three methods, namely: statistics, comparative and analytical. The data used include MODIS sea temperature data imaging results in Aqua satellite with a resolution of 4 km in 2002-2015, MODIS data of chlorophyll-a imaging results in Aqua satellite with a resolution of 4 km in 2002-2015, and Imaging results data ASCAT on MetOp and NOAA satellites with 27 km resolution in 2002-2015. The results of the processing of the data show that the incidence of upwelling in the south of East Java Sea began to happen in June identified with sea surface temperature anomaly below normal, the mass of the air that moves from the East to the West, and chlorophyll-a concentrations are high. In July the region upwelling events are increasingly expanding towards the West and reached its peak in August. Chlorophyll-a concentration prediction using multiple linear regression equations demonstrate excellent results to chlorophyll-a concentrations prediction in 2002 until 2015 with the correlation of predicted chlorophyll-a concentration indicate a value of 0.8 and 0.3 with RMSE value. On the chlorophyll-a concentration prediction in 2016 indicate good results despite a decline in the value of the correlation, where the correlation of predicted chlorophyll-a concentration in the year 2016 indicate a value 0.6, but showed improvement in RMSE values with 0.2.

Keywords: satellite, sea surface temperature, upwelling, wind stress

Procedia PDF Downloads 158