Search results for: heterogeneous massive data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25998

Search results for: heterogeneous massive data

24198 Improve Student Performance Prediction Using Majority Vote Ensemble Model for Higher Education

Authors: Wade Ghribi, Abdelmoty M. Ahmed, Ahmed Said Badawy, Belgacem Bouallegue

Abstract:

In higher education institutions, the most pressing priority is to improve student performance and retention. Large volumes of student data are used in Educational Data Mining techniques to find new hidden information from students' learning behavior, particularly to uncover the early symptom of at-risk pupils. On the other hand, data with noise, outliers, and irrelevant information may provide incorrect conclusions. By identifying features of students' data that have the potential to improve performance prediction results, comparing and identifying the most appropriate ensemble learning technique after preprocessing the data, and optimizing the hyperparameters, this paper aims to develop a reliable students' performance prediction model for Higher Education Institutions. Data was gathered from two different systems: a student information system and an e-learning system for undergraduate students in the College of Computer Science of a Saudi Arabian State University. The cases of 4413 students were used in this article. The process includes data collection, data integration, data preprocessing (such as cleaning, normalization, and transformation), feature selection, pattern extraction, and, finally, model optimization and assessment. Random Forest, Bagging, Stacking, Majority Vote, and two types of Boosting techniques, AdaBoost and XGBoost, are ensemble learning approaches, whereas Decision Tree, Support Vector Machine, and Artificial Neural Network are supervised learning techniques. Hyperparameters for ensemble learning systems will be fine-tuned to provide enhanced performance and optimal output. The findings imply that combining features of students' behavior from e-learning and students' information systems using Majority Vote produced better outcomes than the other ensemble techniques.

Keywords: educational data mining, student performance prediction, e-learning, classification, ensemble learning, higher education

Procedia PDF Downloads 108
24197 Foundation of the Information Model for Connected-Cars

Authors: Hae-Won Seo, Yong-Gu Lee

Abstract:

Recent progress in the next generation of automobile technology is geared towards incorporating information technology into cars. Collectively called smart cars are bringing intelligence to cars that provides comfort, convenience and safety. A branch of smart cars is connected-car system. The key concept in connected-cars is the sharing of driving information among cars through decentralized manner enabling collective intelligence. This paper proposes a foundation of the information model that is necessary to define the driving information for smart-cars. Road conditions are modeled through a unique data structure that unambiguously represent the time variant traffics in the streets. Additionally, the modeled data structure is exemplified in a navigational scenario and usage using UML. Optimal driving route searching is also discussed using the proposed data structure in a dynamically changing road conditions.

Keywords: connected-car, data modeling, route planning, navigation system

Procedia PDF Downloads 374
24196 Synthesis of Nanoparticles and Thin Film of Cu₂ZnSnS₄ by Hydrothermal Method and Its Application as Congo Red Photocatalyst

Authors: Paula Salazar, Rodrigo Henríquez, Pablo Zerega

Abstract:

The textile, food and pharmaceutical industries are expanding daily worldwide, and they are located within the most polluting industries due to the fact that wastewater is discharged into watercourses with high concentrations of dyes and traces of drugs. Many of these compounds are stable to light and biodegradation, being considered as emerging organic contaminants. Advanced oxidation processes (AOPs) emerge as an effective alternative for the removal and elimination of this type of contaminants. Heterogeneous photocatalysis has been extensively studied as it is an efficient, low-cost and durable method. As the main photocatalyst, TiO₂ has been used for the degradation of a large number of dyes and drugs. The disadvantage of TiO₂ is its absorption in the UV region of the solar spectrum. On the other hand, quaternary chalcogenides based on Cu₂SnZnX₄ (X = S, Se) are a possible alternative due to their narrow bandgap (ca. between 0.8 to 1.5 eV depending on the phase considered), low cost, an abundance of its constituent elements in the earth's crust and its low toxicity. The objective of this research was to synthesize Cu₂SnZnS₄ (CZTS) through of a low-cost hydrothermal method and evaluate it as a potential photo-catalyst in the photo-degradation process of Congo Red. The synthesis of the nanoparticle in suspension and film onto fluorine-doped tin oxide coated glass (FTO) was carried out using a mixture of: 2 mmol CuCl₂, 1 mmol ZnCl₂, 1 mmol SnCl₂ and 4 mmol CH4N₂S in a Teflon reactor at 180⁰C for 72 h. Characterization was performed through scanning electron microscopy (SEM), X-ray diffraction (XRD) and UV VIS spectroscopy. Photo-degradation monitoring was carried out employing a UV VIS spectrophotometer. The results show that photodegradation of 55% of the dye can be obtained after 4h of exposure to polychromatic light, it should be noted that the Congo Red dye is being studied for the first time.

Keywords: CZTS, hydrothermal, photocatalysis, dye

Procedia PDF Downloads 122
24195 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction

Procedia PDF Downloads 340
24194 Automated Multisensory Data Collection System for Continuous Monitoring of Refrigerating Appliances Recycling Plants

Authors: Georgii Emelianov, Mikhail Polikarpov, Fabian Hübner, Jochen Deuse, Jochen Schiemann

Abstract:

Recycling refrigerating appliances plays a major role in protecting the Earth's atmosphere from ozone depletion and emissions of greenhouse gases. The performance of refrigerator recycling plants in terms of material retention is the subject of strict environmental certifications and is reviewed periodically through specialized audits. The continuous collection of Refrigerator data required for the input-output analysis is still mostly manual, error-prone, and not digitalized. In this paper, we propose an automated data collection system for recycling plants in order to deduce expected material contents in individual end-of-life refrigerating appliances. The system utilizes laser scanner measurements and optical data to extract attributes of individual refrigerators by applying transfer learning with pre-trained vision models and optical character recognition. Based on Recognized features, the system automatically provides material categories and target values of contained material masses, especially foaming and cooling agents. The presented data collection system paves the way for continuous performance monitoring and efficient control of refrigerator recycling plants.

Keywords: automation, data collection, performance monitoring, recycling, refrigerators

Procedia PDF Downloads 164
24193 Sales Patterns Clustering Analysis on Seasonal Product Sales Data

Authors: Soojin Kim, Jiwon Yang, Sungzoon Cho

Abstract:

As a seasonal product is only in demand for a short time, inventory management is critical to profits. Both markdowns and stockouts decrease the return on perishable products; therefore, researchers have been interested in the distribution of seasonal products with the aim of maximizing profits. In this study, we propose a data-driven seasonal product sales pattern analysis method for individual retail outlets based on observed sales data clustering; the proposed method helps in determining distribution strategies.

Keywords: clustering, distribution, sales pattern, seasonal product

Procedia PDF Downloads 597
24192 Probability Sampling in Matched Case-Control Study in Drug Abuse

Authors: Surya R. Niraula, Devendra B Chhetry, Girish K. Singh, S. Nagesh, Frederick A. Connell

Abstract:

Background: Although random sampling is generally considered to be the gold standard for population-based research, the majority of drug abuse research is based on non-random sampling despite the well-known limitations of this kind of sampling. Method: We compared the statistical properties of two surveys of drug abuse in the same community: one using snowball sampling of drug users who then identified “friend controls” and the other using a random sample of non-drug users (controls) who then identified “friend cases.” Models to predict drug abuse based on risk factors were developed for each data set using conditional logistic regression. We compared the precision of each model using bootstrapping method and the predictive properties of each model using receiver operating characteristics (ROC) curves. Results: Analysis of 100 random bootstrap samples drawn from the snowball-sample data set showed a wide variation in the standard errors of the beta coefficients of the predictive model, none of which achieved statistical significance. One the other hand, bootstrap analysis of the random-sample data set showed less variation, and did not change the significance of the predictors at the 5% level when compared to the non-bootstrap analysis. Comparison of the area under the ROC curves using the model derived from the random-sample data set was similar when fitted to either data set (0.93, for random-sample data vs. 0.91 for snowball-sample data, p=0.35); however, when the model derived from the snowball-sample data set was fitted to each of the data sets, the areas under the curve were significantly different (0.98 vs. 0.83, p < .001). Conclusion: The proposed method of random sampling of controls appears to be superior from a statistical perspective to snowball sampling and may represent a viable alternative to snowball sampling.

Keywords: drug abuse, matched case-control study, non-probability sampling, probability sampling

Procedia PDF Downloads 493
24191 Multifunctional β-Cyclodextrin-EDTA-Chitosan Polymer Adsorbent Synthesis for Simultaneous Removal of Heavy Metals and Organic Dyes from Wastewater

Authors: Monu Verma, Hyunook Kim

Abstract:

Heavy metals and organic dyes are the major sources of water pollution. Herein, a trifunctional β−cyclodextrin−ethylenediaminetetraacetic acid−chitosan (β−CD−EDTA−CS) polymer was synthesized using an easy and simple chemical route by the reaction of activated β−CD with CS through EDTA as a cross-linker (amidation reaction) for the removal of inorganic and organic pollutants from aqueous solution under different parameters such as pH, time effect, initial concentration, reusability, etc. The synthesized adsorbent was characterized using powder X-ray diffraction, Fourier transform infrared spectroscopy, field scanning electron microscopy, energy dispersive spectroscopy, Brunauer-Emmett-Teller (BET), thermogravimetric analyzer techniques to investigate their structural, functional, morphological, elemental compositions, surface area, and thermal properties, respectively. Two types of heavy metals, i.e., mercury (Hg²⁺) and cadmium (Cd²⁺), and three organic dyes, i.e., methylene blue (MB), crystal violet (CV), and safranin O (SO), were chosen as inorganic and organic pollutants, respectively, to study the adsorption capacity of β-CD-EDTA-CS in aqueous solution. The β-CD-EDTA-CS shows a monolayer adsorption capacity of 346.30 ± 14.0 and 202.90 ± 13.90 mg g−¹ for Hg²⁺ and Cd²⁺, respectively, and a heterogeneous adsorption capacity of 107.20 ± 5.70, 77.40 ± 5.30 and 55.30 ± 3.60 mg g−¹ for MB, CV and SO, respectively. Kinetics results followed pseudo-second order (PSO) kinetics behavior for both metal ions and dyes, and higher rate constants values (0.00161–0.00368 g mg−¹ min−¹) for dyes confirmed the cavitation of organic dyes (physisorption). In addition, we have also demonstrated the performance of β-CD-EDTA-CS for the four heavy metals, Hg²⁺, Cd²⁺, Ni²⁺, and Cu²⁺, and three dyes MB, CV, and SO in secondary treated wastewater. The findings of this study indicate that β-CD-EDTA-CS is simple and easy to synthesize and can be used in wastewater treatment.

Keywords: adsorption isotherms, adsorption mechanism, amino-β-cyclodextrin, heavy metal ions, organic dyes

Procedia PDF Downloads 107
24190 Evaluating the Effectiveness of Science Teacher Training Programme in National Colleges of Education: a Preliminary Study, Perceptions of Prospective Teachers

Authors: A. S. V Polgampala, F. Huang

Abstract:

This is an overview of what is entailed in an evaluation and issues to be aware of when class observation is being done. This study examined the effects of evaluating teaching practice of a 7-day ‘block teaching’ session in a pre -service science teacher training program at a reputed National College of Education in Sri Lanka. Effects were assessed in three areas: evaluation of the training process, evaluation of the training impact, and evaluation of the training procedure. Data for this study were collected by class observation of 18 teachers during 9th February to 16th of 2017. Prospective teachers of science teaching, the participants of the study were evaluated based on newly introduced format by the NIE. The data collected was analyzed qualitatively using the Miles and Huberman procedure for analyzing qualitative data: data reduction, data display and conclusion drawing/verification. It was observed that the trainees showed their confidence in teaching those competencies and skills. Teacher educators’ dissatisfaction has been a great impact on evaluation process.

Keywords: evaluation, perceptions & perspectives, pre-service, science teachering

Procedia PDF Downloads 315
24189 Detecting Venomous Files in IDS Using an Approach Based on Data Mining Algorithm

Authors: Sukhleen Kaur

Abstract:

In security groundwork, Intrusion Detection System (IDS) has become an important component. The IDS has received increasing attention in recent years. IDS is one of the effective way to detect different kinds of attacks and malicious codes in a network and help us to secure the network. Data mining techniques can be implemented to IDS, which analyses the large amount of data and gives better results. Data mining can contribute to improving intrusion detection by adding a level of focus to anomaly detection. So far the study has been carried out on finding the attacks but this paper detects the malicious files. Some intruders do not attack directly, but they hide some harmful code inside the files or may corrupt those file and attack the system. These files are detected according to some defined parameters which will form two lists of files as normal files and harmful files. After that data mining will be performed. In this paper a hybrid classifier has been used via Naive Bayes and Ripper classification methods. The results show how the uploaded file in the database will be tested against the parameters and then it is characterised as either normal or harmful file and after that the mining is performed. Moreover, when a user tries to mine on harmful file it will generate an exception that mining cannot be made on corrupted or harmful files.

Keywords: data mining, association, classification, clustering, decision tree, intrusion detection system, misuse detection, anomaly detection, naive Bayes, ripper

Procedia PDF Downloads 414
24188 Generalized Approach to Linear Data Transformation

Authors: Abhijith Asok

Abstract:

This paper presents a generalized approach for the simple linear data transformation, Y=bX, through an integration of multidimensional coordinate geometry, vector space theory and polygonal geometry. The scaling is performed by adding an additional ’Dummy Dimension’ to the n-dimensional data, which helps plot two dimensional component-wise straight lines on pairs of dimensions. The end result is a set of scaled extensions of observations in any of the 2n spatial divisions, where n is the total number of applicable dimensions/dataset variables, created by shifting the n-dimensional plane along the ’Dummy Axis’. The derived scaling factor was found to be dependent on the coordinates of the common point of origin for diverging straight lines and the plane of extension, chosen on and perpendicular to the ’Dummy Axis’, respectively. This result indicates the geometrical interpretation of a linear data transformation and hence, opportunities for a more informed choice of the factor ’b’, based on a better choice of these coordinate values. The paper follows on to identify the effect of this transformation on certain popular distance metrics, wherein for many, the distance metric retained the same scaling factor as that of the features.

Keywords: data transformation, dummy dimension, linear transformation, scaling

Procedia PDF Downloads 297
24187 Blockchain Platform Configuration for MyData Operator in Digital and Connected Health

Authors: Minna Pikkarainen, Yueqiang Xu

Abstract:

The integration of digital technology with existing healthcare processes has been painfully slow, a huge gap exists between the fields of strictly regulated official medical care and the quickly moving field of health and wellness technology. We claim that the promises of preventive healthcare can only be fulfilled when this gap is closed – health care and self-care becomes seamless continuum “correct information, in the correct hands, at the correct time allowing individuals and professionals to make better decisions” what we call connected health approach. Currently, the issues related to security, privacy, consumer consent and data sharing are hindering the implementation of this new paradigm of healthcare. This could be solved by following MyData principles stating that: Individuals should have the right and practical means to manage their data and privacy. MyData infrastructure enables decentralized management of personal data, improves interoperability, makes it easier for companies to comply with tightening data protection regulations, and allows individuals to change service providers without proprietary data lock-ins. This paper tackles today’s unprecedented challenges of enabling and stimulating multiple healthcare data providers and stakeholders to have more active participation in the digital health ecosystem. First, the paper systematically proposes the MyData approach for healthcare and preventive health data ecosystem. In this research, the work is targeted for health and wellness ecosystems. Each ecosystem consists of key actors, such as 1) individual (citizen or professional controlling/using the services) i.e. data subject, 2) services providing personal data (e.g. startups providing data collection apps or data collection devices), 3) health and wellness services utilizing aforementioned data and 4) services authorizing the access to this data under individual’s provided explicit consent. Second, the research extends the existing four archetypes of orchestrator-driven healthcare data business models for the healthcare industry and proposes the fifth type of healthcare data model, the MyData Blockchain Platform. This new architecture is developed by the Action Design Research approach, which is a prominent research methodology in the information system domain. The key novelty of the paper is to expand the health data value chain architecture and design from centralization and pseudo-decentralization to full decentralization, enabled by blockchain, thus the MyData blockchain platform. The study not only broadens the healthcare informatics literature but also contributes to the theoretical development of digital healthcare and blockchain research domains with a systemic approach.

Keywords: blockchain, health data, platform, action design

Procedia PDF Downloads 100
24186 Ecolodging as an Answer for Sustainable Development and Successful Resource Management: The Case of North West Coast in Alexandria

Authors: I. Elrouby

Abstract:

The continued growth of tourism in the future relies on maintaining a clean environment by achieving sustainable development. The erosion and degradation of beaches, the deterioration of coastal water quality, visual pollution of coastlines by massive developments, all this has contributed heavily to the loss of the natural attractiveness for tourism. In light of this, promoting the concept of sustainable coastal development is becoming a central goal for governments and private sector. An ecolodge is a small hotel or guesthouse that incorporates local architectural, cultural and natural characteristics, promotes environmental conservation through minimizing the use of waste and energy and produces social and economic benefits for local communities. Egypt has some scattered attempts in some areas like Sinai in the field of ecolodging. This research tends to investigate the potentials of the North West Coast (NWC) in Alexandria as a new candidate for ecolodging investments. The area is full of primitive natural and man-made resources. These, if used in an environmental-friendly way could achieve cost reductions as a result of successful resource management for investors on the one hand, and coastal preservation on the other hand. In-depth interviews will be conducted with stakeholders in the tourism sector to examine their opinion about the potentials of the research area for ecolodging developments. The candidates will be also asked to rate the importance of the availability of certain environmental aspects in such establishments such as the uses of resources that originate from local communities, uses of natural power sources, uses of an environmental-friendly sewage disposal, forbidding the use of materials of endangered species and enhancing cultural heritage conservation. The results show that the area is full of potentials that could be effectively used for ecolodging investments. This if efficiently used could attract ecotourism as a supplementary type of tourism that could be promoted in Alexandria aside cultural, recreational and religious tourism.

Keywords: Alexandria, ecolodging, ecotourism, sustainability

Procedia PDF Downloads 200
24185 Using Learning Apps in the Classroom

Authors: Janet C. Read

Abstract:

UClan set collaboration with Lingokids to assess the Lingokids learning app's impact on learning outcomes in classrooms in the UK for children with ages ranging from 3 to 5 years. Data gathered during the controlled study with 69 children includes attitudinal data, engagement, and learning scores. Data shows that children enjoyment while learning was higher among those children using the game-based app compared to those children using other traditional methods. It’s worth pointing out that engagement when using the learning app was significantly higher than other traditional methods among older children. According to existing literature, there is a direct correlation between engagement, motivation, and learning. Therefore, this study provides relevant data points to conclude that Lingokids learning app serves its purpose of encouraging learning through playful and interactive content. That being said, we believe that learning outcomes should be assessed with a wider range of methods in further studies. Likewise, it would be beneficial to assess the level of usability and playability of the app in order to evaluate the learning app from other angles.

Keywords: learning app, learning outcomes, rapid test activity, Smileyometer, early childhood education, innovative pedagogy

Procedia PDF Downloads 71
24184 Road Safety in the Great Britain: An Exploratory Data Analysis

Authors: Jatin Kumar Choudhary, Naren Rayala, Abbas Eslami Kiasari, Fahimeh Jafari

Abstract:

The Great Britain has one of the safest road networks in the world. However, the consequences of any death or serious injury are devastating for loved ones, as well as for those who help the severely injured. This paper aims to analyse the Great Britain's road safety situation and show the response measures for areas where the total damage caused by accidents can be significantly and quickly reduced. In this paper, we do an exploratory data analysis using STATS19 data. For the past 30 years, the UK has had a good record in reducing fatalities. The UK ranked third based on the number of road deaths per million inhabitants. There were around 165,000 accidents reported in the Great Britain in 2009 and it has been decreasing every year until 2019 which is under 120,000. The government continues to scale back road deaths empowering responsible road users by identifying and prosecuting the parameters that make the roads less safe.

Keywords: road safety, data analysis, openstreetmap, feature expanding.

Procedia PDF Downloads 140
24183 Intrusion Detection System Using Linear Discriminant Analysis

Authors: Zyad Elkhadir, Khalid Chougdali, Mohammed Benattou

Abstract:

Most of the existing intrusion detection systems works on quantitative network traffic data with many irrelevant and redundant features, which makes detection process more time’s consuming and inaccurate. A several feature extraction methods, such as linear discriminant analysis (LDA), have been proposed. However, LDA suffers from the small sample size (SSS) problem which occurs when the number of the training samples is small compared with the samples dimension. Hence, classical LDA cannot be applied directly for high dimensional data such as network traffic data. In this paper, we propose two solutions to solve SSS problem for LDA and apply them to a network IDS. The first method, reduce the original dimension data using principal component analysis (PCA) and then apply LDA. In the second solution, we propose to use the pseudo inverse to avoid singularity of within-class scatter matrix due to SSS problem. After that, the KNN algorithm is used for classification process. We have chosen two known datasets KDDcup99 and NSLKDD for testing the proposed approaches. Results showed that the classification accuracy of (PCA+LDA) method outperforms clearly the pseudo inverse LDA method when we have large training data.

Keywords: LDA, Pseudoinverse, PCA, IDS, NSL-KDD, KDDcup99

Procedia PDF Downloads 227
24182 A 3D Cell-Based Biosensor for Real-Time and Non-Invasive Monitoring of 3D Cell Viability and Drug Screening

Authors: Yuxiang Pan, Yong Qiu, Chenlei Gu, Ping Wang

Abstract:

In the past decade, three-dimensional (3D) tumor cell models have attracted increasing interest in the field of drug screening due to their great advantages in simulating more accurately the heterogeneous tumor behavior in vivo. Drug sensitivity testing based on 3D tumor cell models can provide more reliable in vivo efficacy prediction. The gold standard fluorescence staining is hard to achieve the real-time and label-free monitoring of the viability of 3D tumor cell models. In this study, micro-groove impedance sensor (MGIS) was specially developed for dynamic and non-invasive monitoring of 3D cell viability. 3D tumor cells were trapped in the micro-grooves with opposite gold electrodes for the in-situ impedance measurement. The change of live cell number would cause inversely proportional change to the impedance magnitude of the entire cell/matrigel to construct and reflect the proliferation and apoptosis of 3D cells. It was confirmed that 3D cell viability detected by the MGIS platform is highly consistent with the standard live/dead staining. Furthermore, the accuracy of MGIS platform was demonstrated quantitatively using 3D lung cancer model and sophisticated drug sensitivity testing. In addition, the parameters of micro-groove impedance chip processing and measurement experiments were optimized in details. The results demonstrated that the MGIS and 3D cell-based biosensor and would be a promising platform to improve the efficiency and accuracy of cell-based anti-cancer drug screening in vitro.

Keywords: micro-groove impedance sensor, 3D cell-based biosensors, 3D cell viability, micro-electromechanical systems

Procedia PDF Downloads 128
24181 Studies of Rule Induction by STRIM from the Decision Table with Contaminated Attribute Values from Missing Data and Noise — in the Case of Critical Dataset Size —

Authors: Tetsuro Saeki, Yuichi Kato, Shoutarou Mizuno

Abstract:

STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by simulation experiments specifying rules in advance, and by comparison with conventional methods. However, scope for future development remains before STRIM can be applied to the analysis of real-world data sets. The first requirement is to determine the size of the dataset needed for inducting true rules, since finding statistically significant rules is the core of the method. The second is to examine the capacity of rule induction from datasets with contaminated attribute values created by missing data and noise, since real-world datasets usually contain such contaminated data. This paper examines the first problem theoretically, in connection with the rule length. The second problem is then examined in a simulation experiment, utilizing the critical size of dataset derived from the first step. The experimental results show that STRIM is highly robust in the analysis of datasets with contaminated attribute values, and hence is applicable to realworld data.

Keywords: rule induction, decision table, missing data, noise

Procedia PDF Downloads 396
24180 Joubert Syndrome in Children as Multicentric Screening in Ten Different Places in World

Authors: Bajraktarevic Adnan, Djukic Branka, Sporisevic Lutvo, Krdzalic Zecevic Belma, Uzicanin Sajra, Hadzimuratovic Admir, Hadzimuratovic Hadzipasic Emina, Abduzaimovic Alisa, Kustric Amer, Suljevic Ismet, Serafi Ismail, Tahmiscija Indira, Khatib Hakam, Semic Jusufagic Aida, Haas Helmut, Vladicic Aleksandra, Aplenc Richard, Kadic Deovic Aida

Abstract:

Introduction: Joubert syndrome has an autosomal recessive pattern of inheritance. It is referred as the brain malfunctioning and caused due to the underdevelopment of the cerebellar vermis. Associated conditions involving the eye, the kidney, and ocular disease are well described. Aims: Research helps us better understand this diseases, Joubert syndrome and can lead to advances in diagnosis and treatment. Methods: Different several conditions have been described in which the molar tooth sign and characteristics of Joubert syndrome in ten different places in the world. Carrier testing and diagnosis are available if one of these gene mutations has been identified in an affected family member. Results: Authors have described eleven cases during twenty years of Joubert syndrome. It is a clinically and genetically heterogeneous group of disorders characterized by hypoplasia of the cerebellar vermis with the characteristic neuroradiologic molar tooth sign, and accompanying neurologic symptoms, including dysregulation of breathing pattern and developmental delay. We made confirmation of diagnosis in twin sisters with Joubert syndrome with renal anomalies. Ocular symptoms have existed in seven cases (63.64%) from total eleven. Eleven cases were different sex, five boys (45.45%) and six girls (54.44%). Conclusions: Joubert syndrome is inherited as an autosomal recessive genetic disorder with several features of the disease.

Keywords: Joubert syndrome, cerebellooculorenal syndrome, autosomal recessive genetic disorder (ARGD), children

Procedia PDF Downloads 278
24179 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services

Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme

Abstract:

Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.

Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing

Procedia PDF Downloads 113
24178 Model Predictive Controller for Pasteurization Process

Authors: Tesfaye Alamirew Dessie

Abstract:

Our study focuses on developing a Model Predictive Controller (MPC) and evaluating it against a traditional PID for a pasteurization process. Utilizing system identification from the experimental data, the dynamics of the pasteurization process were calculated. Using best fit with data validation, residual, and stability analysis, the quality of several model architectures was evaluated. The validation data fit the auto-regressive with exogenous input (ARX322) model of the pasteurization process by roughly 80.37 percent. The ARX322 model structure was used to create MPC and PID control techniques. After comparing controller performance based on settling time, overshoot percentage, and stability analysis, it was found that MPC controllers outperform PID for those parameters.

Keywords: MPC, PID, ARX, pasteurization

Procedia PDF Downloads 163
24177 The Executive Functioning Profile of Children and Adolescents with a Diagnosis of OCD: A Systematic Review and Meta-Analysis

Authors: Parker Townes, Aisouda Savadlou, Shoshana Weiss, Marina Jarenova, Suzzane Ferris, Dan Devoe, Russel Schachar, Scott Patten, Tomas Lange, Marlena Colasanto, Holly McGinn, Paul Arnold

Abstract:

Some research suggests obsessive-compulsive disorder (OCD) is associated with impaired executive functioning: higher-level mental processes involved in carrying out tasks and solving problems. Relevant literature was identified systematically through online databases. Meta-analyses were conducted for task performance metrics reported by at least two articles. Results were synthesized by the executive functioning domain measured through each performance metric. Heterogeneous literature was identified, typically involving few studies using consistent measures. From 29 included studies, analyses were conducted on 33 performance metrics from 12 tasks. Results suggest moderate associations of working memory (two out of five tasks presented significant findings), planning (one out of two tasks presented significant findings), and visuospatial abilities (one out of two tasks presented significant findings) with OCD in youth. There was inadequate literature or contradictory findings for other executive functioning domains. These findings suggest working memory, planning, and visuospatial abilities are impaired in pediatric OCD, with mixed results. More work is needed to identify the effect of age and sex on these results. Acknowledgment: This work was supported by the Alberta Innovates Translational Health Chair in Child and Youth Mental Health. The funders had no role in the design, conducting, writing, or decision to submit this article for publication.

Keywords: obsessive-compulsive disorder, neurocognition, executive functioning, adolescents, children

Procedia PDF Downloads 99
24176 A Remotely Piloted Aerial Application System to Control Rangeland Grasshoppers

Authors: Daniel Martin, Roberto Rodriguez, Derek Woller, Chris Reuter, Lonnie Black, Mohamed Latheef

Abstract:

The grasshoppers comprised of heterogeneous assemblages of Acrididae (Family: Orthoptera) species periodically reach outbreak levels by their gregarious behavior and voracious feeding habits, devouring stems and leaves of food crops and rangeland pasture. Cattle consume about 1.5-2.5% of their body weight in forage per day, so pound for pound, a grasshopper will eat 12-20 times as much plant material as a steer and cause serious economic damage to the cattle industry, especially during a drought when forage is already scarce. Grasshoppers annually consume more than 20% of rangeland forages in the western United States at an estimated loss of $1.25 billion per year in forage. A remotely piloted aerial application system with both a spreader and spray application system was used to apply granular insect bait and a liquid formulation of Carbaryl for control of grasshopper infestations on rangeland in New Mexico, United States. Pattern testing and calibration of both the granular and liquid application systems were conducted to determine proper application rate set up and distribution pattern. From these tests, an effective swath was calculated. Results showed that 14 days after application, granular baits were only effective on those grasshopper species that accepted the baits. The liquid formulation at 16 ounces per acre was highly successful at controlling all grasshopper species. Results of this study indicated that a remotely piloted aerial application system can be used to effectively deliver grasshopper control products in both granular and liquid form. However, the spray application treatment proved to be most effective and efficient for all grasshopper species present.

Keywords: Carbaryl, Grasshopper, Insecticidal Efficacy, Remotely Piloted Aerial Application System

Procedia PDF Downloads 219
24175 Point Estimation for the Type II Generalized Logistic Distribution Based on Progressively Censored Data

Authors: Rana Rimawi, Ayman Baklizi

Abstract:

Skewed distributions are important models that are frequently used in applications. Generalized distributions form a class of skewed distributions and gain widespread use in applications because of their flexibility in data analysis. More specifically, the Generalized Logistic Distribution with its different types has received considerable attention recently. In this study, based on progressively type-II censored data, we will consider point estimation in type II Generalized Logistic Distribution (Type II GLD). We will develop several estimators for its unknown parameters, including maximum likelihood estimators (MLE), Bayes estimators and linear estimators (BLUE). The estimators will be compared using simulation based on the criteria of bias and Mean square error (MSE). An illustrative example of a real data set will be given.

Keywords: point estimation, type II generalized logistic distribution, progressive censoring, maximum likelihood estimation

Procedia PDF Downloads 198
24174 Omni: Data Science Platform for Evaluate Performance of a LoRaWAN Network

Authors: Emanuele A. Solagna, Ricardo S, Tozetto, Roberto dos S. Rabello

Abstract:

Nowadays, physical processes are becoming digitized by the evolution of communication, sensing and storage technologies which promote the development of smart cities. The evolution of this technology has generated multiple challenges related to the generation of big data and the active participation of electronic devices in society. Thus, devices can send information that is captured and processed over large areas, but there is no guarantee that all the obtained data amount will be effectively stored and correctly persisted. Because, depending on the technology which is used, there are parameters that has huge influence on the full delivery of information. This article aims to characterize the project, currently under development, of a platform that based on data science will perform a performance and effectiveness evaluation of an industrial network that implements LoRaWAN technology considering its main parameters configuration relating these parameters to the information loss.

Keywords: Internet of Things, LoRa, LoRaWAN, smart cities

Procedia PDF Downloads 148
24173 Cybervetting and Online Privacy in Job Recruitment – Perspectives on the Current and Future Legislative Framework Within the EU

Authors: Nicole Christiansen, Hanne Marie Motzfeldt

Abstract:

In recent years, more and more HR professionals have been using cyber-vetting in job recruitment in an effort to find the perfect match for the company. These practices are growing rapidly, accessing a vast amount of data from social networks, some of which is privileged and protected information. Thus, there is a risk that the right to privacy is becoming a duty to manage your private data. This paper investigates to which degree a job applicant's fundamental rights are protected adequately in current and future legislation in the EU. This paper argues that current data protection regulations and forthcoming regulations on the use of AI ensure sufficient protection. However, even though the regulation on paper protects employees within the EU, the recruitment sector may not pay sufficient attention to the regulation as it not specifically targeting this area. Therefore, the lack of specific labor and employment regulation is a concern that the social partners should attend to.

Keywords: AI, cyber vetting, data protection, job recruitment, online privacy

Procedia PDF Downloads 86
24172 Sequential Pattern Mining from Data of Medical Record with Sequential Pattern Discovery Using Equivalent Classes (SPADE) Algorithm (A Case Study : Bolo Primary Health Care, Bima)

Authors: Rezky Rifaini, Raden Bagus Fajriya Hakim

Abstract:

This research was conducted at the Bolo primary health Care in Bima Regency. The purpose of the research is to find out the association pattern that is formed of medical record database from Bolo Primary health care’s patient. The data used is secondary data from medical records database PHC. Sequential pattern mining technique is the method that used to analysis. Transaction data generated from Patient_ID, Check_Date and diagnosis. Sequential Pattern Discovery Algorithms Using Equivalent Classes (SPADE) is one of the algorithm in sequential pattern mining, this algorithm find frequent sequences of data transaction, using vertical database and sequence join process. Results of the SPADE algorithm is frequent sequences that then used to form a rule. It technique is used to find the association pattern between items combination. Based on association rules sequential analysis with SPADE algorithm for minimum support 0,03 and minimum confidence 0,75 is gotten 3 association sequential pattern based on the sequence of patient_ID, check_Date and diagnosis data in the Bolo PHC.

Keywords: diagnosis, primary health care, medical record, data mining, sequential pattern mining, SPADE algorithm

Procedia PDF Downloads 401
24171 Estimation of Reservoirs Fracture Network Properties Using an Artificial Intelligence Technique

Authors: Reda Abdel Azim, Tariq Shehab

Abstract:

The main objective of this study is to develop a subsurface fracture map of naturally fractured reservoirs by overcoming the limitations associated with different data sources in characterising fracture properties. Some of these limitations are overcome by employing a nested neuro-stochastic technique to establish inter-relationship between different data, as conventional well logs, borehole images (FMI), core description, seismic attributes, and etc. and then characterise fracture properties in terms of fracture density and fractal dimension for each data source. Fracture density is an important property of a system of fracture network as it is a measure of the cumulative area of all the fractures in a unit volume of a fracture network system and Fractal dimension is also used to characterize self-similar objects such as fractures. At the wellbore locations, fracture density and fractal dimension can only be estimated for limited sections where FMI data are available. Therefore, artificial intelligence technique is applied to approximate the quantities at locations along the wellbore, where the hard data is not available. It should be noted that Artificial intelligence techniques have proven their effectiveness in this domain of applications.

Keywords: naturally fractured reservoirs, artificial intelligence, fracture intensity, fractal dimension

Procedia PDF Downloads 255
24170 Gamma Irradiated Sodium Alginate and Phosphorus Fertilizer Enhances Seed Trigonelline Content, Biochemical Parameters and Yield Attributes of Fenugreek (Trigonella foenum-graecum L.)

Authors: Tariq Ahmad Dar, Moinuddin, M. Masroor A. Khan

Abstract:

There is considerable need in enhancing the content and yield of active constituents of medicinal plants keeping in view their massive demand worldwide. Different strategies have been employed to enhance the active constituents of medicinal plants and the use of phytohormones has been proved effective in this regard. Gamma-irradiated Sodium alginate (ISA) is known to elicit an array of plant defense responses and biological activities in plants. Considering the medicinal importance, a pot experiment was conducted to explore the effect of ISA and phosphorus on growth, yield and quality of fenugreek (Trigonella foenum-graecum L.). ISA spray treatments (0, 40, 80 and 120 mg L-1) were applied alone and in combination with 40 kg P ha-1 (P40). Crop performance was assessed in terms of plant growth characteristics, physiological attributes, seed yield and the content of seed trigonelline. Of the ten-treatments, P40 + 80 mg L−1 of ISA proved the best. The results showed that foliar spray of ISA alone or in combination with P40 augmented the plant vegetative growth, enzymatic activities, trigonelline content, trigonelline yield and economic yield of fenugreek. Application of 80 mg L−1 of ISA applied with P40 gave the best results for almost all the parameters studied compared to control or to 80 mg L−1 of ISA applied alone. This treatment increased the total content of chlorophyll, carotenoids, leaf -N, -P and -K and trigonelline compared to the control by 24.85 and 27.40%, 15 and 23.52%, 18.70 and 16.84%, 15.88 and 18.92%, 12 and 14.44%, at 60 and 90 DAS respectively. The combined application of 80 mg L−1 of ISA along with P40 resulted in the maximum increase in seed yield, trigonelline content and trigonelline yield by146, 34 and 232.41%, respectively, over the control. Gel permeation chromatography revealed the formation of low molecular weight fractions in ISA samples, containing even less than 20,000 molecular weight oligomers, which might be responsible for plant growth promotion in this study. Trigonelline content was determined by reverse phase high performance liquid chromatography (HPLC) with C-18 column.

Keywords: gamma-irradiated sodium alginate, phosphorus, gel permeation chromatography, HPLC, trigonelline content, yield

Procedia PDF Downloads 321
24169 Governance, Risk Management, and Compliance Factors Influencing the Adoption of Cloud Computing in Australia

Authors: Tim Nedyalkov

Abstract:

A business decision to move to the cloud brings fundamental changes in how an organization develops and delivers its Information Technology solutions. The accelerated pace of digital transformation across businesses and government agencies increases the reliance on cloud-based services. They are collecting, managing, and retaining large amounts of data in cloud environments makes information security and data privacy protection essential. It becomes even more important to understand what key factors drive successful cloud adoption following the commencement of the Privacy Amendment Notifiable Data Breaches (NDB) Act 2017 in Australia as the regulatory changes impact many organizations and industries. This quantitative correlational research investigated the governance, risk management, and compliance factors contributing to cloud security success. The factors influence the adoption of cloud computing within an organizational context after the commencement of the NDB scheme. The results and findings demonstrated that corporate information security policies, data storage location, management understanding of data governance responsibilities, and regular compliance assessments are the factors influencing cloud computing adoption. The research has implications for organizations, future researchers, practitioners, policymakers, and cloud computing providers to meet the rapidly changing regulatory and compliance requirements.

Keywords: cloud compliance, cloud security, data governance, privacy protection

Procedia PDF Downloads 116