Search results for: data infrastructure
24569 A Novel Heuristic for Analysis of Large Datasets by Selecting Wrapper-Based Features
Authors: Bushra Zafar, Usman Qamar
Abstract:
Large data sample size and dimensions render the effectiveness of conventional data mining methodologies. A data mining technique are important tools for collection of knowledgeable information from variety of databases and provides supervised learning in the form of classification to design models to describe vital data classes while structure of the classifier is based on class attribute. Classification efficiency and accuracy are often influenced to great extent by noisy and undesirable features in real application data sets. The inherent natures of data set greatly masks its quality analysis and leave us with quite few practical approaches to use. To our knowledge first time, we present a new approach for investigation of structure and quality of datasets by providing a targeted analysis of localization of noisy and irrelevant features of data sets. Machine learning is based primarily on feature selection as pre-processing step which offers us to select few features from number of features as a subset by reducing the space according to certain evaluation criterion. The primary objective of this study is to trim down the scope of the given data sample by searching a small set of important features which may results into good classification performance. For this purpose, a heuristic for wrapper-based feature selection using genetic algorithm and for discriminative feature selection an external classifier are used. Selection of feature based on its number of occurrence in the chosen chromosomes. Sample dataset has been used to demonstrate proposed idea effectively. A proposed method has improved average accuracy of different datasets is about 95%. Experimental results illustrate that proposed algorithm increases the accuracy of prediction of different diseases.Keywords: data mining, generic algorithm, KNN algorithms, wrapper based feature selection
Procedia PDF Downloads 32024568 Improve Student Performance Prediction Using Majority Vote Ensemble Model for Higher Education
Authors: Wade Ghribi, Abdelmoty M. Ahmed, Ahmed Said Badawy, Belgacem Bouallegue
Abstract:
In higher education institutions, the most pressing priority is to improve student performance and retention. Large volumes of student data are used in Educational Data Mining techniques to find new hidden information from students' learning behavior, particularly to uncover the early symptom of at-risk pupils. On the other hand, data with noise, outliers, and irrelevant information may provide incorrect conclusions. By identifying features of students' data that have the potential to improve performance prediction results, comparing and identifying the most appropriate ensemble learning technique after preprocessing the data, and optimizing the hyperparameters, this paper aims to develop a reliable students' performance prediction model for Higher Education Institutions. Data was gathered from two different systems: a student information system and an e-learning system for undergraduate students in the College of Computer Science of a Saudi Arabian State University. The cases of 4413 students were used in this article. The process includes data collection, data integration, data preprocessing (such as cleaning, normalization, and transformation), feature selection, pattern extraction, and, finally, model optimization and assessment. Random Forest, Bagging, Stacking, Majority Vote, and two types of Boosting techniques, AdaBoost and XGBoost, are ensemble learning approaches, whereas Decision Tree, Support Vector Machine, and Artificial Neural Network are supervised learning techniques. Hyperparameters for ensemble learning systems will be fine-tuned to provide enhanced performance and optimal output. The findings imply that combining features of students' behavior from e-learning and students' information systems using Majority Vote produced better outcomes than the other ensemble techniques.Keywords: educational data mining, student performance prediction, e-learning, classification, ensemble learning, higher education
Procedia PDF Downloads 11224567 Foundation of the Information Model for Connected-Cars
Authors: Hae-Won Seo, Yong-Gu Lee
Abstract:
Recent progress in the next generation of automobile technology is geared towards incorporating information technology into cars. Collectively called smart cars are bringing intelligence to cars that provides comfort, convenience and safety. A branch of smart cars is connected-car system. The key concept in connected-cars is the sharing of driving information among cars through decentralized manner enabling collective intelligence. This paper proposes a foundation of the information model that is necessary to define the driving information for smart-cars. Road conditions are modeled through a unique data structure that unambiguously represent the time variant traffics in the streets. Additionally, the modeled data structure is exemplified in a navigational scenario and usage using UML. Optimal driving route searching is also discussed using the proposed data structure in a dynamically changing road conditions.Keywords: connected-car, data modeling, route planning, navigation system
Procedia PDF Downloads 37824566 Natural Hazards and Their Costs in Albanian Part of Ohrid Graben
Authors: Mentor Sulollari
Abstract:
Albania, according to (UNU-EHS) United Nations University, Institute for Environment and Human Security studies for 2015, is listed as the number one country in Europe for the possibility to be caught by natural catastrophes. This is conditioned by unstudied human activity, which has seriously damaged the environment. Albanian part of Ohrid graben that lies in Southeast of Albania, is endangered by landslides and floods, as a result of uncontrolled urban development and low level of investment in infrastructure, rugged terrain in its western part and capricious climate caused by global warming. To be dealt with natural disasters, which cause casualties and material damage, it is important to study them in order to anticipate and reduce damages in future. As part of this study is the construction of natural hazards map, which show us where they are distributed, and which are the vulnerable areas. This article will also be dealing with socio-economic and environmental costs of those events and what are the measures to be taken to reduce them.Keywords: flooding, landslides, natural catastrophes mapping, Pogradec, lake Ohrid, Albanian part of Ohrid graben
Procedia PDF Downloads 30124565 The Fiscal and Macroeconomic Impacts of Reforming Energy Subsidy Policy in Malaysia
Authors: Nora Yusma Bte Mohamed Yusoff, Hussain Ali Bekhet
Abstract:
The rationalization of a gradual subsidies reforms plan has been set out by the Malaysian government to achieve the high-income nation target. This paper attempts to analyze the impacts of energy subsidy reform policy on fiscal deficit and macroeconomics variables in Malaysia. The Computable General Equilibrium (CGE) Model is employed. Three simulations based on different groups of scenarios have been developed. Importantly, the overall results indicate that removal of fuel subsidy has significantly improved the real GDP and reduced the government fiscal deficit. On the other hand, the removal of the fuel subsidy has increased most of the local commodity prices, especially energy commodities. The findings of the study could provide some imperative inputs for policy makers, especially to identify the right policy mechanism. This is especially ensures the subsidy savings from subsidy removal could be transferred back into the domestic economy in the form of infrastructure development, compensation and increases in others sector output contributions towards a sustainable economic growth.Keywords: CGE, deficit, energy, reform, subsidy
Procedia PDF Downloads 26924564 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data
Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad
Abstract:
Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.Keywords: remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction
Procedia PDF Downloads 34224563 Automated Multisensory Data Collection System for Continuous Monitoring of Refrigerating Appliances Recycling Plants
Authors: Georgii Emelianov, Mikhail Polikarpov, Fabian Hübner, Jochen Deuse, Jochen Schiemann
Abstract:
Recycling refrigerating appliances plays a major role in protecting the Earth's atmosphere from ozone depletion and emissions of greenhouse gases. The performance of refrigerator recycling plants in terms of material retention is the subject of strict environmental certifications and is reviewed periodically through specialized audits. The continuous collection of Refrigerator data required for the input-output analysis is still mostly manual, error-prone, and not digitalized. In this paper, we propose an automated data collection system for recycling plants in order to deduce expected material contents in individual end-of-life refrigerating appliances. The system utilizes laser scanner measurements and optical data to extract attributes of individual refrigerators by applying transfer learning with pre-trained vision models and optical character recognition. Based on Recognized features, the system automatically provides material categories and target values of contained material masses, especially foaming and cooling agents. The presented data collection system paves the way for continuous performance monitoring and efficient control of refrigerator recycling plants.Keywords: automation, data collection, performance monitoring, recycling, refrigerators
Procedia PDF Downloads 16924562 Sales Patterns Clustering Analysis on Seasonal Product Sales Data
Authors: Soojin Kim, Jiwon Yang, Sungzoon Cho
Abstract:
As a seasonal product is only in demand for a short time, inventory management is critical to profits. Both markdowns and stockouts decrease the return on perishable products; therefore, researchers have been interested in the distribution of seasonal products with the aim of maximizing profits. In this study, we propose a data-driven seasonal product sales pattern analysis method for individual retail outlets based on observed sales data clustering; the proposed method helps in determining distribution strategies.Keywords: clustering, distribution, sales pattern, seasonal product
Procedia PDF Downloads 60624561 Competitor Analysis to Quantify the Benefits and for Different Use of Transport Infrastructure
Authors: Dimitrios J. Dimitriou, Maria F. Sartzetaki
Abstract:
Different transportation modes have key operational advantages and disadvantages, providing a variety of different transport options to users and passengers. This paper reviews key variables for the competition between air transport and other transport modes. The aim of this paper is to review the competition between air transport and other transport modes, providing results in terms of perceived cost for the users, for destinations high competitiveness for all transport modes. The competitor analysis variables include the cost and time outputs for each transport option, highlighting the level of competitiveness on high demanded Origin-Destination corridors. The case study presents the output of a such analysis for the OD corridor in Greece that connects the Capital city (Athens) with the second largest city (Thessaloniki) and the different transport modes have been considered (air, train, road). Conventional wisdom is to present an easy to handle tool for planners, managers and decision makers towards pricing policy effectiveness and demand attractiveness, appropriate to use for other similar cases.Keywords: competitor analysis, transport economics, transport generalized cost, quantitative modelling
Procedia PDF Downloads 25124560 Probability Sampling in Matched Case-Control Study in Drug Abuse
Authors: Surya R. Niraula, Devendra B Chhetry, Girish K. Singh, S. Nagesh, Frederick A. Connell
Abstract:
Background: Although random sampling is generally considered to be the gold standard for population-based research, the majority of drug abuse research is based on non-random sampling despite the well-known limitations of this kind of sampling. Method: We compared the statistical properties of two surveys of drug abuse in the same community: one using snowball sampling of drug users who then identified “friend controls” and the other using a random sample of non-drug users (controls) who then identified “friend cases.” Models to predict drug abuse based on risk factors were developed for each data set using conditional logistic regression. We compared the precision of each model using bootstrapping method and the predictive properties of each model using receiver operating characteristics (ROC) curves. Results: Analysis of 100 random bootstrap samples drawn from the snowball-sample data set showed a wide variation in the standard errors of the beta coefficients of the predictive model, none of which achieved statistical significance. One the other hand, bootstrap analysis of the random-sample data set showed less variation, and did not change the significance of the predictors at the 5% level when compared to the non-bootstrap analysis. Comparison of the area under the ROC curves using the model derived from the random-sample data set was similar when fitted to either data set (0.93, for random-sample data vs. 0.91 for snowball-sample data, p=0.35); however, when the model derived from the snowball-sample data set was fitted to each of the data sets, the areas under the curve were significantly different (0.98 vs. 0.83, p < .001). Conclusion: The proposed method of random sampling of controls appears to be superior from a statistical perspective to snowball sampling and may represent a viable alternative to snowball sampling.Keywords: drug abuse, matched case-control study, non-probability sampling, probability sampling
Procedia PDF Downloads 49624559 Energy-efficient Buildings In Construction Industry Using Fly Ash-based Geopolymer Technology
Authors: Maryam Kiani
Abstract:
The aim of this study was to investigate the influence of nanoparticles additive on the properties of fly ash-based geopolymer. The geopolymer samples were prepared using fly ash as the primary source material, along with an alkali activator solution and different concentrations of carbon black additive. The effects of nanoparticles flexural strength, water absorption, and micro-structural properties of the cured samples. The results revealed that the inclusion of nanoparticles additive significantly enhanced the mechanical and electrical properties of the geopolymer binder. Micro-structural analysis using scanning electron microscopy (SEM) revealed a more compact and homogeneous structure in the geopolymer samples with nanoparticles. The dispersion of nanoparticles particles within the geopolymer matrix was observed, suggesting improved inter-particle bonding and increased density. Overall, this study demonstrates the positive impact of nanoparticles additive on the qualities of fly ash-based geopolymer, emphasizing its potential as an effective enhancer for geopolymer binder applications for the development of construction and infrastructure for energy buildings.Keywords: fly-ash, geopolymer, energy buildings, nanotechnology
Procedia PDF Downloads 9824558 Broadening the Public Sphere: Examining the Role of Community Radio in Fostering Participatory Democracy in Selected Communities in Ondo State, Nigeria
Authors: John Ibanga
Abstract:
Since May 1999, when Nigeria returned to uninterrupted democratic rule, there have been various attempts by successive governments at committing themselves to democratic ideals. Such efforts include a revision of communication policies after repeated calls by civil society organisations, development partners, researchers, and academics to allow not only the commencement of campus radio broadcasting but also the takeoff of community radio broadcasting. Thus, in 2015, operating licenses were granted to several communities spread across the six geopolitical zones in the country for the establishment of community radio stations culminating in the establishment of the first community radio in Nigeria on July 17, 2015. And, since citizens’ involvement in policy matters and governance is one of the tenets of participatory democracy, it becomes imperative to investigate how the emerging community radio sector in Nigeria is facilitating participatory democracy among Nigerians, even in the face of attempts by the present government to silence all dissenting voices. This study, therefore, examines how residents in Ondo State, Southwest Nigeria, are utilising programmes on Ejule Nen and Kakaaki community radio stations in Ondo State, Nigeria, to deepen participatory democracy. Much of the existing studies on the role of community radio in participatory democracy and citizens' engagement efforts miss out on Nigeria because of the delayed implementation of community radio policy in Nigeria being Africa’s most populous nation as well as a major player in the affairs of the African continent. While the participatory communication and communication infrastructure theories were used as framework, data were collected from in-depth interviews with staff of the community radio station and community leaders, focus group discussions with the community residents, and qualitative content analysis of programmes on the station. The residents used the community radio stations as platforms for demanding accountability from government, mobilising resources for the execution of a number of community projects, promoting credible electoral practices, and influencing the implementation of free education policy in their communities. Hence the community radio stations became the reliable and authoritative voices of residents for participating in the public sphere and, generally, the democratic process.Keywords: community, community radio, democracy, participatory democracy
Procedia PDF Downloads 13024557 Appropriate Legal System for Protection of Plant Innovations in Afghanistan
Authors: Mohammad Reza Fooladi
Abstract:
Because of the importance and effect of plant innovations on economy, industry, and especially agriculture, they have been on the core attention of legislators at the national level, and have been a topic of international documents related to intellectual innovations in the recent decades. For protection of plant innovations, two legal systems (i.e. particular system based on International Convention for protection of new variety of plants, and the patent system) have been considered. Ease of access to the support and the level of support in each of these systems are different. Our attempt in this paper, in addition to describing and analyzing the characteristics of each system, is to suggest the compatible system to the industry and agriculture of Afghanistan. Due to the lack of sufficient industrial infrastructure and academic research, the particular system based on the International Convention on the protection of new variety of plants is suggested. At the same time, appropriate industrial and legal infrastructures, as well as laboratories and research centers should be provided in order that plant innovations under the patent system could also be supported.Keywords: new varieties of plant, patent, agriculture, Afghanistan
Procedia PDF Downloads 33624556 Evaluating the Effectiveness of Science Teacher Training Programme in National Colleges of Education: a Preliminary Study, Perceptions of Prospective Teachers
Authors: A. S. V Polgampala, F. Huang
Abstract:
This is an overview of what is entailed in an evaluation and issues to be aware of when class observation is being done. This study examined the effects of evaluating teaching practice of a 7-day ‘block teaching’ session in a pre -service science teacher training program at a reputed National College of Education in Sri Lanka. Effects were assessed in three areas: evaluation of the training process, evaluation of the training impact, and evaluation of the training procedure. Data for this study were collected by class observation of 18 teachers during 9th February to 16th of 2017. Prospective teachers of science teaching, the participants of the study were evaluated based on newly introduced format by the NIE. The data collected was analyzed qualitatively using the Miles and Huberman procedure for analyzing qualitative data: data reduction, data display and conclusion drawing/verification. It was observed that the trainees showed their confidence in teaching those competencies and skills. Teacher educators’ dissatisfaction has been a great impact on evaluation process.Keywords: evaluation, perceptions & perspectives, pre-service, science teachering
Procedia PDF Downloads 31624555 Multiband Microstrip Slotted Patch Antenna for mmWave 5G Femtocell Applications
Authors: Bhargavi G., Arathi R. Shankar
Abstract:
Transmitter and receiver closer to every other, which creates the twin benefits of better-nice links and more spatial reuse. In a network with nomadic customers, this inevitably includes deploying greater infrastructure, normally in the form of microcells, hot spots, disbursed antennas, or relays. A less pricey alternative is the recent concept of femtocells, additionally known as domestic base stations that are facts get admission to points installed by means of domestic users to get higher indoor voice and records insurance. Femtocells have the potential to offer excessive exceptional community get entry to indoor customers at low cost, even as concurrently reducing the load. gift femtocells that perform in 4G can also be extended for 5G sub-6 GHz band. Designing the femtocell in mmWave band of 5G may have many blessings in terms of bandwidth availability and coverage. Multiband microstrip patch antennas can be considered as a low value and prominent antennas in designing the femtocells because the single antenna helps multiple frequency.Keywords: 5G, mmWave, antennas, wireless communications, femtocell
Procedia PDF Downloads 7724554 Experiences of Social Participation among Community Elderly with Mild Cognitive Impairment: A Qualitative Research
Abstract:
Mild cognitive impairment (MCI) is a clinical stage that occurs between normal aging and dementia. Although MCI increases the risk of developing dementia, individuals with MCI may maintain stable cognitive function and even recover to a typical cognitive state. An intervention to prevent or delay the progression to dementia in individuals with MCI may involve promoting social engagement. Social participation is the engagement in socially relevant social exchanges and meaningful activities. Older adults with MCI may encounter restricted cognitive abilities, mood changes, and behavioral difficulties during social participation, influencing their willingness to engage. Therefore, this study aims to employ qualitative research methods to gain an in-depth comprehension of the authentic social participation experiences of older adults with mild cognitive impairment, which will establish a foundation for designing appropriate intervention programs. A phenomenological research was conducted. The study participants were selected using the purposive sampling method in combination with the maximum differentiation sampling strategy. Face-to-face semistructured interviews were conducted among 12 elderly individuals suffering from mild cognitive impairment in a community in Zhengzhou City from May to July 2023. Colaizzi 7-step method was used to analyze the data and extract the theme. The real experience of social participation in older adults with mild cognitive impairment can be summarized into 3 themes: (1) a single social relationship but a strong desire to participate, (2) a dual experience of social participation with both positive and negative aspects, (3) multiple barriers to social participation, including impaired memory capacity, heavy family responsibilities and lack of infrastructure. The study found that elderly individuals with mild cognitive impairment and one social interaction display an increased desire to engage in society. To improve social participation levels and reduce cognitive function decline, healthcare providers should work with relevant government agencies and the community to create a comprehensive social participation system. It is important for healthcare providers to note the social participation status of the elderly with mild cognitive impairment.Keywords: mild cognitive impairment, the elderly, social participation, qualitative research
Procedia PDF Downloads 9924553 Detecting Venomous Files in IDS Using an Approach Based on Data Mining Algorithm
Authors: Sukhleen Kaur
Abstract:
In security groundwork, Intrusion Detection System (IDS) has become an important component. The IDS has received increasing attention in recent years. IDS is one of the effective way to detect different kinds of attacks and malicious codes in a network and help us to secure the network. Data mining techniques can be implemented to IDS, which analyses the large amount of data and gives better results. Data mining can contribute to improving intrusion detection by adding a level of focus to anomaly detection. So far the study has been carried out on finding the attacks but this paper detects the malicious files. Some intruders do not attack directly, but they hide some harmful code inside the files or may corrupt those file and attack the system. These files are detected according to some defined parameters which will form two lists of files as normal files and harmful files. After that data mining will be performed. In this paper a hybrid classifier has been used via Naive Bayes and Ripper classification methods. The results show how the uploaded file in the database will be tested against the parameters and then it is characterised as either normal or harmful file and after that the mining is performed. Moreover, when a user tries to mine on harmful file it will generate an exception that mining cannot be made on corrupted or harmful files.Keywords: data mining, association, classification, clustering, decision tree, intrusion detection system, misuse detection, anomaly detection, naive Bayes, ripper
Procedia PDF Downloads 41624552 Cost-Effective Hybrid Cloud Framework for Higher Educational Institutes
Authors: Shah Muhammad Butt, Ahmed Masaud Ansair
Abstract:
Present financial crisis in Higher Educational Institutes (HEIs) is causing lots of problems such as considerable budget cuts, which makes it difficult to meet the ever growing IT based research and learning needs. Institutions are rapidly planning and promoting cloud based approaches for their academic and research needs. A cost-effective hybrid cloud framework for HEIs will provide educational services for campus or intercampus communication. Hybrid cloud framework comprises private and public cloud approaches. This paper will propose the framework based on the Open Source Cloud (OpenNebula for Virtualization, Eucalyptus for Infrastructure and Aneka for programming development environment) combined with CSPs services which are delivered to the end-user via the internet from public clouds such as Google, Microsoft, Zoho, and Salesforce.Keywords: educational services, hybrid campus cloud, open source, higher educational institutes
Procedia PDF Downloads 48824551 Generalized Approach to Linear Data Transformation
Authors: Abhijith Asok
Abstract:
This paper presents a generalized approach for the simple linear data transformation, Y=bX, through an integration of multidimensional coordinate geometry, vector space theory and polygonal geometry. The scaling is performed by adding an additional ’Dummy Dimension’ to the n-dimensional data, which helps plot two dimensional component-wise straight lines on pairs of dimensions. The end result is a set of scaled extensions of observations in any of the 2n spatial divisions, where n is the total number of applicable dimensions/dataset variables, created by shifting the n-dimensional plane along the ’Dummy Axis’. The derived scaling factor was found to be dependent on the coordinates of the common point of origin for diverging straight lines and the plane of extension, chosen on and perpendicular to the ’Dummy Axis’, respectively. This result indicates the geometrical interpretation of a linear data transformation and hence, opportunities for a more informed choice of the factor ’b’, based on a better choice of these coordinate values. The paper follows on to identify the effect of this transformation on certain popular distance metrics, wherein for many, the distance metric retained the same scaling factor as that of the features.Keywords: data transformation, dummy dimension, linear transformation, scaling
Procedia PDF Downloads 30324550 A Comparative Assessment of Information Value, Fuzzy Expert System Models for Landslide Susceptibility Mapping of Dharamshala and Surrounding, Himachal Pradesh, India
Authors: Kumari Sweta, Ajanta Goswami, Abhilasha Dixit
Abstract:
Landslide is a geomorphic process that plays an essential role in the evolution of the hill-slope and long-term landscape evolution. But its abrupt nature and the associated catastrophic forces of the process can have undesirable socio-economic impacts, like substantial economic losses, fatalities, ecosystem, geomorphologic and infrastructure disturbances. The estimated fatality rate is approximately 1person /100 sq. Km and the average economic loss is more than 550 crores/year in the Himalayan belt due to landslides. This study presents a comparative performance of a statistical bivariate method and a machine learning technique for landslide susceptibility mapping in and around Dharamshala, Himachal Pradesh. The final produced landslide susceptibility maps (LSMs) with better accuracy could be used for land-use planning to prevent future losses. Dharamshala, a part of North-western Himalaya, is one of the fastest-growing tourism hubs with a total population of 30,764 according to the 2011 census and is amongst one of the hundred Indian cities to be developed as a smart city under PM’s Smart Cities Mission. A total of 209 landslide locations were identified in using high-resolution linear imaging self-scanning (LISS IV) data. The thematic maps of parameters influencing landslide occurrence were generated using remote sensing and other ancillary data in the GIS environment. The landslide causative parameters used in the study are slope angle, slope aspect, elevation, curvature, topographic wetness index, relative relief, distance from lineaments, land use land cover, and geology. LSMs were prepared using information value (Info Val), and Fuzzy Expert System (FES) models. Info Val is a statistical bivariate method, in which information values were calculated as the ratio of the landslide pixels per factor class (Si/Ni) to the total landslide pixel per parameter (S/N). Using this information values all parameters were reclassified and then summed in GIS to obtain the landslide susceptibility index (LSI) map. The FES method is a machine learning technique based on ‘mean and neighbour’ strategy for the construction of fuzzifier (input) and defuzzifier (output) membership function (MF) structure, and the FR method is used for formulating if-then rules. Two types of membership structures were utilized for membership function Bell-Gaussian (BG) and Trapezoidal-Triangular (TT). LSI for BG and TT were obtained applying membership function and if-then rules in MATLAB. The final LSMs were spatially and statistically validated. The validation results showed that in terms of accuracy, Info Val (83.4%) is better than BG (83.0%) and TT (82.6%), whereas, in terms of spatial distribution, BG is best. Hence, considering both statistical and spatial accuracy, BG is the most accurate one.Keywords: bivariate statistical techniques, BG and TT membership structure, fuzzy expert system, information value method, machine learning technique
Procedia PDF Downloads 13224549 Using Learning Apps in the Classroom
Authors: Janet C. Read
Abstract:
UClan set collaboration with Lingokids to assess the Lingokids learning app's impact on learning outcomes in classrooms in the UK for children with ages ranging from 3 to 5 years. Data gathered during the controlled study with 69 children includes attitudinal data, engagement, and learning scores. Data shows that children enjoyment while learning was higher among those children using the game-based app compared to those children using other traditional methods. It’s worth pointing out that engagement when using the learning app was significantly higher than other traditional methods among older children. According to existing literature, there is a direct correlation between engagement, motivation, and learning. Therefore, this study provides relevant data points to conclude that Lingokids learning app serves its purpose of encouraging learning through playful and interactive content. That being said, we believe that learning outcomes should be assessed with a wider range of methods in further studies. Likewise, it would be beneficial to assess the level of usability and playability of the app in order to evaluate the learning app from other angles.Keywords: learning app, learning outcomes, rapid test activity, Smileyometer, early childhood education, innovative pedagogy
Procedia PDF Downloads 7524548 Road Safety in the Great Britain: An Exploratory Data Analysis
Authors: Jatin Kumar Choudhary, Naren Rayala, Abbas Eslami Kiasari, Fahimeh Jafari
Abstract:
The Great Britain has one of the safest road networks in the world. However, the consequences of any death or serious injury are devastating for loved ones, as well as for those who help the severely injured. This paper aims to analyse the Great Britain's road safety situation and show the response measures for areas where the total damage caused by accidents can be significantly and quickly reduced. In this paper, we do an exploratory data analysis using STATS19 data. For the past 30 years, the UK has had a good record in reducing fatalities. The UK ranked third based on the number of road deaths per million inhabitants. There were around 165,000 accidents reported in the Great Britain in 2009 and it has been decreasing every year until 2019 which is under 120,000. The government continues to scale back road deaths empowering responsible road users by identifying and prosecuting the parameters that make the roads less safe.Keywords: road safety, data analysis, openstreetmap, feature expanding.
Procedia PDF Downloads 14424547 Intrusion Detection System Using Linear Discriminant Analysis
Authors: Zyad Elkhadir, Khalid Chougdali, Mohammed Benattou
Abstract:
Most of the existing intrusion detection systems works on quantitative network traffic data with many irrelevant and redundant features, which makes detection process more time’s consuming and inaccurate. A several feature extraction methods, such as linear discriminant analysis (LDA), have been proposed. However, LDA suffers from the small sample size (SSS) problem which occurs when the number of the training samples is small compared with the samples dimension. Hence, classical LDA cannot be applied directly for high dimensional data such as network traffic data. In this paper, we propose two solutions to solve SSS problem for LDA and apply them to a network IDS. The first method, reduce the original dimension data using principal component analysis (PCA) and then apply LDA. In the second solution, we propose to use the pseudo inverse to avoid singularity of within-class scatter matrix due to SSS problem. After that, the KNN algorithm is used for classification process. We have chosen two known datasets KDDcup99 and NSLKDD for testing the proposed approaches. Results showed that the classification accuracy of (PCA+LDA) method outperforms clearly the pseudo inverse LDA method when we have large training data.Keywords: LDA, Pseudoinverse, PCA, IDS, NSL-KDD, KDDcup99
Procedia PDF Downloads 23324546 Studies of Rule Induction by STRIM from the Decision Table with Contaminated Attribute Values from Missing Data and Noise — in the Case of Critical Dataset Size —
Authors: Tetsuro Saeki, Yuichi Kato, Shoutarou Mizuno
Abstract:
STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by simulation experiments specifying rules in advance, and by comparison with conventional methods. However, scope for future development remains before STRIM can be applied to the analysis of real-world data sets. The first requirement is to determine the size of the dataset needed for inducting true rules, since finding statistically significant rules is the core of the method. The second is to examine the capacity of rule induction from datasets with contaminated attribute values created by missing data and noise, since real-world datasets usually contain such contaminated data. This paper examines the first problem theoretically, in connection with the rule length. The second problem is then examined in a simulation experiment, utilizing the critical size of dataset derived from the first step. The experimental results show that STRIM is highly robust in the analysis of datasets with contaminated attribute values, and hence is applicable to realworld data.Keywords: rule induction, decision table, missing data, noise
Procedia PDF Downloads 39824545 Discovering Causal Structure from Observations: The Relationships between Technophile Attitude, Users Value and Use Intention of Mobility Management Travel App
Authors: Aliasghar Mehdizadeh Dastjerdi, Francisco Camara Pereira
Abstract:
The increasing complexity and demand of transport services strains transportation systems especially in urban areas with limited possibilities for building new infrastructure. The solution to this challenge requires changes of travel behavior. One of the proposed means to induce such change is multimodal travel apps. This paper describes a study of the intention to use a real-time multi-modal travel app aimed at motivating travel behavior change in the Greater Copenhagen Region (Denmark) toward promoting sustainable transport options. The proposed app is a multi-faceted smartphone app including both travel information and persuasive strategies such as health and environmental feedback, tailoring travel options, self-monitoring, tunneling users toward green behavior, social networking, nudging and gamification elements. The prospective for mobility management travel apps to stimulate sustainable mobility rests not only on the original and proper employment of the behavior change strategies, but also on explicitly anchoring it on established theoretical constructs from behavioral theories. The theoretical foundation is important because it positively and significantly influences the effectiveness of the system. However, there is a gap in current knowledge regarding the study of mobility-management travel app with support in behavioral theories, which should be explored further. This study addresses this gap by a social cognitive theory‐based examination. However, compare to conventional method in technology adoption research, this study adopts a reverse approach in which the associations between theoretical constructs are explored by Max-Min Hill-Climbing (MMHC) algorithm as a hybrid causal discovery method. A technology-use preference survey was designed to collect data. The survey elicited different groups of variables including (1) three groups of user’s motives for using the app including gain motives (e.g., saving travel time and cost), hedonic motives (e.g., enjoyment) and normative motives (e.g., less travel-related CO2 production), (2) technology-related self-concepts (i.e. technophile attitude) and (3) use Intention of the travel app. The questionnaire items led to the formulation of causal relationships discovery to learn the causal structure of the data. Causal relationships discovery from observational data is a critical challenge and it has applications in different research fields. The estimated causal structure shows that the two constructs of gain motives and technophilia have a causal effect on adoption intention. Likewise, there is a causal relationship from technophilia to both gain and hedonic motives. In line with the findings of the prior studies, it highlights the importance of functional value of the travel app as well as technology self-concept as two important variables for adoption intention. Furthermore, the results indicate the effect of technophile attitude on developing gain and hedonic motives. The causal structure shows hierarchical associations between the three groups of user’s motive. They can be explained by “frustration-regression” principle according to Alderfer's ERG (Existence, Relatedness and Growth) theory of needs meaning that a higher level need remains unfulfilled, a person may regress to lower level needs that appear easier to satisfy. To conclude, this study shows the capability of causal discovery methods to learn the causal structure of theoretical model, and accordingly interpret established associations.Keywords: travel app, behavior change, persuasive technology, travel information, causality
Procedia PDF Downloads 14824544 Stuck Down in the Mess of Aisles: Need of a Practical Consumer Welfare Policy Framework in Sri Lanka with Special Reference to Japan
Authors: E. N. R. de Silva
Abstract:
The main purpose of this research is to set a policy framework for establishing a legal, institutional and social infrastructure that enhances the welfare, health, safety and economic interest of the consumers in Sri Lanka. It will help to develop an approach to continuously and successfully advocate for a consumer protection legal reform agenda and also it is significant as it gives directions to create national consumer protection associations in Sri Lanka. The methodology adopted for this research is purely a qualitative approach and it is generally and specifically categorized. Generally, part of this research looked at the existing laws, regulations and how effective they are in order to protect consumers. It will analyze the consumer protection framework and specially, consumer protection enhanced by the public organizations in Japan. This research offers a model with methods and legal instruments to enforce advocacy group to enhance consumer welfare, also brings out reforms to be made in the national legal framework on consumer welfare.Keywords: consumer protection association, consumer protection law, consumer welfare, legal framework
Procedia PDF Downloads 37524543 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services
Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme
Abstract:
Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing
Procedia PDF Downloads 11824542 Regression Approach for Optimal Purchase of Hosts Cluster in Fixed Fund for Hadoop Big Data Platform
Authors: Haitao Yang, Jianming Lv, Fei Xu, Xintong Wang, Yilin Huang, Lanting Xia, Xuewu Zhu
Abstract:
Given a fixed fund, purchasing fewer hosts of higher capability or inversely more of lower capability is a must-be-made trade-off in practices for building a Hadoop big data platform. An exploratory study is presented for a Housing Big Data Platform project (HBDP), where typical big data computing is with SQL queries of aggregate, join, and space-time condition selections executed upon massive data from more than 10 million housing units. In HBDP, an empirical formula was introduced to predict the performance of host clusters potential for the intended typical big data computing, and it was shaped via a regression approach. With this empirical formula, it is easy to suggest an optimal cluster configuration. The investigation was based on a typical Hadoop computing ecosystem HDFS+Hive+Spark. A proper metric was raised to measure the performance of Hadoop clusters in HBDP, which was tested and compared with its predicted counterpart, on executing three kinds of typical SQL query tasks. Tests were conducted with respect to factors of CPU benchmark, memory size, virtual host division, and the number of element physical host in cluster. The research has been applied to practical cluster procurement for housing big data computing.Keywords: Hadoop platform planning, optimal cluster scheme at fixed-fund, performance predicting formula, typical SQL query tasks
Procedia PDF Downloads 23224541 Model Predictive Controller for Pasteurization Process
Authors: Tesfaye Alamirew Dessie
Abstract:
Our study focuses on developing a Model Predictive Controller (MPC) and evaluating it against a traditional PID for a pasteurization process. Utilizing system identification from the experimental data, the dynamics of the pasteurization process were calculated. Using best fit with data validation, residual, and stability analysis, the quality of several model architectures was evaluated. The validation data fit the auto-regressive with exogenous input (ARX322) model of the pasteurization process by roughly 80.37 percent. The ARX322 model structure was used to create MPC and PID control techniques. After comparing controller performance based on settling time, overshoot percentage, and stability analysis, it was found that MPC controllers outperform PID for those parameters.Keywords: MPC, PID, ARX, pasteurization
Procedia PDF Downloads 16724540 Point Estimation for the Type II Generalized Logistic Distribution Based on Progressively Censored Data
Authors: Rana Rimawi, Ayman Baklizi
Abstract:
Skewed distributions are important models that are frequently used in applications. Generalized distributions form a class of skewed distributions and gain widespread use in applications because of their flexibility in data analysis. More specifically, the Generalized Logistic Distribution with its different types has received considerable attention recently. In this study, based on progressively type-II censored data, we will consider point estimation in type II Generalized Logistic Distribution (Type II GLD). We will develop several estimators for its unknown parameters, including maximum likelihood estimators (MLE), Bayes estimators and linear estimators (BLUE). The estimators will be compared using simulation based on the criteria of bias and Mean square error (MSE). An illustrative example of a real data set will be given.Keywords: point estimation, type II generalized logistic distribution, progressive censoring, maximum likelihood estimation
Procedia PDF Downloads 202