Search results for: ignorable missing data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24818

Search results for: ignorable missing data

24488 Types of Limit Application Problems in Engineering Students: Case Studies

Authors: Veronica Diaz Quezada

Abstract:

The society of the 21st century requires training of engineers capable of solving routine and non-routine problems in applications of the limit of real functions, as part of the course Calculus I. For this purpose, research was conducted with a methodological design that combines quantitative and qualitative procedures and that aims, to identify and to characterize the types of problems according to their nature and context, through the application of a mathematics test; to know— through a questionnaire— the opinion of difficulties in their solution, previous and missing knowledge of some students of three engineering careers of a state university in Chile. This research is completed with three case studies. The results favor the performance of students in solving problems of a fantasist and realistic context, but these do not guarantee mathematical skills which are necessary to solve non-routine problems of limit applications. In conclusion, through this research, it became clear that the students of the three engineerings do not have all the necessary skills to solve problems of application of the limit of a function of the real variable.

Keywords: case studies, engineering program, limits, problem solving

Procedia PDF Downloads 121
24487 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: critical success factors, data quality, data quality management, Delphi, Q-Sort

Procedia PDF Downloads 207
24486 Support for Reporting Guidelines in Surgical Journals Needs Improvement: A Systematic Review

Authors: Riaz A. Agha, Ishani Barai, Shivanchan Rajmohan, Seon Lee, Mohammed O. Anwar, Alex J. Fowler, Dennis P. Orgill, Douglas G. Altman

Abstract:

Introduction: Medical knowledge is growing fast. Evidence-based medicine works best if the evidence is reported well. Past studies have shown reporting quality to be lacking in the field of surgery. Reporting guidelines are an important tool for authors to optimize the reporting of their research. The objective of this study was to analyse the frequency and strength of recommendation for such reporting guidelines within surgical journals. Methods: A systematic review of the 198 journals within the Journal Citation Report 2014 (surgery category) published by Thomson Reuters was undertaken. The online guide for authors for each journal was screened by two independent groups and results were compared. Data regarding the presence and strength of recommendation to use reporting guidelines was extracted. Results: 193 journals were included (as five appeared twice having changed their name). These had a median impact factor of 1.526 (range 0.047 to 8.327), with a median of 145 articles published per journal (range 29-659), with 34,036 articles published in total over the two-year window 2012-2013. The majority (62%) of surgical journals made no mention of reporting guidelines within their guidelines for authors. Of the journals (38%) that did mention them, only 14% (10/73) required the use of all relevant reporting guidelines. The most frequently mentioned reporting guideline was CONSORT (46 journals). Conclusion: The mention of reporting guidelines within the guide for authors of surgical journals needs improvement. Authors, reviewers and editors should work to ensure that research is reported in line with the relevant reporting guidelines. Journals should consider hard-wiring adherence to them. This will allow peer-reviewers to focus on what is present, not what is missing, raising the level of scholarly discourse between authors and the scientific community and reducing frustration amongst readers.

Keywords: CONSORT, guide for authors, PRISMA, reporting guidelines, journal impact factor, citation analysis

Procedia PDF Downloads 460
24485 Industrial and Environmental Safety in the Integrated Security Policy of the Industry: A Corporation and an Enterprise

Authors: Vladimir A. Grachev

Abstract:

Today, in the context of rapidly developing technosphere and hourly emerging new technologies, the industrial and environmental safety issue is ever more pressing. The article is devoted to the relationship of social, environmental, and industrial policies with industrial safety, occupational health and safety, environmental safety, and environmental protection. The author assesses the up-to-day situation through system analysis and on the basis of the existing practices. A complex system of the policies implementation without "gaps" and missing links ensures preservation of human lives, health and a favorable living environment. The author demonstrates that absence of an "environmental safety" high-priority link can lead to a significant loss of human lives and health and the global changes in the environment. The role of implementing the environmental policy of enterprises and organizations, and of economic sectors in the implementation of national environmental policy is shown. It was established that the system for implementing environmental policy should be based on a system analysis.

Keywords: environmental protection, environmental safety, industrial safety, occupational health and safety

Procedia PDF Downloads 203
24484 Integrating Dependent Material Planning Cycle into Building Information Management: A Building Information Management-Based Material Management Automation Framework

Authors: Faris Elghaish, Sepehr Abrishami, Mark Gaterell, Richard Wise

Abstract:

The collaboration and integration between all building information management (BIM) processes and tasks are necessary to ensure that all project objectives can be delivered. The literature review has been used to explore the state of the art BIM technologies to manage construction materials as well as the challenges which have faced the construction process using traditional methods. Thus, this paper aims to articulate a framework to integrate traditional material planning methods such as ABC analysis theory (Pareto principle) to analyse and categorise the project materials, as well as using independent material planning methods such as Economic Order Quantity (EOQ) and Fixed Order Point (FOP) into the BIM 4D, and 5D capabilities in order to articulate a dependent material planning cycle into BIM, which relies on the constructability method. Moreover, we build a model to connect between the material planning outputs and the BIM 4D and 5D data to ensure that all project information will be accurately presented throughout integrated and complementary BIM reporting formats. Furthermore, this paper will present a method to integrate between the risk management output and the material management process to ensure that all critical materials are monitored and managed under the all project stages. The paper includes browsers which are proposed to be embedded in any 4D BIM platform in order to predict the EOQ as well as FOP and alarm the user during the construction stage. This enables the planner to check the status of the materials on the site as well as to get alarm when the new order will be requested. Therefore, this will lead to manage all the project information in a single context and avoid missing any information at early design stage. Subsequently, the planner will be capable of building a more reliable 4D schedule by allocating the categorised material with the required EOQ to check the optimum locations for inventory and the temporary construction facilitates.

Keywords: building information management, BIM, economic order quantity, EOQ, fixed order point, FOP, BIM 4D, BIM 5D

Procedia PDF Downloads 166
24483 A Survey on Students' Intentions to Dropout and Dropout Causes in Higher Education of Mongolia

Authors: D. Naranchimeg, G. Ulziisaikhan

Abstract:

Student dropout problem has not been recently investigated within the Mongolian higher education. A student dropping out is a personal decision, but it may cause unemployment and other social problems including low quality of life because students who are not completed a degree cannot find better-paid jobs. The research aims to determine percentage of at-risk students, and understand reasons for dropouts and to find a way to predict. The study based on the students of the Mongolian National University of Education including its Arkhangai branch school, National University of Mongolia, Mongolian University of Life Sciences, Mongolian University of Science and Technology, Mongolian National University of Medical Science, Ikh Zasag International University, and Dornod University. We conducted the paper survey by method of random sampling and have surveyed about 100 students per university. The margin of error - 4 %, confidence level -90%, and sample size was 846, but we excluded 56 students from this study. Causes for exclusion were missing data on the questionnaire. The survey has totally 17 questions, 4 of which was demographic questions. The survey shows that 1.4% of the students always thought to dropout whereas 61.8% of them thought sometimes. Also, results of the research suggest that students’ dropouts from university do not have relationships with their sex, marital and social status, and peer and faculty climate, whereas it slightly depends on their chosen specialization. Finally, the paper presents the reasons for dropping out provided by the students. The main two reasons for dropouts are personal reasons related with choosing wrong study program, not liking the course they had chosen (50.38%), and financial difficulties (42.66%). These findings reveal the importance of early prevention of dropout where possible, combined with increased attention to high school students in choosing right for them study program, and targeted financial support for those who are at risk.

Keywords: at risk students, dropout, faculty climate, Mongolian universities, peer climate

Procedia PDF Downloads 391
24482 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: biomedical data, learning, classifier, algorithms decision tree, knowledge extraction

Procedia PDF Downloads 546
24481 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 136
24480 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 75
24479 Web-Based Decision Support Systems and Intelligent Decision-Making: A Systematic Analysis

Authors: Serhat Tüzün, Tufan Demirel

Abstract:

Decision Support Systems (DSS) have been investigated by researchers and technologists for more than 35 years. This paper analyses the developments in the architecture and software of these systems, provides a systematic analysis for different Web-based DSS approaches and Intelligent Decision-making Technologies (IDT), with the suggestion for future studies. Decision Support Systems literature begins with building model-oriented DSS in the late 1960s, theory developments in the 1970s, and the implementation of financial planning systems and Group DSS in the early and mid-80s. Then it documents the origins of Executive Information Systems, online analytic processing (OLAP) and Business Intelligence. The implementation of Web-based DSS occurred in the mid-1990s. With the beginning of the new millennia, intelligence is the main focus on DSS studies. Web-based technologies are having a major impact on design, development and implementation processes for all types of DSS. Web technologies are being utilized for the development of DSS tools by leading developers of decision support technologies. Major companies are encouraging its customers to port their DSS applications, such as data mining, customer relationship management (CRM) and OLAP systems, to a web-based environment. Similarly, real-time data fed from manufacturing plants are now helping floor managers make decisions regarding production adjustment to ensure that high-quality products are produced and delivered. Web-based DSS are being employed by organizations as decision aids for employees as well as customers. A common usage of Web-based DSS has been to assist customers configure product and service according to their needs. These systems allow individual customers to design their own products by choosing from a menu of attributes, components, prices and delivery options. The Intelligent Decision-making Technologies (IDT) domain is a fast growing area of research that integrates various aspects of computer science and information systems. This includes intelligent systems, intelligent technology, intelligent agents, artificial intelligence, fuzzy logic, neural networks, machine learning, knowledge discovery, computational intelligence, data science, big data analytics, inference engines, recommender systems or engines, and a variety of related disciplines. Innovative applications that emerge using IDT often have a significant impact on decision-making processes in government, industry, business, and academia in general. This is particularly pronounced in finance, accounting, healthcare, computer networks, real-time safety monitoring and crisis response systems. Similarly, IDT is commonly used in military decision-making systems, security, marketing, stock market prediction, and robotics. Even though lots of research studies have been conducted on Decision Support Systems, a systematic analysis on the subject is still missing. Because of this necessity, this paper has been prepared to search recent articles about the DSS. The literature has been deeply reviewed and by classifying previous studies according to their preferences, taxonomy for DSS has been prepared. With the aid of the taxonomic review and the recent developments over the subject, this study aims to analyze the future trends in decision support systems.

Keywords: decision support systems, intelligent decision-making, systematic analysis, taxonomic review

Procedia PDF Downloads 271
24478 A Deep Learning Approach to Real Time and Robust Vehicular Traffic Prediction

Authors: Bikis Muhammed, Sehra Sedigh Sarvestani, Ali R. Hurson, Lasanthi Gamage

Abstract:

Vehicular traffic events have overly complex spatial correlations and temporal interdependencies and are also influenced by environmental events such as weather conditions. To capture these spatial and temporal interdependencies and make more realistic vehicular traffic predictions, graph neural networks (GNN) based traffic prediction models have been extensively utilized due to their capability of capturing non-Euclidean spatial correlation very effectively. However, most of the already existing GNN-based traffic prediction models have some limitations during learning complex and dynamic spatial and temporal patterns due to the following missing factors. First, most GNN-based traffic prediction models have used static distance or sometimes haversine distance mechanisms between spatially separated traffic observations to estimate spatial correlation. Secondly, most GNN-based traffic prediction models have not incorporated environmental events that have a major impact on the normal traffic states. Finally, most of the GNN-based models did not use an attention mechanism to focus on only important traffic observations. The objective of this paper is to study and make real-time vehicular traffic predictions while incorporating the effect of weather conditions. To fill the previously mentioned gaps, our prediction model uses a real-time driving distance between sensors to build a distance matrix or spatial adjacency matrix and capture spatial correlation. In addition, our prediction model considers the effect of six types of weather conditions and has an attention mechanism in both spatial and temporal data aggregation. Our prediction model efficiently captures the spatial and temporal correlation between traffic events, and it relies on the graph attention network (GAT) and Bidirectional bidirectional long short-term memory (Bi-LSTM) plus attention layers and is called GAT-BILSTMA.

Keywords: deep learning, real time prediction, GAT, Bi-LSTM, attention

Procedia PDF Downloads 63
24477 Application of Artificial Neural Network Technique for Diagnosing Asthma

Authors: Azadeh Bashiri

Abstract:

Introduction: Lack of proper diagnosis and inadequate treatment of asthma leads to physical and financial complications. This study aimed to use data mining techniques and creating a neural network intelligent system for diagnosis of asthma. Methods: The study population is the patients who had visited one of the Lung Clinics in Tehran. Data were analyzed using the SPSS statistical tool and the chi-square Pearson's coefficient was the basis of decision making for data ranking. The considered neural network is trained using back propagation learning technique. Results: According to the analysis performed by means of SPSS to select the top factors, 13 effective factors were selected, in different performances, data was mixed in various forms, so the different models were made for training the data and testing networks and in all different modes, the network was able to predict correctly 100% of all cases. Conclusion: Using data mining methods before the design structure of system, aimed to reduce the data dimension and the optimum choice of the data, will lead to a more accurate system. Therefore, considering the data mining approaches due to the nature of medical data is necessary.

Keywords: asthma, data mining, Artificial Neural Network, intelligent system

Procedia PDF Downloads 264
24476 Interpreting Privacy Harms from a Non-Economic Perspective

Authors: Christopher Muhawe, Masooda Bashir

Abstract:

With increased Internet Communication Technology(ICT), the virtual world has become the new normal. At the same time, there is an unprecedented collection of massive amounts of data by both private and public entities. Unfortunately, this increase in data collection has been in tandem with an increase in data misuse and data breach. Regrettably, the majority of data breach and data misuse claims have been unsuccessful in the United States courts for the failure of proof of direct injury to physical or economic interests. The requirement to express data privacy harms from an economic or physical stance negates the fact that not all data harms are physical or economic in nature. The challenge is compounded by the fact that data breach harms and risks do not attach immediately. This research will use a descriptive and normative approach to show that not all data harms can be expressed in economic or physical terms. Expressing privacy harms purely from an economic or physical harm perspective negates the fact that data insecurity may result into harms which run counter the functions of privacy in our lives. The promotion of liberty, selfhood, autonomy, promotion of human social relations and the furtherance of the existence of a free society. There is no economic value that can be placed on these functions of privacy. The proposed approach addresses data harms from a psychological and social perspective.

Keywords: data breach and misuse, economic harms, privacy harms, psychological harms

Procedia PDF Downloads 187
24475 Resistances among Sexual Offenders on Specific Stage of Change

Authors: Chang Li Yu

Abstract:

Resistances commonly happened during sexual offenders treatment program (SOTP), and removing resistances was one of the treatment goals on it. Studies concerning treatment effectiveness relied on pre- and post-treatment evaluations, however, no significant difference on resistance revealed after treatment, and the above consequences generally contributed to the low motivation for change instead. Therefore, the aim of this study was to investigate the resistance across each stage of change among sexual offenders (SO). The present study recruited prisoned SO in Taiwan, excluding those with literacy difficulties; finally, 272 participants were included. Of all participants completed revised version of URICA (University of Rhode Island Change Assessment) and resistance scale specifically for SO. The former included four stages of change: pre-contemplation (PC), contemplation (C), action (A), and maintain (M); the later composed eight types of resistance: system blaming, victims blaming, problems with treatment alliance, social justification, hopelessness, isolation, psychological reactance, and passive reactance. Both of the instruments were with well reliability and validity. Descriptive statistics and ANOVA were performed. All of 272 participants, age under 25 were 18(6.6%), 25-39 were 133(48.9%), 40-54 were 102(37.5%), and age over 55 were 19(7.0%); college level and above were 53(19.5%), high school level were 110(40.4%), and under high school level were 109(40.1%); first offended were 117(43.0%), and recidivist were 23(8.5%). Further deleting data with missing values and invalid questionnaires, SO with stage of change on PC were 43(18.9%), C were 109(47.8%), A were 70(30.7%), and on M were 6(2.6%). One-way ANOVA showed significant differences on every kind of resistances, excepting isolation and passive reactance. Post-hoc analysis showed that SO with different stages had their main resistance. There are two contributions to the present study. First, this study provided a clinical and theoretical measurement of evaluation that was never used in the past. Second, this study used an evidence-based methodology to prove a clinical perspective differed from the past, suggesting that resistances to treatment on SO appear the whole therapeutic process, when SO progress into the next stage of change, clinicians have to deal with their main resistance for working through the therapy.

Keywords: resistance, sexual offenders treatment program (SOTP), motivation for change, prisoned sexual offender

Procedia PDF Downloads 238
24474 A New Approach to Interval Matrices and Applications

Authors: Obaid Algahtani

Abstract:

An interval may be defined as a convex combination as follows: I=[a,b]={x_α=(1-α)a+αb: α∈[0,1]}. Consequently, we may adopt interval operations by applying the scalar operation point-wise to the corresponding interval points: I ∙J={x_α∙y_α ∶ αϵ[0,1],x_α ϵI ,y_α ϵJ}, With the usual restriction 0∉J if ∙ = ÷. These operations are associative: I+( J+K)=(I+J)+ K, I*( J*K)=( I*J )* K. These two properties, which are missing in the usual interval operations, will enable the extension of the usual linear system concepts to the interval setting in a seamless manner. The arithmetic introduced here avoids such vague terms as ”interval extension”, ”inclusion function”, determinants which we encounter in the engineering literature that deal with interval linear systems. On the other hand, these definitions were motivated by our attempt to arrive at a definition of interval random variables and investigate the corresponding statistical properties. We feel that they are the natural ones to handle interval systems. We will enable the extension of many results from usual state space models to interval state space models. The interval state space model we will consider here is one of the form X_((t+1) )=AX_t+ W_t, Y_t=HX_t+ V_t, t≥0, where A∈ 〖IR〗^(k×k), H ∈ 〖IR〗^(p×k) are interval matrices and 〖W 〗_t ∈ 〖IR〗^k,V_t ∈〖IR〗^p are zero – mean Gaussian white-noise interval processes. This feeling is reassured by the numerical results we obtained in a simulation examples.

Keywords: interval analysis, interval matrices, state space model, Kalman Filter

Procedia PDF Downloads 416
24473 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 36
24472 Data Access, AI Intensity, and Scale Advantages

Authors: Chuping Lo

Abstract:

This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.

Keywords: digital intensity, digital divide, international trade, scale of economics

Procedia PDF Downloads 56
24471 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 406
24470 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 245
24469 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 126
24468 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh

Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila

Abstract:

Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.

Keywords: data culture, data-driven organization, data mesh, data quality for business success

Procedia PDF Downloads 125
24467 Big Data Analysis with RHadoop

Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim

Abstract:

It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop

Procedia PDF Downloads 427
24466 Cessna Citation X Business Aircraft Stability Analysis Using Linear Fractional Representation LFRs Model

Authors: Yamina Boughari, Ruxandra Mihaela Botez, Florian Theel, Georges Ghazi

Abstract:

Clearance of flight control laws of a civil aircraft is a long and expensive process in the Aerospace industry. Thousands of flight combinations in terms of speeds, altitudes, gross weights, centers of gravity and angles of attack have to be investigated, and proved to be safe. Nonetheless, in this method, a worst flight condition can be easily missed, and its missing would lead to a critical situation. Definitively, it would be impossible to analyze a model because of the infinite number of cases contained within its flight envelope, that might require more time, and therefore more design cost. Therefore, in industry, the technique of the flight envelope mesh is commonly used. For each point of the flight envelope, the simulation of the associated model ensures the satisfaction or not of specifications. In order to perform fast, comprehensive and effective analysis, other varying parameters models were developed by incorporating variations, or uncertainties in the nominal models, known as Linear Fractional Representation LFR models; these LFR models were able to describe the aircraft dynamics by taking into account uncertainties over the flight envelope. In this paper, the LFRs models are developed using the speeds and altitudes as varying parameters; The LFR models were built using several flying conditions expressed in terms of speeds and altitudes. The use of such a method has gained a great interest by the aeronautical companies that have seen a promising future in the modeling, and particularly in the design and certification of control laws. In this research paper, we will focus on the Cessna Citation X open loop stability analysis. The data are provided by a Research Aircraft Flight Simulator of Level D, that corresponds to the highest level flight dynamics certification; this simulator was developed by CAE Inc. and its development was based on the requirements of research at the LARCASE laboratory. The acquisition of these data was used to develop a linear model of the airplane in its longitudinal and lateral motions, and was further used to create the LFR’s models for 12 XCG /weights conditions, and thus the whole flight envelope using a friendly Graphical User Interface developed during this study. Then, the LFR’s models are analyzed using Interval Analysis method based upon Lyapunov function, and also the ‘stability and robustness analysis’ toolbox. The results were presented under the form of graphs, thus they have offered good readability, and were easily exploitable. The weakness of this method stays in a relatively long calculation, equal to about four hours for the entire flight envelope.

Keywords: flight control clearance, LFR, stability analysis, robustness analysis

Procedia PDF Downloads 345
24465 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 85
24464 Efficient Positioning of Data Aggregation Point for Wireless Sensor Network

Authors: Sifat Rahman Ahona, Rifat Tasnim, Naima Hassan

Abstract:

Data aggregation is a helpful technique for reducing the data communication overhead in wireless sensor network. One of the important tasks of data aggregation is positioning of the aggregator points. There are a lot of works done on data aggregation. But, efficient positioning of the aggregators points is not focused so much. In this paper, authors are focusing on the positioning or the placement of the aggregation points in wireless sensor network. Authors proposed an algorithm to select the aggregators positions for a scenario where aggregator nodes are more powerful than sensor nodes.

Keywords: aggregation point, data communication, data aggregation, wireless sensor network

Procedia PDF Downloads 148
24463 Battery Replacement Strategy for Electric AGVs in an Automated Container Terminal

Authors: Jiheon Park, Taekwang Kim, Kwang Ryel Ryu

Abstract:

Electric automated guided vehicles (AGVs) are becoming popular in many automated container terminals nowadays because they are pollution-free and environmentally friendly vehicles for transporting the containers within the terminal. Since efficient operation of AGVs is critical for the productivity of the container terminal, the replacement of batteries of the AGVs must be conducted in a strategic way to minimize undesirable transportation interruptions. While a too frequent replacement may lead to a loss of terminal productivity by delaying container deliveries, missing the right timing of battery replacement can result in a dead AGV that causes a severer productivity loss due to the extra efforts required to finish post treatment. In this paper, we propose a strategy for battery replacement based on a scoring function of multiple criteria taking into account the current battery level, the distances to different battery stations, and the progress of the terminal job operations. The strategy is optimized using a genetic algorithm with the objectives of minimizing the total time spent for battery replacement as well as maximizing the terminal productivity.

Keywords: AGV operation, automated container terminal, battery replacement, electric AGV, strategy optimization

Procedia PDF Downloads 383
24462 Holy Quran’s Hermeneutics from Self-Referentiality to the Quran by Quran’s Interpretation

Authors: Mohammad Ba’azm

Abstract:

The self-referentiality method as the missing ring of the Qur’an by Qur’an’s interpretation has a precise application at the level of the Quranic vocabulary, but after entering the domain of the verses, chapters and the whole Qur’an, it reveals its defect. Self-referentiality cannot show the clear concept of the Quranic scriptures, unlike the Qur’an by Qur’an’s interpretation method that guides us to the comprehension and exact hermeneutics. The Qur’an by Qur’an’s interpretation is a solid way of comprehension of the verses of the Qur'an and does not use external resources to provide implications and meanings with different theoretical and practical supports. In this method, theoretical supports are based on the basics and modalities that support and validate the legitimacy and validity of the interpretive method discussed, and the practical supports also relate to the practitioners of the religious elite. The combination of these two methods illustrates the exact understanding of the Qur'an at the level of Quranic verses, chapters, and the whole Qur’an. This study by examining the word 'book' in the Qur'an shows the difference between the two methods, and the necessity of attachment of these, in order to attain a desirable level for comprehensions meaning of the Qur'an. In this article, we have proven that by aspects of the meaning of the Quranic words, we cannot say any word has an exact meaning.

Keywords: Qur’an’s hermeneutic, self-referentiality, The Qur’an by Qur’an’s Interpretation, polysemy

Procedia PDF Downloads 175
24461 Spatial Econometric Approaches for Count Data: An Overview and New Directions

Authors: Paula Simões, Isabel Natário

Abstract:

This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.

Keywords: spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data

Procedia PDF Downloads 582
24460 Combining an Optimized Closed Principal Curve-Based Method and Evolutionary Neural Network for Ultrasound Prostate Segmentation

Authors: Tao Peng, Jing Zhao, Yanqing Xu, Jing Cai

Abstract:

Due to missing/ambiguous boundaries between the prostate and neighboring structures, the presence of shadow artifacts, as well as the large variability in prostate shapes, ultrasound prostate segmentation is challenging. To handle these issues, this paper develops a hybrid method for ultrasound prostate segmentation by combining an optimized closed principal curve-based method and the evolutionary neural network; the former can fit curves with great curvature and generate a contour composed of line segments connected by sorted vertices, and the latter is used to express an appropriate map function (represented by parameters of evolutionary neural network) for generating the smooth prostate contour to match the ground truth contour. Both qualitative and quantitative experimental results showed that our proposed method obtains accurate and robust performances.

Keywords: ultrasound prostate segmentation, optimized closed polygonal segment method, evolutionary neural network, smooth mathematical model, principal curve

Procedia PDF Downloads 186
24459 A NoSQL Based Approach for Real-Time Managing of Robotics's Data

Authors: Gueidi Afef, Gharsellaoui Hamza, Ben Ahmed Samir

Abstract:

This paper deals with the secret of the continual progression data that new data management solutions have been emerged: The NoSQL databases. They crossed several areas like personalization, profile management, big data in real-time, content management, catalog, view of customers, mobile applications, internet of things, digital communication and fraud detection. Nowadays, these database management systems are increasing. These systems store data very well and with the trend of big data, a new challenge’s store demands new structures and methods for managing enterprise data. The new intelligent machine in the e-learning sector, thrives on more data, so smart machines can learn more and faster. The robotics are our use case to focus on our test. The implementation of NoSQL for Robotics wrestle all the data they acquire into usable form because with the ordinary type of robotics; we are facing very big limits to manage and find the exact information in real-time. Our original proposed approach was demonstrated by experimental studies and running example used as a use case.

Keywords: NoSQL databases, database management systems, robotics, big data

Procedia PDF Downloads 341