Search results for: data science techniques
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29764

Search results for: data science techniques

29584 Collaborative Data Refinement for Enhanced Ionic Conductivity Prediction in Garnet-Type Materials

Authors: Zakaria Kharbouch, Mustapha Bouchaara, F. Elkouihen, A. Habbal, A. Ratnani, A. Faik

Abstract:

Solid-state lithium-ion batteries have garnered increasing interest in modern energy research due to their potential for safer, more efficient, and sustainable energy storage systems. Among the critical components of these batteries, the electrolyte plays a pivotal role, with LLZO garnet-based electrolytes showing significant promise. Garnet materials offer intrinsic advantages such as high Li-ion conductivity, wide electrochemical stability, and excellent compatibility with lithium metal anodes. However, optimizing ionic conductivity in garnet structures poses a complex challenge, primarily due to the multitude of potential dopants that can be incorporated into the LLZO crystal lattice. The complexity of material design, influenced by numerous dopant options, requires a systematic method to find the most effective combinations. This study highlights the utility of machine learning (ML) techniques in the materials discovery process to navigate the complex range of factors in garnet-based electrolytes. Collaborators from the materials science and ML fields worked with a comprehensive dataset previously employed in a similar study and collected from various literature sources. This dataset served as the foundation for an extensive data refinement phase, where meticulous error identification, correction, outlier removal, and garnet-specific feature engineering were conducted. This rigorous process substantially improved the dataset's quality, ensuring it accurately captured the underlying physical and chemical principles governing garnet ionic conductivity. The data refinement effort resulted in a significant improvement in the predictive performance of the machine learning model. Originally starting at an accuracy of 0.32, the model underwent substantial refinement, ultimately achieving an accuracy of 0.88. This enhancement highlights the effectiveness of the interdisciplinary approach and underscores the substantial potential of machine learning techniques in materials science research.

Keywords: lithium batteries, all-solid-state batteries, machine learning, solid state electrolytes

Procedia PDF Downloads 27
29583 Inadequate Intake of Energy and Nutrients: A Comparative Cross-Sectional Study Between Sport and Non-sport Science University Students of Southern Ethiopia

Authors: Beruk Berhanu Desalegn, Kebede Awgechew, Addisalem Mesfin

Abstract:

Introduction: This study aimed to investigate and compare the energy and selected nutrient intakes of sport science and non-sport science University students of Southern Ethiopia. Method: Multiple-day dietary data were collected from 166 university students (76 sport science and 90 non-sport sciences). Average daily energy and nutrient intake, and inadequate intakes were calculated using NutriSurvey (NS). Results: There were significant differences (p < 0.05) in the median intakes of energy, total carbohydrate, and vitamin B1 between female students from the sport science and non-sport science groups, but only the median intake of iron was significantly different (p < 0.05) between the male sport and non-sport science students’ group. The prevalence of inadequate intake of vitamin B1 were significantly (p<0.05) higher in the male and female from the non-sport science groups compared to the male and female students’ groups in the sport science, respectively. Whereas, the prevalence of inadequate iron intake by the male sport science students’ group was significantly (p<0.05) higher compared to their counterparts. Similarly, the prevalence of inadequate energy among the females from the sport science group was significantly (p<0.05) higher compared to the female students from the non-sport science department group. The prevalence of inadequate intakes of dietary energy, and the majority of the nutrients (protein, fat, vitamin A, B1, B2, and magnesium) were high (>50%) in selected University students. Conclusion: The energy and majority of nutrient intakes by the students in the selected universities of southern Ethiopia were sub-optimal. Therefore, activities that will improve the dietary intake of University students should include weekly meal plan revision considering their average recommended nutrient intake (RNI).

Keywords: dietary intake, sport science, University students, Ethiopia

Procedia PDF Downloads 55
29582 Data Centers’ Temperature Profile Simulation Optimized by Finite Elements and Discretization Methods

Authors: José Alberto García Fernández, Zhimin Du, Xinqiao Jin

Abstract:

Nowadays, data center industry faces strong challenges for increasing the speed and data processing capacities while at the same time is trying to keep their devices a suitable working temperature without penalizing that capacity. Consequently, the cooling systems of this kind of facilities use a large amount of energy to dissipate the heat generated inside the servers, and developing new cooling techniques or perfecting those already existing would be a great advance in this type of industry. The installation of a temperature sensor matrix distributed in the structure of each server would provide the necessary information for collecting the required data for obtaining a temperature profile instantly inside them. However, the number of temperature probes required to obtain the temperature profiles with sufficient accuracy is very high and expensive. Therefore, other less intrusive techniques are employed where each point that characterizes the server temperature profile is obtained by solving differential equations through simulation methods, simplifying data collection techniques but increasing the time to obtain results. In order to reduce these calculation times, complicated and slow computational fluid dynamics simulations are replaced by simpler and faster finite element method simulations which solve the Burgers‘ equations by backward, forward and central discretization techniques after simplifying the energy and enthalpy conservation differential equations. The discretization methods employed for solving the first and second order derivatives of the obtained Burgers‘ equation after these simplifications are the key for obtaining results with greater or lesser accuracy regardless of the characteristic truncation error.

Keywords: Burgers' equations, CFD simulation, data center, discretization methods, FEM simulation, temperature profile

Procedia PDF Downloads 131
29581 Multi-Class Text Classification Using Ensembles of Classifiers

Authors: Syed Basit Ali Shah Bukhari, Yan Qiang, Saad Abdul Rauf, Syed Saqlaina Bukhari

Abstract:

Text Classification is the methodology to classify any given text into the respective category from a given set of categories. It is highly important and vital to use proper set of pre-processing , feature selection and classification techniques to achieve this purpose. In this paper we have used different ensemble techniques along with variance in feature selection parameters to see the change in overall accuracy of the result and also on some other individual class based features which include precision value of each individual category of the text. After subjecting our data through pre-processing and feature selection techniques , different individual classifiers were tested first and after that classifiers were combined to form ensembles to increase their accuracy. Later we also studied the impact of decreasing the classification categories on over all accuracy of data. Text classification is highly used in sentiment analysis on social media sites such as twitter for realizing people’s opinions about any cause or it is also used to analyze customer’s reviews about certain products or services. Opinion mining is a vital task in data mining and text categorization is a back-bone to opinion mining.

Keywords: Natural Language Processing, Ensemble Classifier, Bagging Classifier, AdaBoost

Procedia PDF Downloads 206
29580 Mathematics as the Foundation for the STEM Disciplines: Different Pedagogical Strategies Addressed

Authors: Marion G. Ben-Jacob, David Wang

Abstract:

There is a mathematics requirement for entry level college and university students, especially those who plan to study STEM (Science, Technology, Engineering and Mathematics). Most of them take College Algebra, and to continue their studies, they need to succeed in this course. Different pedagogical strategies are employed to promote the success of our students. There is, of course, the Traditional Method of teaching- lecture, examples, problems for students to solve. The Emporium Model, another pedagogical approach, replaces traditional lectures with a learning resource center model featuring interactive software and on-demand personalized assistance. This presentation will compare these two methods of pedagogy and the study done with its results on this comparison. Math is the foundation for science, technology, and engineering. Its work is generally used in STEM to find patterns in data. These patterns can be used to test relationships, draw general conclusions about data, and model the real world. In STEM, solutions to problems are analyzed, reasoned, and interpreted using math abilities in a assortment of real-world scenarios. This presentation will examine specific examples of how math is used in the different STEM disciplines. Math becomes practical in science when it is used to model natural and artificial experiments to identify a problem and develop a solution for it. As we analyze data, we are using math to find the statistical correlation between the cause of an effect. Scientists who use math include the following: data scientists, scientists, biologists and geologists. Without math, most technology would not be possible. Math is the basis of binary, and without programming, you just have the hardware. Addition, subtraction, multiplication, and division is also used in almost every program written. Mathematical algorithms are inherent in software as well. Mechanical engineers analyze scientific data to design robots by applying math and using the software. Electrical engineers use math to help design and test electrical equipment. They also use math when creating computer simulations and designing new products. Chemical engineers often use mathematics in the lab. Advanced computer software is used to aid in their research and production processes to model theoretical synthesis techniques and properties of chemical compounds. Mathematics mastery is crucial for success in the STEM disciplines. Pedagogical research on formative strategies and necessary topics to be covered are essential.

Keywords: emporium model, mathematics, pedagogy, STEM

Procedia PDF Downloads 46
29579 Using Photogrammetric Techniques to Map the Mars Surface

Authors: Ahmed Elaksher, Islam Omar

Abstract:

For many years, Mars surface has been a mystery for scientists. Lately with the help of geospatial data and photogrammetric procedures researchers were able to capture some insights about this planet. Two of the most imperative data sources to explore Mars are the The High Resolution Imaging Science Experiment (HiRISE) and the Mars Orbiter Laser Altimeter (MOLA). HiRISE is one of six science instruments carried by the Mars Reconnaissance Orbiter, launched August 12, 2005, and managed by NASA. The MOLA sensor is a laser altimeter carried by the Mars Global Surveyor (MGS) and launched on November 7, 1996. In this project, we used MOLA-based DEMs to orthorectify HiRISE optical images for generating a more accurate and trustful surface of Mars. The MOLA data was interpolated using the kriging interpolation technique. Corresponding tie points were digitized from both datasets. These points were employed in co-registering both datasets using GIS analysis tools. In this project, we employed three different 3D to 2D transformation models. These are the parallel projection (3D affine) transformation model; the extended parallel projection transformation model; the Direct Linear Transformation (DLT) model. A set of tie-points was digitized from both datasets. These points were split into two sets: Ground Control Points (GCPs), used to evaluate the transformation parameters using least squares adjustment techniques, and check points (ChkPs) to evaluate the computed transformation parameters. Results were evaluated using the RMSEs between the precise horizontal coordinates of the digitized check points and those estimated through the transformation models using the computed transformation parameters. For each set of GCPs, three different configurations of GCPs and check points were tested, and average RMSEs are reported. It was found that for the 2D transformation models, average RMSEs were in the range of five meters. Increasing the number of GCPs from six to ten points improve the accuracy of the results with about two and half meters. Further increasing the number of GCPs didn’t improve the results significantly. Using the 3D to 2D transformation parameters provided three to two meters accuracy. Best results were reported using the DLT transformation model. However, increasing the number of GCPS didn’t have substantial effect. The results support the use of the DLT model as it provides the required accuracy for ASPRS large scale mapping standards. However, well distributed sets of GCPs is a key to provide such accuracy. The model is simple to apply and doesn’t need substantial computations.

Keywords: mars, photogrammetry, MOLA, HiRISE

Procedia PDF Downloads 39
29578 Automatic Diagnosis of Electrical Equipment Using Infrared Thermography

Authors: Y. Laib Dit Leksir, S. Bouhouche

Abstract:

Analysis and processing of data bases resulting from infrared thermal measurements made on the electrical installation requires the development of new tools in order to obtain correct and additional information to the visual inspections. Consequently, the methods based on the capture of infrared digital images show a great potential and are employed increasingly in various fields. Although, there is an enormous need for the development of effective techniques to analyse these data base in order to extract relevant information relating to the state of the equipments. Our goal consists in introducing recent techniques of modeling based on new methods, image and signal processing to develop mathematical models in this field. The aim of this work is to capture the anomalies existing in electrical equipments during an inspection of some machines using A40 Flir camera. After, we use binarisation techniques in order to select the region of interest and we make comparison between these methods of thermal images obtained to choose the best one.

Keywords: infrared thermography, defect detection, troubleshooting, electrical equipment

Procedia PDF Downloads 456
29577 Evaluation of Introductory Programming Course for Non-Computer Science Majored Students

Authors: H. Varol

Abstract:

Although students’ interest level in pursuing Computer Science and related degrees are lower than previous decade, fundamentals of computers, specifically introductory level programming courses are either listed as core or elective courses for a number of non-computer science majors. Universities accommodate these non-computer science majored students either via creating separate sections of a class for them or simply offering mixed-body classroom solutions, in which both computer science and non-computer science students take the courses together. In this work, we demonstrated how we handle introductory level programming course at Sam Houston State University and also provide facts about our observations on students’ success during the coursework. Moreover, we provide suggestions and methodologies that are based on students’ major and skills to overcome the deficiencies of mix-body type of classes.

Keywords: computer science, non-computer science major, programming, programming education

Procedia PDF Downloads 303
29576 Methods and Algorithms of Ensuring Data Privacy in AI-Based Healthcare Systems and Technologies

Authors: Omar Farshad Jeelani, Makaire Njie, Viktoriia M. Korzhuk

Abstract:

Recently, the application of AI-powered algorithms in healthcare continues to flourish. Particularly, access to healthcare information, including patient health history, diagnostic data, and PII (Personally Identifiable Information) is paramount in the delivery of efficient patient outcomes. However, as the exchange of healthcare information between patients and healthcare providers through AI-powered solutions increases, protecting a person’s information and their privacy has become even more important. Arguably, the increased adoption of healthcare AI has resulted in a significant concentration on the security risks and protection measures to the security and privacy of healthcare data, leading to escalated analyses and enforcement. Since these challenges are brought by the use of AI-based healthcare solutions to manage healthcare data, AI-based data protection measures are used to resolve the underlying problems. Consequently, this project proposes AI-powered safeguards and policies/laws to protect the privacy of healthcare data. The project presents the best-in-school techniques used to preserve the data privacy of AI-powered healthcare applications. Popular privacy-protecting methods like Federated learning, cryptographic techniques, differential privacy methods, and hybrid methods are discussed together with potential cyber threats, data security concerns, and prospects. Also, the project discusses some of the relevant data security acts/laws that govern the collection, storage, and processing of healthcare data to guarantee owners’ privacy is preserved. This inquiry discusses various gaps and uncertainties associated with healthcare AI data collection procedures and identifies potential correction/mitigation measures.

Keywords: data privacy, artificial intelligence (AI), healthcare AI, data sharing, healthcare organizations (HCOs)

Procedia PDF Downloads 40
29575 Application of XRF and Other Principal Component Analysis for Counterfeited Gold Coin Characterization in Forensic Science

Authors: Somayeh Khanjani, Hamideh Abolghasemi, Hadi Shirzad, Samaneh Nabavi

Abstract:

At world market can be currently encountered a wide range of gemological objects that are incorrectly declared, treated, or it concerns completely different materials that try to copy precious objects more or less successfully. Counterfeiting of precious commodities is a problem faced by governments in most countries. Police have seized many counterfeit coins that looked like the real coins and because the feeling to the touch and the weight were very similar to those of real coins. Most people were fooled and believed that the counterfeit coins were real ones. These counterfeit coins may have been made by big criminal organizations. To elucidate the manufacturing process, not only the quantitative analysis of the coins but also the comparison of their morphological characteristics was necessary. Several modern techniques have been applied to prevent counterfeiting of coins. The objective of this study was to demonstrate the potential of X-ray Fluorescence (XRF) technique and the other analytical techniques for example SEM/EDX/WDX, FT-IR/ATR and Raman Spectroscopy. Using four elements (Cu, Ag, Au and Zn) and obtaining XRF for several samples, they could be discriminated. XRF technique and SEM/EDX/WDX are used for study of chemical composition. XRF analyzers provide a fast, accurate, nondestructive method to test the purity and chemistry of all precious metals. XRF is a very promising technique for rapid and non destructive counterfeit coins identification in forensic science.

Keywords: counterfeit coins, X-ray fluorescence, forensic, FT-IR

Procedia PDF Downloads 463
29574 Data-Mining Approach to Analyzing Industrial Process Information for Real-Time Monitoring

Authors: Seung-Lock Seo

Abstract:

This work presents a data-mining empirical monitoring scheme for industrial processes with partially unbalanced data. Measurement data of good operations are relatively easy to gather, but in unusual special events or faults it is generally difficult to collect process information or almost impossible to analyze some noisy data of industrial processes. At this time some noise filtering techniques can be used to enhance process monitoring performance in a real-time basis. In addition, pre-processing of raw process data is helpful to eliminate unwanted variation of industrial process data. In this work, the performance of various monitoring schemes was tested and demonstrated for discrete batch process data. It showed that the monitoring performance was improved significantly in terms of monitoring success rate of given process faults.

Keywords: data mining, process data, monitoring, safety, industrial processes

Procedia PDF Downloads 369
29573 Open Science Philosophy, Research and Innovation

Authors: C.Ardil

Abstract:

Open Science translates the understanding and application of various theories and practices in open science philosophy, systems, paradigms and epistemology. Open Science originates with the premise that universal scientific knowledge is a product of a collective scholarly and social collaboration involving all stakeholders and knowledge belongs to the global society. Scientific outputs generated by public research are a public good that should be available to all at no cost and without barriers or restrictions. Open Science has the potential to increase the quality, impact and benefits of science and to accelerate advancement of knowledge by making it more reliable, more efficient and accurate, better understandable by society and responsive to societal challenges, and has the potential to enable growth and innovation through reuse of scientific results by all stakeholders at all levels of society, and ultimately contribute to growth and competitiveness of global society. Open Science is a global movement to improve accessibility to and reusability of research practices and outputs. In its broadest definition, it encompasses open access to publications, open research data and methods, open source, open educational resources, open evaluation, and citizen science. The implementation of open science provides an excellent opportunity to renegotiate the social roles and responsibilities of publicly funded research and to rethink the science system as a whole. Open Science is the practice of science in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods. Open Science represents a novel systematic approach to the scientific process, shifting from the standard practices of publishing research results in scientific publications towards sharing and using all available knowledge at an earlier stage in the research process, based on cooperative work and diffusing scholarly knowledge with no barriers and restrictions. Open Science refers to efforts to make the primary outputs of publicly funded research results (publications and the research data) publicly accessible in digital format with no limitations. Open Science is about extending the principles of openness to the whole research cycle, fostering, sharing and collaboration as early as possible, thus entailing a systemic change to the way science and research is done. Open Science is the ongoing transition in how open research is carried out, disseminated, deployed, and transformed to make scholarly research more open, global, collaborative, creative and closer to society. Open Science involves various movements aiming to remove the barriers for sharing any kind of output, resources, methods or tools, at any stage of the research process. Open Science embraces open access to publications, research data, source software, collaboration, peer review, notebooks, educational resources, monographs, citizen science, or research crowdfunding. The recognition and adoption of open science practices, including open science policies that increase open access to scientific literature and encourage data and code sharing, is increasing in the open science philosophy. Revolutionary open science policies are motivated by ethical, moral or utilitarian arguments, such as the right to access digital research literature for open source research or science data accumulation, research indicators, transparency in the field of academic practice, and reproducibility. Open science philosophy is adopted primarily to demonstrate the benefits of open science practices. Researchers use open science applications for their own advantage in order to get more offers, increase citations, attract media attention, potential collaborators, career opportunities, donations and funding opportunities. In open science philosophy, open data findings are evidence that open science practices provide significant benefits to researchers in scientific research creation, collaboration, communication, and evaluation according to more traditional closed science practices. Open science considers concerns such as the rigor of peer review, common research facts such as financing and career development, and the sacrifice of author rights. Therefore, researchers are recommended to implement open science research within the framework of existing academic evaluation and incentives. As a result, open science research issues are addressed in the areas of publishing, financing, collaboration, resource management and sharing, career development, discussion of open science questions and conclusions.

Keywords: Open Science, Open Science Philosophy, Open Science Research, Open Science Data

Procedia PDF Downloads 105
29572 Management Accounting Techniques of Companies Listed on the Stock Exchange in Thailand

Authors: Prateep Wajeetongratana

Abstract:

The objectives of the research were to examine that how management accounting techniques were perceived and used by companies listed on the stock exchange and to investigate similarities or differences of management accounting practices between companies listed on the stock exchange and Thai SMEs. Descriptive and inferential statistics were employed. The finding found that almost all of the companies used traditional management accounting techniques more than advanced management accounting techniques. Four management accounting techniques having no significant association with business characteristic were standard costing, job order costing, process costing. The barriers that Thai SMEs encountered were a lack of proper accounting system and the insufficient knowledge in management accounting of the accountants. The comparison results revealed that both companies listed on the stock exchange and Thai SMEs used traditional management accounting techniques more than advanced techniques.

Keywords: companies listed on the stock exchange, financial budget, management accounting, operating budget

Procedia PDF Downloads 358
29571 Satisfaction in Supreme Financial Disbursement in the Faculty of Science and Technology, Suan Sunandha Rajabhat University

Authors: Adisai Thovicha, Jiranan Pattaphong

Abstract:

The objective of this research is to study the satisfaction of the disbursement of the Faculty of Science and Technology, Suan Sunandha Rajabhat University. The sample of this study consisted of 98 participants who are faculty members and staff of the Faculty of Science and Technology. Sample was drawn by systematic random sampling technique. Questionnaire was used to collect data. Analysis involves frequency, percentage, mean and standard deviation. It was found that: (1) Most of the 98 faculty members and staff are female, aged between 31-40 years and they have been working at the university for 1-5 years. (2) The satisfaction level of the disbursement of the Faculty of Science and Technology, Suan Sunandha Rajabhat University is high. When each aspect is considered, the satisfaction level of faculty members and staff of the Faculty of Science and Technology is high in service providing staff, process and facilitation.

Keywords: satisfaction of disbursement, petition financing, faculty members, staff

Procedia PDF Downloads 381
29570 Reducing Power Consumption in Network on Chip Using Scramble Techniques

Authors: Vinayaga Jagadessh Raja, R. Ganesan, S. Ramesh Kumar

Abstract:

An ever more significant fraction of the overall power dissipation of a network-on-chip (NoC) based system on- chip (SoC) is due to the interconnection scheme. In information, as equipment shrinks, the power contributes of NoC links starts to compete with that of NoC routers. In this paper, we propose the use of clock gating in the data encoding techniques as a viable way to reduce both power dissipation and time consumption of NoC links. The projected scramble scheme exploits the wormhole switching techniques. That is, flits are scramble by the network interface (NI) before they are injected in the network and are decoded by the target NI. This makes the scheme transparent to the underlying network since the encoder and decoder logic is integrated in the NI and no modification of the routers structural design is required. We review the projected scramble scheme on a set of representative data streams (both synthetic and extracted from real applications) showing that it is possible to reduce the power contribution of both the self-switching activity and the coupling switching activity in inter-routers links.

Keywords: Xilinx 12.1, power consumption, Encoder, NOC

Procedia PDF Downloads 373
29569 Mobile Learning: Toward Better Understanding of Compression Techniques

Authors: Farouk Lawan Gambo

Abstract:

Data compression shrinks files into fewer bits then their original presentation. It has more advantage on internet because the smaller a file, the faster it can be transferred but learning most of the concepts in data compression are abstract in nature therefore making them difficult to digest by some students (Engineers in particular). To determine the best approach toward learning data compression technique, this paper first study the learning preference of engineering students who tend to have strong active, sensing, visual and sequential learning preferences, the paper also study the advantage that mobility of learning have experienced; Learning at the point of interest, efficiency, connection, and many more. A survey is carried out with some reasonable number of students, through random sampling to see whether considering the learning preference and advantages in mobility of learning will give a promising improvement over the traditional way of learning. Evidence from data analysis using Ms-Excel as a point of concern for error-free findings shows that there is significance different in the students after using learning content provided on smart phone, also the result of the findings presented in, bar charts and pie charts interpret that mobile learning has to be promising feature of learning.

Keywords: data analysis, compression techniques, learning content, traditional learning approach

Procedia PDF Downloads 323
29568 Omni: Data Science Platform for Evaluate Performance of a LoRaWAN Network

Authors: Emanuele A. Solagna, Ricardo S, Tozetto, Roberto dos S. Rabello

Abstract:

Nowadays, physical processes are becoming digitized by the evolution of communication, sensing and storage technologies which promote the development of smart cities. The evolution of this technology has generated multiple challenges related to the generation of big data and the active participation of electronic devices in society. Thus, devices can send information that is captured and processed over large areas, but there is no guarantee that all the obtained data amount will be effectively stored and correctly persisted. Because, depending on the technology which is used, there are parameters that has huge influence on the full delivery of information. This article aims to characterize the project, currently under development, of a platform that based on data science will perform a performance and effectiveness evaluation of an industrial network that implements LoRaWAN technology considering its main parameters configuration relating these parameters to the information loss.

Keywords: Internet of Things, LoRa, LoRaWAN, smart cities

Procedia PDF Downloads 116
29567 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: biomedical data, learning, classifier, algorithms decision tree, knowledge extraction

Procedia PDF Downloads 513
29566 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 121
29565 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: classification algorithms, data mining, knowledge discovery, tourism

Procedia PDF Downloads 268
29564 A Review of Methods for Handling Missing Data in the Formof Dropouts in Longitudinal Clinical Trials

Authors: A. Satty, H. Mwambi

Abstract:

Much clinical trials data-based research are characterized by the unavoidable problem of dropout as a result of missing or erroneous values. This paper aims to review some of the various techniques to address the dropout problems in longitudinal clinical trials. The fundamental concepts of the patterns and mechanisms of dropout are discussed. This study presents five general techniques for handling dropout: (1) Deletion methods; (2) Imputation-based methods; (3) Data augmentation methods; (4) Likelihood-based methods; and (5) MNAR-based methods. Under each technique, several methods that are commonly used to deal with dropout are presented, including a review of the existing literature in which we examine the effectiveness of these methods in the analysis of incomplete data. Two application examples are presented to study the potential strengths or weaknesses of some of the methods under certain dropout mechanisms as well as to assess the sensitivity of the modelling assumptions.

Keywords: incomplete longitudinal clinical trials, missing at random (MAR), imputation, weighting methods, sensitivity analysis

Procedia PDF Downloads 386
29563 Application of Association Rule Using Apriori Algorithm for Analysis of Industrial Accidents in 2013-2014 in Indonesia

Authors: Triano Nurhikmat

Abstract:

Along with the progress of science and technology, the development of the industrialized world in Indonesia took place very rapidly. This leads to a process of industrialization of society Indonesia faster with the establishment of the company and the workplace are diverse. Development of the industry relates to the activity of the worker. Where in these work activities do not cover the possibility of an impending crash on either the workers or on a construction project. The cause of the occurrence of industrial accidents was the fault of electrical damage, work procedures, and error technique. The method of an association rule is one of the main techniques in data mining and is the most common form used in finding the patterns of data collection. In this research would like to know how relations of the association between the incidence of any industrial accidents. Therefore, by using methods of analysis association rule patterns associated with combination obtained two iterations item set (2 large item set) when every factor of industrial accidents with a West Jakarta so industrial accidents caused by the occurrence of an electrical value damage = 0.2 support and confidence value = 1, and the reverse pattern with value = 0.2 support and confidence = 0.75.

Keywords: association rule, data mining, industrial accidents, rules

Procedia PDF Downloads 256
29562 Advantages and Disadvantages of Socioscientific Issue Based Instruction in Science Classrooms: Pre-Service Science Teachers' Views

Authors: Aysegul Evren Yapicioglu

Abstract:

The social roles and responsibilities expected from citizens are increasing due to changing global living conditions. Science education is expected to prepare conscious and sensitive students. Because today’s students are the adults of future. Precondition of this task is Teacher Education. In the past decade, one of the most important research field is socioscientific issues. This study deals with advantages and disadvantages of socioscientific issue based instruction in science classroom according to pre-service science teachers’ views. A case study approach that is one of the qualitative research design was used to explore their views. Fourteen pre-service science teachers participated to instruction process. Dolphinariums, Kyoto Protocol, genetically modified organisms, recyclable black bags’ benefits and damages, genetic tests, alternative energy sources and organ donation are examples of socioscientific issues, which were taught through activities in a special teaching course. Diaries and focus group interview were used as data collection tools. As a result of the study, the advantages of socioscientific issue based instruction in science classroom comprise of six sub-categories which are multi-skilling, social awareness development of thinking, meaningful learning, character and professional development, contribution of scientific literacy whereas disadvantages of this instruction process are challenges teachers and students, limitations of teaching and learning process in pre-service science teachers’ perspectives. Finally, this study contributes to science teachers and researchers to overcome disadvantages and benefit from the advantage of socioscientific issue based instruction in science classroom.

Keywords: science education, socioscientific issues, socioscientific issue based instruction, pre-service science teacher

Procedia PDF Downloads 154
29561 Exploring Data Leakage in EEG Based Brain-Computer Interfaces: Overfitting Challenges

Authors: Khalida Douibi, Rodrigo Balp, Solène Le Bars

Abstract:

In the medical field, applications related to human experiments are frequently linked to reduced samples size, which makes the training of machine learning models quite sensitive and therefore not very robust nor generalizable. This is notably the case in Brain-Computer Interface (BCI) studies, where the sample size rarely exceeds 20 subjects or a few number of trials. To address this problem, several resampling approaches are often used during the data preparation phase, which is an overly critical step in a data science analysis process. One of the naive approaches that is usually applied by data scientists consists in the transformation of the entire database before the resampling phase. However, this can cause model’ s performance to be incorrectly estimated when making predictions on unseen data. In this paper, we explored the effect of data leakage observed during our BCI experiments for device control through the real-time classification of SSVEPs (Steady State Visually Evoked Potentials). We also studied potential ways to ensure optimal validation of the classifiers during the calibration phase to avoid overfitting. The results show that the scaling step is crucial for some algorithms, and it should be applied after the resampling phase to avoid data leackage and improve results.

Keywords: data leackage, data science, machine learning, SSVEP, BCI, overfitting

Procedia PDF Downloads 122
29560 Public Relations for the Faculty of Management Science in Suan Sunandha Rajabhat University

Authors: Narong Anurak

Abstract:

The objectives of this research were to investigate the knowledge and understanding of public relations principles for public relations officials of the office of the faculty of management science in Ratjabhat Suan Sunandha University and to determine the approach of public relations for the Office of Faculty of Management Science.  The questionnaire was utilized as a tool to collect data. Statistics utilized included frequency, percentage, mean, standard deviation, and regression analysis. The results of the research showed that the public relations officials misunderstood on public relations principles. The lack of the perception in media of the target groups both in-house and outside caused the misunderstanding on the roles, mission, and responsibilities. It would be beneficial to public relations division and other divisions of the office of the faculty of management science to be trained and obtained more knowledge and skills on the public relations to support the public relations work for the organization.

Keywords: faculty of management science, preparation in media, public relations, Suan Sunandha Rajabhat University

Procedia PDF Downloads 352
29559 Training a Neural Network Using Input Dropout with Aggressive Reweighting (IDAR) on Datasets with Many Useless Features

Authors: Stylianos Kampakis

Abstract:

This paper presents a new algorithm for neural networks called “Input Dropout with Aggressive Re-weighting” (IDAR) aimed specifically at datasets with many useless features. IDAR combines two techniques (dropout of input neurons and aggressive re weighting) in order to eliminate the influence of noisy features. The technique can be seen as a generalization of dropout. The algorithm is tested on two different benchmark data sets: a noisy version of the iris dataset and the MADELON data set. Its performance is compared against three other popular techniques for dealing with useless features: L2 regularization, LASSO and random forests. The results demonstrate that IDAR can be an effective technique for handling data sets with many useless features.

Keywords: neural networks, feature selection, regularization, aggressive reweighting

Procedia PDF Downloads 426
29558 The Meta–Evaluation of Master Degree Theses in Science Program of Evaluation Methodology, Srinakharinwirot University

Authors: Panwasn Mahalawalert

Abstract:

The objective of this study was to meta-evaluation of Master Degree theses in Science Program of Evaluation Methodology at Srinakharinwirot University, published during 2008-2011. This study was summative meta-evaluation that evaluated all theses of Master Degree in Science Program of Evaluation Methodology. Data were collected using the theses characteristics recording form and the evaluation meta-evaluation checklist. The collected data were analyzed by two parts: 1) Quantitative data were analyzed by descriptive statistics presented in frequency, percentages, mean, and standard deviation and 2) Qualitative data were analyzed by content analysis. The results of this study were found the theses characteristics was results revealed that most of theses were published in 2011. The largest group of theses researcher were female and were from the government office. The evaluation model of all theses were Decision-Oriented Evaluation Model. The objective of all theses were evaluate the project or curriculum. The most sampling technique were used the multistage random sampling technique. The most tool were used to gathering the data were questionnaires. All of the theses were analysed by descriptive statistics. The meta-evaluation results revealed that most of theses had fair on Utility Standards and Feasibility Standards, good on Propriety Standards and Accuracy Standards.

Keywords: meta-evaluation, evaluation, master degree theses, Srinakharinwirot University

Procedia PDF Downloads 500
29557 Quality of Age Reporting from Tanzania 2012 Census Results: An Assessment Using Whipple’s Index, Myer’s Blended Index, and Age-Sex Accuracy Index

Authors: A. Sathiya Susuman, Hamisi F. Hamisi

Abstract:

Background: Many socio-economic and demographic data are age-sex attributed. However, a variety of irregularities and misstatement are noted with respect to age-related data and less to sex data because of its biological differences between the genders. Noting the misstatement/misreporting of age data regardless of its significance importance in demographics and epidemiological studies, this study aims at assessing the quality of 2012 Tanzania Population and Housing Census Results. Methods: Data for the analysis are downloaded from Tanzania National Bureau of Statistics. Age heaping and digit preference were measured using summary indices viz., Whipple’s index, Myers’ blended index, and Age-Sex Accuracy index. Results: The recorded Whipple’s index for both sexes was 154.43; male has the lowest index of about 152.65 while female has the highest index of about 156.07. For Myers’ blended index, the preferences were at digits ‘0’ and ‘5’ while avoidance were at digits ‘1’ and ‘3’ for both sexes. Finally, Age-sex index stood at 59.8 where sex ratio score was 5.82 and age ratio scores were 20.89 and 21.4 for males and female respectively. Conclusion: The evaluation of the 2012 PHC data using the demographic techniques has qualified the data inaccurate as the results of systematic heaping and digit preferences/avoidances. Thus, innovative methods in data collection along with measuring and minimizing errors using statistical techniques should be used to ensure accuracy of age data.

Keywords: age heaping, digit preference/avoidance, summary indices, Whipple’s index, Myer’s index, age-sex accuracy index

Procedia PDF Downloads 443
29556 Rigorous Literature Review: Open Science Policy

Authors: E. T. Svahn

Abstract:

This article documents how open science policy is perceived in the scientific literature globally throughout the history. It also presents what policy needs are persistent to enable safe and effective dissemination of scientific knowledge. This information may be of interest to open science and science policy makers globally, especially in the view of recent adoption of supranational open science policies such as Plan S. Evaluation of open science policy landscape is in pressing need of assessment regarding its impact on the research community and society at wide as no previous literature review has been conducted on the topic. This study is a rigorous literature review based on constructivist grounded theory method on the full body of scientific open science policy publications. Selection of these articles has been conducted in 2019 and 2020 in major global knowledge databases. Through the analysis of these articles, two key themes emerged that are seen to shape the relationship between science and society. 1st is that of the policy enabling open science in a safe and effective way, and 2nd is that of the outcome of the science policy may have on the research community and the wider society. These findings accentuate that open science policies can have a major impact on not only research process and availability of knowledge but also on society itself. As an outcome of this study, a theoretical framework is constructed, and the need for further study on open science policy itself on a higher level becomes apparent.

Keywords: constructivist grounded theory, open science policy, rigorous literature review, science policy

Procedia PDF Downloads 118
29555 Anomaly Detection of Log Analysis using Data Visualization Techniques for Digital Forensics Audit and Investigation

Authors: Mohamed Fadzlee Sulaiman, Zainurrasyid Abdullah, Mohd Zabri Adil Talib, Aswami Fadillah Mohd Ariffin

Abstract:

In common digital forensics cases, investigation may rely on the analysis conducted on specific and relevant exhibits involved. Usually the investigation officer may define and advise digital forensic analyst about the goals and objectives to be achieved in reconstructing the trail of evidence while maintaining the specific scope of investigation. With the technology growth, people are starting to realize the importance of cyber security to their organization and this new perspective creates awareness that digital forensics auditing must come in place in order to measure possible threat or attack to their cyber-infrastructure. Instead of performing investigation on incident basis, auditing may broaden the scope of investigation to the level of anomaly detection in daily operation of organization’s cyber space. While handling a huge amount of data such as log files, performing digital forensics audit for large organization proven to be onerous task for the analyst either to analyze the huge files or to translate the findings in a way where the stakeholder can clearly understand. Data visualization can be emphasized in conducting digital forensic audit and investigation to resolve both needs. This study will identify the important factors that should be considered to perform data visualization techniques in order to detect anomaly that meet the digital forensic audit and investigation objectives.

Keywords: digital forensic, data visualization, anomaly detection , log analysis, forensic audit, visualization techniques

Procedia PDF Downloads 258