Search results for: data mining techniques
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 28978

Search results for: data mining techniques

28588 Exploring Social Impact of Emerging Technologies from Futuristic Data

Authors: Heeyeul Kwon, Yongtae Park

Abstract:

Despite the highly touted benefits, emerging technologies have unleashed pervasive concerns regarding unintended and unforeseen social impacts. Thus, those wishing to create safe and socially acceptable products need to identify such side effects and mitigate them prior to the market proliferation. Various methodologies in the field of technology assessment (TA), namely Delphi, impact assessment, and scenario planning, have been widely incorporated in such a circumstance. However, literatures face a major limitation in terms of sole reliance on participatory workshop activities. They unfortunately missed out the availability of a massive untapped data source of futuristic information flooding through the Internet. This research thus seeks to gain insights into utilization of futuristic data, future-oriented documents from the Internet, as a supplementary method to generate social impact scenarios whilst capturing perspectives of experts from a wide variety of disciplines. To this end, network analysis is conducted based on the social keywords extracted from the futuristic documents by text mining, which is then used as a guide to produce a comprehensive set of detailed scenarios. Our proposed approach facilitates harmonized depictions of possible hazardous consequences of emerging technologies and thereby makes decision makers more aware of, and responsive to, broad qualitative uncertainties.

Keywords: emerging technologies, futuristic data, scenario, text mining

Procedia PDF Downloads 476
28587 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling

Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal

Abstract:

Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.

Keywords: ABET, accreditation, benchmark collection, machine learning, program educational objectives, student outcomes, supervised multi-class classification, text mining

Procedia PDF Downloads 150
28586 Use of In-line Data Analytics and Empirical Model for Early Fault Detection

Authors: Hyun-Woo Cho

Abstract:

Automatic process monitoring schemes are designed to give early warnings for unusual process events or abnormalities as soon as possible. For this end, various techniques have been developed and utilized in various industrial processes. It includes multivariate statistical methods, representation skills in reduced spaces, kernel-based nonlinear techniques, etc. This work presents a nonlinear empirical monitoring scheme for batch type production processes with incomplete process measurement data. While normal operation data are easy to get, unusual fault data occurs infrequently and thus are difficult to collect. In this work, noise filtering steps are added in order to enhance monitoring performance by eliminating irrelevant information of the data. The performance of the monitoring scheme was demonstrated using batch process data. The results showed that the monitoring performance was improved significantly in terms of detection success rate of process fault.

Keywords: batch process, monitoring, measurement, kernel method

Procedia PDF Downloads 305
28585 Knowledge Discovery from Production Databases for Hierarchical Process Control

Authors: Pavol Tanuska, Pavel Vazan, Michal Kebisek, Dominika Jurovata

Abstract:

The paper gives the results of the project that was oriented on the usage of knowledge discoveries from production systems for needs of the hierarchical process control. One of the main project goals was the proposal of knowledge discovery model for process control. Specifics data mining methods and techniques was used for defined problems of the process control. The gained knowledge was used on the real production system, thus, the proposed solution has been verified. The paper documents how it is possible to apply new discovery knowledge to be used in the real hierarchical process control. There are specified the opportunities for application of the proposed knowledge discovery model for hierarchical process control.

Keywords: hierarchical process control, knowledge discovery from databases, neural network, process control

Procedia PDF Downloads 460
28584 Occupational Safety and Health in the Wake of Drones

Authors: Hoda Rahmani, Gary Weckman

Abstract:

The body of research examining the integration of drones into various industries is expanding rapidly. Despite progress made in addressing the cybersecurity concerns for commercial drones, knowledge deficits remain in determining potential occupational hazards and risks of drone use to employees’ well-being and health in the workplace. This creates difficulty in identifying key approaches to risk mitigation strategies and thus reflects the need for raising awareness among employers, safety professionals, and policymakers about workplace drone-related accidents. The purpose of this study is to investigate the prevalence of and possible risk factors for drone-related mishaps by comparing the application of drones in construction with manufacturing industries. The chief reason for considering these specific sectors is to ascertain whether there exists any significant difference between indoor and outdoor flights since most construction sites use drones outside and vice versa. Therefore, the current research seeks to examine the causes and patterns of workplace drone-related mishaps and suggest possible ergonomic interventions through data collection. Potential ergonomic practices to mitigate hazards associated with flying drones could include providing operators with professional pieces of training, conducting a risk analysis, and promoting the use of personal protective equipment. For the purpose of data analysis, two data mining techniques, the random forest and association rule mining algorithms, will be performed to find meaningful associations and trends in data as well as influential features that have an impact on the occurrence of drone-related accidents in construction and manufacturing sectors. In addition, Spearman’s correlation and chi-square tests will be used to measure the possible correlation between different variables. Indeed, by recognizing risks and hazards, occupational safety stakeholders will be able to pursue data-driven and evidence-based policy change with the aim of reducing drone mishaps, increasing productivity, creating a safer work environment, and extending human performance in safe and fulfilling ways. This research study was supported by the National Institute for Occupational Safety and Health through the Pilot Research Project Training Program of the University of Cincinnati Education and Research Center Grant #T42OH008432.

Keywords: commercial drones, ergonomic interventions, occupational safety, pattern recognition

Procedia PDF Downloads 184
28583 Analysis Mechanized Boring (TBM) of Tehran Subway Line 7

Authors: Shahin Shabani, Pouya Pourmadadi

Abstract:

Tunnel boring machines (TBMs) have been used for the construction of various tunnels for mining projects for the purpose of access, conveyance of ore and waste, drainage, exploration, water supply and water diversion. Several mining projects have seen the successful and economic beneficial use of TBMs, and there is an increasing awareness of the benefits of TBMs for mining projects. Key technical considerations for the use of TBMs for the construction of tunnels for mining projects include geological issues (rock type, rock alteration, rock strength, rock abrasivity, durability, ground water inflows), depth of cover and the potential for overstressing/rockbursts, site access and terrain, portal locations, TBM constraints, minimum tunnel size, tunnel support requirements, contractor and labor experience, and project schedule demands. This study focuses on tunnelling mining, with the goal to develop methods and tools to be used to gain understanding of these processes, and to analyze metro of Tehran. The Metro Line 7 of Tehran is one of the Longest (26 Km) and deepest (27m) of projects that’s under implementation. Because of major differences like passing under all geotechnical layers of the town and encountering part of it with underground water table and also using mechanized excavation system, is one of special metro projects.

Keywords: TBM, tunnel boring machines economic, metro, line 7

Procedia PDF Downloads 370
28582 Using Implicit Data to Improve E-Learning Systems

Authors: Slah Alsaleh

Abstract:

In the recent years and with popularity of internet and technology, e-learning became a major part of majority of education systems. One of the advantages the e-learning systems provide is the large amount of information available about the students' behavior while communicating with the e-learning system. Such information is very rich and it can be used to improve the capability and efficiency of e-learning systems. This paper discusses how e-learning can benefit from implicit data in different ways including; creating homogeneous groups of student, evaluating students' learning, creating behavior profiles for students and identifying the students through their behaviors.

Keywords: e-learning, implicit data, user behavior, data mining

Procedia PDF Downloads 295
28581 Poultry in Motion: Text Mining Social Media Data for Avian Influenza Surveillance in the UK

Authors: Samuel Munaf, Kevin Swingler, Franz Brülisauer, Anthony O’Hare, George Gunn, Aaron Reeves

Abstract:

Background: Avian influenza, more commonly known as Bird flu, is a viral zoonotic respiratory disease stemming from various species of poultry, including pets and migratory birds. Researchers have purported that the accessibility of health information online, in addition to the low-cost data collection methods the internet provides, has revolutionized the methods in which epidemiological and disease surveillance data is utilized. This paper examines the feasibility of using internet data sources, such as Twitter and livestock forums, for the early detection of the avian flu outbreak, through the use of text mining algorithms and social network analysis. Methods: Social media mining was conducted on Twitter between the period of 01/01/2021 to 31/12/2021 via the Twitter API in Python. The results were filtered firstly by hashtags (#avianflu, #birdflu), word occurrences (avian flu, bird flu, H5N1), and then refined further by location to include only those results from within the UK. Analysis was conducted on this text in a time-series manner to determine keyword frequencies and topic modeling to uncover insights in the text prior to a confirmed outbreak. Further analysis was performed by examining clinical signs (e.g., swollen head, blue comb, dullness) within the time series prior to the confirmed avian flu outbreak by the Animal and Plant Health Agency (APHA). Results: The increased search results in Google and avian flu-related tweets showed a correlation in time with the confirmed cases. Topic modeling uncovered clusters of word occurrences relating to livestock biosecurity, disposal of dead birds, and prevention measures. Conclusions: Text mining social media data can prove to be useful in relation to analysing discussed topics for epidemiological surveillance purposes, especially given the lack of applied research in the veterinary domain. The small sample size of tweets for certain weekly time periods makes it difficult to provide statistically plausible results, in addition to a great amount of textual noise in the data.

Keywords: veterinary epidemiology, disease surveillance, infodemiology, infoveillance, avian influenza, social media

Procedia PDF Downloads 91
28580 Location Privacy Preservation of Vehicle Data In Internet of Vehicles

Authors: Ying Ying Liu, Austin Cooke, Parimala Thulasiraman

Abstract:

Internet of Things (IoT) has attracted a recent spark in research on Internet of Vehicles (IoV). In this paper, we focus on one research area in IoV: preserving location privacy of vehicle data. We discuss existing location privacy preserving techniques and provide a scheme for evaluating these techniques under IoV traffic condition. We propose a different strategy in applying Differential Privacy using k-d tree data structure to preserve location privacy and experiment on real world Gowalla data set. We show that our strategy produces differentially private data, good preservation of utility by achieving similar regression accuracy to the original dataset on an LSTM (Long Term Short Term Memory) neural network traffic predictor.

Keywords: differential privacy, internet of things, internet of vehicles, location privacy, privacy preservation scheme

Procedia PDF Downloads 161
28579 Exploring Twitter Data on Human Rights Activism on Olympics Stage through Social Network Analysis and Mining

Authors: Teklu Urgessa, Joong Seek Lee

Abstract:

Social media is becoming the primary choice of activists to make their voices heard. This fact is coupled by two main reasons. The first reason is the emergence web 2.0, which gave the users opportunity to become content creators than passive recipients. Secondly the control of the mainstream mass media outlets by the governments and individuals with their political and economic interests. This paper aimed at exploring twitter data of network actors talking about the marathon silver medalists on Rio2016, who showed solidarity with the Oromo protesters in Ethiopia on the marathon race finish line when he won silver. The aim is to discover important insight using social network analysis and mining. The hashtag #FeyisaLelisa was used for Twitter network search. The actors’ network was visualized and analyzed. It showed the central influencers during first 10 days in August, were international media outlets while it was changed to individual activist in September. The degree distribution of the network is scale free where the frequency of degrees decay by power low. Text mining was also used to arrive at meaningful themes from tweet corpus about the event selected for analysis. The semantic network indicated important clusters of concepts (15) that provided different insight regarding the why, who, where, how of the situation related to the event. The sentiments of the words in the tweets were also analyzed and indicated that 95% of the opinions in the tweets were either positive or neutral. Overall, the finding showed that Olympic stage protest of the marathoner brought the issue of Oromo protest to the global stage. The new research framework is proposed based for event-based social network analysis and mining based on the practical procedures followed in this research for event-based social media sense making.

Keywords: human rights, Olympics, social media, network analysis, social network ming

Procedia PDF Downloads 235
28578 Transforming Data into Knowledge: Mathematical and Statistical Innovations in Data Analytics

Authors: Zahid Ullah, Atlas Khan

Abstract:

The rapid growth of data in various domains has created a pressing need for effective methods to transform this data into meaningful knowledge. In this era of big data, mathematical and statistical innovations play a crucial role in unlocking insights and facilitating informed decision-making in data analytics. This abstract aims to explore the transformative potential of these innovations and their impact on converting raw data into actionable knowledge. Drawing upon a comprehensive review of existing literature, this research investigates the cutting-edge mathematical and statistical techniques that enable the conversion of data into knowledge. By evaluating their underlying principles, strengths, and limitations, we aim to identify the most promising innovations in data analytics. To demonstrate the practical applications of these innovations, real-world datasets will be utilized through case studies or simulations. This empirical approach will showcase how mathematical and statistical innovations can extract patterns, trends, and insights from complex data, enabling evidence-based decision-making across diverse domains. Furthermore, a comparative analysis will be conducted to assess the performance, scalability, interpretability, and adaptability of different innovations. By benchmarking against established techniques, we aim to validate the effectiveness and superiority of the proposed mathematical and statistical innovations in data analytics. Ethical considerations surrounding data analytics, such as privacy, security, bias, and fairness, will be addressed throughout the research. Guidelines and best practices will be developed to ensure the responsible and ethical use of mathematical and statistical innovations in data analytics. The expected contributions of this research include advancements in mathematical and statistical sciences, improved data analysis techniques, enhanced decision-making processes, and practical implications for industries and policymakers. The outcomes will guide the adoption and implementation of mathematical and statistical innovations, empowering stakeholders to transform data into actionable knowledge and drive meaningful outcomes.

Keywords: data analytics, mathematical innovations, knowledge extraction, decision-making

Procedia PDF Downloads 57
28577 Automated Process Quality Monitoring and Diagnostics for Large-Scale Measurement Data

Authors: Hyun-Woo Cho

Abstract:

Continuous monitoring of industrial plants is one of necessary tasks when it comes to ensuring high-quality final products. In terms of monitoring and diagnosis, it is quite critical and important to detect some incipient abnormal events of manufacturing processes in order to improve safety and reliability of operations involved and to reduce related losses. In this work a new multivariate statistical online diagnostic method is presented using a case study. For building some reference models an empirical discriminant model is constructed based on various past operation runs. When a fault is detected on-line, an on-line diagnostic module is initiated. Finally, the status of the current operating conditions is compared with the reference model to make a diagnostic decision. The performance of the presented framework is evaluated using a dataset from complex industrial processes. It has been shown that the proposed diagnostic method outperforms other techniques especially in terms of incipient detection of any faults occurred.

Keywords: data mining, empirical model, on-line diagnostics, process fault, process monitoring

Procedia PDF Downloads 387
28576 Constraining the Potential Nickel Laterite Area Using Geographic Information System-Based Multi-Criteria Rating in Surigao Del Sur

Authors: Reiner-Ace P. Mateo, Vince Paolo F. Obille

Abstract:

The traditional method of classifying the potential mineral resources requires a significant amount of time and money. In this paper, an alternative way to classify potential mineral resources with GIS application in Surigao del Sur. The three (3) analog map data inputs integrated to GIS are geologic map, topographic map, and land cover/vegetation map. The indicators used in the classification of potential nickel laterite integrated from the analog map data inputs are a geologic indicator, which is the presence of ultramafic rock from the geologic map; slope indicator and the presence of plateau edges from the topographic map; areas of forest land, grassland, and shrublands from the land cover/vegetation map. The potential mineral of the area was classified from low up to very high potential. The produced mineral potential classification map of Surigao del Sur has an estimated 4.63% low nickel laterite potential, 42.15% medium nickel laterite potential, 43.34% high nickel laterite potential, and 9.88% very high nickel laterite from its ultramafic terrains. For the validation of the produced map, it was compared with known occurrences of nickel laterite in the area using a nickel mining tenement map from the area with the application of remote sensing. Three (3) prominent nickel mining companies were delineated in the study area. The generated potential classification map of nickel-laterite in Surigao Del Sur may be of aid to the mining companies which are currently in the exploration phase in the study area. Also, the currently operating nickel mines in the study area can help to validate the reliability of the mineral classification map produced.

Keywords: mineral potential classification, nickel laterites, GIS, remote sensing, Surigao del Sur

Procedia PDF Downloads 109
28575 Relevance Feedback within CBIR Systems

Authors: Mawloud Mosbah, Bachir Boucheham

Abstract:

We present here the results for a comparative study of some techniques, available in the literature, related to the relevance feedback mechanism in the case of a short-term learning. Only one method among those considered here is belonging to the data mining field which is the K-Nearest Neighbours Algorithm (KNN) while the rest of the methods is related purely to the information retrieval field and they fall under the purview of the following three major axes: Shifting query, Feature Weighting and the optimization of the parameters of similarity metric. As a contribution, and in addition to the comparative purpose, we propose a new version of the KNN algorithm referred to as an incremental KNN which is distinct from the original version in the sense that besides the influence of the seeds, the rate of the actual target image is influenced also by the images already rated. The results presented here have been obtained after experiments conducted on the Wang database for one iteration and utilizing colour moments on the RGB space. This compact descriptor, Colour Moments, is adequate for the efficiency purposes needed in the case of interactive systems. The results obtained allow us to claim that the proposed algorithm proves good results; it even outperforms a wide range of techniques available in the literature.

Keywords: CBIR, category search, relevance feedback, query point movement, standard Rocchio’s formula, adaptive shifting query, feature weighting, original KNN, incremental KNN

Procedia PDF Downloads 266
28574 Accountant Strategists Challenge the Dominant Business Model: A Strategy-as-Practice Perspective

Authors: Lindie Grebe

Abstract:

This paper reports on a study that explored the strategizing practices of professional accountants in the mining industry, based on Jarratt and Stiles’ dominant strategizing practice models framework. Drawing on a strategy-as-practice perspective, the paper recognises qualified professional accountants in strategic management such as Chief Executive Officers, as strategy practitioners that perform their strategizing practices and praxis within a specific context. The main findings of this paper were produced through semi-structured individual interviews with accountants that perform strategy on a business level in the South African mining industry. Qualitative data were analysed through conversation analysis over two coding-cycles. Findings describe accountant strategists as practitioners who challenge the dominant business model when a disconnect seems to exist between international corporate level strategy and business level strategy in the South African mining industry. Accountant strategy practitioners described their dominant strategizing practice model as incremental change during strategic planning and as a lived experience during strategy implementation. Findings portrayed these strategists as taking initiative as strategy leaders in a dynamic and volatile environment to combine their accounting background with strategic management and challenge the dominant business model. Understanding how accountant strategists perform strategizing offers insight into the social practice of strategic management. This understanding contributes to the body of knowledge on strategizing in the South African mining industry. In addition, knowledge on the transformation of accountants as strategists could provide valuable practice relevant insights for accounting educators and the accounting profession alike.

Keywords: accountant strategists, dominant strategizing practice models framework, mining industry, strategy-as-practice

Procedia PDF Downloads 156
28573 Optimize Data Evaluation Metrics for Fraud Detection Using Machine Learning

Authors: Jennifer Leach, Umashanger Thayasivam

Abstract:

The use of technology has benefited society in more ways than one ever thought possible. Unfortunately, though, as society’s knowledge of technology has advanced, so has its knowledge of ways to use technology to manipulate people. This has led to a simultaneous advancement in the world of fraud. Machine learning techniques can offer a possible solution to help decrease this advancement. This research explores how the use of various machine learning techniques can aid in detecting fraudulent activity across two different types of fraudulent data, and the accuracy, precision, recall, and F1 were recorded for each method. Each machine learning model was also tested across five different training and testing splits in order to discover which testing split and technique would lead to the most optimal results.

Keywords: data science, fraud detection, machine learning, supervised learning

Procedia PDF Downloads 169
28572 Evaluating 8D Reports Using Text-Mining

Authors: Benjamin Kuester, Bjoern Eilert, Malte Stonis, Ludger Overmeyer

Abstract:

Increasing quality requirements make reliable and effective quality management indispensable. This includes the complaint handling in which the 8D method is widely used. The 8D report as a written documentation of the 8D method is one of the key quality documents as it internally secures the quality standards and acts as a communication medium to the customer. In practice, however, the 8D report is mostly faulty and of poor quality. There is no quality control of 8D reports today. This paper describes the use of natural language processing for the automated evaluation of 8D reports. Based on semantic analysis and text-mining algorithms the presented system is able to uncover content and formal quality deficiencies and thus increases the quality of the complaint processing in the long term.

Keywords: 8D report, complaint management, evaluation system, text-mining

Procedia PDF Downloads 291
28571 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 99
28570 Evaluating Machine Learning Techniques for Activity Classification in Smart Home Environments

Authors: Talal Alshammari, Nasser Alshammari, Mohamed Sedky, Chris Howard

Abstract:

With the widespread adoption of the Internet-connected devices, and with the prevalence of the Internet of Things (IoT) applications, there is an increased interest in machine learning techniques that can provide useful and interesting services in the smart home domain. The areas that machine learning techniques can help advance are varied and ever-evolving. Classifying smart home inhabitants’ Activities of Daily Living (ADLs), is one prominent example. The ability of machine learning technique to find meaningful spatio-temporal relations of high-dimensional data is an important requirement as well. This paper presents a comparative evaluation of state-of-the-art machine learning techniques to classify ADLs in the smart home domain. Forty-two synthetic datasets and two real-world datasets with multiple inhabitants are used to evaluate and compare the performance of the identified machine learning techniques. Our results show significant performance differences between the evaluated techniques. Such as AdaBoost, Cortical Learning Algorithm (CLA), Decision Trees, Hidden Markov Model (HMM), Multi-layer Perceptron (MLP), Structured Perceptron and Support Vector Machines (SVM). Overall, neural network based techniques have shown superiority over the other tested techniques.

Keywords: activities of daily living, classification, internet of things, machine learning, prediction, smart home

Procedia PDF Downloads 337
28569 Modeling Activity Pattern Using XGBoost for Mining Smart Card Data

Authors: Eui-Jin Kim, Hasik Lee, Su-Jin Park, Dong-Kyu Kim

Abstract:

Smart-card data are expected to provide information on activity pattern as an alternative to conventional person trip surveys. The focus of this study is to propose a method for training the person trip surveys to supplement the smart-card data that does not contain the purpose of each trip. We selected only available features from smart card data such as spatiotemporal information on the trip and geographic information system (GIS) data near the stations to train the survey data. XGboost, which is state-of-the-art tree-based ensemble classifier, was used to train data from multiple sources. This classifier uses a more regularized model formalization to control the over-fitting and show very fast execution time with well-performance. The validation results showed that proposed method efficiently estimated the trip purpose. GIS data of station and duration of stay at the destination were significant features in modeling trip purpose.

Keywords: activity pattern, data fusion, smart-card, XGboost

Procedia PDF Downloads 225
28568 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 54
28567 Principal Component Analysis in Drug-Excipient Interactions

Authors: Farzad Khajavi

Abstract:

Studies about the interaction between active pharmaceutical ingredients (API) and excipients are so important in the pre-formulation stage of development of all dosage forms. Analytical techniques such as differential scanning calorimetry (DSC), Thermal gravimetry (TG), and Furrier transform infrared spectroscopy (FTIR) are commonly used tools for investigating regarding compatibility and incompatibility of APIs with excipients. Sometimes the interpretation of data obtained from these techniques is difficult because of severe overlapping of API spectrum with excipients in their mixtures. Principal component analysis (PCA) as a powerful factor analytical method is used in these situations to resolve data matrices acquired from these analytical techniques. Binary mixtures of API and interested excipients are considered and produced. Peaks of FTIR, DSC, or TG of pure API and excipient and their mixtures at different mole ratios will construct the rows of the data matrix. By applying PCA on the data matrix, the number of principal components (PCs) is determined so that it contains the total variance of the data matrix. By plotting PCs or factors obtained from the score of the matrix in two-dimensional spaces if the pure API and its mixture with the excipient at the high amount of API and the 1:1mixture form a separate cluster and the other cluster comprise of the pure excipient and its blend with the API at the high amount of excipient. This confirms the existence of compatibility between API and the interested excipient. Otherwise, the incompatibility will overcome a mixture of API and excipient.

Keywords: API, compatibility, DSC, TG, interactions

Procedia PDF Downloads 109
28566 Redefining Surgical Innovation in Urology: A Historical Perspective of the Original Publications on Pioneering Techniques in Urology

Authors: Samuel Sii, David Homewood, Brendan Dittmer, Tony Nzembela, Jonathan O’Brien, Niall Corcoran, Dinesh Agarwal

Abstract:

Introduction: Innovation is key to the advancement of medicine and improvement in patient care. This is particularly true in surgery, where pioneering techniques have transformed operative management from a historically highly risky peri-morbid and disfiguring to the contemporary low-risk, sterile and minimally invasive treatment modality. There is a delicate balance between enabling innovation and minimizing patient harm. Publication and discussion of novel surgical techniques allow for independent expert review. Recent journals have increasingly stringent requirements for publications and often require larger case volumes for novel techniques to be published. This potentially impairs the initial publication of novel techniques and slows innovation. The historical perspective provides a better understanding of how requirements for the publication of new techniques have evolved over time. This is essential in overcoming challenges in developing novel techniques. Aims and Objectives: We explore how novel techniques in Urology have been published over the past 200 years. Our objective is to describe the trend and publication requirements of novel urological techniques, both historical and present. Methods: We assessed all major urological operations using multipronged historical analysis. An initial literature search was carried out through PubMed and Google Scholar for original literature descriptions, followed by reference tracing. The first publication of each pioneering urological procedure was recorded. Data collected includes the year of publication, description of the procedure, number of cases and outcomes. Results: 65 papers describing pioneering techniques in Urology were identified. These comprised of 2 experimental studies, 17 case reports and 46 case series. These papers described various pioneering urological techniques in urological oncology, reconstructive urology and endourology. We found that, historically, techniques were published with smaller case numbers. Often, the surgical technique itself was a greater focus of the publication than patient outcome data. These techniques were often adopted prior to larger publications. In contrast, the risks and benefits of recent novel techniques are often well-defined prior to adoption. This historical perspective is important as recent journals have requirements for larger case series and data outcomes. This potentially impairs the initial publication of novel techniques and slows innovation. Conclusion: A better understanding of historical publications and their effect on the adoption of urological techniques into common practice could assist the current generation of Urologists in formulating a safe, efficacious process in promoting surgical innovation and the development of novel surgical techniques. We propose the reassessment of requirements for the publication of novel operative techniques by splitting technical perspectives and data-orientated case series. Existing frameworks such as IDEAL and ASERNIP-S should be integrated into current processes when investigating and developing new surgical techniques to ensure efficacious and safe innovation within surgery is encouraged.

Keywords: urology, surgical innovation, novel surgical techniques, publications

Procedia PDF Downloads 18
28565 A Method to Evaluate and Compare Web Information Extractors

Authors: Patricia Jiménez, Rafael Corchuelo, Hassan A. Sleiman

Abstract:

Web mining is gaining importance at an increasing pace. Currently, there are many complementary research topics under this umbrella. Their common theme is that they all focus on applying knowledge discovery techniques to data that is gathered from the Web. Sometimes, these data are relatively easy to gather, chiefly when it comes from server logs. Unfortunately, there are cases in which the data to be mined is the data that is displayed on a web document. In such cases, it is necessary to apply a pre-processing step to first extract the information of interest from the web documents. Such pre-processing steps are performed using so-called information extractors, which are software components that are typically configured by means of rules that are tailored to extracting the information of interest from a web page and structuring it according to a pre-defined schema. Paramount to getting good mining results is that the technique used to extract the source information is exact, which requires to evaluate and compare the different proposals in the literature from an empirical point of view. According to Google Scholar, about 4 200 papers on information extraction have been published during the last decade. Unfortunately, they were not evaluated within a homogeneous framework, which leads to difficulties to compare them empirically. In this paper, we report on an original information extraction evaluation method. Our contribution is three-fold: a) this is the first attempt to provide an evaluation method for proposals that work on semi-structured documents; the little existing work on this topic focuses on proposals that work on free text, which has little to do with extracting information from semi-structured documents. b) It provides a method that relies on statistically sound tests to support the conclusions drawn; the previous work does not provide clear guidelines or recommend statistically sound tests, but rather a survey that collects many features to take into account as well as related work; c) We provide a novel method to compute the performance measures regarding unsupervised proposals; otherwise they would require the intervention of a user to compute them by using the annotations on the evaluation sets and the information extracted. Our contributions will definitely help researchers in this area make sure that they have advanced the state of the art not only conceptually, but from an empirical point of view; it will also help practitioners make informed decisions on which proposal is the most adequate for a particular problem. This conference is a good forum to discuss on our ideas so that we can spread them to help improve the evaluation of information extraction proposals and gather valuable feedback from other researchers.

Keywords: web information extractors, information extraction evaluation method, Google scholar, web

Procedia PDF Downloads 235
28564 Application of Optimization Techniques in Overcurrent Relay Coordination: A Review

Authors: Syed Auon Raza, Tahir Mahmood, Syed Basit Ali Bukhari

Abstract:

In power system properly coordinated protection scheme is designed to make sure that only the faulty part of the system will be isolated when abnormal operating condition of the system will reach. The complexity of the system as well as the increased user demand and the deregulated environment enforce the utilities to improve system reliability by using a properly coordinated protection scheme. This paper presents overview of over current relay coordination techniques. Different techniques such as Deterministic Techniques, Meta Heuristic Optimization techniques, Hybrid Optimization Techniques, and Trial and Error Optimization Techniques have been reviewed in terms of method of their implementation, operation modes, nature of distribution system, and finally their advantages as well as the disadvantages.

Keywords: distribution system, relay coordination, optimization, Plug Setting Multiplier (PSM)

Procedia PDF Downloads 380
28563 Analysis of Causality between Defect Causes Using Association Rule Mining

Authors: Sangdeok Lee, Sangwon Han, Changtaek Hyun

Abstract:

Construction defects are major components that result in negative impacts on project performance including schedule delays and cost overruns. Since construction defects generally occur when a few associated causes combine, a thorough understanding of defect causality is required in order to more systematically prevent construction defects. To address this issue, this paper uses association rule mining (ARM) to quantify the causality between defect causes, and social network analysis (SNA) to find indirect causality among them. The suggested approach is validated with 350 defect instances from concrete works in 32 projects in Korea. The results show that the interrelationships revealed by the approach reflect the characteristics of the concrete task and the important causes that should be prevented.

Keywords: causality, defect causes, social network analysis, association rule mining

Procedia PDF Downloads 345
28562 Comparative Study od Three Artificial Intelligence Techniques for Rain Domain in Precipitation Forecast

Authors: Nabilah Filzah Mohd Radzuan, Andi Putra, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Precipitation forecast is important to avoid natural disaster incident which can cause losses in the involved area. This paper reviews three techniques logistic regression, decision tree, and random forest which are used in making precipitation forecast. These combination techniques through the vector auto-regression (VAR) model help in finding the advantages and strengths of each technique in the forecast process. The data-set contains variables of the rain’s domain. Adaptation of artificial intelligence techniques involved in rain domain enables the forecast process to be easier and systematic for precipitation forecast.

Keywords: logistic regression, decisions tree, random forest, VAR model

Procedia PDF Downloads 429
28561 The Effects of Applying Linguistic Principles and Teaching Techniques in Teaching English at Secondary School in Thailand

Authors: Wannakarn Likitrattanaporn

Abstract:

The purposes of this investigation were to investigate the effects of applying linguistic principles and teaching techniques in teaching English through experimenting the Adapted English Lessons and to determine the teachers’ opinions as well as students’ opinions towards the Adapted Lessons. The subjects of the study were 5 Thai teachers, who teach English, and 85 Grade 10 mixed-ability students at Triamudom Suksa Pattanakarn Ratchada School, Bangkok, Thailand. The research instruments included the Adapted English Lessons, questionnaires asking teachers’ and students’ opinions towards the Adapted Lessons and the informal interview. The data from the research instruments was collected and analyzed concerning the teachers’ and students’ opinions towards adapting linguistic principles and teaching techniques. Linguistic principles of minimal pair and articulatory phonetics and teaching techniques of mimicry-memorization; vocabulary substitution drills, language pattern drills, reading comprehension exercise, practicing listening, speaking and writing skill and communicative activities; informal talk and free writing are applied. The data was statistically compiled according to an arithmetic percentage. The results showed that the teachers and students have very highly positive opinions towards adapting linguistic principles for teaching and learning phonological accuracy. Teaching techniques provided in the Adapted English Lessons can be used efficiently in the classroom. The teachers and students have positive opinions towards them too.

Keywords: applying linguistic principles and teaching techniques, teachers’ and students’ opinions, teaching English, the adapted English lessons

Procedia PDF Downloads 462
28560 Integrating Data Mining within a Strategic Knowledge Management Framework: A Platform for Sustainable Competitive Advantage within the Australian Minerals and Metals Mining Sector

Authors: Sanaz Moayer, Fang Huang, Scott Gardner

Abstract:

In the highly leveraged business world of today, an organisation’s success depends on how it can manage and organize its traditional and intangible assets. In the knowledge-based economy, knowledge as a valuable asset gives enduring capability to firms competing in rapidly shifting global markets. It can be argued that ability to create unique knowledge assets by configuring ICT and human capabilities, will be a defining factor for international competitive advantage in the mid-21st century. The concept of KM is recognized in the strategy literature, and increasingly by senior decision-makers (particularly in large firms which can achieve scalable benefits), as an important vehicle for stimulating innovation and organisational performance in the knowledge economy. This thinking has been evident in professional services and other knowledge intensive industries for over a decade. It highlights the importance of social capital and the value of the intellectual capital embedded in social and professional networks, complementing the traditional focus on creation of intellectual property assets. Despite the growing interest in KM within professional services there has been limited discussion in relation to multinational resource based industries such as mining and petroleum where the focus has been principally on global portfolio optimization with economies of scale, process efficiencies and cost reduction. The Australian minerals and metals mining industry, although traditionally viewed as capital intensive, employs a significant number of knowledge workers notably- engineers, geologists, highly skilled technicians, legal, finance, accounting, ICT and contracts specialists working in projects or functions, representing potential knowledge silos within the organisation. This silo effect arguably inhibits knowledge sharing and retention by disaggregating corporate memory, with increased operational and project continuity risk. It also may limit the potential for process, product, and service innovation. In this paper the strategic application of knowledge management incorporating contemporary ICT platforms and data mining practices is explored as an important enabler for knowledge discovery, reduction of risk, and retention of corporate knowledge in resource based industries. With reference to the relevant strategy, management, and information systems literature, this paper highlights possible connections (currently undergoing empirical testing), between an Strategic Knowledge Management (SKM) framework incorporating supportive Data Mining (DM) practices and competitive advantage for multinational firms operating within the Australian resource sector. We also propose based on a review of the relevant literature that more effective management of soft and hard systems knowledge is crucial for major Australian firms in all sectors seeking to improve organisational performance through the human and technological capability captured in organisational networks.

Keywords: competitive advantage, data mining, mining organisation, strategic knowledge management

Procedia PDF Downloads 397
28559 The Regulation of Reputational Information in the Sharing Economy

Authors: Emre Bayamlıoğlu

Abstract:

This paper aims to provide an account of the legal and the regulative aspects of the algorithmic reputation systems with a special emphasis on the sharing economy (i.e., Uber, Airbnb, Lyft) business model. The first section starts with an analysis of the legal and commercial nature of the tripartite relationship among the parties, namely, the host platform, individual sharers/service providers and the consumers/users. The section further examines to what extent an algorithmic system of reputational information could serve as an alternative to legal regulation. Shortcomings are explained and analyzed with specific examples from Airbnb Platform which is a pioneering success in the sharing economy. The following section focuses on the issue of governance and control of the reputational information. The section first analyzes the legal consequences of algorithmic filtering systems to detect undesired comments and how a delicate balance could be struck between the competing interests such as freedom of speech, privacy and the integrity of the commercial reputation. The third section deals with the problem of manipulation by users. Indeed many sharing economy businesses employ certain techniques of data mining and natural language processing to verify consistency of the feedback. Software agents referred as "bots" are employed by the users to "produce" fake reputation values. Such automated techniques are deceptive with significant negative effects for undermining the trust upon which the reputational system is built. The third section is devoted to explore the concerns with regard to data mobility, data ownership, and the privacy. Reputational information provided by the consumers in the form of textual comment may be regarded as a writing which is eligible to copyright protection. Algorithmic reputational systems also contain personal data pertaining both the individual entrepreneurs and the consumers. The final section starts with an overview of the notion of reputation as a communitarian and collective form of referential trust and further provides an evaluation of the above legal arguments from the perspective of public interest in the integrity of reputational information. The paper concludes with certain guidelines and design principles for algorithmic reputation systems, to address the above raised legal implications.

Keywords: sharing economy, design principles of algorithmic regulation, reputational systems, personal data protection, privacy

Procedia PDF Downloads 451