Search results for: big data PCA
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25245

Search results for: big data PCA

24105 Change Point Detection Using Random Matrix Theory with Application to Frailty in Elderly Individuals

Authors: Malika Kharouf, Aly Chkeir, Khac Tuan Huynh

Abstract:

Detecting change points in time series data is a challenging problem, especially in scenarios where there is limited prior knowledge regarding the data’s distribution and the nature of the transitions. We present a method designed for detecting changes in the covariance structure of high-dimensional time series data, where the number of variables closely matches the data length. Our objective is to achieve unbiased test statistic estimation under the null hypothesis. We delve into the utilization of Random Matrix Theory to analyze the behavior of our test statistic within a high-dimensional context. Specifically, we illustrate that our test statistic converges pointwise to a normal distribution under the null hypothesis. To assess the effectiveness of our proposed approach, we conduct evaluations on a simulated dataset. Furthermore, we employ our method to examine changes aimed at detecting frailty in the elderly.

Keywords: change point detection, hypothesis tests, random matrix theory, frailty in elderly

Procedia PDF Downloads 56
24104 Main Cause of Children's Deaths in Indigenous Wayuu Community from Department of La Guajira: A Research Developed through Data Mining Use

Authors: Isaura Esther Solano Núñez, David Suarez

Abstract:

The main purpose of this research is to discover what causes death in children of the Wayuu community, and deeply analyze those results in order to take corrective measures to properly control infant mortality. We consider important to determine the reasons that are producing early death in this specific type of population, since they are the most vulnerable to high risk environmental conditions. In this way, the government, through competent authorities, may develop prevention policies and the right measures to avoid an increase of this tragic fact. The methodology used to develop this investigation is data mining, which consists in gaining and examining large amounts of data to produce new and valuable information. Through this technique it has been possible to determine that the child population is dying mostly from malnutrition. In short, this technique has been very useful to develop this study; it has allowed us to transform large amounts of information into a conclusive and important statement, which has made it easier to take appropriate steps to resolve a particular situation.

Keywords: malnutrition, data mining, analytical, descriptive, population, Wayuu, indigenous

Procedia PDF Downloads 160
24103 Application of the Mobile Phone for Occupational Self-Inspection Program in Small-Scale Industries

Authors: Jia-Sin Li, Ying-Fang Wang, Cheing-Tong Yan

Abstract:

In this study, an integrated approach of Google Spreadsheet and QR code which is free internet resources was used to improve the inspection procedure. The mobile phone Application(App)was also designed to combine with a web page to create an automatic checklist in order to provide a new integrated information of inspection management system. By means of client-server model, the client App is developed for Android mobile OS and the back end is a web server. It can set up App accounts including authorized data and store some checklist documents in the website. The checklist document URL could generate QR code first and then print and paste on the machine. The user can scan the QR code by the app and filled the checklist in the factory. In the meanwhile, the checklist data will send to the server, it not only save the filled data but also executes the related functions and charts. On the other hand, it also enables auditors and supervisors to facilitate the prevention and response to hazards, as well as immediate report data checks. Finally, statistics and professional analysis are performed using inspection records and other relevant data to not only improve the reliability, integrity of inspection operations and equipment loss control, but also increase plant safety and personnel performance. Therefore, it suggested that the traditional paper-based inspection method could be replaced by the APP which promotes the promotion of industrial security and reduces human error.

Keywords: checklist, Google spreadsheet, APP, self-inspection

Procedia PDF Downloads 118
24102 Industry 4.0 and Supply Chain Integration: Case of Tunisian Industrial Companies

Authors: Rym Ghariani, Ghada Soltane, Younes Boujelbene

Abstract:

Industry 4.0, a set of emerging smart and digital technologies, has been the main focus of operations management researchers and practitioners in recent years. The objective of this research paper is to study the impact of Industry 4.0 on the integration of the supply chain (SCI) in Tunisian industrial companies. A conceptual model to study the relationship between Industry 4.0 technologies and supply chain integration was designed. This model contains three explained variables (Big data, Internet of Things, and Robotics) and one variable to be explained (supply chain integration). In order to answer our research questions and investigate the research hypotheses, principal component analysis and discriminant analysis were used using SPSS26 software. The results reveal that there is a statistically positive impact significant impact of Industry 4.0 (Big data, Internet of Things and Robotics) on the integration of the supply chain. Interestingly, big data has a greater positive impact on supply chain integration than the Internet of Things and robotics.

Keywords: industry 4.0 (I4.0), big data, internet of things, robotics, supply chain integration

Procedia PDF Downloads 62
24101 Analysing Competitive Advantage of IoT and Data Analytics in Smart City Context

Authors: Petra Hofmann, Dana Koniel, Jussi Luukkanen, Walter Nieminen, Lea Hannola, Ilkka Donoghue

Abstract:

The Covid-19 pandemic forced people to isolate and become physically less connected. The pandemic has not only reshaped people’s behaviours and needs but also accelerated digital transformation (DT). DT of cities has become an imperative with the outlook of converting them into smart cities in the future. Embedding digital infrastructure and smart city initiatives as part of normal design, construction, and operation of cities provides a unique opportunity to improve the connection between people. The Internet of Things (IoT) is an emerging technology and one of the drivers in DT. It has disrupted many industries by introducing different services and business models, and IoT solutions are being applied in multiple fields, including smart cities. As IoT and data are fundamentally linked together, IoT solutions can only create value if the data generated by the IoT devices is analysed properly. Extracting relevant conclusions and actionable insights by using established techniques, data analytics contributes significantly to the growth and success of IoT applications and investments. Companies must grasp DT and be prepared to redesign their offerings and business models to remain competitive in today’s marketplace. As there are many IoT solutions available today, the amount of data is tremendous. The challenge for companies is to understand what solutions to focus on and how to prioritise and which data to differentiate from the competition. This paper explains how IoT and data analytics can impact competitive advantage and how companies should approach IoT and data analytics to translate them into concrete offerings and solutions in the smart city context. The study was carried out as a qualitative, literature-based research. A case study is provided to validate the preservation of company’s competitive advantage through smart city solutions. The results of the research contribution provide insights into the different factors and considerations related to creating competitive advantage through IoT and data analytics deployment in the smart city context. Furthermore, this paper proposes a framework that merges the factors and considerations with examples of offerings and solutions in smart cities. The data collected through IoT devices, and the intelligent use of it, can create competitive advantage to companies operating in smart city business. Companies should take into consideration the five forces of competition that shape industries and pay attention to the technological, organisational, and external contexts which define factors for consideration of competitive advantages in the field of IoT and data analytics. Companies that can utilise these key assets in their businesses will most likely conquer the markets and have a strong foothold in the smart city business.

Keywords: data analytics, smart cities, competitive advantage, internet of things

Procedia PDF Downloads 136
24100 Best Season for Seismic Survey in Zaria Area, Nigeria: Data Quality and Implications

Authors: Ibe O. Stephen, Egwuonwu N. Gabriel

Abstract:

Variations in seismic P-wave velocity and depth resolution resulting from variations in subsurface water saturation were investigated in this study in order to determine the season of the year that gives the most reliable P-wave velocity and depth resolution of the subsurface in Zaria Area, Nigeria. A 2D seismic refraction tomography technique involving an ABEM Terraloc MK6 Seismograph was used to collect data across a borehole of standard log with the centre of the spread situated at the borehole site. Using the same parameters this procedure was repeated along the same spread for at least once in a month for at least eight months in a year for four years. The choice for each survey time depended on when there was significant variation in rainfall data. The seismic data collected were tomographically inverted. The results suggested that the average P-wave velocity ranges of the subsurface in the area are generally higher when the ground was wet than when it was dry. The results also suggested that the overburden of about 9.0 m in thickness, the weathered basement of about 14.0 m in thickness and the fractured basement at a depth of about 23.0 m best fitted the borehole log. This best fit was consistently obtained in the months between March and May when the average total rainfall was about 44.8 mm in the area. The results had also shown that the velocity ranges in both dry and wet formations fall within the standard ranges as provided in literature. In terms of velocity, this study has not in any way clearly distinguished the quality of the results of the seismic data obtained when the subsurface was dry from the results of the data collected when the subsurface was wet. It was concluded that for more detailed and reliable seismic studies in Zaria Area and its environs with similar climatic condition, the surveys are best conducted between March and May. The most reliable seismic data for depth resolution are most likely obtainable in the area between March and May.

Keywords: best season, variations in depth resolution, variations in P-wave velocity, variations in subsurface water saturation, Zaria area

Procedia PDF Downloads 290
24099 Quick Sequential Search Algorithm Used to Decode High-Frequency Matrices

Authors: Mohammed M. Siddeq, Mohammed H. Rasheed, Omar M. Salih, Marcos A. Rodrigues

Abstract:

This research proposes a data encoding and decoding method based on the Matrix Minimization algorithm. This algorithm is applied to high-frequency coefficients for compression/encoding. The algorithm starts by converting every three coefficients to a single value; this is accomplished based on three different keys. The decoding/decompression uses a search method called QSS (Quick Sequential Search) Decoding Algorithm presented in this research based on the sequential search to recover the exact coefficients. In the next step, the decoded data are saved in an auxiliary array. The basic idea behind the auxiliary array is to save all possible decoded coefficients; this is because another algorithm, such as conventional sequential search, could retrieve encoded/compressed data independently from the proposed algorithm. The experimental results showed that our proposed decoding algorithm retrieves original data faster than conventional sequential search algorithms.

Keywords: matrix minimization algorithm, decoding sequential search algorithm, image compression, DCT, DWT

Procedia PDF Downloads 152
24098 Structuring and Visualizing Healthcare Claims Data Using Systems Architecture Methodology

Authors: Inas S. Khayal, Weiping Zhou, Jonathan Skinner

Abstract:

Healthcare delivery systems around the world are in crisis. The need to improve health outcomes while decreasing healthcare costs have led to an imminent call to action to transform the healthcare delivery system. While Bioinformatics and Biomedical Engineering have primarily focused on biological level data and biomedical technology, there is clear evidence of the importance of the delivery of care on patient outcomes. Classic singular decomposition approaches from reductionist science are not capable of explaining complex systems. Approaches and methods from systems science and systems engineering are utilized to structure healthcare delivery system data. Specifically, systems architecture is used to develop a multi-scale and multi-dimensional characterization of the healthcare delivery system, defined here as the Healthcare Delivery System Knowledge Base. This paper is the first to contribute a new method of structuring and visualizing a multi-dimensional and multi-scale healthcare delivery system using systems architecture in order to better understand healthcare delivery.

Keywords: health informatics, systems thinking, systems architecture, healthcare delivery system, data analytics

Procedia PDF Downloads 348
24097 Cleaning of Scientific References in Large Patent Databases Using Rule-Based Scoring and Clustering

Authors: Emiel Caron

Abstract:

Patent databases contain patent related data, organized in a relational data model, and are used to produce various patent statistics. These databases store raw data about scientific references cited by patents. For example, Patstat holds references to tens of millions of scientific journal publications and conference proceedings. These references might be used to connect patent databases with bibliographic databases, e.g. to study to the relation between science, technology, and innovation in various domains. Problematic in such studies is the low data quality of the references, i.e. they are often ambiguous, unstructured, and incomplete. Moreover, a complete bibliographic reference is stored in only one attribute. Therefore, a computerized cleaning and disambiguation method for large patent databases is developed in this work. The method uses rule-based scoring and clustering. The rules are based on bibliographic metadata, retrieved from the raw data by regular expressions, and are transparent and adaptable. The rules in combination with string similarity measures are used to detect pairs of records that are potential duplicates. Due to the scoring, different rules can be combined, to join scientific references, i.e. the rules reinforce each other. The scores are based on expert knowledge and initial method evaluation. After the scoring, pairs of scientific references that are above a certain threshold, are clustered by means of single-linkage clustering algorithm to form connected components. The method is designed to disambiguate all the scientific references in the Patstat database. The performance evaluation of the clustering method, on a large golden set with highly cited papers, shows on average a 99% precision and a 95% recall. The method is therefore accurate but careful, i.e. it weighs precision over recall. Consequently, separate clusters of high precision are sometimes formed, when there is not enough evidence for connecting scientific references, e.g. in the case of missing year and journal information for a reference. The clusters produced by the method can be used to directly link the Patstat database with bibliographic databases as the Web of Science or Scopus.

Keywords: clustering, data cleaning, data disambiguation, data mining, patent analysis, scientometrics

Procedia PDF Downloads 194
24096 A Human Centered Design of an Exoskeleton Using Multibody Simulation

Authors: Sebastian Kölbl, Thomas Reitmaier, Mathias Hartmann

Abstract:

Trial and error approaches to adapt wearable support structures to human physiology are time consuming and elaborate. However, during preliminary design, the focus lies on understanding the interaction between exoskeleton and the human body in terms of forces and moments, namely body mechanics. For the study at hand, a multi-body simulation approach has been enhanced to evaluate actual forces and moments in a human dummy model with and without a digital mock-up of an active exoskeleton. Therefore, different motion data have been gathered and processed to perform a musculosceletal analysis. The motion data are ground reaction forces, electromyography data (EMG) and human motion data recorded with a marker-based motion capture system. Based on the experimental data, the response of the human dummy model has been calibrated. Subsequently, the scalable human dummy model, in conjunction with the motion data, is connected with the exoskeleton structure. The results of the human-machine interaction (HMI) simulation platform are in particular resulting contact forces and human joint forces to compare with admissible values with regard to the human physiology. Furthermore, it provides feedback for the sizing of the exoskeleton structure in terms of resulting interface forces (stress justification) and the effect of its compliance. A stepwise approach for the setup and validation of the modeling strategy is presented and the potential for a more time and cost-effective development of wearable support structures is outlined.

Keywords: assistive devices, ergonomic design, inverse dynamics, inverse kinematics, multibody simulation

Procedia PDF Downloads 163
24095 Evaluation of the Nursing Management Course in Undergraduate Nursing Programs of State Universities in Turkey

Authors: Oznur Ispir, Oya Celebi Cakiroglu, Esengul Elibol, Emine Ceribas, Gizem Acikgoz, Hande Yesilbas, Merve Tarhan

Abstract:

This study was conducted to evaluate the academic staff teaching the 'Nursing Management' course in the undergraduate nursing programs of the state universities in Turkey and to assess the current content of the course. Design of the study is descriptive. Population of the study consists of seventy-eight undergraduate nursing programs in the state universities in Turkey. The questionnaire/survey prepared by the researchers was used as a data collection tool. The data were obtained by screening the content of the websites of nursing education programs between March and May 2016. Descriptive statistics were used to analyze the data. The research performed within the study indicated that 58% of the undergraduate nursing programs from which the data were derived were included in the school of health, 81% of the academic staff graduated from the undergraduate nursing programs, 40% worked as a lecturer and 37% specialized in a field other than the nursing. The research also implied that the above-mentioned course was included in 98% of the programs from which it was possible to obtain data. The full name of the course was 'Nursing Management' in 95% of the programs and 98% stated that the course was compulsory. Theory and application hours were 3.13 and 2.91, respectively. Moreover, the content of the course was not shared in 65% of the programs reviewed. This study demonstrated that the experience and expertise of the academic staff teaching the 'Nursing Management' course was not sufficient in the management area, and the schedule and content of the course were not sufficient although many nursing education programs provided the course. Comparison between the curricula of the course revealed significant differences.

Keywords: nursing, nursing management, nursing management course, undergraduate program

Procedia PDF Downloads 359
24094 The DAQ Debugger for iFDAQ of the COMPASS Experiment

Authors: Y. Bai, M. Bodlak, V. Frolov, S. Huber, V. Jary, I. Konorov, D. Levit, J. Novy, D. Steffen, O. Subrt, M. Virius

Abstract:

In general, state-of-the-art Data Acquisition Systems (DAQ) in high energy physics experiments must satisfy high requirements in terms of reliability, efficiency and data rate capability. This paper presents the development and deployment of a debugging tool named DAQ Debugger for the intelligent, FPGA-based Data Acquisition System (iFDAQ) of the COMPASS experiment at CERN. Utilizing a hardware event builder, the iFDAQ is designed to be able to readout data at the average maximum rate of 1.5 GB/s of the experiment. In complex softwares, such as the iFDAQ, having thousands of lines of code, the debugging process is absolutely essential to reveal all software issues. Unfortunately, conventional debugging of the iFDAQ is not possible during the real data taking. The DAQ Debugger is a tool for identifying a problem, isolating the source of the problem, and then either correcting the problem or determining a way to work around it. It provides the layer for an easy integration to any process and has no impact on the process performance. Based on handling of system signals, the DAQ Debugger represents an alternative to conventional debuggers provided by most integrated development environments. Whenever problem occurs, it generates reports containing all necessary information important for a deeper investigation and analysis. The DAQ Debugger was fully incorporated to all processes in the iFDAQ during the run 2016. It helped to reveal remaining software issues and improved significantly the stability of the system in comparison with the previous run. In the paper, we present the DAQ Debugger from several insights and discuss it in a detailed way.

Keywords: DAQ Debugger, data acquisition system, FPGA, system signals, Qt framework

Procedia PDF Downloads 284
24093 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: information retrieval, unified medical language system, syntax based analysis, natural language processing, medical informatics

Procedia PDF Downloads 135
24092 Developing Logistics Indices for Turkey as an an Indicator of Economic Activity

Authors: Gizem İntepe, Eti Mizrahi

Abstract:

Investment and financing decisions are influenced by various economic features. Detailed analysis should be conducted in order to make decisions not only by companies but also by governments. Such analysis can be conducted either at the company level or on a sectoral basis to reduce risks and to maximize profits. Sectoral disaggregation caused by seasonality effects, subventions, data advantages or disadvantages may appear in sectors behaving parallel to BIST (Borsa Istanbul stock exchange) Index. Proposed logistic indices could serve market needs as a decision parameter in sectoral basis and also helps forecasting activities in import export volume changes. Also it is an indicator of logistic activity, which is also a sign of economic mobility at the national level. Publicly available data from “Ministry of Transport, Maritime Affairs and Communications” and “Turkish Statistical Institute” is utilized to obtain five logistics indices namely as; exLogistic, imLogistic, fLogistic, dLogistic and cLogistic index. Then, efficiency and reliability of these indices are tested.

Keywords: economic activity, export trade data, import trade data, logistics indices

Procedia PDF Downloads 337
24091 Using Non-Negative Matrix Factorization Based on Satellite Imagery for the Collection of Agricultural Statistics

Authors: Benyelles Zakaria, Yousfi Djaafar, Karoui Moussa Sofiane

Abstract:

Agriculture is fundamental and remains an important objective in the Algerian economy, based on traditional techniques and structures, it generally has a purpose of consumption. Collection of agricultural statistics in Algeria is done using traditional methods, which consists of investigating the use of land through survey and field survey. These statistics suffer from problems such as poor data quality, the long delay between collection of their last final availability and high cost compared to their limited use. The objective of this work is to develop a processing chain for a reliable inventory of agricultural land by trying to develop and implement a new method of extracting information. Indeed, this methodology allowed us to combine data from remote sensing and field data to collect statistics on areas of different land. The contribution of remote sensing in the improvement of agricultural statistics, in terms of area, has been studied in the wilaya of Sidi Bel Abbes. It is in this context that we applied a method for extracting information from satellite images. This method is called the non-negative matrix factorization, which does not consider the pixel as a single entity, but will look for components the pixel itself. The results obtained by the application of the MNF were compared with field data and the results obtained by the method of maximum likelihood. We have seen a rapprochement between the most important results of the FMN and those of field data. We believe that this method of extracting information from satellite data leads to interesting results of different types of land uses.

Keywords: blind source separation, hyper-spectral image, non-negative matrix factorization, remote sensing

Procedia PDF Downloads 423
24090 Estimation of Coefficient of Discharge of Side Trapezoidal Labyrinth Weir Using Group Method of Data Handling Technique

Authors: M. A. Ansari, A. Hussain, A. Uddin

Abstract:

A side weir is a flow diversion structure provided in the side wall of a channel to divert water from the main channel to a branch channel. The trapezoidal labyrinth weir is a special type of weir in which crest length of the weir is increased to pass higher discharge. Experimental and numerical studies related to the coefficient of discharge of trapezoidal labyrinth weir in an open channel have been presented in the present study. Group Method of Data Handling (GMDH) with the transfer function of quadratic polynomial has been used to predict the coefficient of discharge for the side trapezoidal labyrinth weir. A new model is developed for coefficient of discharge of labyrinth weir by regression method. Generalized models for predicting the coefficient of discharge for labyrinth weir using Group Method of Data Handling (GMDH) network have also been developed. The prediction based on GMDH model is more satisfactory than those given by traditional regression equations.

Keywords: discharge coefficient, group method of data handling, open channel, side labyrinth weir

Procedia PDF Downloads 160
24089 Integration of Resistivity and Seismic Refraction Using Combine Inversion for Ancient River Findings at Sungai Batu, Lembah Bujang, Malaysia

Authors: Rais Yusoh, Rosli Saad, Mokhtar Saidin, Fauzi Andika, Sabiu Bala Muhammad

Abstract:

Resistivity and seismic refraction profiling have become a common method in pre-investigations for visualizing subsurface structure. The integration of the methods could reduce an interpretation ambiguity. Both methods have their individual software packages for data inversion, but potential to combine certain geophysical methods are restricted; however, the research algorithms that have this functionality was existed and are evaluated personally. The interpretation of subsurface were improve by combining inversion data from both methods by influence each other models using closure coupling; thus, by implementing both methods to support each other which could improve the subsurface interpretation. These methods were applied on a field dataset from a pre-investigation for archeology in finding the ancient river. There were no major changes in the inverted model by combining data inversion for this archetype which probably due to complex geology. The combine data analysis provides an additional technique for interpretation such as an alluvium, which can have strong influence on the ancient river findings.

Keywords: ancient river, combine inversion, resistivity, seismic refraction

Procedia PDF Downloads 334
24088 Data Mining in Healthcare for Predictive Analytics

Authors: Ruzanna Muradyan

Abstract:

Medical data mining is a crucial field in contemporary healthcare that offers cutting-edge tactics with enormous potential to transform patient care. This abstract examines how sophisticated data mining techniques could transform the healthcare industry, with a special focus on how they might improve patient outcomes. Healthcare data repositories have dynamically evolved, producing a rich tapestry of different, multi-dimensional information that includes genetic profiles, lifestyle markers, electronic health records, and more. By utilizing data mining techniques inside this vast library, a variety of prospects for precision medicine, predictive analytics, and insight production become visible. Predictive modeling for illness prediction, risk stratification, and therapy efficacy evaluations are important points of focus. Healthcare providers may use this abundance of data to tailor treatment plans, identify high-risk patient populations, and forecast disease trajectories by applying machine learning algorithms and predictive analytics. Better patient outcomes, more efficient use of resources, and early treatments are made possible by this proactive strategy. Furthermore, data mining techniques act as catalysts to reveal complex relationships between apparently unrelated data pieces, providing enhanced insights into the cause of disease, genetic susceptibilities, and environmental factors. Healthcare practitioners can get practical insights that guide disease prevention, customized patient counseling, and focused therapies by analyzing these associations. The abstract explores the problems and ethical issues that come with using data mining techniques in the healthcare industry. In order to properly use these approaches, it is essential to find a balance between data privacy, security issues, and the interpretability of complex models. Finally, this abstract demonstrates the revolutionary power of modern data mining methodologies in transforming the healthcare sector. Healthcare practitioners and researchers can uncover unique insights, enhance clinical decision-making, and ultimately elevate patient care to unprecedented levels of precision and efficacy by employing cutting-edge methodologies.

Keywords: data mining, healthcare, patient care, predictive analytics, precision medicine, electronic health records, machine learning, predictive modeling, disease prognosis, risk stratification, treatment efficacy, genetic profiles, precision health

Procedia PDF Downloads 63
24087 ROOP: Translating Sequential Code Fragments to Distributed Code Fragments Using Deep Reinforcement Learning

Authors: Arun Sanjel, Greg Speegle

Abstract:

Every second, massive amounts of data are generated, and Data Intensive Scalable Computing (DISC) frameworks have evolved into effective tools for analyzing such massive amounts of data. Since the underlying architecture of these distributed computing platforms is often new to users, building a DISC application can often be time-consuming and prone to errors. The automated conversion of a sequential program to a DISC program will consequently significantly improve productivity. However, synthesizing a user’s intended program from an input specification is complex, with several important applications, such as distributed program synthesizing and code refactoring. Existing works such as Tyro and Casper rely entirely on deductive synthesis techniques or similar program synthesis approaches. Our approach is to develop a data-driven synthesis technique to identify sequential components and translate them to equivalent distributed operations. We emphasize using reinforcement learning and unit testing as feedback mechanisms to achieve our objectives.

Keywords: program synthesis, distributed computing, reinforcement learning, unit testing, DISC

Procedia PDF Downloads 109
24086 A Novel Technological Approach to Maintaining the Cold Chain during Transportation

Authors: Philip J. Purnell

Abstract:

Innovators propose to use the Internet of Things to solve the problem of maintaining the cold chain during the transport of biopharmaceutical products. Sending a data logger with refrigerated goods is only useful to inform the recipient of the goods that they have either breached the cold chain and are therefore potentially spoiled or that they have not breached it and are therefore assumed to be in good condition. Connecting the data logger to the Internet of Things means that the supply chain manager will be informed in real-time of the exact location and the precise temperature of the material at any point on earth. Readable using a simple online interface, the supply chain manager will watch the progress of their material on a Google map together with accurate and crucially real-time temperature readings. The data logger will also send alarms to the supply chain manager if a cold chain breach becomes imminent allowing them time to contact the transporter and restore the cold chain before the material is affected. This development is expected to save billions of dollars in wasted biologics that currently arrive either spoiled or in an unreliable condition.

Keywords: internet of things, cold chain, data logger, transportation

Procedia PDF Downloads 442
24085 Positioning a Southern Inclusive Framework Embedded in the Social Model of Disability Theory Contextualised for Guyana

Authors: Lidon Lashley

Abstract:

This paper presents how the social model of disability can be used to reshape inclusive education practices in Guyana. Inclusive education in Guyana is metamorphosizing but still firmly held in the tenets of the Medical Model of Disability which influences the experiences of children with Special Education Needs and/or Disabilities (SEN/D). An ethnographic approach to data gathering was employed in this study. Qualitative data was gathered from the voices of children with and without SEN/D as well as their mainstream teachers to present the interplay of discourses and subjectivities in the situation. The data was analyzed using Adele Clarke's postmodern approach to grounded theory analysis called situational analysis. The data suggest that it is possible but will be challenging to fully contextualize and adopt Loreman's synthesis and Booths and Ainscow's Index in the two mainstream schools studied. In addition, the data paved the way for the presentation of the social model framework specific to Guyana called 'Southern Inclusive Education Framework for Guyana' and its support tool called 'The Inclusive Checker created for Southern mainstream primary classrooms.

Keywords: social model of disability, medical model of disability, subjectivities, metamorphosis, special education needs, postcolonial Guyana, inclusion, culture, mainstream primary schools, Loreman's synthesis, Booths and Ainscow's index

Procedia PDF Downloads 162
24084 An Extensible Software Infrastructure for Computer Aided Custom Monitoring of Patients in Smart Homes

Authors: Ritwik Dutta, Marylin Wolf

Abstract:

This paper describes the trade-offs and the design from scratch of a self-contained, easy-to-use health dashboard software system that provides customizable data tracking for patients in smart homes. The system is made up of different software modules and comprises a front-end and a back-end component. Built with HTML, CSS, and JavaScript, the front-end allows adding users, logging into the system, selecting metrics, and specifying health goals. The back-end consists of a NoSQL Mongo database, a Python script, and a SimpleHTTPServer written in Python. The database stores user profiles and health data in JSON format. The Python script makes use of the PyMongo driver library to query the database and displays formatted data as a daily snapshot of user health metrics against target goals. Any number of standard and custom metrics can be added to the system, and corresponding health data can be fed automatically, via sensor APIs or manually, as text or picture data files. A real-time METAR request API permits correlating weather data with patient health, and an advanced query system is implemented to allow trend analysis of selected health metrics over custom time intervals. Available on the GitHub repository system, the project is free to use for academic purposes of learning and experimenting, or practical purposes by building on it.

Keywords: flask, Java, JavaScript, health monitoring, long-term care, Mongo, Python, smart home, software engineering, webserver

Procedia PDF Downloads 391
24083 Building an Integrated Relational Database from Swiss Nutrition National Survey and Swiss Health Datasets for Data Mining Purposes

Authors: Ilona Mewes, Helena Jenzer, Farshideh Einsele

Abstract:

Objective: The objective of the study was to integrate two big databases from Swiss nutrition national survey (menuCH) and Swiss health national survey 2012 for data mining purposes. Each database has a demographic base data. An integrated Swiss database is built to later discover critical food consumption patterns linked with lifestyle diseases known to be strongly tied with food consumption. Design: Swiss nutrition national survey (menuCH) with approx. 2000 respondents from two different surveys, one by Phone and the other by questionnaire along with Swiss health national survey 2012 with 21500 respondents were pre-processed, cleaned and finally integrated to a unique relational database. Results: The result of this study is an integrated relational database from the Swiss nutritional and health databases.

Keywords: health informatics, data mining, nutritional and health databases, nutritional and chronical databases

Procedia PDF Downloads 112
24082 Clustering Performance Analysis using New Correlation-Based Cluster Validity Indices

Authors: Nathakhun Wiroonsri

Abstract:

There are various cluster validity measures used for evaluating clustering results. One of the main objectives of using these measures is to seek the optimal unknown number of clusters. Some measures work well for clusters with different densities, sizes and shapes. Yet, one of the weaknesses that those validity measures share is that they sometimes provide only one clear optimal number of clusters. That number is actually unknown and there might be more than one potential sub-optimal option that a user may wish to choose based on different applications. We develop two new cluster validity indices based on a correlation between an actual distance between a pair of data points and a centroid distance of clusters that the two points are located in. Our proposed indices constantly yield several peaks at different numbers of clusters which overcome the weakness previously stated. Furthermore, the introduced correlation can also be used for evaluating the quality of a selected clustering result. Several experiments in different scenarios, including the well-known iris data set and a real-world marketing application, have been conducted to compare the proposed validity indices with several well-known ones.

Keywords: clustering algorithm, cluster validity measure, correlation, data partitions, iris data set, marketing, pattern recognition

Procedia PDF Downloads 103
24081 Analyzing Competitive Advantage of Internet of Things and Data Analytics in Smart City Context

Authors: Petra Hofmann, Dana Koniel, Jussi Luukkanen, Walter Nieminen, Lea Hannola, Ilkka Donoghue

Abstract:

The Covid-19 pandemic forced people to isolate and become physically less connected. The pandemic hasnot only reshaped people’s behaviours and needs but also accelerated digital transformation (DT). DT of cities has become an imperative with the outlook of converting them into smart cities in the future. Embedding digital infrastructure and smart city initiatives as part of the normal design, construction, and operation of cities provides a unique opportunity to improve connection between people. Internet of Things (IoT) is an emerging technology and one of the drivers in DT. It has disrupted many industries by introducing different services and business models, and IoT solutions are being applied in multiple fields, including smart cities. As IoT and data are fundamentally linked together, IoT solutions can only create value if the data generated by the IoT devices is analysed properly. Extracting relevant conclusions and actionable insights by using established techniques, data analytics contributes significantly to the growth and success of IoT applications and investments. Companies must grasp DT and be prepared to redesign their offerings and business models to remain competitive in today’s marketplace. As there are many IoT solutions available today, the amount of data is tremendous. The challenge for companies is to understand what solutions to focus on and how to prioritise and which data to differentiate from the competition. This paper explains how IoT and data analytics can impact competitive advantage and how companies should approach IoT and data analytics to translate them into concrete offerings and solutions in the smart city context. The study was carried out as a qualitative, literature-based research. A case study is provided to validate the preservation of company’s competitive advantage through smart city solutions. The results of the researchcontribution provide insights into the different factors and considerations related to creating competitive advantage through IoT and data analytics deployment in the smart city context. Furthermore, this paper proposes a framework that merges the factors and considerations with examples of offerings and solutions in smart cities. The data collected through IoT devices, and the intelligent use of it, can create a competitive advantage to companies operating in smart city business. Companies should take into consideration the five forces of competition that shape industries and pay attention to the technological, organisational, and external contexts which define factors for consideration of competitive advantages in the field of IoT and data analytics. Companies that can utilise these key assets in their businesses will most likely conquer the markets and have a strong foothold in the smart city business.

Keywords: internet of things, data analytics, smart cities, competitive advantage

Procedia PDF Downloads 95
24080 Social Data-Based Users Profiles' Enrichment

Authors: Amel Hannech, Mehdi Adda, Hamid Mcheick

Abstract:

In this paper, we propose a generic model of user profile integrating several elements that may positively impact the research process. We exploit the classical behavior of users and integrate a delimitation process of their research activities into several research sessions enriched with contextual and temporal information, which allows reflecting the current interests of these users in every period of time and infer data freshness. We argue that the annotation of resources gives more transparency on users' needs. It also strengthens social links among resources and users, and can so increase the scope of the user profile. Based on this idea, we integrate the social tagging practice in order to exploit the social users' behavior to enrich their profiles. These profiles are then integrated into a recommendation system in order to predict the interesting personalized items of users allowing to assist them in their researches and further enrich their profiles. In this recommendation, we provide users new research experiences.

Keywords: user profiles, topical ontology, contextual information, folksonomies, tags' clusters, data freshness, association rules, data recommendation

Procedia PDF Downloads 265
24079 Impact of External Temperature on the Speleothem Growth in the Moravian Karst

Authors: Frantisek Odvarka

Abstract:

Based on the data from the Moravian Karst, the influence of the calcite speleothem growth by selected meteorological factors was evaluated. External temperature was determined as one of the main factors influencing speleothem growth in Moravian Karst. This factor significantly influences the CO₂ concentration in soil/epikarst, and cave atmosphere in the Moravian Karst and significantly contributes to the changes in the CO₂ partial pressure differences between soil/epikarst and cave atmosphere in Moravian Karst, which determines the drip water supersaturation with respect to the calcite and quantity of precipitated calcite in the Moravian Karst cave environment. External air temperatures and cave air temperatures were measured using a COMET S3120 data logger, which can measure temperatures in the range from -30 to +80 °C with an accuracy of ± 0.4 °C. CO₂ concentrations in the cave and soils were measured with a FT A600 CO₂H Ahlborn probe (value range 0 ppmv to 10,000 ppmv, accuracy 1 ppmv), which was connected to the data logger ALMEMO 2290-4, V5 Ahlborn. The soil temperature was measured with a FHA646E1 Ahlborn probe (temperature range -20 to 70 °C, accuracy ± 0.4 °C) connected to an ALMEMO 2290-4 V5 Ahlborn data logger. The airflow velocities into and out of the cave were monitored by a FVA395 TH4 Thermo anemometer (speed range from 0.05 to 2 m s⁻¹, accuracy ± 0.04 m s⁻¹), which was connected to the ALMEMO 2590-4 V5 Ahlborn data logger for recording. The flow was measured in the lower and upper entrance of the Imperial Cave. The data were analyzed in MS Office Excel 2019 and PHREEQC.

Keywords: speleothem growth, carbon dioxide partial pressure, Moravian Karst, external temperature

Procedia PDF Downloads 144
24078 Using Data Mining in Automotive Safety

Authors: Carine Cridelich, Pablo Juesas Cano, Emmanuel Ramasso, Noureddine Zerhouni, Bernd Weiler

Abstract:

Safety is one of the most important considerations when buying a new car. While active safety aims at avoiding accidents, passive safety systems such as airbags and seat belts protect the occupant in case of an accident. In addition to legal regulations, organizations like Euro NCAP provide consumers with an independent assessment of the safety performance of cars and drive the development of safety systems in automobile industry. Those ratings are mainly based on injury assessment reference values derived from physical parameters measured in dummies during a car crash test. The components and sub-systems of a safety system are designed to achieve the required restraint performance. Sled tests and other types of tests are then carried out by car makers and their suppliers to confirm the protection level of the safety system. A Knowledge Discovery in Databases (KDD) process is proposed in order to minimize the number of tests. The KDD process is based on the data emerging from sled tests according to Euro NCAP specifications. About 30 parameters of the passive safety systems from different data sources (crash data, dummy protocol) are first analysed together with experts opinions. A procedure is proposed to manage missing data and validated on real data sets. Finally, a procedure is developed to estimate a set of rough initial parameters of the passive system before testing aiming at reducing the number of tests.

Keywords: KDD process, passive safety systems, sled test, dummy injury assessment reference values, frontal impact

Procedia PDF Downloads 382
24077 Comparative Study of Greenhouse Locations through Satellite Images and Geographic Information System: Methodological Evaluation in Venezuela

Authors: Maria A. Castillo H., Andrés R. Leandro C.

Abstract:

During the last decades, agricultural productivity in Latin America has increased with precision agriculture and more efficient agricultural technologies. The use of automated systems, satellite images, geographic information systems, and tools for data analysis, and artificial intelligence have contributed to making more effective strategic decisions. Twenty years ago, the state of Mérida, located in the Venezuelan Andes, reported the largest area covered by greenhouses in the country, where certified seeds of potatoes, vegetables, ornamentals, and flowers were produced for export and consumption in the central region of the country. In recent years, it is estimated that production under greenhouses has changed, and the area covered has decreased due to different factors, but there are few historical statistical data in sufficient quantity and quality to support this estimate or to be used for analysis and decision making. The objective of this study is to compare data collected about geoposition, use, and covered areas of the greenhouses in 2007 to data available in 2021, as support for the analysis of the current situation of horticultural production in the main municipalities of the state of Mérida. The document presents the development of the work in the diagnosis and integration of geographic coordinates in GIS and data analysis phases. As a result, an evaluation of the process is made, a dashboard is presented with the most relevant data along with the geographical coordinates integrated into GIS, and an analysis of the obtained information is made. Finally, some recommendations for actions are added, and works that expand the information obtained and its geographical traceability over time are proposed. This study contributes to granting greater certainty in the supporting data for the evaluation of social, environmental, and economic sustainability indicators and to make better decisions according to the sustainable development goals in the area under review. At the same time, the methodology provides improvements to the agricultural data collection process that can be extended to other study areas and crops.

Keywords: greenhouses, geographic information system, protected agriculture, data analysis, Venezuela

Procedia PDF Downloads 93
24076 Modelling Consistency and Change of Social Attitudes in 7 Years of Longitudinal Data

Authors: Paul Campbell, Nicholas Biddle

Abstract:

There is a complex, endogenous relationship between individual circumstances, attitudes, and behaviour. This study uses longitudinal panel data to assess changes in social and political attitudes over a 7-year period. Attitudes are captured with the question 'what is the most important issue facing Australia today', collected at multiple time points in a longitudinal survey of 2200 Australians. Consistency of attitudes, and factors predicting change over time, are assessed. The consistency of responses has methodological implications for data collection, specifically how often such questions ought to be asked of a population. When change in attitude is observed, this study assesses the extent to which individual demographic characteristics, personality traits, and broader societal events predict change.

Keywords: attitudes, longitudinal survey analysis, personality, social values

Procedia PDF Downloads 136