Search results for: operational data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25551

Search results for: operational data

24711 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 66
24710 Numerical Study of Bubbling Fluidized Beds Operating at Sub-atmospheric Conditions

Authors: Lanka Dinushke Weerasiri, Subrat Das, Daniel Fabijanic, William Yang

Abstract:

Fluidization at vacuum pressure has been a topic that is of growing research interest. Several industrial applications (such as drying, extractive metallurgy, and chemical vapor deposition (CVD)) can potentially take advantage of vacuum pressure fluidization. Particularly, the fine chemical industry requires processing under safe conditions for thermolabile substances, and reduced pressure fluidized beds offer an alternative. Fluidized beds under vacuum conditions provide optimal conditions for treatment of granular materials where the reduced gas pressure maintains an operational environment outside of flammability conditions. The fluidization at low-pressure is markedly different from the usual gas flow patterns of atmospheric fluidization. The different flow regimes can be characterized by the dimensionless Knudsen number. Nevertheless, hydrodynamics of bubbling vacuum fluidized beds has not been investigated to author’s best knowledge. In this work, the two-fluid numerical method was used to determine the impact of reduced pressure on the fundamental properties of a fluidized bed. The slip flow model implemented by Ansys Fluent User Defined Functions (UDF) was used to determine the interphase momentum exchange coefficient. A wide range of operating pressures was investigated (1.01, 0.5, 0.25, 0.1 and 0.03 Bar). The gas was supplied by a uniform inlet at 1.5Umf and 2Umf. The predicted minimum fluidization velocity (Umf) shows excellent agreement with the experimental data. The results show that the operating pressure has a notable impact on the bed properties and its hydrodynamics. Furthermore, it also shows that the existing Gorosko correlation that predicts bed expansion is not applicable under reduced pressure conditions.

Keywords: computational fluid dynamics, fluidized bed, gas-solid flow, vacuum pressure, slip flow, minimum fluidization velocity

Procedia PDF Downloads 133
24709 A Methodology to Integrate Data in the Company Based on the Semantic Standard in the Context of Industry 4.0

Authors: Chang Qin, Daham Mustafa, Abderrahmane Khiat, Pierre Bienert, Paulo Zanini

Abstract:

Nowadays, companies are facing lots of challenges in the process of digital transformation, which can be a complex and costly undertaking. Digital transformation involves the collection and analysis of large amounts of data, which can create challenges around data management and governance. Furthermore, it is also challenged to integrate data from multiple systems and technologies. Although with these pains, companies are still pursuing digitalization because by embracing advanced technologies, companies can improve efficiency, quality, decision-making, and customer experience while also creating different business models and revenue streams. In this paper, the issue that data is stored in data silos with different schema and structures is focused. The conventional approaches to addressing this issue involve utilizing data warehousing, data integration tools, data standardization, and business intelligence tools. However, these approaches primarily focus on the grammar and structure of the data and neglect the importance of semantic modeling and semantic standardization, which are essential for achieving data interoperability. In this session, the challenge of data silos in Industry 4.0 is addressed by developing a semantic modeling approach compliant with Asset Administration Shell (AAS) models as an efficient standard for communication in Industry 4.0. The paper highlights how our approach can facilitate the data mapping process and semantic lifting according to existing industry standards such as ECLASS and other industrial dictionaries. It also incorporates the Asset Administration Shell technology to model and map the company’s data and utilize a knowledge graph for data storage and exploration.

Keywords: data interoperability in industry 4.0, digital integration, industrial dictionary, semantic modeling

Procedia PDF Downloads 88
24708 Innovative Design Considerations for Adaptive Spacecraft

Authors: K. Parandhama Gowd

Abstract:

Space technologies have changed the way we live in the present day society and manage many aspects of our daily affairs through Remote sensing, Navigation & Communications. Further, defense and military usage of spacecraft has increased tremendously along with civilian purposes. The number of satellites deployed in space in Low Earth Orbit (LEO), Medium Earth Orbit (MEO), and the Geostationary Orbit (GEO) has gone up. The dependency on remote sensing and operational capabilities are most invariably to be exploited more and more in future. Every country is acquiring spacecraft in one way or other for their daily needs, and spacecraft numbers are likely to increase significantly and create spacecraft traffic problems. The aim of this research paper is to propose innovative design concepts for adaptive spacecraft. The main idea here is to improve existing design methods of spacecraft design and development to further improve upon design considerations for futuristic adaptive spacecraft with inbuilt features for automatic adaptability and self-protection. In other words, the innovative design considerations proposed here are to have future spacecraft with self-organizing capabilities for orbital control and protection from anti-satellite weapons (ASAT). Here, an attempt is made to propose design and develop futuristic spacecraft for 2030 and beyond due to tremendous advancements in VVLSI, miniaturization, and nano antenna array technologies, including nano technologies are expected.

Keywords: satellites, low earth orbit (LEO), medium earth orbit (MEO), geostationary earth orbit (GEO), self-organizing control system, anti-satellite weapons (ASAT), orbital control, radar warning receiver, missile warning receiver, laser warning receiver, attitude and orbit control systems (AOCS), command and data handling (CDH)

Procedia PDF Downloads 289
24707 Big Data Analytics and Data Security in the Cloud via Fully Homomorphic Encryption

Authors: Waziri Victor Onomza, John K. Alhassan, Idris Ismaila, Noel Dogonyaro Moses

Abstract:

This paper describes the problem of building secure computational services for encrypted information in the Cloud Computing without decrypting the encrypted data; therefore, it meets the yearning of computational encryption algorithmic aspiration model that could enhance the security of big data for privacy, confidentiality, availability of the users. The cryptographic model applied for the computational process of the encrypted data is the Fully Homomorphic Encryption Scheme. We contribute theoretical presentations in high-level computational processes that are based on number theory and algebra that can easily be integrated and leveraged in the Cloud computing with detail theoretic mathematical concepts to the fully homomorphic encryption models. This contribution enhances the full implementation of big data analytics based cryptographic security algorithm.

Keywords: big data analytics, security, privacy, bootstrapping, homomorphic, homomorphic encryption scheme

Procedia PDF Downloads 372
24706 Protecting Privacy and Data Security in Online Business

Authors: Bilquis Ferdousi

Abstract:

With the exponential growth of the online business, the threat to consumers’ privacy and data security has become a serious challenge. This literature review-based study focuses on a better understanding of those threats and what legislative measures have been taken to address those challenges. Research shows that people are increasingly involved in online business using different digital devices and platforms, although this practice varies based on age groups. The threat to consumers’ privacy and data security is a serious hindrance in developing trust among consumers in online businesses. There are some legislative measures taken at the federal and state level to protect consumers’ privacy and data security. The study was based on an extensive review of current literature on protecting consumers’ privacy and data security and legislative measures that have been taken.

Keywords: privacy, data security, legislation, online business

Procedia PDF Downloads 97
24705 Flowing Online Vehicle GPS Data Clustering Using a New Parallel K-Means Algorithm

Authors: Orhun Vural, Oguz Bayat, Rustu Akay, Osman N. Ucan

Abstract:

This study presents a new parallel approach clustering of GPS data. Evaluation has been made by comparing execution time of various clustering algorithms on GPS data. This paper aims to propose a parallel based on neighborhood K-means algorithm to make it faster. The proposed parallelization approach assumes that each GPS data represents a vehicle and to communicate between vehicles close to each other after vehicles are clustered. This parallelization approach has been examined on different sized continuously changing GPS data and compared with serial K-means algorithm and other serial clustering algorithms. The results demonstrated that proposed parallel K-means algorithm has been shown to work much faster than other clustering algorithms.

Keywords: parallel k-means algorithm, parallel clustering, clustering algorithms, clustering on flowing data

Procedia PDF Downloads 214
24704 An Analysis of Privacy and Security for Internet of Things Applications

Authors: Dhananjay Singh, M. Abdullah-Al-Wadud

Abstract:

The Internet of Things is a concept of a large scale ecosystem of wireless actuators. The actuators are defined as things in the IoT, those which contribute or produces some data to the ecosystem. However, ubiquitous data collection, data security, privacy preserving, large volume data processing, and intelligent analytics are some of the key challenges into the IoT technologies. In order to solve the security requirements, challenges and threats in the IoT, we have discussed a message authentication mechanism for IoT applications. Finally, we have discussed data encryption mechanism for messages authentication before propagating into IoT networks.

Keywords: Internet of Things (IoT), message authentication, privacy, security

Procedia PDF Downloads 377
24703 Cognitive Science Based Scheduling in Grid Environment

Authors: N. D. Iswarya, M. A. Maluk Mohamed, N. Vijaya

Abstract:

Grid is infrastructure that allows the deployment of distributed data in large size from multiple locations to reach a common goal. Scheduling data intensive applications becomes challenging as the size of data sets are very huge in size. Only two solutions exist in order to tackle this challenging issue. First, computation which requires huge data sets to be processed can be transferred to the data site. Second, the required data sets can be transferred to the computation site. In the former scenario, the computation cannot be transferred since the servers are storage/data servers with little or no computational capability. Hence, the second scenario can be considered for further exploration. During scheduling, transferring huge data sets from one site to another site requires more network bandwidth. In order to mitigate this issue, this work focuses on incorporating cognitive science in scheduling. Cognitive Science is the study of human brain and its related activities. Current researches are mainly focused on to incorporate cognitive science in various computational modeling techniques. In this work, the problem solving approach of human brain is studied and incorporated during the data intensive scheduling in grid environments. Here, a cognitive engine is designed and deployed in various grid sites. The intelligent agents present in CE will help in analyzing the request and creating the knowledge base. Depending upon the link capacity, decision will be taken whether to transfer data sets or to partition the data sets. Prediction of next request is made by the agents to serve the requesting site with data sets in advance. This will reduce the data availability time and data transfer time. Replica catalog and Meta data catalog created by the agents assist in decision making process.

Keywords: data grid, grid workflow scheduling, cognitive artificial intelligence

Procedia PDF Downloads 390
24702 Heritage and Tourism in the Era of Big Data: Analysis of Chinese Cultural Tourism in Catalonia

Authors: Xinge Liao, Francesc Xavier Roige Ventura, Dolores Sanchez Aguilera

Abstract:

With the development of the Internet, the study of tourism behavior has rapidly expanded from the traditional physical market to the online market. Data on the Internet is characterized by dynamic changes, and new data appear all the time. In recent years the generation of a large volume of data was characterized, such as forums, blogs, and other sources, which have expanded over time and space, together they constitute large-scale Internet data, known as Big Data. This data of technological origin that derives from the use of devices and the activity of multiple users is becoming a source of great importance for the study of geography and the behavior of tourists. The study will focus on cultural heritage tourist practices in the context of Big Data. The research will focus on exploring the characteristics and behavior of Chinese tourists in relation to the cultural heritage of Catalonia. Geographical information, target image, perceptions in user-generated content will be studied through data analysis from Weibo -the largest social networks of blogs in China. Through the analysis of the behavior of heritage tourists in the Big Data environment, this study will understand the practices (activities, motivations, perceptions) of cultural tourists and then understand the needs and preferences of tourists in order to better guide the sustainable development of tourism in heritage sites.

Keywords: Barcelona, Big Data, Catalonia, cultural heritage, Chinese tourism market, tourists’ behavior

Procedia PDF Downloads 133
24701 Towards A Framework for Using Open Data for Accountability: A Case Study of A Program to Reduce Corruption

Authors: Darusalam, Jorish Hulstijn, Marijn Janssen

Abstract:

Media has revealed a variety of corruption cases in the regional and local governments all over the world. Many governments pursued many anti-corruption reforms and have created a system of checks and balances. Three types of corruption are faced by citizens; administrative corruption, collusion and extortion. Accountability is one of the benchmarks for building transparent government. The public sector is required to report the results of the programs that have been implemented so that the citizen can judge whether the institution has been working such as economical, efficient and effective. Open Data is offering solutions for the implementation of good governance in organizations who want to be more transparent. In addition, Open Data can create transparency and accountability to the community. The objective of this paper is to build a framework of open data for accountability to combating corruption. This paper will investigate the relationship between open data, and accountability as part of anti-corruption initiatives. This research will investigate the impact of open data implementation on public organization.

Keywords: open data, accountability, anti-corruption, framework

Procedia PDF Downloads 324
24700 Generation of Roof Design Spectra Directly from Uniform Hazard Spectra

Authors: Amin Asgarian, Ghyslaine McClure

Abstract:

Proper seismic evaluation of Non-Structural Components (NSCs) mandates an accurate estimation of floor seismic demands (i.e. acceleration and displacement demands). Most of the current international codes incorporate empirical equations to calculate equivalent static seismic force for which NSCs and their anchorage system must be designed. These equations, in general, are functions of component mass and peak seismic acceleration to which NSCs are subjected to during the earthquake. However, recent studies have shown that these recommendations are suffered from several shortcomings such as neglecting the higher mode effect, tuning effect, NSCs damping effect, etc. which cause underestimation of the component seismic acceleration demand. This work is aimed to circumvent the aforementioned shortcomings of code provisions as well as improving them by proposing a simplified, practical, and yet accurate approach to generate acceleration Floor Design Spectra (FDS) directly from corresponding Uniform Hazard Spectra (UHS) (i.e. design spectra for structural components). A database of 27 Reinforced Concrete (RC) buildings in which Ambient Vibration Measurements (AVM) have been conducted. The database comprises 12 low-rise, 10 medium-rise, and 5 high-rise buildings all located in Montréal, Canada and designated as post-disaster buildings or emergency shelters. The buildings are subjected to a set of 20 compatible seismic records and Floor Response Spectra (FRS) in terms of pseudo acceleration are derived using the proposed approach for every floor of the building in both horizontal directions considering 4 different damping ratios of NSCs (i.e. 2, 5, 10, and 20% viscous damping). Several effective parameters on NSCs response are evaluated statistically. These parameters comprise NSCs damping ratios, tuning of NSCs natural period with one of the natural periods of supporting structure, higher modes of supporting structures, and location of NSCs. The entire spectral region is divided into three distinct segments namely short-period, fundamental period, and long period region. The derived roof floor response spectra for NSCs with 5% damping are compared with the 5% damping UHS and procedure are proposed to generate roof FDS for NSCs with 5% damping directly from 5% damped UHS in each spectral region. The generated FDS is a powerful, practical, and accurate tool for seismic design and assessment of acceleration-sensitive NSCs particularly in existing post-critical buildings which have to remain functional even after the earthquake and cannot tolerate any damage to NSCs.

Keywords: earthquake engineering, operational and functional components (OFCs), operational modal analysis (OMA), seismic assessment and design

Procedia PDF Downloads 235
24699 An E-Maintenance IoT Sensor Node Designed for Fleets of Diverse Heavy-Duty Vehicles

Authors: George Charkoftakis, Panagiotis Liosatos, Nicolas-Alexander Tatlas, Dimitrios Goustouridis, Stelios M. Potirakis

Abstract:

E-maintenance is a relatively new concept, generally referring to maintenance management by monitoring assets over the Internet. One of the key links in the chain of an e-maintenance system is data acquisition and transmission. Specifically for the case of a fleet of heavy-duty vehicles, where the main challenge is the diversity of the vehicles and vehicle-embedded self-diagnostic/reporting technologies, the design of the data acquisition and transmission unit is a demanding task. This clear if one takes into account that a heavy-vehicles fleet assortment may range from vehicles with only a limited number of analog sensors monitored by dashboard light indicators and gauges to vehicles with plethora of sensors monitored by a vehicle computer producing digital reporting. The present work proposes an adaptable internet of things (IoT) sensor node that is capable of addressing this challenge. The proposed sensor node architecture is based on the increasingly popular single-board computer – expansion boards approach. In the proposed solution, the expansion boards undertake the tasks of position identification by means of a global navigation satellite system (GNSS), cellular connectivity by means of 3G/long-term evolution (LTE) modem, connectivity to on-board diagnostics (OBD), and connectivity to analog and digital sensors by means of a novel design of expansion board. Specifically, the later provides eight analog plus three digital sensor channels, as well as one on-board temperature / relative humidity sensor. The specific device offers a number of adaptability features based on appropriate zero-ohm resistor placement and appropriate value selection for limited number of passive components. For example, although in the standard configuration four voltage analog channels with constant voltage sources for the power supply of the corresponding sensors are available, up to two of these voltage channels can be converted to provide power to the connected sensors by means of corresponding constant current source circuits, whereas all parameters of analog sensor power supply and matching circuits are fully configurable offering the advantage of covering a wide variety of industrial sensors. Note that a key feature of the proposed sensor node, ensuring the reliable operation of the connected sensors, is the appropriate supply of external power to the connected sensors and their proper matching to the IoT sensor node. In standard mode, the IoT sensor node communicates to the data center through 3G/LTE, transmitting all digital/digitized sensor data, IoT device identity, and position. Moreover, the proposed IoT sensor node offers WiFi connectivity to mobile devices (smartphones, tablets) equipped with an appropriate application for the manual registration of vehicle- and driver-specific information, and these data are also forwarded to the data center. All control and communication tasks of the IoT sensor node are performed by dedicated firmware. It is programmed with a high-level language (Python) on top of a modern operating system (Linux). Acknowledgment: This research has been co-financed by the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship, and Innovation, under the call RESEARCH—CREATE—INNOVATE (project code: T1EDK- 01359, IntelligentLogger).

Keywords: IoT sensor nodes, e-maintenance, single-board computers, sensor expansion boards, on-board diagnostics

Procedia PDF Downloads 146
24698 Syndromic Surveillance Framework Using Tweets Data Analytics

Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden

Abstract:

Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.

Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza

Procedia PDF Downloads 106
24697 Analysis of Urban Population Using Twitter Distribution Data: Case Study of Makassar City, Indonesia

Authors: Yuyun Wabula, B. J. Dewancker

Abstract:

In the past decade, the social networking app has been growing very rapidly. Geolocation data is one of the important features of social media that can attach the user's location coordinate in the real world. This paper proposes the use of geolocation data from the Twitter social media application to gain knowledge about urban dynamics, especially on human mobility behavior. This paper aims to explore the relation between geolocation Twitter with the existence of people in the urban area. Firstly, the study will analyze the spread of people in the particular area, within the city using Twitter social media data. Secondly, we then match and categorize the existing place based on the same individuals visiting. Then, we combine the Twitter data from the tracking result and the questionnaire data to catch the Twitter user profile. To do that, we used the distribution frequency analysis to learn the visitors’ percentage. To validate the hypothesis, we compare it with the local population statistic data and land use mapping released by the city planning department of Makassar local government. The results show that there is the correlation between Twitter geolocation and questionnaire data. Thus, integration the Twitter data and survey data can reveal the profile of the social media users.

Keywords: geolocation, Twitter, distribution analysis, human mobility

Procedia PDF Downloads 308
24696 Application of Multidimensional Model of Evaluating Organisational Performance in Moroccan Sport Clubs

Authors: Zineb Jibraili, Said Ouhadi, Jorge Arana

Abstract:

Introduction: Organizational performance is recognized by some theorists as one-dimensional concept, and by others as multidimensional. This concept, which is already difficult to apply in traditional companies, is even harder to identify, to measure and to manage when voluntary organizations are concerned, essentially because of the complexity of that form of organizations such as sport clubs who are characterized by the multiple goals and multiple constituencies. Indeed, the new culture of professionalization and modernization around organizational performance emerges new pressures from the state, sponsors, members and other stakeholders which have required these sport organizations to become more performance oriented, or to build their capacity in order to better manage their organizational performance. The evaluation of performance can be made by evaluating the input (e.g. available resources), throughput (e.g. processing of the input) and output (e.g. goals achieved) of the organization. In non-profit organizations (NPOs), questions of performance have become increasingly important in the world of practice. To our knowledge, most of studies used the same methods to evaluate the performance in NPSOs, but no recent study has proposed a club-specific model. Based on a review of the studies that specifically addressed the organizational performance (and effectiveness) of NPSOs at operational level, the present paper aims to provide a multidimensional framework in order to understand, analyse and measure organizational performance of sport clubs. This paper combines all dimensions founded in literature and chooses the most suited of them to our model that we will develop in Moroccan sport clubs case. Method: We propose to implicate our unified model of evaluating organizational performance that takes into account all the limitations found in the literature. On a sample of Moroccan sport clubs ‘Football, Basketball, Handball and Volleyball’, for this purpose we use a qualitative study. The sample of our study comprises data from sport clubs (football, basketball, handball, volleyball) participating on the first division of the professional football league over the period from 2011 to 2016. Each football club had to meet some specific criteria in order to be included in the sample: 1. Each club must have full financial data published in their annual financial statements, audited by an independent chartered accountant. 2. Each club must have sufficient data. Regarding their sport and financial performance. 3. Each club must have participated at least once in the 1st division of the professional football league. Result: The study showed that the dimensions that constitute the model exist in the field with some small modifications. The correlations between the different dimensions are positive. Discussion: The aim of this study is to test the unified model emerged from earlier and narrower approaches for Moroccan case. Using the input-throughput-output model for the sketch of efficiency, it was possible to identify and define five dimensions of organizational effectiveness applied to this field of study.

Keywords: organisational performance, model multidimensional, evaluation organizational performance, sport clubs

Procedia PDF Downloads 316
24695 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract

Procedia PDF Downloads 652
24694 Sensor Data Analysis for a Large Mining Major

Authors: Sudipto Shanker Dasgupta

Abstract:

One of the largest mining companies wanted to look at health analytics for their driverless trucks. These trucks were the key to their supply chain logistics. The automated trucks had multi-level sub-assemblies which would send out sensor information. The use case that was worked on was to capture the sensor signal from the truck subcomponents and analyze the health of the trucks from repair and replacement purview. Open source software was used to stream the data into a clustered Hadoop setup in Amazon Web Services cloud and Apache Spark SQL was used to analyze the data. All of this was achieved through a 10 node amazon 32 core, 64 GB RAM setup real-time analytics was achieved on ‘300 million records’. To check the scalability of the system, the cluster was increased to 100 node setup. This talk will highlight how Open Source software was used to achieve the above use case and the insights on the high data throughput on a cloud set up.

Keywords: streaming analytics, data science, big data, Hadoop, high throughput, sensor data

Procedia PDF Downloads 400
24693 Data-Centric Anomaly Detection with Diffusion Models

Authors: Sheldon Liu, Gordon Wang, Lei Liu, Xuefeng Liu

Abstract:

Anomaly detection, also referred to as one-class classification, plays a crucial role in identifying product images that deviate from the expected distribution. This study introduces Data-centric Anomaly Detection with Diffusion Models (DCADDM), presenting a systematic strategy for data collection and further diversifying the data with image generation via diffusion models. The algorithm addresses data collection challenges in real-world scenarios and points toward data augmentation with the integration of generative AI capabilities. The paper explores the generation of normal images using diffusion models. The experiments demonstrate that with 30% of the original normal image size, modeling in an unsupervised setting with state-of-the-art approaches can achieve equivalent performances. With the addition of generated images via diffusion models (10% equivalence of the original dataset size), the proposed algorithm achieves better or equivalent anomaly localization performance.

Keywords: diffusion models, anomaly detection, data-centric, generative AI

Procedia PDF Downloads 77
24692 A Visual Analytics Tool for the Structural Health Monitoring of an Aircraft Panel

Authors: F. M. Pisano, M. Ciminello

Abstract:

Aerospace, mechanical, and civil engineering infrastructures can take advantages from damage detection and identification strategies in terms of maintenance cost reduction and operational life improvements, as well for safety scopes. The challenge is to detect so called “barely visible impact damage” (BVID), due to low/medium energy impacts, that can progressively compromise the structure integrity. The occurrence of any local change in material properties, that can degrade the structure performance, is to be monitored using so called Structural Health Monitoring (SHM) systems, in charge of comparing the structure states before and after damage occurs. SHM seeks for any "anomalous" response collected by means of sensor networks and then analyzed using appropriate algorithms. Independently of the specific analysis approach adopted for structural damage detection and localization, textual reports, tables and graphs describing possible outlier coordinates and damage severity are usually provided as artifacts to be elaborated for information extraction about the current health conditions of the structure under investigation. Visual Analytics can support the processing of monitored measurements offering data navigation and exploration tools leveraging the native human capabilities of understanding images faster than texts and tables. Herein, a SHM system enrichment by integration of a Visual Analytics component is investigated. Analytical dashboards have been created by combining worksheets, so that a useful Visual Analytics tool is provided to structural analysts for exploring the structure health conditions examined by a Principal Component Analysis based algorithm.

Keywords: interactive dashboards, optical fibers, structural health monitoring, visual analytics

Procedia PDF Downloads 120
24691 Regulation on the Protection of Personal Data Versus Quality Data Assurance in the Healthcare System Case Report

Authors: Elizabeta Krstić Vukelja

Abstract:

Digitization of personal data is a consequence of the development of information and communication technologies that create a new work environment with many advantages and challenges, but also potential threats to privacy and personal data protection. Regulation (EU) 2016/679 of the European Parliament and of the Council is becoming a law and obligation that should address the issues of personal data protection and information security. The existence of the Regulation leads to the conclusion that national legislation in the field of virtual environment, protection of the rights of EU citizens and processing of their personal data is insufficiently effective. In the health system, special emphasis is placed on the processing of special categories of personal data, such as health data. The healthcare industry is recognized as a particularly sensitive area in which a large amount of medical data is processed, the digitization of which enables quick access and quick identification of the health insured. The protection of the individual requires quality IT solutions that guarantee the technical protection of personal categories. However, the real problems are the technical and human nature and the spatial limitations of the application of the Regulation. Some conclusions will be drawn by analyzing the implementation of the basic principles of the Regulation on the example of the Croatian health care system and comparing it with similar activities in other EU member states.

Keywords: regulation, healthcare system, personal dana protection, quality data assurance

Procedia PDF Downloads 32
24690 Parallel Vector Processing Using Multi Level Orbital DATA

Authors: Nagi Mekhiel

Abstract:

Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.

Keywords: Memory Organization, Parallel Processors, Serial Code, Vector Processing

Procedia PDF Downloads 259
24689 Reconstructability Analysis for Landslide Prediction

Authors: David Percy

Abstract:

Landslides are a geologic phenomenon that affects a large number of inhabited places and are constantly being monitored and studied for the prediction of future occurrences. Reconstructability analysis (RA) is a methodology for extracting informative models from large volumes of data that work exclusively with discrete data. While RA has been used in medical applications and social science extensively, we are introducing it to the spatial sciences through applications like landslide prediction. Since RA works exclusively with discrete data, such as soil classification or bedrock type, working with continuous data, such as porosity, requires that these data are binned for inclusion in the model. RA constructs models of the data which pick out the most informative elements, independent variables (IVs), from each layer that predict the dependent variable (DV), landslide occurrence. Each layer included in the model retains its classification data as a primary encoding of the data. Unlike other machine learning algorithms that force the data into one-hot encoding type of schemes, RA works directly with the data as it is encoded, with the exception of continuous data, which must be binned. The usual physical and derived layers are included in the model, and testing our results against other published methodologies, such as neural networks, yields accuracy that is similar but with the advantage of a completely transparent model. The results of an RA session with a data set are a report on every combination of variables and their probability of landslide events occurring. In this way, every combination of informative state combinations can be examined.

Keywords: reconstructability analysis, machine learning, landslides, raster analysis

Procedia PDF Downloads 55
24688 Data Analytics in Hospitality Industry

Authors: Tammy Wee, Detlev Remy, Arif Perdana

Abstract:

In the recent years, data analytics has become the buzzword in the hospitality industry. The hospitality industry is another example of a data-rich industry that has yet fully benefited from the insights of data analytics. Effective use of data analytics can change how hotels operate, market and position themselves competitively in the hospitality industry. However, at the moment, the data obtained by individual hotels remain under-utilized. This research is a preliminary research on data analytics in the hospitality industry, using an in-depth face-to-face interview on one hotel as a start to a multi-level research. The main case study of this research, hotel A, is a chain brand of international hotel that has been systematically gathering and collecting data on its own customer for the past five years. The data collection points begin from the moment a guest book a room until the guest leave the hotel premises, which includes room reservation, spa booking, and catering. Although hotel A has been gathering data intelligence on its customer for some time, they have yet utilized the data to its fullest potential, and they are aware of their limitation as well as the potential of data analytics. Currently, the utilization of data analytics in hotel A is limited in the area of customer service improvement, namely to enhance the personalization of service for each individual customer. Hotel A is able to utilize the data to improve and enhance their service which in turn, encourage repeated customers. According to hotel A, 50% of their guests returned to their hotel, and 70% extended nights because of the personalized service. Apart from using the data analytics for enhancing customer service, hotel A also uses the data in marketing. Hotel A uses the data analytics to predict or forecast the change in consumer behavior and demand, by tracking their guest’s booking preference, payment preference and demand shift between properties. However, hotel A admitted that the data they have been collecting was not fully utilized due to two challenges. The first challenge of using data analytics in hotel A is the data is not clean. At the moment, the data collection of one guest profile is meaningful only for one department in the hotel but meaningless for another department. Cleaning up the data and getting standards correctly for usage by different departments are some of the main concerns of hotel A. The second challenge of using data analytics in hotel A is the non-integral internal system. At the moment, the internal system used by hotel A do not integrate with each other well, limiting the ability to collect data systematically. Hotel A is considering another system to replace the current one for more comprehensive data collection. Hotel proprietors recognized the potential of data analytics as reported in this research, however, the current challenges of implementing a system to collect data come with a cost. This research has identified the current utilization of data analytics and the challenges faced when it comes to implementing data analytics.

Keywords: data analytics, hospitality industry, customer relationship management, hotel marketing

Procedia PDF Downloads 171
24687 Sustainability in Higher Education: A Case of Transition Management from a Private University in Turkey (Ongoing Study)

Authors: Ayse Collins

Abstract:

The Agenda 2030 puts Higher Education Institutions (HEIs) in the situation where they should emphasize ways to promote sustainability accordingly. However, it is still unclear: a) how sustainability is understood, and b) which actions have been taken in both discourse and practice by HEIs regarding the three pillars of sustainability, society, environment, and economy. There are models of sustainable universities developed by different authors from different countries; For Example, The Global Reporting Initiative (GRI) methodology which offers a variety of indicators to diagnose performance. However, these models have never been developed for universities in particular. Any model, in this sense, cannot be completed adequately without defining the appropriate tools to measure, analyze and control the performance of initiatives. There is a need to conduct researches in different universities from different countries to understand where we stand in terms of sustainable higher education. Therefore, this study aims at exploring the actions taken by a university in Ankara, Turkey, since Agenda 2030 should consider localizing its objectives and targets according to a certain geography. This university just announced 2021-2022 as “Sustainability Year.” Therefore, this research is a multi-methodology longitudinal study and uses the theoretical framework of the organization and transition management (TM). It is designed to examine the activities as being strategic, tactical, operational, and reflexive in nature and covers the six main aspects: academic community, administrative staff, operations and services, teaching, research, and extension. The preliminary research will answer the role of the top university governance, perception of the stakeholders (students, instructors, administrative and support staff) regarding sustainability, and the level of achievement at the mid-evaluation and final, end of year evaluation. TM Theory is a multi-scale, multi-actor, process-oriented approach with the analytical framework to explore and promote change in social systems. Therefore, the stages and respective methodology for collecting data in this research is: Pre-development Stage: a) semi-structured interviews with university governance, c) open-ended survey with faculty, students, and administrative staff d) Semi-structured interviews with support staff, and e) analysis of current secondary data for sustainability. Take-off Stage: a) semi-structured interviews with university governance, faculty, students, administrative and support staff, b) analysis of secondary data. Breakthrough stabilization a) survey with all stakeholders at the university, b) secondary data analysis by using selected indicators for the first sustainability report for universities The findings from the predevelopment stage highlight how stakeholders, coming from different faculties, different disciplines with different identities and characteristics, face the sustainability challenge differently. Though similar sustainable development goals ((social, environmental, and economic) are set in the institution, there are differences across disciplines and among different stakeholders, which need to be considered to reach the optimum goal. It is believed that the results will help changes in HEIs organizational culture to embed sustainability values in their strategic planning, academic and managerial work by putting enough time and resources to be successful in coping with sustainability.

Keywords: higher education, sustainability, sustainability auditing, transition management

Procedia PDF Downloads 101
24686 Generic Data Warehousing for Consumer Electronics Retail Industry

Authors: S. Habte, K. Ouazzane, P. Patel, S. Patel

Abstract:

The dynamic and highly competitive nature of the consumer electronics retail industry means that businesses in this industry are experiencing different decision making challenges in relation to pricing, inventory control, consumer satisfaction and product offerings. To overcome the challenges facing retailers and create opportunities, we propose a generic data warehousing solution which can be applied to a wide range of consumer electronics retailers with a minimum configuration. The solution includes a dimensional data model, a template SQL script, a high level architectural descriptions, ETL tool developed using C#, a set of APIs, and data access tools. It has been successfully applied by ASK Outlets Ltd UK resulting in improved productivity and enhanced sales growth.

Keywords: consumer electronics, data warehousing, dimensional data model, generic, retail industry

Procedia PDF Downloads 404
24685 Predicting Photovoltaic Energy Profile of Birzeit University Campus Based on Weather Forecast

Authors: Muhammad Abu-Khaizaran, Ahmad Faza’, Tariq Othman, Yahia Yousef

Abstract:

This paper presents a study to provide sufficient and reliable information about constructing a Photovoltaic energy profile of the Birzeit University campus (BZU) based on the weather forecast. The developed Photovoltaic energy profile helps to predict the energy yield of the Photovoltaic systems based on the weather forecast and hence helps planning energy production and consumption. Two models will be developed in this paper; a Clear Sky Irradiance model and a Cloud-Cover Radiation model to predict the irradiance for a clear sky day and a cloudy day, respectively. The adopted procedure for developing such models takes into consideration two levels of abstraction. First, irradiance and weather data were acquired by a sensory (measurement) system installed on the rooftop of the Information Technology College building at Birzeit University campus. Second, power readings of a fully operational 51kW commercial Photovoltaic system installed in the University at the rooftop of the adjacent College of Pharmacy-Nursing and Health Professions building are used to validate the output of a simulation model and to help refine its structure. Based on a comparison between a mathematical model, which calculates Clear Sky Irradiance for the University location and two sets of accumulated measured data, it is found that the simulation system offers an accurate resemblance to the installed PV power station on clear sky days. However, these comparisons show a divergence between the expected energy yield and actual energy yield in extreme weather conditions, including clouding and soiling effects. Therefore, a more accurate prediction model for irradiance that takes into consideration weather factors, such as relative humidity and cloudiness, which affect irradiance, was developed; Cloud-Cover Radiation Model (CRM). The equivalent mathematical formulas implement corrections to provide more accurate inputs to the simulation system. The results of the CRM show a very good match with the actual measured irradiance during a cloudy day. The developed Photovoltaic profile helps in predicting the output energy yield of the Photovoltaic system installed at the University campus based on the predicted weather conditions. The simulation and practical results for both models are in a very good match.

Keywords: clear-sky irradiance model, cloud-cover radiation model, photovoltaic, weather forecast

Procedia PDF Downloads 124
24684 Sequential Data Assimilation with High-Frequency (HF) Radar Surface Current

Authors: Lei Ren, Michael Hartnett, Stephen Nash

Abstract:

The abundant measured surface current from HF radar system in coastal area is assimilated into model to improve the modeling forecasting ability. A simple sequential data assimilation scheme, Direct Insertion (DI), is applied to update model forecast states. The influence of Direct Insertion data assimilation over time is analyzed at one reference point. Vector maps of surface current from models are compared with HF radar measurements. Root-Mean-Squared-Error (RMSE) between modeling results and HF radar measurements is calculated during the last four days with no data assimilation.

Keywords: data assimilation, CODAR, HF radar, surface current, direct insertion

Procedia PDF Downloads 565
24683 Measured versus Default Interstate Traffic Data in New Mexico, USA

Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder

Abstract:

This study investigates how the site specific traffic data differs from the Mechanistic Empirical Pavement Design Software default values. Two Weigh-in-Motion (WIM) stations were installed in Interstate-40 (I-40) and Interstate-25 (I-25) to developed site specific data. A computer program named WIM Data Analysis Software (WIMDAS) was developed using Microsoft C-Sharp (.Net) for quality checking and processing of raw WIM data. A complete year data from November 2013 to October 2014 was analyzed using the developed WIM Data Analysis Program. After that, the vehicle class distribution, directional distribution, lane distribution, monthly adjustment factor, hourly distribution, axle load spectra, average number of axle per vehicle, axle spacing, lateral wander distribution, and wheelbase distribution were calculated. Then a comparative study was done between measured data and AASHTOWare default values. It was found that the measured general traffic inputs for I-40 and I-25 significantly differ from the default values.

Keywords: AASHTOWare, traffic, weigh-in-motion, axle load distribution

Procedia PDF Downloads 335
24682 Design of Knowledge Management System with Geographic Information System

Authors: Angga Hidayah Ramadhan, Luciana Andrawina, M. Azani Hasibuan

Abstract:

Data will be as a core of the decision if it has a good treatment or process, which is process that data into information, and information into knowledge to make a wisdom or decision. Today, many companies have not realize it include XYZ University Admission Directorate as executor of National Admission called Seleksi Masuk Bersama (SMB) that during the time, the workers only uses their feeling to make a decision. Whereas if it done, then that company can analyze the data to make a right decision to get a pin sales from student candidate or registrant that follow SMB as many as possible. Therefore, needs Knowledge Management System (KMS) with Geographic Information System (GIS) use 5C4C that can process that company data becomes more useful and can help make decisions. This information system can process data into information based on the pin sold data with 5C (Contextualized, Categorize, Calculation, Correction, Condensed) and convert information into knowledge with 4C (Comparing, Consequence, Connection, Conversation) that has been several steps until these data can be useful to make easier to take a decision or wisdom, resolve problems, communicate, and quicker to learn to the employees have not experience and also for ease of viewing/visualization based on spatial data that equipped with GIS functionality that can be used to indicate events in each province with indicator that facilitate in this system. The system also have a function to save the tacit on the system then to be proceed into explicit in expert system based on the problems that will be found from the consequences of information. With the system each team can make a decision with same ways, structured, and the important is based on the actual event/data.

Keywords: 5C4C, data, information, knowledge

Procedia PDF Downloads 459