Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 25981

Search results for: data storage

24001 Thermoelectric Cooler As A Heat Transfer Device For Thermal Conductivity Test

Authors: Abdul Murad Zainal Abidin, Azahar Mohd, Nor Idayu Arifin, Siti Nor Azila Khalid, Mohd Julzaha Zahari Mohamad Yusof

Abstract:

A thermoelectric cooler (TEC) is an electronic component that uses ‘peltier’ effect to create a temperature difference by transferring heat between two electrical junctions of two different types of materials. TEC can also be used for heating by reversing the electric current flow and even power generation. A heat flow meter (HFM) is an equipment for measuring thermal conductivity of building materials. During the test, water is used as heat transfer medium to cool the HFM. The existing re-circulating cooler in the market is very costly, and the alternative is to use piped tap water to extract heat from HFM. However, the tap water temperature is insufficiently low to enable heat transfer to take place. The operating temperature for isothermal plates in the HFM is 40°C with the range of ±0.02°C. When the temperature exceeds the operating range, the HFM stops working, and the test cannot be conducted. The aim of the research is to develop a low-cost but energy-efficient TEC prototype that enables heat transfer without compromising the function of the HFM. The objectives of the research are a) to identify potential of TEC as a cooling device by evaluating its cooling rate and b) to determine the amount of water savings using TEC compared to normal tap water. Four (4) peltier sets were used, with two (2) sets used as pre-cooler. The cooling water is re-circulated from the reservoir into HFM using a water pump. The thermal conductivity readings, the water flow rate, and the power consumption were measured while the HFM was operating. The measured data has shown decrease in average cooling temperature difference (ΔTave) of 2.42°C and average cooling rate of 0.031°C/min. The water savings accrued from using the TEC is projected to be 8,332.8 litres/year with the application of water re-circulation. The results suggest the prototype has achieved required objectives. Further research will include comparing the cooling rate of TEC prototype against conventional tap water and to optimize its design and performance in terms of size and portability. The possible application of the prototype could also be expanded to portable storage for medicine and beverages.

Keywords: energy efficiency, thermoelectric cooling, pre-cooling device, heat flow meter, sustainable technology, thermal conductivity

Procedia PDF Downloads 144

24000 A Novel Heuristic for Analysis of Large Datasets by Selecting Wrapper-Based Features

Authors: Bushra Zafar, Usman Qamar

Abstract:

Large data sample size and dimensions render the effectiveness of conventional data mining methodologies. A data mining technique are important tools for collection of knowledgeable information from variety of databases and provides supervised learning in the form of classification to design models to describe vital data classes while structure of the classifier is based on class attribute. Classification efficiency and accuracy are often influenced to great extent by noisy and undesirable features in real application data sets. The inherent natures of data set greatly masks its quality analysis and leave us with quite few practical approaches to use. To our knowledge first time, we present a new approach for investigation of structure and quality of datasets by providing a targeted analysis of localization of noisy and irrelevant features of data sets. Machine learning is based primarily on feature selection as pre-processing step which offers us to select few features from number of features as a subset by reducing the space according to certain evaluation criterion. The primary objective of this study is to trim down the scope of the given data sample by searching a small set of important features which may results into good classification performance. For this purpose, a heuristic for wrapper-based feature selection using genetic algorithm and for discriminative feature selection an external classifier are used. Selection of feature based on its number of occurrence in the chosen chromosomes. Sample dataset has been used to demonstrate proposed idea effectively. A proposed method has improved average accuracy of different datasets is about 95%. Experimental results illustrate that proposed algorithm increases the accuracy of prediction of different diseases.

Keywords: data mining, generic algorithm, KNN algorithms, wrapper based feature selection

Procedia PDF Downloads 305

23999 Improve Student Performance Prediction Using Majority Vote Ensemble Model for Higher Education

Authors: Wade Ghribi, Abdelmoty M. Ahmed, Ahmed Said Badawy, Belgacem Bouallegue

Abstract:

In higher education institutions, the most pressing priority is to improve student performance and retention. Large volumes of student data are used in Educational Data Mining techniques to find new hidden information from students' learning behavior, particularly to uncover the early symptom of at-risk pupils. On the other hand, data with noise, outliers, and irrelevant information may provide incorrect conclusions. By identifying features of students' data that have the potential to improve performance prediction results, comparing and identifying the most appropriate ensemble learning technique after preprocessing the data, and optimizing the hyperparameters, this paper aims to develop a reliable students' performance prediction model for Higher Education Institutions. Data was gathered from two different systems: a student information system and an e-learning system for undergraduate students in the College of Computer Science of a Saudi Arabian State University. The cases of 4413 students were used in this article. The process includes data collection, data integration, data preprocessing (such as cleaning, normalization, and transformation), feature selection, pattern extraction, and, finally, model optimization and assessment. Random Forest, Bagging, Stacking, Majority Vote, and two types of Boosting techniques, AdaBoost and XGBoost, are ensemble learning approaches, whereas Decision Tree, Support Vector Machine, and Artificial Neural Network are supervised learning techniques. Hyperparameters for ensemble learning systems will be fine-tuned to provide enhanced performance and optimal output. The findings imply that combining features of students' behavior from e-learning and students' information systems using Majority Vote produced better outcomes than the other ensemble techniques.

Keywords: educational data mining, student performance prediction, e-learning, classification, ensemble learning, higher education

Procedia PDF Downloads 93

23998 Foundation of the Information Model for Connected-Cars

Authors: Hae-Won Seo, Yong-Gu Lee

Abstract:

Recent progress in the next generation of automobile technology is geared towards incorporating information technology into cars. Collectively called smart cars are bringing intelligence to cars that provides comfort, convenience and safety. A branch of smart cars is connected-car system. The key concept in connected-cars is the sharing of driving information among cars through decentralized manner enabling collective intelligence. This paper proposes a foundation of the information model that is necessary to define the driving information for smart-cars. Road conditions are modeled through a unique data structure that unambiguously represent the time variant traffics in the streets. Additionally, the modeled data structure is exemplified in a navigational scenario and usage using UML. Optimal driving route searching is also discussed using the proposed data structure in a dynamically changing road conditions.

Keywords: connected-car, data modeling, route planning, navigation system

Procedia PDF Downloads 365

23997 Mesoporous Na2Ti3O7 Nanotube-Constructed Materials with Hierarchical Architecture: Synthesis and Properties

Authors: Neumoin Anton Ivanovich, Opra Denis Pavlovich

Abstract:

Materials based on titanium oxide compounds are widely used in such areas as solar energy, photocatalysis, food industry and hygiene products, biomedical technologies, etc. Demand for them has also formed in the battery industry (an example of this is the commercialization of Li4Ti5O12), where much attention has recently been paid to the development of next-generation systems and technologies, such as sodium-ion batteries. This dictates the need to search for new materials with improved characteristics, as well as ways to obtain them that meet the requirements of scalability. One of the ways to solve these problems can be the creation of nanomaterials that often have a complex of physicochemical properties that radically differ from the characteristics of their counterparts in the micro- or macroscopic state. At the same time, it is important to control the texture (specific surface area, porosity) of such materials. In view of the above, among other methods, the hydrothermal technique seems to be suitable, allowing a wide range of control over the conditions of synthesis. In the present study, a method was developed for the preparation of mesoporous nanostructured sodium trititanate (Na2Ti3O7) with a hierarchical architecture. The materials were synthesized by hydrothermal processing and exhibit a complex hierarchically organized two-layer architecture. At the first level of the hierarchy, materials are represented by particles having a roughness surface, and at the second level, by one-dimensional nanotubes. The products were found to have high specific surface area and porosity with a narrow pore size distribution (about 6 nm). As it is known, the specific surface area and porosity are important characteristics of functional materials, which largely determine the possibilities and directions of their practical application. Electrochemical impedance spectroscopy data show that the resulting sodium trititanate has a sufficiently high electrical conductivity. As expected, the synthesized complexly organized nanoarchitecture based on sodium trititanate with a porous structure can be practically in demand, for example, in the field of new generation electrochemical storage and energy conversion devices.

Keywords: sodium trititanate, hierarchical materials, mesoporosity, nanotubes, hydrothermal synthesis

Procedia PDF Downloads 95

23996 Empirical Analysis of the Effect of Cloud Movement in a Basic Off-Grid Photovoltaic System: Case Study Using Transient Response of DC-DC Converters

Authors: Asowata Osamede, Christo Pienaar, Johan Bekker

Abstract:

Mismatch in electrical energy (power) or outage from commercial providers, in general, does not promote development to the public and private sector, these basically limit the development of industries. The necessity for a well-structured photovoltaic (PV) system is of importance for an efficient and cost-effective monitoring system. The major renewable energy potential on earth is provided from solar radiation and solar photovoltaics (PV) are considered a promising technological solution to support the global transformation to a low-carbon economy and reduction on the dependence on fossil fuels. Solar arrays which consist of various PV module should be operated at the maximum power point in order to reduce the overall cost of the system. So power regulation and conditioning circuits should be incorporated in the set-up of a PV system. Power regulation circuits used in PV systems include maximum power point trackers, DC-DC converters and solar chargers. Inappropriate choice of power conditioning device in a basic off-grid PV system can attribute to power loss, hence the need for a right choice of power conditioning device to be coupled with the system of the essence. This paper presents the design and implementation of a power conditioning devices in order to improve the overall yield from the availability of solar energy and the system’s total efficiency. The power conditioning devices taken into consideration in the project includes the Buck and Boost DC-DC converters as well as solar chargers with MPPT. A logging interface circuit (LIC) is designed and employed into the system. The LIC is designed on a printed circuit board. It basically has DC current signalling sensors, specifically the LTS 6-NP. The LIC is consequently required to program the voltages in the system (these include the PV voltage and the power conditioning device voltage). The voltage is structured in such a way that it can be accommodated by the data logger. Preliminary results which include availability of power as well as power loss in the system and efficiency will be presented and this would be used to draw the final conclusion.

Keywords: tilt and orientation angles, solar chargers, PV panels, storage devices, direct solar radiation

Procedia PDF Downloads 120

23995 Development and Adaptation of a LGBM Machine Learning Model, with a Suitable Concept Drift Detection and Adaptation Technique, for Barcelona Household Electric Load Forecasting During Covid-19 Pandemic Periods (Pre-Pandemic and Strict Lockdown)

Authors: Eric Pla Erra, Mariana Jimenez Martinez

Abstract:

While aggregated loads at a community level tend to be easier to predict, individual household load forecasting present more challenges with higher volatility and uncertainty. Furthermore, the drastic changes that our behavior patterns have suffered due to the COVID-19 pandemic have modified our daily electrical consumption curves and, therefore, further complicated the forecasting methods used to predict short-term electric load. Load forecasting is vital for the smooth and optimized planning and operation of our electric grids, but it also plays a crucial role for individual domestic consumers that rely on a HEMS (Home Energy Management Systems) to optimize their energy usage through self-generation, storage, or smart appliances management. An accurate forecasting leads to higher energy savings and overall energy efficiency of the household when paired with a proper HEMS. In order to study how COVID-19 has affected the accuracy of forecasting methods, an evaluation of the performance of a state-of-the-art LGBM (Light Gradient Boosting Model) will be conducted during the transition between pre-pandemic and lockdowns periods, considering day-ahead electric load forecasting. LGBM improves the capabilities of standard Decision Tree models in both speed and reduction of memory consumption, but it still offers a high accuracy. Even though LGBM has complex non-linear modelling capabilities, it has proven to be a competitive method under challenging forecasting scenarios such as short series, heterogeneous series, or data patterns with minimal prior knowledge. An adaptation of the LGBM model – called “resilient LGBM” – will be also tested, incorporating a concept drift detection technique for time series analysis, with the purpose to evaluate its capabilities to improve the model’s accuracy during extreme events such as COVID-19 lockdowns. The results for the LGBM and resilient LGBM will be compared using standard RMSE (Root Mean Squared Error) as the main performance metric. The models’ performance will be evaluated over a set of real households’ hourly electricity consumption data measured before and during the COVID-19 pandemic. All households are located in the city of Barcelona, Spain, and present different consumption profiles. This study is carried out under the ComMit-20 project, financed by AGAUR (Agència de Gestiód’AjutsUniversitaris), which aims to determine the short and long-term impacts of the COVID-19 pandemic on building energy consumption, incrementing the resilience of electrical systems through the use of tools such as HEMS and artificial intelligence.

Keywords: concept drift, forecasting, home energy management system (HEMS), light gradient boosting model (LGBM)

Procedia PDF Downloads 93

23994 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction

Procedia PDF Downloads 324

23993 Automated Multisensory Data Collection System for Continuous Monitoring of Refrigerating Appliances Recycling Plants

Authors: Georgii Emelianov, Mikhail Polikarpov, Fabian Hübner, Jochen Deuse, Jochen Schiemann

Abstract:

Recycling refrigerating appliances plays a major role in protecting the Earth's atmosphere from ozone depletion and emissions of greenhouse gases. The performance of refrigerator recycling plants in terms of material retention is the subject of strict environmental certifications and is reviewed periodically through specialized audits. The continuous collection of Refrigerator data required for the input-output analysis is still mostly manual, error-prone, and not digitalized. In this paper, we propose an automated data collection system for recycling plants in order to deduce expected material contents in individual end-of-life refrigerating appliances. The system utilizes laser scanner measurements and optical data to extract attributes of individual refrigerators by applying transfer learning with pre-trained vision models and optical character recognition. Based on Recognized features, the system automatically provides material categories and target values of contained material masses, especially foaming and cooling agents. The presented data collection system paves the way for continuous performance monitoring and efficient control of refrigerator recycling plants.

Keywords: automation, data collection, performance monitoring, recycling, refrigerators

Procedia PDF Downloads 149

23992 Sales Patterns Clustering Analysis on Seasonal Product Sales Data

Authors: Soojin Kim, Jiwon Yang, Sungzoon Cho

Abstract:

As a seasonal product is only in demand for a short time, inventory management is critical to profits. Both markdowns and stockouts decrease the return on perishable products; therefore, researchers have been interested in the distribution of seasonal products with the aim of maximizing profits. In this study, we propose a data-driven seasonal product sales pattern analysis method for individual retail outlets based on observed sales data clustering; the proposed method helps in determining distribution strategies.

Keywords: clustering, distribution, sales pattern, seasonal product

Procedia PDF Downloads 581

23991 Safety Considerations of Furanics for Sustainable Applications in Advanced Biorefineries

Authors: Anitha Muralidhara, Victor Engelen, Christophe Len, Pascal Pandard, Guy Marlair

Abstract:

Production of bio-based chemicals and materials from lignocellulosic biomass is gaining tremendous importance in advanced bio-refineries while aiming towards progressive replacement of petroleum based chemicals in transportation fuels and commodity polymers. One such attempt has resulted in the production of key furan derivatives (FD) such as furfural, HMF, MMF etc., via acid catalyzed dehydration (ACD) of C6 and C5 sugars, which are further converted into key chemicals or intermediates (such as Furandicarboxylic acid, Furfuryl alcohol etc.,). In subsequent processes, many high potential FD are produced, that can be converted into high added value polymers or high energy density biofuels. During ACD, an unavoidable polyfuranic byproduct is generated which is called humins. The family of FD is very large with varying chemical structures and diverse physicochemical properties. Accordingly, the associated risk profiles may largely vary. Hazardous Material (Haz-mat) classification systems such as GHS (CLP in the EU) and the UN TDG Model Regulations for transport of dangerous goods are one of the preliminary requirements for all chemicals for their appropriate classification, labelling, packaging, safe storage, and transportation. Considering the growing application routes of FD, it becomes important to notice the limited access to safety related information (safety data sheets available only for famous compounds such as HMF, furfural etc.,) in these internationally recognized haz-mat classification systems. However, these classifications do not necessarily provide information about the extent of risk involved when the chemical is used in any specific application. Factors such as thermal stability, speed of combustion, chemical incompatibilities, etc., can equally influence the safety profile of a compound, that are clearly out of the scope of any haz-mat classification system. Irrespective of the bio-based origin, FD has so far received inconsistent remarks concerning their toxicity profiles. With such inconsistencies, there is a fear that, a large family of FD may also follow extreme judgmental scenarios like ionic liquids, by ranking some compounds as extremely thermally stable, non-flammable, etc., Unless clarified, these messages could lead to misleading judgements while ranking the chemical based on its hazard rating. Safety is a key aspect in any sustainable biorefinery operation/facility, which is often underscored or neglected. To fill up these existing data gaps and to address ambiguities and discrepancies, the current study focuses on giving preliminary insights on safety assessment of FD and their potential targeted by-products. With the available information in the literature and obtained experimental results, physicochemical safety, environmental safety as well as (a scenario based) fire safety profiles of key FD, as well as side streams such as humins and levulinic acid, will be considered. With this, the study focuses on defining patterns and trends that gives coherent safety related information for existing and newly synthesized FD in the market for better functionality and sustainable applications.

Keywords: furanics, humins, safety, thermal and fire hazard, toxicity

Procedia PDF Downloads 155

23990 Probability Sampling in Matched Case-Control Study in Drug Abuse

Authors: Surya R. Niraula, Devendra B Chhetry, Girish K. Singh, S. Nagesh, Frederick A. Connell

Abstract:

Background: Although random sampling is generally considered to be the gold standard for population-based research, the majority of drug abuse research is based on non-random sampling despite the well-known limitations of this kind of sampling. Method: We compared the statistical properties of two surveys of drug abuse in the same community: one using snowball sampling of drug users who then identified “friend controls” and the other using a random sample of non-drug users (controls) who then identified “friend cases.” Models to predict drug abuse based on risk factors were developed for each data set using conditional logistic regression. We compared the precision of each model using bootstrapping method and the predictive properties of each model using receiver operating characteristics (ROC) curves. Results: Analysis of 100 random bootstrap samples drawn from the snowball-sample data set showed a wide variation in the standard errors of the beta coefficients of the predictive model, none of which achieved statistical significance. One the other hand, bootstrap analysis of the random-sample data set showed less variation, and did not change the significance of the predictors at the 5% level when compared to the non-bootstrap analysis. Comparison of the area under the ROC curves using the model derived from the random-sample data set was similar when fitted to either data set (0.93, for random-sample data vs. 0.91 for snowball-sample data, p=0.35); however, when the model derived from the snowball-sample data set was fitted to each of the data sets, the areas under the curve were significantly different (0.98 vs. 0.83, p < .001). Conclusion: The proposed method of random sampling of controls appears to be superior from a statistical perspective to snowball sampling and may represent a viable alternative to snowball sampling.

Keywords: drug abuse, matched case-control study, non-probability sampling, probability sampling

Procedia PDF Downloads 482

23989 Bioinformatics High Performance Computation and Big Data

Authors: Javed Mohammed

Abstract:

Right now, bio-medical infrastructure lags well behind the curve. Our healthcare system is dispersed and disjointed; medical records are a bit of a mess; and we do not yet have the capacity to store and process the crazy amounts of data coming our way from widespread whole-genome sequencing. And then there are privacy issues. Despite these infrastructure challenges, some researchers are plunging into bio medical Big Data now, in hopes of extracting new and actionable knowledge. They are doing delving into molecular-level data to discover bio markers that help classify patients based on their response to existing treatments; and pushing their results out to physicians in novel and creative ways. Computer scientists and bio medical researchers are able to transform data into models and simulations that will enable scientists for the first time to gain a profound under-standing of the deepest biological functions. Solving biological problems may require High-Performance Computing HPC due either to the massive parallel computation required to solve a particular problem or to algorithmic complexity that may range from difficult to intractable. Many problems involve seemingly well-behaved polynomial time algorithms (such as all-to-all comparisons) but have massive computational requirements due to the large data sets that must be analyzed. High-throughput techniques for DNA sequencing and analysis of gene expression have led to exponential growth in the amount of publicly available genomic data. With the increased availability of genomic data traditional database approaches are no longer sufficient for rapidly performing life science queries involving the fusion of data types. Computing systems are now so powerful it is possible for researchers to consider modeling the folding of a protein or even the simulation of an entire human body. This research paper emphasizes the computational biology's growing need for high-performance computing and Big Data. It illustrates this article’s indispensability in meeting the scientific and engineering challenges of the twenty-first century, and how Protein Folding (the structure and function of proteins) and Phylogeny Reconstruction (evolutionary history of a group of genes) can use HPC that provides sufficient capability for evaluating or solving more limited but meaningful instances. This article also indicates solutions to optimization problems, and benefits Big Data and Computational Biology. The article illustrates the Current State-of-the-Art and Future-Generation Biology of HPC Computing with Big Data.

Keywords: high performance, big data, parallel computation, molecular data, computational biology

Procedia PDF Downloads 354

23988 Numerical Analysis and Design of Dielectric to Plasmonic Waveguides Couplers

Authors: Emanuela Paranhos Lima, Vitaly Félix Rodríguez Esquerre

Abstract:

In this work, efficient directional coupler composed of dielectric waveguides and metallic film has been analyzed in details by simulations using finite element method (FEM). The structure consists of a step-index fiber with dielectric core, silica cladding, and a metal nanowire parallel to the core. The results show that an efficient conversion of optical dielectric modes to long range plasmonic is possible. Low insertion losses in conjunction with short coupling length and a broadband operation can be achieved under certain conditions. This kind of couplers has potential applications for the design of photonic integrated circuits for signal routing between dielectric/plasmonic waveguides, sensing, lithography, and optical storage systems. A high efficient focusing of light in a very small region can be obtained.

Keywords: directional coupler, finite element method, metallic nanowire, plasmonic, surface plasmon polariton, superfocusing

Procedia PDF Downloads 259

23987 Evaluating the Effectiveness of Science Teacher Training Programme in National Colleges of Education: a Preliminary Study, Perceptions of Prospective Teachers

Authors: A. S. V Polgampala, F. Huang

Abstract:

This is an overview of what is entailed in an evaluation and issues to be aware of when class observation is being done. This study examined the effects of evaluating teaching practice of a 7-day ‘block teaching’ session in a pre -service science teacher training program at a reputed National College of Education in Sri Lanka. Effects were assessed in three areas: evaluation of the training process, evaluation of the training impact, and evaluation of the training procedure. Data for this study were collected by class observation of 18 teachers during 9th February to 16th of 2017. Prospective teachers of science teaching, the participants of the study were evaluated based on newly introduced format by the NIE. The data collected was analyzed qualitatively using the Miles and Huberman procedure for analyzing qualitative data: data reduction, data display and conclusion drawing/verification. It was observed that the trainees showed their confidence in teaching those competencies and skills. Teacher educators’ dissatisfaction has been a great impact on evaluation process.

Keywords: evaluation, perceptions & perspectives, pre-service, science teachering

Procedia PDF Downloads 300

23986 Detecting Venomous Files in IDS Using an Approach Based on Data Mining Algorithm

Authors: Sukhleen Kaur

Abstract:

In security groundwork, Intrusion Detection System (IDS) has become an important component. The IDS has received increasing attention in recent years. IDS is one of the effective way to detect different kinds of attacks and malicious codes in a network and help us to secure the network. Data mining techniques can be implemented to IDS, which analyses the large amount of data and gives better results. Data mining can contribute to improving intrusion detection by adding a level of focus to anomaly detection. So far the study has been carried out on finding the attacks but this paper detects the malicious files. Some intruders do not attack directly, but they hide some harmful code inside the files or may corrupt those file and attack the system. These files are detected according to some defined parameters which will form two lists of files as normal files and harmful files. After that data mining will be performed. In this paper a hybrid classifier has been used via Naive Bayes and Ripper classification methods. The results show how the uploaded file in the database will be tested against the parameters and then it is characterised as either normal or harmful file and after that the mining is performed. Moreover, when a user tries to mine on harmful file it will generate an exception that mining cannot be made on corrupted or harmful files.

Keywords: data mining, association, classification, clustering, decision tree, intrusion detection system, misuse detection, anomaly detection, naive Bayes, ripper

Procedia PDF Downloads 403

23985 Generalized Approach to Linear Data Transformation

Authors: Abhijith Asok

Abstract:

This paper presents a generalized approach for the simple linear data transformation, Y=bX, through an integration of multidimensional coordinate geometry, vector space theory and polygonal geometry. The scaling is performed by adding an additional ’Dummy Dimension’ to the n-dimensional data, which helps plot two dimensional component-wise straight lines on pairs of dimensions. The end result is a set of scaled extensions of observations in any of the 2n spatial divisions, where n is the total number of applicable dimensions/dataset variables, created by shifting the n-dimensional plane along the ’Dummy Axis’. The derived scaling factor was found to be dependent on the coordinates of the common point of origin for diverging straight lines and the plane of extension, chosen on and perpendicular to the ’Dummy Axis’, respectively. This result indicates the geometrical interpretation of a linear data transformation and hence, opportunities for a more informed choice of the factor ’b’, based on a better choice of these coordinate values. The paper follows on to identify the effect of this transformation on certain popular distance metrics, wherein for many, the distance metric retained the same scaling factor as that of the features.

Keywords: data transformation, dummy dimension, linear transformation, scaling

Procedia PDF Downloads 287

23984 Blockchain Platform Configuration for MyData Operator in Digital and Connected Health

Authors: Minna Pikkarainen, Yueqiang Xu

Abstract:

The integration of digital technology with existing healthcare processes has been painfully slow, a huge gap exists between the fields of strictly regulated official medical care and the quickly moving field of health and wellness technology. We claim that the promises of preventive healthcare can only be fulfilled when this gap is closed – health care and self-care becomes seamless continuum “correct information, in the correct hands, at the correct time allowing individuals and professionals to make better decisions” what we call connected health approach. Currently, the issues related to security, privacy, consumer consent and data sharing are hindering the implementation of this new paradigm of healthcare. This could be solved by following MyData principles stating that: Individuals should have the right and practical means to manage their data and privacy. MyData infrastructure enables decentralized management of personal data, improves interoperability, makes it easier for companies to comply with tightening data protection regulations, and allows individuals to change service providers without proprietary data lock-ins. This paper tackles today’s unprecedented challenges of enabling and stimulating multiple healthcare data providers and stakeholders to have more active participation in the digital health ecosystem. First, the paper systematically proposes the MyData approach for healthcare and preventive health data ecosystem. In this research, the work is targeted for health and wellness ecosystems. Each ecosystem consists of key actors, such as 1) individual (citizen or professional controlling/using the services) i.e. data subject, 2) services providing personal data (e.g. startups providing data collection apps or data collection devices), 3) health and wellness services utilizing aforementioned data and 4) services authorizing the access to this data under individual’s provided explicit consent. Second, the research extends the existing four archetypes of orchestrator-driven healthcare data business models for the healthcare industry and proposes the fifth type of healthcare data model, the MyData Blockchain Platform. This new architecture is developed by the Action Design Research approach, which is a prominent research methodology in the information system domain. The key novelty of the paper is to expand the health data value chain architecture and design from centralization and pseudo-decentralization to full decentralization, enabled by blockchain, thus the MyData blockchain platform. The study not only broadens the healthcare informatics literature but also contributes to the theoretical development of digital healthcare and blockchain research domains with a systemic approach.

Keywords: blockchain, health data, platform, action design

Procedia PDF Downloads 89

23983 Possibilities of Postmortem CT to Detection of Gas Accumulations in the Vessels of Dead Newborns with Congenital Sepsis

Authors: Uliana N. Tumanova, Viacheslav M. Lyapin, Vladimir G. Bychenko, Alexandr I. Shchegolev, Gennady T. Sukhikh

Abstract:

It is well known that the gas formed as a result of postmortem decomposition of tissues can be detected already 24-48 hours after death. In addition, the conditions of keeping and storage of the corpse (temperature and humidity of the environment) significantly determine the rate of occurrence and development of posthumous changes. The presence of sepsis is accompanied by faster postmortem decomposition and decay of the organs and tissues of the body. The presence of gas in the vessels and cavities can be revealed fully at postmortem CT. Radiologists must certainly report on the detection of intraorganic or intravascular gas, wich was detected at postmortem CT, to forensic experts or pathologists before the autopsy. This gas can not be detected during autopsy, but it can be very important for establishing a diagnosis. To explore the possibility of postmortem CT for the evaluation of gas accumulations in the newborns' vessels, who died from congenital sepsis. Researched of 44 newborns bodies (25 male and 19 female sex, at the age from 6 hours to 27 days) after 6 - 12 hours of death. The bodies were stored in the refrigerator at a temperature of +4°C in the supine position. Grouped 12 bodies of newborns that died from congenital sepsis. The control group consisted of 32 bodies of newborns that died without signs of sepsis. Postmortem CT examination was performed at the GEMINI TF TOF16 device, before the autopsy. The localizations of gas accumulations in the vessels were determined on the CT tomograms. The sepsis diagnosis was on the basis of clinical and laboratory data and autopsy results. Gases in the vessels were detected in 33.3% of cases in the group with sepsis, and in the control group - in 34.4%. A group with sepsis most often the gas localized in the heart and liver vessels - 50% each, of observations number with the detected gas in the vessels. In the heart cavities, aorta and mesenteric vessels - 25% each. In control most often gas was detected in the liver (63.6%) and abdominal cavity (54.5%) vessels. In 45.5% the gas localized in the cavities, and in 36.4% in the vessels of the heart. In the cerebral vessels and in the aorta gas was detected in 27.3% and 9.1%, respectively. Postmortem CT has high diagnostic capabilities to detect free gas in vessels. Postmortem changes in newborns that died from sepsis do not affect intravascular gas production within 6-12 hours. Radiation methods should be used as a supplement to the autopsy, including as a kind of ‘guide’, with the indication to the forensic medical expert of certain changes identified during CT studies, for better definition of pathological processes during the autopsy. Postmortem CT can be recommend as a first stage of autopsy.

Keywords: congenital sepsis, gas, newborn, postmortem CT

Procedia PDF Downloads 134

23982 Using Learning Apps in the Classroom

Authors: Janet C. Read

Abstract:

UClan set collaboration with Lingokids to assess the Lingokids learning app's impact on learning outcomes in classrooms in the UK for children with ages ranging from 3 to 5 years. Data gathered during the controlled study with 69 children includes attitudinal data, engagement, and learning scores. Data shows that children enjoyment while learning was higher among those children using the game-based app compared to those children using other traditional methods. It’s worth pointing out that engagement when using the learning app was significantly higher than other traditional methods among older children. According to existing literature, there is a direct correlation between engagement, motivation, and learning. Therefore, this study provides relevant data points to conclude that Lingokids learning app serves its purpose of encouraging learning through playful and interactive content. That being said, we believe that learning outcomes should be assessed with a wider range of methods in further studies. Likewise, it would be beneficial to assess the level of usability and playability of the app in order to evaluate the learning app from other angles.

Keywords: learning app, learning outcomes, rapid test activity, Smileyometer, early childhood education, innovative pedagogy

Procedia PDF Downloads 58

23981 Road Safety in the Great Britain: An Exploratory Data Analysis

Authors: Jatin Kumar Choudhary, Naren Rayala, Abbas Eslami Kiasari, Fahimeh Jafari

Abstract:

The Great Britain has one of the safest road networks in the world. However, the consequences of any death or serious injury are devastating for loved ones, as well as for those who help the severely injured. This paper aims to analyse the Great Britain's road safety situation and show the response measures for areas where the total damage caused by accidents can be significantly and quickly reduced. In this paper, we do an exploratory data analysis using STATS19 data. For the past 30 years, the UK has had a good record in reducing fatalities. The UK ranked third based on the number of road deaths per million inhabitants. There were around 165,000 accidents reported in the Great Britain in 2009 and it has been decreasing every year until 2019 which is under 120,000. The government continues to scale back road deaths empowering responsible road users by identifying and prosecuting the parameters that make the roads less safe.

Keywords: road safety, data analysis, openstreetmap, feature expanding.

Procedia PDF Downloads 120

23980 Intrusion Detection System Using Linear Discriminant Analysis

Authors: Zyad Elkhadir, Khalid Chougdali, Mohammed Benattou

Abstract:

Most of the existing intrusion detection systems works on quantitative network traffic data with many irrelevant and redundant features, which makes detection process more time’s consuming and inaccurate. A several feature extraction methods, such as linear discriminant analysis (LDA), have been proposed. However, LDA suffers from the small sample size (SSS) problem which occurs when the number of the training samples is small compared with the samples dimension. Hence, classical LDA cannot be applied directly for high dimensional data such as network traffic data. In this paper, we propose two solutions to solve SSS problem for LDA and apply them to a network IDS. The first method, reduce the original dimension data using principal component analysis (PCA) and then apply LDA. In the second solution, we propose to use the pseudo inverse to avoid singularity of within-class scatter matrix due to SSS problem. After that, the KNN algorithm is used for classification process. We have chosen two known datasets KDDcup99 and NSLKDD for testing the proposed approaches. Results showed that the classification accuracy of (PCA+LDA) method outperforms clearly the pseudo inverse LDA method when we have large training data.

Keywords: LDA, Pseudoinverse, PCA, IDS, NSL-KDD, KDDcup99

Procedia PDF Downloads 216

23979 Interoperable Design Coordination Method for Sharing Communication Information Using Building Information Model Collaboration Format

Authors: Jin Gang Lee, Hyun-Soo Lee, Moonseo Park

Abstract:

The utilization of BIM and IFC allows project participants to collaborate across different areas by consistently sharing interoperable product information represented in a model. Comments or markups generated during the coordination process can be categorized as communication information, which can be shared in less standardized manner. It can be difficult to manage and reuse such information compared to the product information in a model. The present study proposes an interoperable coordination method using BCF (the BIM Collaboration Format) for managing and sharing the communication information during BIM based coordination process. A management function for coordination in the BIM collaboration system is developed to assess its ability to share the communication information in BIM collaboration projects. This approach systematically links communication information during the coordination process to the building model and serves as a type of storage system for retrieving knowledge created during BIM collaboration projects.

Keywords: design coordination, building information model, BIM collaboration format, industry foundation classes

Procedia PDF Downloads 409

23978 Studies of Rule Induction by STRIM from the Decision Table with Contaminated Attribute Values from Missing Data and Noise — in the Case of Critical Dataset Size —

Authors: Tetsuro Saeki, Yuichi Kato, Shoutarou Mizuno

Abstract:

STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by simulation experiments specifying rules in advance, and by comparison with conventional methods. However, scope for future development remains before STRIM can be applied to the analysis of real-world data sets. The first requirement is to determine the size of the dataset needed for inducting true rules, since finding statistically significant rules is the core of the method. The second is to examine the capacity of rule induction from datasets with contaminated attribute values created by missing data and noise, since real-world datasets usually contain such contaminated data. This paper examines the first problem theoretically, in connection with the rule length. The second problem is then examined in a simulation experiment, utilizing the critical size of dataset derived from the first step. The experimental results show that STRIM is highly robust in the analysis of datasets with contaminated attribute values, and hence is applicable to realworld data.

Keywords: rule induction, decision table, missing data, noise

Procedia PDF Downloads 383

23977 Design and Integration of an Energy Harvesting Vibration Absorber for Rotating System

Authors: F. Infante, W. Kaal, S. Perfetto, S. Herold

Abstract:

In the last decade the demand of wireless sensors and low-power electric devices for condition monitoring in mechanical structures has been strongly increased. Networks of wireless sensors can potentially be applied in a huge variety of applications. Due to the reduction of both size and power consumption of the electric components and the increasing complexity of mechanical systems, the interest of creating dense nodes sensor networks has become very salient. Nevertheless, with the development of large sensor networks with numerous nodes, the critical problem of powering them is drawing more and more attention. Batteries are not a valid alternative for consideration regarding lifetime, size and effort in replacing them. Between possible alternative solutions for durable power sources useable in mechanical components, vibrations represent a suitable source for the amount of power required to feed a wireless sensor network. For this purpose, energy harvesting from structural vibrations has received much attention in the past few years. Suitable vibrations can be found in numerous mechanical environments including automotive moving structures, household applications, but also civil engineering structures like buildings and bridges. Similarly, a dynamic vibration absorber (DVA) is one of the most used devices to mitigate unwanted vibration of structures. This device is used to transfer the primary structural vibration to the auxiliary system. Thus, the related energy is effectively localized in the secondary less sensitive structure. Then, the additional benefit of harvesting part of the energy can be obtained by implementing dedicated components. This paper describes the design process of an energy harvesting tuned vibration absorber (EHTVA) for rotating systems using piezoelectric elements. The energy of the vibration is converted into electricity rather than dissipated. The device proposed is indeed designed to mitigate torsional vibrations as with a conventional rotational TVA, while harvesting energy as a power source for immediate use or storage. The resultant rotational multi degree of freedom (MDOF) system is initially reduced in an equivalent single degree of freedom (SDOF) system. The Den Hartog’s theory is used for evaluating the optimal mechanical parameters of the initial DVA for the SDOF systems defined. The performance of the TVA is operationally assessed and the vibration reduction at the original resonance frequency is measured. Then, the design is modified for the integration of active piezoelectric patches without detuning the TVA. In order to estimate the real power generated, a complex storage circuit is implemented. A DC-DC step-down converter is connected to the device through a rectifier to return a fixed output voltage. Introducing a big capacitor, the energy stored is measured at different frequencies. Finally, the electromechanical prototype is tested and validated achieving simultaneously reduction and harvesting functions.

Keywords: energy harvesting, piezoelectricity, torsional vibration, vibration absorber

Procedia PDF Downloads 133

23976 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services

Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme

Abstract:

Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.

Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing

Procedia PDF Downloads 96

23975 Regression Approach for Optimal Purchase of Hosts Cluster in Fixed Fund for Hadoop Big Data Platform

Authors: Haitao Yang, Jianming Lv, Fei Xu, Xintong Wang, Yilin Huang, Lanting Xia, Xuewu Zhu

Abstract:

Given a fixed fund, purchasing fewer hosts of higher capability or inversely more of lower capability is a must-be-made trade-off in practices for building a Hadoop big data platform. An exploratory study is presented for a Housing Big Data Platform project (HBDP), where typical big data computing is with SQL queries of aggregate, join, and space-time condition selections executed upon massive data from more than 10 million housing units. In HBDP, an empirical formula was introduced to predict the performance of host clusters potential for the intended typical big data computing, and it was shaped via a regression approach. With this empirical formula, it is easy to suggest an optimal cluster configuration. The investigation was based on a typical Hadoop computing ecosystem HDFS+Hive+Spark. A proper metric was raised to measure the performance of Hadoop clusters in HBDP, which was tested and compared with its predicted counterpart, on executing three kinds of typical SQL query tasks. Tests were conducted with respect to factors of CPU benchmark, memory size, virtual host division, and the number of element physical host in cluster. The research has been applied to practical cluster procurement for housing big data computing.

Keywords: Hadoop platform planning, optimal cluster scheme at fixed-fund, performance predicting formula, typical SQL query tasks

Procedia PDF Downloads 217

23974 Model Predictive Controller for Pasteurization Process

Authors: Tesfaye Alamirew Dessie

Abstract:

Our study focuses on developing a Model Predictive Controller (MPC) and evaluating it against a traditional PID for a pasteurization process. Utilizing system identification from the experimental data, the dynamics of the pasteurization process were calculated. Using best fit with data validation, residual, and stability analysis, the quality of several model architectures was evaluated. The validation data fit the auto-regressive with exogenous input (ARX322) model of the pasteurization process by roughly 80.37 percent. The ARX322 model structure was used to create MPC and PID control techniques. After comparing controller performance based on settling time, overshoot percentage, and stability analysis, it was found that MPC controllers outperform PID for those parameters.

Keywords: MPC, PID, ARX, pasteurization

Procedia PDF Downloads 145

23973 Copper Coil Heat Exchanger Performance for Greenhouse Heating: An Experimental and Theoretical Study

Authors: Maha Bakkari, R.Tadili

Abstract:

The present work is a study of the performance of a solar copper coil heating system in a greenhouse microclimate. Our system is based on the circulation of a Heat transfer fluid, which is water in our case, in a closed loop under the greenhouse's roof in order to store heat all day, and then this heat will supply the greenhouse during the night. In order to evaluate our greenhouse, we made an experimental study in two identical greenhouses, where the first one is equipped with a heating system and the second (without heating) is used for control. The heating system allows the establishment of the thermal balance and determines the mass of water necessary for the process in order to ensure its functioning during the night. The results obtained showed that this solar heating system and the climatic parameters inside the experimental greenhouse were improved, and it presents a significant gain compared to a controlled greenhouse without a heating system. This research is one of the solutions that help to reduce the greenhouse effect of the planet Earth, a problem that worries the world.

Keywords: solar energy, energy storage, greenhouse, environment

Procedia PDF Downloads 62

23972 Point Estimation for the Type II Generalized Logistic Distribution Based on Progressively Censored Data

Authors: Rana Rimawi, Ayman Baklizi

Abstract:

Skewed distributions are important models that are frequently used in applications. Generalized distributions form a class of skewed distributions and gain widespread use in applications because of their flexibility in data analysis. More specifically, the Generalized Logistic Distribution with its different types has received considerable attention recently. In this study, based on progressively type-II censored data, we will consider point estimation in type II Generalized Logistic Distribution (Type II GLD). We will develop several estimators for its unknown parameters, including maximum likelihood estimators (MLE), Bayes estimators and linear estimators (BLUE). The estimators will be compared using simulation based on the criteria of bias and Mean square error (MSE). An illustrative example of a real data set will be given.

Keywords: point estimation, type II generalized logistic distribution, progressive censoring, maximum likelihood estimation

Procedia PDF Downloads 187