Search results for: data sets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24733

Search results for: data sets

24463 Constructing the Joint Mean-Variance Regions for Univariate and Bivariate Normal Distributions: Approach Based on the Measure of Cumulative Distribution Functions

Authors: Valerii Dashuk

Abstract:

The usage of the confidence intervals in economics and econometrics is widespread. To be able to investigate a random variable more thoroughly, joint tests are applied. One of such examples is joint mean-variance test. A new approach for testing such hypotheses and constructing confidence sets is introduced. Exploring both the value of the random variable and its deviation with the help of this technique allows checking simultaneously the shift and the probability of that shift (i.e., portfolio risks). Another application is based on the normal distribution, which is fully defined by mean and variance, therefore could be tested using the introduced approach. This method is based on the difference of probability density functions. The starting point is two sets of normal distribution parameters that should be compared (whether they may be considered as identical with given significance level). Then the absolute difference in probabilities at each 'point' of the domain of these distributions is calculated. This measure is transformed to a function of cumulative distribution functions and compared to the critical values. Critical values table was designed from the simulations. The approach was compared with the other techniques for the univariate case. It differs qualitatively and quantitatively in easiness of implementation, computation speed, accuracy of the critical region (theoretical vs. real significance level). Stable results when working with outliers and non-normal distributions, as well as scaling possibilities, are also strong sides of the method. The main advantage of this approach is the possibility to extend it to infinite-dimension case, which was not possible in the most of the previous works. At the moment expansion to 2-dimensional state is done and it allows to test jointly up to 5 parameters. Therefore the derived technique is equivalent to classic tests in standard situations but gives more efficient alternatives in nonstandard problems and on big amounts of data.

Keywords: confidence set, cumulative distribution function, hypotheses testing, normal distribution, probability density function

Procedia PDF Downloads 146
24462 Comparative Study of Estimators of Population Means in Two Phase Sampling in the Presence of Non-Response

Authors: Syed Ali Taqi, Muhammad Ismail

Abstract:

A comparative study of estimators of population means in two phase sampling in the presence of non-response when Unknown population means of the auxiliary variable(s) and incomplete information of study variable y as well as of auxiliary variable(s) is made. Three real data sets of University students, hospital and unemployment are used for comparison of all the available techniques in two phase sampling in the presence of non-response with the newly generalized ratio estimators.

Keywords: two-phase sampling, ratio estimator, product estimator, generalized estimators

Procedia PDF Downloads 203
24461 Design and Development of Fleet Management System for Multi-Agent Autonomous Surface Vessel

Authors: Zulkifli Zainal Abidin, Ahmad Shahril Mohd Ghani

Abstract:

Agent-based systems technology has been addressed as a new paradigm for conceptualizing, designing, and implementing software systems. Agents are sophisticated systems that act autonomously across open and distributed environments in solving problems. Nevertheless, it is impractical to rely on a single agent to do all computing processes in solving complex problems. An increasing number of applications lately require multiple agents to work together. A multi-agent system (MAS) is a loosely coupled network of agents that interact to solve problems that are beyond the individual capacities or knowledge of each problem solver. However, the network of MAS still requires a main system to govern or oversees the operation of the agents in order to achieve a unified goal. We had developed a fleet management system (FMS) in order to manage the fleet of agents, plan route for the agents, perform real-time data processing and analysis, and issue sets of general and specific instructions to the agents. This FMS should be able to perform real-time data processing, communicate with the autonomous surface vehicle (ASV) agents and generate bathymetric map according to the data received from each ASV unit. The first algorithm is developed to communicate with the ASV via radio communication using standard National Marine Electronics Association (NMEA) protocol sentences. Next, the second algorithm will take care of the path planning, formation and pattern generation is tested using various sample data. Lastly, the bathymetry map generation algorithm will make use of data collected by the agents to create bathymetry map in real-time. The outcome of this research is expected can be applied on various other multi-agent systems.

Keywords: autonomous surface vehicle, fleet management system, multi agent system, bathymetry

Procedia PDF Downloads 231
24460 The Administration of Infection Diseases During the Pandemic COVID-19 and the Role of the Differential Diagnosis with Biomarkers VB10

Authors: Sofia Papadimitriou

Abstract:

INTRODUCTION: The differential diagnosis between acute viral and bacterial infections is an important cost-effectiveness parameter at the stage of the treatment process in order to achieve the maximum benefits in therapeutic intervention by combining the minimum cost to ensure the proper use of antibiotics.The discovery of sensitive and robust molecular diagnostic tests in response to the role of the host in infections has enhanced the accurate diagnosis and differentiation of infections. METHOD: The study used a sample of six independent blood samples (total=756) which are associated with human proteins-proteins, each of which at the transcription stage expresses a different response in the host network between viral and bacterial infections.Τhe individual blood samples are subjected to a sequence of computer filters that identify a gene panel corresponding to an autonomous diagnostic score. The data set and the correspondence of the gene panel to the diagnostic patents a new Bangalore -Viral Bacterial (BL-VB). FINDING: We use a biomarker based on the blood of 10 genes(Panel-VB) that are an important prognostic value for the detection of viruses from bacterial infections with a weighted average AUROC of 0.97(95% CL:0.96-0.99) in eleven independent samples (sets n=898). We discovered a base with a patient score (VB 10 ) according to the table, which is a significant diagnostic value with a weighted average of AUROC 0.94(95% CL: 0.91-0.98) in 2996 patient samples from 56 public sets of data from 19 different countries. We also studied VB 10 in a new cohort of South India (BL-VB,n=56) and found 97% accuracy in confirmed cases of viral and bacterial infections. We found that VB 10 (a)accurately identifies the type of infection even in unspecified cases negative to the culture (b) shows its clinical condition recovery and (c) applies to all age groups, covering a wide range of acute bacterial and viral infectious, including non-specific pathogens. We applied our VB 10 rating to publicly available COVID 19 data and found that our rating diagnosed viral infection in patient samples. RESULTS: Τhe results of the study showed the diagnostic power of the biomarker VB 10 as a diagnostic test for the accurate diagnosis of acute infections in recovery conditions. We look forward to helping you make clinical decisions about prescribing antibiotics and integrating them into your policies management of antibiotic stewardship efforts. CONCLUSIONS: Overall, we are developing a new property of the RNA-based biomarker and a new blood test to differentiate between viral and bacterial infections to assist a physician in designing the optimal treatment regimen to contribute to the proper use of antibiotics and reduce the burden on antimicrobial resistance, AMR.

Keywords: acute infections, antimicrobial resistance, biomarker, blood transcriptome, systems biology, classifier diagnostic score

Procedia PDF Downloads 128
24459 Grid and Market Integration of Large Scale Wind Farms using Advanced Predictive Data Mining Techniques

Authors: Umit Cali

Abstract:

The integration of intermittent energy sources like wind farms into the electricity grid has become an important challenge for the utilization and control of electric power systems, because of the fluctuating behaviour of wind power generation. Wind power predictions improve the economic and technical integration of large amounts of wind energy into the existing electricity grid. Trading, balancing, grid operation, controllability and safety issues increase the importance of predicting power output from wind power operators. Therefore, wind power forecasting systems have to be integrated into the monitoring and control systems of the transmission system operator (TSO) and wind farm operators/traders. The wind forecasts are relatively precise for the time period of only a few hours, and, therefore, relevant with regard to Spot and Intraday markets. In this work predictive data mining techniques are applied to identify a statistical and neural network model or set of models that can be used to predict wind power output of large onshore and offshore wind farms. These advanced data analytic methods helps us to amalgamate the information in very large meteorological, oceanographic and SCADA data sets into useful information and manageable systems. Accurate wind power forecasts are beneficial for wind plant operators, utility operators, and utility customers. An accurate forecast allows grid operators to schedule economically efficient generation to meet the demand of electrical customers. This study is also dedicated to an in-depth consideration of issues such as the comparison of day ahead and the short-term wind power forecasting results, determination of the accuracy of the wind power prediction and the evaluation of the energy economic and technical benefits of wind power forecasting.

Keywords: renewable energy sources, wind power, forecasting, data mining, big data, artificial intelligence, energy economics, power trading, power grids

Procedia PDF Downloads 482
24458 Extraction of Urban Land Features from TM Landsat Image Using the Land Features Index and Tasseled Cap Transformation

Authors: R. Bouhennache, T. Bouden, A. A. Taleb, A. Chaddad

Abstract:

In this paper we propose a method to map the urban areas. The method uses an arithmetic calculation processed from the land features indexes and Tasseled cap transformation TC of multi spectral Thematic Mapper Landsat TM image. For this purpose the derived indexes image from the original image such SAVI the soil adjusted vegetation index, UI the urban Index, and EBBI the enhanced built up and bareness index were staked to form a new image and the bands were uncorrelated, also the Spectral Angle Mapper (SAM) and Spectral Information Divergence (SID) supervised classification approaches were first applied on the new image TM data using the reference spectra of the spectral library and subsequently the four urban, vegetation, water and soil land cover categories were extracted with their accuracy assessment.The urban features were represented using a logic calculation applied to the brightness, UI-SAVI, NDBI-greenness and EBBI- brightness data sets. The study applied to Blida and mentioned that the urban features can be mapped with an accuracy ranging from 92 % to 95%.

Keywords: EBBI, SAVI, Tasseled Cap Transformation, UI

Procedia PDF Downloads 454
24457 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data

Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill

Abstract:

Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.

Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function

Procedia PDF Downloads 251
24456 Evaluation of the CRISP-DM Business Understanding Step: An Approach for Assessing the Predictive Power of Regression versus Classification for the Quality Prediction of Hydraulic Test Results

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Digitalisation in production technology is a driver for the application of machine learning methods. Through the application of predictive quality, the great potential for saving necessary quality control can be exploited through the data-based prediction of product quality and states. However, the serial use of machine learning applications is often prevented by various problems. Fluctuations occur in real production data sets, which are reflected in trends and systematic shifts over time. To counteract these problems, data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets to extract stable features. Successful process control of the target variables aims to centre the measured values around a mean and minimise variance. Competitive leaders claim to have mastered their processes. As a result, much of the real data has a relatively low variance. For the training of prediction models, the highest possible generalisability is required, which is at least made more difficult by this data availability. The implementation of a machine learning application can be interpreted as a production process. The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model with six phases that describes the life cycle of data science. As in any process, the costs to eliminate errors increase significantly with each advancing process phase. For the quality prediction of hydraulic test steps of directional control valves, the question arises in the initial phase whether a regression or a classification is more suitable. In the context of this work, the initial phase of the CRISP-DM, the business understanding, is critically compared for the use case at Bosch Rexroth with regard to regression and classification. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. Suitable methods for leakage volume flow regression and classification for inspection decision are applied. Impressively, classification is clearly superior to regression and achieves promising accuracies.

Keywords: classification, CRISP-DM, machine learning, predictive quality, regression

Procedia PDF Downloads 116
24455 4D Monitoring of Subsurface Conditions in Concrete Infrastructure Prior to Failure Using Ground Penetrating Radar

Authors: Lee Tasker, Ali Karrech, Jeffrey Shragge, Matthew Josh

Abstract:

Monitoring for the deterioration of concrete infrastructure is an important assessment tool for an engineer and difficulties can be experienced with monitoring for deterioration within an infrastructure. If a failure crack, or fluid seepage through such a crack, is observed from the surface often the source location of the deterioration is not known. Geophysical methods are used to assist engineers with assessing the subsurface conditions of materials. Techniques such as Ground Penetrating Radar (GPR) provide information on the location of buried infrastructure such as pipes and conduits, positions of reinforcements within concrete blocks, and regions of voids/cavities behind tunnel lining. This experiment underlines the application of GPR as an infrastructure-monitoring tool to highlight and monitor regions of possible deterioration within a concrete test wall due to an increase in the generation of fractures; in particular, during a time period of applied load to a concrete wall up to and including structural failure. A three-point load was applied to a concrete test wall of dimensions 1700 x 600 x 300 mm³ in increments of 10 kN, until the wall structurally failed at 107.6 kN. At each increment of applied load, the load was kept constant and the wall was scanned using GPR along profile lines across the wall surface. The measured radar amplitude responses of the GPR profiles, at each applied load interval, were reconstructed into depth-slice grids and presented at fixed depth-slice intervals. The corresponding depth-slices were subtracted from each data set to compare the radar amplitude response between datasets and monitor for changes in the radar amplitude response. At lower values of applied load (i.e., 0-60 kN), few changes were observed in the difference of radar amplitude responses between data sets. At higher values of applied load (i.e., 100 kN), closer to structural failure, larger differences in radar amplitude response between data sets were highlighted in the GPR data; up to 300% increase in radar amplitude response at some locations between the 0 kN and 100 kN radar datasets. Distinct regions were observed in the 100 kN difference dataset (i.e., 100 kN-0 kN) close to the location of the final failure crack. The key regions observed were a conical feature located between approximately 3.0-12.0 cm depth from surface and a vertical linear feature located approximately 12.1-21.0 cm depth from surface. These key regions have been interpreted as locations exhibiting an increased change in pore-space due to increased mechanical loading, or locations displaying an increase in volume of micro-cracks, or locations showing the development of a larger macro-crack. The experiment showed that GPR is a useful geophysical monitoring tool to assist engineers with highlighting and monitoring regions of large changes of radar amplitude response that may be associated with locations of significant internal structural change (e.g. crack development). GPR is a non-destructive technique that is fast to deploy in a production setting. GPR can assist with reducing risk and costs in future infrastructure maintenance programs by highlighting and monitoring locations within the structure exhibiting large changes in radar amplitude over calendar-time.

Keywords: 4D GPR, engineering geophysics, ground penetrating radar, infrastructure monitoring

Procedia PDF Downloads 144
24454 Heat Transfer Modeling of 'Carabao' Mango (Mangifera indica L.) during Postharvest Hot Water Treatments

Authors: Hazel James P. Agngarayngay, Arnold R. Elepaño

Abstract:

Mango is the third most important export fruit in the Philippines. Despite the expanding mango trade in world market, problems on postharvest losses caused by pests and diseases are still prevalent. Many disease control and pest disinfestation methods have been studied and adopted. Heat treatment is necessary to eliminate pests and diseases to be able to pass the quarantine requirements of importing countries. During heat treatments, temperature and time are critical because fruits can easily be damaged by over-exposure to heat. Modeling the process enables researchers and engineers to study the behaviour of temperature distribution within the fruit over time. Understanding physical processes through modeling and simulation also saves time and resources because of reduced experimentation. This research aimed to simulate the heat transfer mechanism and predict the temperature distribution in ‘Carabao' mangoes during hot water treatment (HWT) and extended hot water treatment (EHWT). The simulation was performed in ANSYS CFD Software, using ANSYS CFX Solver. The simulation process involved model creation, mesh generation, defining the physics of the model, solving the problem, and visualizing the results. Boundary conditions consisted of the convective heat transfer coefficient and a constant free stream temperature. The three-dimensional energy equation for transient conditions was numerically solved to obtain heat flux and transient temperature values. The solver utilized finite volume method of discretization. To validate the simulation, actual data were obtained through experiment. The goodness of fit was evaluated using mean temperature difference (MTD). Also, t-test was used to detect significant differences between the data sets. Results showed that the simulations were able to estimate temperatures accurately with MTD of 0.50 and 0.69 °C for the HWT and EHWT, respectively. This indicates good agreement between the simulated and actual temperature values. The data included in the analysis were taken at different locations of probe punctures within the fruit. Moreover, t-tests showed no significant differences between the two data sets. Maximum heat fluxes obtained at the beginning of the treatments were 394.15 and 262.77 J.s-1 for HWT and EHWT, respectively. These values decreased abruptly at the first 10 seconds and gradual decrease was observed thereafter. Data on heat flux is necessary in the design of heaters. If underestimated, the heating component of a certain machine will not be able to provide enough heat required by certain operations. Otherwise, over-estimation will result in wasting of energy and resources. This study demonstrated that the simulation was able to estimate temperatures accurately. Thus, it can be used to evaluate the influence of various treatment conditions on the temperature-time history in mangoes. When combined with information on insect mortality and quality degradation kinetics, it could predict the efficacy of a particular treatment and guide appropriate selection of treatment conditions. The effect of various parameters on heat transfer rates, such as the boundary and initial conditions as well as the thermal properties of the material, can be systematically studied without performing experiments. Furthermore, the use of ANSYS software in modeling and simulation can be explored in modeling various systems and processes.

Keywords: heat transfer, heat treatment, mango, modeling and simulation

Procedia PDF Downloads 226
24453 Towards End-To-End Disease Prediction from Raw Metagenomic Data

Authors: Maxence Queyrel, Edi Prifti, Alexandre Templier, Jean-Daniel Zucker

Abstract:

Analysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and stored as fastq files. Conventional processing pipelines consist in multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimensionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life data-sets as well a simulated one, we demonstrated that this original approach reaches high performance, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.

Keywords: deep learning, disease prediction, end-to-end machine learning, metagenomics, multiple instance learning, precision medicine

Procedia PDF Downloads 97
24452 Polarity Classification of Social Media Comments in Turkish

Authors: Migena Ceyhan, Zeynep Orhan, Dimitrios Karras

Abstract:

People in modern societies are continuously sharing their experiences, emotions, and thoughts in different areas of life. The information reaches almost everyone in real-time and can have an important impact in shaping people’s way of living. This phenomenon is very well recognized and advantageously used by the market representatives, trying to earn the most from this means. Given the abundance of information, people and organizations are looking for efficient tools that filter the countless data into important information, ready to analyze. This paper is a modest contribution in this field, describing the process of automatically classifying social media comments in the Turkish language into positive or negative. Once data is gathered and preprocessed, feature sets of selected single words or groups of words are build according to the characteristics of language used in the texts. These features are used later to train, and test a system according to different machine learning algorithms (Naïve Bayes, Sequential Minimal Optimization, J48, and Bayesian Linear Regression). The resultant high accuracies can be important feedback for decision-makers to improve the business strategies accordingly.

Keywords: feature selection, machine learning, natural language processing, sentiment analysis, social media reviews

Procedia PDF Downloads 121
24451 Optimal Rest Interval between Sets in Robot-Based Upper-Arm Rehabilitation

Authors: Virgil Miranda, Gissele Mosqueda, Pablo Delgado, Yimesker Yihun

Abstract:

Muscular fatigue affects the muscle activation that is needed for producing the desired clinical outcome. Integrating optimal muscle relaxation periods into a variety of health care rehabilitation protocols is important to maximize the efficiency of the therapy. In this study, four muscle relaxation periods (30, 60, 90, and 120 seconds) and their effectiveness in producing consistent muscle activation of the muscle biceps brachii between sets of elbow flexion and extension task was investigated among a sample of 10 subjects with no disabilities. The same resting periods were then utilized in a controlled exoskeleton-based exercise for a sample size of 5 subjects and have shown similar results. On average, the muscle activity of the biceps brachii decreased by 0.3% when rested for 30 seconds, and it increased by 1.25%, 0.76%, and 0.82% when using muscle relaxation periods of 60, 90, and 120 seconds, respectively. The preliminary results suggest that a muscle relaxation period of about 60 seconds is needed for optimal continuous muscle activation within rehabilitation regimens. Robot-based rehabilitation is good to produce repetitive tasks with the right intensity, and knowing the optimal resting period will make the automation more effective.

Keywords: rest intervals, muscle biceps brachii, robot rehabilitation, muscle fatigue

Procedia PDF Downloads 155
24450 INCIPIT-CRIS: A Research Information System Combining Linked Data Ontologies and Persistent Identifiers

Authors: David Nogueiras Blanco, Amir Alwash, Arnaud Gaudinat, René Schneider

Abstract:

At a time when the access to and the sharing of information are crucial in the world of research, the use of technologies such as persistent identifiers (PIDs), Current Research Information Systems (CRIS), and ontologies may create platforms for information sharing if they respond to the need of disambiguation of their data by assuring interoperability inside and between other systems. INCIPIT-CRIS is a continuation of the former INCIPIT project, whose goal was to set up an infrastructure for a low-cost attribution of PIDs with high granularity based on Archival Resource Keys (ARKs). INCIPIT-CRIS can be interpreted as a logical consequence and propose a research information management system developed from scratch. The system has been created on and around the Schema.org ontology with a further articulation of the use of ARKs. It is thus built upon the infrastructure previously implemented (i.e., INCIPIT) in order to enhance the persistence of URIs. As a consequence, INCIPIT-CRIS aims to be the hinge between previously separated aspects such as CRIS, ontologies and PIDs in order to produce a powerful system allowing the resolution of disambiguation problems using a combination of an ontology such as Schema.org and unique persistent identifiers such as ARK, allowing the sharing of information through a dedicated platform, but also the interoperability of the system by representing the entirety of the data as RDF triplets. This paper aims to present the implemented solution as well as its simulation in real life. We will describe the underlying ideas and inspirations while going through the logic and the different functionalities implemented and their links with ARKs and Schema.org. Finally, we will discuss the tests performed with our project partner, the Swiss Institute of Bioinformatics (SIB), by the use of large and real-world data sets.

Keywords: current research information systems, linked data, ontologies, persistent identifier, schema.org, semantic web

Procedia PDF Downloads 97
24449 Forecasting Unusual Infection of Patient Used by Irregular Weighted Point Set

Authors: Seema Vaidya

Abstract:

Mining association rule is a key issue in data mining. In any case, the standard models ignore the distinction among the exchanges, and the weighted association rule mining does not transform on databases with just binary attributes. This paper proposes a novel continuous example and executes a tree (FP-tree) structure, which is an increased prefix-tree structure for securing compacted, discriminating data about examples, and makes a fit FP-tree-based mining system, FP enhanced capacity algorithm is used, for mining the complete game plan of examples by illustration incessant development. Here, this paper handles the motivation behind making remarkable and weighted item sets, i.e. rare weighted item set mining issue. The two novel brightness measures are proposed for figuring the infrequent weighted item set mining issue. Also, the algorithm are handled which perform IWI which is more insignificant IWI mining. Moreover we utilized the rare item set for choice based structure. The general issue of the start of reliable definite rules is troublesome for the grounds that hypothetically no inciting technique with no other person can promise the rightness of influenced theories. In this way, this framework expects the disorder with the uncommon signs. Usage study demonstrates that proposed algorithm upgrades the structure which is successful and versatile for mining both long and short diagnostics rules. Structure upgrades aftereffects of foreseeing rare diseases of patient.

Keywords: association rule, data mining, IWI mining, infrequent item set, frequent pattern growth

Procedia PDF Downloads 378
24448 Electron Impact Ionization Cross-Sections for e-C₅H₅N₅ Scattering

Authors: Manoj Kumar

Abstract:

Ionization cross sections of molecules due to electron impact play an important role in chemical processes in various branches of applied physics, such as radiation chemistry, gas discharges, plasmas etching in semiconductors, planetary upper atmospheric physics, mass spectrometry, etc. In the present work, we have calculated the total ionization cross sections for Adenine (C₅H₅N₅), a biologically important molecule, by electron impact in the incident electron energy range from ionization threshold to 2 keV employing a well-known Jain-Khare semiempirical formulation based on Bethe and Möllor cross sections. In the non-availability of the experimental results, the present results are in good agreement qualitatively as well as quantitatively with available theoretical results. The present results drive our confidence for further investigation of complex bio-molecule with better accuracy. Notwithstanding, the present method can deduce reliable cross-sectional data for complex targets with adequate accuracy and may facilitate the acclimatization of calculated cross-sections into atomic molecular cross-section data sets for modeling codes and other applications.

Keywords: electron impact ionization cross-sections, oscillator strength, jain-khare semiempirical approach

Procedia PDF Downloads 84
24447 An Expert System Designed to Be Used with MOEAs for Efficient Portfolio Selection

Authors: Kostas Metaxiotis, Kostas Liagkouras

Abstract:

This study presents an Expert System specially designed to be used with Multiobjective Evolutionary Algorithms (MOEAs) for the solution of the portfolio selection problem. The validation of the proposed hybrid System is done by using data sets from Hang Seng 31 in Hong Kong, DAX 100 in Germany and FTSE 100 in UK. The performance of the proposed system is assessed in comparison with the Non-dominated Sorting Genetic Algorithm II (NSGAII). The evaluation of the performance is based on different performance metrics that evaluate both the proximity of the solutions to the Pareto front and their dispersion on it. The results show that the proposed hybrid system is efficient for the solution of this kind of problems.

Keywords: expert systems, multi-objective optimization, evolutionary algorithms, portfolio selection

Procedia PDF Downloads 407
24446 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach

Authors: Theertha Chandroth

Abstract:

This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.

Keywords: XML, JSON, data comparison, integration testing, Python, SQL

Procedia PDF Downloads 86
24445 Global Solar Irradiance: Data Imputation to Analyze Complementarity Studies of Energy in Colombia

Authors: Jeisson A. Estrella, Laura C. Herrera, Cristian A. Arenas

Abstract:

The Colombian electricity sector has been transforming through the insertion of new energy sources to generate electricity, one of them being solar energy, which is being promoted by companies interested in photovoltaic technology. The study of this technology is important for electricity generation in general and for the planning of the sector from the perspective of energy complementarity. Precisely in this last approach is where the project is located; we are interested in answering the concerns about the reliability of the electrical system when climatic phenomena such as El Niño occur or in defining whether it is viable to replace or expand thermoelectric plants. Reliability of the electrical system when climatic phenomena such as El Niño occur, or to define whether it is viable to replace or expand thermoelectric plants with renewable electricity generation systems. In this regard, some difficulties related to the basic information on renewable energy sources from measured data must first be solved, as these come from automatic weather stations. Basic information on renewable energy sources from measured data, since these come from automatic weather stations administered by the Institute of Hydrology, Meteorology and Environmental Studies (IDEAM) and, in the range of study (2005-2019), have significant amounts of missing data. For this reason, the overall objective of the project is to complete the global solar irradiance datasets to obtain time series to develop energy complementarity analyses in a subsequent project. Global solar irradiance data sets to obtain time series that will allow the elaboration of energy complementarity analyses in the following project. The filling of the databases will be done through numerical and statistical methods, which are basic techniques for undergraduate students in technical areas who are starting out as researchers technical areas who are starting out as researchers.

Keywords: time series, global solar irradiance, imputed data, energy complementarity

Procedia PDF Downloads 40
24444 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 349
24443 Formulating Rough Approximations in Information Tables with Possibilistic Information

Authors: Michinori Nakata, Hiroshi Sakai

Abstract:

A rough set, which consists of lower and upper approximations, is formulated in information tables containing possibilistic information. First, lower and upper approximations on the basis of possible world semantics in the same way as Lipski did in the field of incomplete databases are shown in order to clarify fundamentals of rough sets under possibilistic information. Possibility and necessity measures are used, as is done in possibilistic databases. As a result, each object has certain and possible membership degrees to lower and upper approximations, which degrees are the lower and upper bounds. Therefore, the degree that the object belongs to lower and upper approximations is expressed by an interval value. And the complementary property linked with the lower and upper approximations holds, as is valid under complete information. Second, the approach based on indiscernibility relations, which is proposed by Dubois and Prade, are extended in three cases. The first case is that objects used to approximate a set of objects are characterized by possibilistic information. The second case is that objects used to approximate a set of objects with possibilistic information are characterized by complete information. The third case is that objects that are characterized by possibilistic information approximate a set of objects with possibilistic information. The extended approach create the same results as the approach based on possible world semantics. This justifies our extension.

Keywords: rough sets, possibilistic information, possible world semantics, indiscernibility relations, lower approximations, upper approximations

Procedia PDF Downloads 295
24442 Reviewing Privacy Preserving Distributed Data Mining

Authors: Sajjad Baghernezhad, Saeideh Baghernezhad

Abstract:

Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.

Keywords: data mining, distributed data mining, privacy protection, privacy preserving

Procedia PDF Downloads 489
24441 Crisis Management and Corporate Political Activism: A Qualitative Analysis of Online Reactions toward Tesla

Authors: Roxana D. Maiorescu-Murphy

Abstract:

In the US, corporations have recently embraced political stances in an attempt to respond to the external pressure exerted by activist groups. To date, research in this area remains in its infancy, and few studies have been conducted on the way stakeholder groups respond to corporate political advocacy in general and in the immediacy of such a corporate announcement in particular. The current study aims to fill in this research void. In addition, the study contributes to an emerging trajectory in the field of crisis management by focusing on the delineation between crises (unexpected events related to products and services) and scandals (crises that spur moral outrage). The present study looked at online reactions in the aftermath of Elon Musk’s endorsement of the Republican party on Twitter. Two data sets were collected from Twitter following two political endorsements made by Elon Musk on May 18, 2022, and June 15, 2022, respectively. The total sample of analysis stemming from the data two sets consisted of N=1,374 user comments written as a response to Musk’s initial tweets. Given the paucity of studies in the preceding research areas, the analysis employed a case study methodology, used in circumstances in which the phenomena to be studied had not been researched before. According to the case study methodology, which answers the questions of how and why a phenomenon occurs, this study responded to the research questions of how online users perceived Tesla and why they did so. The data were analyzed in NVivo by the use of the grounded theory methodology, which implied multiple exposures to the text and the undertaking of an inductive-deductive approach. Through multiple exposures to the data, the researcher ascertained the common themes and subthemes in the online discussion. Each theme and subtheme were later defined and labeled. Additional exposures to the text ensured that these were exhaustive. The results revealed that the CEO’s political endorsements triggered moral outrage, leading to Tesla’s facing a scandal as opposed to a crisis. The moral outrage revolved around the stakeholders’ predominant rejection of a perceived intrusion of an influential figure on a domain reserved for voters. As expected, Musk’s political endorsements led to polarizing opinions, and those who opposed his views engaged in online activism aimed to boycott the Tesla brand. These findings reveal that the moral outrage that characterizes a scandal requires communication practices that differ from those that practitioners currently borrow from the field of crisis management. Specifically, because scandals flourish in online settings, practitioners should regularly monitor stakeholder perceptions and address them in real-time. While promptness is essential when managing crises, it becomes crucial to respond immediately as a scandal is flourishing online. Finally, attempts should be made to distance a brand, its products, and its CEO from the latter’s political views.

Keywords: crisis management, communication management, Tesla, corporate political activism, Elon Musk

Procedia PDF Downloads 60
24440 Use of Socially Assistive Robots in Early Rehabilitation to Promote Mobility for Infants with Motor Delays

Authors: Elena Kokkoni, Prasanna Kannappan, Ashkan Zehfroosh, Effrosyni Mavroudi, Kristina Strother-Garcia, James C. Galloway, Jeffrey Heinz, Rene Vidal, Herbert G. Tanner

Abstract:

Early immobility affects the motor, cognitive, and social development. Current pediatric rehabilitation lacks the technology that will provide the dosage needed to promote mobility for young children at risk. The addition of socially assistive robots in early interventions may help increase the mobility dosage. The aim of this study is to examine the feasibility of an early intervention paradigm where non-walking infants experience independent mobility while socially interacting with robots. A dynamic environment is developed where both the child and the robot interact and learn from each other. The environment involves: 1) a range of physical activities that are goal-oriented, age-appropriate, and ability-matched for the child to perform, 2) the automatic functions that perceive the child’s actions through novel activity recognition algorithms, and decide appropriate actions for the robot, and 3) a networked visual data acquisition system that enables real-time assessment and provides the means to connect child behavior with robot decision-making in real-time. The environment was tested by bringing a two-year old boy with Down syndrome for eight sessions. The child presented delays throughout his motor development with the current being on the acquisition of walking. During the sessions, the child performed physical activities that required complex motor actions (e.g. climbing an inclined platform and/or staircase). During these activities, a (wheeled or humanoid) robot was either performing the action or was at its end point 'signaling' for interaction. From these sessions, information was gathered to develop algorithms to automate the perception of activities which the robot bases its actions on. A Markov Decision Process (MDP) is used to model the intentions of the child. A 'smoothing' technique is used to help identify the model’s parameters which are a critical step when dealing with small data sets such in this paradigm. The child engaged in all activities and socially interacted with the robot across sessions. With time, the child’s mobility was increased, and the frequency and duration of complex and independent motor actions were also increased (e.g. taking independent steps). Simulation results on the combination of the MDP and smoothing support the use of this model in human-robot interaction. Smoothing facilitates learning MDP parameters from small data sets. This paradigm is feasible and provides an insight on how social interaction may elicit mobility actions suggesting a new early intervention paradigm for very young children with motor disabilities. Acknowledgment: This work has been supported by NIH under grant #5R01HD87133.

Keywords: activity recognition, human-robot interaction, machine learning, pediatric rehabilitation

Procedia PDF Downloads 269
24439 The Right to Data Portability and Its Influence on the Development of Digital Services

Authors: Roman Bieda

Abstract:

The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.

Keywords: data portability, digital market, GDPR, personal data

Procedia PDF Downloads 443
24438 Impact of Small and Medium Enterprises on Economic Development in the Gulf Cooperation Council: Quantitative Approaches

Authors: Hanadi Al-Mubaraki, Michael Busler

Abstract:

Both in the developed and developing countries as well as Gulf Cooperation Council (GCC), the small and medium-sized enterprises (SMEs) proven to be main drivers of jobs creation and tools to accelerate economic development and economic diversification. This paper seeks to investigate and identify the strengths and weakness of SME as a veritable tool in economic development. A survey method was used to gather data from 171 SME from Gulf Cooperation Council (GCC). The research methodology uses a quantitative approach (survey) while data were collected with a structured questionnaire and analyzed with several descriptive statistics. The results of the study, therefore, will present sets of the strengths of SME in GCC such as 1) government supported local products (59%), 2) promoting SME local products rather than international products (47%), 3) reduce the legal and administrative procedures of SME establishment (46%) and weakness of SME in GCC such as: 1) lack of funding during the initial phase of the project (46%), 2) lack of liquidity during project continuity (39%), and 3) strong competition in the domestic and global market (38%). The study findings will be guidelines for academia and practitioners such as governments, policymakers, funded organizations, universities and strategic institutions for successful implementation.

Keywords: SME, economic development, GCC, strengths and weaknesses

Procedia PDF Downloads 121
24437 Remote Vital Signs Monitoring in Neonatal Intensive Care Unit Using a Digital Camera

Authors: Fatema-Tuz-Zohra Khanam, Ali Al-Naji, Asanka G. Perera, Kim Gibson, Javaan Chahl

Abstract:

Conventional contact-based vital signs monitoring sensors such as pulse oximeters or electrocardiogram (ECG) may cause discomfort, skin damage, and infections, particularly in neonates with fragile, sensitive skin. Therefore, remote monitoring of the vital sign is desired in both clinical and non-clinical settings to overcome these issues. Camera-based vital signs monitoring is a recent technology for these applications with many positive attributes. However, there are still limited camera-based studies on neonates in a clinical setting. In this study, the heart rate (HR) and respiratory rate (RR) of eight infants at the Neonatal Intensive Care Unit (NICU) in Flinders Medical Centre were remotely monitored using a digital camera applying color and motion-based computational methods. The region-of-interest (ROI) was efficiently selected by incorporating an image decomposition method. Furthermore, spatial averaging, spectral analysis, band-pass filtering, and peak detection were also used to extract both HR and RR. The experimental results were validated with the ground truth data obtained from an ECG monitor and showed a strong correlation using the Pearson correlation coefficient (PCC) 0.9794 and 0.9412 for HR and RR, respectively. The RMSE between camera-based data and ECG data for HR and RR were 2.84 beats/min and 2.91 breaths/min, respectively. A Bland Altman analysis of the data also showed a close correlation between both data sets with a mean bias of 0.60 beats/min and 1 breath/min, and the lower and upper limit of agreement -4.9 to + 6.1 beats/min and -4.4 to +6.4 breaths/min for both HR and RR, respectively. Therefore, video camera imaging may replace conventional contact-based monitoring in NICU and has potential applications in other contexts such as home health monitoring.

Keywords: neonates, NICU, digital camera, heart rate, respiratory rate, image decomposition

Procedia PDF Downloads 82
24436 Determining the Extent and Direction of Relief Transformations Caused by Ski Run Construction Using LIDAR Data

Authors: Joanna Fidelus-Orzechowska, Dominika Wronska-Walach, Jaroslaw Cebulski

Abstract:

Mountain areas are very often exposed to numerous transformations connected with the development of tourist infrastructure. In mountain areas in Poland ski tourism is very popular, so agricultural areas are often transformed into tourist areas. The construction of new ski runs can change the direction and rate of slope development. The main aim of this research was to determine geomorphological and hydrological changes within slopes caused by ski run constructions. The study was conducted in the Remiaszów catchment in the Inner Polish Carpathians (southern Poland). The mean elevation of the catchment is 859 m a.s.l. and the maximum is 946 m a.s.l. The surface area of the catchment is 1.16 km2, of which 16.8% is the area of the two studied ski runs. The studied ski runs were constructed in 2014 and 2015. In order to determine the relief transformations connected with new ski run construction high resolution LIDAR data was analyzed. The general relief changes in the studied catchment were determined on the basis of ALS (Airborne Laser Scanning ) data obtained before (2013) and after (2016) ski run construction. Based on the two sets of ALS data a digital elevation models of differences (DoDs) was created, which made it possible to determine the quantitative relief changes in the entire studied catchment. Additionally, cross and longitudinal profiles were calculated within slopes where new ski runs were built. Detailed data on relief changes within selected test surfaces was obtained based on TLS (Terrestrial Laser Scanning). Hydrological changes within the analyzed catchment were determined based on the convergence and divergence index. The study shows that the construction of the new ski runs caused significant geomorphological and hydrological changes in the entire studied catchment. However, the most important changes were identified within the ski slopes. After the construction of ski runs the entire catchment area lowered about 0.02 m. Hydrological changes in the studied catchment mainly led to the interruption of surface runoff pathways and changes in runoff direction and geometry.

Keywords: hydrological changes, mountain areas, relief transformations, ski run construction

Procedia PDF Downloads 114
24435 Discovering Semantic Links Between Synonyms, Hyponyms and Hypernyms

Authors: Ricardo Avila, Gabriel Lopes, Vania Vidal, Jose Macedo

Abstract:

This proposal aims for semantic enrichment between glossaries using the Simple Knowledge Organization System (SKOS) vocabulary to discover synonyms, hyponyms and hyperonyms semiautomatically, in Brazilian Portuguese, generating new semantic relationships based on WordNet. To evaluate the quality of this proposed model, experiments were performed by the use of two sets containing new relations, being one generated automatically and the other manually mapped by the domain expert. The applied evaluation metrics were precision, recall, f-score, and confidence interval. The results obtained demonstrate that the applied method in the field of Oil Production and Extraction (E&P) is effective, which suggests that it can be used to improve the quality of terminological mappings. The procedure, although adding complexity in its elaboration, can be reproduced in others domains.

Keywords: ontology matching, mapping enrichment, semantic web, linked data, SKOS

Procedia PDF Downloads 179
24434 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 367