Search results for: data clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24589

Search results for: data clustering

24049 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 230
24048 Molecular Clustering and Velocity Increase in Converging-Diverging Nozzle in Molecular Dynamics Simulation

Authors: Jeoungsu Na, Jaehawn Lee, Changil Hong, Suhee Kim

Abstract:

A molecular dynamics simulation in a converging-diverging nozzle was performed to study molecular collisions and their influence to average flow velocity according to a variety of vacuum levels. The static pressures and the dynamic pressure exerted by the molecule collision on the selected walls were compared to figure out the intensity variances of the directional flows. With pressure differences constant between the entrance and the exit of the nozzle, the numerical experiment was performed for molecular velocities and directional flows. The result shows that the velocities increased at the nozzle exit as the vacuum level gets higher in that area because less molecular collisions.

Keywords: cavitation, molecular collision, nozzle, vacuum, velocity increase

Procedia PDF Downloads 413
24047 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 119
24046 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh

Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila

Abstract:

Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.

Keywords: data culture, data-driven organization, data mesh, data quality for business success

Procedia PDF Downloads 111
24045 Big Data Analysis with RHadoop

Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim

Abstract:

It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop

Procedia PDF Downloads 413
24044 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 74
24043 Efficient Positioning of Data Aggregation Point for Wireless Sensor Network

Authors: Sifat Rahman Ahona, Rifat Tasnim, Naima Hassan

Abstract:

Data aggregation is a helpful technique for reducing the data communication overhead in wireless sensor network. One of the important tasks of data aggregation is positioning of the aggregator points. There are a lot of works done on data aggregation. But, efficient positioning of the aggregators points is not focused so much. In this paper, authors are focusing on the positioning or the placement of the aggregation points in wireless sensor network. Authors proposed an algorithm to select the aggregators positions for a scenario where aggregator nodes are more powerful than sensor nodes.

Keywords: aggregation point, data communication, data aggregation, wireless sensor network

Procedia PDF Downloads 137
24042 Spatial Econometric Approaches for Count Data: An Overview and New Directions

Authors: Paula Simões, Isabel Natário

Abstract:

This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.

Keywords: spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data

Procedia PDF Downloads 569
24041 A NoSQL Based Approach for Real-Time Managing of Robotics's Data

Authors: Gueidi Afef, Gharsellaoui Hamza, Ben Ahmed Samir

Abstract:

This paper deals with the secret of the continual progression data that new data management solutions have been emerged: The NoSQL databases. They crossed several areas like personalization, profile management, big data in real-time, content management, catalog, view of customers, mobile applications, internet of things, digital communication and fraud detection. Nowadays, these database management systems are increasing. These systems store data very well and with the trend of big data, a new challenge’s store demands new structures and methods for managing enterprise data. The new intelligent machine in the e-learning sector, thrives on more data, so smart machines can learn more and faster. The robotics are our use case to focus on our test. The implementation of NoSQL for Robotics wrestle all the data they acquire into usable form because with the ordinary type of robotics; we are facing very big limits to manage and find the exact information in real-time. Our original proposed approach was demonstrated by experimental studies and running example used as a use case.

Keywords: NoSQL databases, database management systems, robotics, big data

Procedia PDF Downloads 327
24040 Microstructure Evolution and Pre-transformation Microstructure Reconstruction in Ti-6Al-4V Alloy

Authors: Shreyash Hadke, Manendra Singh Parihar, Rajesh Khatirkar

Abstract:

In the present investigation, the variation in the microstructure with the changes in the heat treatment conditions i.e. temperature and time was observed. Ti-6Al-4V alloy was subject to solution annealing treatments in β (1066C) and α+β phase (930C and 850C) followed by quenching, air cooling and furnace cooling to room temperature respectively. The effect of solution annealing and cooling on the microstructure was studied by using optical microscopy (OM), scanning electron microscopy (SEM), electron backscattered diffraction (EBSD) and x-ray diffraction (XRD). The chemical composition of the β phase for different conditions was determined with the help of energy dispersive spectrometer (EDS) attached to SEM. Furnace cooling resulted in the development of coarser structure (α+β), while air cooling resulted in much finer structure with widmanstatten morphology of α at the grain boundaries. Quenching from solution annealing temperature formed α’ martensite, their proportion being dependent on the temperature in β phase field. It is well known that the transformation of β to α follows Burger orientation relationship (OR). In order to reconstruct the microstructure of parent β phase, a MATLAB code was written using neighbor-to-neighbor, triplet method and Tari’s method. The code was tested on the annealed samples (1066C solution annealing temperature followed by furnace cooling to room temperature). The parent phase data thus generated was then plotted using the TSL-OIM software. The reconstruction results of the above methods were compared and analyzed. The Tari’s approach (clustering approach) gave better results compared to neighbor-to-neighbor and triplet method but the time taken by the triplet method was least compared to the other two methods.

Keywords: Ti-6Al-4V alloy, microstructure, electron backscattered diffraction, parent phase reconstruction

Procedia PDF Downloads 428
24039 Modeling Activity Pattern Using XGBoost for Mining Smart Card Data

Authors: Eui-Jin Kim, Hasik Lee, Su-Jin Park, Dong-Kyu Kim

Abstract:

Smart-card data are expected to provide information on activity pattern as an alternative to conventional person trip surveys. The focus of this study is to propose a method for training the person trip surveys to supplement the smart-card data that does not contain the purpose of each trip. We selected only available features from smart card data such as spatiotemporal information on the trip and geographic information system (GIS) data near the stations to train the survey data. XGboost, which is state-of-the-art tree-based ensemble classifier, was used to train data from multiple sources. This classifier uses a more regularized model formalization to control the over-fitting and show very fast execution time with well-performance. The validation results showed that proposed method efficiently estimated the trip purpose. GIS data of station and duration of stay at the destination were significant features in modeling trip purpose.

Keywords: activity pattern, data fusion, smart-card, XGboost

Procedia PDF Downloads 220
24038 Estimation of Source Parameters and Moment Tensor Solution through Waveform Modeling of 2013 Kishtwar Earthquake

Authors: Shveta Puri, Shiv Jyoti Pandey, G. M. Bhat, Neha Raina

Abstract:

TheJammu and Kashmir region of the Northwest Himalaya had witnessed many devastating earthquakes in the recent past and has remained unexplored for any kind of seismic investigations except scanty records of the earthquakes that occurred in this region in the past. In this study, we have used local seismic data of year 2013 that was recorded by the network of Broadband Seismographs in J&K. During this period, our seismic stations recorded about 207 earthquakes including two moderate events of Mw 5.7 on 1st May, 2013 and Mw 5.1 of 2nd August, 2013.We analyzed the events of Mw 3-4.6 and the main events only (for minimizing the error) for source parameters, b value and sense of movement through waveform modeling for understanding seismotectonic and seismic hazard of the region. It has been observed that most of the events are bounded between 32.9° N – 33.3° N latitude and 75.4° E – 76.1° E longitudes, Moment Magnitude (Mw) ranges from Mw 3 to 5.7, Source radius (r), from 0.21 to 3.5 km, stress drop, from 1.90 bars to 71.1 bars and Corner frequency, from 0.39 – 6.06 Hz. The b-value for this region was found to be 0.83±0 from these events which are lower than the normal value (b=1), indicating the area is under high stress. The travel time inversion and waveform inversion method suggest focal depth up to 10 km probably above the detachment depth of the Himalayan region. Moment tensor solution of the (Mw 5.1, 02:32:47 UTC) main event of 2ndAugust suggested that the source fault is striking at 295° with dip of 33° and rake value of 85°. It was found that these events form intense clustering of small to moderate events within a narrow zone between Panjal Thrust and Kishtwar Window. Moment tensor solution of the main events and their aftershocks indicating thrust type of movement is occurring in this region.

Keywords: b-value, moment tensor, seismotectonics, source parameters

Procedia PDF Downloads 294
24037 Genetic Structuring of Four Tectona grandis L. F. Seed Production Areas in Southern India

Authors: P. M. Sreekanth

Abstract:

Teak (Tectona grandis L. f.) is a tree species indigenous to India and other Southeastern countries. It produces high-value timber and is easily established in plantations. Reforestation requires a constant supply of high quality seeds. Seed Production Areas (SPA) of teak are improved stands used for collection of open-pollinated quality seeds in large quantities. Information on the genetic diversity of major teak SPAs in India is scanty. The genetic structure of four important seed production areas of Kerala State in Southern India was analyzed employing amplified fragment length polymorphism markers using ten selective primer combinations on 80 samples (4 populations X 20 trees). The study revealed that the gene diversity of the SPAs varied from 0.169 (Konni SPA) to 0.203 (Wayanad SPA). The percentage of polymorphic loci ranged from 74.42 (Parambikulam SPA) to 84.06 (Konni SPA). The mean total gene diversity index (HT) of all the four SPAs was 0.2296 ±0.02. A high proportion of genetic diversity was observed within the populations (83%) while diversity between populations was lower (17%) (GST = 0.17). Principal coordinate analysis and STRUCTURE analysis of the genotypes indicated that the pattern of clustering was in accordance with the origin and geographic location of SPAs, indicating specific identity of each population. A UPGMA dendrogram was prepared and showed that all the twenty samples from each of Konni and Parambikulam SPAs clustered into two separate groups, respectively. However, five Nilambur genotypes and one Wayanad genotype intruded into the Konni cluster. The higher gene flow estimated (Nm = 2.4) reflected the inclusion of Konni origin planting stock in the Nilambur and Wayanad plantations. Evidence for population structure investigated using 3D Principal Coordinate Analysis of FAMD software 1.30 indicated that the pattern of clustering was in accordance with the origin of SPAs. The present study showed that assessment of genetic diversity in seed production plantations can be achieved using AFLP markers. The AFLP fingerprinting was also capable of identifying the geographical origin of planting stock and there by revealing the occurrence of the errors in genotype labeling. Molecular marker-based selective culling of genetically similar trees from a stand so as to increase the genetic base of seed production areas could be a new proposition to improve quality of seeds required for raising commercial plantations of teak. The technique can also be used to assess the genetic diversity status of plus trees within provenances during their selection for raising clonal seed orchards for assuring the quality of seeds available for raising future plantations.

Keywords: AFLP, genetic structure, spa, teak

Procedia PDF Downloads 294
24036 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: mutex task generation, data augmentation, meta-learning, text classification.

Procedia PDF Downloads 116
24035 Revolutionizing Traditional Farming Using Big Data/Cloud Computing: A Review on Vertical Farming

Authors: Milind Chaudhari, Suhail Balasinor

Abstract:

Due to massive deforestation and an ever-increasing population, the organic content of the soil is depleting at a much faster rate. Due to this, there is a big chance that the entire food production in the world will drop by 40% in the next two decades. Vertical farming can help in aiding food production by leveraging big data and cloud computing to ensure plants are grown naturally by providing the optimum nutrients sunlight by analyzing millions of data points. This paper outlines the most important parameters in vertical farming and how a combination of big data and AI helps in calculating and analyzing these millions of data points. Finally, the paper outlines how different organizations are controlling the indoor environment by leveraging big data in enhancing food quantity and quality.

Keywords: big data, IoT, vertical farming, indoor farming

Procedia PDF Downloads 152
24034 Cardiac Biosignal and Adaptation in Confined Nuclear Submarine Patrol

Authors: B. Lefranc, C. Aufauvre-Poupon, C. Martin-Krumm, M. Trousselard

Abstract:

Isolated and confined environments (ICE) present several challenges which may adversely affect human’s psychology and physiology. Submariners in Sub-Surface Ballistic Nuclear (SSBN) mission exposed to these environmental constraints must be able to perform complex tasks as part of their normal duties, as well as during crisis periods when emergency actions are required or imminent. The operational and environmental constraints they face contribute to challenge human adaptability. The impact of such a constrained environment has yet to be explored. Establishing a knowledge framework is a determining factor, particularly in view of the next long space travels. Ensuring that the crews are maintained in optimal operational conditions is a real challenge because the success of the mission depends on them. This study focused on the evaluation of the impact of stress on mental health and sensory degradation of submariners during a mission on SSBN using cardiac biosignal (heart rate variability, HRV) clustering. This is a pragmatic exploratory study of a prospective cohort included 19 submariner volunteers. HRV was recorded at baseline to classify by clustering the submariners according to their stress level based on parasympathetic (Pa) activity. Impacts of high Pa (HPa) versus low Pa (LPa) level at baseline were assessed on emotional state and sensory perception (interoception and exteroception) as a cardiac biosignal during the patrol and at a recovery time one month after. Whatever the time, no significant difference was found in mental health between groups. There are significant differences in the interoceptive, exteroceptive and physiological functioning during the patrol and at recovery time. To sum up, compared to the LPa group, the HPa maintains a higher level in psychosensory functioning during the patrol and at recovery but exhibits a decrease in Pa level. The HPa group has less adaptable HRV characteristics, less unpredictability and flexibility of cardiac biosignals while the LPa group increases them during the patrol and at recovery time. This dissociation between psychosensory and physiological adaptation suggests two treatment modalities for ICE environments. To our best knowledge, our results are the first to highlight the impact of physiological differences in the HRV profile on the adaptability of submariners. Further studies are needed to evaluate the negative emotional and cognitive effects of ICEs based on the cardiac profile. Artificial intelligence offers a promising future for maintaining high level of operational conditions. These future perspectives will not only allow submariners to be better prepared, but also to design feasible countermeasures that will help support analog environments that bring us closer to a trip to Mars.

Keywords: adaptation, exteroception, HRV, ICE, interoception, SSBN

Procedia PDF Downloads 160
24033 Data Challenges Facing Implementation of Road Safety Management Systems in Egypt

Authors: A. Anis, W. Bekheet, A. El Hakim

Abstract:

Implementing a Road Safety Management System (SMS) in a crowded developing country such as Egypt is a necessity. Beginning a sustainable SMS requires a comprehensive reliable data system for all information pertinent to road crashes. In this paper, a survey for the available data in Egypt and validating it for using in an SMS in Egypt. The research provides some missing data, and refer to the unavailable data in Egypt, looking forward to the contribution of the scientific society, the authorities, and the public in solving the problem of missing or unreliable crash data. The required data for implementing an SMS in Egypt are divided into three categories; the first is available data such as fatality and injury rates and it is proven in this research that it may be inconsistent and unreliable, the second category of data is not available, but it may be estimated, an example of estimating vehicle cost is available in this research, the third is not available and can be measured case by case such as the functional and geometric properties of a facility. Some inquiries are provided in this research for the scientific society, such as how to improve the links among stakeholders of road safety in order to obtain a consistent, non-biased, and reliable data system.

Keywords: road safety management system, road crash, road fatality, road injury

Procedia PDF Downloads 100
24032 Big Data-Driven Smart Policing: Big Data-Based Patrol Car Dispatching in Abu Dhabi, UAE

Authors: Oualid Walid Ben Ali

Abstract:

Big Data has become one of the buzzwords today. The recent explosion of digital data has led the organization, either private or public, to a new era towards a more efficient decision making. At some point, business decided to use that concept in order to learn what make their clients tick with phrases like ‘sales funnel’ analysis, ‘actionable insights’, and ‘positive business impact’. So, it stands to reason that Big Data was viewed through green (read: money) colored lenses. Somewhere along the line, however someone realized that collecting and processing data doesn’t have to be for business purpose only, but also could be used for other purposes to assist law enforcement or to improve policing or in road safety. This paper presents briefly, how Big Data have been used in the fields of policing order to improve the decision making process in the daily operation of the police. As example, we present a big-data driven system which is sued to accurately dispatch the patrol cars in a geographic environment. The system is also used to allocate, in real-time, the nearest patrol car to the location of an incident. This system has been implemented and applied in the Emirate of Abu Dhabi in the UAE.

Keywords: big data, big data analytics, patrol car allocation, dispatching, GIS, intelligent, Abu Dhabi, police, UAE

Procedia PDF Downloads 466
24031 Using Nonhomogeneous Poisson Process with Compound Distribution to Price Catastrophe Options

Authors: Rong-Tsorng Wang

Abstract:

In this paper, we derive a pricing formula for catastrophe equity put options (or CatEPut) with non-homogeneous loss and approximated compound distributions. We assume that the loss claims arrival process is a nonhomogeneous Poisson process (NHPP) representing the clustering occurrences of loss claims, the size of loss claims is a sequence of independent and identically distributed random variables, and the accumulated loss distribution forms a compound distribution and is approximated by a heavy-tailed distribution. A numerical example is given to calibrate parameters, and we discuss how the value of CatEPut is affected by the changes of parameters in the pricing model we provided.

Keywords: catastrophe equity put options, compound distributions, nonhomogeneous Poisson process, pricing model

Procedia PDF Downloads 143
24030 Optimized Cluster Head Selection Algorithm Based on LEACH Protocol for Wireless Sensor Networks

Authors: Wided Abidi, Tahar Ezzedine

Abstract:

Low-Energy Adaptive Clustering Hierarchy (LEACH) has been considered as one of the effective hierarchical routing algorithms that optimize energy and prolong the lifetime of network. Since the selection of Cluster Head (CH) in LEACH is carried out randomly, in this paper, we propose an approach of electing CH based on LEACH protocol. In other words, we present a formula for calculating the threshold responsible for CH election. In fact, we adopt three principle criteria: the remaining energy of node, the number of neighbors within cluster range and the distance between node and CH. Simulation results show that our proposed approach beats LEACH protocol in regards of prolonging the lifetime of network and saving residual energy.

Keywords: wireless sensors networks, LEACH protocol, cluster head election, energy efficiency

Procedia PDF Downloads 311
24029 Mining Multicity Urban Data for Sustainable Population Relocation

Authors: Xu Du, Aparna S. Varde

Abstract:

In this research, we propose to conduct diagnostic and predictive analysis about the key factors and consequences of urban population relocation. To achieve this goal, urban simulation models extract the urban development trends as land use change patterns from a variety of data sources. The results are treated as part of urban big data with other information such as population change and economic conditions. Multiple data mining methods are deployed on this data to analyze nonlinear relationships between parameters. The result determines the driving force of population relocation with respect to urban sprawl and urban sustainability and their related parameters. Experiments so far reveal that data mining methods discover useful knowledge from the multicity urban data. This work sets the stage for developing a comprehensive urban simulation model for catering to specific questions by targeted users. It contributes towards achieving sustainability as a whole.

Keywords: data mining, environmental modeling, sustainability, urban planning

Procedia PDF Downloads 275
24028 Model Order Reduction for Frequency Response and Effect of Order of Method for Matching Condition

Authors: Aref Ghafouri, Mohammad javad Mollakazemi, Farhad Asadi

Abstract:

In this paper, model order reduction method is used for approximation in linear and nonlinearity aspects in some experimental data. This method can be used for obtaining offline reduced model for approximation of experimental data and can produce and follow the data and order of system and also it can match to experimental data in some frequency ratios. In this study, the method is compared in different experimental data and influence of choosing of order of the model reduction for obtaining the best and sufficient matching condition for following the data is investigated in format of imaginary and reality part of the frequency response curve and finally the effect and important parameter of number of order reduction in nonlinear experimental data is explained further.

Keywords: frequency response, order of model reduction, frequency matching condition, nonlinear experimental data

Procedia PDF Downloads 377
24027 Integrating Radar Sensors with an Autonomous Vehicle Simulator for an Enhanced Smart Parking Management System

Authors: Mohamed Gazzeh, Bradley Null, Fethi Tlili, Hichem Besbes

Abstract:

The burgeoning global ownership of personal vehicles has posed a significant strain on urban infrastructure, notably parking facilities, leading to traffic congestion and environmental concerns. Effective parking management systems (PMS) are indispensable for optimizing urban traffic flow and reducing emissions. The most commonly deployed systems nowadays rely on computer vision technology. This paper explores the integration of radar sensors and simulation in the context of smart parking management. We concentrate on radar sensors due to their versatility and utility in automotive applications, which extends to PMS. Additionally, radar sensors play a crucial role in driver assistance systems and autonomous vehicle development. However, the resource-intensive nature of radar data collection for algorithm development and testing necessitates innovative solutions. Simulation, particularly the monoDrive simulator, an internal development tool used by NI the Test and Measurement division of Emerson, offers a practical means to overcome this challenge. The primary objectives of this study encompass simulating radar sensors to generate a substantial dataset for algorithm development, testing, and, critically, assessing the transferability of models between simulated and real radar data. We focus on occupancy detection in parking as a practical use case, categorizing each parking space as vacant or occupied. The simulation approach using monoDrive enables algorithm validation and reliability assessment for virtual radar sensors. It meticulously designed various parking scenarios, involving manual measurements of parking spot coordinates, orientations, and the utilization of TI AWR1843 radar. To create a diverse dataset, we generated 4950 scenarios, comprising a total of 455,400 parking spots. This extensive dataset encompasses radar configuration details, ground truth occupancy information, radar detections, and associated object attributes such as range, azimuth, elevation, radar cross-section, and velocity data. The paper also addresses the intricacies and challenges of real-world radar data collection, highlighting the advantages of simulation in producing radar data for parking lot applications. We developed classification models based on Support Vector Machines (SVM) and Density-Based Spatial Clustering of Applications with Noise (DBSCAN), exclusively trained and evaluated on simulated data. Subsequently, we applied these models to real-world data, comparing their performance against the monoDrive dataset. The study demonstrates the feasibility of transferring models from a simulated environment to real-world applications, achieving an impressive accuracy score of 92% using only one radar sensor. This finding underscores the potential of radar sensors and simulation in the development of smart parking management systems, offering significant benefits for improving urban mobility and reducing environmental impact. The integration of radar sensors and simulation represents a promising avenue for enhancing smart parking management systems, addressing the challenges posed by the exponential growth in personal vehicle ownership. This research contributes valuable insights into the practicality of using simulated radar data in real-world applications and underscores the role of radar technology in advancing urban sustainability.

Keywords: autonomous vehicle simulator, FMCW radar sensors, occupancy detection, smart parking management, transferability of models

Procedia PDF Downloads 61
24026 Efficient Subgoal Discovery for Hierarchical Reinforcement Learning Using Local Computations

Authors: Adrian Millea

Abstract:

In hierarchical reinforcement learning, one of the main issues encountered is the discovery of subgoal states or options (which are policies reaching subgoal states) by partitioning the environment in a meaningful way. This partitioning usually requires an expensive global clustering operation or eigendecomposition of the Laplacian of the states graph. We propose a local solution to this issue, much more efficient than algorithms using global information, which successfully discovers subgoal states by computing a simple function, which we call heterogeneity for each state as a function of its neighbors. Moreover, we construct a value function using the difference in heterogeneity from one step to the next, as reward, such that we are able to explore the state space much more efficiently than say epsilon-greedy. The same principle can then be applied to higher level of the hierarchy, where now states are subgoals discovered at the level below.

Keywords: exploration, hierarchical reinforcement learning, locality, options, value functions

Procedia PDF Downloads 143
24025 An Empirical Study of the Impacts of Big Data on Firm Performance

Authors: Thuan Nguyen

Abstract:

In the present time, data to a data-driven knowledge-based economy is the same as oil to the industrial age hundreds of years ago. Data is everywhere in vast volumes! Big data analytics is expected to help firms not only efficiently improve performance but also completely transform how they should run their business. However, employing the emergent technology successfully is not easy, and assessing the roles of big data in improving firm performance is even much harder. There was a lack of studies that have examined the impacts of big data analytics on organizational performance. This study aimed to fill the gap. The present study suggested using firms’ intellectual capital as a proxy for big data in evaluating its impact on organizational performance. The present study employed the Value Added Intellectual Coefficient method to measure firm intellectual capital, via its three main components: human capital efficiency, structural capital efficiency, and capital employed efficiency, and then used the structural equation modeling technique to model the data and test the models. The financial fundamental and market data of 100 randomly selected publicly listed firms were collected. The results of the tests showed that only human capital efficiency had a significant positive impact on firm profitability, which highlighted the prominent human role in the impact of big data technology.

Keywords: big data, big data analytics, intellectual capital, organizational performance, value added intellectual coefficient

Procedia PDF Downloads 217
24024 Automated Test Data Generation For some types of Algorithm

Authors: Hitesh Tahbildar

Abstract:

The cost of test data generation for a program is computationally very high. In general case, no algorithm to generate test data for all types of algorithms has been found. The cost of generating test data for different types of algorithm is different. Till date, people are emphasizing the need to generate test data for different types of programming constructs rather than different types of algorithms. The test data generation methods have been implemented to find heuristics for different types of algorithms. Some algorithms that includes divide and conquer, backtracking, greedy approach, dynamic programming to find the minimum cost of test data generation have been tested. Our experimental results say that some of these types of algorithm can be used as a necessary condition for selecting heuristics and programming constructs are sufficient condition for selecting our heuristics. Finally we recommend the different heuristics for test data generation to be selected for different types of algorithms.

Keywords: ongest path, saturation point, lmax, kL, kS

Procedia PDF Downloads 382
24023 Quantified Metabolomics for the Determination of Phenotypes and Biomarkers across Species in Health and Disease

Authors: Miroslava Cuperlovic-Culf, Lipu Wang, Ketty Boyle, Nadine Makley, Ian Burton, Anissa Belkaid, Mohamed Touaibia, Marc E. Surrette

Abstract:

Metabolic changes are one of the major factors in the development of a variety of diseases in various species. Metabolism of agricultural plants is altered the following infection with pathogens sometimes contributing to resistance. At the same time, pathogens use metabolites for infection and progression. In humans, metabolism is a hallmark of cancer development for example. Quantified metabolomics data combined with other omics or clinical data and analyzed using various unsupervised and supervised methods can lead to better diagnosis and prognosis. It can also provide information about resistance as well as contribute knowledge of compounds significant for disease progression or prevention. In this work, different methods for metabolomics quantification and analysis from Nuclear Magnetic Resonance (NMR) measurements that are used for investigation of disease development in wheat and human cells will be presented. One-dimensional 1H NMR spectra are used extensively for metabolic profiling due to their high reliability, wide range of applicability, speed, trivial sample preparation and low cost. This presentation will describe a new method for metabolite quantification from NMR data that combines alignment of spectra of standards to sample spectra followed by multivariate linear regression optimization of spectra of assigned metabolites to samples’ spectra. Several different alignment methods were tested and multivariate linear regression result has been compared with other quantification methods. Quantified metabolomics data can be analyzed in the variety of ways and we will present different clustering methods used for phenotype determination, network analysis providing knowledge about the relationships between metabolites through metabolic network as well as biomarker selection providing novel markers. These analysis methods have been utilized for the investigation of fusarium head blight resistance in wheat cultivars as well as analysis of the effect of estrogen receptor and carbonic anhydrase activation and inhibition on breast cancer cell metabolism. Metabolic changes in spikelet’s of wheat cultivars FL62R1, Stettler, MuchMore and Sumai3 following fusarium graminearum infection were explored. Extensive 1D 1H and 2D NMR measurements provided information for detailed metabolite assignment and quantification leading to possible metabolic markers discriminating resistance level in wheat subtypes. Quantification data is compared to results obtained using other published methods. Fusarium infection induced metabolic changes in different wheat varieties are discussed in the context of metabolic network and resistance. Quantitative metabolomics has been used for the investigation of the effect of targeted enzyme inhibition in cancer. In this work, the effect of 17 β -estradiol and ferulic acid on metabolism of ER+ breast cancer cells has been compared to their effect on ER- control cells. The effect of the inhibitors of carbonic anhydrase on the observed metabolic changes resulting from ER activation has also been determined. Metabolic profiles were studied using 1D and 2D metabolomic NMR experiments, combined with the identification and quantification of metabolites, and the annotation of the results is provided in the context of biochemical pathways.

Keywords: metabolic biomarkers, metabolic network, metabolomics, multivariate linear regression, NMR quantification, quantified metabolomics, spectral alignment

Procedia PDF Downloads 322
24022 Investigation and Analysis of Residential Building Energy End-Use Profile in Hot and Humid Area with Reference to Zhuhai City in China

Authors: Qingqing Feng, S. Thomas Ng, Frank Xu

Abstract:

Energy consumption in domestic sector has been increasing rapidly in China all along these years. Confronted with environmental challenges, the international society has made a concerted effort by setting the Paris Agreement, the Sustainable Development Goals, and the New Urban Agenda. Thus it’s very important for China to put forward reasonable countermeasures to boost building energy conservation which necessitates looking into the actuality of residential energy end-use profile and its influence factors. In this study, questionnaire surveys have been conducted in Zhuhai city in China, a typical city in hot summer warm winter climate zone. The data solicited mainly include the occupancy schedule, building’s information, residents’ information, household energy uses, the type, quantity and use patterns of appliances and occupants’ satisfaction. Over 200 valid samples have been collected through face-to-face interviews. Descriptive analysis, clustering analysis, correlation analysis and sensitivity analysis were then conducted on the dataset to understand the energy end-use profile. The findings identify: 1) several typical clusters of occupancy patterns and appliances utilization patterns; 2) the top three sensitive factors influencing energy consumption; 3) the correlations between satisfaction and energy consumption. For China with many different climates zones, it’s difficult to find a silver bullet on energy conservation. The aim of this paper is to provide a theoretical basis for multi-stakeholders including policy makers, residents, and academic communities to formulate reasonable energy saving blueprints for hot and humid urban residential buildings in China.

Keywords: residential building, energy end-use profile, questionnaire survey, sustainability

Procedia PDF Downloads 109
24021 The Perspective on Data Collection Instruments for Younger Learners

Authors: Hatice Kübra Koç

Abstract:

For academia, collecting reliable and valid data is one of the most significant issues for researchers. However, it is not the same procedure for all different target groups; meanwhile, during data collection from teenagers, young adults, or adults, researchers can use common data collection tools such as questionnaires, interviews, and semi-structured interviews; yet, for young learners and very young ones, these reliable and valid data collection tools cannot be easily designed or applied by the researchers. In this study, firstly, common data collection tools are examined for ‘very young’ and ‘young learners’ participant groups since it is thought that the quality and efficiency of an academic study is mainly based on its valid and correct data collection and data analysis procedure. Secondly, two different data collection instruments for very young and young learners are stated as discussing the efficacy of them. Finally, a suggested data collection tool – a performance-based questionnaire- which is specifically developed for ‘very young’ and ‘young learners’ participant groups in the field of teaching English to young learners as a foreign language is presented in this current study. The designing procedure and suggested items/factors for the suggested data collection tool are accordingly revealed at the end of the study to help researchers have studied with young and very learners.

Keywords: data collection instruments, performance-based questionnaire, young learners, very young learners

Procedia PDF Downloads 59
24020 Genome-Wide Identification and Characterization of MLO Family Genes in Pumpkin (Cucurbita maxima Duch.)

Authors: Khin Thanda Win, Chunying Zhang, Sanghyeob Lee

Abstract:

Mildew resistance locus o (Mlo), a plant-specific gene family with seven-transmembrane (TM), plays an important role in plant resistance to powdery mildew (PM). PM caused by Podosphaera xanthii is a widespread plant disease and probably represents the major fungal threat for many Cucurbits. The recent Cucurbita maxima genome sequence data provides an opportunity to identify and characterize the MLO gene family in this species. Total twenty genes (designated CmaMLO1 through CmaMLO20) have been identified by using an in silico cloning method with the MLO gene sequences of Cucumis sativus, Cucumis melo, Citrullus lanatus and Cucurbita pepo as probes. These CmaMLOs were evenly distributed on 15 chromosomes of 20 C. maxima chromosomes without any obvious clustering. Multiple sequence alignment showed that the common structural features of MLO gene family, such as TM domains, a calmodulin-binding domain and 30 important amino acid residues for MLO function, were well conserved. Phylogenetic analysis of the CmaMLO genes and other plant species reveals seven different clades (I through VII) and only clade IV is specific to monocots (rice, barley, and wheat). Phylogenetic and structural analyses provided preliminary evidence that five genes belonged to clade V could be the susceptibility genes which may play the importance role in PM resistance. This study is the first comprehensive report on MLO genes in C. maxima to our knowledge. These findings will facilitate the functional analysis of the MLOs related to PM susceptibility and are valuable resources for the development of disease resistance in pumpkin.

Keywords: Mildew resistance locus o (Mlo), powdery mildew, phylogenetic relationship, susceptibility genes

Procedia PDF Downloads 165