Search results for: LiDAR datasets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 796

Search results for: LiDAR datasets

526 1/Sigma Term Weighting Scheme for Sentiment Analysis

Authors: Hanan Alshaher, Jinsheng Xu

Abstract:

Large amounts of data on the web can provide valuable information. For example, product reviews help business owners measure customer satisfaction. Sentiment analysis classifies texts into two polarities: positive and negative. This paper examines movie reviews and tweets using a new term weighting scheme, called one-over-sigma (1/sigma), on benchmark datasets for sentiment classification. The proposed method aims to improve the performance of sentiment classification. The results show that 1/sigma is more accurate than the popular term weighting schemes. In order to verify if the entropy reflects the discriminating power of terms, we report a comparison of entropy values for different term weighting schemes.

Keywords: 1/sigma, natural language processing, sentiment analysis, term weighting scheme, text classification

Procedia PDF Downloads 182
525 A Dynamic Spatial Panel Data Analysis on Renter-Occupied Multifamily Housing DC

Authors: Jose Funes, Jeff Sauer, Laixiang Sun

Abstract:

This research examines determinants of multifamily housing development and spillovers in the District of Columbia. A range of socioeconomic factors related to income distribution, productivity, and land use policies are thought to influence the development in contemporary U.S. multifamily housing markets. The analysis leverages data from the American Community Survey to construct panel datasets spanning from 2010 to 2019. Using spatial regression, we identify several socioeconomic measures and land use policies both positively and negatively associated with new housing supply. We contextualize housing estimates related to race in relation to uneven development in the contemporary D.C. housing supply.

Keywords: neighborhood effect, sorting, spatial spillovers, multifamily housing

Procedia PDF Downloads 69
524 Radar Cross Section Modelling of Lossy Dielectrics

Authors: Ciara Pienaar, J. W. Odendaal, J. Joubert, J. C. Smit

Abstract:

Radar cross section (RCS) of dielectric objects play an important role in many applications, such as low observability technology development, drone detection, and monitoring as well as coastal surveillance. Various materials are used to construct the targets of interest such as metal, wood, composite materials, radar absorbent materials, and other dielectrics. Since simulated datasets are increasingly being used to supplement infield measurements, as it is more cost effective and a larger variety of targets can be simulated, it is important to have a high level of confidence in the predicted results. Confidence can be attained through validation. Various computational electromagnetic (CEM) methods are capable of predicting the RCS of dielectric targets. This study will extend previous studies by validating full-wave and asymptotic RCS simulations of dielectric targets with measured data. The paper will provide measured RCS data of a number of canonical dielectric targets exhibiting different material properties. As stated previously, these measurements are used to validate numerous CEM methods. The dielectric properties are accurately characterized to reduce the uncertainties in the simulations. Finally, an analysis of the sensitivity of oblique and normal incidence scattering predictions to material characteristics is also presented. In this paper, the ability of several CEM methods, including method of moments (MoM), and physical optics (PO), to calculate the RCS of dielectrics were validated with measured data. A few dielectrics, exhibiting different material properties, were selected and several canonical targets, such as flat plates and cylinders, were manufactured. The RCS of these dielectric targets were measured in a compact range at the University of Pretoria, South Africa, over a frequency range of 2 to 18 GHz and a 360° azimuth angle sweep. This study also investigated the effect of slight variations in the material properties on the calculated RCS results, by varying the material properties within a realistic tolerance range and comparing the calculated RCS results. Interesting measured and simulated results have been obtained. Large discrepancies were observed between the different methods as well as the measured data. It was also observed that the accuracy of the RCS data of the dielectrics can be frequency and angle dependent. The simulated RCS for some of these materials also exhibit high sensitivity to variations in the material properties. Comparison graphs between the measured and simulation RCS datasets will be presented and the validation thereof will be discussed. Finally, the effect that small tolerances in the material properties have on the calculated RCS results will be shown. Thus the importance of accurate dielectric material properties for validation purposes will be discussed.

Keywords: asymptotic, CEM, dielectric scattering, full-wave, measurements, radar cross section, validation

Procedia PDF Downloads 219
523 Tumor Detection Using Convolutional Neural Networks (CNN) Based Neural Network

Authors: Vinai K. Singh

Abstract:

In Neural Network-based Learning techniques, there are several models of Convolutional Networks. Whenever the methods are deployed with large datasets, only then can their applicability and appropriateness be determined. Clinical and pathological pictures of lobular carcinoma are thought to exhibit a large number of random formations and textures. Working with such pictures is a difficult problem in machine learning. Focusing on wet laboratories and following the outcomes, numerous studies have been published with fresh commentaries in the investigation. In this research, we provide a framework that can operate effectively on raw photos of various resolutions while easing the issues caused by the existence of patterns and texturing. The suggested approach produces very good findings that may be used to make decisions in the diagnosis of cancer.

Keywords: lobular carcinoma, convolutional neural networks (CNN), deep learning, histopathological imagery scans

Procedia PDF Downloads 110
522 Retrospective Demographic Analysis of Patients Lost to Follow-Up from Antiretroviral Therapy in Mulanje Mission Hospital, Malawi

Authors: Silas Webb, Joseph Hartland

Abstract:

Background: Long-term retention of patients on ART has become a major health challenge in Sub-Saharan Africa (SSA). In 2010 a systematic review of 39 papers found that 30% of patients were no longer taking their ARTs two years after starting treatment. In the same review, it was noted that there was a paucity of data as to why patients become lost to follow-up (LTFU) in SSA. This project was performed in Mulanje Mission Hospital in Malawi as part of Swindon Academy’s Global Health eSSC. The HIV prevalence for Malawi is 10.3%, one of the highest rates in the world, however prevalence soars to 18% in the Mulanje. Therefore it is essential that patients at risk of being LTFU are identified early and managed appropriately to help them continue to participate in the service. Methodology: All patients on adult antiretroviral formulations at MMH, who were classified as ‘defaulters’ (patients missing a scheduled follow up visit by more than two months) over the last 12 months were included in the study. Demographic varibales were collected from Mastercards for data analysis. A comparison group of patients currently not lost to follow up was created by using all of the patients who attended the HIV clinic between 18th-22nd July 2016 who had never defaulted from ART. Data was analysed using the chi squared (χ²) test, as data collected was categorical, with alpha levels set at 0.05. Results: Overall, 136 patients had defaulted from ART over the past 12 months at MMH. Of these, 43 patients had missing Mastercards, so 93 defaulter datasets were analysed. In the comparison group 93 datasets were also analysed and statistical analysis done using Chi-Squared testing. A higher proportion of men in the defaulting group was noted (χ²=0.034) and defaulters tended to be younger (χ²=0.052). 94.6% of patients who defaulted were taking Tenofovir, Lamivudine and Efavirenz, the standard first line ART therapy in Malawi. The mean length of time on ART was 39.0 months (RR: -22.4-100.4) in the defaulters group and 47.3 months (RR: -19.71-114.23) in the control group, with a mean difference of 8.3 less months in the defaulters group (χ ²=0.056). Discussion: The findings in this study echo the literature, however this review expands on that and shows the demographic for the patient at most risk of defaulting and being LTFU would be: a young male who has missed more than 4 doses of ART and is within his first year of treatment. For the hospital, this data is important at it identifies significant areas for public health focus. For instance, fear of disclosure and stigma may be disproportionately affecting younger men, so interventions can be aimed specifically at them to improve their health outcomes. The mean length of time on medication was 8.3 months less in the defaulters group, with a p-value of 0.056, emphasising the need for more intensive follow-up in the early stages of treatment, when patients are at the highest risk of defaulting.

Keywords: anti-retroviral therapy, ART, HIV, lost to follow up, Malawi

Procedia PDF Downloads 165
521 Analysis of Epileptic Electroencephalogram Using Detrended Fluctuation and Recurrence Plots

Authors: Mrinalini Ranjan, Sudheesh Chethil

Abstract:

Epilepsy is a common neurological disorder characterised by the recurrence of seizures. Electroencephalogram (EEG) signals are complex biomedical signals which exhibit nonlinear and nonstationary behavior. We use two methods 1) Detrended Fluctuation Analysis (DFA) and 2) Recurrence Plots (RP) to capture this complex behavior of EEG signals. DFA considers fluctuation from local linear trends. Scale invariance of these signals is well captured in the multifractal characterisation using detrended fluctuation analysis (DFA). Analysis of long-range correlations is vital for understanding the dynamics of EEG signals. Correlation properties in the EEG signal are quantified by the calculation of a scaling exponent. We report the existence of two scaling behaviours in the epileptic EEG signals which quantify short and long-range correlations. To illustrate this, we perform DFA on extant ictal (seizure) and interictal (seizure free) datasets of different patients in different channels. We compute the short term and long scaling exponents and report a decrease in short range scaling exponent during seizure as compared to pre-seizure and a subsequent increase during post-seizure period, while the long-term scaling exponent shows an increase during seizure activity. Our calculation of long-term scaling exponent yields a value between 0.5 and 1, thus pointing to power law behaviour of long-range temporal correlations (LRTC). We perform this analysis for multiple channels and report similar behaviour. We find an increase in the long-term scaling exponent during seizure in all channels, which we attribute to an increase in persistent LRTC during seizure. The magnitude of the scaling exponent and its distribution in different channels can help in better identification of areas in brain most affected during seizure activity. The nature of epileptic seizures varies from patient-to-patient. To illustrate this, we report an increase in long-term scaling exponent for some patients which is also complemented by the recurrence plots (RP). RP is a graph that shows the time index of recurrence of a dynamical state. We perform Recurrence Quantitative analysis (RQA) and calculate RQA parameters like diagonal length, entropy, recurrence, determinism, etc. for ictal and interictal datasets. We find that the RQA parameters increase during seizure activity, indicating a transition. We observe that RQA parameters are higher during seizure period as compared to post seizure values, whereas for some patients post seizure values exceeded those during seizure. We attribute this to varying nature of seizure in different patients indicating a different route or mechanism during the transition. Our results can help in better understanding of the characterisation of epileptic EEG signals from a nonlinear analysis.

Keywords: detrended fluctuation, epilepsy, long range correlations, recurrence plots

Procedia PDF Downloads 156
520 Demographic Factors Influencing Employees’ Salary Expectations and Labor Turnover

Authors: M. Osipova

Abstract:

Thanks to informational technologies development every sphere of economics is becoming more and more data-centralized as people are generating huge datasets containing information on any aspect of their life. Applying research of such data to human resources management allows getting scarce statistics on labor market state including salary expectations and potential employees’ typical career behavior, and this information can become a reliable basis for management decisions. The following article presents results of career behavior research based on freely accessible resume data. Information used for study is much wider than one usually uses in human resources surveys. That is why there is enough data for statistically significant results even for subgroups analysis.

Keywords: human resources management, salary expectations, statistics, turnover

Procedia PDF Downloads 326
519 Wake Effects of Wind Turbines and Its Impacts on Power Curve Measurements

Authors: Sajan Antony Mathew, Bhukya Ramdas

Abstract:

Abstract—The impetus of wind energy deployment over the last few decades has seen potential sites being harvested very actively for wind farm development. Due to the scarce availability of highly potential sites, the turbines are getting more optimized in its location wherein minimum spacing between the turbines are resorted without comprising on the optimization of its energy yield. The optimization of the energy yield from a wind turbine is achieved by effective micrositing techniques. These time-tested techniques which are applied from site to site on terrain conditions that meet the requirements of the International standard for power performance measurements of wind turbines result in the positioning of wind turbines for optimized energy yields. The international standard for Power Curve Measurements has rules of procedure and methodology to evaluate the terrain, obstacles and sector for measurements. There are many challenges at the sites for complying with the requirements for terrain, obstacles and sector for measurements. Studies are being attempted to carry out these measurements within the scope of the international standard as various other procedures specified in alternate standards or the integration of LIDAR for Power Curve Measurements are in the nascent stage. The paper strives to assist in the understanding of the fact that if positioning of a wind turbine at a site is based on an optimized output, then there are no wake effects seen on the power curve of an adjacent wind turbine. The paper also demonstrates that an invalid sector for measurements could be used in the analysis in alteration to the requirement as per the international standard for power performance measurements. Therefore the paper strives firstly to demonstrate that if a wind turbine is optimally positioned, no wake effects are seen and secondly the sector for measurements in such a case could include sectors which otherwise would have to be excluded as per the requirements of International standard for power performance measurements.

Keywords: micrositing, optimization, power performance, wake effects

Procedia PDF Downloads 443
518 Gender Recognition with Deep Belief Networks

Authors: Xiaoqi Jia, Qing Zhu, Hao Zhang, Su Yang

Abstract:

A gender recognition system is able to tell the gender of the given person through a few of frontal facial images. An effective gender recognition approach enables to improve the performance of many other applications, including security monitoring, human-computer interaction, image or video retrieval and so on. In this paper, we present an effective method for gender classification task in frontal facial images based on deep belief networks (DBNs), which can pre-train model and improve accuracy a little bit. Our experiments have shown that the pre-training method with DBNs for gender classification task is feasible and achieves a little improvement of accuracy on FERET and CAS-PEAL-R1 facial datasets.

Keywords: gender recognition, beep belief net-works, semi-supervised learning, greedy-layer wise RBMs

Procedia PDF Downloads 426
517 Development of an Autonomous Automated Guided Vehicle with Robot Manipulator under Robot Operation System Architecture

Authors: Jinsiang Shaw, Sheng-Xiang Xu

Abstract:

This paper presents the development of an autonomous automated guided vehicle (AGV) with a robot arm attached on top of it within the framework of robot operation system (ROS). ROS can provide libraries and tools, including hardware abstraction, device drivers, libraries, visualizers, message-passing, package management, etc. For this reason, this AGV can provide automatic navigation and parts transportation and pick-and-place task using robot arm for typical industrial production line use. More specifically, this AGV will be controlled by an on-board host computer running ROS software. Command signals for vehicle and robot arm control and measurement signals from various sensors are transferred to respective microcontrollers. Users can operate the AGV remotely through the TCP / IP protocol and perform SLAM (Simultaneous Localization and Mapping). An RGBD camera and LIDAR sensors are installed on the AGV, using these data to perceive the environment. For SLAM, Gmapping is used to construct the environment map by Rao-Blackwellized particle filter; and AMCL method (Adaptive Monte Carlo localization) is employed for mobile robot localization. In addition, current AGV position and orientation can be visualized by ROS toolkit. As for robot navigation and obstacle avoidance, A* for global path planning and dynamic window approach for local planning are implemented. The developed ROS AGV with a robot arm on it has been experimented in the university factory. A 2-D and 3-D map of the factory were successfully constructed by the SLAM method. Base on this map, robot navigation through the factory with and without dynamic obstacles are shown to perform well. Finally, pick-and-place of parts using robot arm and ensuing delivery in the factory by the mobile robot are also accomplished.

Keywords: automated guided vehicle, navigation, robot operation system, Simultaneous Localization and Mapping

Procedia PDF Downloads 126
516 Monocular Visual Odometry for Three Different View Angles by Intel Realsense T265 with the Measurement of Remote

Authors: Heru Syah Putra, Aji Tri Pamungkas Nurcahyo, Chuang-Jan Chang

Abstract:

MOIL-SDK method refers to the spatial angle that forms a view with a different perspective from the Fisheye image. Visual Odometry forms a trusted application for extending projects by tracking using image sequences. A real-time, precise, and persistent approach that is able to contribute to the work when taking datasets and generate ground truth as a reference for the estimates of each image using the FAST Algorithm method in finding Keypoints that are evaluated during the tracking process with the 5-point Algorithm with RANSAC, as well as produce accurate estimates the camera trajectory for each rotational, translational movement on the X, Y, and Z axes.

Keywords: MOIL-SDK, intel realsense T265, Fisheye image, monocular visual odometry

Procedia PDF Downloads 109
515 Benchmarking Bert-Based Low-Resource Language: Case Uzbek NLP Models

Authors: Jamshid Qodirov, Sirojiddin Komolov, Ravilov Mirahmad, Olimjon Mirzayev

Abstract:

Nowadays, natural language processing tools play a crucial role in our daily lives, including various techniques with text processing. There are very advanced models in modern languages, such as English, Russian etc. But, in some languages, such as Uzbek, the NLP models have been developed recently. Thus, there are only a few NLP models in Uzbek language. Moreover, there is no such work that could show which Uzbek NLP model behaves in different situations and when to use them. This work tries to close this gap and compares the Uzbek NLP models existing as of the time this article was written. The authors try to compare the NLP models in two different scenarios: sentiment analysis and sentence similarity, which are the implementations of the two most common problems in the industry: classification and similarity. Another outcome from this work is two datasets for classification and sentence similarity in Uzbek language that we generated ourselves and can be useful in both industry and academia as well.

Keywords: NLP, benchmak, bert, vectorization

Procedia PDF Downloads 29
514 A Novel PSO Based Decision Tree Classification

Authors: Ali Farzan

Abstract:

Classification of data objects or patterns is a major part in most of Decision making systems. One of the popular and commonly used classification methods is Decision Tree (DT). It is a hierarchical decision making system by which a binary tree is constructed and starting from root, at each node some of the classes is rejected until reaching the leaf nods. Each leaf node is a representative of one specific class. Finding the splitting criteria in each node for constructing or training the tree is a major problem. Particle Swarm Optimization (PSO) has been adopted as a metaheuristic searching method for finding the best splitting criteria. Result of evaluating the proposed method over benchmark datasets indicates the higher accuracy of the new PSO based decision tree.

Keywords: decision tree, particle swarm optimization, splitting criteria, metaheuristic

Procedia PDF Downloads 384
513 Automatic Threshold Search for Heat Map Based Feature Selection: A Cancer Dataset Analysis

Authors: Carlos Huertas, Reyes Juarez-Ramirez

Abstract:

Public health is one of the most critical issues today; therefore, there is great interest to improve technologies in the area of diseases detection. With machine learning and feature selection, it has been possible to aid the diagnosis of several diseases such as cancer. In this work, we present an extension to the Heat Map Based Feature Selection algorithm, this modification allows automatic threshold parameter selection that helps to improve the generalization performance of high dimensional data such as mass spectrometry. We have performed a comparison analysis using multiple cancer datasets and compare against the well known Recursive Feature Elimination algorithm and our original proposal, the results show improved classification performance that is very competitive against current techniques.

Keywords: biomarker discovery, cancer, feature selection, mass spectrometry

Procedia PDF Downloads 305
512 Evaluating Alternative Structures for Prefix Trees

Authors: Feras Hanandeh, Izzat Alsmadi, Muhammad M. Kwafha

Abstract:

Prefix trees or tries are data structures that are used to store data or index of data. The goal is to be able to store and retrieve data by executing queries in quick and reliable manners. In principle, the structure of the trie depends on having letters in nodes at the different levels to point to the actual words in the leafs. However, the exact structure of the trie may vary based on several aspects. In this paper, we evaluated different structures for building tries. Using datasets of words of different sizes, we evaluated the different forms of trie structures. Results showed that some characteristics may impact significantly, positively or negatively, the size and the performance of the trie. We investigated different forms and structures for the trie. Results showed that using an array of pointers in each level to represent the different alphabet letters is the best choice.

Keywords: data structures, indexing, tree structure, trie, information retrieval

Procedia PDF Downloads 436
511 Emotion-Convolutional Neural Network for Perceiving Stress from Audio Signals: A Brain Chemistry Approach

Authors: Anup Anand Deshmukh, Catherine Soladie, Renaud Seguier

Abstract:

Emotion plays a key role in many applications like healthcare, to gather patients’ emotional behavior. Unlike typical ASR (Automated Speech Recognition) problems which focus on 'what was said', it is equally important to understand 'how it was said.' There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is finding the appropriate set of acoustic features corresponding to an emotion. Another difficulty lies in defining the very meaning of emotion and being able to categorize it in a precise manner. Supervised Machine Learning models, including state of the art Deep Learning classification methods, rely on the availability of clean and labelled data. One of the problems in affective computation is the limited amount of annotated data. The existing labelled emotions datasets are highly subjective to the perception of the annotator. We address the first issue of feature selection by exploiting the use of traditional MFCC (Mel-Frequency Cepstral Coefficients) features in Convolutional Neural Network. Our proposed Emo-CNN (Emotion-CNN) architecture treats speech representations in a manner similar to how CNN’s treat images in a vision problem. Our experiments show that Emo-CNN consistently and significantly outperforms the popular existing methods over multiple datasets. It achieves 90.2% categorical accuracy on the Emo-DB dataset. We claim that Emo-CNN is robust to speaker variations and environmental distortions. The proposed approach achieves 85.5% speaker-dependant categorical accuracy for SAVEE (Surrey Audio-Visual Expressed Emotion) dataset, beating the existing CNN based approach by 10.2%. To tackle the second problem of subjectivity in stress labels, we use Lovheim’s cube, which is a 3-dimensional projection of emotions. Monoamine neurotransmitters are a type of chemical messengers in the brain that transmits signals on perceiving emotions. The cube aims at explaining the relationship between these neurotransmitters and the positions of emotions in 3D space. The learnt emotion representations from the Emo-CNN are mapped to the cube using three component PCA (Principal Component Analysis) which is then used to model human stress. This proposed approach not only circumvents the need for labelled stress data but also complies with the psychological theory of emotions given by Lovheim’s cube. We believe that this work is the first step towards creating a connection between Artificial Intelligence and the chemistry of human emotions.

Keywords: deep learning, brain chemistry, emotion perception, Lovheim's cube

Procedia PDF Downloads 127
510 Developing an AI-Driven Application for Real-Time Emotion Recognition from Human Vocal Patterns

Authors: Sayor Ajfar Aaron, Mushfiqur Rahman, Sajjat Hossain Abir, Ashif Newaz

Abstract:

This study delves into the development of an artificial intelligence application designed for real-time emotion recognition from human vocal patterns. Utilizing advanced machine learning algorithms, including deep learning and neural networks, the paper highlights both the technical challenges and potential opportunities in accurately interpreting emotional cues from speech. Key findings demonstrate the critical role of diverse training datasets and the impact of ambient noise on recognition accuracy, offering insights into future directions for improving robustness and applicability in real-world scenarios.

Keywords: artificial intelligence, convolutional neural network, emotion recognition, vocal patterns

Procedia PDF Downloads 19
509 Changes in Geospatial Structure of Households in the Czech Republic: Findings from Population and Housing Census

Authors: Jaroslav Kraus

Abstract:

Spatial information about demographic processes are a standard part of outputs in the Czech Republic. That was also the case of Population and Housing Census which was held on 2011. This is a starting point for a follow up study devoted to two basic types of households: single person households and households of one completed family. Single person households and one family households create more than 80 percent of all households, but the share and spatial structure is in long-term changing. The increase of single households is results of long-term fertility decrease and divorce increase, but also possibility of separate living. There are regions in the Czech Republic with traditional demographic behavior, and regions like capital Prague and some others with changing pattern. Population census is based - according to international standards - on the concept of currently living population. Three types of geospatial approaches will be used for analysis: (i) firstly measures of geographic distribution, (ii) secondly mapping clusters to identify the locations of statistically significant hot spots, cold spots, spatial outliers, and similar features and (iii) finally analyzing pattern approach as a starting point for more in-depth analyses (geospatial regression) in the future will be also applied. For analysis of this type of data, number of households by types should be distinct objects. All events in a meaningful delimited study region (e.g. municipalities) will be included in an analysis. Commonly produced measures of central tendency and spread will include: identification of the location of the center of the point set (by NUTS3 level); identification of the median center and standard distance, weighted standard distance and standard deviational ellipses will be also used. Identifying that clustering exists in census households datasets does not provide a detailed picture of the nature and pattern of clustering but will be helpful to apply simple hot-spot (and cold spot) identification techniques to such datasets. Once the spatial structure of households will be determined, any particular measure of autocorrelation can be constructed by defining a way of measuring the difference between location attribute values. The most widely used measure is Moran’s I that will be applied to municipal units where numerical ratio is calculated. Local statistics arise naturally out of any of the methods for measuring spatial autocorrelation and will be applied to development of localized variants of almost any standard summary statistic. Local Moran’s I will give an indication of household data homogeneity and diversity on a municipal level.

Keywords: census, geo-demography, households, the Czech Republic

Procedia PDF Downloads 82
508 Random Subspace Ensemble of CMAC Classifiers

Authors: Somaiyeh Dehghan, Mohammad Reza Kheirkhahan Haghighi

Abstract:

The rapid growth of domains that have data with a large number of features, while the number of samples is limited has caused difficulty in constructing strong classifiers. To reduce the dimensionality of the feature space becomes an essential step in classification task. Random subspace method (or attribute bagging) is an ensemble classifier that consists of several classifiers that each base learner in ensemble has subset of features. In the present paper, we introduce Random Subspace Ensemble of CMAC neural network (RSE-CMAC), each of which has training with subset of features. Then we use this model for classification task. For evaluation performance of our model, we compare it with bagging algorithm on 36 UCI datasets. The results reveal that the new model has better performance.

Keywords: classification, random subspace, ensemble, CMAC neural network

Procedia PDF Downloads 308
507 Defect Detection for Nanofibrous Images with Deep Learning-Based Approaches

Authors: Gaokai Liu

Abstract:

Automatic defect detection for nanomaterial images is widely required in industrial scenarios. Deep learning approaches are considered as the most effective solutions for the great majority of image-based tasks. In this paper, an edge guidance network for defect segmentation is proposed. First, the encoder path with multiple convolution and downsampling operations is applied to the acquisition of shared features. Then two decoder paths both are connected to the last convolution layer of the encoder and supervised by the edge and segmentation labels, respectively, to guide the whole training process. Meanwhile, the edge and encoder outputs from the same stage are concatenated to the segmentation corresponding part to further tune the segmentation result. Finally, the effectiveness of the proposed method is verified via the experiments on open nanofibrous datasets.

Keywords: deep learning, defect detection, image segmentation, nanomaterials

Procedia PDF Downloads 120
506 GPS Refinement in Cities Using Statistical Approach

Authors: Ashwani Kumar

Abstract:

GPS plays an important role in everyday life for safe and convenient transportation. While pedestrians use hand held devices to know their position in a city, vehicles in intelligent transport systems use relatively sophisticated GPS receivers for estimating their current position. However, in urban areas where the GPS satellites are occluded by tall buildings, trees and reflections of GPS signals from nearby vehicles, GPS position estimation becomes poor. In this work, an exhaustive GPS data is collected at a single point in urban area under different times of day and under dynamic environmental conditions. The data is analyzed and statistical refinement methods are used to obtain optimal position estimate among all the measured positions. The results obtained are compared with publically available datasets and obtained position estimation refinement results are promising.

Keywords: global positioning system, statistical approach, intelligent transport systems, least squares estimation

Procedia PDF Downloads 262
505 Predicting Groundwater Areas Using Data Mining Techniques: Groundwater in Jordan as Case Study

Authors: Faisal Aburub, Wael Hadi

Abstract:

Data mining is the process of extracting useful or hidden information from a large database. Extracted information can be used to discover relationships among features, where data objects are grouped according to logical relationships; or to predict unseen objects to one of the predefined groups. In this paper, we aim to investigate four well-known data mining algorithms in order to predict groundwater areas in Jordan. These algorithms are Support Vector Machines (SVMs), Naïve Bayes (NB), K-Nearest Neighbor (kNN) and Classification Based on Association Rule (CBA). The experimental results indicate that the SVMs algorithm outperformed other algorithms in terms of classification accuracy, precision and F1 evaluation measures using the datasets of groundwater areas that were collected from Jordanian Ministry of Water and Irrigation.

Keywords: classification, data mining, evaluation measures, groundwater

Procedia PDF Downloads 254
504 Mask-Prompt-Rerank: An Unsupervised Method for Text Sentiment Transfer

Authors: Yufen Qin

Abstract:

Text sentiment transfer is an important branch of text style transfer. The goal is to generate text with another sentiment attribute based on a text with a specific sentiment attribute while maintaining the content and semantic information unrelated to sentiment unchanged in the process. There are currently two main challenges in this field: no parallel corpus and text attribute entanglement. In response to the above problems, this paper proposed a novel solution: Mask-Prompt-Rerank. Use the method of masking the sentiment words and then using prompt regeneration to transfer the sentence sentiment. Experiments on two sentiment benchmark datasets and one formality transfer benchmark dataset show that this approach makes the performance of small pre-trained language models comparable to that of the most advanced large models, while consuming two orders of magnitude less computing and memory.

Keywords: language model, natural language processing, prompt, text sentiment transfer

Procedia PDF Downloads 50
503 Exploring the Spatial Characteristics of Mortality Map: A Statistical Area Perspective

Authors: Jung-Hong Hong, Jing-Cen Yang, Cai-Yu Ou

Abstract:

The analysis of geographic inequality heavily relies on the use of location-enabled statistical data and quantitative measures to present the spatial patterns of the selected phenomena and analyze their differences. To protect the privacy of individual instance and link to administrative units, point-based datasets are spatially aggregated to area-based statistical datasets, where only the overall status for the selected levels of spatial units is used for decision making. The partition of the spatial units thus has dominant influence on the outcomes of the analyzed results, well known as the Modifiable Areal Unit Problem (MAUP). A new spatial reference framework, the Taiwan Geographical Statistical Classification (TGSC), was recently introduced in Taiwan based on the spatial partition principles of homogeneous consideration of the number of population and households. Comparing to the outcomes of the traditional township units, TGSC provides additional levels of spatial units with finer granularity for presenting spatial phenomena and enables domain experts to select appropriate dissemination level for publishing statistical data. This paper compares the results of respectively using TGSC and township unit on the mortality data and examines the spatial characteristics of their outcomes. For the mortality data between the period of January 1st, 2008 and December 31st, 2010 of the Taitung County, the all-cause age-standardized death rate (ASDR) ranges from 571 to 1757 per 100,000 persons, whereas the 2nd dissemination area (TGSC) shows greater variation, ranged from 0 to 2222 per 100,000. The finer granularity of spatial units of TGSC clearly provides better outcomes for identifying and evaluating the geographic inequality and can be further analyzed with the statistical measures from other perspectives (e.g., population, area, environment.). The management and analysis of the statistical data referring to the TGSC in this research is strongly supported by the use of Geographic Information System (GIS) technology. An integrated workflow that consists of the tasks of the processing of death certificates, the geocoding of street address, the quality assurance of geocoded results, the automatic calculation of statistic measures, the standardized encoding of measures and the geo-visualization of statistical outcomes is developed. This paper also introduces a set of auxiliary measures from a geographic distribution perspective to further examine the hidden spatial characteristics of mortality data and justify the analyzed results. With the common statistical area framework like TGSC, the preliminary results demonstrate promising potential for developing a web-based statistical service that can effectively access domain statistical data and present the analyzed outcomes in meaningful ways to avoid wrong decision making.

Keywords: mortality map, spatial patterns, statistical area, variation

Procedia PDF Downloads 231
502 From Two-Way to Multi-Way: A Comparative Study for Map-Reduce Join Algorithms

Authors: Marwa Hussien Mohamed, Mohamed Helmy Khafagy

Abstract:

Map-Reduce is a programming model which is widely used to extract valuable information from enormous volumes of data. Map-reduce designed to support heterogeneous datasets. Apache Hadoop map-reduce used extensively to uncover hidden pattern like data mining, SQL, etc. The most important operation for data analysis is joining operation. But, map-reduce framework does not directly support join algorithm. This paper explains and compares two-way and multi-way map-reduce join algorithms for map reduce also we implement MR join Algorithms and show the performance of each phase in MR join algorithms. Our experimental results show that map side join and map merge join in two-way join algorithms has the longest time according to preprocessing step sorting data and reduce side cascade join has the longest time at Multi-Way join algorithms.

Keywords: Hadoop, MapReduce, multi-way join, two-way join, Ubuntu

Procedia PDF Downloads 460
501 A Multi-Agent Urban Traffic Simulator for Generating Autonomous Driving Training Data

Authors: Florin Leon

Abstract:

This paper describes a simulator of traffic scenarios tailored to facilitate autonomous driving model training for urban environments. With the rising prominence of self-driving vehicles, the need for diverse datasets is very important. The proposed simulator provides a flexible framework that allows the generation of custom scenarios needed for the validation and enhancement of trajectory prediction algorithms. Its controlled yet dynamic environment addresses the challenges associated with real-world data acquisition and ensures adaptability to diverse driving scenarios. By providing an adaptable solution for scenario creation and algorithm testing, this tool proves to be a valuable resource for advancing autonomous driving technology that aims to ensure safe and efficient self-driving vehicles.

Keywords: autonomous driving, car simulator, machine learning, model training, urban simulation environment

Procedia PDF Downloads 26
500 Simulation-Based Unmanned Surface Vehicle Design Using PX4 and Robot Operating System With Kubernetes and Cloud-Native Tooling

Authors: Norbert Szulc, Jakub Wilk, Franciszek Górski

Abstract:

This paper presents an approach for simulating and testing robotic systems based on PX4, using a local Kubernetes cluster. The approach leverages modern cloud-native tools and runs on single-board computers. Additionally, this solution enables the creation of datasets for computer vision and the evaluation of control system algorithms in an end-to-end manner. This paper compares this approach to method commonly used Docker based approach. This approach was used to develop simulation environment for an unmanned surface vehicle (USV) for RoboBoat 2023 by running a containerized configuration of the PX4 Open-source Autopilot connected to ROS and the Gazebo simulation environment.

Keywords: cloud computing, Kubernetes, single board computers, simulation, ROS

Procedia PDF Downloads 52
499 Big Data for Local Decision-Making: Indicators Identified at International Conference on Urban Health 2017

Authors: Dana R. Thomson, Catherine Linard, Sabine Vanhuysse, Jessica E. Steele, Michal Shimoni, Jose Siri, Waleska Caiaffa, Megumi Rosenberg, Eleonore Wolff, Tais Grippa, Stefanos Georganos, Helen Elsey

Abstract:

The Sustainable Development Goals (SDGs) and Urban Health Equity Assessment and Response Tool (Urban HEART) identify dozens of key indicators to help local decision-makers prioritize and track inequalities in health outcomes. However, presentations and discussions at the International Conference on Urban Health (ICUH) 2017 suggested that additional indicators are needed to make decisions and policies. A local decision-maker may realize that malaria or road accidents are a top priority. However, s/he needs additional health determinant indicators, for example about standing water or traffic, to address the priority and reduce inequalities. Health determinants reflect the physical and social environments that influence health outcomes often at community- and societal-levels and include such indicators as access to quality health facilities, access to safe parks, traffic density, location of slum areas, air pollution, social exclusion, and social networks. Indicator identification and disaggregation are necessarily constrained by available datasets – typically collected about households and individuals in surveys, censuses, and administrative records. Continued advancements in earth observation, data storage, computing and mobile technologies mean that new sources of health determinants indicators derived from 'big data' are becoming available at fine geographic scale. Big data includes high-resolution satellite imagery and aggregated, anonymized mobile phone data. While big data are themselves not representative of the population (e.g., satellite images depict the physical environment), they can provide information about population density, wealth, mobility, and social environments with tremendous detail and accuracy when combined with population-representative survey, census, administrative and health system data. The aim of this paper is to (1) flag to data scientists important indicators needed by health decision-makers at the city and sub-city scale - ideally free and publicly available, and (2) summarize for local decision-makers new datasets that can be generated from big data, with layperson descriptions of difficulties in generating them. We include SDGs and Urban HEART indicators, as well as indicators mentioned by decision-makers attending ICUH 2017.

Keywords: health determinant, health outcome, mobile phone, remote sensing, satellite imagery, SDG, urban HEART

Procedia PDF Downloads 184
498 MULTI-FLGANs: Multi-Distributed Adversarial Networks for Non-Independent and Identically Distributed Distribution

Authors: Akash Amalan, Rui Wang, Yanqi Qiao, Emmanouil Panaousis, Kaitai Liang

Abstract:

Federated learning is an emerging concept in the domain of distributed machine learning. This concept has enabled General Adversarial Networks (GANs) to benefit from the rich distributed training data while preserving privacy. However, in a non-IID setting, current federated GAN architectures are unstable, struggling to learn the distinct features, and vulnerable to mode collapse. In this paper, we propose an architecture MULTI-FLGAN to solve the problem of low-quality images, mode collapse, and instability for non-IID datasets. Our results show that MULTI-FLGAN is four times as stable and performant (i.e., high inception score) on average over 20 clients compared to baseline FLGAN.

Keywords: federated learning, generative adversarial network, inference attack, non-IID data distribution

Procedia PDF Downloads 128
497 Real Time Multi Person Action Recognition Using Pose Estimates

Authors: Aishrith Rao

Abstract:

Human activity recognition is an important aspect of video analytics, and many approaches have been recommended to enable action recognition. In this approach, the model is used to identify the action of the multiple people in the frame and classify them accordingly. A few approaches use RNNs and 3D CNNs, which are computationally expensive and cannot be trained with the small datasets which are currently available. Multi-person action recognition has been performed in order to understand the positions and action of people present in the video frame. The size of the video frame can be adjusted as a hyper-parameter depending on the hardware resources available. OpenPose has been used to calculate pose estimate using CNN to produce heap-maps, one of which provides skeleton features, which are basically joint features. The features are then extracted, and a classification algorithm can be applied to classify the action.

Keywords: human activity recognition, computer vision, pose estimates, convolutional neural networks

Procedia PDF Downloads 114