Search results for: fine-grained dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1165

Search results for: fine-grained dataset

655 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 148
654 Recognizing Human Actions by Multi-Layer Growing Grid Architecture

Authors: Z. Gharaee

Abstract:

Recognizing actions performed by others is important in our daily lives since it is necessary for communicating with others in a proper way. We perceive an action by observing the kinematics of motions involved in the performance. We use our experience and concepts to make a correct recognition of the actions. Although building the action concepts is a life-long process, which is repeated throughout life, we are very efficient in applying our learned concepts in analyzing motions and recognizing actions. Experiments on the subjects observing the actions performed by an actor show that an action is recognized after only about two hundred milliseconds of observation. In this study, hierarchical action recognition architecture is proposed by using growing grid layers. The first-layer growing grid receives the pre-processed data of consecutive 3D postures of joint positions and applies some heuristics during the growth phase to allocate areas of the map by inserting new neurons. As a result of training the first-layer growing grid, action pattern vectors are generated by connecting the elicited activations of the learned map. The ordered vector representation layer receives action pattern vectors to create time-invariant vectors of key elicited activations. Time-invariant vectors are sent to second-layer growing grid for categorization. This grid creates the clusters representing the actions. Finally, one-layer neural network developed by a delta rule labels the action categories in the last layer. System performance has been evaluated in an experiment with the publicly available MSR-Action3D dataset. There are actions performed by using different parts of human body: Hand Clap, Two Hands Wave, Side Boxing, Bend, Forward Kick, Side Kick, Jogging, Tennis Serve, Golf Swing, Pick Up and Throw. The growing grid architecture was trained by applying several random selections of generalization test data fed to the system during on average 100 epochs for each training of the first-layer growing grid and around 75 epochs for each training of the second-layer growing grid. The average generalization test accuracy is 92.6%. A comparison analysis between the performance of growing grid architecture and self-organizing map (SOM) architecture in terms of accuracy and learning speed show that the growing grid architecture is superior to the SOM architecture in action recognition task. The SOM architecture completes learning the same dataset of actions in around 150 epochs for each training of the first-layer SOM while it takes 1200 epochs for each training of the second-layer SOM and it achieves the average recognition accuracy of 90% for generalization test data. In summary, using the growing grid network preserves the fundamental features of SOMs, such as topographic organization of neurons, lateral interactions, the abilities of unsupervised learning and representing high dimensional input space in the lower dimensional maps. The architecture also benefits from an automatic size setting mechanism resulting in higher flexibility and robustness. Moreover, by utilizing growing grids the system automatically obtains a prior knowledge of input space during the growth phase and applies this information to expand the map by inserting new neurons wherever there is high representational demand.

Keywords: action recognition, growing grid, hierarchical architecture, neural networks, system performance

Procedia PDF Downloads 157
653 Using Artificial Vision Techniques for Dust Detection on Photovoltaic Panels

Authors: Gustavo Funes, Eduardo Peters, Jose Delpiano

Abstract:

It is widely known that photovoltaic technology has been massively distributed over the last decade despite its low-efficiency ratio. Dust deposition reduces this efficiency even more, lowering the energy production and module lifespan. In this work, we developed an artificial vision algorithm based on CIELAB color space to identify dust over panels in an autonomous way. We performed several experiments photographing three different types of panels, 30W, 340W and 410W. Those panels were soiled artificially with uniform and non-uniform distributed dust. The algorithm proposed uses statistical tools to provide a simulation with a 100% soiled panel and then performs a comparison to get the percentage of dirt in the experimental data set. The simulation uses a seed that is obtained by taking a dust sample from the maximum amount of dust from the dataset. The final result is the dirt percentage and the possible distribution of dust over the panel. Dust deposition is a key factor for plant owners to determine cleaning cycles or identify nonuniform depositions that could lead to module failure and hot spots.

Keywords: dust detection, photovoltaic, artificial vision, soiling

Procedia PDF Downloads 47
652 Automated Process Quality Monitoring and Diagnostics for Large-Scale Measurement Data

Authors: Hyun-Woo Cho

Abstract:

Continuous monitoring of industrial plants is one of necessary tasks when it comes to ensuring high-quality final products. In terms of monitoring and diagnosis, it is quite critical and important to detect some incipient abnormal events of manufacturing processes in order to improve safety and reliability of operations involved and to reduce related losses. In this work a new multivariate statistical online diagnostic method is presented using a case study. For building some reference models an empirical discriminant model is constructed based on various past operation runs. When a fault is detected on-line, an on-line diagnostic module is initiated. Finally, the status of the current operating conditions is compared with the reference model to make a diagnostic decision. The performance of the presented framework is evaluated using a dataset from complex industrial processes. It has been shown that the proposed diagnostic method outperforms other techniques especially in terms of incipient detection of any faults occurred.

Keywords: data mining, empirical model, on-line diagnostics, process fault, process monitoring

Procedia PDF Downloads 400
651 Machine Learning Driven Analysis of Kepler Objects of Interest to Identify Exoplanets

Authors: Akshat Kumar, Vidushi

Abstract:

This paper identifies 27 KOIs, 26 of which are currently classified as candidates and one as false positives that have a high probability of being confirmed. For this purpose, 11 machine learning algorithms were implemented on the cumulative kepler dataset sourced from the NASA exoplanet archive; it was observed that the best-performing model was HistGradientBoosting and XGBoost with a test accuracy of 93.5%, and the lowest-performing model was Gaussian NB with a test accuracy of 54%, to test model performance F1, cross-validation score and RUC curve was calculated. Based on the learned models, the significant characteristics for confirm exoplanets were identified, putting emphasis on the object’s transit and stellar properties; these characteristics were namely koi_count, koi_prad, koi_period, koi_dor, koi_ror, and koi_smass, which were later considered to filter out the potential KOIs. The paper also calculates the Earth similarity index based on the planetary radius and equilibrium temperature for each KOI identified to aid in their classification.

Keywords: Kepler objects of interest, exoplanets, space exploration, machine learning, earth similarity index, transit photometry

Procedia PDF Downloads 73
650 Comparison of Classical Computer Vision vs. Convolutional Neural Networks Approaches for Weed Mapping in Aerial Images

Authors: Paulo Cesar Pereira Junior, Alexandre Monteiro, Rafael da Luz Ribeiro, Antonio Carlos Sobieranski, Aldo von Wangenheim

Abstract:

In this paper, we present a comparison between convolutional neural networks and classical computer vision approaches, for the specific precision agriculture problem of weed mapping on sugarcane fields aerial images. A systematic literature review was conducted to find which computer vision methods are being used on this specific problem. The most cited methods were implemented, as well as four models of convolutional neural networks. All implemented approaches were tested using the same dataset, and their results were quantitatively and qualitatively analyzed. The obtained results were compared to a human expert made ground truth for validation. The results indicate that the convolutional neural networks present better precision and generalize better than the classical models.

Keywords: convolutional neural networks, deep learning, digital image processing, precision agriculture, semantic segmentation, unmanned aerial vehicles

Procedia PDF Downloads 258
649 Finding Related Scientific Documents Using Formal Concept Analysis

Authors: Nadeem Akhtar, Hira Javed

Abstract:

An important aspect of research is literature survey. Availability of a large amount of literature across different domains triggers the need for optimized systems which provide relevant literature to researchers. We propose a search system based on keywords for text documents. This experimental approach provides a hierarchical structure to the document corpus. The documents are labelled with keywords using KEA (Keyword Extraction Algorithm) and are automatically organized in a lattice structure using Formal Concept Analysis (FCA). This groups the semantically related documents together. The hierarchical structure, based on keywords gives out only those documents which precisely contain them. This approach open doors for multi-domain research. The documents across multiple domains which are indexed by similar keywords are grouped together. A hierarchical relationship between keywords is obtained. To signify the effectiveness of the approach, we have carried out the experiment and evaluation on Semeval-2010 Dataset. Results depict that the presented method is considerably successful in indexing of scientific papers.

Keywords: formal concept analysis, keyword extraction algorithm, scientific documents, lattice

Procedia PDF Downloads 330
648 Development of Computational Approach for Calculation of Hydrogen Solubility in Hydrocarbons for Treatment of Petroleum

Authors: Abdulrahman Sumayli, Saad M. AlShahrani

Abstract:

For the hydrogenation process, knowing the solubility of hydrogen (H2) in hydrocarbons is critical to improve the efficiency of the process. We investigated the H2 solubility computation in four heavy crude oil feedstocks using machine learning techniques. Temperature, pressure, and feedstock type were considered as the inputs to the models, while the hydrogen solubility was the sole response. Specifically, we employed three different models: Support Vector Regression (SVR), Gaussian process regression (GPR), and Bayesian ridge regression (BRR). To achieve the best performance, the hyper-parameters of these models are optimized using the whale optimization algorithm (WOA). We evaluated the models using a dataset of solubility measurements in various feedstocks, and we compared their performance based on several metrics. Our results show that the WOA-SVR model tuned with WOA achieves the best performance overall, with an RMSE of 1.38 × 10− 2 and an R-squared of 0.991. These findings suggest that machine learning techniques can provide accurate predictions of hydrogen solubility in different feedstocks, which could be useful in the development of hydrogen-related technologies. Besides, the solubility of hydrogen in the four heavy oil fractions is estimated in different ranges of temperatures and pressures of 150 ◦C–350 ◦C and 1.2 MPa–10.8 MPa, respectively

Keywords: temperature, pressure variations, machine learning, oil treatment

Procedia PDF Downloads 67
647 Change Point Detection Using Random Matrix Theory with Application to Frailty in Elderly Individuals

Authors: Malika Kharouf, Aly Chkeir, Khac Tuan Huynh

Abstract:

Detecting change points in time series data is a challenging problem, especially in scenarios where there is limited prior knowledge regarding the data’s distribution and the nature of the transitions. We present a method designed for detecting changes in the covariance structure of high-dimensional time series data, where the number of variables closely matches the data length. Our objective is to achieve unbiased test statistic estimation under the null hypothesis. We delve into the utilization of Random Matrix Theory to analyze the behavior of our test statistic within a high-dimensional context. Specifically, we illustrate that our test statistic converges pointwise to a normal distribution under the null hypothesis. To assess the effectiveness of our proposed approach, we conduct evaluations on a simulated dataset. Furthermore, we employ our method to examine changes aimed at detecting frailty in the elderly.

Keywords: change point detection, hypothesis tests, random matrix theory, frailty in elderly

Procedia PDF Downloads 51
646 Assessing Bus Service Quality in Dhaka City from the Perspective of Female Passengers

Authors: S. K. Subah, R. Tasnim, M. I. Jahan, M. R. Islam

Abstract:

While talking about how comfortable and convenient Dhaka's bus service is, the minimum emphasis is placed on the female commuters of the Dhaka city. Recognizing the contemporary situation, the supreme focus is to develop experimental model based on statistical methods. SEM has been adopted to quantify passenger satisfaction, which is affected by the perceived service quality. The study deals with 16 observed variables and three latent variables, which were correlated to identify their significance on the regulation of perceived SQ (Service Quality). To calibrate the model, a dataset of 250 responses from female users of local buses has been utilized through survey. A questionnaire structured with SQ variables was prepared in consultation with prevailing literature, practitioners, academicians, and users. The result concludes that the attributes of safe and secured environment have the most significant impact on the overall bus service quality according to the insight of female respondents. The study outcome might be a great help for the policymakers, women's organizations, and NGOs to formulate transport policy that will ensure a women-friendly public bus service.

Keywords: bus service quality, female perception, structural equation modelling, safety-security, women friendly bus

Procedia PDF Downloads 154
645 Optimized Preprocessing for Accurate and Efficient Bioassay Prediction with Machine Learning Algorithms

Authors: Jeff Clarine, Chang-Shyh Peng, Daisy Sang

Abstract:

Bioassay is the measurement of the potency of a chemical substance by its effect on a living animal or plant tissue. Bioassay data and chemical structures from pharmacokinetic and drug metabolism screening are mined from and housed in multiple databases. Bioassay prediction is calculated accordingly to determine further advancement. This paper proposes a four-step preprocessing of datasets for improving the bioassay predictions. The first step is instance selection in which dataset is categorized into training, testing, and validation sets. The second step is discretization that partitions the data in consideration of accuracy vs. precision. The third step is normalization where data are normalized between 0 and 1 for subsequent machine learning processing. The fourth step is feature selection where key chemical properties and attributes are generated. The streamlined results are then analyzed for the prediction of effectiveness by various machine learning algorithms including Pipeline Pilot, R, Weka, and Excel. Experiments and evaluations reveal the effectiveness of various combination of preprocessing steps and machine learning algorithms in more consistent and accurate prediction.

Keywords: bioassay, machine learning, preprocessing, virtual screen

Procedia PDF Downloads 273
644 Building Scalable and Accurate Hybrid Kernel Mapping Recommender

Authors: Hina Iqbal, Mustansar Ali Ghazanfar, Sandor Szedmak

Abstract:

Recommender systems uses artificial intelligence practices for filtering obscure information and can predict if a user likes a specified item. Kernel mapping Recommender systems have been proposed which are accurate and state-of-the-art algorithms and resolve recommender system’s design objectives such as; long tail, cold-start, and sparsity. The aim of research is to propose hybrid framework that can efficiently integrate different versions— namely item-based and user-based KMR— of KMR algorithm. We have proposed various heuristic algorithms that integrate different versions of KMR (into a unified framework) resulting in improved accuracy and elimination of problems associated with conventional recommender system. We have tested our system on publically available movies dataset and benchmark with KMR. The results (in terms of accuracy, precision, recall, F1 measure and ROC metrics) reveal that the proposed algorithm is quite accurate especially under cold-start and sparse scenarios.

Keywords: Kernel Mapping Recommender Systems, hybrid recommender systems, cold start, sparsity, long tail

Procedia PDF Downloads 337
643 Integration of Resistivity and Seismic Refraction Using Combine Inversion for Ancient River Findings at Sungai Batu, Lembah Bujang, Malaysia

Authors: Rais Yusoh, Rosli Saad, Mokhtar Saidin, Fauzi Andika, Sabiu Bala Muhammad

Abstract:

Resistivity and seismic refraction profiling have become a common method in pre-investigations for visualizing subsurface structure. The integration of the methods could reduce an interpretation ambiguity. Both methods have their individual software packages for data inversion, but potential to combine certain geophysical methods are restricted; however, the research algorithms that have this functionality was existed and are evaluated personally. The interpretation of subsurface were improve by combining inversion data from both methods by influence each other models using closure coupling; thus, by implementing both methods to support each other which could improve the subsurface interpretation. These methods were applied on a field dataset from a pre-investigation for archeology in finding the ancient river. There were no major changes in the inverted model by combining data inversion for this archetype which probably due to complex geology. The combine data analysis provides an additional technique for interpretation such as an alluvium, which can have strong influence on the ancient river findings.

Keywords: ancient river, combine inversion, resistivity, seismic refraction

Procedia PDF Downloads 330
642 On the Bias and Predictability of Asylum Cases

Authors: Panagiota Katsikouli, William Hamilton Byrne, Thomas Gammeltoft-Hansen, Tijs Slaats

Abstract:

An individual who demonstrates a well-founded fear of persecution or faces real risk of being subjected to torture is eligible for asylum. In Danish law, the exact legal thresholds reflect those established by international conventions, notably the 1951 Refugee Convention and the 1950 European Convention for Human Rights. These international treaties, however, remain largely silent when it comes to how states should assess asylum claims. As a result, national authorities are typically left to determine an individual’s legal eligibility on a narrow basis consisting of an oral testimony, which may itself be hampered by several factors, including imprecise language interpretation, insecurity or lacking trust towards the authorities among applicants. The leaky ground, on which authorities must assess their subjective perceptions of asylum applicants' credibility, questions whether, in all cases, adjudicators make the correct decision. Moreover, the subjective element in these assessments raises questions on whether individual asylum cases could be afflicted by implicit biases or stereotyping amongst adjudicators. In fact, recent studies have uncovered significant correlations between decision outcomes and the experience and gender of the assigned judge, as well as correlations between asylum outcomes and entirely external events such as weather and political elections. In this study, we analyze a publicly available dataset containing approximately 8,000 summaries of asylum cases, initially rejected, and re-tried by the Refugee Appeals Board (RAB) in Denmark. First, we look for variations in the recognition rates, with regards to a number of applicants’ features: their country of origin/nationality, their identified gender, their identified religion, their ethnicity, whether torture was mentioned in their case and if so, whether it was supported or not, and the year the applicant entered Denmark. In order to extract those features from the text summaries, as well as the final decision of the RAB, we applied natural language processing and regular expressions, adjusting for the Danish language. We observed interesting variations in recognition rates related to the applicants’ country of origin, ethnicity, year of entry and the support or not of torture claims, whenever those were made in the case. The appearance (or not) of significant variations in the recognition rates, does not necessarily imply (or not) bias in the decision-making progress. None of the considered features, with the exception maybe of the torture claims, should be decisive factors for an asylum seeker’s fate. We therefore investigate whether the decision can be predicted on the basis of these features, and consequently, whether biases are likely to exist in the decisionmaking progress. We employed a number of machine learning classifiers, and found that when using the applicant’s country of origin, religion, ethnicity and year of entry with a random forest classifier, or a decision tree, the prediction accuracy is as high as 82% and 85% respectively. tentially predictive properties with regards to the outcome of an asylum case. Our analysis and findings call for further investigation on the predictability of the outcome, on a larger dataset of 17,000 cases, which is undergoing.

Keywords: asylum adjudications, automated decision-making, machine learning, text mining

Procedia PDF Downloads 92
641 The Potential and Economic Viability Analysis of Grid-Connected Solar PV Power in Kenya

Authors: Remember Samu, Kathy Kiema, Murat Fahrioglu

Abstract:

This present study is aimed at minimizing the dependence on fossil fuels thus reducing greenhouse gas (GHG) emissions and also to curb for the rising energy demands in Kenya. In this analysis, 35 locations were each considered for their techno-economic potential of installation of a 10MW grid-connected PV plant. The sites are scattered across the country but are mostly concentrated in the eastern region and were selected based on their accessibility to the national grid and availability of their meteorological parameters from NASA Solar Energy Dataset. RETScreen software 4.0 version will be employed for the analysis in this present paper. The capacity factor, simple payback, equity payback, the net present value (NPV), annual life cycle savings, energy production cost, net annual greenhouse gas emission reduction and the equivalent barrels of crude oil not consumed are outlined. Energy accounting is performed and compared to the existing grid tariff for an effective feasibility argument of this 10MW grid-connected PV power system.

Keywords: photovoltaics, project viability analysis, PV module, renewable energy

Procedia PDF Downloads 312
640 Flexible Cities: A Multisided Spatial Application of Tracking Livability of Urban Environment

Authors: Maria Christofi, George Plastiras, Rafaella Elia, Vaggelis Tsiourtis, Theocharis Theocharides, Miltiadis Katsaros

Abstract:

The rapidly expanding urban areas of the world constitute a challenge of how we need to make the transition to "the next urbanization", which will be defined by new analytical tools and new sources of data. This paper is about the production of a spatial application, the ‘FUMapp’, where space and its initiative will be available literally, in meters, but also abstractly, at a sensed level. While existing spatial applications typically focus on illustrations of the urban infrastructure, the suggested application goes beyond the existing: It investigates how our environment's perception adapts to the alterations of the built environment through a dataset construction of biophysical measurements (eye-tracking, heart beating), and physical metrics (spatial characteristics, size of stimuli, rhythm of mobility). It explores the intersections between architecture, cognition, and computing where future design can be improved and identifies the flexibility and livability of the ‘available space’ of specific examined urban paths.

Keywords: biophysical data, flexibility of urban, livability, next urbanization, spatial application

Procedia PDF Downloads 142
639 Pawn or Potentates: Corporate Governance Structure in Indian Central Public Sector Enterprises

Authors: Ritika Jain, Rajnish Kumar

Abstract:

The Department of Public Enterprises had made submissions of Self Evaluation Reports, for the purpose of corporate governance, mandatory for all central government owned enterprises. Despite this, an alarming 40% of the enterprises did not do so. This study examines the impact of external policy tools and internal firm-specific factors on corporate governance of central public sector enterprises (CPSEs). We use a dataset of all manufacturing and non-financial services owned by the central government of India for the year 2010-11. Using probit, ordered logit and Heckman’s sample selection models, the study finds that the probability and quality of corporate governance is positively influenced by the CPSE getting into a Memorandum of Understanding (MoU) with the central government of India, and hence, enjoying more autonomy in terms of day to day operations. Besides these, internal factors, including bigger size and lower debt size contribute significantly to better corporate governance.

Keywords: corporate governance, central public sector enterprises (CPSEs), sample selection, Memorandum of Understanding (MoU), ordered logit, disinvestment

Procedia PDF Downloads 257
638 Classification of Political Affiliations by Reduced Number of Features

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.

Keywords: feature selection, LIWC, machine learning, politics

Procedia PDF Downloads 381
637 Semi-Supervised Learning for Spanish Speech Recognition Using Deep Neural Networks

Authors: B. R. Campomanes-Alvarez, P. Quiros, B. Fernandez

Abstract:

Automatic Speech Recognition (ASR) is a machine-based process of decoding and transcribing oral speech. A typical ASR system receives acoustic input from a speaker or an audio file, analyzes it using algorithms, and produces an output in the form of a text. Some speech recognition systems use Hidden Markov Models (HMMs) to deal with the temporal variability of speech and Gaussian Mixture Models (GMMs) to determine how well each state of each HMM fits a short window of frames of coefficients that represents the acoustic input. Another way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition systems. Acoustic models for state-of-the-art ASR systems are usually training on massive amounts of data. However, audio files with their corresponding transcriptions can be difficult to obtain, especially in the Spanish language. Hence, in the case of these low-resource scenarios, building an ASR model is considered as a complex task due to the lack of labeled data, resulting in an under-trained system. Semi-supervised learning approaches arise as necessary tasks given the high cost of transcribing audio data. The main goal of this proposal is to develop a procedure based on acoustic semi-supervised learning for Spanish ASR systems by using DNNs. This semi-supervised learning approach consists of: (a) Training a seed ASR model with a DNN using a set of audios and their respective transcriptions. A DNN with a one-hidden-layer network was initialized; increasing the number of hidden layers in training, to a five. A refinement, which consisted of the weight matrix plus bias term and a Stochastic Gradient Descent (SGD) training were also performed. The objective function was the cross-entropy criterion. (b) Decoding/testing a set of unlabeled data with the obtained seed model. (c) Selecting a suitable subset of the validated data to retrain the seed model, thereby improving its performance on the target test set. To choose the most precise transcriptions, three confidence scores or metrics, regarding the lattice concept (based on the graph cost, the acoustic cost and a combination of both), was performed as selection technique. The performance of the ASR system will be calculated by means of the Word Error Rate (WER). The test dataset was renewed in order to extract the new transcriptions added to the training dataset. Some experiments were carried out in order to select the best ASR results. A comparison between a GMM-based model without retraining and the DNN proposed system was also made under the same conditions. Results showed that the semi-supervised ASR-model based on DNNs outperformed the GMM-model, in terms of WER, in all tested cases. The best result obtained an improvement of 6% relative WER. Hence, these promising results suggest that the proposed technique could be suitable for building ASR models in low-resource environments.

Keywords: automatic speech recognition, deep neural networks, machine learning, semi-supervised learning

Procedia PDF Downloads 339
636 Local Boundary Analysis for Generative Theory of Tonal Music: From the Aspect of Classic Music Melody Analysis

Authors: Po-Chun Wang, Yan-Ru Lai, Sophia I. C. Lin, Alvin W. Y. Su

Abstract:

The Generative Theory of Tonal Music (GTTM) provides systematic approaches to recognizing local boundaries of music. The rules have been implemented in some automated melody segmentation algorithms. Besides, there are also deep learning methods with GTTM features applied to boundary detection tasks. However, these studies might face constraints such as a lack of or inconsistent label data. The GTTM database is currently the most widely used GTTM database, which includes manually labeled GTTM rules and local boundaries. Even so, we found some problems with these labels. They are sometimes discrepancies with GTTM rules. In addition, since it is labeled at different times by multiple musicians, they are not within the same scope in some cases. Therefore, in this paper, we examine this database with musicians from the aspect of classical music and relabel the scores. The relabeled database - GTTM Database v2.0 - will be released for academic research usage. Despite the experimental and statistical results showing that the relabeled database is more consistent, the improvement in boundary detection is not substantial. It seems that we need more clues than GTTM rules for boundary detection in the future.

Keywords: dataset, GTTM, local boundary, neural network

Procedia PDF Downloads 144
635 Generative Adversarial Network for Bidirectional Mappings between Retinal Fundus Images and Vessel Segmented Images

Authors: Haoqi Gao, Koichi Ogawara

Abstract:

Retinal vascular segmentation of color fundus is the basis of ophthalmic computer-aided diagnosis and large-scale disease screening systems. Early screening of fundus diseases has great value for clinical medical diagnosis. The traditional methods depend on the experience of the doctor, which is time-consuming, labor-intensive, and inefficient. Furthermore, medical images are scarce and fraught with legal concerns regarding patient privacy. In this paper, we propose a new Generative Adversarial Network based on CycleGAN for retinal fundus images. This method can generate not only synthetic fundus images but also generate corresponding segmentation masks, which has certain application value and challenge in computer vision and computer graphics. In the results, we evaluate our proposed method from both quantitative and qualitative. For generated segmented images, our method achieves dice coefficient of 0.81 and PR of 0.89 on DRIVE dataset. For generated synthetic fundus images, we use ”Toy Experiment” to verify the state-of-the-art performance of our method.

Keywords: retinal vascular segmentations, generative ad-versarial network, cyclegan, fundus images

Procedia PDF Downloads 142
634 The Inequality Effects of Natural Disasters: Evidence from Thailand

Authors: Annop Jaewisorn

Abstract:

This study explores the relationship between natural disasters and inequalities -both income and expenditure inequality- at a micro-level of Thailand as the first study of this nature for this country. The analysis uses a unique panel and remote-sensing dataset constructed for the purpose of this research. It contains provincial inequality measures and other economic and social indicators based on the Thailand Household Survey during the period between 1992 and 2019. Meanwhile, the data on natural disasters, which are remote-sensing data, are received from several official geophysical or meteorological databases. Employing a panel fixed effects, the results show that natural disasters significantly reduce household income and expenditure inequality as measured by the Gini index, implying that rich people in Thailand bear a higher cost of natural disasters when compared to poor people. The effect on income inequality is mainly driven by droughts, while the effect on expenditure inequality is mainly driven by flood events. The results are robust across heterogeneity of the samples, lagged effects, outliers, and an alternative inequality measure.

Keywords: inequality, natural disasters, remote-sensing data, Thailand

Procedia PDF Downloads 122
633 Longitudinal Study of the Phenomenon of Acting White in Hungarian Elementary Schools Analysed by Fixed and Random Effects Models

Authors: Lilla Dorina Habsz, Marta Rado

Abstract:

Popularity is affected by a variety of factors in the primary school such as academic achievement and ethnicity. The main goal of our study was to analyse whether acting white exists in Hungarian elementary schools. In other words, we observed whether Roma students penalize those in-group members who obtain the high academic achievement. Furthermore, to show how popularity is influenced by changes in academic achievement in inter-ethnic relations. The empirical basis of our research was the 'competition and negative networks' longitudinal dataset, which was collected by the MTA TK 'Lendület' RECENS research group. This research followed 11 and 12-year old students for a two-year period. The survey was analysed using fixed and random effect models. Overall, we found a positive correlation between grades and popularity, but no evidence for the acting white effect. However, better grades were more positively evaluated within the majority group than within the minority group, which may further increase inequalities.

Keywords: academic achievement, elementary school, ethnicity, popularity

Procedia PDF Downloads 200
632 Weed Classification Using a Two-Dimensional Deep Convolutional Neural Network

Authors: Muhammad Ali Sarwar, Muhammad Farooq, Nayab Hassan, Hammad Hassan

Abstract:

Pakistan is highly recognized for its agriculture and is well known for producing substantial amounts of wheat, cotton, and sugarcane. However, some factors contribute to a decline in crop quality and a reduction in overall output. One of the main factors contributing to this decline is the presence of weed and its late detection. This process of detection is manual and demands a detailed inspection to be done by the farmer itself. But by the time detection of weed, the farmer will be able to save its cost and can increase the overall production. The focus of this research is to identify and classify the four main types of weeds (Small-Flowered Cranesbill, Chick Weed, Prickly Acacia, and Black-Grass) that are prevalent in our region’s major crops. In this work, we implemented three different deep learning techniques: YOLO-v5, Inception-v3, and Deep CNN on the same Dataset, and have concluded that deep convolutions neural network performed better with an accuracy of 97.45% for such classification. In relative to the state of the art, our proposed approach yields 2% better results. We devised the architecture in an efficient way such that it can be used in real-time.

Keywords: deep convolution networks, Yolo, machine learning, agriculture

Procedia PDF Downloads 116
631 Single Cell Analysis of Circulating Monocytes in Prostate Cancer Patients

Authors: Leander Van Neste, Kirk Wojno

Abstract:

The innate immune system reacts to foreign insult in several unique ways, one of which is phagocytosis of perceived threats such as cancer, bacteria, and viruses. The goal of this study was to look for evidence of phagocytosed RNA from tumor cells in circulating monocytes. While all monocytes possess phagocytic capabilities, the non-classical CD14+/FCGR3A+ monocytes and the intermediate CD14++/FCGR3A+ monocytes most actively remove threatening ‘external’ cellular materials. Purified CD14-positive monocyte samples from fourteen patients recently diagnosed with clinically localized prostate cancer (PCa) were investigated by single-cell RNA sequencing using the 10X Genomics protocol followed by paired-end sequencing on Illumina’s NovaSeq. Similarly, samples were processed and used as controls, i.e., one patient underwent biopsy but was found not to harbor prostate cancer (benign), three young, healthy men, and three men previously diagnosed with prostate cancer that recently underwent (curative) radical prostatectomy (post-RP). Sequencing data were mapped using 10X Genomics’ CellRanger software and viable cells were subsequently identified using CellBender, removing technical artifacts such as doublets and non-cellular RNA. Next, data analysis was performed in R, using the Seurat package. Because the main goal was to identify differences between PCa patients and ‘control’ patients, rather than exploring differences between individual subjects, the individual Seurat objects of all 21 patients were merged into one Seurat object per Seurat’s recommendation. Finally, the single-cell dataset was normalized as a whole prior to further analysis. Cell identity was assessed using the SingleR and cell dex packages. The Monaco Immune Data was selected as the reference dataset, consisting of bulk RNA-seq data of sorted human immune cells. The Monaco classification was supplemented with normalized PCa data obtained from The Cancer Genome Atlas (TCGA), which consists of bulk RNA sequencing data from 499 prostate tumor tissues (including 1 metastatic) and 52 (adjacent) normal prostate tissues. SingleR was subsequently run on the combined immune cell and PCa datasets. As expected, the vast majority of cells were labeled as having a monocytic origin (~90%), with the most noticeable difference being the larger number of intermediate monocytes in the PCa patients (13.6% versus 7.1%; p<.001). In men harboring PCa, 0.60% of all purified monocytes were classified as harboring PCa signals when the TCGA data were included. This was 3-fold, 7.5-fold, and 4-fold higher compared to post-RP, benign, and young men, respectively (all p<.001). In addition, with 7.91%, the number of unclassified cells, i.e., cells with pruned labels due to high uncertainty of the assigned label, was also highest in men with PCa, compared to 3.51%, 2.67%, and 5.51% of cells in post-RP, benign, and young men, respectively (all p<.001). It can be postulated that actively phagocytosing cells are hardest to classify due to their dual immune cell and foreign cell nature. Hence, the higher number of unclassified cells and intermediate monocytes in PCa patients might reflect higher phagocytic activity due to tumor burden. This also illustrates that small numbers (~1%) of circulating peripheral blood monocytes that have interacted with tumor cells might still possess detectable phagocytosed tumor RNA.

Keywords: circulating monocytes, phagocytic cells, prostate cancer, tumor immune response

Procedia PDF Downloads 161
630 Intended-Actual First Asking/Offer Price Discrepancies and Their Impact on Negotiation Behaviour and Outcomes

Authors: Liuyao Chai, Colin Clark

Abstract:

Analysis of 574 participants in a simulated two-person distributive negotiation revealed that the first price 245 (42.7%) of these participants actually asked/offered for the item under negotiation (a used car) differed from the first price they previously stated they intended to ask/offer during their negotiation. This discrepancy between a negotiator’s intended first asking/offer price and his/her actual first asking/offer price had a significant and economically consequential impact on both the course and the outcomes of the negotiations studied. Participants whose actual first price remained the same as their intended first price tended to secure better negotiation outcomes. Moreover, participants who changed their intended first price tended to obtain relatively lower outcomes regardless of whether their modified first announced price had created a negotiating position that was ‘stronger’ or ‘weaker’ than if they had opened with their intended first price. Subsequent investigation of over twenty negotiation behaviours and pre-negotiation perceptual variables within this dataset indicated that the three types of first price announcers—i.e. intended first asking/offer price ‘weakeners’, ‘maintainers’ and ‘strengtheners’— comprised persons who tended to have significantly different pre-negotiation perceptions and behaved in systematically different ways during their negotiation. Typically, the most negative, outcome-compromising consequences of changing, weakening or strengthening an intended first price occurred at the very beginning of a negotiation when participants exchanged their actual first asking/offer prices.

Keywords: business communication, negotiation, persuasion, intended first asking/offer prices, bargaining

Procedia PDF Downloads 369
629 Living Arrangement of Elderly in India: An Exploration from BKPAI Study

Authors: Jitendra Gouda, Chander Shekhar

Abstract:

With the addition of 27 million elderly in India in past census decade from 2001 to 2011, it is imperative to work towards exploring the issues and concerns of this increasingly aged population. In Indian society, the elderly person is assumed to be looked after by the family members, especially by children but with changing economy, society, and lifestyle, this assumption demands examining. This paper is an attempt to explore the living arrangement of the elderly and their perceptions about this in India. The findings are based on the BKPAI dataset of 2011, which was conducted in seven states – Himachal Pradesh, Kerala, Maharashtra, Odisha, Punjab, Tamil Nadu, and West Bengal. The result shows that three fourth of elderly lives with their children. Having son and staying with children is positively associated among elderly. More than 40 percent as compared to 37 percent of elderly feels comfortable living with sons and daughters respectively. Half of elderly across sexes viewed that sons are the best person to live with. The result of discriminant analysis suggest that health status and living arrangement of elderly are the good discriminators to ensure their importance in the family.

Keywords: discriminant analysis, elderly, India, living arrangment

Procedia PDF Downloads 324
628 Conflict and Hunger Revisit: Evidences from Global Surveys, 1989-2020

Authors: Manasse Elusma, Thung-Hong Lin, Chun-yin Lee

Abstract:

The relationship between hunger and war or conflict remains to be discussed. Do wars or conflicts cause hunger and food scarcity, or is the reverse relationship is true? As the world becomes more peaceful and wealthier, some countries are still suffered from hunger and food shortage. So, eradicating hunger calls for a more comprehensive understanding of the relationship between conflict and hunger. Several studies are carried out to detect the importance of conflict or war on food security. Most of these studies, however, perform only descriptive analysis and largely use food security indicators instead of the global hunger index. Few studies have employed cross-country panel data to explicitly analyze the association between conflict and chronic hunger, including hidden hunger. Herein, this study addresses this issue and the knowledge gap. We combine global datasets to build a new panel dataset including 143 countries from 1989 to 2020. This study examines the effect of conflict on hunger with fixed effect models, and the results show that the increase of conflict frequency deteriorates hunger. Peacebuilding efforts and war prevention initiative are required to eradicate global hunger.

Keywords: armed conflict, food scarcity, hidden hunger, hunger, malnutrition

Procedia PDF Downloads 171
627 Security Practices of the European Union on Migration: An Analysis of the Frontex Within the Framework of Biopolitics

Authors: Gizem Ertürk, Nursena Dinç

Abstract:

The Aegean area has always been an important transit point for migration; however, the establishment of the European Union gave further impetus to the migration phenomenon and increased the significance of the area within this context. The migration waves have been more visible in the area in recent decades, and particularly after the “2015 Migration Crisis,” this issue has been subject to further securitization in the EU. In this conjuncture, the Frontex, which is an agency set up by the EU in 2005 for the purpose of managing and coordinating the border control efforts, has become more functional in the relevant area, but at the same time, have some questionable actions within the context of human rights. This paper problematizes the rationality behind the existence and practices of such a structure and attempts to make a political and legal analysis of the security practices of the European Union against migration within a framework based on the biopolitics approaches of Michel Foucault and Giorgio Agamben. The dataset of this paper, which focuses on the agency in question by taking it as a case, is formed by making use of the existing literature on the EU’s security policies, the relevant official texts of the Union and Frontex reports on migration practices in and around the Aegean Sea.

Keywords: migration, biopolitics, Frontex, security, European union, securitization

Procedia PDF Downloads 137
626 SC-LSH: An Efficient Indexing Method for Approximate Similarity Search in High Dimensional Space

Authors: Sanaa Chafik, Imane Daoudi, Mounim A. El Yacoubi, Hamid El Ouardi

Abstract:

Locality Sensitive Hashing (LSH) is one of the most promising techniques for solving nearest neighbour search problem in high dimensional space. Euclidean LSH is the most popular variation of LSH that has been successfully applied in many multimedia applications. However, the Euclidean LSH presents limitations that affect structure and query performances. The main limitation of the Euclidean LSH is the large memory consumption. In order to achieve a good accuracy, a large number of hash tables is required. In this paper, we propose a new hashing algorithm to overcome the storage space problem and improve query time, while keeping a good accuracy as similar to that achieved by the original Euclidean LSH. The Experimental results on a real large-scale dataset show that the proposed approach achieves good performances and consumes less memory than the Euclidean LSH.

Keywords: approximate nearest neighbor search, content based image retrieval (CBIR), curse of dimensionality, locality sensitive hashing, multidimensional indexing, scalability

Procedia PDF Downloads 320