Search results for: webpage classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2182

Search results for: webpage classification

1492 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 49
1491 Classification of Multiple Cancer Types with Deep Convolutional Neural Network

Authors: Nan Deng, Zhenqiu Liu

Abstract:

Thousands of patients with metastatic tumors were diagnosed with cancers of unknown primary sites each year. The inability to identify the primary cancer site may lead to inappropriate treatment and unexpected prognosis. Nowadays, a large amount of genomics and transcriptomics cancer data has been generated by next-generation sequencing (NGS) technologies, and The Cancer Genome Atlas (TCGA) database has accrued thousands of human cancer tumors and healthy controls, which provides an abundance of resource to differentiate cancer types. Meanwhile, deep convolutional neural networks (CNNs) have shown high accuracy on classification among a large number of image object categories. Here, we utilize 25 cancer primary tumors and 3 normal tissues from TCGA and convert their RNA-Seq gene expression profiling to color images; train, validate and test a CNN classifier directly from these images. The performance result shows that our CNN classifier can archive >80% test accuracy on most of the tumors and normal tissues. Since the gene expression pattern of distant metastases is similar to their primary tumors, the CNN classifier may provide a potential computational strategy on identifying the unknown primary origin of metastatic cancer in order to plan appropriate treatment for patients.

Keywords: bioinformatics, cancer, convolutional neural network, deep leaning, gene expression pattern

Procedia PDF Downloads 299
1490 Semantic Differences between Bug Labeling of Different Repositories via Machine Learning

Authors: Pooja Khanal, Huaming Zhang

Abstract:

Labeling of issues/bugs, also known as bug classification, plays a vital role in software engineering. Some known labels/classes of bugs are 'User Interface', 'Security', and 'API'. Most of the time, when a reporter reports a bug, they try to assign some predefined label to it. Those issues are reported for a project, and each project is a repository in GitHub/GitLab, which contains multiple issues. There are many software project repositories -ranging from individual projects to commercial projects. The labels assigned for different repositories may be dependent on various factors like human instinct, generalization of labels, label assignment policy followed by the reporter, etc. While the reporter of the issue may instinctively give that issue a label, another person reporting the same issue may label it differently. This way, it is not known mathematically if a label in one repository is similar or different to the label in another repository. Hence, the primary goal of this research is to find the semantic differences between bug labeling of different repositories via machine learning. Independent optimal classifiers for individual repositories are built first using the text features from the reported issues. The optimal classifiers may include a combination of multiple classifiers stacked together. Then, those classifiers are used to cross-test other repositories which leads the result to be deduced mathematically. The produce of this ongoing research includes a formalized open-source GitHub issues database that is used to deduce the similarity of the labels pertaining to the different repositories.

Keywords: bug classification, bug labels, GitHub issues, semantic differences

Procedia PDF Downloads 200
1489 Object-Scene: Deep Convolutional Representation for Scene Classification

Authors: Yanjun Chen, Chuanping Hu, Jie Shao, Lin Mei, Chongyang Zhang

Abstract:

Traditional image classification is based on encoding scheme (e.g. Fisher Vector, Vector of Locally Aggregated Descriptor) with low-level image features (e.g. SIFT, HoG). Compared to these low-level local features, deep convolutional features obtained at the mid-level layer of convolutional neural networks (CNN) have richer information but lack of geometric invariance. For scene classification, there are scattered objects with different size, category, layout, number and so on. It is crucial to find the distinctive objects in scene as well as their co-occurrence relationship. In this paper, we propose a method to take advantage of both deep convolutional features and the traditional encoding scheme while taking object-centric and scene-centric information into consideration. First, to exploit the object-centric and scene-centric information, two CNNs that trained on ImageNet and Places dataset separately are used as the pre-trained models to extract deep convolutional features at multiple scales. This produces dense local activations. By analyzing the performance of different CNNs at multiple scales, it is found that each CNN works better in different scale ranges. A scale-wise CNN adaption is reasonable since objects in scene are at its own specific scale. Second, a fisher kernel is applied to aggregate a global representation at each scale and then to merge into a single vector by using a post-processing method called scale-wise normalization. The essence of Fisher Vector lies on the accumulation of the first and second order differences. Hence, the scale-wise normalization followed by average pooling would balance the influence of each scale since different amount of features are extracted. Third, the Fisher vector representation based on the deep convolutional features is followed by a linear Supported Vector Machine, which is a simple yet efficient way to classify the scene categories. Experimental results show that the scale-specific feature extraction and normalization with CNNs trained on object-centric and scene-centric datasets can boost the results from 74.03% up to 79.43% on MIT Indoor67 when only two scales are used (compared to results at single scale). The result is comparable to state-of-art performance which proves that the representation can be applied to other visual recognition tasks.

Keywords: deep convolutional features, Fisher Vector, multiple scales, scale-specific normalization

Procedia PDF Downloads 331
1488 Machine Learning Approach for Automating Electronic Component Error Classification and Detection

Authors: Monica Racha, Siva Chandrasekaran, Alex Stojcevski

Abstract:

The engineering programs focus on promoting students' personal and professional development by ensuring that students acquire technical and professional competencies during four-year studies. The traditional engineering laboratory provides an opportunity for students to "practice by doing," and laboratory facilities aid them in obtaining insight and understanding of their discipline. Due to rapid technological advancements and the current COVID-19 outbreak, the traditional labs were transforming into virtual learning environments. Aim: To better understand the limitations of the physical laboratory, this research study aims to use a Machine Learning (ML) algorithm that interfaces with the Augmented Reality HoloLens and predicts the image behavior to classify and detect the electronic components. The automated electronic components error classification and detection automatically detect and classify the position of all components on a breadboard by using the ML algorithm. This research will assist first-year undergraduate engineering students in conducting laboratory practices without any supervision. With the help of HoloLens, and ML algorithm, students will reduce component placement error on a breadboard and increase the efficiency of simple laboratory practices virtually. Method: The images of breadboards, resistors, capacitors, transistors, and other electrical components will be collected using HoloLens 2 and stored in a database. The collected image dataset will then be used for training a machine learning model. The raw images will be cleaned, processed, and labeled to facilitate further analysis of components error classification and detection. For instance, when students conduct laboratory experiments, the HoloLens captures images of students placing different components on a breadboard. The images are forwarded to the server for detection in the background. A hybrid Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) algorithm will be used to train the dataset for object recognition and classification. The convolution layer extracts image features, which are then classified using Support Vector Machine (SVM). By adequately labeling the training data and classifying, the model will predict, categorize, and assess students in placing components correctly. As a result, the data acquired through HoloLens includes images of students assembling electronic components. It constantly checks to see if students appropriately position components in the breadboard and connect the components to function. When students misplace any components, the HoloLens predicts the error before the user places the components in the incorrect proportion and fosters students to correct their mistakes. This hybrid Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) algorithm automating electronic component error classification and detection approach eliminates component connection problems and minimizes the risk of component damage. Conclusion: These augmented reality smart glasses powered by machine learning provide a wide range of benefits to supervisors, professionals, and students. It helps customize the learning experience, which is particularly beneficial in large classes with limited time. It determines the accuracy with which machine learning algorithms can forecast whether students are making the correct decisions and completing their laboratory tasks.

Keywords: augmented reality, machine learning, object recognition, virtual laboratories

Procedia PDF Downloads 134
1487 A Psychophysiological Evaluation of an Effective Recognition Technique Using Interactive Dynamic Virtual Environments

Authors: Mohammadhossein Moghimi, Robert Stone, Pia Rotshtein

Abstract:

Recording psychological and physiological correlates of human performance within virtual environments and interpreting their impacts on human engagement, ‘immersion’ and related emotional or ‘effective’ states is both academically and technologically challenging. By exposing participants to an effective, real-time (game-like) virtual environment, designed and evaluated in an earlier study, a psychophysiological database containing the EEG, GSR and Heart Rate of 30 male and female gamers, exposed to 10 games, was constructed. Some 174 features were subsequently identified and extracted from a number of windows, with 28 different timing lengths (e.g. 2, 3, 5, etc. seconds). After reducing the number of features to 30, using a feature selection technique, K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) methods were subsequently employed for the classification process. The classifiers categorised the psychophysiological database into four effective clusters (defined based on a 3-dimensional space – valence, arousal and dominance) and eight emotion labels (relaxed, content, happy, excited, angry, afraid, sad, and bored). The KNN and SVM classifiers achieved average cross-validation accuracies of 97.01% (±1.3%) and 92.84% (±3.67%), respectively. However, no significant differences were found in the classification process based on effective clusters or emotion labels.

Keywords: virtual reality, effective computing, effective VR, emotion-based effective physiological database

Procedia PDF Downloads 233
1486 Hacking's 'Between Goffman and Foucault': A Theoretical Frame for Criminology

Authors: Tomás Speziale

Abstract:

This paper aims to analyse how Ian Hacking states the theoretical basis of his research on the classification of people. Although all his early philosophical education had been based in Foucault, it is also true that Erving Goffman’s perspective provided him with epistemological and methodological tools for understanding face-to-face relationships. Hence, all his works must be thought of as social science texts that combine the research on how the individuals are constituted ‘top-down’ (as in Foucault), with the inquiry into how people renegotiate ‘bottom-up’ the classifications about them. Thus, Hacking´s proposal constitutes a middle ground between the French Philosopher and the American Sociologist. Placing himself between both authors allows Hacking to build a frame that is expected to adjust to Social Sciences’ main particularity: the fact that they study interactive kinds. These are kinds of people, which imply that those who are classified can change in certain ways that prompt the need for changing previous classifications themselves. It is all about the interaction between the labelling of people and the people who are classified. Consequently, understanding the way in which Hacking uses Foucault’s and Goffman’s theories is essential to fully comprehend the social dynamic between individuals and concepts, what Bert Hansen had called dialectical realism. His theoretical proposal, therefore, is not only valuable because it combines diverse perspectives, but also because it constitutes an utterly original and relevant framework for Sociological theory and particularly for Criminology.

Keywords: classification of people, Foucault's archaeology, Goffman's interpersonal sociology, interactive kinds

Procedia PDF Downloads 343
1485 Technologic Information about Photovoltaic Applied in Urban Residences

Authors: Stephanie Fabris Russo, Daiane Costa Guimarães, Jonas Pedro Fabris, Maria Emilia Camargo, Suzana Leitão Russo, José Augusto Andrade Filho

Abstract:

Among renewable energy sources, solar energy is the one that has stood out. Solar radiation can be used as a thermal energy source and can also be converted into electricity by means of effects on certain materials, such as thermoelectric and photovoltaic panels. These panels are often used to generate energy in homes, buildings, arenas, etc., and have low pollution emissions. Thus, a technological prospecting was performed to find patents related to the use of photovoltaic plates in urban residences. The patent search was based on ESPACENET, associating the keywords photovoltaic and home, where we found 136 patent documents in the period of 1994-2015 in the fields title and abstract. Note that the years 2009, 2010, 2011, 2012, 2013 and 2014 had the highest number of applicants, with respectively, 11, 13, 23, 29, 15 and 21. Regarding the country that deposited about this technology, it is clear that China leads with 67 patent deposits, followed by Japan with 38 patents applications. It is important to note that most depositors, 50% are companies, 44% are individual inventors and only 6% are universities. On the International Patent classification (IPC) codes, we noted that the most present classification in results was H02J3/38, which represents provisions in parallel to feed a single network by two or more generators, converters or transformers. Among all categories, there is the H session, which means Electricity, with 70% of the patents.

Keywords: photovoltaic, urban residences, technology forecasting, prospecting

Procedia PDF Downloads 300
1484 Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information

Authors: Haifeng Wang, Haili Zhang

Abstract:

Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.

Keywords: computational social science, movie preference, machine learning, SVM

Procedia PDF Downloads 260
1483 An Improved Parallel Algorithm of Decision Tree

Authors: Jiameng Wang, Yunfei Yin, Xiyu Deng

Abstract:

Parallel optimization is one of the important research topics of data mining at this stage. Taking Classification and Regression Tree (CART) parallelization as an example, this paper proposes a parallel data mining algorithm based on SSP-OGini-PCCP. Aiming at the problem of choosing the best CART segmentation point, this paper designs an S-SP model without data association; and in order to calculate the Gini index efficiently, a parallel OGini calculation method is designed. In addition, in order to improve the efficiency of the pruning algorithm, a synchronous PCCP pruning strategy is proposed in this paper. In this paper, the optimal segmentation calculation, Gini index calculation, and pruning algorithm are studied in depth. These are important components of parallel data mining. By constructing a distributed cluster simulation system based on SPARK, data mining methods based on SSP-OGini-PCCP are tested. Experimental results show that this method can increase the search efficiency of the best segmentation point by an average of 89%, increase the search efficiency of the Gini segmentation index by 3853%, and increase the pruning efficiency by 146% on average; and as the size of the data set increases, the performance of the algorithm remains stable, which meets the requirements of contemporary massive data processing.

Keywords: classification, Gini index, parallel data mining, pruning ahead

Procedia PDF Downloads 123
1482 Remote Sensing Application in Environmental Researches: Case Study of Iran Mangrove Forests Quantitative Assessment

Authors: Neda Orak, Mostafa Zarei

Abstract:

Environmental assessment is an important session in environment management. Since various methods and techniques have been produces and implemented. Remote sensing (RS) is widely used in many scientific and research fields such as geology, cartography, geography, agriculture, forestry, land use planning, environment, etc. It can show earth surface objects cyclical changes. Also, it can show earth phenomena limits on basis of electromagnetic reflectance changes and deviations records. The research has been done on mangrove forests assessment by RS techniques. Mangrove forests quantitative analysis in Basatin and Bidkhoon estuaries was the aim of this research. It has been done by Landsat satellite images from 1975- 2013 and match to ground control points. This part of mangroves are the last distribution in northern hemisphere. It can provide a good background to improve better management on this important ecosystem. Landsat has provided valuable images to earth changes detection to researchers. This research has used MSS, TM, +ETM, OLI sensors from 1975, 1990, 2000, 2003-2013. Changes had been studied after essential corrections such as fix errors, bands combination, georeferencing on 2012 images as basic image, by maximum likelihood and IPVI Index. It was done by supervised classification. 2004 google earth image and ground points by GPS (2010-2012) was used to compare satellite images obtained changes. Results showed mangrove area in bidkhoon was 1119072 m2 by GPS and 1231200 m2 by maximum likelihood supervised classification and 1317600 m2 by IPVI in 2012. Basatin areas is respectively: 466644 m2, 88200 m2, 63000 m2. Final results show forests have been declined naturally. It is due to human activities in Basatin. The defect was offset by planting in many years. Although the trend has been declining in recent years again. So, it mentioned satellite images have high ability to estimation all environmental processes. This research showed high correlation between images and indexes such as IPVI and NDVI with ground control points.

Keywords: IPVI index, Landsat sensor, maximum likelihood supervised classification, Nayband National Park

Procedia PDF Downloads 293
1481 Deciphering Orangutan Drawing Behavior Using Artificial Intelligence

Authors: Benjamin Beltzung, Marie Pelé, Julien P. Renoult, Cédric Sueur

Abstract:

To this day, it is not known if drawing is specifically human behavior or if this behavior finds its origins in ancestor species. An interesting window to enlighten this question is to analyze the drawing behavior in genetically close to human species, such as non-human primate species. A good candidate for this approach is the orangutan, who shares 97% of our genes and exhibits multiple human-like behaviors. Focusing on figurative aspects may not be suitable for orangutans’ drawings, which may appear as scribbles but may have meaning. A manual feature selection would lead to an anthropocentric bias, as the features selected by humans may not match with those relevant for orangutans. In the present study, we used deep learning to analyze the drawings of a female orangutan named Molly († in 2011), who has produced 1,299 drawings in her last five years as part of a behavioral enrichment program at the Tama Zoo in Japan. We investigate multiple ways to decipher Molly’s drawings. First, we demonstrate the existence of differences between seasons by training a deep learning model to classify Molly’s drawings according to the seasons. Then, to understand and interpret these seasonal differences, we analyze how the information spreads within the network, from shallow to deep layers, where early layers encode simple local features and deep layers encode more complex and global information. More precisely, we investigate the impact of feature complexity on classification accuracy through features extraction fed to a Support Vector Machine. Last, we leverage style transfer to dissociate features associated with drawing style from those describing the representational content and analyze the relative importance of these two types of features in explaining seasonal variation. Content features were relevant for the classification, showing the presence of meaning in these non-figurative drawings and the ability of deep learning to decipher these differences. The style of the drawings was also relevant, as style features encoded enough information to have a classification better than random. The accuracy of style features was higher for deeper layers, demonstrating and highlighting the variation of style between seasons in Molly’s drawings. Through this study, we demonstrate how deep learning can help at finding meanings in non-figurative drawings and interpret these differences.

Keywords: cognition, deep learning, drawing behavior, interpretability

Procedia PDF Downloads 165
1480 Video Object Segmentation for Automatic Image Annotation of Ethernet Connectors with Environment Mapping and 3D Projection

Authors: Marrone Silverio Melo Dantas Pedro Henrique Dreyer, Gabriel Fonseca Reis de Souza, Daniel Bezerra, Ricardo Souza, Silvia Lins, Judith Kelner, Djamel Fawzi Hadj Sadok

Abstract:

The creation of a dataset is time-consuming and often discourages researchers from pursuing their goals. To overcome this problem, we present and discuss two solutions adopted for the automation of this process. Both optimize valuable user time and resources and support video object segmentation with object tracking and 3D projection. In our scenario, we acquire images from a moving robotic arm and, for each approach, generate distinct annotated datasets. We evaluated the precision of the annotations by comparing these with a manually annotated dataset, as well as the efficiency in the context of detection and classification problems. For detection support, we used YOLO and obtained for the projection dataset an F1-Score, accuracy, and mAP values of 0.846, 0.924, and 0.875, respectively. Concerning the tracking dataset, we achieved an F1-Score of 0.861, an accuracy of 0.932, whereas mAP reached 0.894. In order to evaluate the quality of the annotated images used for classification problems, we employed deep learning architectures. We adopted metrics accuracy and F1-Score, for VGG, DenseNet, MobileNet, Inception, and ResNet. The VGG architecture outperformed the others for both projection and tracking datasets. It reached an accuracy and F1-score of 0.997 and 0.993, respectively. Similarly, for the tracking dataset, it achieved an accuracy of 0.991 and an F1-Score of 0.981.

Keywords: RJ45, automatic annotation, object tracking, 3D projection

Procedia PDF Downloads 167
1479 Quality Assessment and Classification of Recycled Aggregates from CandDW According to the European Standards

Authors: M. Eckert, D. Mendes, J P. Gonçalves, C. Moço, M. Oliveira

Abstract:

The intensive extraction of natural aggregates leads to both depletion of natural resources and unwanted environmental impacts. On the other hand, uncontrolled disposal of Construction and Demolition Wastes (C&DW) causes the lifetime reduction of landfills. It is known that the European Union produces, each year, about 850 million tons of C&DW. For all the member States of the European Union, one of the milestones to be reached by 2020, according to the Resource Efficiency Roadmap (COM (2011) 571) of the European Commission, is to recycle 70% of the C&DW. In this work, properties of different types of recycled C&DW aggregates and natural aggregates were compared. Assays were performed according to European Standards (EN 13285; EN 13242+A1; EN 12457-4; EN 12620; EN 13139) for the characterization of there: physical, mechanical and chemical properties. Not standardized tests such as water absorption over time, mass stability and post compaction sieve analysis were also carried out. The tested recycled C&DW aggregates were classified according to the requirements of the European Standards regarding there potential use in concrete, mortar, unbound layers of road pavements and embankments. The results of the physical and mechanical properties of recycled C&DW aggregates indicated, in general, lower quality properties when compared to natural aggregates, particularly, for concrete preparation and unbound layers of road pavements. The results of the chemical properties attested that the C&DW aggregates constitute no environmental risk. It was concluded that recycled aggregates produced from C&DW have the potential to be used in many applications.

Keywords: recycled aggregate, sustainability, aggregate properties, European Standard Classification

Procedia PDF Downloads 674
1478 Classification Framework of Production Planning and Scheduling Solutions from Supply Chain Management Perspective

Authors: Kwan Hee Han

Abstract:

In today’s business environments, frequent change of customer requirements is a tough challenge to manufacturing company. To cope with these challenges, a production planning and scheduling (PP&S) function might be established to provide accountability for both customer service and operational efficiency. Nowadays, many manufacturing firms have utilized PP&S software solutions to generate a realistic production plan and schedule to adapt to external changes efficiently. However, companies which consider the introduction of PP&S software solution, still have difficulties for selecting adequate solution to meet their specific needs. Since the task of PP&S is the one of major building blocks of SCM (Supply Chain Management) architecture, which deals with short term decision making in the production process of SCM, it is needed that the functionalities of PP&S should be analysed within the whole SCM process. The aim of this paper is to analyse the PP&S functionalities and its system architecture from the SCM perspective by using the criteria of level of planning hierarchy, major 4 SCM processes and problem-solving approaches, and finally propose a classification framework of PP&S solutions to facilitate the comparison among various commercial software solutions. By using proposed framework, several major PP&S solutions are classified and positioned according to their functional characteristics in this paper. By using this framework, practitioners who consider the introduction of computerized PP&S solutions in manufacturing firms can prepare evaluation and benchmarking sheets for selecting the most suitable solution with ease and in less time.

Keywords: production planning, production scheduling, supply chain management, the advanced planning system

Procedia PDF Downloads 198
1477 Gingival Tissue Appearance Changes According Hormonal Oscillations at Female Patients

Authors: Ilma Robo, Saimir Heta, Vera Ostreni, Elsaida Agrushi, Eduart Kapaj

Abstract:

Introduction: Cyclic hormonal fluctuations are known from literature to have a clinically visible effects on gingival tissue reactions, to the diagnosed processes of gingival inflammation. Materials and methods: At a total of 47 female patients, ad-hock presented at the University Clinic, were recorded data on effect of hormonal oscillations at periodontal treatment protocol. Oral examination was performed on soft tissue of gingiva and the oral mucous membrane, always respecting the air-drying procedure and then checking with free eye differences in oral mucosal relief. After the patients were informed about the study protocol, the purpose of the study and the ongoing procedure, verbal consensus was required. Results: The study was conducted in a total of 47 patients, out of which 13 patients were under the gingivitis classification, and 24 patients under the periodontal classification. Patients included in the study are divided by age, cycle week respectively 1,2,3 and 4.The younger age of female patients is more prone to the appearance of gingivitis, which is further aggravated by the effects of sexual hormones and the effect of the controlled or non-regulated fluctuations of the latter. Conclusions: The healing process is more fuel-intensive in the absence of high hormone levels, as they are these pro-inflammatory hormones, both in or near the ho Younger women are more open to volunteering in studies that record individual and study data that may last in time.

Keywords: gingiva, hormonal oscillations, female patients, mucosa, periodontal non-surgical treatment

Procedia PDF Downloads 81
1476 Strategies for Synchronizing Chocolate Conching Data Using Dynamic Time Warping

Authors: Fernanda A. P. Peres, Thiago N. Peres, Flavio S. Fogliatto, Michel J. Anzanello

Abstract:

Batch processes are widely used in food industry and have an important role in the production of high added value products, such as chocolate. Process performance is usually described by variables that are monitored as the batch progresses. Data arising from these processes are likely to display a strong correlation-autocorrelation structure, and are usually monitored using control charts based on multiway principal components analysis (MPCA). Process control of a new batch is carried out comparing the trajectories of its relevant process variables with those in a reference set of batches that yielded products within specifications; it is clear that proper determination of the reference set is key for the success of a correct signalization of non-conforming batches in such quality control schemes. In chocolate manufacturing, misclassifications of non-conforming batches in the conching phase may lead to significant financial losses. In such context, the accuracy of process control grows in relevance. In addition to that, the main assumption in MPCA-based monitoring strategies is that all batches are synchronized in duration, both the new batch being monitored and those in the reference set. Such assumption is often not satisfied in chocolate manufacturing process. As a consequence, traditional techniques as MPCA-based charts are not suitable for process control and monitoring. To address that issue, the objective of this work is to compare the performance of three dynamic time warping (DTW) methods in the alignment and synchronization of chocolate conching process variables’ trajectories, aimed at properly determining the reference distribution for multivariate statistical process control. The power of classification of batches in two categories (conforming and non-conforming) was evaluated using the k-nearest neighbor (KNN) algorithm. Real data from a milk chocolate conching process was collected and the following variables were monitored over time: frequency of soybean lecithin dosage, rotation speed of the shovels, current of the main motor of the conche, and chocolate temperature. A set of 62 batches with durations between 495 and 1,170 minutes was considered; 53% of the batches were known to be conforming based on lab test results and experts’ evaluations. Results showed that all three DTW methods tested were able to align and synchronize the conching dataset. However, synchronized datasets obtained from these methods performed differently when inputted in the KNN classification algorithm. Kassidas, MacGregor and Taylor’s (named KMT) method was deemed the best DTW method for aligning and synchronizing a milk chocolate conching dataset, presenting 93.7% accuracy, 97.2% sensitivity and 90.3% specificity in batch classification, being considered the best option to determine the reference set for the milk chocolate dataset. Such method was recommended due to the lowest number of iterations required to achieve convergence and highest average accuracy in the testing portion using the KNN classification technique.

Keywords: batch process monitoring, chocolate conching, dynamic time warping, reference set distribution, variable duration

Procedia PDF Downloads 167
1475 Spatio-Temporal Pest Risk Analysis with ‘BioClass’

Authors: Vladimir A. Todiras

Abstract:

Spatio-temporal models provide new possibilities for real-time action in pest risk analysis. It should be noted that estimation of the possibility and probability of introduction of a pest and of its economic consequences involves many uncertainties. We present a new mapping technique that assesses pest invasion risk using online BioClass software. BioClass is a GIS tool designed to solve multiple-criteria classification and optimization problems based on fuzzy logic and level set methods. This research describes a method for predicting the potential establishment and spread of a plant pest into new areas using a case study: corn rootworm (Diabrotica spp.), tomato leaf miner (Tuta absoluta) and plum fruit moth (Grapholita funebrana). Our study demonstrated that in BioClass we can combine fuzzy logic and geographic information systems with knowledge of pest biology and environmental data to derive new information for decision making. Pests are sensitive to a warming climate, as temperature greatly affects their survival and reproductive rate and capacity. Changes have been observed in the distribution, frequency and severity of outbreaks of Helicoverpa armigera on tomato. BioClass has demonstrated to be a powerful tool for applying dynamic models and map the potential future distribution of a species, enable resource to make decisions about dangerous and invasive species management and control.

Keywords: classification, model, pest, risk

Procedia PDF Downloads 282
1474 A Proposed Treatment Protocol for the Management of Pars Interarticularis Pathology in Children and Adolescents

Authors: Paul Licina, Emma M. Johnston, David Lisle, Mark Young, Chris Brady

Abstract:

Background: Lumbar pars pathology is a common cause of pain in the growing spine. It can be seen in young athletes participating in at-risk sports and can affect sporting performance and long-term health due to its resistance to traditional management. There is a current lack of consensus of classification and treatment for pars injuries. Previous systems used CT to stage pars defects but could not assess early stress reactions. A modified classification is proposed that considers findings on MRI, significantly improving early treatment guidance. The treatment protocol is designed for patients aged 5 to 19 years. Method: Clinical screening identifies patients with a low, medium, or high index of suspicion for lumbar pars injury using patient age, sport participation and pain characteristics. MRI of the at-risk cohort enables augmentation of existing CT-based classification while avoiding ionising radiation. Patients are classified into five categories based on MRI findings. A type 0 lesion (stress reaction) is present when CT is normal and MRI shows high signal change (HSC) in the pars/pedicle on T2 images. A type 1 lesion represents the ‘early defect’ CT classification. The group previously referred to as a 'progressive stage' defect on CT can be split into 2A and 2B categories. 2As have HSC on MRI, whereas 2Bs do not. This distinction is important with regard to healing potential. Type 3 lesions are terminal stage defects on CT, characterised by pseudarthrosis. MRI shows no HSC. Results: Stress reactions (type 0) and acute fractures (1 and 2a) can heal and are treated in a custom-made hard brace for 12 weeks. It is initially worn 23 hours per day. At three weeks, patients commence basic core rehabilitation. At six weeks, in the absence of pain, the brace is removed for sleeping. Exercises are progressed to positions of daily living. Patients with continued pain remain braced 23 hours per day without exercise progression until becoming symptom-free. At nine weeks, patients commence supervised exercises out of the brace for 30 minutes each day. This allows them to re-learn muscular control without rigid support of the brace. At 12 weeks, bracing ceases and MRI is repeated. For patients with near or complete resolution of bony oedema and healing of any cortical defect, rehabilitation is focused on strength and conditioning and sport-specific exercise for the full return to activity. The length of this final stage is approximately nine weeks but depends on factors such as development and level of sports participation. If significant HSC remains on MRI, CT scan is considered to definitively assess cortical defect healing. For these patients, return to high-risk sports is delayed for up to three months. Chronic defects (2b and 3) cannot heal and are not braced, and rehabilitation follows traditional protocols. Conclusion: Appropriate clinical screening and imaging with MRI can identify pars pathology early. In those with potential for healing, we propose hard bracing and appropriate rehabilitation as part of a multidisciplinary management protocol. The validity of this protocol will be tested in future studies.

Keywords: adolescents, MRI classification, pars interticularis, treatment protocol

Procedia PDF Downloads 153
1473 A Dynamic Solution Approach for Heart Disease Prediction

Authors: Walid Moudani

Abstract:

The healthcare environment is generally perceived as being information rich yet knowledge poor. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. In fact, valuable knowledge can be discovered from application of data mining techniques in healthcare system. In this study, a proficient methodology for the extraction of significant patterns from the coronary heart disease warehouses for heart attack prediction, which unfortunately continues to be a leading cause of mortality in the whole world, has been presented. For this purpose, we propose to enumerate dynamically the optimal subsets of the reduced features of high interest by using rough sets technique associated to dynamic programming. Therefore, we propose to validate the classification using Random Forest (RF) decision tree to identify the risky heart disease cases. This work is based on a large amount of data collected from several clinical institutions based on the medical profile of patient. Moreover, the experts’ knowledge in this field has been taken into consideration in order to define the disease, its risk factors, and to establish significant knowledge relationships among the medical factors. A computer-aided system is developed for this purpose based on a population of 525 adults. The performance of the proposed model is analyzed and evaluated based on set of benchmark techniques applied in this classification problem.

Keywords: multi-classifier decisions tree, features reduction, dynamic programming, rough sets

Procedia PDF Downloads 410
1472 Use of Machine Learning Algorithms to Pediatric MR Images for Tumor Classification

Authors: I. Stathopoulos, V. Syrgiamiotis, E. Karavasilis, A. Ploussi, I. Nikas, C. Hatzigiorgi, K. Platoni, E. P. Efstathopoulos

Abstract:

Introduction: Brain and central nervous system (CNS) tumors form the second most common group of cancer in children, accounting for 30% of all childhood cancers. MRI is the key imaging technique used for the visualization and management of pediatric brain tumors. Initial characterization of tumors from MRI scans is usually performed via a radiologist’s visual assessment. However, different brain tumor types do not always demonstrate clear differences in visual appearance. Using only conventional MRI to provide a definite diagnosis could potentially lead to inaccurate results, and so histopathological examination of biopsy samples is currently considered to be the gold standard for obtaining definite diagnoses. Machine learning is defined as the study of computational algorithms that can use, complex or not, mathematical relationships and patterns from empirical and scientific data to make reliable decisions. Concerning the above, machine learning techniques could provide effective and accurate ways to automate and speed up the analysis and diagnosis for medical images. Machine learning applications in radiology are or could potentially be useful in practice for medical image segmentation and registration, computer-aided detection and diagnosis systems for CT, MR or radiography images and functional MR (fMRI) images for brain activity analysis and neurological disease diagnosis. Purpose: The objective of this study is to provide an automated tool, which may assist in the imaging evaluation and classification of brain neoplasms in pediatric patients by determining the glioma type, grade and differentiating between different brain tissue types. Moreover, a future purpose is to present an alternative way of quick and accurate diagnosis in order to save time and resources in the daily medical workflow. Materials and Methods: A cohort, of 80 pediatric patients with a diagnosis of posterior fossa tumor, was used: 20 ependymomas, 20 astrocytomas, 20 medulloblastomas and 20 healthy children. The MR sequences used, for every single patient, were the following: axial T1-weighted (T1), axial T2-weighted (T2), FluidAttenuated Inversion Recovery (FLAIR), axial diffusion weighted images (DWI), axial contrast-enhanced T1-weighted (T1ce). From every sequence only a principal slice was used that manually traced by two expert radiologists. Image acquisition was carried out on a GE HDxt 1.5-T scanner. The images were preprocessed following a number of steps including noise reduction, bias-field correction, thresholding, coregistration of all sequences (T1, T2, T1ce, FLAIR, DWI), skull stripping, and histogram matching. A large number of features for investigation were chosen, which included age, tumor shape characteristics, image intensity characteristics and texture features. After selecting the features for achieving the highest accuracy using the least number of variables, four machine learning classification algorithms were used: k-Nearest Neighbour, Support-Vector Machines, C4.5 Decision Tree and Convolutional Neural Network. The machine learning schemes and the image analysis are implemented in the WEKA platform and MatLab platform respectively. Results-Conclusions: The results and the accuracy of images classification for each type of glioma by the four different algorithms are still on process.

Keywords: image classification, machine learning algorithms, pediatric MRI, pediatric oncology

Procedia PDF Downloads 149
1471 Changes in Financial Reporting of Polish Entities Resulting from the Implementation of Directive 34/EU and Evaluation of the Changes by Accountants

Authors: Piotr Prewysz-Kwinto, Grazyna Voss

Abstract:

In June 2013, the European Parliament and the Council adopted a directive on financial reporting (Directive 2013/34/EU). The main objective was to simplify the principles of the preparation of financial statements, including the principles of the presentation and disclosures of financial information by adapting reporting burdens to the type and size of an undertaking. Therefore, the Directive introduced a classification of all undertakings into five groups, i.e. micro, small, medium-sized, large and public-interest entities, and defined in detail the classification criteria. The principles of the preparation of financial statements and the presentation of financial information as well as applicable simplifications were defined for each group. The EU Member States had to implement the provisions of Directive 34 relating to accounting and financial reporting into domestic norms until January 1, 2016. In Poland, the provisions of Directive 34 were implemented into domestic accounting norms specified in the Polish Accounting Act on a gradual basis. On July 11, 2014, the Polish Parliament adopted an amendment to the Act, introducing the Directive's solutions for micro-undertakings and on July 23, 2015, for the remaining undertakings. The aim of this paper is to present Polish solutions relating to financial reporting after the implementation of Directive 34 and the results of the survey conducted among accountants regarding the evaluation of the implemented simplifications for micro and small undertakings.

Keywords: accounting standards, financial reporting, financial statement, simplification

Procedia PDF Downloads 278
1470 Information Management Approach in the Prediction of Acute Appendicitis

Authors: Ahmad Shahin, Walid Moudani, Ali Bekraki

Abstract:

This research aims at presenting a predictive data mining model to handle an accurate diagnosis of acute appendicitis with patients for the purpose of maximizing the health service quality, minimizing morbidity/mortality, and reducing cost. However, acute appendicitis is the most common disease which requires timely accurate diagnosis and needs surgical intervention. Although the treatment of acute appendicitis is simple and straightforward, its diagnosis is still difficult because no single sign, symptom, laboratory or image examination accurately confirms the diagnosis of acute appendicitis in all cases. This contributes in increasing morbidity and negative appendectomy. In this study, the authors propose to generate an accurate model in prediction of patients with acute appendicitis which is based, firstly, on the segmentation technique associated to ABC algorithm to segment the patients; secondly, on applying fuzzy logic to process the massive volume of heterogeneous and noisy data (age, sex, fever, white blood cell, neutrophilia, CRP, urine, ultrasound, CT, appendectomy, etc.) in order to express knowledge and analyze the relationships among data in a comprehensive manner; and thirdly, on applying dynamic programming technique to reduce the number of data attributes. The proposed model is evaluated based on a set of benchmark techniques and even on a set of benchmark classification problems of osteoporosis, diabetes and heart obtained from the UCI data and other data sources.

Keywords: healthcare management, acute appendicitis, data mining, classification, decision tree

Procedia PDF Downloads 350
1469 Adversarial Attacks and Defenses on Deep Neural Networks

Authors: Jonathan Sohn

Abstract:

Deep neural networks (DNNs) have shown state-of-the-art performance for many applications, including computer vision, natural language processing, and speech recognition. Recently, adversarial attacks have been studied in the context of deep neural networks, which aim to alter the results of deep neural networks by modifying the inputs slightly. For example, an adversarial attack on a DNN used for object detection can cause the DNN to miss certain objects. As a result, the reliability of DNNs is undermined by their lack of robustness against adversarial attacks, raising concerns about their use in safety-critical applications such as autonomous driving. In this paper, we focus on studying the adversarial attacks and defenses on DNNs for image classification. There are two types of adversarial attacks studied which are fast gradient sign method (FGSM) attack and projected gradient descent (PGD) attack. A DNN forms decision boundaries that separate the input images into different categories. The adversarial attack slightly alters the image to move over the decision boundary, causing the DNN to misclassify the image. FGSM attack obtains the gradient with respect to the image and updates the image once based on the gradients to cross the decision boundary. PGD attack, instead of taking one big step, repeatedly modifies the input image with multiple small steps. There is also another type of attack called the target attack. This adversarial attack is designed to make the machine classify an image to a class chosen by the attacker. We can defend against adversarial attacks by incorporating adversarial examples in training. Specifically, instead of training the neural network with clean examples, we can explicitly let the neural network learn from the adversarial examples. In our experiments, the digit recognition accuracy on the MNIST dataset drops from 97.81% to 39.50% and 34.01% when the DNN is attacked by FGSM and PGD attacks, respectively. If we utilize FGSM training as a defense method, the classification accuracy greatly improves from 39.50% to 92.31% for FGSM attacks and from 34.01% to 75.63% for PGD attacks. To further improve the classification accuracy under adversarial attacks, we can also use a stronger PGD training method. PGD training improves the accuracy by 2.7% under FGSM attacks and 18.4% under PGD attacks over FGSM training. It is worth mentioning that both FGSM and PGD training do not affect the accuracy of clean images. In summary, we find that PGD attacks can greatly degrade the performance of DNNs, and PGD training is a very effective way to defend against such attacks. PGD attacks and defence are overall significantly more effective than FGSM methods.

Keywords: deep neural network, adversarial attack, adversarial defense, adversarial machine learning

Procedia PDF Downloads 194
1468 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: classifier ensemble, breast cancer survivability, data mining, SEER

Procedia PDF Downloads 328
1467 Determination of the Effective Economic and/or Demographic Indicators in Classification of European Union Member and Candidate Countries Using Partial Least Squares Discriminant Analysis

Authors: Esra Polat

Abstract:

Partial Least Squares Discriminant Analysis (PLSDA) is a statistical method for classification and consists a classical Partial Least Squares Regression (PLSR) in which the dependent variable is a categorical one expressing the class membership of each observation. PLSDA can be applied in many cases when classical discriminant analysis cannot be applied. For example, when the number of observations is low and when the number of independent variables is high. When there are missing values, PLSDA can be applied on the data that is available. Finally, it is adapted when multicollinearity between independent variables is high. The aim of this study is to determine the economic and/or demographic indicators, which are effective in grouping the 28 European Union (EU) member countries and 7 candidate countries (including potential candidates Bosnia and Herzegovina (BiH) and Kosova) by using the data set obtained from database of the World Bank for 2014. Leaving the political issues aside, the analysis is only concerned with the economic and demographic variables that have the potential influence on country’s eligibility for EU entrance. Hence, in this study, both the performance of PLSDA method in classifying the countries correctly to their pre-defined groups (candidate or member) and the differences between the EU countries and candidate countries in terms of these indicators are analyzed. As a result of the PLSDA, the value of percentage correctness of 100 % indicates that overall of the 35 countries is classified correctly. Moreover, the most important variables that determine the statuses of member and candidate countries in terms of economic indicators are identified as 'external balance on goods and services (% GDP)', 'gross domestic savings (% GDP)' and 'gross national expenditure (% GDP)' that means for the 2014 economical structure of countries is the most important determinant of EU membership. Subsequently, the model validated to prove the predictive ability by using the data set for 2015. For prediction sample, %97,14 of the countries are correctly classified. An interesting result is obtained for only BiH, which is still a potential candidate for EU, predicted as a member of EU by using the indicators data set for 2015 as a prediction sample. Although BiH has made a significant transformation from a war-torn country to a semi-functional state, ethnic tensions, nationalistic rhetoric and political disagreements are still evident, which inhibit Bosnian progress towards the EU.

Keywords: classification, demographic indicators, economic indicators, European Union, partial least squares discriminant analysis

Procedia PDF Downloads 280
1466 Detection and Classification Strabismus Using Convolutional Neural Network and Spatial Image Processing

Authors: Anoop T. R., Otman Basir, Robert F. Hess, Eileen E. Birch, Brooke A. Koritala, Reed M. Jost, Becky Luu, David Stager, Ben Thompson

Abstract:

Strabismus refers to a misalignment of the eyes. Early detection and treatment of strabismus in childhood can prevent the development of permanent vision loss due to abnormal development of visual brain areas. We developed a two-stage method for strabismus detection and classification based on photographs of the face. The first stage detects the presence or absence of strabismus, and the second stage classifies the type of strabismus. The first stage comprises face detection using Haar cascade, facial landmark estimation, face alignment, aligned face landmark detection, segmentation of the eye region, and detection of strabismus using VGG 16 convolution neural networks. Face alignment transforms the face to a canonical pose to ensure consistency in subsequent analysis. Using facial landmarks, the eye region is segmented from the aligned face and fed into a VGG 16 CNN model, which has been trained to classify strabismus. The CNN determines whether strabismus is present and classifies the type of strabismus (exotropia, esotropia, and vertical deviation). If stage 1 detects strabismus, the eye region image is fed into stage 2, which starts with the estimation of pupil center coordinates using mask R-CNN deep neural networks. Then, the distance between the pupil coordinates and eye landmarks is calculated along with the angle that the pupil coordinates make with the horizontal and vertical axis. The distance and angle information is used to characterize the degree and direction of the strabismic eye misalignment. This model was tested on 100 clinically labeled images of children with (n = 50) and without (n = 50) strabismus. The True Positive Rate (TPR) and False Positive Rate (FPR) of the first stage were 94% and 6% respectively. The classification stage has produced a TPR of 94.73%, 94.44%, and 100% for esotropia, exotropia, and vertical deviations, respectively. This method also had an FPR of 5.26%, 5.55%, and 0% for esotropia, exotropia, and vertical deviation, respectively. The addition of one more feature related to the location of corneal light reflections may reduce the FPR, which was primarily due to children with pseudo-strabismus (the appearance of strabismus due to a wide nasal bridge or skin folds on the nasal side of the eyes).

Keywords: strabismus, deep neural networks, face detection, facial landmarks, face alignment, segmentation, VGG 16, mask R-CNN, pupil coordinates, angle deviation, horizontal and vertical deviation

Procedia PDF Downloads 93
1465 Deep Feature Augmentation with Generative Adversarial Networks for Class Imbalance Learning in Medical Images

Authors: Rongbo Shen, Jianhua Yao, Kezhou Yan, Kuan Tian, Cheng Jiang, Ke Zhou

Abstract:

This study proposes a generative adversarial networks (GAN) framework to perform synthetic sampling in feature space, i.e., feature augmentation, to address the class imbalance problem in medical image analysis. A feature extraction network is first trained to convert images into feature space. Then the GAN framework incorporates adversarial learning to train a feature generator for the minority class through playing a minimax game with a discriminator. The feature generator then generates features for minority class from arbitrary latent distributions to balance the data between the majority class and the minority class. Additionally, a data cleaning technique, i.e., Tomek link, is employed to clean up undesirable conflicting features introduced from the feature augmentation and thus establish well-defined class clusters for the training. The experiment section evaluates the proposed method on two medical image analysis tasks, i.e., mass classification on mammogram and cancer metastasis classification on histopathological images. Experimental results suggest that the proposed method obtains superior or comparable performance over the state-of-the-art counterparts. Compared to all counterparts, our proposed method improves more than 1.5 percentage of accuracy.

Keywords: class imbalance, synthetic sampling, feature augmentation, generative adversarial networks, data cleaning

Procedia PDF Downloads 127
1464 Classification of Emotions in Emergency Call Center Conversations

Authors: Magdalena Igras, Joanna Grzybowska, Mariusz Ziółko

Abstract:

The study of emotions expressed in emergency phone call is presented, covering both statistical analysis of emotions configurations and an attempt to automatically classify emotions. An emergency call is a situation usually accompanied by intense, authentic emotions. They influence (and may inhibit) the communication between caller and responder. In order to support responders in their responsible and psychically exhaustive work, we studied when and in which combinations emotions appeared in calls. A corpus of 45 hours of conversations (about 3300 calls) from emergency call center was collected. Each recording was manually tagged with labels of emotions valence (positive, negative or neutral), type (sadness, tiredness, anxiety, surprise, stress, anger, fury, calm, relief, compassion, satisfaction, amusement, joy) and arousal (weak, typical, varying, high) on the basis of perceptual judgment of two annotators. As we concluded, basic emotions tend to appear in specific configurations depending on the overall situational context and attitude of speaker. After performing statistical analysis we distinguished four main types of emotional behavior of callers: worry/helplessness (sadness, tiredness, compassion), alarm (anxiety, intense stress), mistake or neutral request for information (calm, surprise, sometimes with amusement) and pretension/insisting (anger, fury). The frequency of profiles was respectively: 51%, 21%, 18% and 8% of recordings. A model of presenting the complex emotional profiles on the two-dimensional (tension-insecurity) plane was introduced. In the stage of acoustic analysis, a set of prosodic parameters, as well as Mel-Frequency Cepstral Coefficients (MFCC) were used. Using these parameters, complex emotional states were modeled with machine learning techniques including Gaussian mixture models, decision trees and discriminant analysis. Results of classification with several methods will be presented and compared with the state of the art results obtained for classification of basic emotions. Future work will include optimization of the algorithm to perform in real time in order to track changes of emotions during a conversation.

Keywords: acoustic analysis, complex emotions, emotion recognition, machine learning

Procedia PDF Downloads 398
1463 Methodology for Temporary Analysis of Production and Logistic Systems on the Basis of Distance Data

Authors: M. Mueller, M. Kuehn, M. Voelker

Abstract:

In small and medium-sized enterprises (SMEs), the challenge is to create a well-grounded and reliable basis for process analysis, optimization and planning due to a lack of data. SMEs have limited access to methods with which they can effectively and efficiently analyse processes and identify cause-and-effect relationships in order to generate the necessary database and derive optimization potential from it. The implementation of digitalization within the framework of Industry 4.0 thus becomes a particular necessity for SMEs. For these reasons, the abstract presents an analysis methodology that is subject to the objective of developing an SME-appropriate methodology for efficient, temporarily feasible data collection and evaluation in flexible production and logistics systems as a basis for process analysis and optimization. The overall methodology focuses on retrospective, event-based tracing and analysis of material flow objects. The technological basis consists of Bluetooth low energy (BLE)-based transmitters, so-called beacons, and smart mobile devices (SMD), e.g. smartphones as receivers, between which distance data can be measured and derived motion profiles. The distance is determined using the Received Signal Strength Indicator (RSSI), which is a measure of signal field strength between transmitter and receiver. The focus is the development of a software-based methodology for interpretation of relative movements of transmitters and receivers based on distance data. The main research is on selection and implementation of pattern recognition methods for automatic process recognition as well as methods for the visualization of relative distance data. Due to an existing categorization of the database regarding process types, classification methods (e.g. Support Vector Machine) from the field of supervised learning are used. The necessary data quality requires selection of suitable methods as well as filters for smoothing occurring signal variations of the RSSI, the integration of methods for determination of correction factors depending on possible signal interference sources (columns, pallets) as well as the configuration of the used technology. The parameter settings on which respective algorithms are based have a further significant influence on result quality of the classification methods, correction models and methods for visualizing the position profiles used. The accuracy of classification algorithms can be improved up to 30% by selected parameter variation; this has already been proven in studies. Similar potentials can be observed with parameter variation of methods and filters for signal smoothing. Thus, there is increased interest in obtaining detailed results on the influence of parameter and factor combinations on data quality in this area. The overall methodology is realized with a modular software architecture consisting of independently modules for data acquisition, data preparation and data storage. The demonstrator for initialization and data acquisition is available as mobile Java-based application. The data preparation, including methods for signal smoothing, are Python-based with the possibility to vary parameter settings and to store them in the database (SQLite). The evaluation is divided into two separate software modules with database connection: the achievement of an automated assignment of defined process classes to distance data using selected classification algorithms and the visualization as well as reporting in terms of a graphical user interface (GUI).

Keywords: event-based tracing, machine learning, process classification, parameter settings, RSSI, signal smoothing

Procedia PDF Downloads 131