Search results for: classification of big data actors
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8153

Search results for: classification of big data actors

7463 Detecting HCC Tumor in Three Phasic CT Liver Images with Optimization of Neural Network

Authors: Mahdieh Khalilinezhad, Silvana Dellepiane, Gianni Vernazza

Abstract:

The aim of this work is to build a model based on tissue characterization that is able to discriminate pathological and non-pathological regions from three-phasic CT images. With our research and based on a feature selection in different phases, we are trying to design a neural network system with an optimal neuron number in a hidden layer. Our approach consists of three steps: feature selection, feature reduction, and classification. For each region of interest (ROI), 6 distinct sets of texture features are extracted such as: first order histogram parameters, absolute gradient, run-length matrix, co-occurrence matrix, autoregressive model, and wavelet, for a total of 270 texture features. When analyzing more phases, we show that the injection of liquid cause changes to the high relevant features in each region. Our results demonstrate that for detecting HCC tumor phase 3 is the best one in most of the features that we apply to the classification algorithm. The percentage of detection between pathology and healthy classes, according to our method, relates to first order histogram parameters with accuracy of 85% in phase 1, 95% in phase 2, and 95% in phase 3.

Keywords: Feature selection, Multi-phasic liver images, Neural network, Texture analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2534
7462 Eclectic Rule-Extraction from Support Vector Machines

Authors: Nahla Barakat, Joachim Diederich

Abstract:

Support vector machines (SVMs) have shown superior performance compared to other machine learning techniques, especially in classification problems. Yet one limitation of SVMs is the lack of an explanation capability which is crucial in some applications, e.g. in the medical and security domains. In this paper, a novel approach for eclectic rule-extraction from support vector machines is presented. This approach utilizes the knowledge acquired by the SVM and represented in its support vectors as well as the parameters associated with them. The approach includes three stages; training, propositional rule-extraction and rule quality evaluation. Results from four different experiments have demonstrated the value of the approach for extracting comprehensible rules of high accuracy and fidelity.

Keywords: Data mining, hybrid rule-extraction algorithms, medical diagnosis, SVMs

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1707
7461 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634
7460 Information Retrieval: A Comparative Study of Textual Indexing Using an Oriented Object Database (db4o) and the Inverted File

Authors: Mohammed Erritali

Abstract:

The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user. Most of the models of information retrieval use a specific data structure to index a corpus which is called "inverted file" or "reverse index". This inverted file collects information on all terms over the corpus documents specifying the identifiers of documents that contain the term in question, the frequency of each term in the documents of the corpus, the positions of the occurrences of the word... In this paper we use an oriented object database (db4o) instead of the inverted file, that is to say, instead to search a term in the inverted file, we will search it in the db4o database. The purpose of this work is to make a comparative study to see if the oriented object databases may be competing for the inverse index in terms of access speed and resource consumption using a large volume of data.

Keywords: Information Retrieval, indexation, oriented object database (db4o), inverted file.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734
7459 e/b-Learning Activities and High School Pedagogy

Authors: Rui Antunes

Abstract:

This article presents the implementation of several different e/b-Learning collaborative activities, used to improve the students learning process in an high school Polytechnic Institution. A new learning model arises, based on a combination between face-toface and distance leaning. Learning is now becoming centered with the development of collaborative activities, and its actors (teachers and students) have to be re-socialized to a new e/b-Learning paradigm. Measuring approaches are proposed for this model and results are presented, showing prospective correlation between students learning success and the use of online collaborative activities.

Keywords: e/b-Learning, Collaborative Learning, TeachingCommunities, Web-based Courseware

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1703
7458 Statistics over Lyapunov Exponents for Feature Extraction: Electroencephalographic Changes Detection Case

Authors: Elif Derya UBEYLI, Inan GULER

Abstract:

A new approach based on the consideration that electroencephalogram (EEG) signals are chaotic signals was presented for automated diagnosis of electroencephalographic changes. This consideration was tested successfully using the nonlinear dynamics tools, like the computation of Lyapunov exponents. This paper presented the usage of statistics over the set of the Lyapunov exponents in order to reduce the dimensionality of the extracted feature vectors. Since classification is more accurate when the pattern is simplified through representation by important features, feature extraction and selection play an important role in classifying systems such as neural networks. Multilayer perceptron neural network (MLPNN) architectures were formulated and used as basis for detection of electroencephalographic changes. Three types of EEG signals (EEG signals recorded from healthy volunteers with eyes open, epilepsy patients in the epileptogenic zone during a seizure-free interval, and epilepsy patients during epileptic seizures) were classified. The selected Lyapunov exponents of the EEG signals were used as inputs of the MLPNN trained with Levenberg- Marquardt algorithm. The classification results confirmed that the proposed MLPNN has potential in detecting the electroencephalographic changes.

Keywords: Chaotic signal, Electroencephalogram (EEG) signals, Feature extraction/selection, Lyapunov exponents

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2507
7457 Use of Data of the Remote Sensing for Spatiotemporal Analysis Land Use Changes in the Eastern Aurès (Algeria)

Authors: A. Bouzekri, H. Benmassaud

Abstract:

Aurèsregion is one of the arid and semi-arid areas that have suffered climate crises and overexploitation of natural resources they have led to significant land degradation. The use of remote sensing data allowed us to analyze the land and its spatiotemporal changes in the Aurès between 1987 and 2013, for this work, we adopted a method of analysis based on the exploitation of the images satellite Landsat TM 1987 and Landsat OLI 2013, from the supervised classification likelihood coupled with field surveys of the mission of May and September of 2013. Using ENVI EX software by the superposition of the ground cover maps from 1987 and 2013, one can extract a spatial map change of different land cover units. The results show that between 1987 and 2013 vegetation has suffered negative changes are the significant degradation of forests and steppe rangelands, and sandy soils and bare land recorded a considerable increase. The spatial change map land cover units between 1987 and 2013 allows us to understand the extensive or regressive orientation of vegetation and soil, this map shows that dense forests give his place to clear forests and steppe vegetation develops from a degraded forest vegetation and bare, sandy soils earn big steppe surfaces that explain its remarkable extension. The analysis of remote sensing data highlights the profound changes in our environment over time and quantitative monitoring of the risk of desertification.

Keywords: Aurès, Land use, remote sensing, spatiotemporal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5029
7456 Manufacturers-Retailers: The New Actor in the U.S. Furniture Industry. Characteristics and Implications for the Chinese Industry

Authors: Lidia Martínez Murillo

Abstract:

Since the 1990s the American furniture industry faces a transition period. Manufacturers, one of its most important actors made its entrance into the retail industry. This shift has had deep consequences not only for the American furniture industry as a whole, but also for other international furniture industries, especially the Chinese. The present work aims to analyze this actor based on the distinction provided by the Global Commodity Chain Theory. It stresses its characteristics, structure, operational way and importance for both the U.S. and the Chinese furniture industries.

Keywords: M&RC, blended strategy, U.S. furniture industry, Chinese furniture industry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1968
7455 Computer Aided Diagnostic System for Detection and Classification of a Brain Tumor through MRI Using Level Set Based Segmentation Technique and ANN Classifier

Authors: Atanu K Samanta, Asim Ali Khan

Abstract:

Due to the acquisition of huge amounts of brain tumor magnetic resonance images (MRI) in clinics, it is very difficult for radiologists to manually interpret and segment these images within a reasonable span of time. Computer-aided diagnosis (CAD) systems can enhance the diagnostic capabilities of radiologists and reduce the time required for accurate diagnosis. An intelligent computer-aided technique for automatic detection of a brain tumor through MRI is presented in this paper. The technique uses the following computational methods; the Level Set for segmentation of a brain tumor from other brain parts, extraction of features from this segmented tumor portion using gray level co-occurrence Matrix (GLCM), and the Artificial Neural Network (ANN) to classify brain tumor images according to their respective types. The entire work is carried out on 50 images having five types of brain tumor. The overall classification accuracy using this method is found to be 98% which is significantly good.

Keywords: Artificial neural network, ANN, brain tumor, computer-aided diagnostic, CAD system, gray-level co-occurrence matrix, GLCM, level set method, tumor segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1363
7454 Integrating Computational Intelligence Techniques and Assessment Agents in ELearning Environments

Authors: Konstantinos C. Giotopoulos, Christos E. Alexakos, Grigorios N. Beligiannis, Spiridon D.Likothanassis

Abstract:

In this contribution an innovative platform is being presented that integrates intelligent agents and evolutionary computation techniques in legacy e-learning environments. It introduces the design and development of a scalable and interoperable integration platform supporting: I) various assessment agents for e-learning environments, II) a specific resource retrieval agent for the provision of additional information from Internet sources matching the needs and profile of the specific user and III) a genetic algorithm designed to extract efficient information (classifying rules) based on the students- answering input data. The agents are implemented in order to provide intelligent assessment services based on computational intelligence techniques such as Bayesian Networks and Genetic Algorithms. The proposed Genetic Algorithm (GA) is used in order to extract efficient information (classifying rules) based on the students- answering input data. The idea of using a GA in order to fulfil this difficult task came from the fact that GAs have been widely used in applications including classification of unknown data. The utilization of new and emerging technologies like web services allows integrating the provided services to any web based legacy e-learning environment.

Keywords: Bayesian Networks, Computational Intelligencetechniques, E-learning legacy systems, Service Oriented Integration, Intelligent Agents, Genetic Algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1743
7453 Artificial Neural Networks for Classifying Magnetic Measurements in Tokamak Reactors

Authors: A. Greco, N. Mammone, F.C. Morabito, M.Versaci

Abstract:

This paper is mainly concerned with the application of a novel technique of data interpretation to the characterization and classification of measurements of plasma columns in Tokamak reactors for nuclear fusion applications. The proposed method exploits several concepts derived from soft computing theory. In particular, Artifical Neural Networks have been exploited to classify magnetic variables useful to determine shape and position of the plasma with a reduced computational complexity. The proposed technique is used to analyze simulated databases of plasma equilibria based on ITER geometry configuration. As well as demonstrating the successful recovery of scalar equilibrium parameters, we show that the technique can yield practical advantages compares with earlier methods.

Keywords: Tokamak, sensors, artificial neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1821
7452 Manufacturers-Retailers: The New Actor in the U.S. Furniture Industry. Characteristics and Implications for the Chinese Furniture Industry

Authors: Lidia Martínez Murillo

Abstract:

Since the 1990s the American furniture industry faces a transition period. Manufacturers, one of its most important actors made its entrance into the retail industry. This shift has had deep consequences not only for the American furniture industry as a whole, but also for other international furniture industries, especially the Chinese. The present work aims to analyze this actor based on the distinction provided by the Global Commodity Chain Theory. It stresses its characteristics, structure, operational way and importance for both the U.S. and the Chinese furniture industries.

Keywords: M&RC, blended strategy, U.S. furniture industry, Chinese furniture industry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2255
7451 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2008
7450 Performance Comparison and Evaluation of AdaBoost and SoftBoost Algorithms on Generic Object Recognition

Authors: Doaa Hegazy, Joachim Denzler

Abstract:

SoftBoost is a recently presented boosting algorithm, which trades off the size of achieved classification margin and generalization performance. This paper presents a performance evaluation of SoftBoost algorithm on the generic object recognition problem. An appearance-based generic object recognition model is used. The evaluation experiments are performed using a difficult object recognition benchmark. An assessment with respect to different degrees of label noise as well as a comparison to the well known AdaBoost algorithm is performed. The obtained results reveal that SoftBoost is encouraged to be used in cases when the training data is known to have a high degree of noise. Otherwise, using Adaboost can achieve better performance.

Keywords: SoftBoost algorithm, AdaBoost algorithm, Generic object recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1828
7449 European Radical Right Parties as Actors in Securitization of Migration

Authors: Mehmet Gökay Özerim

Abstract:

This study reveals that anti-immigrant policies in Europe result from a process of securitization, and that, within this process, radical right parties have been formulating discourses and approaches through a construction process by using some common security themes. These security themes can be classified as national security, economic security, cultural security and internal security. The frequency with which radical right parties use these themes may vary according to the specific historical, social and cultural characteristics of a particular country.

Keywords: European Union, International Migration, Radical Right Parties, Securitization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3411
7448 Improving Activity Recognition Classification of Repetitious Beginner Swimming Using a 2-Step Peak/Valley Segmentation Method with Smoothing and Resampling for Machine Learning

Authors: Larry Powell, Seth Polsley, Drew Casey, Tracy Hammond

Abstract:

Human activity recognition (HAR) systems have shown positive performance when recognizing repetitive activities like walking, running, and sleeping. Water-based activities are a reasonably new area for activity recognition. However, water-based activity recognition has largely focused on supporting the elite and competitive swimming population, which already has amazing coordination and proper form. Beginner swimmers are not perfect, and activity recognition needs to support the individual motions to help beginners. Activity recognition algorithms are traditionally built around short segments of timed sensor data. Using a time window input can cause performance issues in the machine learning model. The window’s size can be too small or large, requiring careful tuning and precise data segmentation. In this work, we present a method that uses a time window as the initial segmentation, then separates the data based on the change in the sensor value. Our system uses a multi-phase segmentation method that pulls all peaks and valleys for each axis of an accelerometer placed on the swimmer’s lower back. This results in high recognition performance using leave-one-subject-out validation on our study with 20 beginner swimmers, with our model optimized from our final dataset resulting in an F-Score of 0.95.

Keywords: Time window, peak/valley segmentation, feature extraction, beginner swimming, activity recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 204
7447 A Few Descriptive and Optimization Issues on the Material Flow at a Research-Academic Institution: The Role of Simulation

Authors: D. R. Delgado Sobrino, P. Košťál, J. Oravcová

Abstract:

Lately, significant work in the area of Intelligent Manufacturing has become public and mainly applied within the frame of industrial purposes. Special efforts have been made in the implementation of new technologies, management and control systems, among many others which have all evolved the field. Aware of all this and due to the scope of new projects and the need of turning the existing flexible ideas into more autonomous and intelligent ones, i.e.: Intelligent Manufacturing, the present paper emerges with the main aim of contributing to the design and analysis of the material flow in either systems, cells or work stations under this new “intelligent" denomination. For this, besides offering a conceptual basis in some of the key points to be taken into account and some general principles to consider in the design and analysis of the material flow, also some tips on how to define other possible alternative material flow scenarios and a classification of the states a system, cell or workstation are offered as well. All this is done with the intentions of relating it with the use of simulation tools, for which these have been briefly addressed with a special focus on the Witness simulation package. For a better comprehension, the previous elements are supported by a detailed layout, other figures and a few expressions which could help obtaining necessary data. Such data and others will be used in the future, when simulating the scenarios in the search of the best material flow configurations.

Keywords: Flexible/Intelligent Manufacturing System/Cell (F/IMS/C), material flow/design/configuration (MF/D/C), workstation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1610
7446 Support Vector Machine based Intelligent Watermark Decoding for Anticipated Attack

Authors: Syed Fahad Tahir, Asifullah Khan, Abdul Majid, Anwar M. Mirza

Abstract:

In this paper, we present an innovative scheme of blindly extracting message bits from an image distorted by an attack. Support Vector Machine (SVM) is used to nonlinearly classify the bits of the embedded message. Traditionally, a hard decoder is used with the assumption that the underlying modeling of the Discrete Cosine Transform (DCT) coefficients does not appreciably change. In case of an attack, the distribution of the image coefficients is heavily altered. The distribution of the sufficient statistics at the receiving end corresponding to the antipodal signals overlap and a simple hard decoder fails to classify them properly. We are considering message retrieval of antipodal signal as a binary classification problem. Machine learning techniques like SVM is used to retrieve the message, when certain specific class of attacks is most probable. In order to validate SVM based decoding scheme, we have taken Gaussian noise as a test case. We generate a data set using 125 images and 25 different keys. Polynomial kernel of SVM has achieved 100 percent accuracy on test data.

Keywords: Bit Correct Ratio (BCR), Grid Search, Intelligent Decoding, Jackknife Technique, Support Vector Machine (SVM), Watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1669
7445 Concentrated Animal Feeding Operations and Planning in the United States: Evidences from North Carolina

Authors: Asmaa Benbaba

Abstract:

This paper aims to reconsider relationships between animal feeding operations (CAFOs) and planning. It stresses the idea of the necessity for a methodological revolution in order to increase the chances for dialogue between different actors and various planning agencies and create possibilities to manage conflicts. The explored case of North Carolina shows limitations in environmental agencies’ actions and methods. It also calls for a more integrated approach among agencies including the local agencies.

Keywords: (CAFOs), North Carolina, Planning, United States.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2011
7444 On the Theory of Persecution

Authors: Aleksander V. Zakharov, Marat R. Bogdanov, Ramil F. Malikov, Irina N. Dumchikova

Abstract:

Classification of persecution movement laws is proposed. Modes of persecution in number of specific cases were researched. Modes of movement control using GLONASS/GPS are discussed

Keywords: Controlled Dynamic Motion, Unmanned Aerial Vehicles, GPS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1594
7443 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2790
7442 Applications of Support Vector Machines on Smart Phone Systems for Emotional Speech Recognition

Authors: Wernhuar Tarng, Yuan-Yuan Chen, Chien-Lung Li, Kun-Rong Hsie, Mingteh Chen

Abstract:

An emotional speech recognition system for the applications on smart phones was proposed in this study to combine with 3G mobile communications and social networks to provide users and their groups with more interaction and care. This study developed a mechanism using the support vector machines (SVM) to recognize the emotions of speech such as happiness, anger, sadness and normal. The mechanism uses a hierarchical classifier to adjust the weights of acoustic features and divides various parameters into the categories of energy and frequency for training. In this study, 28 commonly used acoustic features including pitch and volume were proposed for training. In addition, a time-frequency parameter obtained by continuous wavelet transforms was also used to identify the accent and intonation in a sentence during the recognition process. The Berlin Database of Emotional Speech was used by dividing the speech into male and female data sets for training. According to the experimental results, the accuracies of male and female test sets were increased by 4.6% and 5.2% respectively after using the time-frequency parameter for classifying happy and angry emotions. For the classification of all emotions, the average accuracy, including male and female data, was 63.5% for the test set and 90.9% for the whole data set.

Keywords: Smart phones, emotional speech recognition, socialnetworks, support vector machines, time-frequency parameter, Mel-scale frequency cepstral coefficients (MFCC).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1841
7441 Triangular Geometric Feature for Offline Signature Verification

Authors: Zuraidasahana Zulkarnain, Mohd Shafry Mohd Rahim, Nor Anita Fairos Ismail, Mohd Azhar M. Arsad

Abstract:

Handwritten signature is accepted widely as a biometric characteristic for personal authentication. The use of appropriate features plays an important role in determining accuracy of signature verification; therefore, this paper presents a feature based on the geometrical concept. To achieve the aim, triangle attributes are exploited to design a new feature since the triangle possesses orientation, angle and transformation that would improve accuracy. The proposed feature uses triangulation geometric set comprising of sides, angles and perimeter of a triangle which is derived from the center of gravity of a signature image. For classification purpose, Euclidean classifier along with Voting-based classifier is used to verify the tendency of forgery signature. This classification process is experimented using triangular geometric feature and selected global features. Based on an experiment that was validated using Grupo de Senales 960 (GPDS-960) signature database, the proposed triangular geometric feature achieves a lower Average Error Rates (AER) value with a percentage of 34% as compared to 43% of the selected global feature. As a conclusion, the proposed triangular geometric feature proves to be a more reliable feature for accurate signature verification.

Keywords: biometrics, euclidean classifier, feature extraction, offline signature verification, VOTING-based classifier

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1977
7440 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1641
7439 Normal and Peaberry Coffee Beans Classification from Green Coffee Bean Images Using Convolutional Neural Networks and Support Vector Machine

Authors: Hira Lal Gope, Hidekazu Fukai

Abstract:

The aim of this study is to develop a system which can identify and sort peaberries automatically at low cost for coffee producers in developing countries. In this paper, the focus is on the classification of peaberries and normal coffee beans using image processing and machine learning techniques. The peaberry is not bad and not a normal bean. The peaberry is born in an only single seed, relatively round seed from a coffee cherry instead of the usual flat-sided pair of beans. It has another value and flavor. To make the taste of the coffee better, it is necessary to separate the peaberry and normal bean before green coffee beans roasting. Otherwise, the taste of total beans will be mixed, and it will be bad. In roaster procedure time, all the beans shape, size, and weight must be unique; otherwise, the larger bean will take more time for roasting inside. The peaberry has a different size and different shape even though they have the same weight as normal beans. The peaberry roasts slower than other normal beans. Therefore, neither technique provides a good option to select the peaberries. Defect beans, e.g., sour, broken, black, and fade bean, are easy to check and pick up manually by hand. On the other hand, the peaberry pick up is very difficult even for trained specialists because the shape and color of the peaberry are similar to normal beans. In this study, we use image processing and machine learning techniques to discriminate the normal and peaberry bean as a part of the sorting system. As the first step, we applied Deep Convolutional Neural Networks (CNN) and Support Vector Machine (SVM) as machine learning techniques to discriminate the peaberry and normal bean. As a result, better performance was obtained with CNN than with SVM for the discrimination of the peaberry. The trained artificial neural network with high performance CPU and GPU in this work will be simply installed into the inexpensive and low in calculation Raspberry Pi system. We assume that this system will be used in under developed countries. The study evaluates and compares the feasibility of the methods in terms of accuracy of classification and processing speed.

Keywords: Convolutional neural networks, coffee bean, peaberry, sorting, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1552
7438 A Comparative Analysis of Machine Learning Techniques for PM10 Forecasting in Vilnius

Authors: M. A. S. Fahim, J. Sužiedelytė Visockienė

Abstract:

With the growing concern over air pollution (AP), it is clear that this has gained more prominence than ever before. The level of consciousness has increased and a sense of knowledge now has to be forwarded as a duty by those enlightened enough to disseminate it to others. This realization often comes after an understanding of how poor air quality indices (AQI) damage human health. The study focuses on assessing air pollution prediction models specifically for Lithuania, addressing a substantial need for empirical research within the region. Concentrating on Vilnius, it specifically examines particulate matter concentrations 10 micrometers or less in diameter (PM10). Utilizing Gaussian Process Regression (GPR) and Regression Tree Ensemble, and Regression Tree methodologies, predictive forecasting models are validated and tested using hourly data from January 2020 to December 2022. The study explores the classification of AP data into anthropogenic and natural sources, the impact of AP on human health, and its connection to cardiovascular diseases. The study revealed varying levels of accuracy among the models, with GPR achieving the highest accuracy, indicated by an RMSE of 4.14 in validation and 3.89 in testing.

Keywords: Air pollution, anthropogenic and natural sources, machine learning, Gaussian process regression, tree ensemble, forecasting models, particulate matter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 114
7437 Hydrochemical Assessment and Quality Classification of Water in Torogh and Kardeh Dam Reservoirs, North-East Iran

Authors: Mojtaba Heydarizad

Abstract:

Khorasan Razavi is the second most important province in north-east of Iran, which faces a water shortage crisis due to recent droughts and huge water consummation. Kardeh and Torogh dam reservoirs in this province provide a notable part of Mashhad metropolitan (with more than 4.5 million inhabitants) potable water needs. Hydrochemical analyses on these dam reservoirs samples demonstrate that MgHCO3 in Kardeh and CaHCO3 and to lower extent MgHCO3 water types in Torogh dam reservoir are dominant. On the other hand, Gibbs binary diagram demonstrates that rock weathering is the main factor controlling water quality in dam reservoirs. Plotting dam reservoir samples on Mg2+/Na+ and HCO3-/Na+ vs. Ca2+/ Na+ diagrams demonstrate evaporative and carbonate mineral dissolution is the dominant rock weathering ion sources in these dam reservoirs. Cluster Analyses (CA) also demonstrate intense role of rock weathering mainly (carbonate and evaporative minerals dissolution) in water quality of these dam reservoirs. Studying water quality by the U.S. National Sanitation Foundation (NSF) WQI index NSF-WQI, Oregon Water Quality Index (OWQI) and Canadian Water Quality Index DWQI index show moderate and good quality.

Keywords: Hydrochemistry, water quality classification, water quality indexes, Torogh and Kardeh Dam Reservoirs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1142
7436 Face Authentication for Access Control based on SVM using Class Characteristics

Authors: SeHun Lim, Sanghoon Kim, Sun-Tae Chung, Seongwon Cho

Abstract:

Face authentication for access control is a face membership authentication which passes the person of the incoming face if he turns out to be one of an enrolled person based on face recognition or rejects if not. Face membership authentication belongs to the two class classification problem where SVM(Support Vector Machine) has been successfully applied and shows better performance compared to the conventional threshold-based classification. However, most of previous SVMs have been trained using image feature vectors extracted from face images of each class member(enrolled class/unenrolled class) so that they are not robust to variations in illuminations, poses, and facial expressions and much affected by changes in member configuration of the enrolled class In this paper, we propose an effective face membership authentication method based on SVM using class discriminating features which represent an incoming face image-s associability with each class distinctively. These class discriminating features are weakly related with image features so that they are less affected by variations in illuminations, poses and facial expression. Through experiments, it is shown that the proposed face membership authentication method performs better than the threshold rule-based or the conventional SVM-based authentication methods and is relatively less affected by changes in member size and membership.

Keywords: Face Authentication, Access control, member ship authentication, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1507
7435 Analytical Authentication of Butter Using Fourier Transform Infrared Spectroscopy Coupled with Chemometrics

Authors: M. Bodner, M. Scampicchio

Abstract:

Fourier Transform Infrared (FT-IR) spectroscopy coupled with chemometrics was used to distinguish between butter samples and non-butter samples. Further, quantification of the content of margarine in adulterated butter samples was investigated. Fingerprinting region (1400-800 cm–1) was used to develop unsupervised pattern recognition (Principal Component Analysis, PCA), supervised modeling (Soft Independent Modelling by Class Analogy, SIMCA), classification (Partial Least Squares Discriminant Analysis, PLS-DA) and regression (Partial Least Squares Regression, PLS-R) models. PCA of the fingerprinting region shows a clustering of the two sample types. All samples were classified in their rightful class by SIMCA approach; however, nine adulterated samples (between 1% and 30% w/w of margarine) were classified as belonging both at the butter class and at the non-butter one. In the two-class PLS-DA model’s (R2 = 0.73, RMSEP, Root Mean Square Error of Prediction = 0.26% w/w) sensitivity was 71.4% and Positive Predictive Value (PPV) 100%. Its threshold was calculated at 7% w/w of margarine in adulterated butter samples. Finally, PLS-R model (R2 = 0.84, RMSEP = 16.54%) was developed. PLS-DA was a suitable classification tool and PLS-R a proper quantification approach. Results demonstrate that FT-IR spectroscopy combined with PLS-R can be used as a rapid, simple and safe method to identify pure butter samples from adulterated ones and to determine the grade of adulteration of margarine in butter samples.

Keywords: Adulterated butter, margarine, PCA, PLS-DA, PLS-R, SIMCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 778
7434 The "Project" Approach in Urban: A Response to Uncertainty

Authors: Nedjima Mouhoubi, Souad Sassi Boudemagh

Abstract:

In this paper, we will try to demonstrate the importance of the project approach in the urban to deal with uncertainty, the importance of the involvement of all stakeholders in the urban project process and that the absence of an actor can lead to project failure but also the importance of the urban project management. These points are handled through the following questions: Does the urban adhere to the theory of complexity? Does the project approach bring hope and solution to make urban planning "sustainable"? How converging visions of actors for the same project? Is the management of urban project the solution to support the urban project approach?

Keywords: Strategic planning, project, urban project stakeholders, management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1286