Search results for: Pima Indians diabetes dataset

405 Scaling up Detection Rates and Reducing False Positives in Intrusion Detection using NBTree

Authors: Dewan Md. Farid, Nguyen Huu Hoa, Jerome Darmont, Nouria Harbi, Mohammad Zahidur Rahman

Abstract:

In this paper, we present a new learning algorithm for anomaly based network intrusion detection using improved self adaptive naïve Bayesian tree (NBTree), which induces a hybrid of decision tree and naïve Bayesian classifier. The proposed approach scales up the balance detections for different attack types and keeps the false positives at acceptable level in intrusion detection. In complex and dynamic large intrusion detection dataset, the detection accuracy of naïve Bayesian classifier does not scale up as well as decision tree. It has been successfully tested in other problem domains that naïve Bayesian tree improves the classification rates in large dataset. In naïve Bayesian tree nodes contain and split as regular decision-trees, but the leaves contain naïve Bayesian classifiers. The experimental results on KDD99 benchmark network intrusion detection dataset demonstrate that this new approach scales up the detection rates for different attack types and reduces false positives in network intrusion detection.

Keywords: Detection rates, false positives, network intrusiondetection, naïve Bayesian tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2280

404 Exploring the Safety of Sodium Glucose Co-Transporter-2 Inhibitors at the Imperial College London Diabetes Centre, UAE

Authors: Raad Nari, Maura Moriaty, Maha T. Barakat

Abstract:

Introduction: Sodium-glucose co-transporter-2 (SGLT2) inhibitors are a new class of oral anti-diabetic drugs with a unique mechanism of action. They are used to improve glycaemic control in adults with type 2 diabetes by enhancing urinary glucose excretion. In the UAE, there has been certainly an increased use of these medications. As with any new medication, there are safety considerations related to their use in patients with type two diabetes. A retrospective study was conducted at the three main centres of the Imperial College London Diabetes Centre. Methodology: All patients in electronic database (Diamond) from October 2014 to October 2017 were included with a minimum of six months usage of sodium glucose co-transporter inhibitors that comprise canagliflozin, dapagliflozin and empagliflozin. There were 15 paired sample biochemical and clinical correlations. The analysis was done at the start of the study, three months and six months apart. SPSS version 24 was used for this study. Conclusion: This study of sodium glucose co-transporter-2 inhibitors used showed significant reductions in weight, glycated haemoglobin A1C, systolic and diastolic blood pressures. As the case with systematic reviews, there were similar changes in liver enzymes, raised total cholesterol, low density lipopoptein and high density lipoprotein. There was slight improvement in estimated glomerular filtration rate too. Our analysis also showed that they increased in the incidence of urinary tract symptoms and incidence of urinary tract infections.

Keywords: SGLT2 inhibitors dapagliflozin empagliflozin canagliflozin, adverse effects, amputation diabetic ketoacidosis DKA, urinary tract infection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 722

403 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: Classification algorithms; data mining; tourism; knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2546

402 Block-Based 2D to 3D Image Conversion Method

Authors: S. Sowmyayani, V. Murugan

Abstract:

With the advent of three-dimension (3D) technology, there are lots of research in converting 2D images to 3D images. The main difference between 2D and 3D is the visual illusion of depth in 3D images. In the recent era, there are more depth estimation techniques. The objective of this paper is to convert 2D images to 3D images with less computation time. For this, the input image is divided into blocks from which the depth information is obtained. Having the depth information, a depth map is generated. Then the 3D image is warped using the original image and the depth map. The proposed method is tested on Make3D dataset and NYU-V2 dataset. The experimental results are compared with other recent methods. The proposed method proved to work with less computation time and good accuracy.

Keywords: Depth map, 3D image warping, image rendering, bilateral filter, minimum spanning tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 359

401 Comparison of Deep Convolutional Neural Networks Models for Plant Disease Identification

Authors: Megha Gupta, Nupur Prakash

Abstract:

Identification of plant diseases has been performed using machine learning and deep learning models on the datasets containing images of healthy and diseased plant leaves. The current study carries out an evaluation of some of the deep learning models based on convolutional neural network architectures for identification of plant diseases. For this purpose, the publicly available New Plant Diseases Dataset, an augmented version of PlantVillage dataset, available on Kaggle platform, containing 87,900 images has been used. The dataset contained images of 26 diseases of 14 different plants and images of 12 healthy plants. The CNN models selected for the study presented in this paper are AlexNet, ZFNet, VGGNet (four models), GoogLeNet, and ResNet (three models). The selected models are trained using PyTorch, an open-source machine learning library, on Google Colaboratory. A comparative study has been carried out to analyze the high degree of accuracy achieved using these models. The highest test accuracy and F1-score of 99.59% and 0.996, respectively, were achieved by using GoogLeNet with Mini-batch momentum based gradient descent learning algorithm.

Keywords: comparative analysis, convolutional neural networks, deep learning, plant disease identification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 638

400 Improving Classification Accuracy with Discretization on Datasets Including Continuous Valued Features

Authors: Mehmet Hacibeyoglu, Ahmet Arslan, Sirzat Kahramanli

Abstract:

This study analyzes the effect of discretization on classification of datasets including continuous valued features. Six datasets from UCI which containing continuous valued features are discretized with entropy-based discretization method. The performance improvement between the dataset with original features and the dataset with discretized features is compared with k-nearest neighbors, Naive Bayes, C4.5 and CN2 data mining classification algorithms. As the result the classification accuracies of the six datasets are improved averagely by 1.71% to 12.31%.

Keywords: Data mining classification algorithms, entropy-baseddiscretization method

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2460

399 Olive Leaves Extract Restored the antioxidant Perturbations in Red Blood Cells Hemolysate in Streptozotocin Induced Diabetic Rats

Authors: Ismail I. Abo Ghanema, Kadry M. Sadek

Abstract:

Oxidative stress and overwhelming free radicals associated with diabetes mellitus are likely to be linked with development of certain complication such as retinopathy, nephropathy and neuropathy. Treatment of diabetic subjects with antioxidant may be of advantage in attenuating these complications. Olive leaf (Oleaeuropaea), has been endowed with many beneficial and health promoting properties mostly linked to its antioxidant activity. This study aimed to evaluate the significance of supplementation of Olive leaves extract (OLE) in reducing oxidative stress, hyperglycemia and hyperlipidemia in Sterptozotocin (STZ)- induced diabetic rats. After induction of diabetes, a significant rise in plasma glucose, lipid profiles except High density lipoproteincholestrol (HDLc), malondialdehyde (MDA) and significant decrease of plasma insulin, HDLc and Plasma reduced glutathione GSH as well as alteration in enzymatic antioxidants was observed in all diabetic animals. During treatment of diabetic rats with 0.5g/kg body weight of Olive leaves extract (OLE) the levels of plasma (MDA) ,(GSH), insulin, lipid profiles along with blood glucose and erythrocyte enzymatic antioxidant enzymes were significantly restored to establish values that were not different from normal control rats. Untreated diabetic rats on the other hand demonstrated persistent alterations in the oxidative stress marker (MDA), blood glucose, insulin, lipid profiles and the antioxidant parameters. These results demonstrate that OLE may be of advantage in inhibiting hyperglycemia, hyperlipidemia and oxidative stress induced by diabetes and suggest that administration of OLE may be helpful in the prevention or at least reduced of diabetic complications associated with oxidative stress.

Keywords: Diabetes mellitus, olive leaves, oxidative stress, red blood cells

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3064

398 Time and Frequency Domain Analysis of Heart Rate Variability and their Correlations in Diabetes Mellitus

Authors: P. T. Ahamed Seyd, V. I. Thajudin Ahamed, Jeevamma Jacob, Paul Joseph K

Abstract:

Diabetes mellitus (DM) is frequently characterized by autonomic nervous dysfunction. Analysis of heart rate variability (HRV) has become a popular noninvasive tool for assessing the activities of autonomic nervous system (ANS). In this paper, changes in ANS activity are quantified by means of frequency and time domain analysis of R-R interval variability. Electrocardiograms (ECG) of 16 patients suffering from DM and of 16 healthy volunteers were recorded. Frequency domain analysis of extracted normal to normal interval (NN interval) data indicates significant difference in very low frequency (VLF) power, low frequency (LF) power and high frequency (HF) power, between the DM patients and control group. Time domain measures, standard deviation of NN interval (SDNN), root mean square of successive NN interval differences (RMSSD), successive NN intervals differing more than 50 ms (NN50 Count), percentage value of NN50 count (pNN50), HRV triangular index and triangular interpolation of NN intervals (TINN) also show significant difference between the DM patients and control group.

Keywords: Autonomic nervous system, diabetes mellitus, frequency domain and time domain analysis, heart rate variability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3110

397 An Efficient Motion Recognition System Based on LMA Technique and a Discrete Hidden Markov Model

Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier

Abstract:

Human motion recognition has been extensively increased in recent years due to its importance in a wide range of applications, such as human-computer interaction, intelligent surveillance, augmented reality, content-based video compression and retrieval, etc. However, it is still regarded as a challenging task especially in realistic scenarios. It can be seen as a general machine learning problem which requires an effective human motion representation and an efficient learning method. In this work, we introduce a descriptor based on Laban Movement Analysis technique, a formal and universal language for human movement, to capture both quantitative and qualitative aspects of movement. We use Discrete Hidden Markov Model (DHMM) for training and classification motions. We improve the classification algorithm by proposing two DHMMs for each motion class to process the motion sequence in two different directions, forward and backward. Such modification allows avoiding the misclassification that can happen when recognizing similar motions. Two experiments are conducted. In the first one, we evaluate our method on a public dataset, the Microsoft Research Cambridge-12 Kinect gesture data set (MSRC-12) which is a widely used dataset for evaluating action/gesture recognition methods. In the second experiment, we build a dataset composed of 10 gestures(Introduce yourself, waving, Dance, move, turn left, turn right, stop, sit down, increase velocity, decrease velocity) performed by 20 persons. The evaluation of the system includes testing the efficiency of our descriptor vector based on LMA with basic DHMM method and comparing the recognition results of the modified DHMM with the original one. Experiment results demonstrate that our method outperforms most of existing methods that used the MSRC-12 dataset, and a near perfect classification rate in our dataset.

Keywords: Human Motion Recognition, Motion representation, Laban Movement Analysis, Discrete Hidden Markov Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 728

396 Apolipoprotein E Gene Polymorphism and Its Association with Cardiovascular Heart Disease Risk Factors in Type 2 Diabetes Mellitus

Authors: Amani Ashari, Julia Omar, Arif Hashim, Shahrul Hamid

Abstract:

Apolipoprotein E (APOE) gene polymorphism has influence on serum lipids which relates to cardiovascular risk. The purpose of this study was to determine the frequency distribution of APOE alleles among Malaysian Type 2 Diabetes Mellitus (DM) patients with and without coronary artery disease (CAD) and their association with serum lipid profiles. A total of 115 patients were recruited in which 78 patients had Type 2 DM without CAD and 37 patients had Type 2 DM with CAD. The APOE polymorphism was detected by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP). The APOE ɛ3 allele was the most common one in both groups. There was no significant association between the APOE genotypes and the CAD status in Type 2 DM using Pearson χ²test. Further analysis indicated there were no significant differences in all lipid parameters between E2, E3 and E4 subgroups in both groups. The study showed that the E4 allele carriers of Type 2 DM with CAD patients had higher LDL-C level and lower HDL-C level compared to the other allele carriers. However, analyses showed these levels were not statistically different. The study also showed that the Type 2 DM with CAD group with E2 allele had higher triglyceride (TG). In conclusion, further study with larger sample size is needed to confirm role of E4 as a marker of CAD among Type 2 DM patients in Malaysian population.

Keywords: Apolipoprotein E, diabetes mellitus, cardiovascular disease, lipids.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1263

395 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets

Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi

Abstract:

In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1500

394 The Links between Brain Insulin Resistance and Alzheimer’s Disease

Authors: Negar Khezri, Golnaz Yaghoubnezhadzanganeh, Amirreza Attarzadeh

Abstract:

Type 2 Diabetes (T2DM) and Alzheimer's disease (AD) are two main health problems influencing millions of people in the world. Neuron loss and synaptic impairment that interfere with cognition and memory cause for the behavioral indications of AD. While it is now accepted that insulin has central neuromodulatory purpose, it was contemplated for many years that brain is insusceptible to insulin, involving its function in memory and learning, which are impaired in AD. The common characteristics of both AD and T2D are impaired insulin signaling, oxidative stress, the excitation of inflammatory pathways and unqualified glucose metabolism. This review summarizes how the recognition of these mechanisms may lead to the development of alternative therapeutic approaches. Here we summarize how the recognition of these mechanisms may lead to the development of alternative therapeutic approaches.

Keywords: Alzheimer’s disease, diabetes, insulin resistance, neurodegenerative.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1137

393 Distances over Incomplete Diabetes and Breast Cancer Data Based on Bhattacharyya Distance

Authors: Loai AbdAllah, Mahmoud Kaiyal

Abstract:

Missing values in real-world datasets are a common problem. Many algorithms were developed to deal with this problem, most of them replace the missing values with a fixed value that was computed based on the observed values. In our work, we used a distance function based on Bhattacharyya distance to measure the distance between objects with missing values. Bhattacharyya distance, which measures the similarity of two probability distributions. The proposed distance distinguishes between known and unknown values. Where the distance between two known values is the Mahalanobis distance. When, on the other hand, one of them is missing the distance is computed based on the distribution of the known values, for the coordinate that contains the missing value. This method was integrated with Wikaya, a digital health company developing a platform that helps to improve prevention of chronic diseases such as diabetes and cancer. In order for Wikaya’s recommendation system to work distance between users need to be measured. Since there are missing values in the collected data, there is a need to develop a distance function distances between incomplete users profiles. To evaluate the accuracy of the proposed distance function in reflecting the actual similarity between different objects, when some of them contain missing values, we integrated it within the framework of k nearest neighbors (kNN) classifier, since its computation is based only on the similarity between objects. To validate this, we ran the algorithm over diabetes and breast cancer datasets, standard benchmark datasets from the UCI repository. Our experiments show that kNN classifier using our proposed distance function outperforms the kNN using other existing methods.

Keywords: Missing values, distance metric, Bhattacharyya distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 781

392 Studies of Rule Induction by STRIM from the Decision Table with Contaminated Attribute Values from Missing Data and Noise — In the Case of Critical Dataset Size —

Authors: Tetsuro Saeki, Yuichi Kato, Shoutarou Mizuno

Abstract:

STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by simulation experiments specifying rules in advance, and by comparison with conventional methods. However, scope for future development remains before STRIM can be applied to the analysis of real-world data sets. The first requirement is to determine the size of the dataset needed for inducting true rules, since finding statistically significant rules is the core of the method. The second is to examine the capacity of rule induction from datasets with contaminated attribute values created by missing data and noise, since real-world datasets usually contain such contaminated data. This paper examines the first problem theoretically, in connection with the rule length. The second problem is then examined in a simulation experiment, utilizing the critical size of dataset derived from the first step. The experimental results show that STRIM is highly robust in the analysis of datasets with contaminated attribute values, and hence is applicable to real-world data

Keywords: Rule induction, decision table, missing data, noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1463

391 Multi-Objective Optimal Threshold Selection for Similarity Functions in Siamese Networks for Semantic Textual Similarity Tasks

Authors: Kriuk Boris, Kriuk Fedor

Abstract:

This paper presents a comparative study of fundamental similarity functions for Siamese networks in semantic textual similarity (STS) tasks. We evaluate various similarity functions using the STS Benchmark dataset, analyzing their performance and stability. Additionally, we present a multi-objective approach for optimal threshold selection. Our findings provide insights into the effectiveness of different similarity functions and offer a straightforward method for threshold selection optimization, contributing to the advancement of Siamese network architectures in STS applications.

Keywords: Siamese networks, Semantic textual similarity, Similarity functions, STS Benchmark dataset, Threshold selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 76

390 An Application-Driven Procedure for Optimal Signal Digitization of Automotive-Grade Ultrasonic Sensors

Authors: Mohamed Shawki Elamir, Heinrich Gotzig, Raoul Zoellner, Patrick Maeder

Abstract:

In this work, a methodology is presented for identifying the optimal digitization parameters for the analog signal of ultrasonic sensors. These digitization parameters are the resolution of the analog to digital conversion and the sampling rate. This is accomplished though the derivation of characteristic curves based on Fano inequality and the calculation of the mutual information content over a given dataset. The mutual information is calculated between the examples in the dataset and the corresponding variation in the feature that needs to be estimated. The optimal parameters are identified in a manner that ensures optimal estimation performance while preventing inefficiency in using unnecessarily powerful analog to digital converters.

Keywords: Analog to digital conversion, digitization, sampling rate, ultrasonic sensors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 446

389 Screening of Potential Sources of Tannin and Its Therapeutic Application

Authors: Mamta Kumari, Shashi Jain

Abstract:

Tannins are a unique category of plant phytochemicals especially in terms of their vast potential health-benefiting properties. Researchers have described the capacity of tannins to enhance glucose uptake and inhibit adipogenesis, thus being potential drugs for the treatment of non-insulin dependent diabetes mellitus. Thus, the present research was conducted to find out tannin content of food products. The percentage of tannin in various analyzed sources ranged from 0.0 to 108.53%; highest in kathaa and lowest in ker and mango bark. The percentage of tannins present in the plants, however, varies. Numerous studies have confirmed that the naturally occurring polyphenols are key factor for the beneficial effects of the herbal medicines. Isolation and identification of active constituents from plants, preparation of standardized dose & dosage regimen can play a significant role in improving the hypoglycaemic action.

Keywords: Tannins, Diabetes, Polyphenols, Antioxidants, Hypoglycemia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2313

388 Machine Learning Methods for Network Intrusion Detection

Authors: Mouhammad Alkasassbeh, Mohammad Almseidin

Abstract:

Network security engineers work to keep services available all the time by handling intruder attacks. Intrusion Detection System (IDS) is one of the obtainable mechanisms that is used to sense and classify any abnormal actions. Therefore, the IDS must be always up to date with the latest intruder attacks signatures to preserve confidentiality, integrity, and availability of the services. The speed of the IDS is a very important issue as well learning the new attacks. This research work illustrates how the Knowledge Discovery and Data Mining (or Knowledge Discovery in Databases) KDD dataset is very handy for testing and evaluating different Machine Learning Techniques. It mainly focuses on the KDD preprocess part in order to prepare a decent and fair experimental data set. The J48, MLP, and Bayes Network classifiers have been chosen for this study. It has been proven that the J48 classifier has achieved the highest accuracy rate for detecting and classifying all KDD dataset attacks, which are of type DOS, R2L, U2R, and PROBE.

Keywords: IDS, DDoS, MLP, KDD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 727

387 Wound Healing Effect of Ocimum sanctum Leaves Extract in Diabetic Rats

Authors: Manish Kumar Gautam, Raj Kumar Goel

Abstract:

Delayed wound healing in diabetes is primarily associated with hyperglycemia, over-expression of inflammatory marker, oxidative stress and delayed collagen synthesis. This unmanaged wound is producing high economic burden on the society. Thus research is required to develop new and effective treatment strategies to deal with this emerging issue. Our present study incorporates the evaluation of wound healing effects of 50% ethanol extract of Ocimum sanctum (OSE) in streptozotocin (45mg/kg)-induced diabetic rats with concurrent wound ulcer. The animals showing diabetes (Blood glucose level >140 and <250 mg/dL) will be selected for wound healing study using standard dead space wound model. Wounds were created by implanting two polypropylene tubes (0.5 x 2.5 cm2 each), one on either side in the lumbar region on the dorsal surface of each rat. On the 10th postwounding day, the animals were sacrificed and granulation tissue formed on the implanted tubes was carefully dissected out and study the status of antioxidants (Superoxide dismutase, SOD and Glutathione, GSH) free radicals (Lipid peroxidation, LPO and nitric oxide, NO) acute inflammatory marker (myeloperoxidase, MPO) connective tissue determinants, hydroxyproline, hexosamine and hexuronic acid, which play a major role in wound healing and diabetes. Besides the anti-diabetic parameters (estimation of serum blood glucose, triglycerides and total cholesterol), the above parameters for wound healing were studied both in normal, untreated and OSE treated diabetic rats. The effects of extract on above parameters will be compared with known standard antioxidant (Vitamin E) and anti-diabetic (Glybenclamide) drugs. OSE 400 mg/kg substantiated by significantly decreased serum blood glucose, triglycerides and total cholesterol. OSE also decrease granulation tissue free radicals (LPO, 58.1% and NO, 52.7%) and myeloperoxidase (MPO, 63.3%), and enhanced antioxidants (GSH, 116.4% and SOD, 201.1%)

Keywords: Wound healing, diabetes, Ocimum sanctum, Antioxidant, Free radical, Myeloperoxidase

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3160

386 Extrapolation of Clinical Data from an Oral Glucose Tolerance Test Using a Support Vector Machine

Authors: Jianyin Lu, Masayoshi Seike, Wei Liu, Peihong Wu, Lihua Wang, Yihua Wu, Yasuhiro Naito, Hiromu Nakajima, Yasuhiro Kouchi

Abstract:

To extract the important physiological factors related to diabetes from an oral glucose tolerance test (OGTT) by mathematical modeling, highly informative but convenient protocols are required. Current models require a large number of samples and extended period of testing, which is not practical for daily use. The purpose of this study is to make model assessments possible even from a reduced number of samples taken over a relatively short period. For this purpose, test values were extrapolated using a support vector machine. A good correlation was found between reference and extrapolated values in evaluated 741 OGTTs. This result indicates that a reduction in the number of clinical test is possible through a computational approach.

Keywords: SVM regression, OGTT, diabetes, mathematical model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1613

385 On Identity Disclosure Risk Measurement for Shared Microdata

Authors: M. N. Huda, S. Yamada, N. Sonehara

Abstract:

Probability-based identity disclosure risk measurement may give the same overall risk for different anonymization strategy of the same dataset. Some entities in the anonymous dataset may have higher identification risks than the others. Individuals are more concerned about higher risks than the average and are more interested to know if they have a possibility of being under higher risk. A notation of overall risk in the above measurement method doesn-t indicate whether some of the involved entities have higher identity disclosure risk than the others. In this paper, we have introduced an identity disclosure risk measurement method that not only implies overall risk, but also indicates whether some of the members have higher risk than the others. The proposed method quantifies the overall risk based on the individual risk values, the percentage of the records that have a risk value higher than the average and how larger the higher risk values are compared to the average. We have analyzed the disclosure risks for different disclosure control techniques applied to original microdata and present the results.

Keywords: Anonymization, microdata, disclosure risk, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1364

384 Feature Selection for Web Page Classification Using Swarm Optimization

Authors: B. Leela Devi, A. Sankar

Abstract:

The web’s increased popularity has included a huge amount of information, due to which automated web page classification systems are essential to improve search engines’ performance. Web pages have many features like HTML or XML tags, hyperlinks, URLs and text contents which can be considered during an automated classification process. It is known that Webpage classification is enhanced by hyperlinks as it reflects Web page linkages. The aim of this study is to reduce the number of features to be used to improve the accuracy of the classification of web pages. In this paper, a novel feature selection method using an improved Particle Swarm Optimization (PSO) using principle of evolution is proposed. The extracted features were tested on the WebKB dataset using a parallel Neural Network to reduce the computational cost.

Keywords: Web page classification, WebKB Dataset, Term Frequency-Inverse Document Frequency (TF-IDF), Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3259

383 Neural Network Based Approach for Face Detection cum Face Recognition

Authors: Kesari Verma, Aniruddha S. Thoke, Pritam Singh

Abstract:

Automatic face detection is a complex problem in image processing. Many methods exist to solve this problem such as template matching, Fisher Linear Discriminate, Neural Networks, SVM, and MRC. Success has been achieved with each method to varying degrees and complexities. In proposed algorithm we used upright, frontal faces for single gray scale images with decent resolution and under good lighting condition. In the field of face recognition technique the single face is matched with single face from the training dataset. The author proposed a neural network based face detection algorithm from the photographs as well as if any test data appears it check from the online scanned training dataset. Experimental result shows that the algorithm detected up to 95% accuracy for any image.

Keywords: Face Detection, Face Recognition, NN Approach, PCA Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2301

382 Optimized Brain Computer Interface System for Unspoken Speech Recognition: Role of Wernicke Area

Authors: Nassib Abdallah, Pierre Chauvet, Abd El Salam Hajjar, Bassam Daya

Abstract:

In this paper, we propose an optimized brain computer interface (BCI) system for unspoken speech recognition, based on the fact that the constructions of unspoken words rely strongly on the Wernicke area, situated in the temporal lobe. Our BCI system has four modules: (i) the EEG Acquisition module based on a non-invasive headset with 14 electrodes; (ii) the Preprocessing module to remove noise and artifacts, using the Common Average Reference method; (iii) the Features Extraction module, using Wavelet Packet Transform (WPT); (iv) the Classification module based on a one-hidden layer artificial neural network. The present study consists of comparing the recognition accuracy of 5 Arabic words, when using all the headset electrodes or only the 4 electrodes situated near the Wernicke area, as well as the selection effect of the subbands produced by the WPT module. After applying the articial neural network on the produced database, we obtain, on the test dataset, an accuracy of 83.4% with all the electrodes and all the subbands of 8 levels of the WPT decomposition. However, by using only the 4 electrodes near Wernicke Area and the 6 middle subbands of the WPT, we obtain a high reduction of the dataset size, equal to approximately 19% of the total dataset, with 67.5% of accuracy rate. This reduction appears particularly important to improve the design of a low cost and simple to use BCI, trained for several words.

Keywords: Brain-computer interface, speech recognition, electroencephalography EEG, Wernicke area, artificial neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 918

381 Fusion of ETM+ Multispectral and Panchromatic Texture for Remote Sensing Classification

Authors: Mahesh Pal

Abstract:

This paper proposes to use ETM+ multispectral data and panchromatic band as well as texture features derived from the panchromatic band for land cover classification. Four texture features including one 'internal texture' and three GLCM based textures namely correlation, entropy, and inverse different moment were used in combination with ETM+ multispectral data. Two data sets involving combination of multispectral, panchromatic band and its texture were used and results were compared with those obtained by using multispectral data alone. A decision tree classifier with and without boosting were used to classify different datasets. Results from this study suggest that the dataset consisting of panchromatic band, four of its texture features and multispectral data was able to increase the classification accuracy by about 2%. In comparison, a boosted decision tree was able to increase the classification accuracy by about 3% with the same dataset.

Keywords: Internal texture; GLCM; decision tree; boosting; classification accuracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1735

380 Improving the Performance of Deep Learning in Facial Emotion Recognition with Image Sharpening

Authors: Ksheeraj Sai Vepuri, Nada Attar

Abstract:

We as humans use words with accompanying visual and facial cues to communicate effectively. Classifying facial emotion using computer vision methodologies has been an active research area in the computer vision field. In this paper, we propose a simple method for facial expression recognition that enhances accuracy. We tested our method on the FER-2013 dataset that contains static images. Instead of using Histogram equalization to preprocess the dataset, we used Unsharp Mask to emphasize texture and details and sharpened the edges. We also used ImageDataGenerator from Keras library for data augmentation. Then we used Convolutional Neural Networks (CNN) model to classify the images into 7 different facial expressions, yielding an accuracy of 69.46% on the test set. Our results show that using image preprocessing such as the sharpening technique for a CNN model can improve the performance, even when the CNN model is relatively simple.

Keywords: Facial expression recognition, image pre-processing, deep learning, CNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 544

379 Assessing and Visualizing the Stability of Feature Selectors: A Case Study with Spectral Data

Authors: R.Guzman-Martinez, Oscar Garcia-Olalla, R.Alaiz-Rodriguez

Abstract:

Feature selection plays an important role in applications with high dimensional data. The assessment of the stability of feature selection/ranking algorithms becomes an important issue when the dataset is small and the aim is to gain insight into the underlying process by analyzing the most relevant features. In this work, we propose a graphical approach that enables to analyze the similarity between feature ranking techniques as well as their individual stability. Moreover, it works with whatever stability metric (Canberra distance, Spearman's rank correlation coefficient, Kuncheva's stability index,...). We illustrate this visualization technique evaluating the stability of several feature selection techniques on a spectral binary dataset. Experimental results with a neural-based classifier show that stability and ranking quality may not be linked together and both issues have to be studied jointly in order to offer answers to the domain experts.

Keywords: Feature Selection Stability, Spectral data, Data visualization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1525

378 Predictive Clustering Hybrid Regression(pCHR) Approach and Its Application to Sucrose-Based Biohydrogen Production

Authors: Nikhil, Ari Visa, Chin-Chao Chen, Chiu-Yue Lin, Jaakko A. Puhakka, Olli Yli-Harja

Abstract:

A predictive clustering hybrid regression (pCHR) approach was developed and evaluated using dataset from H2- producing sucrose-based bioreactor operated for 15 months. The aim was to model and predict the H2-production rate using information available about envirome and metabolome of the bioprocess. Selforganizing maps (SOM) and Sammon map were used to visualize the dataset and to identify main metabolic patterns and clusters in bioprocess data. Three metabolic clusters: acetate coupled with other metabolites, butyrate only, and transition phases were detected. The developed pCHR model combines principles of k-means clustering, kNN classification and regression techniques. The model performed well in modeling and predicting the H2-production rate with mean square error values of 0.0014 and 0.0032, respectively.

Keywords: Biohydrogen, bioprocess modeling, clusteringhybrid regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1776

377 Quantification of Heart Rate Variability: A Measure based on Unique Heart Rates

Authors: V. I. Thajudin Ahamed, P. Dhanasekaran, A. Naseem, N. G. Karthick, T. K. Abdul Jaleel, Paul K.Joseph

Abstract:

It is established that the instantaneous heart rate (HR) of healthy humans keeps on changing. Analysis of heart rate variability (HRV) has become a popular non invasive tool for assessing the activities of autonomic nervous system. Depressed HRV has been found in several disorders, like diabetes mellitus (DM) and coronary artery disease, characterised by autonomic nervous dysfunction. A new technique, which searches for pattern repeatability in a time series, is proposed specifically for the analysis of heart rate data. These set of indices, which are termed as pattern repeatability measure and pattern repeatability ratio are compared with approximate entropy and sample entropy. In our analysis, based on the method developed, it is observed that heart rate variability is significantly different for DM patients, particularly for patients with diabetic foot ulcer.

Keywords: Autonomic nervous system, diabetes mellitus, heart rate variability, pattern identification, sample entropy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1908

376 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: Data mining, knowledge discovery, machine learning, similarity measurement, supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1527