Search results for: associative classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2140

Search results for: associative classification

1630 lncRNA Gene Expression Profiling Analysis by TCGA RNA-Seq Data of Breast Cancer

Authors: Xiaoping Su, Gabriel G. Malouf

Abstract:

Introduction: Breast cancer is a heterogeneous disease that can be classified in 4 subgroups using transcriptional profiling. The role of lncRNA expression in human breast cancer biology, prognosis, and molecular classification remains unknown. Methods and results: Using an integrative comprehensive analysis of lncRNA, mRNA and DNA methylation in 900 breast cancer patients from The Cancer Genome Atlas (TCGA) project, we unraveled the molecular portraits of 1,700 expressed lncRNA. Some of those lncRNAs (i.e, HOTAIR) are previously reported and others are novel (i.e, HOTAIRM1, MAPT-AS1). The lncRNA classification correlated well with the PAM50 classification for basal-like, Her-2 enriched and luminal B subgroups, in contrast to the luminal A subgroup which behaved differently. Importantly, estrogen receptor (ESR1) expression was associated with distinct lncRNA networks in lncRNA clusters III and IV. Gene set enrichment analysis for cis- and trans-acting lncRNA showed enrichment for breast cancer signatures driven by breast cancer master regulators. Almost two third of those lncRNA were marked by enhancer chromatin modifications (i.e., H3K27ac), suggesting that lncRNA expression may result in increased activity of neighboring genes. Differential analysis of gene expression profiling data showed that lncRNA HOTAIRM1 was significantly down-regulated in basal-like subtype, and DNA methylation profiling data showed that lncRNA HOTAIRM1 was highly methylated in basal-like subtype. Thus, our integrative analysis of gene expression and DNA methylation strongly suggested that lncRNA HOTAIRM1 should be a tumor suppressor in basal-like subtype. Conclusion and significance: Our study depicts the first lncRNA molecular portrait of breast cancer and shows that lncRNA HOTAIRM1 might be a novel tumor suppressor.

Keywords: lncRNA profiling, breast cancer, HOTAIRM1, tumor suppressor

Procedia PDF Downloads 83
1629 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract

Procedia PDF Downloads 634
1628 Astronomical Object Classification

Authors: Alina Muradyan, Lina Babayan, Arsen Nanyan, Gohar Galstyan, Vigen Khachatryan

Abstract:

We present a photometric method for identifying stars, galaxies and quasars in multi-color surveys, which uses a library of ∼> 65000 color templates for comparison with observed objects. The method aims for extracting the information content of object colors in a statistically correct way, and performs a classification as well as a redshift estimation for galaxies and quasars in a unified approach based on the same probability density functions. For the redshift estimation, we employ an advanced version of the Minimum Error Variance estimator which determines the redshift error from the redshift dependent probability density function itself. The method was originally developed for the Calar Alto Deep Imaging Survey (CADIS), but is now used in a wide variety of survey projects. We checked its performance by spectroscopy of CADIS objects, where the method provides high reliability (6 errors among 151 objects with R < 24), especially for the quasar selection, and redshifts accurate within σz ≈ 0.03 for galaxies and σz ≈ 0.1 for quasars. For an optimization of future survey efforts, a few model surveys are compared, which are designed to use the same total amount of telescope time but different sets of broad-band and medium-band filters. Their performance is investigated by Monte-Carlo simulations as well as by analytic evaluation in terms of classification and redshift estimation. If photon noise were the only error source, broad-band surveys and medium-band surveys should perform equally well, as long as they provide the same spectral coverage. In practice, medium-band surveys show superior performance due to their higher tolerance for calibration errors and cosmic variance. Finally, we discuss the relevance of color calibration and derive important conclusions for the issues of library design and choice of filters. The calibration accuracy poses strong constraints on an accurate classification, which are most critical for surveys with few, broad and deeply exposed filters, but less severe for surveys with many, narrow and less deep filters.

Keywords: VO, ArVO, DFBS, FITS, image processing, data analysis

Procedia PDF Downloads 52
1627 Analyses of Adverse Drug Reactions Reported of Hospital in Taiwan

Authors: Yu-Hong Lin

Abstract:

Background: An adverse drug reaction (ADR) reported is an injury which caused by taking medicines. Sometimes the severity of ADR reported may be minor, but sometimes it could be a life-threatening situation. In order to provide healthcare professionals as a better reference in clinical practice, we do data collection and analysis from our hospital. Methods: This was a retrospective study of ADRs reported performed from 2014 to 2015 in our hospital in Taiwan. We collected assessment items of ADRs reported, which contain gender and age, occurring sources, Anatomical Therapeutic Chemical (ATC) classification of suspected drugs, types of adverse reactions, Naranjo score calculating by Naranjo Adverse Drug Reaction Probability Scale and so on. Results: The investigation included two hundred and seven ADRs reported. Most of ADRs reported were occurring in outpatient department (92%). The average age of ADRs reported was 65.3 years. Less than 65 years of age were in the majority in this study (54%). Majority of all ADRs reported were males (51%). According to ATC classification system, the major classification of suspected drugs was cardiovascular system (19%) and antiinfectives for systemic use (18%) respectively. Among the adverse reactions, Dermatologic Effects (35%) were the major type of ADRs. Also, the major Naranjo scores of all ADRs reported ranged from 1 to 4 points (91%), which represents a possible correlation between ADRs reported and suspected drugs. Conclusions: Definitely, ADRs reported is still an extremely important information for healthcare professionals. For that reason, we put all information of ADRs reported into our hospital's computer system, and it will improve the safety of medication use. By hospital's computer system, it can remind prescribers to think of information about patient's ADRs reported. No drugs are administered without risk. Therefore, all healthcare professionals should have a responsibility to their patients, who themselves are becoming more aware of problems associated with drug therapy.

Keywords: adverse drug reaction, Taiwan, healthcare professionals, safe use of medicines

Procedia PDF Downloads 210
1626 A Two-Week and Six-Month Stability of Cancer Health Literacy Classification Using the CHLT-6

Authors: Levent Dumenci, Laura A. Siminoff

Abstract:

Health literacy has been shown to predict a variety of health outcomes. Reliable identification of persons with limited cancer health literacy (LCHL) has been proved questionable with existing instruments using an arbitrary cut point along a continuum. The CHLT-6, however, uses a latent mixture modeling approach to identify persons with LCHL. The purpose of this study was to estimate two-week and six-month stability of identifying persons with LCHL using the CHLT-6 with a discrete latent variable approach as the underlying measurement structure. Using a test-retest design, the CHLT-6 was administered to cancer patients with two-week (N=98) and six-month (N=51) intervals. The two-week and six-month latent test-retest agreements were 89% and 88%, respectively. The chance-corrected latent agreements estimated from Dumenci’s latent kappa were 0.62 (95% CI: 0.41 – 0.82) and .47 (95% CI: 0.14 – 0.80) for the two-week and six-month intervals, respectively. High levels of latent test-retest agreement between limited and adequate categories of cancer health literacy construct, coupled with moderate to good levels of change-corrected latent agreements indicated that the CHLT-6 classification of limited versus adequate cancer health literacy is relatively stable over time. In conclusion, the measurement structure underlying the instrument allows for estimating classification errors circumventing limitations due to arbitrary approaches adopted by all other instruments. The CHLT-6 can be used to identify persons with LCHL in oncology clinics and intervention studies to accurately estimate treatment effectiveness.

Keywords: limited cancer health literacy, the CHLT-6, discrete latent variable modeling, latent agreement

Procedia PDF Downloads 156
1625 Fake Accounts Detection in Twitter Based on Minimum Weighted Feature Set

Authors: Ahmed ElAzab, Amira M. Idrees, Mahmoud A. Mahmoud, Hesham Hefny

Abstract:

Social networking sites such as Twitter and Facebook attracts over 500 million users across the world, for those users, their social life, even their practical life, has become interrelated. Their interaction with social networking has affected their life forever. Accordingly, social networking sites have become among the main channels that are responsible for vast dissemination of different kinds of information during real time events. This popularity in Social networking has led to different problems including the possibility of exposing incorrect information to their users through fake accounts which results to the spread of malicious content during life events. This situation can result to a huge damage in the real world to the society in general including citizens, business entities, and others. In this paper, we present a classification method for detecting fake accounts on Twitter. The study determines the minimized set of the main factors that influence the detection of the fake accounts on Twitter, then the determined factors have been applied using different classification techniques, a comparison of the results for these techniques has been performed and the most accurate algorithm is selected according to the accuracy of the results. The study has been compared with different recent research in the same area, this comparison has proved the accuracy of the proposed study. We claim that this study can be continuously applied on Twitter social network to automatically detect the fake accounts, moreover, the study can be applied on different Social network sites such as Facebook with minor changes according to the nature of the social network which are discussed in this paper.

Keywords: fake accounts detection, classification algorithms, twitter accounts analysis, features based techniques

Procedia PDF Downloads 381
1624 Rapid Classification of Soft Rot Enterobacteriaceae Phyto-Pathogens Pectobacterium and Dickeya Spp. Using Infrared Spectroscopy and Machine Learning

Authors: George Abu-Aqil, Leah Tsror, Elad Shufan, Shaul Mordechai, Mahmoud Huleihel, Ahmad Salman

Abstract:

Pectobacterium and Dickeya spp which negatively affect a wide range of crops are the main causes of the aggressive diseases of agricultural crops. These aggressive diseases are responsible for a huge economic loss in agriculture including a severe decrease in the quality of the stored vegetables and fruits. Therefore, it is important to detect these pathogenic bacteria at their early stages of infection to control their spread and consequently reduce the economic losses. In addition, early detection is vital for producing non-infected propagative material for future generations. The currently used molecular techniques for the identification of these bacteria at the strain level are expensive and laborious. Other techniques require a long time of ~48 h for detection. Thus, there is a clear need for rapid, non-expensive, accurate and reliable techniques for early detection of these bacteria. In this study, infrared spectroscopy, which is a well-known technique with all its features, was used for rapid detection of Pectobacterium and Dickeya spp. at the strain level. The bacteria were isolated from potato plants and tubers with soft rot symptoms and measured by infrared spectroscopy. The obtained spectra were analyzed using different machine learning algorithms. The performances of our approach for taxonomic classification among the bacterial samples were evaluated in terms of success rates. The success rates for the correct classification of the genus, species and strain levels were ~100%, 95.2% and 92.6% respectively.

Keywords: soft rot enterobacteriaceae (SRE), pectobacterium, dickeya, plant infections, potato, solanum tuberosum, infrared spectroscopy, machine learning

Procedia PDF Downloads 80
1623 Design of Bacterial Pathogens Identification System Based on Scattering of Laser Beam Light and Classification of Binned Plots

Authors: Mubashir Hussain, Mu Lv, Xiaohan Dong, Zhiyang Li, Bin Liu, Nongyue He

Abstract:

Detection and classification of microbes have a vast range of applications in biomedical engineering especially in detection, characterization, and quantification of bacterial contaminants. For identification of pathogens, different techniques are emerging in the field of biomedical engineering. Latest technology uses light scattering, capable of identifying different pathogens without any need for biochemical processing. Bacterial Pathogens Identification System (BPIS) which uses a laser beam, passes through the sample and light scatters off. An assembly of photodetectors surrounded by the sample at different angles to detect the scattering of light. The algorithm of the system consists of two parts: (a) Library files, and (b) Comparator. Library files contain data of known species of bacterial microbes in the form of binned plots, while comparator compares data of unknown sample with library files. Using collected data of unknown bacterial species, highest voltage values stored in the form of peaks and arranged in 3D histograms to find the frequency of occurrence. Resulting data compared with library files of known bacterial species. If sample data matching with any library file of known bacterial species, sample identified as a matched microbe. An experiment performed to identify three different bacteria particles: Enterococcus faecalis, Pseudomonas aeruginosa, and Escherichia coli. By applying algorithm using library files of given samples, results were compromising. This system is potentially applicable to several biomedical areas, especially those related to cell morphology.

Keywords: microbial identification, laser scattering, peak identification, binned plots classification

Procedia PDF Downloads 128
1622 Investigating the Relationship between the Kuwait Stock Market and Its Marketing Sectors

Authors: Mohamad H. Atyeh, Ahmad Khaldi

Abstract:

The main objective of this research is to measure the relationship between the Kuwait stock Exchange (KSE) index and its two marketing sectors after the new market classification. The findings of this research are important for Public economic policy makers as they need to know if the new system (new classification) is efficient and to what level, to monitor the markets and intervene with appropriate measures. The data used are the daily index of the whole Kuwaiti market and the daily closing price, number of deals and volume of shares traded of two marketing sectors (consumer goods and consumer services) for the period from the 13th of May 2012 till the 12th of December 2016. The results indicate a positive direct impact of the closing price, volume and deals indexes of the consumer goods and the consumer services companies on the overall KSE index, volume and deals of the Kuwaiti stock market (KSE).

Keywords: correlation, market capitalization, Kuwait Stock Exchange (KSE), marketing sectors, stock performance

Procedia PDF Downloads 305
1621 Sentiment Classification of Documents

Authors: Swarnadip Ghosh

Abstract:

Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it determines whether a piece of writing is positive, negative or neutral.Sentiment analysis of documents holds great importance in today's world, when numerous information is stored in databases and in the world wide web. An efficient algorithm to illicit such information, would be beneficial for social, economic as well as medical purposes. In this project, we have developed an algorithm to classify a document into positive or negative. Using our algorithm, we obtained a feature set from the data, and classified the documents based on this feature set. It is important to note that, in the classification, we have not used the independence assumption, which is considered by many procedures like the Naive Bayes. This makes the algorithm more general in scope. Moreover, because of the sparsity and high dimensionality of such data, we did not use empirical distribution for estimation, but developed a method by finding degree of close clustering of the data points. We have applied our algorithm on a movie review data set obtained from IMDb and obtained satisfactory results.

Keywords: sentiment, Run's Test, cross validation, higher dimensional pmf estimation

Procedia PDF Downloads 377
1620 A Feature Clustering-Based Sequential Selection Approach for Color Texture Classification

Authors: Mohamed Alimoussa, Alice Porebski, Nicolas Vandenbroucke, Rachid Oulad Haj Thami, Sana El Fkihi

Abstract:

Color and texture are highly discriminant visual cues that provide an essential information in many types of images. Color texture representation and classification is therefore one of the most challenging problems in computer vision and image processing applications. Color textures can be represented in different color spaces by using multiple image descriptors which generate a high dimensional set of texture features. In order to reduce the dimensionality of the feature set, feature selection techniques can be used. The goal of feature selection is to find a relevant subset from an original feature space that can improve the accuracy and efficiency of a classification algorithm. Traditionally, feature selection is focused on removing irrelevant features, neglecting the possible redundancy between relevant ones. This is why some feature selection approaches prefer to use feature clustering analysis to aid and guide the search. These techniques can be divided into two categories. i) Feature clustering-based ranking algorithm uses feature clustering as an analysis that comes before feature ranking. Indeed, after dividing the feature set into groups, these approaches perform a feature ranking in order to select the most discriminant feature of each group. ii) Feature clustering-based subset search algorithms can use feature clustering following one of three strategies; as an initial step that comes before the search, binded and combined with the search or as the search alternative and replacement. In this paper, we propose a new feature clustering-based sequential selection approach for the purpose of color texture representation and classification. Our approach is a three step algorithm. First, irrelevant features are removed from the feature set thanks to a class-correlation measure. Then, introducing a new automatic feature clustering algorithm, the feature set is divided into several feature clusters. Finally, a sequential search algorithm, based on a filter model and a separability measure, builds a relevant and non redundant feature subset: at each step, a feature is selected and features of the same cluster are removed and thus not considered thereafter. This allows to significantly speed up the selection process since large number of redundant features are eliminated at each step. The proposed algorithm uses the clustering algorithm binded and combined with the search. Experiments using a combination of two well known texture descriptors, namely Haralick features extracted from Reduced Size Chromatic Co-occurence Matrices (RSCCMs) and features extracted from Local Binary patterns (LBP) image histograms, on five color texture data sets, Outex, NewBarktex, Parquet, Stex and USPtex demonstrate the efficiency of our method compared to seven of the state of the art methods in terms of accuracy and computation time.

Keywords: feature selection, color texture classification, feature clustering, color LBP, chromatic cooccurrence matrix

Procedia PDF Downloads 111
1619 Colored Image Classification Using Quantum Convolutional Neural Networks Approach

Authors: Farina Riaz, Shahab Abdulla, Srinjoy Ganguly, Hajime Suzuki, Ravinesh C. Deo, Susan Hopkins

Abstract:

Recently, quantum machine learning has received significant attention. For various types of data, including text and images, numerous quantum machine learning (QML) models have been created and are being tested. Images are exceedingly complex data components that demand more processing power. Despite being mature, classical machine learning still has difficulties with big data applications. Furthermore, quantum technology has revolutionized how machine learning is thought of, by employing quantum features to address optimization issues. Since quantum hardware is currently extremely noisy, it is not practicable to run machine learning algorithms on it without risking the production of inaccurate results. To discover the advantages of quantum versus classical approaches, this research has concentrated on colored image data. Deep learning classification models are currently being created on Quantum platforms, but they are still in a very early stage. Black and white benchmark image datasets like MNIST and Fashion MINIST have been used in recent research. MNIST and CIFAR-10 were compared for binary classification, but the comparison showed that MNIST performed more accurately than colored CIFAR-10. This research will evaluate the performance of the QML algorithm on the colored benchmark dataset CIFAR-10 to advance QML's real-time applicability. However, deep learning classification models have not been developed to compare colored images like Quantum Convolutional Neural Network (QCNN) to determine how much it is better to classical. Only a few models, such as quantum variational circuits, take colored images. The methodology adopted in this research is a hybrid approach by using penny lane as a simulator. To process the 10 classes of CIFAR-10, the image data has been translated into grey scale and the 28 × 28-pixel image containing 10,000 test and 50,000 training images were used. The objective of this work is to determine how much the quantum approach can outperform a classical approach for a comprehensive dataset of color images. After pre-processing 50,000 images from a classical computer, the QCNN model adopted a hybrid method and encoded the images into a quantum simulator for feature extraction using quantum gate rotations. The measurements were carried out on the classical computer after the rotations were applied. According to the results, we note that the QCNN approach is ~12% more effective than the traditional classical CNN approaches and it is possible that applying data augmentation may increase the accuracy. This study has demonstrated that quantum machine and deep learning models can be relatively superior to the classical machine learning approaches in terms of their processing speed and accuracy when used to perform classification on colored classes.

Keywords: CIFAR-10, quantum convolutional neural networks, quantum deep learning, quantum machine learning

Procedia PDF Downloads 100
1618 Small Target Recognition Based on Trajectory Information

Authors: Saad Alkentar, Abdulkareem Assalem

Abstract:

Recognizing small targets has always posed a significant challenge in image analysis. Over long distances, the image signal-to-noise ratio tends to be low, limiting the amount of useful information available to detection systems. Consequently, visual target recognition becomes an intricate task to tackle. In this study, we introduce a Track Before Detect (TBD) approach that leverages target trajectory information (coordinates) to effectively distinguish between noise and potential targets. By reframing the problem as a multivariate time series classification, we have achieved remarkable results. Specifically, our TBD method achieves an impressive 97% accuracy in separating target signals from noise within a mere half-second time span (consisting of 10 data points). Furthermore, when classifying the identified targets into our predefined categories—airplane, drone, and bird—we achieve an outstanding classification accuracy of 96% over a more extended period of 1.5 seconds (comprising 30 data points).

Keywords: small targets, drones, trajectory information, TBD, multivariate time series

Procedia PDF Downloads 24
1617 Rheology Study of Polyurethane (COAPUR 6050) For Composite Materials Usage

Authors: Sabrina Boutaleb, Kouider Halim Benrahou, François Schosseler, Abdelouahed Tounsi, El Abbas Adda Bedia

Abstract:

The use of polyurethane in different areas becomes more frequent. This is due to significant advantages they have including their lightness and resistance. However, their use requires a mastery of their mechanical performance. We will present in this work, a COAPUR 6050 which can be used to develop composite materials. COAPUR 6050 is an associative polyurethane thickener allowing fine rheological adjustment of flat or semi-gloss paints. COAPUR 6050 is characterised by its thickening efficiency at low shear rate. It is a solvent-free liquid product. It promotes good paint pick up, while maintaining a low yield point after shearing, and consequently a good levelling. We will then determine its rheological behaviour experimentally using different annular gaps. The rheological properties of COAPUR 6050 were researched by rotational rheometer (Rheometer-Mars III) using different annular gaps. There is the influence of the size of the annular gap on the behaviour as well as on the rheological parameters of the COAPUR 6050. The rheological properties data of COAPUR 6050 were regressed by nonlinear regression method and their rheological models were established, are characterized by yield pseudoplastic model. In this case, it is essential to make a viscometric correction. The latter was developed and presented in the experimental results.

Keywords: COAPUR 6050, flow’s couette, polyurethane, rheological behaviours

Procedia PDF Downloads 477
1616 Hybrid Fuzzy Weighted K-Nearest Neighbor to Predict Hospital Readmission for Diabetic Patients

Authors: Soha A. Bahanshal, Byung G. Kim

Abstract:

Identification of patients at high risk for hospital readmission is of crucial importance for quality health care and cost reduction. Predicting hospital readmissions among diabetic patients has been of great interest to many researchers and health decision makers. We build a prediction model to predict hospital readmission for diabetic patients within 30 days of discharge. The core of the prediction model is a modified k Nearest Neighbor called Hybrid Fuzzy Weighted k Nearest Neighbor algorithm. The prediction is performed on a patient dataset which consists of more than 70,000 patients with 50 attributes. We applied data preprocessing using different techniques in order to handle data imbalance and to fuzzify the data to suit the prediction algorithm. The model so far achieved classification accuracy of 80% compared to other models that only use k Nearest Neighbor.

Keywords: machine learning, prediction, classification, hybrid fuzzy weighted k-nearest neighbor, diabetic hospital readmission

Procedia PDF Downloads 166
1615 Multi-Sensor Target Tracking Using Ensemble Learning

Authors: Bhekisipho Twala, Mantepu Masetshaba, Ramapulana Nkoana

Abstract:

Multiple classifier systems combine several individual classifiers to deliver a final classification decision. However, an increasingly controversial question is whether such systems can outperform the single best classifier, and if so, what form of multiple classifiers system yields the most significant benefit. Also, multi-target tracking detection using multiple sensors is an important research field in mobile techniques and military applications. In this paper, several multiple classifiers systems are evaluated in terms of their ability to predict a system’s failure or success for multi-sensor target tracking tasks. The Bristol Eden project dataset is utilised for this task. Experimental and simulation results show that the human activity identification system can fulfill requirements of target tracking due to improved sensors classification performances with multiple classifier systems constructed using boosting achieving higher accuracy rates.

Keywords: single classifier, ensemble learning, multi-target tracking, multiple classifiers

Procedia PDF Downloads 240
1614 Clustering of Association Rules of ISIS & Al-Qaeda Based on Similarity Measures

Authors: Tamanna Goyal, Divya Bansal, Sanjeev Sofat

Abstract:

In world-threatening terrorist attacks, where early detection, distinction, and prediction are effective diagnosis techniques and for functionally accurate and precise analysis of terrorism data, there are so many data mining & statistical approaches to assure accuracy. The computational extraction of derived patterns is a non-trivial task which comprises specific domain discovery by means of sophisticated algorithm design and analysis. This paper proposes an approach for similarity extraction by obtaining the useful attributes from the available datasets of terrorist attacks and then applying feature selection technique based on the statistical impurity measures followed by clustering techniques on the basis of similarity measures. On the basis of degree of participation of attributes in the rules, the associative dependencies between the attacks are analyzed. Consequently, to compute the similarity among the discovered rules, we applied a weighted similarity measure. Finally, the rules are grouped by applying using hierarchical clustering. We have applied it to an open source dataset to determine the usability and efficiency of our technique, and a literature search is also accomplished to support the efficiency and accuracy of our results.

Keywords: association rules, clustering, similarity measure, statistical approaches

Procedia PDF Downloads 298
1613 Identification and Classification of Fiber-Fortified Semolina by Near-Infrared Spectroscopy (NIR)

Authors: Amanda T. Badaró, Douglas F. Barbin, Sofia T. Garcia, Maria Teresa P. S. Clerici, Amanda R. Ferreira

Abstract:

Food fortification is the intentional addition of a nutrient in a food matrix and has been widely used to overcome the lack of nutrients in the diet or increasing the nutritional value of food. Fortified food must meet the demand of the population, taking into account their habits and risks that these foods may cause. Wheat and its by-products, such as semolina, has been strongly indicated to be used as a food vehicle since it is widely consumed and used in the production of other foods. These products have been strategically used to add some nutrients, such as fibers. Methods of analysis and quantification of these kinds of components are destructive and require lengthy sample preparation and analysis. Therefore, the industry has searched for faster and less invasive methods, such as Near-Infrared Spectroscopy (NIR). NIR is a rapid and cost-effective method, however, it is based on indirect measurements, yielding high amount of data. Therefore, NIR spectroscopy requires calibration with mathematical and statistical tools (Chemometrics) to extract analytical information from the corresponding spectra, as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). PCA is well suited for NIR, once it can handle many spectra at a time and be used for non-supervised classification. Advantages of the PCA, which is also a data reduction technique, is that it reduces the data spectra to a smaller number of latent variables for further interpretation. On the other hand, LDA is a supervised method that searches the Canonical Variables (CV) with the maximum separation among different categories. In LDA, the first CV is the direction of maximum ratio between inter and intra-class variances. The present work used a portable infrared spectrometer (NIR) for identification and classification of pure and fiber-fortified semolina samples. The fiber was added to semolina in two different concentrations, and after the spectra acquisition, the data was used for PCA and LDA to identify and discriminate the samples. The results showed that NIR spectroscopy associate to PCA was very effective in identifying pure and fiber-fortified semolina. Additionally, the classification range of the samples using LDA was between 78.3% and 95% for calibration and 75% and 95% for cross-validation. Thus, after the multivariate analysis such as PCA and LDA, it was possible to verify that NIR associated to chemometric methods is able to identify and classify the different samples in a fast and non-destructive way.

Keywords: Chemometrics, fiber, linear discriminant analysis, near-infrared spectroscopy, principal component analysis, semolina

Procedia PDF Downloads 190
1612 Classification of IoT Traffic Security Attacks Using Deep Learning

Authors: Anum Ali, Kashaf ad Dooja, Asif Saleem

Abstract:

The future smart cities trend will be towards Internet of Things (IoT); IoT creates dynamic connections in a ubiquitous manner. Smart cities offer ease and flexibility for daily life matters. By using small devices that are connected to cloud servers based on IoT, network traffic between these devices is growing exponentially, whose security is a concerned issue, since ratio of cyber attack may make the network traffic vulnerable. This paper discusses the latest machine learning approaches in related work further to tackle the increasing rate of cyber attacks, machine learning algorithm is applied to IoT-based network traffic data. The proposed algorithm train itself on data and identify different sections of devices interaction by using supervised learning which is considered as a classifier related to a specific IoT device class. The simulation results clearly identify the attacks and produce fewer false detections.

Keywords: IoT, traffic security, deep learning, classification

Procedia PDF Downloads 130
1611 A Hybrid System for Boreholes Soil Sample

Authors: Ali Ulvi Uzer

Abstract:

Data reduction is an important topic in the field of pattern recognition applications. The basic concept is the reduction of multitudinous amounts of data down to the meaningful parts. The Principal Component Analysis (PCA) method is frequently used for data reduction. The Support Vector Machine (SVM) method is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data, the algorithm outputs an optimal hyperplane which categorizes new examples. This study offers a hybrid approach that uses the PCA for data reduction and Support Vector Machines (SVM) for classification. In order to detect the accuracy of the suggested system, two boreholes taken from the soil sample was used. The classification accuracies for this dataset were obtained through using ten-fold cross-validation method. As the results suggest, this system, which is performed through size reduction, is a feasible system for faster recognition of dataset so our study result appears to be very promising.

Keywords: feature selection, sequential forward selection, support vector machines, soil sample

Procedia PDF Downloads 430
1610 Greyscale: A Tree-Based Taxonomy for Grey Literature Published by Fisheries Agencies

Authors: Tatiana Tunon, Gottfried Pestal

Abstract:

Government agencies responsible for the management of fisheries resources publish many types of grey literature, and these materials are increasingly accessible to the public on agency websites. However, scope and quality vary considerably, and end-users need meta-data about the report series when deciding whether to use the information (e.g. apply the methods, include the results in a systematic review), or when prioritizing materials for archiving (e.g. library holdings, reference databases). A proposed taxonomy for these report series was developed based on a review of 41 report series from 6 government agencies in 4 countries (Canada, New Zealand, Scotland, and United States). Each report series was categorized according to multiple criteria describing peer-review process, content, and purpose. A robust classification tree was then fitted to these descriptions, and the resulting taxonomic groups were used to compare agency output from 4 countries using reports available in their online repositories.

Keywords: classification tree, fisheries, government, grey literature

Procedia PDF Downloads 255
1609 Non-Uniform Filter Banks-based Minimum Distance to Riemannian Mean Classifition in Motor Imagery Brain-Computer Interface

Authors: Ping Tan, Xiaomeng Su, Yi Shen

Abstract:

The motion intention in the motor imagery braincomputer interface is identified by classifying the event-related desynchronization (ERD) and event-related synchronization ERS characteristics of sensorimotor rhythm (SMR) in EEG signals. When the subject imagines different limbs or different parts moving, the rhythm components and bandwidth will change, which varies from person to person. How to find the effective sensorimotor frequency band of subjects is directly related to the classification accuracy of brain-computer interface. To solve this problem, this paper proposes a Minimum Distance to Riemannian Mean Classification method based on Non-Uniform Filter Banks. During the training phase, the EEG signals are decomposed into multiple different bandwidt signals by using multiple band-pass filters firstly; Then the spatial covariance characteristics of each frequency band signal are computered to be as the feature vectors. these feature vectors will be classified by the MDRM (Minimum Distance to Riemannian Mean) method, and cross validation is employed to obtain the effective sensorimotor frequency bands. During the test phase, the test signals are filtered by the bandpass filter of the effective sensorimotor frequency bands, and the extracted spatial covariance feature vectors will be classified by using the MDRM. Experiments on the BCI competition IV 2a dataset show that the proposed method is superior to other classification methods.

Keywords: non-uniform filter banks, motor imagery, brain-computer interface, minimum distance to Riemannian mean

Procedia PDF Downloads 93
1608 Turkish Validation of the Nursing Outcomes for Urinary Incontinence and Their Sensitivities on Nursing Interventions

Authors: Dercan Gencbas, Hatice Bebis, Sue Moorhead

Abstract:

In the nursing process, many of the nursing classification systems were created to be used in international. From these, NANDA-I, Nursing Outcomes Classification (NOC) and Nursing Interventions Classification (NIC). In this direction, the main objective of this study is to establish a model for caregivers in hospitals and communities in Turkey and to ensure that nursing outputs are assessed by NOC-based measures. There are many scales to measure Urinary Incontinence (UI), which is very common in children, in old age, vaginal birth, NOC scales are ideal for use in the nursing process for comprehensive and holistic assessment, with surveys available. For this reason, the purpose of this study is to evaluate the validity of the NOC outputs and indicators used for UI NANDA-I. This research is a methodological study. In addition to the validity of scale indicators in the study, how much they will contribute to recovery after the nursing intervention was assessed by experts. Scope validations have been applied and calculated according to Fehring 1987 work model. According to this, nursing inclusion criteria and scores were determined. For example, if experts have at least four years of clinical experience, their score was 4 points or have at least one year of the nursing classification system, their score was 1 point. The experts were a publication experience about nursing classification, their score was 1 point, or have a doctoral degree in nursing, their score was 2 points. If the expert has a master degree, their score was 1 point. Total of 55 experts rated Fehring as a “senior degree” with a score of 90 according to the expert scoring. The nursing interventions to be applied were asked to what extent these indicators would contribute to recovery. For coverage validity tailored to Fehring's model, each NOC and NOC indicator from specialists was asked to score between 1-5. Score for the significance of indicators was from 1=no precaution to 5=very important. After the expert opinion, these weighted scores obtained for each NOC and NOC indicator were classified as 0.8 critical, 0.8 > 0.5 complements, > 0.5 are excluded. In the NANDA-I / NOC / NIC system (guideline), 5 NOCs proposed for nursing diagnoses for UI were proposed. These outputs are; Urinary Continence, Urinary Elimination, Tissue Integrity, Self CareToileting, Medication Response. After the scales are translated into Turkish, the weighted average of the scores obtained from specialists for the coverage of all 5 NOCs and the contribution of nursing initiatives exceeded 0.8. After the opinions of the experts, 79 of the 82 indicators were calculated as critical, 3 of the indicators were calculated as supplemental. Because of 0.5 > was not obtained, no substance was removed. All NOC outputs were identified as valid and usable scales in Turkey. In this study, five NOC outcomes were verified for the evaluation of the output of individuals who have received nursing knowledge of UI and variant types. Nurses in Turkey can benefit from the outputs of the NOC scale to perform the care of the elderly incontinence.

Keywords: nursing outcomes, content validity, nursing diagnosis, urinary incontinence

Procedia PDF Downloads 113
1607 Prediction of Remaining Life of Industrial Cutting Tools with Deep Learning-Assisted Image Processing Techniques

Authors: Gizem Eser Erdek

Abstract:

This study is research on predicting the remaining life of industrial cutting tools used in the industrial production process with deep learning methods. When the life of cutting tools decreases, they cause destruction to the raw material they are processing. This study it is aimed to predict the remaining life of the cutting tool based on the damage caused by the cutting tools to the raw material. For this, hole photos were collected from the hole-drilling machine for 8 months. Photos were labeled in 5 classes according to hole quality. In this way, the problem was transformed into a classification problem. Using the prepared data set, a model was created with convolutional neural networks, which is a deep learning method. In addition, VGGNet and ResNet architectures, which have been successful in the literature, have been tested on the data set. A hybrid model using convolutional neural networks and support vector machines is also used for comparison. When all models are compared, it has been determined that the model in which convolutional neural networks are used gives successful results of a %74 accuracy rate. In the preliminary studies, the data set was arranged to include only the best and worst classes, and the study gave ~93% accuracy when the binary classification model was applied. The results of this study showed that the remaining life of the cutting tools could be predicted by deep learning methods based on the damage to the raw material. Experiments have proven that deep learning methods can be used as an alternative for cutting tool life estimation.

Keywords: classification, convolutional neural network, deep learning, remaining life of industrial cutting tools, ResNet, support vector machine, VggNet

Procedia PDF Downloads 53
1606 Profit and Nonprofit Sports Clubs, Financial and Organizational Comparison in Poland

Authors: Igor Perechuda, Wojciech Cieśliński

Abstract:

The paper identifies the features of Polish sports clubs in the particular organizational forms: profit and nonprofit. Identification and description of these features is carried out in terms of financial efficiency of the given organizational form. Under the terms of the efficiency the research allows you to specify the advantages of particular organizational sports club form and the following limitations. Paper considers features of sports clubs in range of Polish conditions as legal regulations. The sources of the functioning efficiency of sports clubs may lie in the organizational forms in which they operate. Each of the available forms can be considered either a for-profit or nonprofit enterprise. Depending on this classification there are different capabilities of increasing organizational and financial efficiency of a given sports club. Authors start with general classification and difference between for-profit and non-profit sport clubs. Next identifies specific financial and organizational conditions of both organizational form and then show examples of mixed activity forms and their efficiency effect.

Keywords: financial efficiency, for-profit, non-profit, sports club

Procedia PDF Downloads 523
1605 Corporate Culture and Subcultures: Corporate Culture Analysis in a Company without a Public Relations Department

Authors: Sibel Kurt

Abstract:

In this study, with the use of Goffee and Jones’s corporate culture classification and the scale of this classification, there aimed to analyze a company’s corporate culture which does not have a public relations or communication department. First of all, the type of corporate culture in the company had been determined. Then it questioned if there are subcultures which formed according to demographics or the department of work. In the survey questionnaire, there are 53 questions total. 6 of these questions are about demographics, and 47 of them are about corporate culture. 152 personnel of the company had answered the survey, and the data have been evaluated according to frequency, descriptive, and compare means tests. The type of corporate culture of the company was determined as the 'communal' from the typology of Goffee and Jones in the positive form. There are no subcultures in the company which bases on the demographics, but only one subculture has determined according to the department of work. As a result, the absence of public relations department, personnel’s low level of awareness about corporate culture, and the lack of information between management and employees has been revealed.

Keywords: corporate culture, subculture, public relations, organizational communication

Procedia PDF Downloads 142
1604 A World Map of Seabed Sediment Based on 50 Years of Knowledge

Authors: T. Garlan, I. Gabelotaud, S. Lucas, E. Marchès

Abstract:

Production of a global sedimentological seabed map has been initiated in 1995 to provide the necessary tool for searches of aircraft and boats lost at sea, to give sedimentary information for nautical charts, and to provide input data for acoustic propagation modelling. This original approach had already been initiated one century ago when the French hydrographic service and the University of Nancy had produced maps of the distribution of marine sediments of the French coasts and then sediment maps of the continental shelves of Europe and North America. The current map of the sediment of oceans presented was initiated with a UNESCO's general map of the deep ocean floor. This map was adapted using a unique sediment classification to present all types of sediments: from beaches to the deep seabed and from glacial deposits to tropical sediments. In order to allow good visualization and to be adapted to the different applications, only the granularity of sediments is represented. The published seabed maps are studied, if they present an interest, the nature of the seabed is extracted from them, the sediment classification is transcribed and the resulted map is integrated in the world map. Data come also from interpretations of Multibeam Echo Sounder (MES) imagery of large hydrographic surveys of deep-ocean. These allow a very high-quality mapping of areas that until then were represented as homogeneous. The third and principal source of data comes from the integration of regional maps produced specifically for this project. These regional maps are carried out using all the bathymetric and sedimentary data of a region. This step makes it possible to produce a regional synthesis map, with the realization of generalizations in the case of over-precise data. 86 regional maps of the Atlantic Ocean, the Mediterranean Sea, and the Indian Ocean have been produced and integrated into the world sedimentary map. This work is permanent and permits a digital version every two years, with the integration of some new maps. This article describes the choices made in terms of sediment classification, the scale of source data and the zonation of the variability of the quality. This map is the final step in a system comprising the Shom Sedimentary Database, enriched by more than one million punctual and surface items of data, and four series of coastal seabed maps at 1:10,000, 1:50,000, 1:200,000 and 1:1,000,000. This step by step approach makes it possible to take into account the progresses in knowledge made in the field of seabed characterization during the last decades. Thus, the arrival of new classification systems for seafloor has improved the recent seabed maps, and the compilation of these new maps with those previously published allows a gradual enrichment of the world sedimentary map. But there is still a lot of work to enhance some regions, which are still based on data acquired more than half a century ago.

Keywords: marine sedimentology, seabed map, sediment classification, world ocean

Procedia PDF Downloads 216
1603 Establishment of Air Quality Zones in Italy

Authors: M. G. Dirodi, G. Gugliotta, C. Leonardi

Abstract:

The member states shall establish zones and agglomerations throughout their territory to assess and manage air quality in order to comply with European directives. In Italy decree 155/2010, transposing Directive 2008/50/EC on ambient air quality and cleaner air for Europe, merged into a single act the previous provisions on ambient air quality assessment and management, including those resulting from the implementation of Directive 2004/107/EC relating to arsenic, cadmium, nickel, mercury, and polycyclic aromatic hydrocarbons in ambient air. Decree 155/2010 introduced stricter rules for identifying zones on the basis of the characteristics of the territory in spite of considering pollution levels, as it was in the past. The implementation of such new criteria has reduced the great variability of the previous zoning, leading to a significant reduction of the total number of zones and to a complete and uniform ambient air quality assessment and management throughout the Country. The present document is related to the new zones definition in Italy according to Decree 155/2010. In particular, the paper contains the description and the analysis of the outcome of zoning and classification.

Keywords: zones, agglomerations, air quality assessment, classification

Procedia PDF Downloads 312
1602 Optimizing Load Shedding Schedule Problem Based on Harmony Search

Authors: Almahd Alshereef, Ahmed Alkilany, Hammad Said, Azuraliza Abu Bakar

Abstract:

From time to time, electrical power grid is directed by the National Electricity Operator to conduct load shedding, which involves hours' power outages on the area of this study, Southern Electrical Grid of Libya (SEGL). Load shedding is conducted in order to alleviate pressure on the National Electricity Grid at times of peak demand. This approach has chosen a set of categories to study load-shedding problem considering the effect of the demand priorities on the operation of the power system during emergencies. Classification of category region for load shedding problem is solved by a new algorithm (the harmony algorithm) based on the "random generation list of category region", which is a possible solution with a proximity degree to the optimum. The obtained results prove additional enhancements compared to other heuristic approaches. The case studies are carried out on SEGL.

Keywords: optimization, harmony algorithm, load shedding, classification

Procedia PDF Downloads 368
1601 Speaker Identification by Atomic Decomposition of Learned Features Using Computational Auditory Scene Analysis Principals in Noisy Environments

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

Speaker recognition is performed in high Additive White Gaussian Noise (AWGN) environments using principals of Computational Auditory Scene Analysis (CASA). CASA methods often classify sounds from images in the time-frequency (T-F) plane using spectrograms or cochleargrams as the image. In this paper atomic decomposition implemented by matching pursuit performs a transform from time series speech signals to the T-F plane. The atomic decomposition creates a sparsely populated T-F vector in “weight space” where each populated T-F position contains an amplitude weight. The weight space vector along with the atomic dictionary represents a denoised, compressed version of the original signal. The arraignment or of the atomic indices in the T-F vector are used for classification. Unsupervised feature learning implemented by a sparse autoencoder learns a single dictionary of basis features from a collection of envelope samples from all speakers. The approach is demonstrated using pairs of speakers from the TIMIT data set. Pairs of speakers are selected randomly from a single district. Each speak has 10 sentences. Two are used for training and 8 for testing. Atomic index probabilities are created for each training sentence and also for each test sentence. Classification is performed by finding the lowest Euclidean distance between then probabilities from the training sentences and the test sentences. Training is done at a 30dB Signal-to-Noise Ratio (SNR). Testing is performed at SNR’s of 0 dB, 5 dB, 10 dB and 30dB. The algorithm has a baseline classification accuracy of ~93% averaged over 10 pairs of speakers from the TIMIT data set. The baseline accuracy is attributable to short sequences of training and test data as well as the overall simplicity of the classification algorithm. The accuracy is not affected by AWGN and produces ~93% accuracy at 0dB SNR.

Keywords: time-frequency plane, atomic decomposition, envelope sampling, Gabor atoms, matching pursuit, sparse dictionary learning, sparse autoencoder

Procedia PDF Downloads 276