Search results for: ABC-VED inventory classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2910

Search results for: ABC-VED inventory classification

2220 Review of Cyber Security in Oil and Gas Industry with Cloud Computing Perspective: Taxonomy, Issues and Future Direction

Authors: Irfan Mohiuddin, Ahmad Al Mogren

Abstract:

In recent years, cloud computing has earned substantial attention in the Oil and Gas Industry and provides services in all the phases of the industry lifecycle. Oil and gas supply infrastructure, in particular, is more vulnerable to accidental, natural and intentional threats because of its widespread distribution. Numerous surveys have been conducted on cloud security and privacy. However, to the best of our knowledge, hardly any survey is carried out that reviews cyber security in all phases with a cloud computing perspective. Moreover, a distinctive classification is performed for all the cloud-based cyber security measures based on the cloud component in use. The classification approach will enable researchers to identify the required technique used to enhance the security in specific cloud components. Also, the limitation of each component will allow the researchers to design optimal algorithms. Lastly, future directions are given to point out the imminent challenges that can pave the way for researchers to further enhance the resilience to cyber security threats in the oil and gas industry.

Keywords: cyber security, cloud computing, safety and security, oil and gas industry, security threats, oil and gas pipelines

Procedia PDF Downloads 143
2219 The Problems of Women over 65 with Incontinence Diagnosis: A Case Study in Turkey

Authors: Birsel Canan Demirbag, Kıymet Yesilcicek Calik, Hacer Kobya Bulut

Abstract:

Objective: This study was conducted to evaluate the problems of women over 65 with incontinence diagnosis. Methods: This descriptive study was conducted with women over 65 with incontinence diagnosis in four Family Health Centers in a city in Eastern Black Sea region between November 1, and December 20, 2015. 203, 107, 178, 180 women over 65 were registered in these centers and 262 had incontinence diagnosis at least once and had an ongoing complaint. 177 women were volunteers for the study. During home visits and using face-to-face survey methodology, participants were given socio-demographic characteristics survey, Sandvik severity scale, Incontinence Quality of Life Scale, Urogenital Distress Inventory and a questionnaire including challenges experienced due to incontinence developed by the researcher. Data were analyzed with SPSS program using percentages, numbers, Chi-square, Man-Whitney U and t test with 95% confidence interval and a significance level p <0.05. Findings: 67 ± 1.4 was the mean age, 2.05 ± 0.04 was parity, 44.5 ± 2.12 was menopause age, 66.3% were primary school graduates, 45.7% had deceased spouse, 44.4% lived in a large family, 67.2% had their own room, 77.8% had income, 89.2% could meet self- care, 73.2% had a diagnosis of mixed incontinence, 87.5% suffered for 6-20 years % 78.2 had diuretics, antidepressants and heart medicines, 20.5% had urinary fecal cases, 80.5% had bladder training at least once, 90.1% didn’t have bladder diary calendar/control training programs, 31.1% had hysterectomy for prolapse, 97.1'i% was treated with lower urinary tract infection at least once, 66.3% saw a doctor to get drug in the last three months, 76.2 could not go out alone, 99.2 % had at least one chronic disease, 87.6 % had constipation complain, 2.9% had chronic cough., 45.1% fell due to a sudden rise for toilet. Incontinence Impact Questionnaire Average score was (QOL) 54.3 ± 21.1, Sandvik score was 12.1 ± 2.5, Urogenital Distress Inventory was 47.7 ± 9.2. Difficulties experienced due to incontinence were 99.5% feeling of unhappiness, 67.1% constant feeling of urine smell due to failing to change briefs frequently, % 87.2 move away from social life, 89.7 unable to use pad, 99.2% feeling of disturbing households / other individuals, 87.5% feel dizziness/fall due to sudden rise, 87.4% feeling of others’ imperceptions about the situation, % 94.3 insomnia, 78.2 lack of assistance, 84.7% couldn’t afford urine protection briefs. Results: With this study, it was found out that there were a lot of unsolved issues at individual and community level affecting the life quality of women with incontinence. In accordance with this common problem in women, to facilitate daily life it is obvious that regular home care training programs at institutional level in our country will be effective.

Keywords: health problems, incontinence, incontinence quality of life questionnaire, old age, urinary urogenital distress inventory, Sandviken severity, women

Procedia PDF Downloads 321
2218 A Study of Applying the Use of Breathing Training to Palliative Care Patients, Based on the Bio-Psycho-Social Model

Authors: Wenhsuan Lee, Yachi Chang, Yingyih Shih

Abstract:

In clinical practices, it is common that while facing the unknown progress of their disease, palliative care patients may easily feel anxious and depressed. These types of reactions are a cause of psychosomatic diseases and may also influence treatment results. However, the purpose of palliative care is to provide relief from all kinds of pains. Therefore, how to make patients more comfortable is an issue worth studying. This study adopted the “bio-psycho-social model” proposed by Engel and applied spontaneous breathing training, in the hope of seeing patients’ psychological state changes caused by their physiological state changes, improvements in their anxious conditions, corresponding adjustments of their cognitive functions, and further enhancement of their social functions and the social support system. This study will be a one-year study. Palliative care outpatients will be recruited and assigned to the experimental group or the control group for six outpatient visits (once a month), with 80 patients in each group. The patients of both groups agreed that this study can collect their physiological quantitative data using an HRV device before the first outpatient visit. They also agreed to answer the “Beck Anxiety Inventory (BAI)”, the “Taiwanese version of the WHOQOL-BREF questionnaire” before the first outpatient visit, to fill a self-report questionnaire after each outpatient visit, and to answer the “Beck Anxiety Inventory (BAI)”, the “Taiwanese version of the WHOQOL-BREF questionnaire” after the last outpatient visit. The patients of the experimental group agreed to receive the breathing training under HRV monitoring during the first outpatient visit of this study. Before each of the following three outpatient visits, they were required to fill a self-report questionnaire regarding their breathing practices after going home. After the outpatient visits, they were taught how to practice breathing through an HRV device and asked to practice it after going home. Later, based on the results from the HRV data analyses and the pre-tests and post-tests of the “Beck Anxiety Inventory (BAI)”, the “Taiwanese version of the WHOQOL-BREF questionnaire”, the influence of the breathing training in the bio, psycho, and social aspects were evaluated. The data collected through the self-report questionnaires of the patients of both groups were used to explore the possible interfering factors among the bio, psycho, and social changes. It is expected that this study will support the “bio-psycho-social model” proposed by Engel, meaning that bio, psycho, and social supports are closely related, and that breathing training helps to transform palliative care patients’ psychological feelings of anxiety and depression, to facilitate their positive interactions with others, and to improve the quality medical care for them.

Keywords: palliative care, breathing training, bio-psycho-social model, heart rate variability

Procedia PDF Downloads 260
2217 Analysis on Prediction Models of TBM Performance and Selection of Optimal Input Parameters

Authors: Hang Lo Lee, Ki Il Song, Hee Hwan Ryu

Abstract:

An accurate prediction of TBM(Tunnel Boring Machine) performance is very difficult for reliable estimation of the construction period and cost in preconstruction stage. For this purpose, the aim of this study is to analyze the evaluation process of various prediction models published since 2000 for TBM performance, and to select the optimal input parameters for the prediction model. A classification system of TBM performance prediction model and applied methodology are proposed in this research. Input and output parameters applied for prediction models are also represented. Based on these results, a statistical analysis is performed using the collected data from shield TBM tunnel in South Korea. By performing a simple regression and residual analysis utilizinFg statistical program, R, the optimal input parameters are selected. These results are expected to be used for development of prediction model of TBM performance.

Keywords: TBM performance prediction model, classification system, simple regression analysis, residual analysis, optimal input parameters

Procedia PDF Downloads 309
2216 lncRNA Gene Expression Profiling Analysis by TCGA RNA-Seq Data of Breast Cancer

Authors: Xiaoping Su, Gabriel G. Malouf

Abstract:

Introduction: Breast cancer is a heterogeneous disease that can be classified in 4 subgroups using transcriptional profiling. The role of lncRNA expression in human breast cancer biology, prognosis, and molecular classification remains unknown. Methods and results: Using an integrative comprehensive analysis of lncRNA, mRNA and DNA methylation in 900 breast cancer patients from The Cancer Genome Atlas (TCGA) project, we unraveled the molecular portraits of 1,700 expressed lncRNA. Some of those lncRNAs (i.e, HOTAIR) are previously reported and others are novel (i.e, HOTAIRM1, MAPT-AS1). The lncRNA classification correlated well with the PAM50 classification for basal-like, Her-2 enriched and luminal B subgroups, in contrast to the luminal A subgroup which behaved differently. Importantly, estrogen receptor (ESR1) expression was associated with distinct lncRNA networks in lncRNA clusters III and IV. Gene set enrichment analysis for cis- and trans-acting lncRNA showed enrichment for breast cancer signatures driven by breast cancer master regulators. Almost two third of those lncRNA were marked by enhancer chromatin modifications (i.e., H3K27ac), suggesting that lncRNA expression may result in increased activity of neighboring genes. Differential analysis of gene expression profiling data showed that lncRNA HOTAIRM1 was significantly down-regulated in basal-like subtype, and DNA methylation profiling data showed that lncRNA HOTAIRM1 was highly methylated in basal-like subtype. Thus, our integrative analysis of gene expression and DNA methylation strongly suggested that lncRNA HOTAIRM1 should be a tumor suppressor in basal-like subtype. Conclusion and significance: Our study depicts the first lncRNA molecular portrait of breast cancer and shows that lncRNA HOTAIRM1 might be a novel tumor suppressor.

Keywords: lncRNA profiling, breast cancer, HOTAIRM1, tumor suppressor

Procedia PDF Downloads 106
2215 National Assessment for Schools in Saudi Arabia: Score Reliability and Plausible Values

Authors: Dimiter M. Dimitrov, Abdullah Sadaawi

Abstract:

The National Assessment for Schools (NAFS) in Saudi Arabia consists of standardized tests in Mathematics, Reading, and Science for school grade levels 3, 6, and 9. One main goal is to classify students into four categories of NAFS performance (minimal, basic, proficient, and advanced) by schools and the entire national sample. The NAFS scoring and equating is performed on a bounded scale (D-scale: ranging from 0 to 1) in the framework of the recently developed “D-scoring method of measurement.” The specificity of the NAFS measurement framework and data complexity presented both challenges and opportunities to (a) the estimation of score reliability for schools, (b) setting cut-scores for the classification of students into categories of performance, and (c) generating plausible values for distributions of student performance on the D-scale. The estimation of score reliability at the school level was performed in the framework of generalizability theory (GT), with students “nested” within schools and test items “nested” within test forms. The GT design was executed via a multilevel modeling syntax code in R. Cut-scores (on the D-scale) for the classification of students into performance categories was derived via a recently developed method of standard setting, referred to as “Response Vector for Mastery” (RVM) method. For each school, the classification of students into categories of NAFS performance was based on distributions of plausible values for the students’ scores on NAFS tests by grade level (3, 6, and 9) and subject (Mathematics, Reading, and Science). Plausible values (on the D-scale) for each individual student were generated via random selection from a statistical logit-normal distribution with parameters derived from the student’s D-score and its conditional standard error, SE(D). All procedures related to D-scoring, equating, generating plausible values, and classification of students into performance levels were executed via a computer program in R developed for the purpose of NAFS data analysis.

Keywords: large-scale assessment, reliability, generalizability theory, plausible values

Procedia PDF Downloads 21
2214 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract

Procedia PDF Downloads 659
2213 Astronomical Object Classification

Authors: Alina Muradyan, Lina Babayan, Arsen Nanyan, Gohar Galstyan, Vigen Khachatryan

Abstract:

We present a photometric method for identifying stars, galaxies and quasars in multi-color surveys, which uses a library of ∼> 65000 color templates for comparison with observed objects. The method aims for extracting the information content of object colors in a statistically correct way, and performs a classification as well as a redshift estimation for galaxies and quasars in a unified approach based on the same probability density functions. For the redshift estimation, we employ an advanced version of the Minimum Error Variance estimator which determines the redshift error from the redshift dependent probability density function itself. The method was originally developed for the Calar Alto Deep Imaging Survey (CADIS), but is now used in a wide variety of survey projects. We checked its performance by spectroscopy of CADIS objects, where the method provides high reliability (6 errors among 151 objects with R < 24), especially for the quasar selection, and redshifts accurate within σz ≈ 0.03 for galaxies and σz ≈ 0.1 for quasars. For an optimization of future survey efforts, a few model surveys are compared, which are designed to use the same total amount of telescope time but different sets of broad-band and medium-band filters. Their performance is investigated by Monte-Carlo simulations as well as by analytic evaluation in terms of classification and redshift estimation. If photon noise were the only error source, broad-band surveys and medium-band surveys should perform equally well, as long as they provide the same spectral coverage. In practice, medium-band surveys show superior performance due to their higher tolerance for calibration errors and cosmic variance. Finally, we discuss the relevance of color calibration and derive important conclusions for the issues of library design and choice of filters. The calibration accuracy poses strong constraints on an accurate classification, which are most critical for surveys with few, broad and deeply exposed filters, but less severe for surveys with many, narrow and less deep filters.

Keywords: VO, ArVO, DFBS, FITS, image processing, data analysis

Procedia PDF Downloads 80
2212 Analyses of Adverse Drug Reactions Reported of Hospital in Taiwan

Authors: Yu-Hong Lin

Abstract:

Background: An adverse drug reaction (ADR) reported is an injury which caused by taking medicines. Sometimes the severity of ADR reported may be minor, but sometimes it could be a life-threatening situation. In order to provide healthcare professionals as a better reference in clinical practice, we do data collection and analysis from our hospital. Methods: This was a retrospective study of ADRs reported performed from 2014 to 2015 in our hospital in Taiwan. We collected assessment items of ADRs reported, which contain gender and age, occurring sources, Anatomical Therapeutic Chemical (ATC) classification of suspected drugs, types of adverse reactions, Naranjo score calculating by Naranjo Adverse Drug Reaction Probability Scale and so on. Results: The investigation included two hundred and seven ADRs reported. Most of ADRs reported were occurring in outpatient department (92%). The average age of ADRs reported was 65.3 years. Less than 65 years of age were in the majority in this study (54%). Majority of all ADRs reported were males (51%). According to ATC classification system, the major classification of suspected drugs was cardiovascular system (19%) and antiinfectives for systemic use (18%) respectively. Among the adverse reactions, Dermatologic Effects (35%) were the major type of ADRs. Also, the major Naranjo scores of all ADRs reported ranged from 1 to 4 points (91%), which represents a possible correlation between ADRs reported and suspected drugs. Conclusions: Definitely, ADRs reported is still an extremely important information for healthcare professionals. For that reason, we put all information of ADRs reported into our hospital's computer system, and it will improve the safety of medication use. By hospital's computer system, it can remind prescribers to think of information about patient's ADRs reported. No drugs are administered without risk. Therefore, all healthcare professionals should have a responsibility to their patients, who themselves are becoming more aware of problems associated with drug therapy.

Keywords: adverse drug reaction, Taiwan, healthcare professionals, safe use of medicines

Procedia PDF Downloads 230
2211 A Two-Week and Six-Month Stability of Cancer Health Literacy Classification Using the CHLT-6

Authors: Levent Dumenci, Laura A. Siminoff

Abstract:

Health literacy has been shown to predict a variety of health outcomes. Reliable identification of persons with limited cancer health literacy (LCHL) has been proved questionable with existing instruments using an arbitrary cut point along a continuum. The CHLT-6, however, uses a latent mixture modeling approach to identify persons with LCHL. The purpose of this study was to estimate two-week and six-month stability of identifying persons with LCHL using the CHLT-6 with a discrete latent variable approach as the underlying measurement structure. Using a test-retest design, the CHLT-6 was administered to cancer patients with two-week (N=98) and six-month (N=51) intervals. The two-week and six-month latent test-retest agreements were 89% and 88%, respectively. The chance-corrected latent agreements estimated from Dumenci’s latent kappa were 0.62 (95% CI: 0.41 – 0.82) and .47 (95% CI: 0.14 – 0.80) for the two-week and six-month intervals, respectively. High levels of latent test-retest agreement between limited and adequate categories of cancer health literacy construct, coupled with moderate to good levels of change-corrected latent agreements indicated that the CHLT-6 classification of limited versus adequate cancer health literacy is relatively stable over time. In conclusion, the measurement structure underlying the instrument allows for estimating classification errors circumventing limitations due to arbitrary approaches adopted by all other instruments. The CHLT-6 can be used to identify persons with LCHL in oncology clinics and intervention studies to accurately estimate treatment effectiveness.

Keywords: limited cancer health literacy, the CHLT-6, discrete latent variable modeling, latent agreement

Procedia PDF Downloads 179
2210 Fake Accounts Detection in Twitter Based on Minimum Weighted Feature Set

Authors: Ahmed ElAzab, Amira M. Idrees, Mahmoud A. Mahmoud, Hesham Hefny

Abstract:

Social networking sites such as Twitter and Facebook attracts over 500 million users across the world, for those users, their social life, even their practical life, has become interrelated. Their interaction with social networking has affected their life forever. Accordingly, social networking sites have become among the main channels that are responsible for vast dissemination of different kinds of information during real time events. This popularity in Social networking has led to different problems including the possibility of exposing incorrect information to their users through fake accounts which results to the spread of malicious content during life events. This situation can result to a huge damage in the real world to the society in general including citizens, business entities, and others. In this paper, we present a classification method for detecting fake accounts on Twitter. The study determines the minimized set of the main factors that influence the detection of the fake accounts on Twitter, then the determined factors have been applied using different classification techniques, a comparison of the results for these techniques has been performed and the most accurate algorithm is selected according to the accuracy of the results. The study has been compared with different recent research in the same area, this comparison has proved the accuracy of the proposed study. We claim that this study can be continuously applied on Twitter social network to automatically detect the fake accounts, moreover, the study can be applied on different Social network sites such as Facebook with minor changes according to the nature of the social network which are discussed in this paper.

Keywords: fake accounts detection, classification algorithms, twitter accounts analysis, features based techniques

Procedia PDF Downloads 419
2209 Rapid Classification of Soft Rot Enterobacteriaceae Phyto-Pathogens Pectobacterium and Dickeya Spp. Using Infrared Spectroscopy and Machine Learning

Authors: George Abu-Aqil, Leah Tsror, Elad Shufan, Shaul Mordechai, Mahmoud Huleihel, Ahmad Salman

Abstract:

Pectobacterium and Dickeya spp which negatively affect a wide range of crops are the main causes of the aggressive diseases of agricultural crops. These aggressive diseases are responsible for a huge economic loss in agriculture including a severe decrease in the quality of the stored vegetables and fruits. Therefore, it is important to detect these pathogenic bacteria at their early stages of infection to control their spread and consequently reduce the economic losses. In addition, early detection is vital for producing non-infected propagative material for future generations. The currently used molecular techniques for the identification of these bacteria at the strain level are expensive and laborious. Other techniques require a long time of ~48 h for detection. Thus, there is a clear need for rapid, non-expensive, accurate and reliable techniques for early detection of these bacteria. In this study, infrared spectroscopy, which is a well-known technique with all its features, was used for rapid detection of Pectobacterium and Dickeya spp. at the strain level. The bacteria were isolated from potato plants and tubers with soft rot symptoms and measured by infrared spectroscopy. The obtained spectra were analyzed using different machine learning algorithms. The performances of our approach for taxonomic classification among the bacterial samples were evaluated in terms of success rates. The success rates for the correct classification of the genus, species and strain levels were ~100%, 95.2% and 92.6% respectively.

Keywords: soft rot enterobacteriaceae (SRE), pectobacterium, dickeya, plant infections, potato, solanum tuberosum, infrared spectroscopy, machine learning

Procedia PDF Downloads 103
2208 Integrating Optuna and Synthetic Data Generation for Optimized Medical Transcript Classification Using BioBERT

Authors: Sachi Nandan Mohanty, Shreya Sinha, Sweeti Sah, Shweta Sharma4

Abstract:

The advancement of natural language processing has majorly influenced the field of medical transcript classification, providing a robust framework for enhancing the accuracy of clinical data processing. It has enormous potential to transform healthcare and improve people's livelihoods. This research focuses on improving the accuracy of medical transcript categorization using Bidirectional Encoder Representations from Transformers (BERT) and its specialized variants, including BioBERT, ClinicalBERT, SciBERT, and BlueBERT. The experimental work employs Optuna, an optimization framework, for hyperparameter tuning to identify the most effective variant, concluding that BioBERT yields the best performance. Furthermore, various optimizers, including Adam, RMSprop, and Layerwise adaptive large batch optimization (LAMB), were evaluated alongside BERT's default AdamW optimizer. The findings show that the LAMB optimizer achieves a performance that is equally good as AdamW's. Synthetic data generation techniques from Gretel were utilized to augment the dataset, expanding the original dataset from 5,000 to 10,000 rows. Subsequent evaluations demonstrated that the model maintained its performance with synthetic data, with the LAMB optimizer showing marginally better results. The enhanced dataset and optimized model configurations improved classification accuracy, showcasing the efficacy of the BioBERT variant and the LAMB optimizer. It resulted in an accuracy of up to 98.2% and 90.8% for the original and combined datasets.

Keywords: BioBERT, clinical data, healthcare AI, transformer models

Procedia PDF Downloads 4
2207 Design of Bacterial Pathogens Identification System Based on Scattering of Laser Beam Light and Classification of Binned Plots

Authors: Mubashir Hussain, Mu Lv, Xiaohan Dong, Zhiyang Li, Bin Liu, Nongyue He

Abstract:

Detection and classification of microbes have a vast range of applications in biomedical engineering especially in detection, characterization, and quantification of bacterial contaminants. For identification of pathogens, different techniques are emerging in the field of biomedical engineering. Latest technology uses light scattering, capable of identifying different pathogens without any need for biochemical processing. Bacterial Pathogens Identification System (BPIS) which uses a laser beam, passes through the sample and light scatters off. An assembly of photodetectors surrounded by the sample at different angles to detect the scattering of light. The algorithm of the system consists of two parts: (a) Library files, and (b) Comparator. Library files contain data of known species of bacterial microbes in the form of binned plots, while comparator compares data of unknown sample with library files. Using collected data of unknown bacterial species, highest voltage values stored in the form of peaks and arranged in 3D histograms to find the frequency of occurrence. Resulting data compared with library files of known bacterial species. If sample data matching with any library file of known bacterial species, sample identified as a matched microbe. An experiment performed to identify three different bacteria particles: Enterococcus faecalis, Pseudomonas aeruginosa, and Escherichia coli. By applying algorithm using library files of given samples, results were compromising. This system is potentially applicable to several biomedical areas, especially those related to cell morphology.

Keywords: microbial identification, laser scattering, peak identification, binned plots classification

Procedia PDF Downloads 150
2206 Investigating the Relationship between the Kuwait Stock Market and Its Marketing Sectors

Authors: Mohamad H. Atyeh, Ahmad Khaldi

Abstract:

The main objective of this research is to measure the relationship between the Kuwait stock Exchange (KSE) index and its two marketing sectors after the new market classification. The findings of this research are important for Public economic policy makers as they need to know if the new system (new classification) is efficient and to what level, to monitor the markets and intervene with appropriate measures. The data used are the daily index of the whole Kuwaiti market and the daily closing price, number of deals and volume of shares traded of two marketing sectors (consumer goods and consumer services) for the period from the 13th of May 2012 till the 12th of December 2016. The results indicate a positive direct impact of the closing price, volume and deals indexes of the consumer goods and the consumer services companies on the overall KSE index, volume and deals of the Kuwaiti stock market (KSE).

Keywords: correlation, market capitalization, Kuwait Stock Exchange (KSE), marketing sectors, stock performance

Procedia PDF Downloads 328
2205 Sentiment Classification of Documents

Authors: Swarnadip Ghosh

Abstract:

Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it determines whether a piece of writing is positive, negative or neutral.Sentiment analysis of documents holds great importance in today's world, when numerous information is stored in databases and in the world wide web. An efficient algorithm to illicit such information, would be beneficial for social, economic as well as medical purposes. In this project, we have developed an algorithm to classify a document into positive or negative. Using our algorithm, we obtained a feature set from the data, and classified the documents based on this feature set. It is important to note that, in the classification, we have not used the independence assumption, which is considered by many procedures like the Naive Bayes. This makes the algorithm more general in scope. Moreover, because of the sparsity and high dimensionality of such data, we did not use empirical distribution for estimation, but developed a method by finding degree of close clustering of the data points. We have applied our algorithm on a movie review data set obtained from IMDb and obtained satisfactory results.

Keywords: sentiment, Run's Test, cross validation, higher dimensional pmf estimation

Procedia PDF Downloads 404
2204 A Feature Clustering-Based Sequential Selection Approach for Color Texture Classification

Authors: Mohamed Alimoussa, Alice Porebski, Nicolas Vandenbroucke, Rachid Oulad Haj Thami, Sana El Fkihi

Abstract:

Color and texture are highly discriminant visual cues that provide an essential information in many types of images. Color texture representation and classification is therefore one of the most challenging problems in computer vision and image processing applications. Color textures can be represented in different color spaces by using multiple image descriptors which generate a high dimensional set of texture features. In order to reduce the dimensionality of the feature set, feature selection techniques can be used. The goal of feature selection is to find a relevant subset from an original feature space that can improve the accuracy and efficiency of a classification algorithm. Traditionally, feature selection is focused on removing irrelevant features, neglecting the possible redundancy between relevant ones. This is why some feature selection approaches prefer to use feature clustering analysis to aid and guide the search. These techniques can be divided into two categories. i) Feature clustering-based ranking algorithm uses feature clustering as an analysis that comes before feature ranking. Indeed, after dividing the feature set into groups, these approaches perform a feature ranking in order to select the most discriminant feature of each group. ii) Feature clustering-based subset search algorithms can use feature clustering following one of three strategies; as an initial step that comes before the search, binded and combined with the search or as the search alternative and replacement. In this paper, we propose a new feature clustering-based sequential selection approach for the purpose of color texture representation and classification. Our approach is a three step algorithm. First, irrelevant features are removed from the feature set thanks to a class-correlation measure. Then, introducing a new automatic feature clustering algorithm, the feature set is divided into several feature clusters. Finally, a sequential search algorithm, based on a filter model and a separability measure, builds a relevant and non redundant feature subset: at each step, a feature is selected and features of the same cluster are removed and thus not considered thereafter. This allows to significantly speed up the selection process since large number of redundant features are eliminated at each step. The proposed algorithm uses the clustering algorithm binded and combined with the search. Experiments using a combination of two well known texture descriptors, namely Haralick features extracted from Reduced Size Chromatic Co-occurence Matrices (RSCCMs) and features extracted from Local Binary patterns (LBP) image histograms, on five color texture data sets, Outex, NewBarktex, Parquet, Stex and USPtex demonstrate the efficiency of our method compared to seven of the state of the art methods in terms of accuracy and computation time.

Keywords: feature selection, color texture classification, feature clustering, color LBP, chromatic cooccurrence matrix

Procedia PDF Downloads 138
2203 Numerical Simulation of Air Pollutant Using Coupled AERMOD-WRF Modeling System over Visakhapatnam: A Case Study

Authors: Amit Kumar

Abstract:

Accurate identification of deteriorated air quality regions is very helpful in devising better environmental practices and mitigation efforts. In the present study, an attempt has been made to identify the air pollutant dispersion patterns especially NOX due to vehicular and industrial sources over a rapidly developing urban city, Visakhapatnam (17°42’ N, 83°20’ E), India, during April 2009. Using the emission factors of different vehicles as well as the industry, a high resolution 1 km x 1 km gridded emission inventory has been developed for Visakhapatnam city. A dispersion model AERMOD with explicit representation of planetary boundary layer (PBL) dynamics and offline coupled through a developed coupler mechanism with a high resolution mesoscale model WRF-ARW resolution for simulating the dispersion patterns of NOX is used in the work. The meteorological as well as PBL parameters obtained by employing two PBL schemes viz., non-local Yonsei University (YSU) and local Mellor-Yamada-Janjic (MYJ) of WRF-ARW model, which are reasonably representing the boundary layer parameters are considered for integrating AERMOD. Significantly different dispersion patterns of NOX have been noticed between summer and winter months. The simulated NOX concentration is validated with available six monitoring stations of Central Pollution Control Board, India. Statistical analysis of model evaluated concentrations with the observations reveals that WRF-ARW of YSU scheme with AERMOD has shown better performance. The deteriorated air quality locations are identified over Visakhapatnam based on the validated model simulations of NOX concentrations. The present study advocates the utility of tNumerical Simulation of Air Pollutant Using Coupled AERMOD-WRF Modeling System over Visakhapatnam: A Case Studyhe developed gridded emission inventory of NOX with coupled WRF-AERMOD modeling system for air quality assessment over the study region.

Keywords: WRF-ARW, AERMOD, planetary boundary layer, air quality

Procedia PDF Downloads 282
2202 Cataloguing Beetle Fauna (Insecta: Coleoptera) of India: Estimating Diversity, Distribution, and Taxonomic Challenges

Authors: Devanshu Gupta, Kailash Chandra, Priyanka Das, Joyjit Ghosh

Abstract:

Beetles, in the insect order Coleoptera are the most species-rich group on this planet today. They represent about 40% of the total insect diversity of the world. With a considerable range of landform types including significant mountain ranges, deserts, fertile irrigational plains, and hilly forested areas, India is one of the mega-diverse countries and includes more than 0.1 million faunal species. Despite having rich biodiversity, the efforts to catalogue the beetle diversity of the extant species/taxa reported from India have been less. Therefore, in this paper, the information on the beetle fauna of India is provided based on the data available with the museum collections of Zoological Survey of India and taxa extracted from zoological records and published literature. The species were listed with their valid names, synonyms, type localities, type depositories, and their distribution in states and biogeographic zones of India. The catalogue also incorporates the bibliography on Indian Coleoptera. The exhaustive species inventory, prepared by us include distributional records from Himalaya, Trans Himalaya, Desert, Semi-Arid, Western Ghats, Deccan Peninsula, Gangetic Plains, Northeast, Islands, and Coastal areas of the country. Our study concludes that many of the species are still known from their type localities only, so there is need to revisit and resurvey those collection localities for the taxonomic evaluation of those species. There are species which exhibit single locality records, and taxa-specific biodiversity assessments are required to be undertaken to understand the distributional range of such species. The primary challenge is taxonomic identifications of the species which were described before independence, and the type materials are present in overseas museums. For such species, taxonomic revisions of the different group of beetles are required to solve the problems of identification and classification.

Keywords: checklist, taxonomy, museum collections, biogeographic zones

Procedia PDF Downloads 277
2201 Colored Image Classification Using Quantum Convolutional Neural Networks Approach

Authors: Farina Riaz, Shahab Abdulla, Srinjoy Ganguly, Hajime Suzuki, Ravinesh C. Deo, Susan Hopkins

Abstract:

Recently, quantum machine learning has received significant attention. For various types of data, including text and images, numerous quantum machine learning (QML) models have been created and are being tested. Images are exceedingly complex data components that demand more processing power. Despite being mature, classical machine learning still has difficulties with big data applications. Furthermore, quantum technology has revolutionized how machine learning is thought of, by employing quantum features to address optimization issues. Since quantum hardware is currently extremely noisy, it is not practicable to run machine learning algorithms on it without risking the production of inaccurate results. To discover the advantages of quantum versus classical approaches, this research has concentrated on colored image data. Deep learning classification models are currently being created on Quantum platforms, but they are still in a very early stage. Black and white benchmark image datasets like MNIST and Fashion MINIST have been used in recent research. MNIST and CIFAR-10 were compared for binary classification, but the comparison showed that MNIST performed more accurately than colored CIFAR-10. This research will evaluate the performance of the QML algorithm on the colored benchmark dataset CIFAR-10 to advance QML's real-time applicability. However, deep learning classification models have not been developed to compare colored images like Quantum Convolutional Neural Network (QCNN) to determine how much it is better to classical. Only a few models, such as quantum variational circuits, take colored images. The methodology adopted in this research is a hybrid approach by using penny lane as a simulator. To process the 10 classes of CIFAR-10, the image data has been translated into grey scale and the 28 × 28-pixel image containing 10,000 test and 50,000 training images were used. The objective of this work is to determine how much the quantum approach can outperform a classical approach for a comprehensive dataset of color images. After pre-processing 50,000 images from a classical computer, the QCNN model adopted a hybrid method and encoded the images into a quantum simulator for feature extraction using quantum gate rotations. The measurements were carried out on the classical computer after the rotations were applied. According to the results, we note that the QCNN approach is ~12% more effective than the traditional classical CNN approaches and it is possible that applying data augmentation may increase the accuracy. This study has demonstrated that quantum machine and deep learning models can be relatively superior to the classical machine learning approaches in terms of their processing speed and accuracy when used to perform classification on colored classes.

Keywords: CIFAR-10, quantum convolutional neural networks, quantum deep learning, quantum machine learning

Procedia PDF Downloads 130
2200 Small Target Recognition Based on Trajectory Information

Authors: Saad Alkentar, Abdulkareem Assalem

Abstract:

Recognizing small targets has always posed a significant challenge in image analysis. Over long distances, the image signal-to-noise ratio tends to be low, limiting the amount of useful information available to detection systems. Consequently, visual target recognition becomes an intricate task to tackle. In this study, we introduce a Track Before Detect (TBD) approach that leverages target trajectory information (coordinates) to effectively distinguish between noise and potential targets. By reframing the problem as a multivariate time series classification, we have achieved remarkable results. Specifically, our TBD method achieves an impressive 97% accuracy in separating target signals from noise within a mere half-second time span (consisting of 10 data points). Furthermore, when classifying the identified targets into our predefined categories—airplane, drone, and bird—we achieve an outstanding classification accuracy of 96% over a more extended period of 1.5 seconds (comprising 30 data points).

Keywords: small targets, drones, trajectory information, TBD, multivariate time series

Procedia PDF Downloads 48
2199 Floristic Diversity, Carbon Stocks and Degradation Factors in Two Sacred Forests in the West Cameroon Region

Authors: Maffo Maffo Nicole Liliane, Mounmeni Kpoumie Hubert, Mbaire Matindje Karl Marx, Zapfack Louis

Abstract:

Sacred forests play a valuable role in conserving local biodiversity and provide numerous ecosystem services in Cameroon. The study was carried out in the sacred forests of Bandrefam and Batoufam (western Cameroon). The aim was to estimate the diversity of woody species, carbon stocks and degradation factors in these sacred forests. The floristic inventory was carried out in plots measuring 25m × 25m for trees with diameters greater than 10 cm and 5m × 5m for trees with diameters less than 10 cm. Carbon stocks were estimated using the non-destructive method and the allometric equations. Data on degradation factors were collected using semi-structured surveys in the Bandrefam and Batoufam neighborhoods. The floristic inventory identified 65 species divided into 57 genera and 30 families in the Bandrefam Sacred Forest and 45 species divided into 42 genera and 27 families in the Batoufam Sacres Forest. The families common to both sacred forests are as follows: Phyllanthaceae, Fabaceae, Moraceae, Lamiaceae, Malvaceae, Rubiaceae, Meliaceae, Anacardiaceae, and Sapindaceae. Three genera are present in both sites. These are: Albizia, Macaranga, Trichillia. In addition, there are 27 species in common between the two sites. The total carbon stock is 469.26 tC/ha at Batoufam and 291.41 tC/ha at Bandrefam. The economic value varies between 15 823 877.05 fcfa at Batoufam and 9 825 530.528 fcfa at Bandrefam. The study shows that despite the sacred nature of these forests, they are subject to degradation factors such as bushfires (35.42 %), the creation of plantations (23.96 %), illegal timber exploitation (21.88 %), young people's lack of interest in the notion of conservation (9.38 %), climate change (7.29 %) and growing urbanization (2.08 %). These factors threaten biodiversity and reduce carbon storage in these forests.

Keywords: sacred forests, degradation factors, carbon stocks, semi-structured surveys

Procedia PDF Downloads 49
2198 The Use of Unmanned Aerial System (UAS) in Improving the Measurement System on the Example of Textile Heaps

Authors: Arkadiusz Zurek

Abstract:

The potential of using drones is visible in many areas of logistics, especially in terms of their use for monitoring and control of many processes. The technologies implemented in the last decade concern new possibilities for companies that until now have not even considered them, such as warehouse inventories. Unmanned aerial vehicles are no longer seen as a revolutionary tool for Industry 4.0, but rather as tools in the daily work of factories and logistics operators. The research problem is to develop a method for measuring the weight of goods in a selected link of the clothing supply chain by drones. However, the purpose of this article is to analyze the causes of errors in traditional measurements, and then to identify adverse events related to the use of drones for the inventory of a heap of textiles intended for production purposes. On this basis, it will be possible to develop guidelines to eliminate the causes of these events in the measurement process using drones. In a real environment, work was carried out to determine the volume and weight of textiles, including, among others, weighing a textile sample to determine the average density of the assortment, establishing a local geodetic network, terrestrial laser scanning and photogrammetric raid using an unmanned aerial vehicle. As a result of the analysis of measurement data obtained in the facility, the volume and weight of the assortment and the accuracy of their determination were determined. In this article, this work presents how such heaps are currently being tested, what adverse events occur, indicate and describes the current use of photogrammetric techniques of this type of measurements so far performed by external drones for the inventory of wind farms or construction of the station and compare them with the measurement system of the aforementioned textile heap inside a large-format facility.

Keywords: drones, unmanned aerial system, UAS, indoor system, security, process automation, cost optimization, photogrammetry, risk elimination, industry 4.0

Procedia PDF Downloads 87
2197 Hybrid Fuzzy Weighted K-Nearest Neighbor to Predict Hospital Readmission for Diabetic Patients

Authors: Soha A. Bahanshal, Byung G. Kim

Abstract:

Identification of patients at high risk for hospital readmission is of crucial importance for quality health care and cost reduction. Predicting hospital readmissions among diabetic patients has been of great interest to many researchers and health decision makers. We build a prediction model to predict hospital readmission for diabetic patients within 30 days of discharge. The core of the prediction model is a modified k Nearest Neighbor called Hybrid Fuzzy Weighted k Nearest Neighbor algorithm. The prediction is performed on a patient dataset which consists of more than 70,000 patients with 50 attributes. We applied data preprocessing using different techniques in order to handle data imbalance and to fuzzify the data to suit the prediction algorithm. The model so far achieved classification accuracy of 80% compared to other models that only use k Nearest Neighbor.

Keywords: machine learning, prediction, classification, hybrid fuzzy weighted k-nearest neighbor, diabetic hospital readmission

Procedia PDF Downloads 186
2196 Multi-Sensor Target Tracking Using Ensemble Learning

Authors: Bhekisipho Twala, Mantepu Masetshaba, Ramapulana Nkoana

Abstract:

Multiple classifier systems combine several individual classifiers to deliver a final classification decision. However, an increasingly controversial question is whether such systems can outperform the single best classifier, and if so, what form of multiple classifiers system yields the most significant benefit. Also, multi-target tracking detection using multiple sensors is an important research field in mobile techniques and military applications. In this paper, several multiple classifiers systems are evaluated in terms of their ability to predict a system’s failure or success for multi-sensor target tracking tasks. The Bristol Eden project dataset is utilised for this task. Experimental and simulation results show that the human activity identification system can fulfill requirements of target tracking due to improved sensors classification performances with multiple classifier systems constructed using boosting achieving higher accuracy rates.

Keywords: single classifier, ensemble learning, multi-target tracking, multiple classifiers

Procedia PDF Downloads 271
2195 Identification and Classification of Fiber-Fortified Semolina by Near-Infrared Spectroscopy (NIR)

Authors: Amanda T. Badaró, Douglas F. Barbin, Sofia T. Garcia, Maria Teresa P. S. Clerici, Amanda R. Ferreira

Abstract:

Food fortification is the intentional addition of a nutrient in a food matrix and has been widely used to overcome the lack of nutrients in the diet or increasing the nutritional value of food. Fortified food must meet the demand of the population, taking into account their habits and risks that these foods may cause. Wheat and its by-products, such as semolina, has been strongly indicated to be used as a food vehicle since it is widely consumed and used in the production of other foods. These products have been strategically used to add some nutrients, such as fibers. Methods of analysis and quantification of these kinds of components are destructive and require lengthy sample preparation and analysis. Therefore, the industry has searched for faster and less invasive methods, such as Near-Infrared Spectroscopy (NIR). NIR is a rapid and cost-effective method, however, it is based on indirect measurements, yielding high amount of data. Therefore, NIR spectroscopy requires calibration with mathematical and statistical tools (Chemometrics) to extract analytical information from the corresponding spectra, as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). PCA is well suited for NIR, once it can handle many spectra at a time and be used for non-supervised classification. Advantages of the PCA, which is also a data reduction technique, is that it reduces the data spectra to a smaller number of latent variables for further interpretation. On the other hand, LDA is a supervised method that searches the Canonical Variables (CV) with the maximum separation among different categories. In LDA, the first CV is the direction of maximum ratio between inter and intra-class variances. The present work used a portable infrared spectrometer (NIR) for identification and classification of pure and fiber-fortified semolina samples. The fiber was added to semolina in two different concentrations, and after the spectra acquisition, the data was used for PCA and LDA to identify and discriminate the samples. The results showed that NIR spectroscopy associate to PCA was very effective in identifying pure and fiber-fortified semolina. Additionally, the classification range of the samples using LDA was between 78.3% and 95% for calibration and 75% and 95% for cross-validation. Thus, after the multivariate analysis such as PCA and LDA, it was possible to verify that NIR associated to chemometric methods is able to identify and classify the different samples in a fast and non-destructive way.

Keywords: Chemometrics, fiber, linear discriminant analysis, near-infrared spectroscopy, principal component analysis, semolina

Procedia PDF Downloads 213
2194 Multiple-Material Flow Control in Construction Supply Chain with External Storage Site

Authors: Fatmah Almathkour

Abstract:

Managing and controlling the construction supply chain (CSC) are very important components of effective construction project execution. The goals of managing the CSC are to reduce uncertainty and optimize the performance of a construction project by improving efficiency and reducing project costs. The heart of much SC activity is addressing risk, and the CSC is no different. The delivery and consumption of construction materials is highly variable due to the complexity of construction operations, rapidly changing demand for certain components, lead time variability from suppliers, transportation time variability, and disruptions at the job site. Current notions of managing and controlling CSC, involve focusing on one project at a time with a push-based material ordering system based on the initial construction schedule and, then, holding a tremendous amount of inventory. A two-stage methodology was proposed to coordinate the feed-forward control of advanced order placement with a supplier to a feedback local control in the form of adding the ability to transship materials between projects to improve efficiency and reduce costs. It focused on the single supplier integrated production and transshipment problem with multiple products. The methodology is used as a design tool for the CSC because it includes an external storage site not associated with one of the projects. The idea is to add this feature to a highly constrained environment to explore its effectiveness in buffering the impact of variability and maintaining project schedule at low cost. The methodology uses deterministic optimization models with objectives that minimizing the total cost of the CSC. To illustrate how this methodology can be used in practice and the types of information that can be gleaned, it is tested on a number of cases based on the real example of multiple construction projects in Kuwait.

Keywords: construction supply chain, inventory control supply chain, transshipment

Procedia PDF Downloads 122
2193 Classification of IoT Traffic Security Attacks Using Deep Learning

Authors: Anum Ali, Kashaf ad Dooja, Asif Saleem

Abstract:

The future smart cities trend will be towards Internet of Things (IoT); IoT creates dynamic connections in a ubiquitous manner. Smart cities offer ease and flexibility for daily life matters. By using small devices that are connected to cloud servers based on IoT, network traffic between these devices is growing exponentially, whose security is a concerned issue, since ratio of cyber attack may make the network traffic vulnerable. This paper discusses the latest machine learning approaches in related work further to tackle the increasing rate of cyber attacks, machine learning algorithm is applied to IoT-based network traffic data. The proposed algorithm train itself on data and identify different sections of devices interaction by using supervised learning which is considered as a classifier related to a specific IoT device class. The simulation results clearly identify the attacks and produce fewer false detections.

Keywords: IoT, traffic security, deep learning, classification

Procedia PDF Downloads 154
2192 A Hybrid System for Boreholes Soil Sample

Authors: Ali Ulvi Uzer

Abstract:

Data reduction is an important topic in the field of pattern recognition applications. The basic concept is the reduction of multitudinous amounts of data down to the meaningful parts. The Principal Component Analysis (PCA) method is frequently used for data reduction. The Support Vector Machine (SVM) method is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data, the algorithm outputs an optimal hyperplane which categorizes new examples. This study offers a hybrid approach that uses the PCA for data reduction and Support Vector Machines (SVM) for classification. In order to detect the accuracy of the suggested system, two boreholes taken from the soil sample was used. The classification accuracies for this dataset were obtained through using ten-fold cross-validation method. As the results suggest, this system, which is performed through size reduction, is a feasible system for faster recognition of dataset so our study result appears to be very promising.

Keywords: feature selection, sequential forward selection, support vector machines, soil sample

Procedia PDF Downloads 455
2191 Greyscale: A Tree-Based Taxonomy for Grey Literature Published by Fisheries Agencies

Authors: Tatiana Tunon, Gottfried Pestal

Abstract:

Government agencies responsible for the management of fisheries resources publish many types of grey literature, and these materials are increasingly accessible to the public on agency websites. However, scope and quality vary considerably, and end-users need meta-data about the report series when deciding whether to use the information (e.g. apply the methods, include the results in a systematic review), or when prioritizing materials for archiving (e.g. library holdings, reference databases). A proposed taxonomy for these report series was developed based on a review of 41 report series from 6 government agencies in 4 countries (Canada, New Zealand, Scotland, and United States). Each report series was categorized according to multiple criteria describing peer-review process, content, and purpose. A robust classification tree was then fitted to these descriptions, and the resulting taxonomic groups were used to compare agency output from 4 countries using reports available in their online repositories.

Keywords: classification tree, fisheries, government, grey literature

Procedia PDF Downloads 286