Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 4178

Search results for: cluster detection

1538 Use of Multivariate Statistical Techniques for Water Quality Monitoring Network Assessment, Case of Study: Jequetepeque River Basin

Abstract:

A proper water quality management requires the establishment of a monitoring network. Therefore, evaluation of the efficiency of water quality monitoring networks is needed to ensure high-quality data collection of critical quality chemical parameters. Unfortunately, in some Latin American countries water quality monitoring programs are not sustainable in terms of recording historical data or environmentally representative sites wasting time, money and valuable information. In this study, multivariate statistical techniques, such as principal components analysis (PCA) and hierarchical cluster analysis (HCA), are applied for identifying the most significant monitoring sites as well as critical water quality parameters in the monitoring network of the Jequetepeque River basin, in northern Peru. The Jequetepeque River basin, like others in Peru, shows socio-environmental conflicts due to economical activities developed in this area. Water pollution by trace elements in the upper part of the basin is mainly related with mining activity, and agricultural land lost due to salinization is caused by the extensive use of groundwater in the lower part of the basin. Since the 1980s, the water quality in the basin has been non-continuously assessed by public and private organizations, and recently the National Water Authority had established permanent water quality networks in 45 basins in Peru. Despite many countries use multivariate statistical techniques for assessing water quality monitoring networks, those instruments have never been applied for that purpose in Peru. For this reason, the main contribution of this study is to demonstrate that application of the multivariate statistical techniques could serve as an instrument that allows the optimization of monitoring networks using least number of monitoring sites as well as the most significant water quality parameters, which would reduce costs concerns and improve the water quality management in Peru. Main socio-economical activities developed and the principal stakeholders related to the water management in the basin are also identified. Finally, water quality management programs will also be discussed in terms of their efficiency and sustainability.

Keywords: PCA, HCA, Jequetepeque, multivariate statistical

Procedia PDF Downloads 341

1537 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: calibration model, monitoring, quality improvement, feature selection

Procedia PDF Downloads 341

1536 Charging-Vacuum Helium Mass Spectrometer Leak Detection Technology in the Application of Space Products Leak Testing and Error Control

Authors: Jijun Shi, Lichen Sun, Jianchao Zhao, Lizhi Sun, Enjun Liu, Chongwu Guo

Abstract:

Because of the consistency of pressure direction, more short cycle, and high sensitivity, Charging-Vacuum helium mass spectrometer leak testing technology is the most popular leak testing technology for the seal testing of the spacecraft parts, especially the small and medium size ones. Usually, auxiliary pump was used, and the minimum detectable leak rate could reach 5E-9Pa•m3/s, even better on certain occasions. Relative error is more important when evaluating the results. How to choose the reference leak, the background level of helium, and record formats would affect the leak rate tested. In the linearity range of leak testing system, it would reduce 10% relative error if the reference leak with larger leak rate was used, and the relative error would reduce obviously if the background of helium was low efficiently, the record format of decimal was used, and the more stable data were recorded.

Keywords: leak testing, spacecraft parts, relative error, error control

Procedia PDF Downloads 441

1535 Dynamic Process Monitoring of an Ammonia Synthesis Fixed-Bed Reactor

Authors: Bothinah Altaf, Gary Montague, Elaine B. Martin

Abstract:

This study involves the modeling and monitoring of an ammonia synthesis fixed-bed reactor using partial least squares (PLS) and its variants. The process exhibits complex dynamic behavior due to the presence of heat recycling and feed quench. One limitation of static PLS model in this situation is that it does not take account of the process dynamics and hence dynamic PLS was used. Although it showed, superior performance to static PLS in terms of prediction, the monitoring scheme was inappropriate hence adaptive PLS was considered. A limitation of adaptive PLS is that non-conforming observations also contribute to the model, therefore, a new adaptive approach was developed, robust adaptive dynamic PLS. This approach updates a dynamic PLS model and is robust to non-representative data. The developed methodology showed a clear improvement over existing approaches in terms of the modeling of the reactor and the detection of faults.

Keywords: ammonia synthesis fixed-bed reactor, dynamic partial least squares modeling, recursive partial least squares, robust modeling

Procedia PDF Downloads 375

1534 Early Detection of Major Earthquakes Using Broadband Accelerometers

Authors: Umberto Cerasani, Luca Cerasani

Abstract:

Methods for earthquakes forecasting have been intensively investigated in the last decades, but there is still no universal solution agreed by seismologists. Rock failure is most often preceded by a tiny elastic movement in the failure area and by the appearance of micro-cracks. These micro-cracks could be detected at the soil surface and represent useful earth-quakes precursors. The aim of this study was to verify whether tiny raw acceleration signals (in the 10⁻¹ to 10⁻⁴ cm/s² range) prior to the arrival of main primary-waves could be exploitable and related to earthquakes magnitude. Mathematical tools such as Fast Fourier Transform (FFT), moving average and wavelets have been applied on raw acceleration data available on the ITACA web site, and the study focused on one of the most unpredictable earth-quakes, i.e., the August 24th, 2016 at 01H36 one that occurred in the central Italy area. It appeared that these tiny acceleration signals preceding main P-waves have different patterns both on frequency and time domains for high magnitude earthquakes compared to lower ones.

Keywords: earthquake, accelerometer, earthquake forecasting, seism

Procedia PDF Downloads 125

1533 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 426

1532 Hyper Tuned RBF SVM: Approach for the Prediction of the Breast Cancer

Authors: Surita Maini, Sanjay Dhanka

Abstract:

Machine learning (ML) involves developing algorithms and statistical models that enable computers to learn and make predictions or decisions based on data without being explicitly programmed. Because of its unlimited abilities ML is gaining popularity in medical sectors; Medical Imaging, Electronic Health Records, Genomic Data Analysis, Wearable Devices, Disease Outbreak Prediction, Disease Diagnosis, etc. In the last few decades, many researchers have tried to diagnose Breast Cancer (BC) using ML, because early detection of any disease can save millions of lives. Working in this direction, the authors have proposed a hybrid ML technique RBF SVM, to predict the BC in earlier the stage. The proposed method is implemented on the Breast Cancer UCI ML dataset with 569 instances and 32 attributes. The authors recorded performance metrics of the proposed model i.e., Accuracy 98.24%, Sensitivity 98.67%, Specificity 97.43%, F1 Score 98.67%, Precision 98.67%, and run time 0.044769 seconds. The proposed method is validated by K-Fold cross-validation.

Keywords: breast cancer, support vector classifier, machine learning, hyper parameter tunning

Procedia PDF Downloads 56

1531 Extraction of Polystyrene from Styrofoam Waste: Synthesis of Novel Chelating Resin for the Enrichment and Speciation of Cr(III)/Cr(vi) Ions in Industrial Effluents

Authors: Ali N. Siyal, Saima Q. Memon, Latif Elçi, Aydan Elçi

Abstract:

Polystyrene (PS) was extracted from Styrofoam (expanded polystyrene foam) waste, so called white pollutant. The PS was functionalized with N, N- Bis(2-aminobenzylidene)benzene-1,2-diamine (ABA) ligand through an azo spacer. The resin was characterized by FT-IR spectroscopy and elemental analysis. The PS-N=N-ABA resin was used for the enrichment and speciation of Cr(III)/Cr(VI) ions and total Cr determination in aqueous samples by Flame Atomic Absorption Spectrometry (FAAS). The separation of Cr(III)/Cr(VI) ions was achieved at pH 2. The recovery of Cr(VI) ions was achieved ≥ 95.0% at optimum parameters: pH 2; resin amount 300 mg; flow rates 2.0 mL min-1 of solution and 2.0 mL min-1 of eluent (2.0 mol L-1 HNO3). Total Cr was determined by oxidation of Cr(III) to Cr(VI) ions using H2O2. The limit of detection (LOD) and quantification (LOQ) of Cr(VI) were found to be 0.40 and 1.20 μg L-1, respectively with preconcentration factor of 250. Total saturation and breakthrough capacitates of the resin for Cr(IV) ions were found to be 0.181 and 0.531 mmol g-1, respectively. The proposed method was successfully applied for the preconcentration/speciation of Cr(III)/Cr(VI) ions and determination of total Cr in industrial effluents.

Keywords: styrofoam waste, polymeric resin, preconcentration, speciation, Cr(III)/Cr(VI) ions, FAAS

Procedia PDF Downloads 274

1530 Rapid Detection of MBL Genes by SYBR Green Based Real-Time PCR

Authors: Taru Singh, Shukla Das, V. G. Ramachandran

Abstract:

Objectives: To develop SYBR green based real-time PCR assay to detect carbapenemases (NDM, IMP) genes in E. coli. Methods: A total of 40 E. coli from stool samples were tested. Six were previously characterized as resistant to carbapenems and documented by PCR. The remaining 34 isolates previously tested susceptible to carbapenems and were negative for these genes. Bacterial RNA was extracted using manual method. The real-time PCR was performed using the Light Cycler III 480 instrument (Roche) and specific primers for each carbapenemase target were used. Results: Each one of the two carbapenemase gene tested presented a different melting curve after PCR amplification. The melting temperature (Tm) analysis of the amplicons identified was as follows: blaIMP type (Tm 82.18°C), blaNDM-1 (Tm 78.8°C). No amplification was detected among the negative samples. The results showed 100% concordance with the genotypes previously identified. Conclusions: The new assay was able to detect the presence of two different carbapenemase gene type by real-time PCR.

Keywords: resistance, b-lactamases, E. coli, real-time PCR

Procedia PDF Downloads 396

1529 Water Leakage Detection System of Pipe Line using Radial Basis Function Neural Network

Authors: A. Ejah Umraeni Salam, M. Tola, M. Selintung, F. Maricar

Abstract:

Clean water is an essential and fundamental human need. Therefore, its supply must be assured by maintaining the quality, quantity and water pressure. However the fact is, on its distribution system, leakage happens and becomes a common world issue. One of the technical causes of the leakage is a leaking pipe. The purpose of the research is how to use the Radial Basis Function Neural (RBFNN) model to detect the location and the magnitude of the pipeline leakage rapidly and efficiently. In this study the RBFNN are trained and tested on data from EPANET hydraulic modeling system. Method of Radial Basis Function Neural Network is proved capable to detect location and magnitude of pipeline leakage with of the accuracy of the prediction results based on the value of RMSE (Root Meant Square Error), comparison prediction and actual measurement approaches 0.000049 for the whole pipeline system.

Keywords: radial basis function neural network, leakage pipeline, EPANET, RMSE

Procedia PDF Downloads 344

1528 Machine Learning Automatic Detection on Twitter Cyberbullying

Authors: Raghad A. Altowairgi

Abstract:

With the wide spread of social media platforms, young people tend to use them extensively as the first means of communication due to their ease and modernity. But these platforms often create a fertile ground for bullies to practice their aggressive behavior against their victims. Platform usage cannot be reduced, but intelligent mechanisms can be implemented to reduce the abuse. This is where machine learning comes in. Understanding and classifying text can be helpful in order to minimize the act of cyberbullying. Artificial intelligence techniques have expanded to formulate an applied tool to address the phenomenon of cyberbullying. In this research, machine learning models are built to classify text into two classes; cyberbullying and non-cyberbullying. After preprocessing the data in 4 stages; removing characters that do not provide meaningful information to the models, tokenization, removing stop words, and lowering text. BoW and TF-IDF are used as the main features for the five classifiers, which are; logistic regression, Naïve Bayes, Random Forest, XGboost, and Catboost classifiers. Each of them scores 92%, 90%, 92%, 91%, 86% respectively.

Keywords: cyberbullying, machine learning, Bag-of-Words, term frequency-inverse document frequency, natural language processing, Catboost

Procedia PDF Downloads 113

1527 Application of Advanced Remote Sensing Data in Mineral Exploration in the Vicinity of Heavy Dense Forest Cover Area of Jharkhand and Odisha State Mining Area

Authors: Hemant Kumar, R. N. K. Sharma, A. P. Krishna

Abstract:

The study has been carried out on the Saranda in Jharkhand and a part of Odisha state. Geospatial data of Hyperion, a remote sensing satellite, have been used. This study has used a wide variety of patterns related to image processing to enhance and extract the mining class of Fe and Mn ores.Landsat-8, OLI sensor data have also been used to correctly explore related minerals. In this way, various processes have been applied to increase the mineralogy class and comparative evaluation with related frequency done. The Hyperion dataset for hyperspectral remote sensing has been specifically verified as an effective tool for mineral or rock information extraction within the band range of shortwave infrared used. The abundant spatial and spectral information contained in hyperspectral images enables the differentiation of different objects of any object into targeted applications for exploration such as exploration detection, mining.

Keywords: Hyperion, hyperspectral, sensor, Landsat-8

Procedia PDF Downloads 105

1526 The Relationship between Amplitude and Stability of Circadian Rhythm with Sleep Quality and Sleepiness: A Population Study, Kerman 2018

Authors: Akram Sadat Jafari Roodbandi, Farzaneh Akbari, Vafa Feyzi, Zahra Zare, Zohreh Foroozanfar

Abstract:

Introduction: Circadian rhythm or sleep-awake cycle in 24 hours is one of the important factors affecting the physiological and psychological characteristics in humans that contribute to biochemical, physiological and behavioral processes and helps people to set up brain and body for sleep or active awakening during certain hours. The purpose of this study was to investigate the relationship between the characteristics of circadian rhythms on the sleep quality and sleepiness according to their demographic characteristics such as age. Methods: This cross-sectional descriptive-analytic study was carried out among the general population of Kerman, aged 15-84 years. After dividing the age groups into 10-year demographic characteristics questionnaire, the type of circadian questionnaire, Pittsburgh sleep quality questionnaire and Euporth sleepiness questionnaire were completed in equal numbers between men and women of that age group. Using cluster sampling with effect design equal 2, 1300 questionnaires were distributed during the various hours of 24 hours in public places in Kerman city. Data analysis was done using SPSS software and univariate tests and linear regressions at a significance level of 0.05. Results: In this study, 1147 subjects were included in the study, 584 (50.9%) were male and the rest were women. The mean age was 39.50 ± 15.38. 133 (11.60%) subjects from the study participants had sleepiness and 308 (26.90%) subjects had undesirable sleep quality. Using linear regression test, sleep quality was the significant correlation with sex, hours needed for sleep at 24 hours, chronic illness, sleepiness, and circadian rhythm amplitude. Sleepiness was the meaningful relationship with marital status, sleep-wake schedule of other family members and the stability of circadian rhythm. Both women and men, with age, decrease the quality of sleep and increase the rate of sleepiness. Conclusion: Age, sex, and type of circadian people, the need for sleep at 24 hours, marital status, sleep-wake schedule of other family members are significant factors related to the sleep quality and sleepiness and their adaptation to night shift work.

Keywords: circadian type, sleep quality, sleepiness, age, shift work

Procedia PDF Downloads 137

1525 Detection of Arterial Stiffness in Diabetes Using Photoplethysmograph

Authors: Neelamshobha Nirala, R. Periyasamy, Awanish Kumar

Abstract:

Diabetes is a metabolic disorder and with the increase of global prevalence of diabetes, cardiovascular diseases and mortality related to diabetes has also increased. Diabetes causes the increase of arterial stiffness by elusive hormonal and metabolic abnormalities. We used photoplethysmograph (PPG), a simple non-invasive method to study the change in arterial stiffness due to diabetes. Toe PPG signals were taken from 29 diabetic subjects with mean age of (65±8.4) years and 21 non-diabetic subjects of mean age of (49±14) years. Mean duration of diabetes is 12±8 years for diabetic group. Rise-time (RT) and area under rise time (AUR) were calculated from the PPG signal of each subject and Welch’s t-test is used to find the significant difference between two groups. We obtained a significant difference of (p-value) 0.0005 and 0.03 for RT and AUR respectively between diabetic and non-diabetic subjects. Average value of RT and AUR is 0.298±0.003 msec and 14.4±4.2 arbitrary units respectively for diabetic subject compared to 0.277±0.0005 msec and 13.66±2.3 a.u respectively for non-diabetic subjects. In conclusion, this study support that arterial stiffness is increased in diabetes and can be detected early using PPG.

Keywords: area under rise-time, AUR, arterial stiffness, diabetes, photoplethysmograph, PPG, rise-time (RT)

Procedia PDF Downloads 241

1524 Mobile Microscope for the Detection of Pathogenic Cells Using Image Processing

Authors: P. S. Surya Meghana, K. Lingeshwaran, C. Kannan, V. Raghavendran, C. Priya

Abstract:

One of the most basic and powerful tools in all of science and medicine is the light microscope, the fundamental device for laboratory as well as research purposes. With the improving technology, the need for portable, economic and user-friendly instruments is in high demand. The conventional microscope fails to live up to the emerging trend. Also, adequate access to healthcare is not widely available, especially in developing countries. The most basic step towards the curing of a malady is the diagnosis of the disease itself. The main aim of this paper is to diagnose Malaria with the most common device, cell phones, which prove to be the immediate solution for most of the modern day needs with the development of wireless infrastructure allowing to compute and communicate on the move. This opened up the opportunity to develop novel imaging, sensing, and diagnostics platforms using mobile phones as an underlying platform to address the global demand for accurate, sensitive, cost-effective, and field-portable measurement devices for use in remote and resource-limited settings around the world.

Keywords: cellular, hand-held, health care, image processing, malarial parasites, microscope

Procedia PDF Downloads 250

1523 Two-Level Separation of High Air Conditioner Consumers and Demand Response Potential Estimation Based on Set Point Change

Authors: Mehdi Naserian, Mohammad Jooshaki, Mahmud Fotuhi-Firuzabad, Mohammad Hossein Mohammadi Sanjani, Ashknaz Oraee

Abstract:

In recent years, the development of communication infrastructure and smart meters have facilitated the utilization of demand-side resources which can enhance stability and economic efficiency of power systems. Direct load control programs can play an important role in the utilization of demand-side resources in the residential sector. However, investments required for installing control equipment can be a limiting factor in the development of such demand response programs. Thus, selection of consumers with higher potentials is crucial to the success of a direct load control program. Heating, ventilation, and air conditioning (HVAC) systems, which due to the heat capacity of buildings feature relatively high flexibility, make up a major part of household consumption. Considering that the consumption of HVAC systems depends highly on the ambient temperature and bearing in mind the high investments required for control systems enabling direct load control demand response programs, in this paper, a recent solution is presented to uncover consumers with high air conditioner demand among large number of consumers and to measure the demand response potential of such consumers. This can pave the way for estimating the investments needed for the implementation of direct load control programs for residential HVAC systems and for estimating the demand response potentials in a distribution system. In doing so, we first cluster consumers into several groups based on the correlation coefficients between hourly consumption data and hourly temperature data using K-means algorithm. Then, by applying a recent algorithm to the hourly consumption and temperature data, consumers with high air conditioner consumption are identified. Finally, demand response potential of such consumers is estimated based on the equivalent desired temperature setpoint changes.

Keywords: communication infrastructure, smart meters, power systems, HVAC system, residential HVAC systems

Procedia PDF Downloads 44

1522 Detection of Antibiotic Resistance Genes and Antibiotic Residues in Plant-based Products

Authors: Morello Sara, Pederiva Sabina, Bianchi Manila, Martucci Francesca, Marchis Daniela, Decastelli Lucia

Abstract:

Vegetables represent an integral part of a healthy diet due to their valuable nutritional properties and the growth in consumer demand in recent years is particularly remarkable for a diet rich in vitamins and micronutrients. However, plant-based products are involved in several food outbreaks connected to various sources of contamination and quite often, bacteria responsible for side effects showed high resistance to antibiotics. The abuse of antibiotics can be one of the main mechanisms responsible for increasing antibiotic resistance (AR). Plants grown for food use can be contaminated directly by spraying antibiotics on crops or indirectly by treatments with antibiotics due to the use of manure, which may contain both antibiotics and genes of antibiotic resistance (ARG). Antibiotic residues could represent a potential way of human health risk due to exposure through the consumption of plant-based foods. The presence of antibiotic-resistant bacteria might pose a particular risk to consumers. The present work aims to investigate through a multidisciplinary approach the occurrence of ARG by means of a biomolecular approach (PCR) and the prevalence of antibiotic residues using a multi residues LC-MS/MS method, both in different plant-based products. During the period from July 2020 to October 2021, a total of 74 plant samples (33 lettuces and 41 tomatoes) were collected from 57 farms located throughout the Piedmont area, and18 out of 74 samples (11 lettuces and 7 tomatoes) were selected to LC-MS/MS analyses. DNA extracted (ExtractME, Blirt, Poland) from plants used on crops and isolated bacteria were analyzed with 6 sets of end-point multiplex PCR (Qiagen, Germany) to detect the presence of resistance genes of the main antibiotic families, such as tet genes (tetracyclines), bla (β-lactams) and mcr (colistin). Simultaneous detection of 43 molecules of antibiotics belonging to 10 different classes (tetracyclines, sulphonamides, quinolones, penicillins, amphenicols, macrolides, pleuromotilines, lincosamides, diaminopyrimidines) was performed using Exion LC system AB SCIEX coupled to a triple quadrupole mass spectrometer QTRAP 5500 from AB SCIEX. The PCR assays showed the presence of ARG in 57% (n=42): tetB (4.8%; n=2), tetA (9.5%; n=4), tetE (2.4%; n=1), tetL (12%; n=5), tetM (26%; n=11), blaSHV (21.5%; n=9), blaTEM (4.8%; n =2) and blaCTX-M (19%; n=8). In none of the analyzed samples was the mcr gene responsible for colistin resistance detected. Results obtained from LC-MS/MS analyses showed that none of the tested antibiotics appear to exceed the LOQ (100 ppb). Data obtained confirmed the presence of bacterial populations containing antibiotic resistance determinants such as tet gene (tetracycline) and bla genes (beta-lactams), widely used in human medicine, which can join the food chain and represent a risk for consumers, especially with raw products. The presence of traces of antibiotic residues in vegetables, in concentration below the LOQ of the LC-MS/MS method applied, cannot be excluded. In conclusion, traces of antibiotic residues could be a health risk to the consumer due to potential involvement in the spread of AR. PCR represents a useful and effective approach to characterize and monitor AR carried by bacteria from the entire food chain.

Keywords: plant-based products, ARG, PCR, antibiotic residues

Procedia PDF Downloads 74

1521 A Sector-Wise Study on Detecting Earnings Management in India

Authors: Raghuveer Kaur, Kartikay Sharma, Ashu Khanna

Abstract:

Earnings management has been present from times immemorial. The recent downfall of giant enterprises like Enron, Satyam and WorldCom has brought a lot of focus on the study and detection of earnings management. The present study is an attempt to study earnings management in one of the fastest emerging economy - India. The study makes an attempt to understand earnings management in different sectors of the economy. The paper first tests a hypothesis to check whether different sectors of India are engaged in earnings management or not. In the later section the paper aims to study the level of earnings management in 6 popular sectors of India: IT&BPO, Retail, Telecom, Biotech, Hotels and coffee. To measure earnings management two popular techniques of detecting earnings management has been employed: Modified Jones Model and Beniesh M Score. A total of 332 companies were studied. Publicly available data from Capitaline database has been used. The paper also classifies the top and bottom five performers on the basis of sales turnover in each sector and identifies whether they manage their earnings or not.

Keywords: earnings management, India, modified Jones model, Beneish M score

Procedia PDF Downloads 499

1520 Using Machine Learning Techniques for Autism Spectrum Disorder Analysis and Detection in Children

Authors: Norah Mohammed Alshahrani, Abdulaziz Almaleh

Abstract:

Autism Spectrum Disorder (ASD) is a condition related to issues with brain development that affects how a person recognises and communicates with others which results in difficulties with interaction and communication socially and it is constantly growing. Early recognition of ASD allows children to lead safe and healthy lives and helps doctors with accurate diagnoses and management of conditions. Therefore, it is crucial to develop a method that will achieve good results and with high accuracy for the measurement of ASD in children. In this paper, ASD datasets of toddlers and children have been analyzed. We employed the following machine learning techniques to attempt to explore ASD and they are Random Forest (RF), Decision Tree (DT), Na¨ıve Bayes (NB) and Support Vector Machine (SVM). Then Feature selection was used to provide fewer attributes from ASD datasets while preserving model performance. As a result, we found that the best result has been provided by the Support Vector Machine (SVM), achieving 0.98% in the toddler dataset and 0.99% in the children dataset.

Keywords: autism spectrum disorder, machine learning, feature selection, support vector machine

Procedia PDF Downloads 130

1519 Lanthanide-Mediated Aggregation of Glutathione-Capped Gold Nanoclusters Exhibiting Strong Luminescence and Fluorescence Turn-on for Sensing Alkaline Phosphatase

Authors: Jyun-Guo You, Wei-Lung Tseng

Abstract:

Herein, this study represents a synthetic route for producing highly luminescent AuNCs based on the integration of two concepts, including thiol-induced luminescence enhancement of ligand-insufficient GSH-AuNCs and Ce3+-induced aggregation of GSH-AuNCs. The synthesis of GSH-AuNCs was conducted by modifying the previously reported procedure. To produce more Au(I)-GSH complexes on the surface of ligand-insufficient GSH-AuNCs, the extra GSH is added to attach onto the AuNC surface. The formed ligand-sufficient GSH-AuNCs (LS-GSH-AuNCs) emit relatively strong luminescence. The luminescence of LS-GSH-AuNCs is further enhanced by the coordination of two carboxylic groups (pKa1 = 2 and pKa2 = 3.5) of GSH and lanthanide ions, which induce the self-assembly of LS-GSH-AuNCs. As a result, the quantum yield of the self-assembled LS-GSH-AuNCs (SA-AuNCs) was improved to be 13%. Interestingly, the SA-AuNCs were dissembled into LS-GSH-AuNCs in the presence of adenosine triphosphate (ATP) because of the formation of the ATP- lanthanide ion complexes. Our assay was employed to detect alkaline phosphatase (ALP) activity over the range of 0.1−10 U/mL with a limit of detection (LOD) of 0.03 U/mL.

Keywords: self-assembly, lanthanide ion, adenosine triphosphate, alkaline phosphatase

Procedia PDF Downloads 159

1518 Performance Comparison of ADTree and Naive Bayes Algorithms for Spam Filtering

Authors: Thanh Nguyen, Andrei Doncescu, Pierre Siegel

Abstract:

Classification is an important data mining technique and could be used as data filtering in artificial intelligence. The broad application of classification for all kind of data leads to be used in nearly every field of our modern life. Classification helps us to put together different items according to the feature items decided as interesting and useful. In this paper, we compare two classification methods Naïve Bayes and ADTree use to detect spam e-mail. This choice is motivated by the fact that Naive Bayes algorithm is based on probability calculus while ADTree algorithm is based on decision tree. The parameter settings of the above classifiers use the maximization of true positive rate and minimization of false positive rate. The experiment results present classification accuracy and cost analysis in view of optimal classifier choice for Spam Detection. It is point out the number of attributes to obtain a tradeoff between number of them and the classification accuracy.

Keywords: classification, data mining, spam filtering, naive bayes, decision tree

Procedia PDF Downloads 397

1517 Determination of Water Pollution and Water Quality with Decision Trees

Authors: Çiğdem Bakır, Mecit Yüzkat

Abstract:

With the increasing emphasis on water quality worldwide, the search for and expanding the market for new and intelligent monitoring systems has increased. The current method is the laboratory process, where samples are taken from bodies of water, and tests are carried out in laboratories. This method is time-consuming, a waste of manpower, and uneconomical. To solve this problem, we used machine learning methods to detect water pollution in our study. We created decision trees with the Orange3 software we used in our study and tried to determine all the factors that cause water pollution. An automatic prediction model based on water quality was developed by taking many model inputs such as water temperature, pH, transparency, conductivity, dissolved oxygen, and ammonia nitrogen with machine learning methods. The proposed approach consists of three stages: preprocessing of the data used, feature detection, and classification. We tried to determine the success of our study with different accuracy metrics and the results. We presented it comparatively. In addition, we achieved approximately 98% success with the decision tree.

Keywords: decision tree, water quality, water pollution, machine learning

Procedia PDF Downloads 71

1516 Ultrafiltration Process Intensification for Municipal Wastewater Reuse: Water Quality, Optimization of Operating Conditions and Fouling Management

Authors: J. Yang, M. Monnot, T. Eljaddi, L. Simonian, L. Ercolei, P. Moulin

Abstract:

The application of membrane technology to wastewater treatment has expanded rapidly under increasing stringent legislation and environmental protection requirements. At the same time, the water resource is becoming precious, and water reuse has gained popularity. Particularly, ultrafiltration (UF) is a very promising technology for water reuse as it can retain organic matters, suspended solids, colloids, and microorganisms. Nevertheless, few studies dealing with operating optimization of UF as a tertiary treatment for water reuse on a semi-industrial scale appear in the literature. Therefore, this study aims to explore the permeate water quality and to optimize operating parameters (maximizing productivity and minimizing irreversible fouling) through the operation of a UF pilot plant under real conditions. The fully automatic semi-industrial UF pilot plant with periodic classic backwashes (CB) and air backwashes (AB) was set up to filtrate the secondary effluent of an urban wastewater treatment plant (WWTP) in France. In this plant, the secondary treatment consists of a conventional activated sludge process followed by a sedimentation tank. The UF process was thus defined as a tertiary treatment and was operated under constant flux. It is important to note that a combination of CB and chlorinated AB was used for better fouling management. The 200 kDa hollow fiber membrane was used in the UF module, with an initial permeability (for WWTP outlet water) of 600 L·m-2·h⁻¹·bar⁻¹ and a total filtration surface of 9 m². Fifteen filtration conditions with different fluxes, filtration times, and air backwash frequencies were operated for more than 40 hours of each to observe their hydraulic filtration performances. Through comparison, the best sustainable condition was flux at 60 L·h⁻¹·m⁻², filtration time at 60 min, and backwash frequency of 1 AB every 3 CBs. The optimized condition stands out from the others with > 92% water recovery rates, better irreversible fouling control, stable permeability variation, efficient backwash reversibility (80% for CB and 150% for AB), and no chemical washing occurrence in 40h’s filtration. For all tested conditions, the permeate water quality met the water reuse guidelines of the World Health Organization (WHO), French standards, and the regulation of the European Parliament adopted in May 2020, setting minimum requirements for water reuse in agriculture. In permeate: the total suspended solids, biochemical oxygen demand, and turbidity were decreased to < 2 mg·L-1, ≤ 10 mg·L⁻¹, < 0.5 NTU respectively; the Escherichia coli and Enterococci were > 5 log removal reduction, the other required microorganisms’ analysis were below the detection limits. Additionally, because of the COVID-19 pandemic, coronavirus SARS-CoV-2 was measured in raw wastewater of WWTP, UF feed, and UF permeate in November 2020. As a result, the raw wastewater was tested positive above the detection limit but below the quantification limit. Interestingly, the UF feed and UF permeate were tested negative to SARS-CoV-2 by these PCR assays. In summary, this work confirms the great interest in UF as intensified tertiary treatment for water reuse and gives operational indications for future industrial-scale production of reclaimed water.

Keywords: semi-industrial UF pilot plant, water reuse, fouling management, coronavirus

Procedia PDF Downloads 100

1515 On the Bootstrap P-Value Method in Identifying out of Control Signals in Multivariate Control Chart

Authors: O. Ikpotokin

Abstract:

In any production process, every product is aimed to attain a certain standard, but the presence of assignable cause of variability affects our process, thereby leading to low quality of product. The ability to identify and remove this type of variability reduces its overall effect, thereby improving the quality of the product. In case of a univariate control chart signal, it is easy to detect the problem and give a solution since it is related to a single quality characteristic. However, the problems involved in the use of multivariate control chart are the violation of multivariate normal assumption and the difficulty in identifying the quality characteristic(s) that resulted in the out of control signals. The purpose of this paper is to examine the use of non-parametric control chart (the bootstrap approach) for obtaining control limit to overcome the problem of multivariate distributional assumption and the p-value method for detecting out of control signals. Results from a performance study show that the proposed bootstrap method enables the setting of control limit that can enhance the detection of out of control signals when compared, while the p-value method also enhanced in identifying out of control variables.

Keywords: bootstrap control limit, p-value method, out-of-control signals, p-value, quality characteristics

Procedia PDF Downloads 333

1514 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 150

1513 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 138

1512 Synthesis of Silver Nanoparticle: An Analytical Method Based Approach for the Quantitative Assessment of Drug

Authors: Zeid A. Alothman

Abstract:

Silver nanoparticle (AgNP) has been synthesized using adrenaline. Adrenaline readily undergoes an autoxidation reaction in an alkaline medium with the dissolved oxygen to form adrenochrome, thus behaving as a mild reducing agent for the dissolved oxygen. This reducing behavior of adrenaline when employed to reduce Ag(+) ions yielded a large enhancement in the intensity of absorbance in the visible region. Transmission electron microscopy (TEM) and X-ray diffraction (XRD) studies have been performed to confirm the surface morphology of AgNPs. Further, the metallic nanoparticles with size greater than 2 nm caused a strong and broad absorption band in the UV-visible spectrum called surface plasmon band or Mie resonance. The formation of AgNPs caused the large enhancement in the absorbance values with λmax at 436 nm through the excitation of the surface plasmon band. The formation of AgNPs was adapted to for the quantitative assessment of adrenaline using spectrophotometry with lower detection limit and higher precision values.

Keywords: silver nanoparticle, adrenaline, XRD, TEM, analysis

Procedia PDF Downloads 190

1511 Two-Step Patterning of Microfluidic Structures in Paper by Laser Cutting and Wax Printing for Mass Fabrication of Biosensor

Authors: Bong Keun Kang, Sung Suk Oh, Jeong-Woo Sohn, Jong-Ryul Choi, Young Ho Kim

Abstract:

In this paper, we describe two-step micro-pattering by using laser cutting and wax printing. Wax printing is performed only on the bridges for hydrophobic barriers. We prepared 405nm blue-violet laser module and wax pencil module. And, this two modules combine x-y plot. The hollow microstructure formed by laser patterning define the hydrophilic flowing paths. However, bridges are essential to avoid the cutting area being the island. Through the support bridges, microfluidic solution spread out to the unnecessary areas. Chromatography blotting paper was purchased from Whatman. We used 20x20 cm and 46x57 cm of chromatography blotting paper. Axis moving speed of x-y plot was the main parameter of optimization. For aligning between the two patterning, the paper sheet was taped at the bottom. After the two-step patterning, temperature curing step was done at 110-130 °C. The resolution of the fabrication and the potential of the multiplex detection were investigated.

Keywords: µPADs, microfluidic, biosensor, mass-fabrication

Procedia PDF Downloads 452

1510 Process Data-Driven Representation of Abnormalities for Efficient Process Control

Authors: Hyun-Woo Cho

Abstract:

Unexpected operational events or abnormalities of industrial processes have a serious impact on the quality of final product of interest. In terms of statistical process control, fault detection and diagnosis of processes is one of the essential tasks needed to run the process safely. In this work, nonlinear representation of process measurement data is presented and evaluated using a simulation process. The effect of using different representation methods on the diagnosis performance is tested in terms of computational efficiency and data handling. The results have shown that the nonlinear representation technique produced more reliable diagnosis results and outperforms linear methods. The use of data filtering step improved computational speed and diagnosis performance for test data sets. The presented scheme is different from existing ones in that it attempts to extract the fault pattern in the reduced space, not in the original process variable space. Thus this scheme helps to reduce the sensitivity of empirical models to noise.

Keywords: fault diagnosis, nonlinear technique, process data, reduced spaces

Procedia PDF Downloads 232

1509 Syndrome of Irreversible Lithium-Effectuated Neurotoxicity: Case Report and Review of Literature

Authors: David J. Thomson, Joshua C. J. Chew

Abstract:

Background: Syndrome of Irreversible Lithium-Effectuated Neurotoxicity (SILENT) is a rare complication of lithium toxicity that typically causes irreversible cerebellar dysfunction. These patients may require hemodialysis and extensive supports in the intensive care. Methods: A review was performed on the available literature of SILENT with a focus on current pathophysiological hypotheses and advances in treatment. Articles were restricted to the English language. Results: Although the exact mechanism is unclear, CNS demyelination, especially in the cerebellum, was seen on the brain biopsies of a proportion of patients. There is no definitive management of SILENT but instead current management is focused on primary and tertiary prevention – detection of those at risk, and rehabilitation post onset of neurological deficits. Conclusions: This review draws conclusions from a limited amount of available literature, most of which are isolated case reports. Greater awareness of SILENT and further investigation into the risk factors and pathogenesis are required so this serious and irreversible syndrome may be avoided.

Keywords: lithium toxicity, pathogenesis, SILENT, syndrome of irreversible lithium-effectuated neurotoxicity

Procedia PDF Downloads 481