Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 3726

Search results for: pointing accuracy

2916 A TgCNN-Based Surrogate Model for Subsurface Oil-Water Phase Flow under Multi-Well Conditions

Authors: Jian Li

Abstract:

The uncertainty quantification and inversion problems of subsurface oil-water phase flow usually require extensive repeated forward calculations for new runs with changed conditions. To reduce the computational time, various forms of surrogate models have been built. Related research shows that deep learning has emerged as an effective surrogate model, while most surrogate models with deep learning are purely data-driven, which always leads to poor robustness and abnormal results. To guarantee the model more consistent with the physical laws, a coupled theory-guided convolutional neural network (TgCNN) based surrogate model is built to facilitate computation efficiency under the premise of satisfactory accuracy. The model is a convolutional neural network based on multi-well reservoir simulation. The core notion of this proposed method is to bridge two separate blocks on top of an overall network. They underlie the TgCNN model in a coupled form, which reflects the coupling nature of pressure and water saturation in the two-phase flow equation. The model is driven by not only labeled data but also scientific theories, including governing equations, stochastic parameterization, boundary, and initial conditions, well conditions, and expert knowledge. The results show that the TgCNN-based surrogate model exhibits satisfactory accuracy and efficiency in subsurface oil-water phase flow under multi-well conditions.

Keywords: coupled theory-guided convolutional neural network, multi-well conditions, surrogate model, subsurface oil-water phase

Procedia PDF Downloads 68

2915 The Role of Chennai NGOs in Combatting Human Trafficking

Authors: Nisha James, Shubha Ranganathan

Abstract:

Sex trafficking is a type of human trafficking involving prostitution of individuals for sexual exploitation. The stigma and social isolation they face in the society often makes it difficult for them to become rehabilitated from trafficking, due to which many of them continue in prostitution for years after being sex trafficked. Victims are subjected to violations of their fundamental human rights, deprived of basic medical facilities and undergo long-term abuse. This paper focuses on the role of Non-Governmental Organizations (NGOs) in the rescue and rehabilitation of victims of sex trafficking. Semi-structured interviews were conducted with 26 survivors of sex trafficking, five sex workers and 14 non-community staff members of a project running NGO in the city of Chennai in South India. Chennai has a number of NGOs that are involved in HIV/AIDS awareness and prevention programs. In many cases, rehabilitation of sex trafficking victims is also a mandate of these NGOs. This particular NGO was also involved in development activities towards the eradication of HIV/AIDS. For instance, they were engaged in inculcating safe sex practices among high-risk groups such as sex workers or in fighting for sex worker rights. The study found that the NGO’s role in combatting sex trafficking is overrun by the way it approaches these issue related to HIV/AIDS. Further, their activities are dependent solely on funding. Given that gradually, international funding for HIV/AIDS has slowly been withdrawn, there have been problems such as reduction in the salary of the project staff, the outreach workers and peer educators, many of whom were survivors of sex trafficking who have been able to survive on their wages instead of continuing in prostitution. Therefore, till date, the project funding has helped in making them aware of the health and social consequences of continuing in prostitution, and in supporting them socioeconomically, but the lack of funding may also lead the NGO workers into a state of unemployment, poverty and eventually into being re-trafficked. The study concludes by pointing to the need for disengaging anti-trafficking efforts from the HIV/AIDS related programs.

Keywords: non-governmental organization role, non-governmental organization staff, sex trafficking survivors, sex workers

Procedia PDF Downloads 289

2914 Using Deep Learning for the Detection of Faulty RJ45 Connectors on a Radio Base Station

Authors: Djamel Fawzi Hadj Sadok, Marrone Silvério Melo Dantas Pedro Henrique Dreyer, Gabriel Fonseca Reis de Souza, Daniel Bezerra, Ricardo Souza, Silvia Lins, Judith Kelner

Abstract:

A radio base station (RBS), part of the radio access network, is a particular type of equipment that supports the connection between a wide range of cellular user devices and an operator network access infrastructure. Nowadays, most of the RBS maintenance is carried out manually, resulting in a time consuming and costly task. A suitable candidate for RBS maintenance automation is repairing faulty links between devices caused by missing or unplugged connectors. A suitable candidate for RBS maintenance automation is repairing faulty links between devices caused by missing or unplugged connectors. This paper proposes and compares two deep learning solutions to identify attached RJ45 connectors on network ports. We named connector detection, the solution based on object detection, and connector classification, the one based on object classification. With the connector detection, we get an accuracy of 0:934, mean average precision 0:903. Connector classification, get a maximum accuracy of 0:981 and an AUC of 0:989. Although connector detection was outperformed in this study, this should not be viewed as an overall result as connector detection is more flexible for scenarios where there is no precise information about the environment and the possible devices. At the same time, the connector classification requires that information to be well-defined.

Keywords: radio base station, maintenance, classification, detection, deep learning, automation

Procedia PDF Downloads 179

2913 Early Depression Detection for Young Adults with a Psychiatric and AI Interdisciplinary Multimodal Framework

Authors: Raymond Xu, Ashley Hua, Andrew Wang, Yuru Lin

Abstract:

During COVID-19, the depression rate has increased dramatically. Young adults are most vulnerable to the mental health effects of the pandemic. Lower-income families have a higher ratio to be diagnosed with depression than the general population, but less access to clinics. This research aims to achieve early depression detection at low cost, large scale, and high accuracy with an interdisciplinary approach by incorporating clinical practices defined by American Psychiatric Association (APA) as well as multimodal AI framework. The proposed approach detected the nine depression symptoms with Natural Language Processing sentiment analysis and a symptom-based Lexicon uniquely designed for young adults. The experiments were conducted on the multimedia survey results from adolescents and young adults and unbiased Twitter communications. The result was further aggregated with the facial emotional cues analyzed by the Convolutional Neural Network on the multimedia survey videos. Five experiments each conducted on 10k data entries reached consistent results with an average accuracy of 88.31%, higher than the existing natural language analysis models. This approach can reach 300+ million daily active Twitter users and is highly accessible by low-income populations to promote early depression detection to raise awareness in adolescents and young adults and reveal complementary cues to assist clinical depression diagnosis.

Keywords: artificial intelligence, COVID-19, depression detection, psychiatric disorder

Procedia PDF Downloads 117

2912 Chatbots as Language Teaching Tools for L2 English Learners

Authors: Feiying Wu

Abstract:

Chatbots are computer programs that attempt to engage a human in a dialogue, which originated in the 1960s with MIT's Eliza. However, they have become widespread more recently as advances in language technology have produced chatbots with increasing linguistic quality and sophistication, leading to their potential to serve as a tool for Computer-Assisted Language Learning(CALL). The aim of this article is to assess the feasibility of using two chatbots, Mitsuku and CleverBot, as pedagogical tools for learning English as a second language by stimulating L2 learners with distinct English proficiencies. Speaking of the input of stimulated learners, they are measured by AntWordProfiler to match the user's expected vocabulary proficiency. Totally, there are four chat sessions as each chatbot will converse with both beginners and advanced learners. For evaluation, it focuses on chatbots' responses from a linguistic standpoint, encompassing vocabulary and sentence levels. The vocabulary level is determined by the vocabulary range and the reaction to misspelled words. Grammatical accuracy and responsiveness to poorly formed sentences are assessed for the sentence level. In addition, the assessment of this essay sets 25% lexical and grammatical incorrect input to determine chatbots' corrective ability towards different linguistic forms. Based on statistical evidence and illustration of examples, despite the small sample size, neither Mitsuku nor CleverBot is ideal as educational tools based on their performance through word range, grammatical accuracy, topic range, and corrective feedback for incorrect words and sentences, but rather as a conversational tool for beginners of L2 English.

Keywords: chatbots, CALL, L2, corrective feedback

Procedia PDF Downloads 63

2911 Prediction of Pounding between Two SDOF Systems by Using Link Element Based On Mathematic Relations and Suggestion of New Equation for Impact Damping Ratio

Authors: Seyed M. Khatami, H. Naderpour, R. Vahdani, R. C. Barros

Abstract:

Many previous studies have been carried out to calculate the impact force and the dissipated energy between two neighboring buildings during seismic excitation, when they collide with each other. Numerical studies are an important part of impact, which several researchers have tried to simulate the impact by using different formulas. Estimation of the impact force and the dissipated energy depends significantly on some parameters of impact. Mass of bodies, stiffness of spring, coefficient of restitution, damping ratio of dashpot and impact velocity are some known and unknown parameters to simulate the impact and measure dissipated energy during collision. Collision is usually shown by force-displacement hysteresis curve. The enclosed area of the hysteresis loop explains the dissipated energy during impact. In this paper, the effect of using different types of impact models is investigated in order to calculate the impact force. To increase the accuracy of impact model and to optimize the results of simulations, a new damping equation is assumed and is validated to get the best results of impact force and dissipated energy, which can show the accuracy of suggested equation of motion in comparison with other formulas. This relation is called "n-m". Based on mathematical relation, an initial value is selected for the mentioned coefficients and kinetic energy loss is calculated. After each simulation, kinetic energy loss and energy dissipation are compared with each other. If they are equal, selected parameters are true and, if not, the constant of parameters are modified and a new analysis is performed. Finally, two unknown parameters are suggested to estimate the impact force and calculate the dissipated energy.

Keywords: impact force, dissipated energy, kinetic energy loss, damping relation

Procedia PDF Downloads 539

2910 Development and Validation of High-Performance Liquid Chromatography Method for the Determination and Pharmacokinetic Study of Linagliptin in Rat Plasma

Authors: Hoda Mahgoub, Abeer Hanafy

Abstract:

Linagliptin (LNG) belongs to dipeptidyl-peptidase-4 (DPP-4) inhibitor class. DPP-4 inhibitors represent a new therapeutic approach for the treatment of type 2 diabetes in adults. The aim of this work was to develop and validate an accurate and reproducible HPLC method for the determination of LNG with high sensitivity in rat plasma. The method involved separation of both LNG and pindolol (internal standard) at ambient temperature on a Zorbax Eclipse XDB C18 column and a mobile phase composed of 75% methanol: 25% formic acid 0.1% pH 4.1 at a flow rate of 1.0 mL.min-1. UV detection was performed at 254nm. The method was validated in compliance with ICH guidelines and found to be linear in the range of 5–1000ng.mL-1. The limit of quantification (LOQ) was found to be 5ng.mL-1 based on 100µL of plasma. The variations for intra- and inter-assay precision were less than 10%, and the accuracy values were ranged between 93.3% and 102.5%. The extraction recovery (R%) was more than 83%. The method involved a single extraction step of a very small plasma volume (100µL). The assay was successfully applied to an in-vivo pharmacokinetic study of LNG in rats that were administered a single oral dose of 10mg.kg-1 LNG. The maximum concentration (Cmax) was found to be 927.5 ± 23.9ng.mL-1. The area under the plasma concentration-time curve (AUC0-72) was 18285.02 ± 605.76h.ng.mL-1. In conclusion, the good accuracy and low LOQ of the bioanalytical HPLC method were suitable for monitoring the full pharmacokinetic profile of LNG in rats. The main advantages of the method were the sensitivity, small sample volume, single-step extraction procedure and the short time of analysis.

Keywords: HPLC, linagliptin, pharmacokinetic study, rat plasma

Procedia PDF Downloads 232

2909 Internet of Things Networks: Denial of Service Detection in Constrained Application Protocol Using Machine Learning Algorithm

Authors: Adamu Abdullahi, On Francisca, Saidu Isah Rambo, G. N. Obunadike, D. T. Chinyio

Abstract:

The paper discusses the potential threat of Denial of Service (DoS) attacks in the Internet of Things (IoT) networks on constrained application protocols (CoAP). As billions of IoT devices are expected to be connected to the internet in the coming years, the security of these devices is vulnerable to attacks, disrupting their functioning. This research aims to tackle this issue by applying mixed methods of qualitative and quantitative for feature selection, extraction, and cluster algorithms to detect DoS attacks in the Constrained Application Protocol (CoAP) using the Machine Learning Algorithm (MLA). The main objective of the research is to enhance the security scheme for CoAP in the IoT environment by analyzing the nature of DoS attacks and identifying a new set of features for detecting them in the IoT network environment. The aim is to demonstrate the effectiveness of the MLA in detecting DoS attacks and compare it with conventional intrusion detection systems for securing the CoAP in the IoT environment. Findings: The research identifies the appropriate node to detect DoS attacks in the IoT network environment and demonstrates how to detect the attacks through the MLA. The accuracy detection in both classification and network simulation environments shows that the k-means algorithm scored the highest percentage in the training and testing of the evaluation. The network simulation platform also achieved the highest percentage of 99.93% in overall accuracy. This work reviews conventional intrusion detection systems for securing the CoAP in the IoT environment. The DoS security issues associated with the CoAP are discussed.

Keywords: algorithm, CoAP, DoS, IoT, machine learning

Procedia PDF Downloads 54

2908 A Two-Stage Bayesian Variable Selection Method with the Extension of Lasso for Geo-Referenced Data

Authors: Georgiana Onicescu, Yuqian Shen

Abstract:

Due to the complex nature of geo-referenced data, multicollinearity of the risk factors in public health spatial studies is a commonly encountered issue, which leads to low parameter estimation accuracy because it inflates the variance in the regression analysis. To address this issue, we proposed a two-stage variable selection method by extending the least absolute shrinkage and selection operator (Lasso) to the Bayesian spatial setting, investigating the impact of risk factors to health outcomes. Specifically, in stage I, we performed the variable selection using Bayesian Lasso and several other variable selection approaches. Then, in stage II, we performed the model selection with only the selected variables from stage I and compared again the methods. To evaluate the performance of the two-stage variable selection methods, we conducted a simulation study with different distributions for the risk factors, using geo-referenced count data as the outcome and Michigan as the research region. We considered the cases when all candidate risk factors are independently normally distributed, or follow a multivariate normal distribution with different correlation levels. Two other Bayesian variable selection methods, Binary indicator, and the combination of Binary indicator and Lasso were considered and compared as alternative methods. The simulation results indicated that the proposed two-stage Bayesian Lasso variable selection method has the best performance for both independent and dependent cases considered. When compared with the one-stage approach, and the other two alternative methods, the two-stage Bayesian Lasso approach provides the highest estimation accuracy in all scenarios considered.

Keywords: Lasso, Bayesian analysis, spatial analysis, variable selection

Procedia PDF Downloads 123

2907 Creep Analysis and Rupture Evaluation of High Temperature Materials

Authors: Yuexi Xiong, Jingwu He

Abstract:

The structural components in an energy facility such as steam turbine machines are operated under high stress and elevated temperature in an endured time period and thus the creep deformation and creep rupture failure are important issues that need to be addressed in the design of such components. There are numerous creep models being used for creep analysis that have both advantages and disadvantages in terms of accuracy and efficiency. The Isochronous Creep Analysis is one of the simplified approaches in which a full-time dependent creep analysis is avoided and instead an elastic-plastic analysis is conducted at each time point. This approach has been established based on the rupture dependent creep equations using the well-known Larson-Miller parameter. In this paper, some fundamental aspects of creep deformation and the rupture dependent creep models are reviewed and the analysis procedures using isochronous creep curves are discussed. Four rupture failure criteria are examined from creep fundamental perspectives including criteria of Stress Damage, Strain Damage, Strain Rate Damage, and Strain Capability. The accuracy of these criteria in predicting creep life is discussed and applications of the creep analysis procedures and failure predictions of simple models will be presented. In addition, a new failure criterion is proposed to improve the accuracy and effectiveness of the existing criteria. Comparisons are made between the existing criteria and the new one using several examples materials. Both strain increase and stress relaxation form a full picture of the creep behaviour of a material under high temperature in an endured time period. It is important to bear this in mind when dealing with creep problems. Accordingly there are two sets of rupture dependent creep equations. While the rupture strength vs LMP equation shows how the rupture time depends on the stress level under load controlled condition, the strain rate vs rupture time equation reflects how the rupture time behaves under strain-controlled condition. Among the four existing failure criteria for rupture life predictions, the Stress Damage and Strain Damage Criteria provide the most conservative and non-conservative predictions, respectively. The Strain Rate and Strain Capability Criteria provide predictions in between that are believed to be more accurate because the strain rate and strain capability are more determined quantities than stress to reflect the creep rupture behaviour. A modified Strain Capability Criterion is proposed making use of the two sets of creep equations and therefore is considered to be more accurate than the original Strain Capability Criterion.

Keywords: creep analysis, high temperature mateials, rapture evalution, steam turbine machines

Procedia PDF Downloads 273

2906 A Mixing Matrix Estimation Algorithm for Speech Signals under the Under-Determined Blind Source Separation Model

Authors: Jing Wu, Wei Lv, Yibing Li, Yuanfan You

Abstract:

The separation of speech signals has become a research hotspot in the field of signal processing in recent years. It has many applications and influences in teleconferencing, hearing aids, speech recognition of machines and so on. The sounds received are usually noisy. The issue of identifying the sounds of interest and obtaining clear sounds in such an environment becomes a problem worth exploring, that is, the problem of blind source separation. This paper focuses on the under-determined blind source separation (UBSS). Sparse component analysis is generally used for the problem of under-determined blind source separation. The method is mainly divided into two parts. Firstly, the clustering algorithm is used to estimate the mixing matrix according to the observed signals. Then the signal is separated based on the known mixing matrix. In this paper, the problem of mixing matrix estimation is studied. This paper proposes an improved algorithm to estimate the mixing matrix for speech signals in the UBSS model. The traditional potential algorithm is not accurate for the mixing matrix estimation, especially for low signal-to noise ratio (SNR).In response to this problem, this paper considers the idea of an improved potential function method to estimate the mixing matrix. The algorithm not only avoids the inuence of insufficient prior information in traditional clustering algorithm, but also improves the estimation accuracy of mixing matrix. This paper takes the mixing of four speech signals into two channels as an example. The results of simulations show that the approach in this paper not only improves the accuracy of estimation, but also applies to any mixing matrix.

Keywords: DBSCAN, potential function, speech signal, the UBSS model

Procedia PDF Downloads 120

2905 Influence of Alcohol Consumption on Attention in Wistar Albino Rats

Authors: Adekunle Adesina, Dorcas Adesina

Abstract:

This Research investigated the influence of alcohol consumption on attention in Wister albino rats. It was designed to test whether or not alcohol consumption affected visual and auditory attention. The sample of this study comprise of 3males albino rats and 3 females albino rats which were randomly assigned to 3 (male/female each) groups, 1, 2 and 3. The first group which was experimental Group 1 received 4ml of alcohol ingestion with cannula twice daily (morning and evening). The second group which was experimental group 2 received 2ml of alcohol ingestion with cannula twice daily (morning and evening). Third group which was the control group only received water (placebo), all these happened within a period of 2 days. Three hypotheses were advanced and testedf in the study. Hypothesis 1 stated that there will be no significant difference between the response speed of albino rats that consume alcohol and those that consume water on visual attention using 5-CSRTT. This was confirmed (DF (2, 9) = 0.72, P <.05). Hypothesis 2 stated that albino rats who consumed alcohol will perform better than those who consume water on auditory accuracy using 5-CSRTT. This was also tested but not confirmed (DF (2, 9) = 2.10, P< .05). The third hypothesis which stated that female albino rats who consumed alcohol would not perform better than male albino rats who consumed alcohol on auditory accuracy using 5-CSRTT was tested and not confirmed. (DF (4) = 0.17, P < .05). Data was analyzed using one-way ANOVA and T-test for independent measures. It was therefore recommended that government policies and programs should be directed at reducing to the barest minimum the rate of alcohol consumption especially among males as it is detrimental to the human auditory attentional organ.

Keywords: alcohol, attention, influence, rats, Wistar

Procedia PDF Downloads 244

2904 An Enhanced Approach in Validating Analytical Methods Using Tolerance-Based Design of Experiments (DoE)

Authors: Gule Teri

Abstract:

The effective validation of analytical methods forms a crucial component of pharmaceutical manufacturing. However, traditional validation techniques can occasionally fail to fully account for inherent variations within datasets, which may result in inconsistent outcomes. This deficiency in validation accuracy is particularly noticeable when quantifying low concentrations of active pharmaceutical ingredients (APIs), excipients, or impurities, introducing a risk to the reliability of the results and, subsequently, the safety and effectiveness of the pharmaceutical products. In response to this challenge, we introduce an enhanced, tolerance-based Design of Experiments (DoE) approach for the validation of analytical methods. This approach distinctly measures variability with reference to tolerance or design margins, enhancing the precision and trustworthiness of the results. This method provides a systematic, statistically grounded validation technique that improves the truthfulness of results. It offers an essential tool for industry professionals aiming to guarantee the accuracy of their measurements, particularly for low-concentration components. By incorporating this innovative method, pharmaceutical manufacturers can substantially advance their validation processes, subsequently improving the overall quality and safety of their products. This paper delves deeper into the development, application, and advantages of this tolerance-based DoE approach and demonstrates its effectiveness using High-Performance Liquid Chromatography (HPLC) data for verification. This paper also discusses the potential implications and future applications of this method in enhancing pharmaceutical manufacturing practices and outcomes.

Keywords: tolerance-based design, design of experiments, analytical method validation, quality control, biopharmaceutical manufacturing

Procedia PDF Downloads 56

2903 Data Augmentation for Early-Stage Lung Nodules Using Deep Image Prior and Pix2pix

Authors: Qasim Munye, Juned Islam, Haseeb Qureshi, Syed Jung

Abstract:

Lung nodules are commonly identified in computed tomography (CT) scans by experienced radiologists at a relatively late stage. Early diagnosis can greatly increase survival. We propose using a pix2pix conditional generative adversarial network to generate realistic images simulating early-stage lung nodule growth. We have applied deep images prior to 2341 slices from 895 computed tomography (CT) scans from the Lung Image Database Consortium (LIDC) dataset to generate pseudo-healthy medical images. From these images, 819 were chosen to train a pix2pix network. We observed that for most of the images, the pix2pix network was able to generate images where the nodule increased in size and intensity across epochs. To evaluate the images, 400 generated images were chosen at random and shown to a medical student beside their corresponding original image. Of these 400 generated images, 384 were defined as satisfactory - meaning they resembled a nodule and were visually similar to the corresponding image. We believe that this generated dataset could be used as training data for neural networks to detect lung nodules at an early stage or to improve the accuracy of such networks. This is particularly significant as datasets containing the growth of early-stage nodules are scarce. This project shows that the combination of deep image prior and generative models could potentially open the door to creating larger datasets than currently possible and has the potential to increase the accuracy of medical classification tasks.

Keywords: medical technology, artificial intelligence, radiology, lung cancer

Procedia PDF Downloads 53

2902 Effect of Knowledge of Bubble Point Pressure on Estimating PVT Properties from Correlations

Authors: Ahmed El-Banbi, Ahmed El-Maraghi

Abstract:

PVT properties are needed as input data in all reservoir, production, and surface facilities engineering calculations. In the absence of PVT reports on valid reservoir fluid samples, engineers rely on PVT correlations to generate the required PVT data. The accuracy of PVT correlations varies, and no correlation group has been found to provide accurate results for all oil types. The effect of inaccurate PVT data can be significant in engineering calculations and is well documented in the literature. Bubble point pressure can sometimes be obtained from external sources. In this paper, we show how to utilize the known bubble point pressure to improve the accuracy of calculated PVT properties from correlations. We conducted a systematic study using around 250 reservoir oil samples to quantify the effect of pre-knowledge of bubble point pressure. The samples spanned a wide range of oils, from very volatile oils to black oils and all the way to low-GOR oils. A method for shifting both undersaturated and saturated sections of the PVT properties curves to the correct bubble point is explained. Seven PVT correlation families were used in this study. All PVT properties (e.g., solution gas-oil ratio, formation volume factor, density, viscosity, and compressibility) were calculated using the correct bubble point pressure and the correlation estimated bubble point pressure. Comparisons between the calculated PVT properties and actual laboratory-measured values were made. It was found that pre-knowledge of bubble point pressure and using the shifting technique presented in the paper improved the correlation-estimated values by 10% to more than 30%. The most improvement was seen in the solution gas-oil ratio and formation volume factor.

Keywords: PVT data, PVT properties, PVT correlations, bubble point pressure

Procedia PDF Downloads 47

2901 Automatic Staging and Subtype Determination for Non-Small Cell Lung Carcinoma Using PET Image Texture Analysis

Authors: Seyhan Karaçavuş, Bülent Yılmaz, Ömer Kayaaltı, Semra İçer, Arzu Taşdemir, Oğuzhan Ayyıldız, Kübra Eset, Eser Kaya

Abstract:

In this study, our goal was to perform tumor staging and subtype determination automatically using different texture analysis approaches for a very common cancer type, i.e., non-small cell lung carcinoma (NSCLC). Especially, we introduced a texture analysis approach, called Law’s texture filter, to be used in this context for the first time. The 18F-FDG PET images of 42 patients with NSCLC were evaluated. The number of patients for each tumor stage, i.e., I-II, III or IV, was 14. The patients had ~45% adenocarcinoma (ADC) and ~55% squamous cell carcinoma (SqCCs). MATLAB technical computing language was employed in the extraction of 51 features by using first order statistics (FOS), gray-level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), and Laws’ texture filters. The feature selection method employed was the sequential forward selection (SFS). Selected textural features were used in the automatic classification by k-nearest neighbors (k-NN) and support vector machines (SVM). In the automatic classification of tumor stage, the accuracy was approximately 59.5% with k-NN classifier (k=3) and 69% with SVM (with one versus one paradigm), using 5 features. In the automatic classification of tumor subtype, the accuracy was around 92.7% with SVM one vs. one. Texture analysis of FDG-PET images might be used, in addition to metabolic parameters as an objective tool to assess tumor histopathological characteristics and in automatic classification of tumor stage and subtype.

Keywords: cancer stage, cancer cell type, non-small cell lung carcinoma, PET, texture analysis

Procedia PDF Downloads 308

2900 Evaluation of Machine Learning Algorithms and Ensemble Methods for Prediction of Students’ Graduation

Authors: Soha A. Bahanshal, Vaibhav Verdhan, Bayong Kim

Abstract:

Graduation rates at six-year colleges are becoming a more essential indicator for incoming fresh students and for university rankings. Predicting student graduation is extremely beneficial to schools and has a huge potential for targeted intervention. It is important for educational institutions since it enables the development of strategic plans that will assist or improve students' performance in achieving their degrees on time (GOT). A first step and a helping hand in extracting useful information from these data and gaining insights into the prediction of students' progress and performance is offered by machine learning techniques. Data analysis and visualization techniques are applied to understand and interpret the data. The data used for the analysis contains students who have graduated in 6 years in the academic year 2017-2018 for science majors. This analysis can be used to predict the graduation of students in the next academic year. Different Predictive modelings such as logistic regression, decision trees, support vector machines, Random Forest, Naïve Bayes, and KNeighborsClassifier are applied to predict whether a student will graduate. These classifiers were evaluated with k folds of 5. The performance of these classifiers was compared based on accuracy measurement. The results indicated that Ensemble Classifier achieves better accuracy, about 91.12%. This GOT prediction model would hopefully be useful to university administration and academics in developing measures for assisting and boosting students' academic performance and ensuring they graduate on time.

Keywords: prediction, decision trees, machine learning, support vector machine, ensemble model, student graduation, GOT graduate on time

Procedia PDF Downloads 60

2899 Path-Tracking Controller for Tracked Mobile Robot on Rough Terrain

Authors: Toshifumi Hiramatsu, Satoshi Morita, Manuel Pencelli, Marta Niccolini, Matteo Ragaglia, Alfredo Argiolas

Abstract:

Automation technologies for agriculture field are needed to promote labor-saving. One of the most relevant problems in automated agriculture is represented by controlling the robot along a predetermined path in presence of rough terrain or incline ground. Unfortunately, disturbances originating from interaction with the ground, such as slipping, make it quite difficult to achieve the required accuracy. In general, it is required to move within 5-10 cm accuracy with respect to the predetermined path. Moreover, lateral velocity caused by gravity on the incline field also affects slipping. In this paper, a path-tracking controller for tracked mobile robots moving on rough terrains of incline field such as vineyard is presented. The controller is composed of a disturbance observer and an adaptive controller based on the kinematic model of the robot. The disturbance observer measures the difference between the measured and the reference yaw rate and linear velocity in order to estimate slip. Then, the adaptive controller adapts “virtual” parameter of the kinematics model: Instantaneous Centers of Rotation (ICRs). Finally, target angular velocity reference is computed according to the adapted parameter. This solution allows estimating the effects of slip without making the model too complex. Finally, the effectiveness of the proposed solution is tested in a simulation environment.

Keywords: the agricultural robot, autonomous control, path-tracking control, tracked mobile robot

Procedia PDF Downloads 160

2898 Performance and Limitations of Likelihood Based Information Criteria and Leave-One-Out Cross-Validation Approximation Methods

Authors: M. A. C. S. Sampath Fernando, James M. Curran, Renate Meyer

Abstract:

Model assessment, in the Bayesian context, involves evaluation of the goodness-of-fit and the comparison of several alternative candidate models for predictive accuracy and improvements. In posterior predictive checks, the data simulated under the fitted model is compared with the actual data. Predictive model accuracy is estimated using information criteria such as the Akaike information criterion (AIC), the Bayesian information criterion (BIC), the Deviance information criterion (DIC), and the Watanabe-Akaike information criterion (WAIC). The goal of an information criterion is to obtain an unbiased measure of out-of-sample prediction error. Since posterior checks use the data twice; once for model estimation and once for testing, a bias correction which penalises the model complexity is incorporated in these criteria. Cross-validation (CV) is another method used for examining out-of-sample prediction accuracy. Leave-one-out cross-validation (LOO-CV) is the most computationally expensive variant among the other CV methods, as it fits as many models as the number of observations. Importance sampling (IS), truncated importance sampling (TIS) and Pareto-smoothed importance sampling (PSIS) are generally used as approximations to the exact LOO-CV and utilise the existing MCMC results avoiding expensive computational issues. The reciprocals of the predictive densities calculated over posterior draws for each observation are treated as the raw importance weights. These are in turn used to calculate the approximate LOO-CV of the observation as a weighted average of posterior densities. In IS-LOO, the raw weights are directly used. In contrast, the larger weights are replaced by their modified truncated weights in calculating TIS-LOO and PSIS-LOO. Although, information criteria and LOO-CV are unable to reflect the goodness-of-fit in absolute sense, the differences can be used to measure the relative performance of the models of interest. However, the use of these measures is only valid under specific circumstances. This study has developed 11 models using normal, log-normal, gamma, and student’s t distributions to improve the PCR stutter prediction with forensic data. These models are comprised of four with profile-wide variances, four with locus specific variances, and three which are two-component mixture models. The mean stutter ratio in each model is modeled as a locus specific simple linear regression against a feature of the alleles under study known as the longest uninterrupted sequence (LUS). The use of AIC, BIC, DIC, and WAIC in model comparison has some practical limitations. Even though, IS-LOO, TIS-LOO, and PSIS-LOO are considered to be approximations of the exact LOO-CV, the study observed some drastic deviations in the results. However, there are some interesting relationships among the logarithms of pointwise predictive densities (lppd) calculated under WAIC and the LOO approximation methods. The estimated overall lppd is a relative measure that reflects the overall goodness-of-fit of the model. Parallel log-likelihood profiles for the models conditional on equal posterior variances in lppds were observed. This study illustrates the limitations of the information criteria in practical model comparison problems. In addition, the relationships among LOO-CV approximation methods and WAIC with their limitations are discussed. Finally, useful recommendations that may help in practical model comparisons with these methods are provided.

Keywords: cross-validation, importance sampling, information criteria, predictive accuracy

Procedia PDF Downloads 378

2897 On the Added Value of Probabilistic Forecasts Applied to the Optimal Scheduling of a PV Power Plant with Batteries in French Guiana

Authors: Rafael Alvarenga, Hubert Herbaux, Laurent Linguet

Abstract:

The uncertainty concerning the power production of intermittent renewable energy is one of the main barriers to the integration of such assets into the power grid. Efforts have thus been made to develop methods to quantify this uncertainty, allowing producers to ensure more reliable and profitable engagements related to their future power delivery. Even though a diversity of probabilistic approaches was proposed in the literature giving promising results, the added value of adopting such methods for scheduling intermittent power plants is still unclear. In this study, the profits obtained by a decision-making model used to optimally schedule an existing PV power plant connected to batteries are compared when the model is fed with deterministic and probabilistic forecasts generated with two of the most recent methods proposed in the literature. Moreover, deterministic forecasts with different accuracy levels were used in the experiments, testing the utility and the capability of probabilistic methods of modeling the progressively increasing uncertainty. Even though probabilistic approaches are unquestionably developed in the recent literature, the results obtained through a study case show that deterministic forecasts still provide the best performance if accurate, ensuring a gain of 14% on final profits compared to the average performance of probabilistic models conditioned to the same forecasts. When the accuracy of deterministic forecasts progressively decreases, probabilistic approaches start to become competitive options until they completely outperform deterministic forecasts when these are very inaccurate, generating 73% more profits in the case considered compared to the deterministic approach.

Keywords: PV power forecasting, uncertainty quantification, optimal scheduling, power systems

Procedia PDF Downloads 64

2896 A Robust and Efficient Segmentation Method Applied for Cardiac Left Ventricle with Abnormal Shapes

Authors: Peifei Zhu, Zisheng Li, Yasuki Kakishita, Mayumi Suzuki, Tomoaki Chono

Abstract:

Segmentation of left ventricle (LV) from cardiac ultrasound images provides a quantitative functional analysis of the heart to diagnose disease. Active Shape Model (ASM) is a widely used approach for LV segmentation but suffers from the drawback that initialization of the shape model is not sufficiently close to the target, especially when dealing with abnormal shapes in disease. In this work, a two-step framework is proposed to improve the accuracy and speed of the model-based segmentation. Firstly, a robust and efficient detector based on Hough forest is proposed to localize cardiac feature points, and such points are used to predict the initial fitting of the LV shape model. Secondly, to achieve more accurate and detailed segmentation, ASM is applied to further fit the LV shape model to the cardiac ultrasound image. The performance of the proposed method is evaluated on a dataset of 800 cardiac ultrasound images that are mostly of abnormal shapes. The proposed method is compared to several combinations of ASM and existing initialization methods. The experiment results demonstrate that the accuracy of feature point detection for initialization was improved by 40% compared to the existing methods. Moreover, the proposed method significantly reduces the number of necessary ASM fitting loops, thus speeding up the whole segmentation process. Therefore, the proposed method is able to achieve more accurate and efficient segmentation results and is applicable to unusual shapes of heart with cardiac diseases, such as left atrial enlargement.

Keywords: hough forest, active shape model, segmentation, cardiac left ventricle

Procedia PDF Downloads 327

2895 Prediction Modeling of Alzheimer’s Disease and Its Prodromal Stages from Multimodal Data with Missing Values

Authors: M. Aghili, S. Tabarestani, C. Freytes, M. Shojaie, M. Cabrerizo, A. Barreto, N. Rishe, R. E. Curiel, D. Loewenstein, R. Duara, M. Adjouadi

Abstract:

A major challenge in medical studies, especially those that are longitudinal, is the problem of missing measurements which hinders the effective application of many machine learning algorithms. Furthermore, recent Alzheimer's Disease studies have focused on the delineation of Early Mild Cognitive Impairment (EMCI) and Late Mild Cognitive Impairment (LMCI) from cognitively normal controls (CN) which is essential for developing effective and early treatment methods. To address the aforementioned challenges, this paper explores the potential of using the eXtreme Gradient Boosting (XGBoost) algorithm in handling missing values in multiclass classification. We seek a generalized classification scheme where all prodromal stages of the disease are considered simultaneously in the classification and decision-making processes. Given the large number of subjects (1631) included in this study and in the presence of almost 28% missing values, we investigated the performance of XGBoost on the classification of the four classes of AD, NC, EMCI, and LMCI. Using 10-fold cross validation technique, XGBoost is shown to outperform other state-of-the-art classification algorithms by 3% in terms of accuracy and F-score. Our model achieved an accuracy of 80.52%, a precision of 80.62% and recall of 80.51%, supporting the more natural and promising multiclass classification.

Keywords: eXtreme gradient boosting, missing data, Alzheimer disease, early mild cognitive impairment, late mild cognitive impair, multiclass classification, ADNI, support vector machine, random forest

Procedia PDF Downloads 170

2894 Automatic Detection of Traffic Stop Locations Using GPS Data

Authors: Areej Salaymeh, Loren Schwiebert, Stephen Remias, Jonathan Waddell

Abstract:

Extracting information from new data sources has emerged as a crucial task in many traffic planning processes, such as identifying traffic patterns, route planning, traffic forecasting, and locating infrastructure improvements. Given the advanced technologies used to collect Global Positioning System (GPS) data from dedicated GPS devices, GPS equipped phones, and navigation tools, intelligent data analysis methodologies are necessary to mine this raw data. In this research, an automatic detection framework is proposed to help identify and classify the locations of stopped GPS waypoints into two main categories: signalized intersections or highway congestion. The Delaunay triangulation is used to perform this assessment in the clustering phase. While most of the existing clustering algorithms need assumptions about the data distribution, the effectiveness of the Delaunay triangulation relies on triangulating geographical data points without such assumptions. Our proposed method starts by cleaning noise from the data and normalizing it. Next, the framework will identify stoppage points by calculating the traveled distance. The last step is to use clustering to form groups of waypoints for signalized traffic and highway congestion. Next, a binary classifier was applied to find distinguish highway congestion from signalized stop points. The binary classifier uses the length of the cluster to find congestion. The proposed framework shows high accuracy for identifying the stop positions and congestion points in around 99.2% of trials. We show that it is possible, using limited GPS data, to distinguish with high accuracy.

Keywords: Delaunay triangulation, clustering, intelligent transportation systems, GPS data

Procedia PDF Downloads 260

2893 A Human Factors Approach to Workload Optimization for On-Screen Review Tasks

Authors: Christina Kirsch, Adam Hatzigiannis

Abstract:

Rail operators and maintainers worldwide are increasingly replacing walking patrols in the rail corridor with mechanized track patrols -essentially data capture on trains- and on-screen reviews of track infrastructure in centralized review facilities. The benefit is that infrastructure workers are less exposed to the dangers of the rail corridor. The impact is a significant change in work design from walking track sections and direct observation in the real world to sedentary jobs in the review facility reviewing captured data on screens. Defects in rail infrastructure can have catastrophic consequences. Reviewer performance regarding accuracy and efficiency of reviews within the available time frame is essential to ensure safety and operational performance. Rail operators must optimize workload and resource loading to transition to on-screen reviews successfully. Therefore, they need to know what workload assessment methodologies will provide reliable and valid data to optimize resourcing for on-screen reviews. This paper compares objective workload measures, including track difficulty ratings and review distance covered per hour, and subjective workload assessments (NASA TLX) and analyses the link between workload and reviewer performance, including sensitivity, precision, and overall accuracy. An experimental study was completed with eight on-screen reviewers, including infrastructure workers and engineers, reviewing track sections with different levels of track difficulty over nine days. Each day the reviewers completed four 90-minute sessions of on-screen inspection of the track infrastructure. Data regarding the speed of review (km/ hour), detected defects, false negatives, and false positives were collected. Additionally, all reviewers completed a subjective workload assessment (NASA TLX) after each 90-minute session and a short employee engagement survey at the end of the study period that captured impacts on job satisfaction and motivation. The results showed that objective measures for tracking difficulty align with subjective mental demand, temporal demand, effort, and frustration in the NASA TLX. Interestingly, review speed correlated with subjective assessments of physical and temporal demand, but to mental demand. Subjective performance ratings correlated with all accuracy measures and review speed. The results showed that subjective NASA TLX workload assessments accurately reflect objective workload. The analysis of the impact of workload on performance showed that subjective mental demand correlated with high precision -accurately detected defects, not false positives. Conversely, high temporal demand was negatively correlated with sensitivity and the percentage of detected existing defects. Review speed was significantly correlated with false negatives. With an increase in review speed, accuracy declined. On the other hand, review speed correlated with subjective performance assessments. Reviewers thought their performance was higher when they reviewed the track sections faster, despite the decline in accuracy. The study results were used to optimize resourcing and ensure that reviewers had enough time to review the allocated track sections to improve defect detection rates in accordance with the efficiency-thoroughness trade-off. Overall, the study showed the importance of a multi-method approach to workload assessment and optimization, combining subjective workload assessments with objective workload and performance measures to ensure that recommendations for work system optimization are evidence-based and reliable.

Keywords: automation, efficiency-thoroughness trade-off, human factors, job design, NASA TLX, performance optimization, subjective workload assessment, workload analysis

Procedia PDF Downloads 102

2892 Hourly Solar Radiations Predictions for Anticipatory Control of Electrically Heated Floor: Use of Online Weather Conditions Forecast

Authors: Helene Thieblemont, Fariborz Haghighat

Abstract:

Energy storage systems play a crucial role in decreasing building energy consumption during peak periods and expand the use of renewable energies in buildings. To provide a high building thermal performance, the energy storage system has to be properly controlled to insure a good energy performance while maintaining a satisfactory thermal comfort for building’s occupant. In the case of passive discharge storages, defining in advance the required amount of energy is required to avoid overheating in the building. Consequently, anticipatory supervisory control strategies have been developed forecasting future energy demand and production to coordinate systems. Anticipatory supervisory control strategies are based on some predictions, mainly of the weather forecast. However, if the forecasted hourly outdoor temperature may be found online with a high accuracy, solar radiations predictions are most of the time not available online. To estimate them, this paper proposes an advanced approach based on the forecast of weather conditions. Several methods to correlate hourly weather conditions forecast to real hourly solar radiations are compared. Results show that using weather conditions forecast allows estimating with an acceptable accuracy solar radiations of the next day. Moreover, this technique allows obtaining hourly data that may be used for building models. As a result, this solar radiation prediction model may help to implement model-based controller as Model Predictive Control.

Keywords: anticipatory control, model predictive control, solar radiation forecast, thermal storage

Procedia PDF Downloads 257

2891 Characteristing Aquifer Layers of Karstic Springs in Nahavand Plain Using Geoelectrical and Electromagnetic Methods

Authors: A. Taheri Tizro, Rojin Fasihi

Abstract:

Geoelectrical method is one of the most effective tools in determining subsurface lithological layers. The electromagnetic method is also a newer method that can play an important role in determining and separating subsurface layers with acceptable accuracy. In the present research, 10 electromagnetic soundings were collected in the upstream of 5 karstic springs of Famaseb, Faresban, Ghale Baroodab, Gian and Gonbad kabood in Nahavand plain of Hamadan province. By using the emerging data, the belectromagnetic logs were prepared at different depths and compared with 5 logs of the geoelectric method. The comparison showed that the value of NRMSE in the geoelectric method for the 5 springs of Famaseb, Faresban, Ghale Baroodab, Gian and Gonbad kabood were 7.11, 7.50, respectively. It is 44.93, 3.99, and 2.99, and in the electromagnetic method, the value of this coefficient for the investigated springs is about 1.4, 1.1, 1.2, 1.5, and 1.3, respectively. In addition to the similarity of the results of the two methods, it is found that, the accuracy of the electromagnetic method based on the NRMSE value is higher than the geoelectric method. The advantage of the electromagnetic method compared to geoelectric is on less time consuming and its cost prohibitive. The depth to water table is the final result of this research work , which showed that in the springs of Famaseb, Faresban, Ghale Baroodab, Gian and Gonbad kabood, having depth of about 6, 20, 10, 2 36 meters respectively. The maximum thickness of the aquifer layer was estimated in Gonbad kabood spring (36 meters) and the lowest in Gian spring (2 meters). These results can be used to identify the water potential of the region in order to better manage water resources.

Keywords: karst spring, geoelectric, aquifer layers, nahavand

Procedia PDF Downloads 58

2890 Development and Application of an Intelligent Masonry Modulation in BIM Tools: Literature Review

Authors: Sara A. Ben Lashihar

Abstract:

The heritage building information modelling (HBIM) of the historical masonry buildings has expanded lately to meet the urgent needs for conservation and structural analysis. The masonry structures are unique features for ancient building architectures worldwide that have special cultural, spiritual, and historical significance. However, there is a research gap regarding the reliability of the HBIM modeling process of these structures. The HBIM modeling process of the masonry structures faces significant challenges due to the inherent complexity and uniqueness of their structural systems. Most of these processes are based on tracing the point clouds and rarely follow documents, archival records, or direct observation. The results of these techniques are highly abstracted models where the accuracy does not exceed LOD 200. The masonry assemblages, especially curved elements such as arches, vaults, and domes, are generally modeled with standard BIM components or in-place models, and the brick textures are graphically input. Hence, future investigation is necessary to establish a methodology to generate automatically parametric masonry components. These components are developed algorithmically according to mathematical and geometric accuracy and the validity of the survey data. The main aim of this paper is to provide a comprehensive review of the state of the art of the existing researches and papers that have been conducted on the HBIM modeling of the masonry structural elements and the latest approaches to achieve parametric models that have both the visual fidelity and high geometric accuracy. The paper reviewed more than 800 articles, proceedings papers, and book chapters focused on "HBIM and Masonry" keywords from 2017 to 2021. The studies were downloaded from well-known, trusted bibliographic databases such as Web of Science, Scopus, Dimensions, and Lens. As a starting point, a scientometric analysis was carried out using VOSViewer software. This software extracts the main keywords in these studies to retrieve the relevant works. It also calculates the strength of the relationships between these keywords. Subsequently, an in-depth qualitative review followed the studies with the highest frequency of occurrence and the strongest links with the topic, according to the VOSViewer's results. The qualitative review focused on the latest approaches and the future suggestions proposed in these researches. The findings of this paper can serve as a valuable reference for researchers, and BIM specialists, to make more accurate and reliable HBIM models for historic masonry buildings.

Keywords: HBIM, masonry, structure, modeling, automatic, approach, parametric

Procedia PDF Downloads 149

2889 Markov Random Field-Based Segmentation Algorithm for Detection of Land Cover Changes Using Uninhabited Aerial Vehicle Synthetic Aperture Radar Polarimetric Images

Authors: Mehrnoosh Omati, Mahmod Reza Sahebi

Abstract:

The information on land use/land cover changing plays an essential role for environmental assessment, planning and management in regional development. Remotely sensed imagery is widely used for providing information in many change detection applications. Polarimetric Synthetic aperture radar (PolSAR) image, with the discrimination capability between different scattering mechanisms, is a powerful tool for environmental monitoring applications. This paper proposes a new boundary-based segmentation algorithm as a fundamental step for land cover change detection. In this method, first, two PolSAR images are segmented using integration of marker-controlled watershed algorithm and coupled Markov random field (MRF). Then, object-based classification is performed to determine changed/no changed image objects. Compared with pixel-based support vector machine (SVM) classifier, this novel segmentation algorithm significantly reduces the speckle effect in PolSAR images and improves the accuracy of binary classification in object-based level. The experimental results on Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) polarimetric images show a 3% and 6% improvement in overall accuracy and kappa coefficient, respectively. Also, the proposed method can correctly distinguish homogeneous image parcels.

Keywords: coupled Markov random field (MRF), environment, object-based analysis, polarimetric SAR (PolSAR) images

Procedia PDF Downloads 203

2888 Development of a Direct Immunoassay for Human Ferritin Using Diffraction-Based Sensing Method

Authors: Joel Ballesteros, Harriet Jane Caleja, Florian Del Mundo, Cherrie Pascual

Abstract:

Diffraction-based sensing was utilized in the quantification of human ferritin in blood serum to provide an alternative to label-based immunoassays currently used in clinical diagnostics and researches. The diffraction intensity was measured by the diffractive optics technology or dotLab™ system. Two methods were evaluated in this study: direct immunoassay and direct sandwich immunoassay. In the direct immunoassay, human ferritin was captured by human ferritin antibodies immobilized on an avidin-coated sensor while the direct sandwich immunoassay had an additional step for the binding of a detector human ferritin antibody on the analyte complex. Both methods were repeatable with coefficient of variation values below 15%. The direct sandwich immunoassay had a linear response from 10 to 500 ng/mL which is wider than the 100-500 ng/mL of the direct immunoassay. The direct sandwich immunoassay also has a higher calibration sensitivity with value 0.002 Diffractive Intensity (ng mL-1)-1) compared to the 0.004 Diffractive Intensity (ng mL-1)-1 of the direct immunoassay. The limit of detection and limit of quantification values of the direct immunoassay were found to be 29 ng/mL and 98 ng/mL, respectively, while the direct sandwich immunoassay has a limit of detection (LOD) of 2.5 ng/mL and a limit of quantification (LOQ) of 8.2 ng/mL. In terms of accuracy, the direct immunoassay had a percent recovery of 88.8-93.0% in PBS while the direct sandwich immunoassay had 94.1 to 97.2%. Based on the results, the direct sandwich immunoassay is a better diffraction-based immunoassay in terms of accuracy, LOD, LOQ, linear range, and sensitivity. The direct sandwich immunoassay was utilized in the determination of human ferritin in blood serum and the results are validated by Chemiluminescent Magnetic Immunoassay (CMIA). The calculated Pearson correlation coefficient was 0.995 and the p-values of the paired-sample t-test were less than 0.5 which show that the results of the direct sandwich immunoassay was comparable to that of CMIA and could be utilized as an alternative analytical method.

Keywords: biosensor, diffraction, ferritin, immunoassay

Procedia PDF Downloads 337

2887 Fake Accounts Detection in Twitter Based on Minimum Weighted Feature Set

Authors: Ahmed ElAzab, Amira M. Idrees, Mahmoud A. Mahmoud, Hesham Hefny

Abstract:

Social networking sites such as Twitter and Facebook attracts over 500 million users across the world, for those users, their social life, even their practical life, has become interrelated. Their interaction with social networking has affected their life forever. Accordingly, social networking sites have become among the main channels that are responsible for vast dissemination of different kinds of information during real time events. This popularity in Social networking has led to different problems including the possibility of exposing incorrect information to their users through fake accounts which results to the spread of malicious content during life events. This situation can result to a huge damage in the real world to the society in general including citizens, business entities, and others. In this paper, we present a classification method for detecting fake accounts on Twitter. The study determines the minimized set of the main factors that influence the detection of the fake accounts on Twitter, then the determined factors have been applied using different classification techniques, a comparison of the results for these techniques has been performed and the most accurate algorithm is selected according to the accuracy of the results. The study has been compared with different recent research in the same area, this comparison has proved the accuracy of the proposed study. We claim that this study can be continuously applied on Twitter social network to automatically detect the fake accounts, moreover, the study can be applied on different Social network sites such as Facebook with minor changes according to the nature of the social network which are discussed in this paper.

Keywords: fake accounts detection, classification algorithms, twitter accounts analysis, features based techniques

Procedia PDF Downloads 387