Search results for: Markov classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2348

Search results for: Markov classification

1328 Changing Emphases in Mental Health Research Methodology: Opportunities for Occupational Therapy

Authors: Jeffrey Chase

Abstract:

Historically the profession of Occupational Therapy was closely tied to the treatment of those suffering from mental illness; more recently, and especially in the U.S., the percentage of OTs identifying as working in the mental health area has declined significantly despite the estimate that by 2020 behavioral health disorders will surpass physical illnesses as the major cause of disability worldwide. In the U.S. less than 10% of OTs identify themselves as working with the mentally ill and/or practicing in mental health settings. Such a decline has implications for both those suffering from mental illness and the profession of Occupational Therapy. One reason cited for the decline of OT in mental health has been the limited research in the discipline addressing mental health practice. Despite significant advances in technology and growth in the field of neuroscience, major institutions and funding sources such as the National Institute of Mental Health (NIMH) have noted that research into the etiology and treatment of mental illness have met with limited success over the past 25 years. One major reason posited by NIMH is that research has been limited by how we classify individuals, that being mostly on what is observable. A new classification system being developed by NIMH, the Research Domain Criteria (RDoc), has the goal to look beyond just descriptors of disorders for common neural, genetic, and physiological characteristics that cut across multiple supposedly separate disorders. The hope is that by classifying individuals along RDoC measures that both reliability and validity will improve resulting in greater advances in the field. As a result of this change NIH and NIMH will prioritize research funding to those projects using the RDoC model. Multiple disciplines across many different setting will be required for RDoC or similar classification systems to be developed. During this shift in research methodology OT has an opportunity to reassert itself into the research and treatment of mental illness, both in developing new ways to more validly classify individuals, and to document the legitimacy of previously ill-defined and validated disorders such as sensory integration.

Keywords: global mental health and neuroscience, research opportunities for ot, greater integration of ot in mental health research, research and funding opportunities, research domain criteria (rdoc)

Procedia PDF Downloads 275
1327 Influence of Travel Time Reliability on Elderly Drivers Crash Severity

Authors: Ren Moses, Emmanuel Kidando, Eren Ozguven, Yassir Abdelrazig

Abstract:

Although older drivers (defined as those of age 65 and above) are less involved with speeding, alcohol use as well as night driving, they are more vulnerable to severe crashes. The major contributing factors for severe crashes include frailty and medical complications. Several studies have evaluated the contributing factors on severity of crashes. However, few studies have established the impact of travel time reliability (TTR) on road safety. In particular, the impact of TTR on senior adults who face several challenges including hearing difficulties, decreasing of the processing skills and cognitive problems in driving is not well established. Therefore, this study focuses on determining possible impacts of TTR on the traffic safety with focus on elderly drivers. Historical travel speed data from freeway links in the study area were used to calculate travel time and the associated TTR metrics that is, planning time index, the buffer index, the standard deviation of the travel time and the probability of congestion. Four-year information on crashes occurring on these freeway links was acquired. The binary logit model estimated using the Markov Chain Monte Carlo (MCMC) sampling technique was used to evaluate variables that could be influencing elderly crash severity. Preliminary results of the analysis suggest that TTR is statistically significant in affecting the severity of a crash involving an elderly driver. The result suggests that one unit increase in the probability of congestion reduces the likelihood of the elderly severe crash by nearly 22%. These findings will enhance the understanding of TTR and its impact on the elderly crash severity.

Keywords: highway safety, travel time reliability, elderly drivers, traffic modeling

Procedia PDF Downloads 493
1326 Automated Prediction of HIV-associated Cervical Cancer Patients Using Data Mining Techniques for Survival Analysis

Authors: O. J. Akinsola, Yinan Zheng, Rose Anorlu, F. T. Ogunsola, Lifang Hou, Robert Leo-Murphy

Abstract:

Cervical Cancer (CC) is the 2nd most common cancer among women living in low and middle-income countries, with no associated symptoms during formative periods. With the advancement and innovative medical research, there are numerous preventive measures being utilized, but the incidence of cervical cancer cannot be truncated with the application of only screening tests. The mortality associated with this invasive cervical cancer can be nipped in the bud through the important role of early-stage detection. This study research selected an array of different top features selection techniques which was aimed at developing a model that could validly diagnose the risk factors of cervical cancer. A retrospective clinic-based cohort study was conducted on 178 HIV-associated cervical cancer patients in Lagos University teaching Hospital, Nigeria (U54 data repository) in April 2022. The outcome measure was the automated prediction of the HIV-associated cervical cancer cases, while the predictor variables include: demographic information, reproductive history, birth control, sexual history, cervical cancer screening history for invasive cervical cancer. The proposed technique was assessed with R and Python programming software to produce the model by utilizing the classification algorithms for the detection and diagnosis of cervical cancer disease. Four machine learning classification algorithms used are: the machine learning model was split into training and testing dataset into ratio 80:20. The numerical features were also standardized while hyperparameter tuning was carried out on the machine learning to train and test the data. Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), and K-Nearest Neighbor (KNN). Some fitting features were selected for the detection and diagnosis of cervical cancer diseases from selected characteristics in the dataset using the contribution of various selection methods for the classification cervical cancer into healthy or diseased status. The mean age of patients was 49.7±12.1 years, mean age at pregnancy was 23.3±5.5 years, mean age at first sexual experience was 19.4±3.2 years, while the mean BMI was 27.1±5.6 kg/m2. A larger percentage of the patients are Married (62.9%), while most of them have at least two sexual partners (72.5%). Age of patients (OR=1.065, p<0.001**), marital status (OR=0.375, p=0.011**), number of pregnancy live-births (OR=1.317, p=0.007**), and use of birth control pills (OR=0.291, p=0.015**) were found to be significantly associated with HIV-associated cervical cancer. On top ten 10 features (variables) considered in the analysis, RF claims the overall model performance, which include: accuracy of (72.0%), the precision of (84.6%), a recall of (84.6%) and F1-score of (74.0%) while LR has: an accuracy of (74.0%), precision of (70.0%), recall of (70.0%) and F1-score of (70.0%). The RF model identified 10 features predictive of developing cervical cancer. The age of patients was considered as the most important risk factor, followed by the number of pregnancy livebirths, marital status, and use of birth control pills, The study shows that data mining techniques could be used to identify women living with HIV at high risk of developing cervical cancer in Nigeria and other sub-Saharan African countries.

Keywords: associated cervical cancer, data mining, random forest, logistic regression

Procedia PDF Downloads 83
1325 Outcome of Unilateral Retinoblastoma: A Ten Years Experience of Children's Cancer, Hospital Egypt

Authors: Ahmed Elhussein, Hossam El-Zomor, Adel Alieldin, Mahmoud A. Afifi, Abdullah Elhusseiny, Hala Taha, Amal Refaat, Soha Ahmed, Mohamed S. Zagloul

Abstract:

Background: A majority of children with retinoblastoma (60%) have a disease in one eye only (unilateral disease). This is a retrospective study to evaluate two different treatment modalities in those patients for saving their lives and vision. Methods: Four hundred and four patients were diagnosed with unilateral intraocular retinoblastoma at Children’s Cancer, Hospital Egypt (CCHE) through the period of July/2007 until December/2017. Management strategies included primary enucleation versus ocular salvage treatment. Results: Patients presented with mean age 24.5 months with range (1.2-154.3 months). According to the international retinoblastoma classification, Group D (n=172, 42%) was the most common, followed by group E (n=142, 35%), group C (n=63, 16%), and group B (n=27, 7%). All patients were alive at the end of the study except four patients who died, with 5-years overall survival 98.3% [CI, (96.5-100%)]. Patients presented with advanced disease and poor visual prognosis (n=241, 59.6%) underwent primary enucleation with 6 cycles adjuvant chemotherapy if they had high-risk features in the enucleated eye; only four patients out of 241 ended-up either with extraocular metastasis (n=3) or death (n=1). While systemic chemotherapy and focal therapy were the primary treatment for those who presented with favorable disease status and good visual prognosis (n=163, 40.4%); seventy-seven patients of them (47%) ended up with a pre-defined event (enucleation, EBRT, off protocol chemotherapy or 2ry malignancy). Ocular survival for patients received primary chemotherapy + focal therapy was [50.9% (CI, 43.5-59.6%)] at 3 years and [46.9% (CI,39.3-56%)] at 5 years. Comparison between upfront enucleation and primary chemotherapy for occurrence of extraocular metastasis revealed that there was no statistical difference between them except in group D (p value). While for occurrence of death, no statistical difference in all classification groups. Conclusion: In retinoblastoma, primary chemotherapy is a reasonable option and has a good probability for ocular salvage without increasing the risk of metastasis in comparison to upfront enucleation except in group D.

Keywords: CCHE, chemotherapy, enucleation, retinoblastoma

Procedia PDF Downloads 155
1324 Evaluation of the Efficacy of Basic Life Support Teaching in Second and Third Year Medical Students

Authors: Bianca W. O. Silva, Adriana C. M. Andrade, Gustavo C. M. Lucena, Virna M. S. Lima

Abstract:

Introduction: Basic life support (BLS) involves the immediate recognition of cardiopulmonary arrest. Each year, 359.400 and 275.000 individuals with cardiac arrest are attended in emergency departments in USA and Europe. Brazilian data shows that 200.000 cardiac arrests occur every year, and half of them out of the hospital. Medical schools around the world teach BLS in the first years of the course, but studies show that there is a decline of the knowledge as the years go by, affecting the chain of survival. The objective was to analyze the knowledge of medical students about BLS and the retention of this learning throughout the course. Methods: This study included 150 students who were at the second and third year of a medical school in Salvador, Bahia, Brazil. The instrument of data collection was a structured questionnaire composed of 20 questions based on the 2015 American Heart Association guideline. The Pearson Chi-square test was used in order to study the association between previous training, sex and semester with the degree of knowledge of the students. The Kruskal-Wallis test was used to evaluate the different yields obtained between the various semesters. The number of correct answers was described by average and quartiles. Results: Regarding the degree of knowledge, 19.6% of the female students reached the optimal classification, a better outcome than the achieved by the male participants. Of those with previous training, 33.33% were classified as good and optimal, none of the students reached the optimal classification and only 2.2% of them were classified as bad (those who did not have 52.6% of correct answers). The analysis of the degree of knowledge related to each semester revealed that the 5th semester had the highest outcome: 30.5%. However, the acquaintance presented by the semesters was generally unsatisfactory, since 50% of the students, or more, demonstrated knowledge levels classified as bad or regular. When confronting the different semesters and the achieved scores, the value of p was 0.831. Conclusion: It is important to focus on the training of medical professionals that are capable of facing emergency situations, improving the systematization of care, and thereby increasing the victims' possibility of survival.

Keywords: basic life support, cardiopulmonary ressucitacion, education, medical students

Procedia PDF Downloads 186
1323 Machine Learning Techniques to Predict Cyberbullying and Improve Social Work Interventions

Authors: Oscar E. Cariceo, Claudia V. Casal

Abstract:

Machine learning offers a set of techniques to promote social work interventions and can lead to support decisions of practitioners in order to predict new behaviors based on data produced by the organizations, services agencies, users, clients or individuals. Machine learning techniques include a set of generalizable algorithms that are data-driven, which means that rules and solutions are derived by examining data, based on the patterns that are present within any data set. In other words, the goal of machine learning is teaching computers through 'examples', by training data to test specifics hypothesis and predict what would be a certain outcome, based on a current scenario and improve that experience. Machine learning can be classified into two general categories depending on the nature of the problem that this technique needs to tackle. First, supervised learning involves a dataset that is already known in terms of their output. Supervising learning problems are categorized, into regression problems, which involve a prediction from quantitative variables, using a continuous function; and classification problems, which seek predict results from discrete qualitative variables. For social work research, machine learning generates predictions as a key element to improving social interventions on complex social issues by providing better inference from data and establishing more precise estimated effects, for example in services that seek to improve their outcomes. This paper exposes the results of a classification algorithm to predict cyberbullying among adolescents. Data were retrieved from the National Polyvictimization Survey conducted by the government of Chile in 2017. A logistic regression model was created to predict if an adolescent would experience cyberbullying based on the interaction and behavior of gender, age, grade, type of school, and self-esteem sentiments. The model can predict with an accuracy of 59.8% if an adolescent will suffer cyberbullying. These results can help to promote programs to avoid cyberbullying at schools and improve evidence based practice.

Keywords: cyberbullying, evidence based practice, machine learning, social work research

Procedia PDF Downloads 168
1322 Fight against Money Laundering with Optical Character Recognition

Authors: Saikiran Subbagari, Avinash Malladhi

Abstract:

Anti Money Laundering (AML) regulations are designed to prevent money laundering and terrorist financing activities worldwide. Financial institutions around the world are legally obligated to identify, assess and mitigate the risks associated with money laundering and report any suspicious transactions to governing authorities. With increasing volumes of data to analyze, financial institutions seek to automate their AML processes. In the rise of financial crimes, optical character recognition (OCR), in combination with machine learning (ML) algorithms, serves as a crucial tool for automating AML processes by extracting the data from documents and identifying suspicious transactions. In this paper, we examine the utilization of OCR for AML and delve into various OCR techniques employed in AML processes. These techniques encompass template-based, feature-based, neural network-based, natural language processing (NLP), hidden markov models (HMMs), conditional random fields (CRFs), binarizations, pattern matching and stroke width transform (SWT). We evaluate each technique, discussing their strengths and constraints. Also, we emphasize on how OCR can improve the accuracy of customer identity verification by comparing the extracted text with the office of foreign assets control (OFAC) watchlist. We will also discuss how OCR helps to overcome language barriers in AML compliance. We also address the implementation challenges that OCR-based AML systems may face and offer recommendations for financial institutions based on the data from previous research studies, which illustrate the effectiveness of OCR-based AML.

Keywords: anti-money laundering, compliance, financial crimes, fraud detection, machine learning, optical character recognition

Procedia PDF Downloads 144
1321 Calibration and Validation of ArcSWAT Model for Estimation of Surface Runoff and Sediment Yield from Dhangaon Watershed

Authors: M. P. Tripathi, Priti Tiwari

Abstract:

Soil and Water Assessment Tool (SWAT) is a distributed parameter continuous time model and was tested on daily and fortnightly basis for a small agricultural watershed (Dhangaon) of Chhattisgarh state in India. The SWAT model recently interfaced with ArcGIS and called as ArcSWAT. The watershed and sub-watershed boundaries, drainage networks, slope and texture maps were generated in the environment of ArcGIS of ArcSWAT. Supervised classification method was used for land use/cover classification from satellite imageries of the years 2009 and 2012. Manning's roughness coefficient 'n' for overland flow and channel flow and Fraction of Field Capacity (FFC) were calibrated for monsoon season of the years 2009 and 2010. The model was validated on a daily basis for the years 2011 and 2012 by using the observed daily rainfall and temperature data. Calibration and validation results revealed that the model was predicting the daily surface runoff and sediment yield satisfactorily. Sensitivity analysis showed that the annual sediment yield was inversely proportional to the overland and channel 'n' values whereas; annual runoff and sediment yields were directly proportional to the FFC. The model was also tested (calibrated and validated) for the fortnightly runoff and sediment yield for the year 2009-10 and 2011-12, respectively. Simulated values of fortnightly runoff and sediment yield for the calibration and validation years compared well with their observed counterparts. The calibration and validation results revealed that the ArcSWAT model could be used for identification of critical sub-watershed and for developing management scenarios for the Dhangaon watershed. Further, the model should be tested for simulating the surface runoff and sediment yield using generated rainfall and temperature before applying it for developing the management scenario for the critical or priority sub-watersheds.

Keywords: watershed, hydrologic and water quality, ArcSWAT model, remote sensing, GIS, runoff and sediment yield

Procedia PDF Downloads 379
1320 Image Processing-Based Maize Disease Detection Using Mobile Application

Authors: Nathenal Thomas

Abstract:

In the food chain and in many other agricultural products, corn, also known as maize, which goes by the scientific name Zea mays subsp, is a widely produced agricultural product. Corn has the highest adaptability. It comes in many different types, is employed in many different industrial processes, and is more adaptable to different agro-climatic situations. In Ethiopia, maize is among the most widely grown crop. Small-scale corn farming may be a household's only source of food in developing nations like Ethiopia. The aforementioned data demonstrates that the country's requirement for this crop is excessively high, and conversely, the crop's productivity is very low for a variety of reasons. The most damaging disease that greatly contributes to this imbalance between the crop's supply and demand is the corn disease. The failure to diagnose diseases in maize plant until they are too late is one of the most important factors influencing crop output in Ethiopia. This study will aid in the early detection of such diseases and support farmers during the cultivation process, directly affecting the amount of maize produced. The diseases in maize plants, such as northern leaf blight and cercospora leaf spot, have distinct symptoms that are visible. This study aims to detect the most frequent and degrading maize diseases using the most efficiently used subset of machine learning technology, deep learning so, called Image Processing. Deep learning uses networks that can be trained from unlabeled data without supervision (unsupervised). It is a feature that simulates the exercises the human brain goes through when digesting data. Its applications include speech recognition, language translation, object classification, and decision-making. Convolutional Neural Network (CNN) for Image Processing, also known as convent, is a deep learning class that is widely used for image classification, image detection, face recognition, and other problems. it will also use this algorithm as the state-of-the-art for my research to detect maize diseases by photographing maize leaves using a mobile phone.

Keywords: CNN, zea mays subsp, leaf blight, cercospora leaf spot

Procedia PDF Downloads 74
1319 Frequency Decomposition Approach for Sub-Band Common Spatial Pattern Methods for Motor Imagery Based Brain-Computer Interface

Authors: Vitor M. Vilas Boas, Cleison D. Silva, Gustavo S. Mafra, Alexandre Trofino Neto

Abstract:

Motor imagery (MI) based brain-computer interfaces (BCI) uses event-related (de)synchronization (ERS/ ERD), typically recorded using electroencephalography (EEG), to translate brain electrical activity into control commands. To mitigate undesirable artifacts and noise measurements on EEG signals, methods based on band-pass filters defined by a specific frequency band (i.e., 8 – 30Hz), such as the Infinity Impulse Response (IIR) filters, are typically used. Spatial techniques, such as Common Spatial Patterns (CSP), are also used to estimate the variations of the filtered signal and extract features that define the imagined motion. The CSP effectiveness depends on the subject's discriminative frequency, and approaches based on the decomposition of the band of interest into sub-bands with smaller frequency ranges (SBCSP) have been suggested to EEG signals classification. However, despite providing good results, the SBCSP approach generally increases the computational cost of the filtering step in IM-based BCI systems. This paper proposes the use of the Fast Fourier Transform (FFT) algorithm in the IM-based BCI filtering stage that implements SBCSP. The goal is to apply the FFT algorithm to reduce the computational cost of the processing step of these systems and to make them more efficient without compromising classification accuracy. The proposal is based on the representation of EEG signals in a matrix of coefficients resulting from the frequency decomposition performed by the FFT, which is then submitted to the SBCSP process. The structure of the SBCSP contemplates dividing the band of interest, initially defined between 0 and 40Hz, into a set of 33 sub-bands spanning specific frequency bands which are processed in parallel each by a CSP filter and an LDA classifier. A Bayesian meta-classifier is then used to represent the LDA outputs of each sub-band as scores and organize them into a single vector, and then used as a training vector of an SVM global classifier. Initially, the public EEG data set IIa of the BCI Competition IV is used to validate the approach. The first contribution of the proposed method is that, in addition to being more compact, because it has a 68% smaller dimension than the original signal, the resulting FFT matrix maintains the signal information relevant to class discrimination. In addition, the results showed an average reduction of 31.6% in the computational cost in relation to the application of filtering methods based on IIR filters, suggesting FFT efficiency when applied in the filtering step. Finally, the frequency decomposition approach improves the overall system classification rate significantly compared to the commonly used filtering, going from 73.7% using IIR to 84.2% using FFT. The accuracy improvement above 10% and the computational cost reduction denote the potential of FFT in EEG signal filtering applied to the context of IM-based BCI implementing SBCSP. Tests with other data sets are currently being performed to reinforce such conclusions.

Keywords: brain-computer interfaces, fast Fourier transform algorithm, motor imagery, sub-band common spatial patterns

Procedia PDF Downloads 128
1318 Understanding the Classification of Rain Microstructure and Estimation of Z-R Relationship using a Micro Rain Radar in Tropical Region

Authors: Tomiwa, Akinyemi Clement

Abstract:

Tropical regions experience diverse and complex precipitation patterns, posing significant challenges for accurate rainfall estimation and forecasting. This study addresses the problem of effectively classifying tropical rain types and refining the Z-R (Reflectivity-Rain Rate) relationship to enhance rainfall estimation accuracy. Through a combination of remote sensing, meteorological analysis, and machine learning, the research aims to develop an advanced classification framework capable of distinguishing between different types of tropical rain based on their unique characteristics. This involves utilizing high-resolution satellite imagery, radar data, and atmospheric parameters to categorize precipitation events into distinct classes, providing a comprehensive understanding of tropical rain systems. Additionally, the study seeks to improve the Z-R relationship, a crucial aspect of rainfall estimation. One year of rainfall data was analyzed using a Micro Rain Radar (MRR) located at The Federal University of Technology Akure, Nigeria, measuring rainfall parameters from ground level to a height of 4.8 km with a vertical resolution of 0.16 km. Rain rates were classified into low (stratiform) and high (convective) based on various microstructural attributes such as rain rates, liquid water content, Drop Size Distribution (DSD), average fall speed of the drops, and radar reflectivity. By integrating diverse datasets and employing advanced statistical techniques, the study aims to enhance the precision of Z-R models, offering a more reliable means of estimating rainfall rates from radar reflectivity data. This refined Z-R relationship holds significant potential for improving our understanding of tropical rain systems and enhancing forecasting accuracy in regions prone to heavy precipitation.

Keywords: remote sensing, precipitation, drop size distribution, micro rain radar

Procedia PDF Downloads 33
1317 Machine Learning Approach for Predicting Students’ Academic Performance and Study Strategies Based on Their Motivation

Authors: Fidelia A. Orji, Julita Vassileva

Abstract:

This research aims to develop machine learning models for students' academic performance and study strategy prediction, which could be generalized to all courses in higher education. Key learning attributes (intrinsic, extrinsic, autonomy, relatedness, competence, and self-esteem) used in building the models are chosen based on prior studies, which revealed that the attributes are essential in students’ learning process. Previous studies revealed the individual effects of each of these attributes on students’ learning progress. However, few studies have investigated the combined effect of the attributes in predicting student study strategy and academic performance to reduce the dropout rate. To bridge this gap, we used Scikit-learn in python to build five machine learning models (Decision Tree, K-Nearest Neighbour, Random Forest, Linear/Logistic Regression, and Support Vector Machine) for both regression and classification tasks to perform our analysis. The models were trained, evaluated, and tested for accuracy using 924 university dentistry students' data collected by Chilean authors through quantitative research design. A comparative analysis of the models revealed that the tree-based models such as the random forest (with prediction accuracy of 94.9%) and decision tree show the best results compared to the linear, support vector, and k-nearest neighbours. The models built in this research can be used in predicting student performance and study strategy so that appropriate interventions could be implemented to improve student learning progress. Thus, incorporating strategies that could improve diverse student learning attributes in the design of online educational systems may increase the likelihood of students continuing with their learning tasks as required. Moreover, the results show that the attributes could be modelled together and used to adapt/personalize the learning process.

Keywords: classification models, learning strategy, predictive modeling, regression models, student academic performance, student motivation, supervised machine learning

Procedia PDF Downloads 128
1316 A Qualitative Research of Online Fraud Decision-Making Process

Authors: Semire Yekta

Abstract:

Many online retailers set up manual review teams to overcome the limitations of automated online fraud detection systems. This study critically examines the strategies they adapt in their decision-making process to set apart fraudulent individuals from non-fraudulent online shoppers. The study uses a mix method research approach. 32 in-depth interviews have been conducted alongside with participant observation and auto-ethnography. The study found out that all steps of the decision-making process are significantly affected by a level of subjectivity, personal understandings of online fraud, preferences and judgments and not necessarily by objectively identifiable facts. Rather clearly knowing who the fraudulent individuals are, the team members have to predict whether they think the customer might be a fraudster. Common strategies used are relying on the classification and fraud scorings in the automated fraud detection systems, weighing up arguments for and against the customer and making a decision, using cancellation to test customers’ reaction and making use of personal experiences and “the sixth sense”. The interaction in the team also plays a significant role given that some decisions turn into a group discussion. While customer data represent the basis for the decision-making, fraud management teams frequently make use of Google search and Google Maps to find out additional information about the customer and verify whether the customer is the person they claim to be. While this, on the one hand, raises ethical concerns, on the other hand, Google Street View on the address and area of the customer puts customers living in less privileged housing and areas at a higher risk of being classified as fraudsters. Phone validation is used as a final measurement to make decisions for or against the customer when previous strategies and Google Search do not suffice. However, phone validation is also characterized by individuals’ subjectivity, personal views and judgment on customer’s reaction on the phone that results in a final classification as genuine or fraudulent.

Keywords: online fraud, data mining, manual review, social construction

Procedia PDF Downloads 343
1315 Histopathological Features of Basal Cell Carcinoma: A Ten Year Retrospective Statistical Study in Egypt

Authors: Hala M. El-hanbuli, Mohammed F. Darweesh

Abstract:

The incidence rates of any tumor vary hugely with geographical location. Basal Cell Carcinoma (BCC) is one of the most common skin cancer that has many histopathologic subtypes. Objective: The aim was to study the histopathological features of BCC cases that were received in the Pathology Department, Kasr El-Aini hospital, Cairo University, Egypt during the period from Jan 2004 to Dec 2013 and to evaluate the clinical characters through the patient data available in the request sheets. Methods: Slides and data of BCC cases were collected from the archives of the pathology department, Kasr El-Aini hospital. Revision of all available slides and histological classification of BCC according to WHO (2006) was done. Results: A total number of 310 cases of BCC representing about 65% from the total number of malignant skin tumors examined during the 10-years duration in the department. The age ranged from 8 to 84 years, the mean age was (55.7 ± 15.5). Most of the patients (85%) were above the age of 40 years. There was a slight male predominance (55%). Ulcerated BCC was the most common gross picture (60%), followed by nodular lesion (30%) and finally the ulcerated nodule (10%). Most of the lesions situated in the high-risk sites (77%) where the nose was the most common site (35%) followed by the periocular area (22%), then periauricular (15%) and finally perioral (5%). No lesion was reported outside the head. The tumor size was less than 2 centimeters in 65% of cases, and from 2-5 centimeters in the lesions' greatest dimension in the rest of cases. Histopathological reclassification revealed that the nodular BCC was the most common (68%) followed by the pigmented nodular (18.75%). The histologic high-risk groups represented (7.5%) about half of them (3.75%) being basosquamous carcinoma. The total incidence for multiple BCC and 2nd primary was 12%. Recurrent BCC represented 8%. All of the recurrent lesions of BCC belonged to the histologic high-risk group. Conclusion: Basal Cell Carcinoma is the most common skin cancer in the 10-year survey. Histopathological diagnosis and classification of BCC cases are essential for the determination of the tumor type and its biological behavior.

Keywords: basal cell carcinoma, high risk, histopathological features, statistical analysis

Procedia PDF Downloads 149
1314 Modelling Urban Rigidity and Elasticity Growth Boundaries: A Spatial Constraints-Suitability Based Perspective

Authors: Pengcheng Xiang Jr., Xueqing Sun, Dong Ngoduy

Abstract:

In the context of rapid urbanization, urban sprawl has brought about extensive negative impacts on ecosystems and the environment, resulting in a gradual shift from "incremental growth" to ‘stock growth’ in cities. A detailed urban growth boundary is a prerequisite for urban renewal and management. This study takes Shenyang City, China, as the study area and evaluates the spatial distribution of urban spatial suitability in the study area from the perspective of spatial constraints-suitability using multi-source data and simulates the future rigid and elastic growth boundaries of the city in the study area using the CA-Markov model. The results show that (1) the suitable construction area and moderate construction area in the study area account for 8.76% and 19.01% of the total area, respectively, and the suitable construction area and moderate construction area show a trend of distribution from the urban centre to the periphery, mainly in Shenhe District, the southern part of Heping District, the western part of Dongling District, and the central part of Dadong District; (2) the area of expansion of construction land in the study area in the period of 2023-2030 is 153274.6977hm2, accounting for 44.39% of the total area of the study area; (3) the rigid boundary of the study area occupies an area of 153274.6977 hm2, accounting for 44.39% of the total area of the study area, and the elastic boundary of the study area contains an area of 75362.61 hm2, accounting for 21.69% of the total area of the study area. The study constructed a method for urban growth boundary delineation, which helps to apply remote sensing to guide future urban spatial growth management and urban renewal.

Keywords: urban growth boundary, spatial constraints, spatial suitability, urban sprawl

Procedia PDF Downloads 32
1313 A Methodology for Developing New Technology Ideas to Avoid Patent Infringement: F-Term Based Patent Analysis

Authors: Kisik Song, Sungjoo Lee

Abstract:

With the growing importance of intangible assets recently, the impact of patent infringement on the business of a company has become more evident. Accordingly, it is essential for firms to estimate the risk of patent infringement risk before developing a technology and create new technology ideas to avoid the risk. Recognizing the needs, several attempts have been made to help develop new technology opportunities and most of them have focused on identifying emerging vacant technologies from patent analysis. In these studies, the IPC (International Patent Classification) system or keywords from text-mining application to patent documents was generally used to define vacant technologies. Unlike those studies, this study adopted F-term, which classifies patent documents according to the technical features of the inventions described in them. Since the technical features are analyzed by various perspectives by F-term, F-term provides more detailed information about technologies compared to IPC while more systematic information compared to keywords. Therefore, if well utilized, it can be a useful guideline to create a new technology idea. Recognizing the potential of F-term, this paper aims to suggest a novel approach to developing new technology ideas to avoid patent infringement based on F-term. For this purpose, we firstly collected data about F-term and then applied text-mining to the descriptions about classification criteria and attributes. From the text-mining results, we could identify other technologies with similar technical features of the existing one, the patented technology. Finally, we compare the technologies and extract the technical features that are commonly used in other technologies but have not been used in the existing one. These features are presented in terms of “purpose”, “function”, “structure”, “material”, “method”, “processing and operation procedure” and “control means” and so are useful for creating new technology ideas that help avoid infringing patent rights of other companies. Theoretically, this is one of the earliest attempts to adopt F-term to patent analysis; the proposed methodology can show how to best take advantage of F-term with the wealth of technical information. In practice, the proposed methodology can be valuable in the ideation process for successful product and service innovation without infringing the patents of other companies.

Keywords: patent infringement, new technology ideas, patent analysis, F-term

Procedia PDF Downloads 269
1312 Categorical Metadata Encoding Schemes for Arteriovenous Fistula Blood Flow Sound Classification: Scaling Numerical Representations Leads to Improved Performance

Authors: George Zhou, Yunchan Chen, Candace Chien

Abstract:

Kidney replacement therapy is the current standard of care for end-stage renal diseases. In-center or home hemodialysis remains an integral component of the therapeutic regimen. Arteriovenous fistulas (AVF) make up the vascular circuit through which blood is filtered and returned. Naturally, AVF patency determines whether adequate clearance and filtration can be achieved and directly influences clinical outcomes. Our aim was to build a deep learning model for automated AVF stenosis screening based on the sound of blood flow through the AVF. A total of 311 patients with AVF were enrolled in this study. Blood flow sounds were collected using a digital stethoscope. For each patient, blood flow sounds were collected at 6 different locations along the patient’s AVF. The 6 locations are artery, anastomosis, distal vein, middle vein, proximal vein, and venous arch. A total of 1866 sounds were collected. The blood flow sounds are labeled as “patent” (normal) or “stenotic” (abnormal). The labels are validated from concurrent ultrasound. Our dataset included 1527 “patent” and 339 “stenotic” sounds. We show that blood flow sounds vary significantly along the AVF. For example, the blood flow sound is loudest at the anastomosis site and softest at the cephalic arch. Contextualizing the sound with location metadata significantly improves classification performance. How to encode and incorporate categorical metadata is an active area of research1. Herein, we study ordinal (i.e., integer) encoding schemes. The numerical representation is concatenated to the flattened feature vector. We train a vision transformer (ViT) on spectrogram image representations of the sound and demonstrate that using scalar multiples of our integer encodings improves classification performance. Models are evaluated using a 10-fold cross-validation procedure. The baseline performance of our ViT without any location metadata achieves an AuROC and AuPRC of 0.68 ± 0.05 and 0.28 ± 0.09, respectively. Using the following encodings of Artery:0; Arch: 1; Proximal: 2; Middle: 3; Distal 4: Anastomosis: 5, the ViT achieves an AuROC and AuPRC of 0.69 ± 0.06 and 0.30 ± 0.10, respectively. Using the following encodings of Artery:0; Arch: 10; Proximal: 20; Middle: 30; Distal 40: Anastomosis: 50, the ViT achieves an AuROC and AuPRC of 0.74 ± 0.06 and 0.38 ± 0.10, respectively. Using the following encodings of Artery:0; Arch: 100; Proximal: 200; Middle: 300; Distal 400: Anastomosis: 500, the ViT achieves an AuROC and AuPRC of 0.78 ± 0.06 and 0.43 ± 0.11. respectively. Interestingly, we see that using increasing scalar multiples of our integer encoding scheme (i.e., encoding “venous arch” as 1,10,100) results in progressively improved performance. In theory, the integer values do not matter since we are optimizing the same loss function; the model can learn to increase or decrease the weights associated with location encodings and converge on the same solution. However, in the setting of limited data and computation resources, increasing the importance at initialization either leads to faster convergence or helps the model escape a local minimum.

Keywords: arteriovenous fistula, blood flow sounds, metadata encoding, deep learning

Procedia PDF Downloads 87
1311 Study and Analysis of the Factors Affecting Road Safety Using Decision Tree Algorithms

Authors: Naina Mahajan, Bikram Pal Kaur

Abstract:

The purpose of traffic accident analysis is to find the possible causes of an accident. Road accidents cannot be totally prevented but by suitable traffic engineering and management the accident rate can be reduced to a certain extent. This paper discusses the classification techniques C4.5 and ID3 using the WEKA Data mining tool. These techniques use on the NH (National highway) dataset. With the C4.5 and ID3 technique it gives best results and high accuracy with less computation time and error rate.

Keywords: C4.5, ID3, NH(National highway), WEKA data mining tool

Procedia PDF Downloads 338
1310 Fault-Tolerant Control Study and Classification: Case Study of a Hydraulic-Press Model Simulated in Real-Time

Authors: Jorge Rodriguez-Guerra, Carlos Calleja, Aron Pujana, Iker Elorza, Ana Maria Macarulla

Abstract:

Society demands more reliable manufacturing processes capable of producing high quality products in shorter production cycles. New control algorithms have been studied to satisfy this paradigm, in which Fault-Tolerant Control (FTC) plays a significant role. It is suitable to detect, isolate and adapt a system when a harmful or faulty situation appears. In this paper, a general overview about FTC characteristics are exposed; highlighting the properties a system must ensure to be considered faultless. In addition, a research to identify which are the main FTC techniques and a classification based on their characteristics is presented in two main groups: Active Fault-Tolerant Controllers (AFTCs) and Passive Fault-Tolerant Controllers (PFTCs). AFTC encompasses the techniques capable of re-configuring the process control algorithm after the fault has been detected, while PFTC comprehends the algorithms robust enough to bypass the fault without further modifications. The mentioned re-configuration requires two stages, one focused on detection, isolation and identification of the fault source and the other one in charge of re-designing the control algorithm by two approaches: fault accommodation and control re-design. From the algorithms studied, one has been selected and applied to a case study based on an industrial hydraulic-press. The developed model has been embedded under a real-time validation platform, which allows testing the FTC algorithms and analyse how the system will respond when a fault arises in similar conditions as a machine will have on factory. One AFTC approach has been picked up as the methodology the system will follow in the fault recovery process. In a first instance, the fault will be detected, isolated and identified by means of a neural network. In a second instance, the control algorithm will be re-configured to overcome the fault and continue working without human interaction.

Keywords: fault-tolerant control, electro-hydraulic actuator, fault detection and isolation, control re-design, real-time

Procedia PDF Downloads 177
1309 Mondoc: Informal Lightweight Ontology for Faceted Semantic Classification of Hypernymy

Authors: M. Regina Carreira-Lopez

Abstract:

Lightweight ontologies seek to concrete union relationships between a parent node, and a secondary node, also called "child node". This logic relation (L) can be formally defined as a triple ontological relation (LO) equivalent to LO in ⟨LN, LE, LC⟩, and where LN represents a finite set of nodes (N); LE is a set of entities (E), each of which represents a relationship between nodes to form a rooted tree of ⟨LN, LE⟩; and LC is a finite set of concepts (C), encoded in a formal language (FL). Mondoc enables more refined searches on semantic and classified facets for retrieving specialized knowledge about Atlantic migrations, from the Declaration of Independence of the United States of America (1776) and to the end of the Spanish Civil War (1939). The model looks forward to increasing documentary relevance by applying an inverse frequency of co-ocurrent hypernymy phenomena for a concrete dataset of textual corpora, with RMySQL package. Mondoc profiles archival utilities implementing SQL programming code, and allows data export to XML schemas, for achieving semantic and faceted analysis of speech by analyzing keywords in context (KWIC). The methodology applies random and unrestricted sampling techniques with RMySQL to verify the resonance phenomena of inverse documentary relevance between the number of co-occurrences of the same term (t) in more than two documents of a set of texts (D). Secondly, the research also evidences co-associations between (t) and their corresponding synonyms and antonyms (synsets) are also inverse. The results from grouping facets or polysemic words with synsets in more than two textual corpora within their syntagmatic context (nouns, verbs, adjectives, etc.) state how to proceed with semantic indexing of hypernymy phenomena for subject-heading lists and for authority lists for documentary and archival purposes. Mondoc contributes to the development of web directories and seems to achieve a proper and more selective search of e-documents (classification ontology). It can also foster on-line catalogs production for semantic authorities, or concepts, through XML schemas, because its applications could be used for implementing data models, by a prior adaptation of the based-ontology to structured meta-languages, such as OWL, RDF (descriptive ontology). Mondoc serves to the classification of concepts and applies a semantic indexing approach of facets. It enables information retrieval, as well as quantitative and qualitative data interpretation. The model reproduces a triple tuple ⟨LN, LE, LT, LCF L, BKF⟩ where LN is a set of entities that connect with other nodes to concrete a rooted tree in ⟨LN, LE⟩. LT specifies a set of terms, and LCF acts as a finite set of concepts, encoded in a formal language, L. Mondoc only resolves partial problems of linguistic ambiguity (in case of synonymy and antonymy), but neither the pragmatic dimension of natural language nor the cognitive perspective is addressed. To achieve this goal, forthcoming programming developments should target at oriented meta-languages with structured documents in XML.

Keywords: hypernymy, information retrieval, lightweight ontology, resonance

Procedia PDF Downloads 125
1308 Deep Reinforcement Learning-Based Computation Offloading for 5G Vehicle-Aware Multi-Access Edge Computing Network

Authors: Ziying Wu, Danfeng Yan

Abstract:

Multi-Access Edge Computing (MEC) is one of the key technologies of the future 5G network. By deploying edge computing centers at the edge of wireless access network, the computation tasks can be offloaded to edge servers rather than the remote cloud server to meet the requirements of 5G low-latency and high-reliability application scenarios. Meanwhile, with the development of IOV (Internet of Vehicles) technology, various delay-sensitive and compute-intensive in-vehicle applications continue to appear. Compared with traditional internet business, these computation tasks have higher processing priority and lower delay requirements. In this paper, we design a 5G-based Vehicle-Aware Multi-Access Edge Computing Network (VAMECN) and propose a joint optimization problem of minimizing total system cost. In view of the problem, a deep reinforcement learning-based joint computation offloading and task migration optimization (JCOTM) algorithm is proposed, considering the influences of multiple factors such as concurrent multiple computation tasks, system computing resources distribution, and network communication bandwidth. And, the mixed integer nonlinear programming problem is described as a Markov Decision Process. Experiments show that our proposed algorithm can effectively reduce task processing delay and equipment energy consumption, optimize computing offloading and resource allocation schemes, and improve system resource utilization, compared with other computing offloading policies.

Keywords: multi-access edge computing, computation offloading, 5th generation, vehicle-aware, deep reinforcement learning, deep q-network

Procedia PDF Downloads 117
1307 The Effects of Lithofacies on Oil Enrichment in Lucaogou Formation Fine-Grained Sedimentary Rocks in Santanghu Basin, China

Authors: Guoheng Liu, Zhilong Huang

Abstract:

For more than the past ten years, oil and gas production from marine shale such as the Barnett shale. In addition, in recent years, major breakthroughs have also been made in lacustrine shale gas exploration, such as the Yanchang Formation of the Ordos Basin in China. Lucaogou Formation shale, which is also lacustrine shale, has also yielded a high production in recent years, for wells such as M1, M6, and ML2, yielding a daily oil production of 5.6 tons, 37.4 tons and 13.56 tons, respectively. Lithologic identification and classification of reservoirs are the base and keys to oil and gas exploration. Lithology and lithofacies obviously control the distribution of oil and gas in lithological reservoirs, so it is of great significance to describe characteristics of lithology and lithofacies of reservoirs finely. Lithofacies is an intrinsic property of rock formed under certain conditions of sedimentation. Fine-grained sedimentary rocks such as shale formed under different sedimentary conditions display great particularity and distinctiveness. Hence, to our best knowledge, no constant and unified criteria and methods exist for fine-grained sedimentary rocks regarding lithofacies definition and classification. Consequently, multi-parameters and multi-disciplines are necessary. A series of qualitative descriptions and quantitative analysis were used to figure out the lithofacies characteristics and its effect on oil accumulation of Lucaogou formation fine-grained sedimentary rocks in Santanghu basin. The qualitative description includes core description, petrographic thin section observation, fluorescent thin-section observation, cathode luminescence observation and scanning electron microscope observation. The quantitative analyses include X-ray diffraction, total organic content analysis, ROCK-EVAL.II Methodology, soxhlet extraction, porosity and permeability analysis and oil saturation analysis. Three types of lithofacies were mainly well-developed in this study area, which is organic-rich massive shale lithofacies, organic-rich laminated and cloddy hybrid sedimentary lithofacies and organic-lean massive carbonate lithofacies. Organic-rich massive shale lithofacies mainly include massive shale and tuffaceous shale, of which quartz and clay minerals are the major components. Organic-rich laminated and cloddy hybrid sedimentary lithofacies contain lamina and cloddy structure. Rocks from this lithofacies chiefly consist of dolomite and quartz. Organic-lean massive carbonate lithofacies mainly contains massive bedding fine-grained carbonate rocks, of which fine-grained dolomite accounts for the main part. Organic-rich massive shale lithofacies contain the highest content of free hydrocarbon and solid organic matter. Moreover, more pores were developed in organic-rich massive shale lithofacies. Organic-lean massive carbonate lithofacies contain the lowest content solid organic matter and develop the least amount of pores. Organic-rich laminated and cloddy hybrid sedimentary lithofacies develop the largest number of cracks and fractures. To sum up, organic-rich massive shale lithofacies is the most favorable type of lithofacies. Organic-lean massive carbonate lithofacies is impossible for large scale oil accumulation.

Keywords: lithofacies classification, tuffaceous shale, oil enrichment, Lucaogou formation

Procedia PDF Downloads 220
1306 Fast Bayesian Inference of Multivariate Block-Nearest Neighbor Gaussian Process (NNGP) Models for Large Data

Authors: Carlos Gonzales, Zaida Quiroz, Marcos Prates

Abstract:

Several spatial variables collected at the same location that share a common spatial distribution can be modeled simultaneously through a multivariate geostatistical model that takes into account the correlation between these variables and the spatial autocorrelation. The main goal of this model is to perform spatial prediction of these variables in the region of study. Here we focus on a geostatistical multivariate formulation that relies on sharing common spatial random effect terms. In particular, the first response variable can be modeled by a mean that incorporates a shared random spatial effect, while the other response variables depend on this shared spatial term, in addition to specific random spatial effects. Each spatial random effect is defined through a Gaussian process with a valid covariance function, but in order to improve the computational efficiency when the data are large, each Gaussian process is approximated to a Gaussian random Markov field (GRMF), specifically to the block nearest neighbor Gaussian process (Block-NNGP). This approach involves dividing the spatial domain into several dependent blocks under certain constraints, where the cross blocks allow capturing the spatial dependence on a large scale, while each individual block captures the spatial dependence on a smaller scale. The multivariate geostatistical model belongs to the class of Latent Gaussian Models; thus, to achieve fast Bayesian inference, it is used the integrated nested Laplace approximation (INLA) method. The good performance of the proposed model is shown through simulations and applications for massive data.

Keywords: Block-NNGP, geostatistics, gaussian process, GRMF, INLA, multivariate models.

Procedia PDF Downloads 97
1305 Reconnaissance Investigation of Thermal Springs in the Middle Benue Trough, Nigeria by Remote Sensing

Authors: N. Tochukwu, M. Mukhopadhyay, A. Mohamed

Abstract:

It is no new that Nigeria faces a continual power shortage problem due to its vast population power demand and heavy reliance on nonrenewable forms of energy such as thermal power or fossil fuel. Many researchers have recommended using geothermal energy as an alternative; however, Past studies focus on the geophysical & geochemical investigation of this energy in the sedimentary and basement complex; only a few studies incorporated the remote sensing methods. Therefore, in this study, the preliminary examination of geothermal resources in the Middle Benue was carried out using satellite imagery in ArcMap. Landsat 8 scene (TIR, NIR, Red spectral bands) was used to estimate the Land Surface Temperature (LST). The Maximum Likelihood Classification (MLC) technique was used to classify sites with very low, low, moderate, and high LST. The intermediate and high classification happens to be possible geothermal zones, and they occupy 49% of the study area (38077km2). Riverline were superimposed on the LST layer, and the identification tool was used to locate high temperate sites. Streams that overlap on the selected sites were regarded as geothermal springs as. Surprisingly, the LST results show lower temperatures (<36°C) at the famous thermal springs (Awe & Wukari) than some unknown rivers/streams found in Kwande (38°C), Ussa, (38°C), Gwer East (37°C), Yola Cross & Ogoja (36°C). Studies have revealed that temperature increases with depth. However, this result shows excellent geothermal resources potential as it is expected to exceed the minimum geothermal gradient of 25.47 with an increase in depth. Therefore, further investigation is required to estimate the depth of the causative body, geothermal gradients, and the sustainability of the reservoirs by geophysical and field exploration. This method has proven to be cost-effective in locating geothermal resources in the study area. Consequently, the same procedure is recommended to be applied in other regions of the Precambrian basement complex and the sedimentary basins in Nigeria to save a preliminary field survey cost.

Keywords: ArcMap, geothermal resources, Landsat 8, LST, thermal springs, MLC

Procedia PDF Downloads 190
1304 Constructing a Semi-Supervised Model for Network Intrusion Detection

Authors: Tigabu Dagne Akal

Abstract:

While advances in computer and communications technology have made the network ubiquitous, they have also rendered networked systems vulnerable to malicious attacks devised from a distance. These attacks or intrusions start with attackers infiltrating a network through a vulnerable host and then launching further attacks on the local network or Intranet. Nowadays, system administrators and network professionals can attempt to prevent such attacks by developing intrusion detection tools and systems using data mining technology. In this study, the experiments were conducted following the Knowledge Discovery in Database Process Model. The Knowledge Discovery in Database Process Model starts from selection of the datasets. The dataset used in this study has been taken from Massachusetts Institute of Technology Lincoln Laboratory. After taking the data, it has been pre-processed. The major pre-processing activities include fill in missed values, remove outliers; resolve inconsistencies, integration of data that contains both labelled and unlabelled datasets, dimensionality reduction, size reduction and data transformation activity like discretization tasks were done for this study. A total of 21,533 intrusion records are used for training the models. For validating the performance of the selected model a separate 3,397 records are used as a testing set. For building a predictive model for intrusion detection J48 decision tree and the Naïve Bayes algorithms have been tested as a classification approach for both with and without feature selection approaches. The model that was created using 10-fold cross validation using the J48 decision tree algorithm with the default parameter values showed the best classification accuracy. The model has a prediction accuracy of 96.11% on the training datasets and 93.2% on the test dataset to classify the new instances as normal, DOS, U2R, R2L and probe classes. The findings of this study have shown that the data mining methods generates interesting rules that are crucial for intrusion detection and prevention in the networking industry. Future research directions are forwarded to come up an applicable system in the area of the study.

Keywords: intrusion detection, data mining, computer science, data mining

Procedia PDF Downloads 296
1303 Recurrent Neural Networks for Classifying Outliers in Electronic Health Record Clinical Text

Authors: Duncan Wallace, M-Tahar Kechadi

Abstract:

In recent years, Machine Learning (ML) approaches have been successfully applied to an analysis of patient symptom data in the context of disease diagnosis, at least where such data is well codified. However, much of the data present in Electronic Health Records (EHR) are unlikely to prove suitable for classic ML approaches. Furthermore, as scores of data are widely spread across both hospitals and individuals, a decentralized, computationally scalable methodology is a priority. The focus of this paper is to develop a method to predict outliers in an out-of-hours healthcare provision center (OOHC). In particular, our research is based upon the early identification of patients who have underlying conditions which will cause them to repeatedly require medical attention. OOHC act as an ad-hoc delivery of triage and treatment, where interactions occur without recourse to a full medical history of the patient in question. Medical histories, relating to patients contacting an OOHC, may reside in several distinct EHR systems in multiple hospitals or surgeries, which are unavailable to the OOHC in question. As such, although a local solution is optimal for this problem, it follows that the data under investigation is incomplete, heterogeneous, and comprised mostly of noisy textual notes compiled during routine OOHC activities. Through the use of Deep Learning methodologies, the aim of this paper is to provide the means to identify patient cases, upon initial contact, which are likely to relate to such outliers. To this end, we compare the performance of Long Short-Term Memory, Gated Recurrent Units, and combinations of both with Convolutional Neural Networks. A further aim of this paper is to elucidate the discovery of such outliers by examining the exact terms which provide a strong indication of positive and negative case entries. While free-text is the principal data extracted from EHRs for classification, EHRs also contain normalized features. Although the specific demographical features treated within our corpus are relatively limited in scope, we examine whether it is beneficial to include such features among the inputs to our neural network, or whether these features are more successfully exploited in conjunction with a different form of a classifier. In this section, we compare the performance of randomly generated regression trees and support vector machines and determine the extent to which our classification program can be improved upon by using either of these machine learning approaches in conjunction with the output of our Recurrent Neural Network application. The output of our neural network is also used to help determine the most significant lexemes present within the corpus for determining high-risk patients. By combining the confidence of our classification program in relation to lexemes within true positive and true negative cases, with an inverse document frequency of the lexemes related to these cases, we can determine what features act as the primary indicators of frequent-attender and non-frequent-attender cases, providing a human interpretable appreciation of how our program classifies cases.

Keywords: artificial neural networks, data-mining, machine learning, medical informatics

Procedia PDF Downloads 131
1302 A Kruskal Based Heuxistic for the Application of Spanning Tree

Authors: Anjan Naidu

Abstract:

In this paper we first discuss the minimum spanning tree, then we use the Kruskal algorithm to obtain minimum spanning tree. Based on Kruskal algorithm we propose Kruskal algorithm to apply an application to find minimum cost applying the concept of spanning tree.

Keywords: Minimum Spanning tree, algorithm, Heuxistic, application, classification of Sub 97K90

Procedia PDF Downloads 444
1301 Towards Learning Query Expansion

Authors: Ahlem Bouziri, Chiraz Latiri, Eric Gaussier

Abstract:

The steady growth in the size of textual document collections is a key progress-driver for modern information retrieval techniques whose effectiveness and efficiency are constantly challenged. Given a user query, the number of retrieved documents can be overwhelmingly large, hampering their efficient exploitation by the user. In addition, retaining only relevant documents in a query answer is of paramount importance for an effective meeting of the user needs. In this situation, the query expansion technique offers an interesting solution for obtaining a complete answer while preserving the quality of retained documents. This mainly relies on an accurate choice of the added terms to an initial query. Interestingly enough, query expansion takes advantage of large text volumes by extracting statistical information about index terms co-occurrences and using it to make user queries better fit the real information needs. In this respect, a promising track consists in the application of data mining methods to extract dependencies between terms, namely a generic basis of association rules between terms. The key feature of our approach is a better trade off between the size of the mining result and the conveyed knowledge. Thus, face to the huge number of derived association rules and in order to select the optimal combination of query terms from the generic basis, we propose to model the problem as a classification problem and solve it using a supervised learning algorithm such as SVM or k-means. For this purpose, we first generate a training set using a genetic algorithm based approach that explores the association rules space in order to find an optimal set of expansion terms, improving the MAP of the search results. The experiments were performed on SDA 95 collection, a data collection for information retrieval. It was found that the results were better in both terms of MAP and NDCG. The main observation is that the hybridization of text mining techniques and query expansion in an intelligent way allows us to incorporate the good features of all of them. As this is a preliminary attempt in this direction, there is a large scope for enhancing the proposed method.

Keywords: supervised leaning, classification, query expansion, association rules

Procedia PDF Downloads 324
1300 Human Gait Recognition Using Moment with Fuzzy

Authors: Jyoti Bharti, Navneet Manjhi, M. K.Gupta, Bimi Jain

Abstract:

A reliable gait features are required to extract the gait sequences from an images. In this paper suggested a simple method for gait identification which is based on moments. Moment values are extracted on different number of frames of gray scale and silhouette images of CASIA database. These moment values are considered as feature values. Fuzzy logic and nearest neighbour classifier are used for classification. Both achieved higher recognition.

Keywords: gait, fuzzy logic, nearest neighbour, recognition rate, moments

Procedia PDF Downloads 757
1299 Adolescent-Parent Relationship as the Most Important Factor in Preventing Mood Disorders in Adolescents: An Application of Artificial Intelligence to Social Studies

Authors: Elżbieta Turska

Abstract:

Introduction: One of the most difficult times in a person’s life is adolescence. The experiences in this period may shape the future life of this person to a large extent. This is the reason why many young people experience sadness, dejection, hopelessness, sense of worthlessness, as well as losing interest in various activities and social relationships, all of which are often classified as mood disorders. As many as 15-40% adolescents experience depressed moods and for most of them they resolve and are not carried into adulthood. However, (5-6%) of those affected by mood disorders develop the depressive syndrome and as many as (1-3%) develop full-blown clinical depression. Materials: A large questionnaire was given to 2508 students, aged 13–16 years old, and one of its parts was the Burns checklist, i.e. the standard test for identifying depressed mood. The questionnaire asked about many aspects of the student’s life, it included a total of 53 questions, most of which had subquestions. It is important to note that the data suffered from many problems, the most important of which were missing data and collinearity. Aim: In order to identify the correlates of mood disorders we built predictive models which were then trained and validated. Our aim was not to be able to predict which students suffer from mood disorders but rather to explore the factors influencing mood disorders. Methods: The problems with data described above practically excluded using all classical statistical methods. For this reason, we attempted to use the following Artificial Intelligence (AI) methods: classification trees with surrogate variables, random forests and xgboost. All analyses were carried out with the use of the mlr package for the R programming language. Resuts: The predictive model built by classification trees algorithm outperformed the other algorithms by a large margin. As a result, we were able to rank the variables (questions and subquestions from the questionnaire) from the most to least influential as far as protection against mood disorder is concerned. Thirteen out of twenty most important variables reflect the relationships with parents. This seems to be a really significant result both from the cognitive point of view and also from the practical point of view, i.e. as far as interventions to correct mood disorders are concerned.

Keywords: mood disorders, adolescents, family, artificial intelligence

Procedia PDF Downloads 101