Search results for: semantic clinical classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5932

Search results for: semantic clinical classification

5752 A Custom Convolutional Neural Network with Hue, Saturation, Value Color for Malaria Classification

Authors: Ghazala Hcini, Imen Jdey, Hela Ltifi

Abstract:

Malaria disease should be considered and handled as a potential restorative catastrophe. One of the most challenging tasks in the field of microscopy image processing is due to differences in test design and vulnerability of cell classifications. In this article, we focused on applying deep learning to classify patients by identifying images of infected and uninfected cells. We performed multiple forms, counting a classification approach using the Hue, Saturation, Value (HSV) color space. HSV is used since of its superior ability to speak to image brightness; at long last, for classification, a convolutional neural network (CNN) architecture is created. Clusters of focus were used to deliver the classification. The highlights got to be forbidden, and a few more clamor sorts are included in the information. The suggested method has a precision of 99.79%, a recall value of 99.55%, and provides 99.96% accuracy.

Keywords: deep learning, convolutional neural network, image classification, color transformation, HSV color, malaria diagnosis, malaria cells images

Procedia PDF Downloads 75
5751 Reinforcement Learning for Classification of Low-Resolution Satellite Images

Authors: Khadija Bouzaachane, El Mahdi El Guarmah

Abstract:

The classification of low-resolution satellite images has been a worthwhile and fertile field that attracts plenty of researchers due to its importance in monitoring geographical areas. It could be used for several purposes such as disaster management, military surveillance, agricultural monitoring. The main objective of this work is to classify efficiently and accurately low-resolution satellite images by using novel technics of deep learning and reinforcement learning. The images include roads, residential areas, industrial areas, rivers, sea lakes, and vegetation. To achieve that goal, we carried out experiments on the sentinel-2 images considering both high accuracy and efficiency classification. Our proposed model achieved a 91% accuracy on the testing dataset besides a good classification for land cover. Focus on the parameter precision; we have obtained 93% for the river, 92% for residential, 97% for residential, 96% for the forest, 87% for annual crop, 84% for herbaceous vegetation, 85% for pasture, 78% highway and 100% for Sea Lake.

Keywords: classification, deep learning, reinforcement learning, satellite imagery

Procedia PDF Downloads 190
5750 Using Self Organizing Feature Maps for Classification in RGB Images

Authors: Hassan Masoumi, Ahad Salimi, Nazanin Barhemmat, Babak Gholami

Abstract:

Artificial neural networks have gained a lot of interest as empirical models for their powerful representational capacity, multi input and output mapping characteristics. In fact, most feed-forward networks with nonlinear nodal functions have been proved to be universal approximates. In this paper, we propose a new supervised method for color image classification based on self organizing feature maps (SOFM). This algorithm is based on competitive learning. The method partitions the input space using self-organizing feature maps to introduce the concept of local neighborhoods. Our image classification system entered into RGB image. Experiments with simulated data showed that separability of classes increased when increasing training time. In additional, the result shows proposed algorithms are effective for color image classification.

Keywords: classification, SOFM algorithm, neural network, neighborhood, RGB image

Procedia PDF Downloads 459
5749 Attention Multiple Instance Learning for Cancer Tissue Classification in Digital Histopathology Images

Authors: Afaf Alharbi, Qianni Zhang

Abstract:

The identification of malignant tissue in histopathological slides holds significant importance in both clinical settings and pathology research. This paper introduces a methodology aimed at automatically categorizing cancerous tissue through the utilization of a multiple-instance learning framework. This framework is specifically developed to acquire knowledge of the Bernoulli distribution of the bag label probability by employing neural networks. Furthermore, we put forward a neural network based permutation-invariant aggregation operator, equivalent to attention mechanisms, which is applied to the multi-instance learning network. Through empirical evaluation of an openly available colon cancer histopathology dataset, we provide evidence that our approach surpasses various conventional deep learning methods.

Keywords: attention multiple instance learning, MIL and transfer learning, histopathological slides, cancer tissue classification

Procedia PDF Downloads 88
5748 A Hybrid Fuzzy Clustering Approach for Fertile and Unfertile Analysis

Authors: Shima Soltanzadeh, Mohammad Hosain Fazel Zarandi, Mojtaba Barzegar Astanjin

Abstract:

Diagnosis of male infertility by the laboratory tests is expensive and, sometimes it is intolerable for patients. Filling out the questionnaire and then using classification method can be the first step in decision-making process, so only in the cases with a high probability of infertility we can use the laboratory tests. In this paper, we evaluated the performance of four classification methods including naive Bayesian, neural network, logistic regression and fuzzy c-means clustering as a classification, in the diagnosis of male infertility due to environmental factors. Since the data are unbalanced, the ROC curves are most suitable method for the comparison. In this paper, we also have selected the more important features using a filtering method and examined the impact of this feature reduction on the performance of each methods; generally, most of the methods had better performance after applying the filter. We have showed that using fuzzy c-means clustering as a classification has a good performance according to the ROC curves and its performance is comparable to other classification methods like logistic regression.

Keywords: classification, fuzzy c-means, logistic regression, Naive Bayesian, neural network, ROC curve

Procedia PDF Downloads 319
5747 Automatic Classification of Periodic Heart Sounds Using Convolutional Neural Network

Authors: Jia Xin Low, Keng Wah Choo

Abstract:

This paper presents an automatic normal and abnormal heart sound classification model developed based on deep learning algorithm. MITHSDB heart sounds datasets obtained from the 2016 PhysioNet/Computing in Cardiology Challenge database were used in this research with the assumption that the electrocardiograms (ECG) were recorded simultaneously with the heart sounds (phonocardiogram, PCG). The PCG time series are segmented per heart beat, and each sub-segment is converted to form a square intensity matrix, and classified using convolutional neural network (CNN) models. This approach removes the need to provide classification features for the supervised machine learning algorithm. Instead, the features are determined automatically through training, from the time series provided. The result proves that the prediction model is able to provide reasonable and comparable classification accuracy despite simple implementation. This approach can be used for real-time classification of heart sounds in Internet of Medical Things (IoMT), e.g. remote monitoring applications of PCG signal.

Keywords: convolutional neural network, discrete wavelet transform, deep learning, heart sound classification

Procedia PDF Downloads 332
5746 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 451
5745 Statistical Wavelet Features, PCA, and SVM-Based Approach for EEG Signals Classification

Authors: R. K. Chaurasiya, N. D. Londhe, S. Ghosh

Abstract:

The study of the electrical signals produced by neural activities of human brain is called Electroencephalography. In this paper, we propose an automatic and efficient EEG signal classification approach. The proposed approach is used to classify the EEG signal into two classes: epileptic seizure or not. In the proposed approach, we start with extracting the features by applying Discrete Wavelet Transform (DWT) in order to decompose the EEG signals into sub-bands. These features, extracted from details and approximation coefficients of DWT sub-bands, are used as input to Principal Component Analysis (PCA). The classification is based on reducing the feature dimension using PCA and deriving the support-vectors using Support Vector Machine (SVM). The experimental are performed on real and standard dataset. A very high level of classification accuracy is obtained in the result of classification.

Keywords: discrete wavelet transform, electroencephalogram, pattern recognition, principal component analysis, support vector machine

Procedia PDF Downloads 620
5744 Lipschitz Classifiers Ensembles: Usage for Classification of Target Events in C-OTDR Monitoring Systems

Authors: Andrey V. Timofeev

Abstract:

This paper introduces an original method for guaranteed estimation of the accuracy of an ensemble of Lipschitz classifiers. The solution was obtained as a finite closed set of alternative hypotheses, which contains an object of classification with a probability of not less than the specified value. Thus, the classification is represented by a set of hypothetical classes. In this case, the smaller the cardinality of the discrete set of hypothetical classes is, the higher is the classification accuracy. Experiments have shown that if the cardinality of the classifiers ensemble is increased then the cardinality of this set of hypothetical classes is reduced. The problem of the guaranteed estimation of the accuracy of an ensemble of Lipschitz classifiers is relevant in the multichannel classification of target events in C-OTDR monitoring systems. Results of suggested approach practical usage to accuracy control in C-OTDR monitoring systems are present.

Keywords: Lipschitz classifiers, confidence set, C-OTDR monitoring, classifiers accuracy, classifiers ensemble

Procedia PDF Downloads 479
5743 A Review of Effective Gene Selection Methods for Cancer Classification Using Microarray Gene Expression Profile

Authors: Hala Alshamlan, Ghada Badr, Yousef Alohali

Abstract:

Cancer is one of the dreadful diseases, which causes considerable death rate in humans. DNA microarray-based gene expression profiling has been emerged as an efficient technique for cancer classification, as well as for diagnosis, prognosis, and treatment purposes. In recent years, a DNA microarray technique has gained more attraction in both scientific and in industrial fields. It is important to determine the informative genes that cause cancer to improve early cancer diagnosis and to give effective chemotherapy treatment. In order to gain deep insight into the cancer classification problem, it is necessary to take a closer look at the proposed gene selection methods. We believe that they should be an integral preprocessing step for cancer classification. Furthermore, finding an accurate gene selection method is a very significant issue in a cancer classification area because it reduces the dimensionality of microarray dataset and selects informative genes. In this paper, we classify and review the state-of-art gene selection methods. We proceed by evaluating the performance of each gene selection approach based on their classification accuracy and number of informative genes. In our evaluation, we will use four benchmark microarray datasets for the cancer diagnosis (leukemia, colon, lung, and prostate). In addition, we compare the performance of gene selection method to investigate the effective gene selection method that has the ability to identify a small set of marker genes, and ensure high cancer classification accuracy. To the best of our knowledge, this is the first attempt to compare gene selection approaches for cancer classification using microarray gene expression profile.

Keywords: gene selection, feature selection, cancer classification, microarray, gene expression profile

Procedia PDF Downloads 433
5742 Semantic Features of Turkish and Spanish Phraseological Units with a Somatic Component ‘Hand’

Authors: Narmina Mammadova

Abstract:

In modern linguistics, the comparative study of languages is becoming increasingly popular, the typology and comparison of languages that have different structures is expanding and deepening. Of particular interest is the study of phraseological units, which makes it possible to identify the specific features of the compared languages in all their national identity. This paper gives a brief analysis of the comparative study of somatic phraseological units (SFU) of the Spanish and Turkish languages with the component "hand" in the semantic aspect; identification of equivalents, analogs and non-equivalent units, as well as a description of methods of translation of non-equivalent somatic phraseological units. Comparative study of the phraseology of unrelated languages is of particular relevance since it allows us to identify both general, universal features and differential and specific features characteristic of a particular language. Based on the results of the generalization of the study, it can be assumed that phraseological units containing a somatic component have a high interlingual phraseological activity, which contributes to an increase in the degree of interlingual equivalence.

Keywords: Linguoculturology, Turkish, Spanish, language picture of the world, phraseological units, semantic microfield

Procedia PDF Downloads 178
5741 Preliminary Study of Sediment-Derived Plastiglomerate: Proposal to Classification

Authors: Agung Rizki Perdana, Asrofi Mursalin, Adniwan Shubhi Banuzaki, M. Indra Novian

Abstract:

The understanding about sediment-derived plastiglomerate has a wide-range of merit in the academic realm. It can cover discussions about the Anthropocene Epoch in the scope of geoscience knowledge to even provide a solution for the environmental problem of plastic waste. Albeit its importance, very few research has been done regarding this issue. This research aims to create a classification as a pioneer for the study of sediment-derived plastiglomerate. This research was done in Bantul Regency, Daerah Istimewa Yogyakarta Province as an analogue of plastic debris sedimentation process. Observation is carried out in five observation points that shows three different depositional environments, which are terrestrial, fluvial, and transitional environment. The resulting classification uses three parameters and forms in a taxonomical manner. These parameters are composition, degree of lithification, and abundance of matrix respectively in advancing order. There is also a compositional ternary diagram which should be followed before entering the plastiglomerate nomenclature classification.

Keywords: plastiglomerate, classification, sedimentary mechanism, microplastic

Procedia PDF Downloads 120
5740 Use of Interpretable Evolved Search Query Classifiers for Sinhala Documents

Authors: Prasanna Haddela

Abstract:

Document analysis is a well matured yet still active research field, partly as a result of the intricate nature of building computational tools but also due to the inherent problems arising from the variety and complexity of human languages. Breaking down language barriers is vital in enabling access to a number of recent technologies. This paper investigates the application of document classification methods to new Sinhalese datasets. This language is geographically isolated and rich with many of its own unique features. We will examine the interpretability of the classification models with a particular focus on the use of evolved Lucene search queries generated using a Genetic Algorithm (GA) as a method of document classification. We will compare the accuracy and interpretability of these search queries with other popular classifiers. The results are promising and are roughly in line with previous work on English language datasets.

Keywords: evolved search queries, Sinhala document classification, Lucene Sinhala analyzer, interpretable text classification, genetic algorithm

Procedia PDF Downloads 100
5739 Academic Literacy: Semantic-Discursive Resource and the Relationship with the Constitution of Genre for the Development of Writing

Authors: Lucia Rottava

Abstract:

The present study focuses on academic literacy and addresses the impact of semantic-discursive resources on the constitution of genres that are produced in such context. The research considers the development of writing in the academic context in Portuguese. Researches that address academic literacy and the characteristics of the texts produced in this context are rare, mainly with focus on the development of writing, considering three variables: the constitution of the writer, the perception of the reader/interlocutor and the organization of the informational text flow. The research aims to map the semantic-discursive resources of the written register in texts of several genres and produced by students in the first semester of the undergraduate course in letters. The hypothesis raised is that writing in the academic environment is not a recurrent literacy practice for these learners and can be explained by the ontogenetic and phylogenetic nature of language development. Qualitative in nature, the present research has as empirical data texts produced in a half-yearly course of Reading and Textual Production; these data result from the proposition of four different writing proposals, in a total of 600 texts. The corpus is analyzed based on semantic-discursive resources, seeking to contemplate relevant aspects of language (grammar, discourse and social context) that reveal the choices made in the reader/writer interrelationship and the organizational flow of the text. Among the semantic-discursive resources, the analysis includes three resources, including (a) appraisal and negotiation to understand the attitudes negotiated (roles of the participants of the discourse and their relationship with the other); (b) ideation to explain the construction of the experience (activities performed and participants); and (c) periodicity to outline the flow of information in the organization of the text according to the genre it instantiates. The results indicate the organizational difficulties of the flow of the text information. Cartography contributes to the understanding of the way writers use language in an effort to present themselves, evaluate someone else’s work, and communicate with readers.

Keywords: academic writing, portuguese mother tongue, semantic-discursive resources, sistemic funcional linguistic

Procedia PDF Downloads 110
5738 Classification Systems of Peat Soils Based on Their Geotechnical, Physical and Chemical Properties

Authors: Mohammad Saberian, Reza Porhoseini, Mohammad Ali Rahgozar

Abstract:

Peat is a partially carbonized vegetable tissue which is formed in wet conditions by decomposition of various plants, mosses and animal remains. This restricted definition, including only materials which are entirely of vegetative origin, conflicts with several established soil classification systems. Peat soils are usually defined as soils having more than 75 percent organic matter. Due to this composition, the structure of peat soil is highly different from the mineral soils such as silt, clay and sand. Peat has high compressibility, high moisture content, low shear strength and low bearing capacity, so it is considered to be in the category of problematic. Since this kind of soil is generally found in many countries and various zones, except for desert and polar zones, recognizing this soil is inevitably significant. The objective of this paper is to review the classification of peats based on various properties of peat soils such as organic contents, water content, color, odor, and decomposition, scholars offer various classification systems which Von Post classification system is one of the most well-known and efficient system.

Keywords: peat soil, degree of decomposition, organic content, water content, Von Post classification

Procedia PDF Downloads 578
5737 Factorization of Computations in Bayesian Networks: Interpretation of Factors

Authors: Linda Smail, Zineb Azouz

Abstract:

Given a Bayesian network relative to a set I of discrete random variables, we are interested in computing the probability distribution P(S) where S is a subset of I. The general idea is to write the expression of P(S) in the form of a product of factors where each factor is easy to compute. More importantly, it will be very useful to give an interpretation of each of the factors in terms of conditional probabilities. This paper considers a semantic interpretation of the factors involved in computing marginal probabilities in Bayesian networks. Establishing such a semantic interpretations is indeed interesting and relevant in the case of large Bayesian networks.

Keywords: Bayesian networks, D-Separation, level two Bayesian networks, factorization of computation

Procedia PDF Downloads 510
5736 On Supporting a Meta-Design Approach in Socio-Technical Ontology Engineering

Authors: Mesnan Silalahi, Dana Indra Sensuse, Indra Budi

Abstract:

Many research have revealed the fact of the complexity of ontology building process that there is a need to have a new approach which addresses the socio-technical aspects in the collaboration to reach a consensus. Meta-design approach is considered applicable as a method in the methodological model in a socio-technical ontology engineering. Principles in the meta-design framework is applied in the construction phases on the ontology. A portal is developed to support the meta-design principles requirements. To validate the methodological model semantic web applications were developed and integrated in the portal and also used as a way to show the usefulness of the ontology. The knowledge based system will be filled with data of Indonesian medicinal plants. By showing the usefulness of the developed ontology in a web semantic application, we motivate all stakeholders to participate in the development of knowledge based system of medicinal plants in Indonesia.

Keywords: socio-technical, metadesign, ontology engineering methodology, semantic web application

Procedia PDF Downloads 424
5735 Social Media, Networks and Related Technology: Business and Governance Perspectives

Authors: M. A. T. AlSudairi, T. G. K. Vasista

Abstract:

The concept of social media is becoming the top of the agenda for many business executives and public sector executives today. Decision makers as well as consultants, try to identify ways in which firms and enterprises can make profitable use of social media and network related applications such as Wikipedia, Face book, YouTube, Google+, Twitter. While it is fun and useful to participating in this media and network for achieving the communication effectively and efficiently, semantic and sentiment analysis and interpretation becomes a crucial issue. So, the objective of this paper is to provide literature review on social media, network and related technology related to semantics and sentiment or opinion analysis covering business and governance perspectives. In this regard, a case study on the use and adoption of Social media in Saudi Arabia has been discussed. It is concluded that semantic web technology play a significant role in analyzing the social networks and social media content for extracting the interpretational knowledge towards strategic decision support.

Keywords: CRASP methodology, formative assessment, literature review, semantic web services, social media, social networks

Procedia PDF Downloads 437
5734 Clinical Features, Diagnosis and Treatment Outcomes in Necrotising Autoimmune Myopathy: A Rare Entity in the Spectrum of Inflammatory Myopathies

Authors: Tamphasana Wairokpam

Abstract:

Inflammatory myopathies (IMs) have long been recognised as a heterogenous family of myopathies with acute, subacute, and sometimes chronic presentation and are potentially treatable. Necrotizing autoimmune myopathies (NAM) are a relatively new subset of myopathies. Patients generally present with subacute onset of proximal myopathy and significantly elevated creatinine kinase (CK) levels. It is being increasingly recognised that there are limitations to the independent diagnostic utility of muscle biopsy. Immunohistochemistry tests may reveal important information in these cases. The traditional classification of IMs failed to recognise NAM as a separate entity and did not adequately emphasize the diversity of IMs. This review and case report on NAM aims to highlight the heterogeneity of this entity and focus on the distinct clinical presentation, biopsy findings, specific auto-antibodies implicated, and available treatment options with prognosis. This article is a meta-analysis of literatures on NAM and a case report illustrating the clinical course, investigation and biopsy findings, antibodies implicated, and management of a patient with NAM. The main databases used for the search were Pubmed, Google Scholar, and Cochrane Library. Altogether, 67 publications have been taken as references. Two biomarkers, anti-signal recognition protein (SRP) and anti- hydroxyl methylglutaryl-coenzyme A reductase (HMGCR) Abs, have been found to have an association with NAM in about 2/3rd of cases. Interestingly, anti-SRP associated NAM appears to be more aggressive in its clinical course when compared to its anti-HMGCR associated counterpart. Biopsy shows muscle fibre necrosis without inflammation. There are reports of statin-induced NAM where progression of myopathy has been seen even after discontinuation of statins, pointing towards an underlying immune mechanism. Diagnosisng NAM is essential as it requires more aggressive immunotherapy than other types of IMs. Most cases are refractory to corticosteroid monotherapy. Immunosuppressive therapy with other immunotherapeutic agents such as IVIg, rituximab, mycophenolate mofetil, azathioprine has been explored and found to have a role in the treatment of NAM. In conclusion,given the heterogeneity of NAM, it appears that NAM is not just a single entity but consists of many different forms, despite the similarities in presentation and its classification remains an evolving field. A thorough understanding of underlying mechanism and the clinical correlation with antibodies associated with NAM is essential for efficacious management and disease prognostication.

Keywords: inflammatory myopathies, necrotising autoimmune myopathies, anti-SRP antibody, anti-HMGCR antibody, statin induced myopathy

Procedia PDF Downloads 94
5733 The Diminished Online Persona: A Semantic Change of Chinese Classifier Mei on Weibo

Authors: Hui Shi

Abstract:

This study investigates a newly emerged usage of Chinese numeral classifier mei (枚) in the cyberspace. In modern Chinese grammar, mei as a classifier should occupy the pre-nominal position, and its valid accompanying nouns are restricted to small, flat, fragile inanimate objects rather than humans. To examine the semantic change of mei, two types of data from Weibo.com were collected. First, 500 mei-included Weibo posts constructed a corpus for analyzing this classifier's word order distribution (post-nominal or pre-nominal) as well as its accompanying nouns' semantics (inanimate or human). Second, considering that mei accompanies a remarkable number of human nouns in the first corpus, the second corpus is composed of mei-involved Weibo IDs from users located in first and third-tier cities (n=8 respectively). The findings show that in the cyber community, mei frequently classifies human-related neologisms at the archaic post-normal position. Besides, the 23 to 29-year-old females as well as Weibo users from third-tier cities are the major populations who adopt mei in their user IDs for self-description and identity expression. This paper argues that the creative usage of mei gains popularity in the Chinese internet due to a humor effect. The marked word order switch and semantic misapplication combined to trigger incongruity and jocularity. This study has significance for research on Chinese cyber neologism. It may also lay a foundation for further studies on Chinese classifier change and Chinese internet communication.

Keywords: Chinese classifier, humor, neologism, semantic change

Procedia PDF Downloads 242
5732 INRAM-3DCNN: Multi-Scale Convolutional Neural Network Based on Residual and Attention Module Combined with Multilayer Perceptron for Hyperspectral Image Classification

Authors: Jianhong Xiang, Rui Sun, Linyu Wang

Abstract:

In recent years, due to the continuous improvement of deep learning theory, Convolutional Neural Network (CNN) has played a great superior performance in the research of Hyperspectral Image (HSI) classification. Since HSI has rich spatial-spectral information, only utilizing a single dimensional or single size convolutional kernel will limit the detailed feature information received by CNN, which limits the classification accuracy of HSI. In this paper, we design a multi-scale CNN with MLP based on residual and attention modules (INRAM-3DCNN) for the HSI classification task. We propose to use multiple 3D convolutional kernels to extract the packet feature information and fully learn the spatial-spectral features of HSI while designing residual 3D convolutional branches to avoid the decline of classification accuracy due to network degradation. Secondly, we also design the 2D Inception module with a joint channel attention mechanism to quickly extract key spatial feature information at different scales of HSI and reduce the complexity of the 3D model. Due to the high parallel processing capability and nonlinear global action of the Multilayer Perceptron (MLP), we use it in combination with the previous CNN structure for the final classification process. The experimental results on two HSI datasets show that the proposed INRAM-3DCNN method has superior classification performance and can perform the classification task excellently.

Keywords: INRAM-3DCNN, residual, channel attention, hyperspectral image classification

Procedia PDF Downloads 58
5731 Semantic Textual Similarity on Contracts: Exploring Multiple Negative Ranking Losses for Sentence Transformers

Authors: Yogendra Sisodia

Abstract:

Researchers are becoming more interested in extracting useful information from legal documents thanks to the development of large-scale language models in natural language processing (NLP), and deep learning has accelerated the creation of powerful text mining models. Legal fields like contracts benefit greatly from semantic text search since it makes it quick and easy to find related clauses. After collecting sentence embeddings, it is relatively simple to locate sentences with a comparable meaning throughout the entire legal corpus. The author of this research investigated two pre-trained language models for this task: MiniLM and Roberta, and further fine-tuned them on Legal Contracts. The author used Multiple Negative Ranking Loss for the creation of sentence transformers. The fine-tuned language models and sentence transformers showed promising results.

Keywords: legal contracts, multiple negative ranking loss, natural language inference, sentence transformers, semantic textual similarity

Procedia PDF Downloads 89
5730 Deep Learning-Based Automated Structure Deterioration Detection for Building Structures: A Technological Advancement for Ensuring Structural Integrity

Authors: Kavita Bodke

Abstract:

Structural health monitoring (SHM) is experiencing growth, necessitating the development of distinct methodologies to address its expanding scope effectively. In this study, we developed automatic structure damage identification, which incorporates three unique types of a building’s structural integrity. The first pertains to the presence of fractures within the structure, the second relates to the issue of dampness within the structure, and the third involves corrosion inside the structure. This study employs image classification techniques to discern between intact and impaired structures within structural data. The aim of this research is to find automatic damage detection with the probability of each damage class being present in one image. Based on this probability, we know which class has a higher probability or is more affected than the other classes. Utilizing photographs captured by a mobile camera serves as the input for an image classification system. Image classification was employed in our study to perform multi-class and multi-label classification. The objective was to categorize structural data based on the presence of cracks, moisture, and corrosion. In the context of multi-class image classification, our study employed three distinct methodologies: Random Forest, Multilayer Perceptron, and CNN. For the task of multi-label image classification, the models employed were Rasnet, Xceptionet, and Inception.

Keywords: SHM, CNN, deep learning, multi-class classification, multi-label classification

Procedia PDF Downloads 14
5729 Competing Risks Modeling Using within Node Homogeneity Classification Tree

Authors: Kazeem Adesina Dauda, Waheed Babatunde Yahya

Abstract:

To design a tree that maximizes within-node homogeneity, there is a need for a homogeneity measure that is appropriate for event history data with multiple risks. We consider the use of Deviance and Modified Cox-Snell residuals as a measure of impurity in Classification Regression Tree (CART) and compare our results with the results of Fiona (2008) in which homogeneity measures were based on Martingale Residual. Data structure approach was used to validate the performance of our proposed techniques via simulation and real life data. The results of univariate competing risk revealed that: using Deviance and Cox-Snell residuals as a response in within node homogeneity classification tree perform better than using other residuals irrespective of performance techniques. Bone marrow transplant data and double-blinded randomized clinical trial, conducted in other to compare two treatments for patients with prostate cancer were used to demonstrate the efficiency of our proposed method vis-à-vis the existing ones. Results from empirical studies of the bone marrow transplant data showed that the proposed model with Cox-Snell residual (Deviance=16.6498) performs better than both the Martingale residual (deviance=160.3592) and Deviance residual (Deviance=556.8822) in both event of interest and competing risks. Additionally, results from prostate cancer also reveal the performance of proposed model over the existing one in both causes, interestingly, Cox-Snell residual (MSE=0.01783563) outfit both the Martingale residual (MSE=0.1853148) and Deviance residual (MSE=0.8043366). Moreover, these results validate those obtained from the Monte-Carlo studies.

Keywords: within-node homogeneity, Martingale residual, modified Cox-Snell residual, classification and regression tree

Procedia PDF Downloads 257
5728 Neural Network Approach to Classifying Truck Traffic

Authors: Ren Moses

Abstract:

The process of classifying vehicles on a highway is hereby viewed as a pattern recognition problem in which connectionist techniques such as artificial neural networks (ANN) can be used to assign vehicles to their correct classes and hence to establish optimum axle spacing thresholds. In the United States, vehicles are typically classified into 13 classes using a methodology commonly referred to as “Scheme F”. In this research, the ANN model was developed, trained, and applied to field data of vehicles. The data comprised of three vehicular features—axle spacing, number of axles per vehicle, and overall vehicle weight. The ANN reduced the classification error rate from 9.5 percent to 6.2 percent when compared to an existing classification algorithm that is not ANN-based and which uses two vehicular features for classification, that is, axle spacing and number of axles. The inclusion of overall vehicle weight as a third classification variable further reduced the error rate from 6.2 percent to only 3.0 percent. The promising results from the neural networks were used to set up new thresholds that reduce classification error rate.

Keywords: artificial neural networks, vehicle classification, traffic flow, traffic analysis, and highway opera-tions

Procedia PDF Downloads 296
5727 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 137
5726 Radar-Based Classification of Pedestrian and Dog Using High-Resolution Raw Range-Doppler Signatures

Authors: C. Mayr, J. Periya, A. Kariminezhad

Abstract:

In this paper, we developed a learning framework for the classification of vulnerable road users (VRU) by their range-Doppler signatures. The frequency-modulated continuous-wave (FMCW) radar raw data is first pre-processed to obtain robust object range-Doppler maps per coherent time interval. The complex-valued range-Doppler maps captured from our outdoor measurements are further fed into a convolutional neural network (CNN) to learn the classification. This CNN has gone through a hyperparameter optimization process for improved learning. By learning VRU range-Doppler signatures, the three classes 'pedestrian', 'dog', and 'noise' are classified with an average accuracy of almost 95%. Interestingly, this classification accuracy holds for a combined longitudinal and lateral object trajectories.

Keywords: machine learning, radar, signal processing, autonomous driving

Procedia PDF Downloads 222
5725 Multi-Layer Perceptron and Radial Basis Function Neural Network Models for Classification of Diabetic Retinopathy Disease Using Video-Oculography Signals

Authors: Ceren Kaya, Okan Erkaymaz, Orhan Ayar, Mahmut Özer

Abstract:

Diabetes Mellitus (Diabetes) is a disease based on insulin hormone disorders and causes high blood glucose. Clinical findings determine that diabetes can be diagnosed by electrophysiological signals obtained from the vital organs. 'Diabetic Retinopathy' is one of the most common eye diseases resulting on diabetes and it is the leading cause of vision loss due to structural alteration of the retinal layer vessels. In this study, features of horizontal and vertical Video-Oculography (VOG) signals have been used to classify non-proliferative and proliferative diabetic retinopathy disease. Twenty-five features are acquired by using discrete wavelet transform with VOG signals which are taken from 21 subjects. Two models, based on multi-layer perceptron and radial basis function, are recommended in the diagnosis of Diabetic Retinopathy. The proposed models also can detect level of the disease. We show comparative classification performance of the proposed models. Our results show that proposed the RBF model (100%) results in better classification performance than the MLP model (94%).

Keywords: diabetic retinopathy, discrete wavelet transform, multi-layer perceptron, radial basis function, video-oculography (VOG)

Procedia PDF Downloads 245
5724 Classification of Poverty Level Data in Indonesia Using the Naïve Bayes Method

Authors: Anung Style Bukhori, Ani Dijah Rahajoe

Abstract:

Poverty poses a significant challenge in Indonesia, requiring an effective analytical approach to understand and address this issue. In this research, we applied the Naïve Bayes classification method to examine and classify poverty data in Indonesia. The main focus is on classifying data using RapidMiner, a powerful data analysis platform. The analysis process involves data splitting to train and test the classification model. First, we collected and prepared a poverty dataset that includes various factors such as education, employment, and health..The experimental results indicate that the Naïve Bayes classification model can provide accurate predictions regarding the risk of poverty. The use of RapidMiner in the analysis process offers flexibility and efficiency in evaluating the model's performance. The classification produces several values to serve as the standard for classifying poverty data in Indonesia using Naive Bayes. The accuracy result obtained is 40.26%, with a moderate recall result of 35.94%, a high recall result of 63.16%, and a low recall result of 38.03%. The precision for the moderate class is 58.97%, for the high class is 17.39%, and for the low class is 58.70%. These results can be seen from the graph below.

Keywords: poverty, classification, naïve bayes, Indonesia

Procedia PDF Downloads 38
5723 Drone Classification Using Classification Methods Using Conventional Model With Embedded Audio-Visual Features

Authors: Hrishi Rakshit, Pooneh Bagheri Zadeh

Abstract:

This paper investigates the performance of drone classification methods using conventional DCNN with different hyperparameters, when additional drone audio data is embedded in the dataset for training and further classification. In this paper, first a custom dataset is created using different images of drones from University of South California (USC) datasets and Leeds Beckett university datasets with embedded drone audio signal. The three well-known DCNN architectures namely, Resnet50, Darknet53 and Shufflenet are employed over the created dataset tuning their hyperparameters such as, learning rates, maximum epochs, Mini Batch size with different optimizers. Precision-Recall curves and F1 Scores-Threshold curves are used to evaluate the performance of the named classification algorithms. Experimental results show that Resnet50 has the highest efficiency compared to other DCNN methods.

Keywords: drone classifications, deep convolutional neural network, hyperparameters, drone audio signal

Procedia PDF Downloads 84