Search results for: user classification accuracy
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7125

Search results for: user classification accuracy

6375 Performance Assessment of Multi-Level Ensemble for Multi-Class Problems

Authors: Rodolfo Lorbieski, Silvia Modesto Nassar

Abstract:

Many supervised machine learning tasks require decision making across numerous different classes. Multi-class classification has several applications, such as face recognition, text recognition and medical diagnostics. The objective of this article is to analyze an adapted method of Stacking in multi-class problems, which combines ensembles within the ensemble itself. For this purpose, a training similar to Stacking was used, but with three levels, where the final decision-maker (level 2) performs its training by combining outputs from the tree-based pair of meta-classifiers (level 1) from Bayesian families. These are in turn trained by pairs of base classifiers (level 0) of the same family. This strategy seeks to promote diversity among the ensembles forming the meta-classifier level 2. Three performance measures were used: (1) accuracy, (2) area under the ROC curve, and (3) time for three factors: (a) datasets, (b) experiments and (c) levels. To compare the factors, ANOVA three-way test was executed for each performance measure, considering 5 datasets by 25 experiments by 3 levels. A triple interaction between factors was observed only in time. The accuracy and area under the ROC curve presented similar results, showing a double interaction between level and experiment, as well as for the dataset factor. It was concluded that level 2 had an average performance above the other levels and that the proposed method is especially efficient for multi-class problems when compared to binary problems.

Keywords: stacking, multi-layers, ensemble, multi-class

Procedia PDF Downloads 266
6374 Between AACR2 and RDA What Changes Occurs in Them

Authors: Ibrahim Abdullahi Mohammad

Abstract:

A library catalogue exists not only as an inventory of the collections of the particular library, but also as a retrieval device. It is provided to assist the library user in finding whatever information or information resources they may be looking for. The paper proposes that this location objective of the library catalogue can only be fulfilled, if the library catalogue is constructed, bearing in mind the information needs and searching behavior of the library user. Comparing AACR2 and RDA viz-a-viz the changes RDA has introduced into bibliographic standards, the paper tries to establish the level of viability of RDA in relation to AACR2.

Keywords: library catalogue, information retrieval, AACR2, RDA

Procedia PDF Downloads 47
6373 Gauging Floral Resources for Pollinators Using High Resolution Drone Imagery

Authors: Nicholas Anderson, Steven Petersen, Tom Bates, Val Anderson

Abstract:

Under the multiple-use management regime established in the United States for federally owned lands, government agencies have come under pressure from commercial apiaries to grant permits for the summer pasturing of honeybees on government lands. Federal agencies have struggled to integrate honeybees into their management plans and have little information to make regulations that resolve how many colonies should be allowed in a single location and at what distance sets of hives should be placed. Many conservation groups have voiced their concerns regarding the introduction of honeybees to these natural lands, as they may outcompete and displace native pollinating species. Assessing the quality of an area in regard to its floral resources, pollen, and nectar can be important when attempting to create regulations for the integration of commercial honeybee operations into a native ecosystem. Areas with greater floral resources may be able to support larger numbers of honeybee colonies, while poorer resource areas may be less resilient to introduced disturbances. Attempts are made in this study to determine flower cover using high resolution drone imagery to help assess the floral resource availability to pollinators in high elevation, tall forb communities. This knowledge will help in determining the potential that different areas may have for honeybee pasturing and honey production. Roughly 700 images were captured at 23m above ground level using a drone equipped with a Sony QX1 RGB 20-megapixel camera. These images were stitched together using Pix4D, resulting in a 60m diameter high-resolution mosaic of a tall forb meadow. Using the program ENVI, a supervised maximum likelihood classification was conducted to calculate the percentage of total flower cover and flower cover by color (blue, white, and yellow). A complete vegetation inventory was taken on site, and the major flowers contributing to each color class were noted. An accuracy assessment was performed on the classification yielding an 89% overall accuracy and a Kappa Statistic of 0.855. With this level of accuracy, drones provide an affordable and time efficient method for the assessment of floral cover in large areas. The proximal step of this project will now be to determine the average pollen and nectar loads carried by each flower species. The addition of this knowledge will result in a quantifiable method of measuring pollen and nectar resources of entire landscapes. This information will not only help land managers determine stocking rates for honeybees on public lands but also has applications in the agricultural setting, aiding producers in the determination of the number of honeybee colonies necessary for proper pollination of fruit and nut crops.

Keywords: honeybee, flower, pollinator, remote sensing

Procedia PDF Downloads 137
6372 A Spatial Hypergraph Based Semi-Supervised Band Selection Method for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah

Abstract:

Hyperspectral imagery (HSI) typically provides a wealth of information captured in a wide range of the electromagnetic spectrum for each pixel in the image. Hence, a pixel in HSI is a high-dimensional vector of intensities with a large spectral range and a high spectral resolution. Therefore, the semantic interpretation is a challenging task of HSI analysis. We focused in this paper on object classification as HSI semantic interpretation. However, HSI classification still faces some issues, among which are the following: The spatial variability of spectral signatures, the high number of spectral bands, and the high cost of true sample labeling. Therefore, the high number of spectral bands and the low number of training samples pose the problem of the curse of dimensionality. In order to resolve this problem, we propose to introduce the process of dimensionality reduction trying to improve the classification of HSI. The presented approach is a semi-supervised band selection method based on spatial hypergraph embedding model to represent higher order relationships with different weights of the spatial neighbors corresponding to the centroid of pixel. This semi-supervised band selection has been developed to select useful bands for object classification. The presented approach is evaluated on AVIRIS and ROSIS HSIs and compared to other dimensionality reduction methods. The experimental results demonstrate the efficacy of our approach compared to many existing dimensionality reduction methods for HSI classification.

Keywords: dimensionality reduction, hyperspectral image, semantic interpretation, spatial hypergraph

Procedia PDF Downloads 305
6371 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 157
6370 Automated Prediction of HIV-associated Cervical Cancer Patients Using Data Mining Techniques for Survival Analysis

Authors: O. J. Akinsola, Yinan Zheng, Rose Anorlu, F. T. Ogunsola, Lifang Hou, Robert Leo-Murphy

Abstract:

Cervical Cancer (CC) is the 2nd most common cancer among women living in low and middle-income countries, with no associated symptoms during formative periods. With the advancement and innovative medical research, there are numerous preventive measures being utilized, but the incidence of cervical cancer cannot be truncated with the application of only screening tests. The mortality associated with this invasive cervical cancer can be nipped in the bud through the important role of early-stage detection. This study research selected an array of different top features selection techniques which was aimed at developing a model that could validly diagnose the risk factors of cervical cancer. A retrospective clinic-based cohort study was conducted on 178 HIV-associated cervical cancer patients in Lagos University teaching Hospital, Nigeria (U54 data repository) in April 2022. The outcome measure was the automated prediction of the HIV-associated cervical cancer cases, while the predictor variables include: demographic information, reproductive history, birth control, sexual history, cervical cancer screening history for invasive cervical cancer. The proposed technique was assessed with R and Python programming software to produce the model by utilizing the classification algorithms for the detection and diagnosis of cervical cancer disease. Four machine learning classification algorithms used are: the machine learning model was split into training and testing dataset into ratio 80:20. The numerical features were also standardized while hyperparameter tuning was carried out on the machine learning to train and test the data. Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), and K-Nearest Neighbor (KNN). Some fitting features were selected for the detection and diagnosis of cervical cancer diseases from selected characteristics in the dataset using the contribution of various selection methods for the classification cervical cancer into healthy or diseased status. The mean age of patients was 49.7±12.1 years, mean age at pregnancy was 23.3±5.5 years, mean age at first sexual experience was 19.4±3.2 years, while the mean BMI was 27.1±5.6 kg/m2. A larger percentage of the patients are Married (62.9%), while most of them have at least two sexual partners (72.5%). Age of patients (OR=1.065, p<0.001**), marital status (OR=0.375, p=0.011**), number of pregnancy live-births (OR=1.317, p=0.007**), and use of birth control pills (OR=0.291, p=0.015**) were found to be significantly associated with HIV-associated cervical cancer. On top ten 10 features (variables) considered in the analysis, RF claims the overall model performance, which include: accuracy of (72.0%), the precision of (84.6%), a recall of (84.6%) and F1-score of (74.0%) while LR has: an accuracy of (74.0%), precision of (70.0%), recall of (70.0%) and F1-score of (70.0%). The RF model identified 10 features predictive of developing cervical cancer. The age of patients was considered as the most important risk factor, followed by the number of pregnancy livebirths, marital status, and use of birth control pills, The study shows that data mining techniques could be used to identify women living with HIV at high risk of developing cervical cancer in Nigeria and other sub-Saharan African countries.

Keywords: associated cervical cancer, data mining, random forest, logistic regression

Procedia PDF Downloads 82
6369 Role of Web Graphics and Interface in Creating Visitor Trust

Authors: Pramika J. Muthya

Abstract:

This paper investigates the impact of web graphics and interface design on building visitor trust in websites. A quantitative survey approach was used to examine how aesthetic and usability elements of website design influence user perceptions of trustworthiness. 133 participants aged 18-25 who live in urban Bangalore and engage in online transactions were recruited via convenience sampling. Data was collected through an online survey measuring trust levels based on website design, using validated constructs like the Visual Aesthetic of Websites Inventory (VisAWI). Statistical analysis, including ordinal regression, was conducted to analyze the results. The findings show a statistically significant relationship between web graphics and interface design and the level of trust visitors place in a website. The goodness-of-fit statistics and highly significant model fitting information provide strong evidence for rejecting the null hypothesis of no relationship. Well-designed visual aesthetics like simplicity, diversity, colorfulness, and craftsmanship are key drivers of perceived credibility. Intuitive navigation and usability also increase trust. The results emphasize the strategic importance for companies to invest in appealing graphic design, consistent with existing theoretical frameworks. There are also implications for taking a user-centric approach to web design and acknowledging the reciprocal link between pre-existing user trust and perception of visuals. While generalizable, limitations include possible sampling and self-report biases. Further research can build on these findings to deepen understanding of nuanced cultural and temporal factors influencing online trust. Overall, this study makes a significant contribution by providing empirical evidence that reinforces the crucial impact of thoughtful graphic design in fostering lasting user trust in websites.

Keywords: web graphics, interface design, visitor trust, website design, aesthetics, user experience, online trust, visual design, graphic design, user perceptions, user expectations

Procedia PDF Downloads 48
6368 Towards a Balancing Medical Database by Using the Least Mean Square Algorithm

Authors: Kamel Belammi, Houria Fatrim

Abstract:

imbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification performance of machine learning algorithms. There have been many attempts at dealing with classification of imbalanced data sets. In medical diagnosis classification, we often face the imbalanced number of data samples between the classes in which there are not enough samples in rare classes. In this paper, we proposed a learning method based on a cost sensitive extension of Least Mean Square (LMS) algorithm that penalizes errors of different samples with different weight and some rules of thumb to determine those weights. After the balancing phase, we applythe different classifiers (support vector machine (SVM), k- nearest neighbor (KNN) and multilayer neuronal networks (MNN)) for balanced data set. We have also compared the obtained results before and after balancing method.

Keywords: multilayer neural networks, k- nearest neighbor, support vector machine, imbalanced medical data, least mean square algorithm, diabetes

Procedia PDF Downloads 528
6367 [Keynote Talk]: Formal Specification and Description Language and Message Sequence Chart to Model and Validate Session Initiation Protocol Services

Authors: Sa’ed Abed, Mohammad H. Al Shayeji, Ovais Ahmed, Sahel Alouneh

Abstract:

Session Initiation Protocol (SIP) is a signaling layer protocol for building, adjusting and ending sessions among participants including Internet conferences, telephone calls and multimedia distribution. SIP facilitates user movement by proxying and forwarding requests to the present location of the user. In this paper, we provide a formal Specification and Description Language (SDL) and Message Sequence Chart (MSC) to model and define the Internet Engineering Task Force (IETF) SIP protocol and its sample services resulted from informal SIP specification. We create an “Abstract User Interface” using case analysis so that can be applied to identify SIP services more explicitly. The issued sample SIP features are then used as case scenarios; they are revised in MSCs format and validated to their corresponding SDL models.

Keywords: modeling, MSC, SDL, SIP, validating

Procedia PDF Downloads 208
6366 Unsupervised Classification of DNA Barcodes Species Using Multi-Library Wavelet Networks

Authors: Abdesselem Dakhli, Wajdi Bellil, Chokri Ben Amar

Abstract:

DNA Barcode, a short mitochondrial DNA fragment, made up of three subunits; a phosphate group, sugar and nucleic bases (A, T, C, and G). They provide good sources of information needed to classify living species. Such intuition has been confirmed by many experimental results. Species classification with DNA Barcode sequences has been studied by several researchers. The classification problem assigns unknown species to known ones by analyzing their Barcode. This task has to be supported with reliable methods and algorithms. To analyze species regions or entire genomes, it becomes necessary to use similarity sequence methods. A large set of sequences can be simultaneously compared using Multiple Sequence Alignment which is known to be NP-complete. To make this type of analysis feasible, heuristics, like progressive alignment, have been developed. Another tool for similarity search against a database of sequences is BLAST, which outputs shorter regions of high similarity between a query sequence and matched sequences in the database. However, all these methods are still computationally very expensive and require significant computational infrastructure. Our goal is to build predictive models that are highly accurate and interpretable. This method permits to avoid the complex problem of form and structure in different classes of organisms. On empirical data and their classification performances are compared with other methods. Our system consists of three phases. The first is called transformation, which is composed of three steps; Electron-Ion Interaction Pseudopotential (EIIP) for the codification of DNA Barcodes, Fourier Transform and Power Spectrum Signal Processing. The second is called approximation, which is empowered by the use of Multi Llibrary Wavelet Neural Networks (MLWNN).The third is called the classification of DNA Barcodes, which is realized by applying the algorithm of hierarchical classification.

Keywords: DNA barcode, electron-ion interaction pseudopotential, Multi Library Wavelet Neural Networks (MLWNN)

Procedia PDF Downloads 315
6365 Predictive Modelling of Aircraft Component Replacement Using Imbalanced Learning and Ensemble Method

Authors: Dangut Maren David, Skaf Zakwan

Abstract:

Adequate monitoring of vehicle component in other to obtain high uptime is the goal of predictive maintenance, the major challenge faced by businesses in industries is the significant cost associated with a delay in service delivery due to system downtime. Most of those businesses are interested in predicting those problems and proactively prevent them in advance before it occurs, which is the core advantage of Prognostic Health Management (PHM) application. The recent emergence of industry 4.0 or industrial internet of things (IIoT) has led to the need for monitoring systems activities and enhancing system-to-system or component-to- component interactions, this has resulted to a large generation of data known as big data. Analysis of big data represents an increasingly important, however, due to complexity inherently in the dataset such as imbalance classification problems, it becomes extremely difficult to build a model with accurate high precision. Data-driven predictive modeling for condition-based maintenance (CBM) has recently drowned research interest with growing attention to both academics and industries. The large data generated from industrial process inherently comes with a different degree of complexity which posed a challenge for analytics. Thus, imbalance classification problem exists perversely in industrial datasets which can affect the performance of learning algorithms yielding to poor classifier accuracy in model development. Misclassification of faults can result in unplanned breakdown leading economic loss. In this paper, an advanced approach for handling imbalance classification problem is proposed and then a prognostic model for predicting aircraft component replacement is developed to predict component replacement in advanced by exploring aircraft historical data, the approached is based on hybrid ensemble-based method which improves the prediction of the minority class during learning, we also investigate the impact of our approach on multiclass imbalance problem. We validate the feasibility and effectiveness in terms of the performance of our approach using real-world aircraft operation and maintenance datasets, which spans over 7 years. Our approach shows better performance compared to other similar approaches. We also validate our approach strength for handling multiclass imbalanced dataset, our results also show good performance compared to other based classifiers.

Keywords: prognostics, data-driven, imbalance classification, deep learning

Procedia PDF Downloads 168
6364 Ensemble Methods in Machine Learning: An Algorithmic Approach to Derive Distinctive Behaviors of Criminal Activity Applied to the Poaching Domain

Authors: Zachary Blanks, Solomon Sonya

Abstract:

Poaching presents a serious threat to endangered animal species, environment conservations, and human life. Additionally, some poaching activity has even been linked to supplying funds to support terrorist networks elsewhere around the world. Consequently, agencies dedicated to protecting wildlife habitats have a near intractable task of adequately patrolling an entire area (spanning several thousand kilometers) given limited resources, funds, and personnel at their disposal. Thus, agencies need predictive tools that are both high-performing and easily implementable by the user to help in learning how the significant features (e.g. animal population densities, topography, behavior patterns of the criminals within the area, etc) interact with each other in hopes of abating poaching. This research develops a classification model using machine learning algorithms to aid in forecasting future attacks that is both easy to train and performs well when compared to other models. In this research, we demonstrate how data imputation methods (specifically predictive mean matching, gradient boosting, and random forest multiple imputation) can be applied to analyze data and create significant predictions across a varied data set. Specifically, we apply these methods to improve the accuracy of adopted prediction models (Logistic Regression, Support Vector Machine, etc). Finally, we assess the performance of the model and the accuracy of our data imputation methods by learning on a real-world data set constituting four years of imputed data and testing on one year of non-imputed data. This paper provides three main contributions. First, we extend work done by the Teamcore and CREATE (Center for Risk and Economic Analysis of Terrorism Events) research group at the University of Southern California (USC) working in conjunction with the Department of Homeland Security to apply game theory and machine learning algorithms to develop more efficient ways of reducing poaching. This research introduces ensemble methods (Random Forests and Stochastic Gradient Boosting) and applies it to real-world poaching data gathered from the Ugandan rain forest park rangers. Next, we consider the effect of data imputation on both the performance of various algorithms and the general accuracy of the method itself when applied to a dependent variable where a large number of observations are missing. Third, we provide an alternate approach to predict the probability of observing poaching both by season and by month. The results from this research are very promising. We conclude that by using Stochastic Gradient Boosting to predict observations for non-commercial poaching by season, we are able to produce statistically equivalent results while being orders of magnitude faster in computation time and complexity. Additionally, when predicting potential poaching incidents by individual month vice entire seasons, boosting techniques produce a mean area under the curve increase of approximately 3% relative to previous prediction schedules by entire seasons.

Keywords: ensemble methods, imputation, machine learning, random forests, statistical analysis, stochastic gradient boosting, wildlife protection

Procedia PDF Downloads 287
6363 Dimensional Accuracy of CNTs/PMMA Parts and Holes Produced by Laser Cutting

Authors: A. Karimzad Ghavidel, M. Zadshakouyan

Abstract:

Laser cutting is a very common production method for cutting 2D polymeric parts. Developing of polymer composites with nano-fibers makes important their other properties like laser workability. The aim of this research is investigation of the influence different laser cutting conditions on the dimensional accuracy of parts and holes from poly methyl methacrylate (PMMA)/carbon nanotubes (CNTs) material. Experiments were carried out by considering of CNTs (in four level 0,0.5, 1 and 1.5% wt.%), laser power (60, 80, and 100 watt) and cutting speed 20, 30, and 40 mm/s as input variable factors. The results reveal that CNTs adding improves the laser workability of PMMA and the increasing of power has a significant effect on the part and hole size. The findings also show cutting speed is effective parameter on the size accuracy. Eventually, the statistical analysis of results was done, and calculated mathematical equations by the regression are presented for determining relation between input and output factor.

Keywords: dimensional accuracy, PMMA, CNTs, laser cutting

Procedia PDF Downloads 304
6362 Graphic User Interface Design Principles for Designing Augmented Reality Applications

Authors: Afshan Ejaz, Syed Asim Ali

Abstract:

The reality is a combination of perception, reconstruction, and interaction. Augmented Reality is the advancement that layer over consistent everyday existence which includes content based interface, voice-based interfaces, voice-based interface and guide based or gesture-based interfaces, so designing augmented reality application interfaces is a difficult task for the maker. Designing a user interface which is not only easy to use and easy to learn but its more interactive and self-explanatory which have high perceived affordability, perceived usefulness, consistency and high discoverability so that the user could easily recognized and understand the design. For this purpose, a lot of interface design principles such as learnability, Affordance, Simplicity, Memorability, Feedback, Visibility, Flexibly and others are introduced but there no such principles which explain the most appropriate interface design principles for designing an Augmented Reality application interfaces. Therefore, the basic goal of introducing design principles for Augmented Reality application interfaces is to match the user efforts and the computer display (‘plot user input onto computer output’) using an appropriate interface action symbol (‘metaphors’) or to make that application easy to use, easy to understand and easy to discover. In this study by observing Augmented reality system and interfaces, few of well-known design principle related to GUI (‘user-centered design’) are identify and through them, few issues are shown which can be determined through the design principles. With the help of multiple studies, our study suggests different interface design principles which makes designing Augmented Reality application interface more easier and more helpful for the maker as these principles make the interface more interactive, learnable and more usable. To accomplish and test our finding, Pokémon Go an Augmented Reality game was selected and all the suggested principles are implement and test on its interface. From the results, our study concludes that our identified principles are most important principles while developing and testing any Augmented Reality application interface.

Keywords: GUI, augmented reality, metaphors, affordance, perception, satisfaction, cognitive burden

Procedia PDF Downloads 165
6361 Enhanced Arabic Semantic Information Retrieval System Based on Arabic Text Classification

Authors: A. Elsehemy, M. Abdeen , T. Nazmy

Abstract:

Since the appearance of the Semantic web, many semantic search techniques and models were proposed to exploit the information in ontology to enhance the traditional keyword-based search. Many advances were made in languages such as English, German, French and Spanish. However, other languages such as Arabic are not fully supported yet. In this paper we present a framework for ontology based information retrieval for Arabic language. Our system consists of four main modules, namely query parser, indexer, search and a ranking module. Our approach includes building a semantic index by linking ontology concepts to documents, including an annotation weight for each link, to be used in ranking the results. We also augmented the framework with an automatic document categorizer, which enhances the overall document ranking. We have built three Arabic domain ontologies: Sports, Economic and Politics as example for the Arabic language. We built a knowledge base that consists of 79 classes and more than 1456 instances. The system is evaluated using the precision and recall metrics. We have done many retrieval operations on a sample of 40,316 documents with a size 320 MB of pure text. The results show that the semantic search enhanced with text classification gives better performance results than the system without classification.

Keywords: Arabic text classification, ontology based retrieval, Arabic semantic web, information retrieval, Arabic ontology

Procedia PDF Downloads 522
6360 On-Road Text Detection Platform for Driver Assistance Systems

Authors: Guezouli Larbi, Belkacem Soundes

Abstract:

The automation of the text detection process can help the human in his driving task. Its application can be very useful to help drivers to have more information about their environment by facilitating the reading of road signs such as directional signs, events, stores, etc. In this paper, a system consisting of two stages has been proposed. In the first one, we used pseudo-Zernike moments to pinpoint areas of the image that may contain text. The architecture of this part is based on three main steps, region of interest (ROI) detection, text localization, and non-text region filtering. Then, in the second step, we present a convolutional neural network architecture (On-Road Text Detection Network - ORTDN) which is considered a classification phase. The results show that the proposed framework achieved ≈ 35 fps and an mAP of ≈ 90%, thus a low computational time with competitive accuracy.

Keywords: text detection, CNN, PZM, deep learning

Procedia PDF Downloads 80
6359 Performance Evaluation of Routing Protocol in Cognitive Radio with Multi Technological Environment

Authors: M. Yosra, A. Mohamed, T. Sami

Abstract:

Over the past few years, mobile communication technologies have seen significant evolution. This fact promoted the implementation of many systems in a multi-technological setting. From one system to another, the Quality of Service (QoS) provided to mobile consumers gets better. The growing number of normalized standards extends the available services for each consumer, moreover, most of the available radio frequencies have already been allocated, such as 3G, Wifi, Wimax, and LTE. A study by the Federal Communications Commission (FCC) found that certain frequency bands are partially occupied in particular locations and times. So, the idea of Cognitive Radio (CR) is to share the spectrum between a primary user (PU) and a secondary user (SU). The main objective of this spectrum management is to achieve a maximum rate of exploitation of the radio spectrum. In general, the CR can greatly improve the quality of service (QoS) and improve the reliability of the link. The problem will reside in the possibility of proposing a technique to improve the reliability of the wireless link by using the CR with some routing protocols. However, users declared that the links were unreliable and that it was an incompatibility with QoS. In our case, we choose the QoS parameter "bandwidth" to perform a supervised classification. In this paper, we propose a comparative study between some routing protocols, taking into account the variation of different technologies on the existing spectral bandwidth like 3G, WIFI, WIMAX, and LTE. Due to the simulation results, we observe that LTE has significantly higher availability bandwidth compared with other technologies. The performance of the OLSR protocol is better than other on-demand routing protocols (DSR, AODV and DSDV), in LTE technology because of the proper receiving of packets, less packet drop and the throughput. Numerous simulations of routing protocols have been made using simulators such as NS3.

Keywords: cognitive radio, multi technology, network simulator (NS3), routing protocol

Procedia PDF Downloads 55
6358 Melanoma and Non-Melanoma, Skin Lesion Classification, Using a Deep Learning Model

Authors: Shaira L. Kee, Michael Aaron G. Sy, Myles Joshua T. Tan, Hezerul Abdul Karim, Nouar AlDahoul

Abstract:

Skin diseases are considered the fourth most common disease, with melanoma and non-melanoma skin cancer as the most common type of cancer in Caucasians. The alarming increase in Skin Cancer cases shows an urgent need for further research to improve diagnostic methods, as early diagnosis can significantly improve the 5-year survival rate. Machine Learning algorithms for image pattern analysis in diagnosing skin lesions can dramatically increase the accuracy rate of detection and decrease possible human errors. Several studies have shown the diagnostic performance of computer algorithms outperformed dermatologists. However, existing methods still need improvements to reduce diagnostic errors and generate efficient and accurate results. Our paper proposes an ensemble method to classify dermoscopic images into benign and malignant skin lesions. The experiments were conducted using the International Skin Imaging Collaboration (ISIC) image samples. The dataset contains 3,297 dermoscopic images with benign and malignant categories. The results show improvement in performance with an accuracy of 88% and an F1 score of 87%, outperforming other existing models such as support vector machine (SVM), Residual network (ResNet50), EfficientNetB0, EfficientNetB4, and VGG16.

Keywords: deep learning - VGG16 - efficientNet - CNN – ensemble – dermoscopic images - melanoma

Procedia PDF Downloads 77
6357 Frequency Decomposition Approach for Sub-Band Common Spatial Pattern Methods for Motor Imagery Based Brain-Computer Interface

Authors: Vitor M. Vilas Boas, Cleison D. Silva, Gustavo S. Mafra, Alexandre Trofino Neto

Abstract:

Motor imagery (MI) based brain-computer interfaces (BCI) uses event-related (de)synchronization (ERS/ ERD), typically recorded using electroencephalography (EEG), to translate brain electrical activity into control commands. To mitigate undesirable artifacts and noise measurements on EEG signals, methods based on band-pass filters defined by a specific frequency band (i.e., 8 – 30Hz), such as the Infinity Impulse Response (IIR) filters, are typically used. Spatial techniques, such as Common Spatial Patterns (CSP), are also used to estimate the variations of the filtered signal and extract features that define the imagined motion. The CSP effectiveness depends on the subject's discriminative frequency, and approaches based on the decomposition of the band of interest into sub-bands with smaller frequency ranges (SBCSP) have been suggested to EEG signals classification. However, despite providing good results, the SBCSP approach generally increases the computational cost of the filtering step in IM-based BCI systems. This paper proposes the use of the Fast Fourier Transform (FFT) algorithm in the IM-based BCI filtering stage that implements SBCSP. The goal is to apply the FFT algorithm to reduce the computational cost of the processing step of these systems and to make them more efficient without compromising classification accuracy. The proposal is based on the representation of EEG signals in a matrix of coefficients resulting from the frequency decomposition performed by the FFT, which is then submitted to the SBCSP process. The structure of the SBCSP contemplates dividing the band of interest, initially defined between 0 and 40Hz, into a set of 33 sub-bands spanning specific frequency bands which are processed in parallel each by a CSP filter and an LDA classifier. A Bayesian meta-classifier is then used to represent the LDA outputs of each sub-band as scores and organize them into a single vector, and then used as a training vector of an SVM global classifier. Initially, the public EEG data set IIa of the BCI Competition IV is used to validate the approach. The first contribution of the proposed method is that, in addition to being more compact, because it has a 68% smaller dimension than the original signal, the resulting FFT matrix maintains the signal information relevant to class discrimination. In addition, the results showed an average reduction of 31.6% in the computational cost in relation to the application of filtering methods based on IIR filters, suggesting FFT efficiency when applied in the filtering step. Finally, the frequency decomposition approach improves the overall system classification rate significantly compared to the commonly used filtering, going from 73.7% using IIR to 84.2% using FFT. The accuracy improvement above 10% and the computational cost reduction denote the potential of FFT in EEG signal filtering applied to the context of IM-based BCI implementing SBCSP. Tests with other data sets are currently being performed to reinforce such conclusions.

Keywords: brain-computer interfaces, fast Fourier transform algorithm, motor imagery, sub-band common spatial patterns

Procedia PDF Downloads 127
6356 Estimating Tree Height and Forest Classification from Multi Temporal Risat-1 HH and HV Polarized Satellite Aperture Radar Interferometric Phase Data

Authors: Saurav Kumar Suman, P. Karthigayani

Abstract:

In this paper the height of the tree is estimated and forest types is classified from the multi temporal RISAT-1 Horizontal-Horizontal (HH) and Horizontal-Vertical (HV) Polarised Satellite Aperture Radar (SAR) data. The novelty of the proposed project is combined use of the Back-scattering Coefficients (Sigma Naught) and the Coherence. It uses Water Cloud Model (WCM). The approaches use two main steps. (a) Extraction of the different forest parameter data from the Product.xml, BAND-META file and from Grid-xxx.txt file come with the HH & HV polarized data from the ISRO (Indian Space Research Centre). These file contains the required parameter during height estimation. (b) Calculation of the Vegetation and Ground Backscattering, Coherence and other Forest Parameters. (c) Classification of Forest Types using the ENVI 5.0 Tool and ROI (Region of Interest) calculation.

Keywords: RISAT-1, classification, forest, SAR data

Procedia PDF Downloads 402
6355 Integrated Model for Enhancing Data Security Performance in Cloud Computing

Authors: Amani A. Saad, Ahmed A. El-Farag, El-Sayed A. Helali

Abstract:

Cloud computing is an important and promising field in the recent decade. Cloud computing allows sharing resources, services and information among the people of the whole world. Although the advantages of using clouds are great, but there are many risks in a cloud. The data security is the most important and critical problem of cloud computing. In this research a new security model for cloud computing is proposed for ensuring secure communication system, hiding information from other users and saving the user's times. In this proposed model Blowfish encryption algorithm is used for exchanging information or data, and SHA-2 cryptographic hash algorithm is used for data integrity. For user authentication process a user-name and password is used, the password uses SHA-2 for one way encryption. The proposed system shows an improvement of the processing time of uploading and downloading files on the cloud in secure form.

Keywords: cloud Ccomputing, data security, SAAS, PAAS, IAAS, Blowfish

Procedia PDF Downloads 472
6354 The Role of Situational Factors in User Experience during Human-Robot Interaction

Authors: Da Tao, Tieyan Wang, Mingfu Qin

Abstract:

While social robots have been increasingly developed and rapidly applied in our daily life, how robots should interact with humans is still an urgent problem to be explored. Appropriate use of interactive behavior is likely to create a good user experience in human-robot interaction situations, which in turn can improve people’s acceptance of robots. This paper aimed to systematically and quantitatively examine the effects of several important situational factors (i.e., interaction distance, interaction posture, and feedback style) on user experience during human-robot interaction. A three-factor mixed designed experiment was adopted in this study, where subjects were asked to interact with a social robot in different interaction situations by combinations of varied interaction distance, interaction posture, and feedback style. A set of data on users’ behavioral performance, subjective perceptions, and eye movement measures were tracked and collected, and analyzed by repeated measures analysis of variance. The results showed that the three situational factors showed no effects on behavioral performance in tasks during human-robot interaction. Interaction distance and feedback style yielded significant main effects and interaction effects on the proportion of fixation times. The proportion of fixation times on the robot is higher for negative feedback compared with positive feedback style. While the proportion of fixation times on the robot generally decreased with the increase of the interaction distance, it decreased more under the positive feedback style than under the negative feedback style. In addition, there were significant interaction effects on pupil diameter between interaction distance and posture. As interaction distance increased, mean pupil diameter became smaller in side interaction, while it became larger in frontal interaction. Moreover, the three situation factors had significant interaction effects on user acceptance of the interaction mode. The findings are helpful in the underlying mechanism of user experience in human-robot interaction situations and provide important implications for the design of robot behavioral expression and for optimal strategies to improve user experience during human-robot interaction.

Keywords: social robots, human-robot interaction, interaction posture, interaction distance, feedback style, user experience

Procedia PDF Downloads 126
6353 A Novel Framework for User-Friendly Ontology-Mediated Access to Relational Databases

Authors: Efthymios Chondrogiannis, Vassiliki Andronikou, Efstathios Karanastasis, Theodora Varvarigou

Abstract:

A large amount of data is typically stored in relational databases (DB). The latter can efficiently handle user queries which intend to elicit the appropriate information from data sources. However, direct access and use of this data requires the end users to have an adequate technical background, while they should also cope with the internal data structure and values presented. Consequently the information retrieval is a quite difficult process even for IT or DB experts, taking into account the limited contributions of relational databases from the conceptual point of view. Ontologies enable users to formally describe a domain of knowledge in terms of concepts and relations among them and hence they can be used for unambiguously specifying the information captured by the relational database. However, accessing information residing in a database using ontologies is feasible, provided that the users are keen on using semantic web technologies. For enabling users form different disciplines to retrieve the appropriate data, the design of a Graphical User Interface is necessary. In this work, we will present an interactive, ontology-based, semantically enable web tool that can be used for information retrieval purposes. The tool is totally based on the ontological representation of underlying database schema while it provides a user friendly environment through which the users can graphically form and execute their queries.

Keywords: ontologies, relational databases, SPARQL, web interface

Procedia PDF Downloads 270
6352 Tumor Size and Lymph Node Metastasis Detection in Colon Cancer Patients Using MR Images

Authors: Mohammadreza Hedyehzadeh, Mahdi Yousefi

Abstract:

Colon cancer is one of the most common cancer, which predicted to increase its prevalence due to the bad eating habits of peoples. Nowadays, due to the busyness of people, the use of fast foods is increasing, and therefore, diagnosis of this disease and its treatment are of particular importance. To determine the best treatment approach for each specific colon cancer patients, the oncologist should be known the stage of the tumor. The most common method to determine the tumor stage is TNM staging system. In this system, M indicates the presence of metastasis, N indicates the extent of spread to the lymph nodes, and T indicates the size of the tumor. It is clear that in order to determine all three of these parameters, an imaging method must be used, and the gold standard imaging protocols for this purpose are CT and PET/CT. In CT imaging, due to the use of X-rays, the risk of cancer and the absorbed dose of the patient is high, while in the PET/CT method, there is a lack of access to the device due to its high cost. Therefore, in this study, we aimed to estimate the tumor size and the extent of its spread to the lymph nodes using MR images. More than 1300 MR images collected from the TCIA portal, and in the first step (pre-processing), histogram equalization to improve image qualities and resizing to get the same image size was done. Two expert radiologists, which work more than 21 years on colon cancer cases, segmented the images and extracted the tumor region from the images. The next step is feature extraction from segmented images and then classify the data into three classes: T0N0، T3N1 و T3N2. In this article, the VGG-16 convolutional neural network has been used to perform both of the above-mentioned tasks, i.e., feature extraction and classification. This network has 13 convolution layers for feature extraction and three fully connected layers with the softmax activation function for classification. In order to validate the proposed method, the 10-fold cross validation method used in such a way that the data was randomly divided into three parts: training (70% of data), validation (10% of data) and the rest for testing. It is repeated 10 times, each time, the accuracy, sensitivity and specificity of the model are calculated and the average of ten repetitions is reported as the result. The accuracy, specificity and sensitivity of the proposed method for testing dataset was 89/09%, 95/8% and 96/4%. Compared to previous studies, using a safe imaging technique (MRI) and non-use of predefined hand-crafted imaging features to determine the stage of colon cancer patients are some of the study advantages.

Keywords: colon cancer, VGG-16, magnetic resonance imaging, tumor size, lymph node metastasis

Procedia PDF Downloads 53
6351 Measurement of IMRT Dose Distribution in Rando Head and Neck Phantom using EBT3 Film

Authors: Pegah Safavi, Mehdi Zehtabian, Mohammad Amin Mosleh-Shirazi

Abstract:

Cancer is one of the leading causes of death in the world. Radiation therapy is one of the main choices for cancer treatment. Intensity-modulated radiation therapy is a new type of radiation therapy technique available for vital structures such as the parathyroid glands. It is very important to check the accuracy of the delivered IMRT treatment because any mistake may lead to more complications for the patient. This paper describes an experiment to determine the accuracy of a dose measured by EBT3 film. To test this method, the EBT3 film on the head and neck of the Rando phantom was irradiated by an IMRT device and the irradiation was repeated twice. Finally, the dose designed by the irradiation system was compared with the dose measured by the EBT3 film. Using this criterion, the accuracy of the EBT3 film was evaluated. When using this criterion, a 95% agreement was reached between the planned treatment and the measured values.

Keywords: EBT3, phantom, accuracy, cancer, IMRT

Procedia PDF Downloads 145
6350 Classification for Obstructive Sleep Apnea Syndrome Based on Random Forest

Authors: Cheng-Yu Tsai, Wen-Te Liu, Shin-Mei Hsu, Yin-Tzu Lin, Chi Wu

Abstract:

Background: Obstructive Sleep apnea syndrome (OSAS) is a common respiratory disorder during sleep. In addition, Body parameters were identified high predictive importance for OSAS severity. However, the effects of body parameters on OSAS severity remain unclear. Objective: In this study, the objective is to establish a prediction model for OSAS by using body parameters and investigate the effects of body parameters in OSAS. Methodologies: Severity was quantified as the polysomnography and the mean hourly number of greater than 3% dips in oxygen saturation during examination in a hospital in New Taipei City (Taiwan). Four levels of OSAS severity were classified by the apnea and hypopnea index (AHI) with American Academy of Sleep Medicine (AASM) guideline. Body parameters, including neck circumference, waist size, and body mass index (BMI) were obtained from questionnaire. Next, dividing the collecting subjects into two groups: training and testing groups. The training group was used to establish the random forest (RF) to predicting, and test group was used to evaluated the accuracy of classification. Results: There were 3330 subjects recruited in this study, whom had been done polysomnography for evaluating severity for OSAS. A RF of 1000 trees achieved correctly classified 79.94 % of test cases. When further evaluated on the test cohort, RF showed the waist and BMI as the high import factors in OSAS. Conclusion It is possible to provide patient with prescreening by body parameters which can pre-evaluate the health risks.

Keywords: apnea and hypopnea index, Body parameters, obstructive sleep apnea syndrome, Random Forest

Procedia PDF Downloads 149
6349 Radar on Bike: Coarse Classification based on Multi-Level Clustering for Cyclist Safety Enhancement

Authors: Asma Omri, Noureddine Benothman, Sofiane Sayahi, Fethi Tlili, Hichem Besbes

Abstract:

Cycling, a popular mode of transportation, can also be perilous due to cyclists' vulnerability to collisions with vehicles and obstacles. This paper presents an innovative cyclist safety system based on radar technology designed to offer real-time collision risk warnings to cyclists. The system incorporates a low-power radar sensor affixed to the bicycle and connected to a microcontroller. It leverages radar point cloud detections, a clustering algorithm, and a supervised classifier. These algorithms are optimized for efficiency to run on the TI’s AWR 1843 BOOST radar, utilizing a coarse classification approach distinguishing between cars, trucks, two-wheeled vehicles, and other objects. To enhance the performance of clustering techniques, we propose a 2-Level clustering approach. This approach builds on the state-of-the-art Density-based spatial clustering of applications with noise (DBSCAN). The objective is to first cluster objects based on their velocity, then refine the analysis by clustering based on position. The initial level identifies groups of objects with similar velocities and movement patterns. The subsequent level refines the analysis by considering the spatial distribution of these objects. The clusters obtained from the first level serve as input for the second level of clustering. Our proposed technique surpasses the classical DBSCAN algorithm in terms of geometrical metrics, including homogeneity, completeness, and V-score. Relevant cluster features are extracted and utilized to classify objects using an SVM classifier. Potential obstacles are identified based on their velocity and proximity to the cyclist. To optimize the system, we used the View of Delft dataset for hyperparameter selection and SVM classifier training. The system's performance was assessed using our collected dataset of radar point clouds synchronized with a camera on an Nvidia Jetson Nano board. The radar-based cyclist safety system is a practical solution that can be easily installed on any bicycle and connected to smartphones or other devices, offering real-time feedback and navigation assistance to cyclists. We conducted experiments to validate the system's feasibility, achieving an impressive 85% accuracy in the classification task. This system has the potential to significantly reduce the number of accidents involving cyclists and enhance their safety on the road.

Keywords: 2-level clustering, coarse classification, cyclist safety, warning system based on radar technology

Procedia PDF Downloads 75
6348 Gradient Boosted Trees on Spark Platform for Supervised Learning in Health Care Big Data

Authors: Gayathri Nagarajan, L. D. Dhinesh Babu

Abstract:

Health care is one of the prominent industries that generate voluminous data thereby finding the need of machine learning techniques with big data solutions for efficient processing and prediction. Missing data, incomplete data, real time streaming data, sensitive data, privacy, heterogeneity are few of the common challenges to be addressed for efficient processing and mining of health care data. In comparison with other applications, accuracy and fast processing are of higher importance for health care applications as they are related to the human life directly. Though there are many machine learning techniques and big data solutions used for efficient processing and prediction in health care data, different techniques and different frameworks are proved to be effective for different applications largely depending on the characteristics of the datasets. In this paper, we present a framework that uses ensemble machine learning technique gradient boosted trees for data classification in health care big data. The framework is built on Spark platform which is fast in comparison with other traditional frameworks. Unlike other works that focus on a single technique, our work presents a comparison of six different machine learning techniques along with gradient boosted trees on datasets of different characteristics. Five benchmark health care datasets are considered for experimentation, and the results of different machine learning techniques are discussed in comparison with gradient boosted trees. The metric chosen for comparison is misclassification error rate and the run time of the algorithms. The goal of this paper is to i) Compare the performance of gradient boosted trees with other machine learning techniques in Spark platform specifically for health care big data and ii) Discuss the results from the experiments conducted on datasets of different characteristics thereby drawing inference and conclusion. The experimental results show that the accuracy is largely dependent on the characteristics of the datasets for other machine learning techniques whereas gradient boosting trees yields reasonably stable results in terms of accuracy without largely depending on the dataset characteristics.

Keywords: big data analytics, ensemble machine learning, gradient boosted trees, Spark platform

Procedia PDF Downloads 234
6347 One-Class Classification Approach Using Fukunaga-Koontz Transform and Selective Multiple Kernel Learning

Authors: Abdullah Bal

Abstract:

This paper presents a one-class classification (OCC) technique based on Fukunaga-Koontz Transform (FKT) for binary classification problems. The FKT is originally a powerful tool to feature selection and ordering for two-class problems. To utilize the standard FKT for data domain description problem (i.e., one-class classification), in this paper, a set of non-class samples which exist outside of positive class (target class) describing boundary formed with limited training data has been constructed synthetically. The tunnel-like decision boundary around upper and lower border of target class samples has been designed using statistical properties of feature vectors belonging to the training data. To capture higher order of statistics of data and increase discrimination ability, the proposed method, termed one-class FKT (OC-FKT), has been extended to its nonlinear version via kernel machines and referred as OC-KFKT for short. Multiple kernel learning (MKL) is a favorable family of machine learning such that tries to find an optimal combination of a set of sub-kernels to achieve a better result. However, the discriminative ability of some of the base kernels may be low and the OC-KFKT designed by this type of kernels leads to unsatisfactory classification performance. To address this problem, the quality of sub-kernels should be evaluated, and the weak kernels must be discarded before the final decision making process. MKL/OC-FKT and selective MKL/OC-FKT frameworks have been designed stimulated by ensemble learning (EL) to weight and then select the sub-classifiers using the discriminability and diversities measured by eigenvalue ratios. The eigenvalue ratios have been assessed based on their regions on the FKT subspaces. The comparative experiments, performed on various low and high dimensional data, against state-of-the-art algorithms confirm the effectiveness of our techniques, especially in case of small sample size (SSS) conditions.

Keywords: ensemble methods, fukunaga-koontz transform, kernel-based methods, multiple kernel learning, one-class classification

Procedia PDF Downloads 14
6346 Unveiling Comorbidities in Irritable Bowel Syndrome: A UK BioBank Study utilizing Supervised Machine Learning

Authors: Uswah Ahmad Khan, Muhammad Moazam Fraz, Humayoon Shafique Satti, Qasim Aziz

Abstract:

Approximately 10-14% of the global population experiences a functional disorder known as irritable bowel syndrome (IBS). The disorder is defined by persistent abdominal pain and an irregular bowel pattern. IBS significantly impairs work productivity and disrupts patients' daily lives and activities. Although IBS is widespread, there is still an incomplete understanding of its underlying pathophysiology. This study aims to help characterize the phenotype of IBS patients by differentiating the comorbidities found in IBS patients from those in non-IBS patients using machine learning algorithms. In this study, we extracted samples coding for IBS from the UK BioBank cohort and randomly selected patients without a code for IBS to create a total sample size of 18,000. We selected the codes for comorbidities of these cases from 2 years before and after their IBS diagnosis and compared them to the comorbidities in the non-IBS cohort. Machine learning models, including Decision Trees, Gradient Boosting, Support Vector Machine (SVM), AdaBoost, Logistic Regression, and XGBoost, were employed to assess their accuracy in predicting IBS. The most accurate model was then chosen to identify the features associated with IBS. In our case, we used XGBoost feature importance as a feature selection method. We applied different models to the top 10% of features, which numbered 50. Gradient Boosting, Logistic Regression and XGBoost algorithms yielded a diagnosis of IBS with an optimal accuracy of 71.08%, 71.427%, and 71.53%, respectively. Among the comorbidities most closely associated with IBS included gut diseases (Haemorrhoids, diverticular diseases), atopic conditions(asthma), and psychiatric comorbidities (depressive episodes or disorder, anxiety). This finding emphasizes the need for a comprehensive approach when evaluating the phenotype of IBS, suggesting the possibility of identifying new subsets of IBS rather than relying solely on the conventional classification based on stool type. Additionally, our study demonstrates the potential of machine learning algorithms in predicting the development of IBS based on comorbidities, which may enhance diagnosis and facilitate better management of modifiable risk factors for IBS. Further research is necessary to confirm our findings and establish cause and effect. Alternative feature selection methods and even larger and more diverse datasets may lead to more accurate classification models. Despite these limitations, our findings highlight the effectiveness of Logistic Regression and XGBoost in predicting IBS diagnosis.

Keywords: comorbidities, disease association, irritable bowel syndrome (IBS), predictive analytics

Procedia PDF Downloads 113