Search results for: classification tree
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2861

Search results for: classification tree

2321 Dual-Channel Reliable Breast Ultrasound Image Classification Based on Explainable Attribution and Uncertainty Quantification

Authors: Haonan Hu, Shuge Lei, Dasheng Sun, Huabin Zhang, Kehong Yuan, Jian Dai, Jijun Tang

Abstract:

This paper focuses on the classification task of breast ultrasound images and conducts research on the reliability measurement of classification results. A dual-channel evaluation framework was developed based on the proposed inference reliability and predictive reliability scores. For the inference reliability evaluation, human-aligned and doctor-agreed inference rationals based on the improved feature attribution algorithm SP-RISA are gracefully applied. Uncertainty quantification is used to evaluate the predictive reliability via the test time enhancement. The effectiveness of this reliability evaluation framework has been verified on the breast ultrasound clinical dataset YBUS, and its robustness is verified on the public dataset BUSI. The expected calibration errors on both datasets are significantly lower than traditional evaluation methods, which proves the effectiveness of the proposed reliability measurement.

Keywords: medical imaging, ultrasound imaging, XAI, uncertainty measurement, trustworthy AI

Procedia PDF Downloads 78
2320 A Multi-Output Network with U-Net Enhanced Class Activation Map and Robust Classification Performance for Medical Imaging Analysis

Authors: Jaiden Xuan Schraut, Leon Liu, Yiqiao Yin

Abstract:

Computer vision in medical diagnosis has achieved a high level of success in diagnosing diseases with high accuracy. However, conventional classifiers that produce an image to-label result provides insufficient information for medical professionals to judge and raise concerns over the trust and reliability of a model with results that cannot be explained. In order to gain local insight into cancerous regions, separate tasks such as imaging segmentation need to be implemented to aid the doctors in treating patients, which doubles the training time and costs which renders the diagnosis system inefficient and difficult to be accepted by the public. To tackle this issue and drive AI-first medical solutions further, this paper proposes a multi-output network that follows a U-Net architecture for image segmentation output and features an additional convolutional neural networks (CNN) module for auxiliary classification output. Class activation maps are a method of providing insight into a convolutional neural network’s feature maps that leads to its classification but in the case of lung diseases, the region of interest is enhanced by U-net-assisted Class Activation Map (CAM) visualization. Therefore, our proposed model combines image segmentation models and classifiers to crop out only the lung region of a chest X-ray’s class activation map to provide a visualization that improves the explainability and is able to generate classification results simultaneously which builds trust for AI-led diagnosis systems. The proposed U-Net model achieves 97.61% accuracy and a dice coefficient of 0.97 on testing data from the COVID-QU-Ex Dataset which includes both diseased and healthy lungs.

Keywords: multi-output network model, U-net, class activation map, image classification, medical imaging analysis

Procedia PDF Downloads 182
2319 Bayesian System and Copula for Event Detection and Summarization of Soccer Videos

Authors: Dhanuja S. Patil, Sanjay B. Waykar

Abstract:

Event detection is a standout amongst the most key parts for distinctive sorts of area applications of video data framework. Recently, it has picked up an extensive interest of experts and in scholastics from different zones. While detecting video event has been the subject of broad study efforts recently, impressively less existing methodology has considered multi-model data and issues related efficiency. Start of soccer matches different doubtful circumstances rise that can't be effectively judged by the referee committee. A framework that checks objectively image arrangements would prevent not right interpretations because of some errors, or high velocity of the events. Bayesian networks give a structure for dealing with this vulnerability using an essential graphical structure likewise the probability analytics. We propose an efficient structure for analysing and summarization of soccer videos utilizing object-based features. The proposed work utilizes the t-cherry junction tree, an exceptionally recent advancement in probabilistic graphical models, to create a compact representation and great approximation intractable model for client’s relationships in an interpersonal organization. There are various advantages in this approach firstly; the t-cherry gives best approximation by means of junction trees class. Secondly, to construct a t-cherry junction tree can be to a great extent parallelized; and at last inference can be performed utilizing distributed computation. Examination results demonstrates the effectiveness, adequacy, and the strength of the proposed work which is shown over a far reaching information set, comprising more soccer feature, caught at better places.

Keywords: summarization, detection, Bayesian network, t-cherry tree

Procedia PDF Downloads 306
2318 Fungi Associated with Decline of Kikar (Acacia nilotica) and Red River Gum (Eucalyptus camaldulensis) in Faisalabad

Authors: I. Ahmad, A. Hannan, S. Ahmad, M. Asif, M. F. Nawaz, M. A. Tanvir, M. F. Azhar

Abstract:

During this research, a comprehensive survey of tree growing areas of Faisalabad district of Pakistan was conducted to observe the symptoms, spectrum, occurrence and severity of A. nilotica and E. camaldulensis decline. Objective of current research was to investigate specific fungal pathogens involved in decline of A. nilotica and E. camaldulensis. For this purpose, infected roots, bark, neck portion, stem, branches, leaves and infected soils were collected to identify associated fungi. Potato dextrose agar (PDA) and Czepak dox agar media were used for isolations. Identification of isolated fungi was done microscopically and different fungi were identified. During survey of urban locations of Faisalabad, disease incidence on Kikar and Eucalyptus was recorded as 3.9-7.9% and 2.6-7.1% respectively. Survey of Agroforest zones of Faisalabad revealed decline incidence on kikar 7.5% from Sargodha road while on Satiana and Jhang road it was not planted. In eucalyptus trees, 4%, 8% and 0% disease incidence was observed on Jhang road, Sargodha road and Satiana road respectively. The maximum fungus isolated from the kikar tree was Drechslera australiensis (5.00%) from the stem part. Aspergillus flavus also gave the maximum value of (3.05%) from the bark. Alternaria alternata gave the maximum value of (2.05%) from leaves. Rhizopus and Mucor spp. were recorded minimum as compared to the Drechslera, Alternaria and Aspergillus. The maximum fungus isolated from the Eucalyptus tree was Armillaria luteobubalina (5.00%) from the stem part. The other fungi isolated were Macrophamina phaseolina and A. niger.

Keywords: decline, frequency of mycoflora, A. nilotica and E. camaldulensis, Drechslera australiensis, Armillaria luteobubalina

Procedia PDF Downloads 356
2317 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 156
2316 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 142
2315 Chemical Composition and Nutritional Value of Leaves and Pods of Leucaena Leucocephala, Prosopis Laevigata and Acacia Farnesiana in a Xerophyllous Shrubland

Authors: Miguel Mellado, Cecilia Zapata

Abstract:

Goats can be exploited in harsh environments due to their capacity to adjust to limited quantity and quality forage sources. In these environments, leguminous trees can be used as supplementary feeds as foliage and fruits of these trees can contribute to maintain or improve production efficiency in ruminants. The objective of this study was to determine the nutritional value of three leguminous trees heavily selected by goats in a xerophyllous shrubland. Chemical composition and in vitro dry matter disappearance (IVDMD) of leaves and pods from leucaena (Leucaena leucocephala), mesquite (Prosopis laevigata) and huisache (Acacia farnesiana) is presented. Crude protein (CP) ranged from 17.3% for leaves of huisache to 21.9% for leucaena. The neutral detergent fiber (NDF) content ranged from 39.0 to 40.3 with no difference among fodder threes. Across tree species, mean IVDMD was 61.6% for pods and 52.2% for leaves. IVDMD for leaves was highest (P < 0.01) for leucaena (54.9%) and lowest for huisache (47.3%). Condensed tannins in an acetonic extract were highest for leaves of huisache (45.3 mg CE/g DM) and lowest for mesquite (25.9 mg CE/g DM). Pods and leaves of huisache presented the highest number of secondary metabolites, mainly related to hydrobenzoic acid and flavonols; leucaena and mesquite presented mainly flavonols and anthocyanins. It was concluded that leaves and pods of leucaena, mesquite and huisache constitute valuable forages for ruminant livestock due to its low fiber, high CP levels, moderate in vitro fermentation characteristics and high mineral content. Keywords: Fodder tree; ruminants; secondary metabolites; minerals; tannins

Keywords: fodder tree, ruminants, secondary metabolites, minerals, tannins

Procedia PDF Downloads 132
2314 Classification of EEG Signals Based on Dynamic Connectivity Analysis

Authors: Zoran Šverko, Saša Vlahinić, Nino Stojković, Ivan Markovinović

Abstract:

In this article, the classification of target letters is performed using data from the EEG P300 Speller paradigm. Neural networks trained with the results of dynamic connectivity analysis between different brain regions are used for classification. Dynamic connectivity analysis is based on the adaptive window size and the imaginary part of the complex Pearson correlation coefficient. Brain dynamics are analysed using the relative intersection of confidence intervals for the imaginary component of the complex Pearson correlation coefficient method (RICI-imCPCC). The RICI-imCPCC method overcomes the shortcomings of currently used dynamical connectivity analysis methods, such as the low reliability and low temporal precision for short connectivity intervals encountered in constant sliding window analysis with wide window size and the high susceptibility to noise encountered in constant sliding window analysis with narrow window size. This method overcomes these shortcomings by dynamically adjusting the window size using the RICI rule. This method extracts information about brain connections for each time sample. Seventy percent of the extracted brain connectivity information is used for training and thirty percent for validation. Classification of the target word is also done and based on the same analysis method. As far as we know, through this research, we have shown for the first time that dynamic connectivity can be used as a parameter for classifying EEG signals.

Keywords: dynamic connectivity analysis, EEG, neural networks, Pearson correlation coefficients

Procedia PDF Downloads 197
2313 Accuracy Analysis of the American Society of Anesthesiologists Classification Using ChatGPT

Authors: Jae Ni Jang, Young Uk Kim

Abstract:

Background: Chat Generative Pre-training Transformer-3 (ChatGPT; San Francisco, California, Open Artificial Intelligence) is an artificial intelligence chatbot based on a large language model designed to generate human-like text. As the usage of ChatGPT is increasing among less knowledgeable patients, medical students, and anesthesia and pain medicine residents or trainees, we aimed to evaluate the accuracy of ChatGPT-3 responses to questions about the American Society of Anesthesiologists (ASA) classification based on patients’ underlying diseases and assess the quality of the generated responses. Methods: A total of 47 questions were submitted to ChatGPT using textual prompts. The questions were designed for ChatGPT-3 to provide answers regarding ASA classification in response to common underlying diseases frequently observed in adult patients. In addition, we created 18 questions regarding the ASA classification for pediatric patients and pregnant women. The accuracy of ChatGPT’s responses was evaluated by cross-referencing with Miller’s Anesthesia, Morgan & Mikhail’s Clinical Anesthesiology, and the American Society of Anesthesiologists’ ASA Physical Status Classification System (2020). Results: Out of the 47 questions pertaining to adults, ChatGPT -3 provided correct answers for only 23, resulting in an accuracy rate of 48.9%. Furthermore, the responses provided by ChatGPT-3 regarding children and pregnant women were mostly inaccurate, as indicated by a 28% accuracy rate (5 out of 18). Conclusions: ChatGPT provided correct responses to questions relevant to the daily clinical routine of anesthesiologists in approximately half of the cases, while the remaining responses contained errors. Therefore, caution is advised when using ChatGPT to retrieve anesthesia-related information. Although ChatGPT may not yet be suitable for clinical settings, we anticipate significant improvements in ChatGPT and other large language models in the near future. Regular assessments of ChatGPT's ASA classification accuracy are essential due to the evolving nature of ChatGPT as an artificial intelligence entity. This is especially important because ChatGPT has a clinically unacceptable rate of error and hallucination, particularly in pediatric patients and pregnant women. The methodology established in this study may be used to continue evaluating ChatGPT.

Keywords: American Society of Anesthesiologists, artificial intelligence, Chat Generative Pre-training Transformer-3, ChatGPT

Procedia PDF Downloads 30
2312 Assessing the Effectiveness of Machine Learning Algorithms for Cyber Threat Intelligence Discovery from the Darknet

Authors: Azene Zenebe

Abstract:

Deep learning is a subset of machine learning which incorporates techniques for the construction of artificial neural networks and found to be useful for modeling complex problems with large dataset. Deep learning requires a very high power computational and longer time for training. By aggregating computing power, high performance computer (HPC) has emerged as an approach to resolving advanced problems and performing data-driven research activities. Cyber threat intelligence (CIT) is actionable information or insight an organization or individual uses to understand the threats that have, will, or are currently targeting the organization. Results of review of literature will be presented along with results of experimental study that compares the performance of tree-based and function-base machine learning including deep learning algorithms using secondary dataset collected from darknet.

Keywords: deep-learning, cyber security, cyber threat modeling, tree-based machine learning, function-based machine learning, data science

Procedia PDF Downloads 140
2311 Vegetation Assessment Under the Influence of Environmental Variables; A Case Study from the Yakhtangay Hill of Himalayan Range, Pakistan

Authors: Hameed Ullah, Shujaul Mulk Khan, Zahid Ullah, Zeeshan Ahmad Sadia Jahangir, Abdullah, Amin Ur Rahman, Muhammad Suliman, Dost Muhammad

Abstract:

The interrelationship between vegetation and abiotic variables inside an ecosystem is one of the main jobs of plant scientists. This study was designed to investigate the vegetation structure and species diversity along with the environmental variables in the Yakhtangay hill district Shangla of the Himalayan Mountain series Pakistan by using multivariate statistical analysis. Quadrat’s method was used and a total of 171 Quadrats were laid down 57 for Tree, Shrubs and Herbs, respectively, to analyze the phytosociological attributes of the vegetation. The vegetation of the selected area was classified into different Life and leaf-forms according to Raunkiaer classification, while PCORD software version 5 was used to classify the vegetation into different plants communities by Two-way indicator species Analysis (TWINSPAN). The CANOCCO version 4.5 was used for DCA and CCA analysis to find out variation directories of vegetation with different environmental variables. A total of 114 plants species belonging to 45 different families was investigated inside the area. The Rosaceae (12 species) was the dominant family followed by Poaceae (10 species) and then Asteraceae (7 species). Monocots were more dominant than Dicots and Angiosperms were more dominant than Gymnosperms. Among the life forms the Hemicryptophytes and Nanophanerophytes were dominant, followed by Therophytes, while among the leaf forms Microphylls were dominant, followed by Leptophylls. It is concluded that among the edaphic factors such as soil pH, the concentration of soil organic matter, Calcium Carbonates concentration in soil, soil EC, soil TDS, and physiographic factors such as Altitude and slope are affecting the structure of vegetation, species composition and species diversity at the significant level with p-value ≤0.05. The Vegetation of the selected area was classified into four major plants communities and the indicator species for each community was recorded. Classification of plants into 4 different communities based upon edaphic gradients favors the individualistic hypothesis. Indicator Species Analysis (ISA) shows the indicators of the study area are mostly indicators to the Himalayan or moist temperate ecosystem, furthermore, these indicators could be considered for micro-habitat conservation and respective ecosystem management plans.

Keywords: species richness, edaphic gradients, canonical correspondence analysis (CCA), TWCA

Procedia PDF Downloads 139
2310 The Impact on the Composition of Survey Refusals΄ Demographic Profile When Implementing Different Classifications

Authors: Eva Tsouparopoulou, Maria Symeonaki

Abstract:

The internationally documented declining survey response rates of the last two decades are mainly attributed to refusals. In fieldwork, a refusal may be obtained not only from the respondent himself/herself, but from other sources on the respondent’s behalf, such as other household members, apartment building residents or administrator(s), and neighborhood residents. In this paper, we investigate how the composition of the demographic profile of survey refusals changes when different classifications are implemented and the classification issues arising from that. The analysis is based on the 2002-2018 European Social Survey (ESS) datasets for Belgium, Germany, and United Kingdom. For these three countries, the size of selected sample units coded as a type of refusal for all nine under investigation rounds was large enough to meet the purposes of the analysis. The results indicate the existence of four different possible classifications that can be implemented and the significance of choosing the one that strengthens the contrasts of the different types of respondents' demographic profiles. Since the foundation of social quantitative research lies in the triptych of definition, classification, and measurement, this study aims to identify the multiplicity of the definition of survey refusals as a methodological tool for the continually growing research on non-response.

Keywords: non-response, refusals, European social survey, classification

Procedia PDF Downloads 74
2309 Disease Level Assessment in Wheat Plots Using a Residual Deep Learning Algorithm

Authors: Felipe A. Guth, Shane Ward, Kevin McDonnell

Abstract:

The assessment of disease levels in crop fields is an important and time-consuming task that generally relies on expert knowledge of trained individuals. Image classification in agriculture problems historically has been based on classical machine learning strategies that make use of hand-engineered features in the top of a classification algorithm. This approach tends to not produce results with high accuracy and generalization to the classes classified by the system when the nature of the elements has a significant variability. The advent of deep convolutional neural networks has revolutionized the field of machine learning, especially in computer vision tasks. These networks have great resourcefulness of learning and have been applied successfully to image classification and object detection tasks in the last years. The objective of this work was to propose a new method based on deep learning convolutional neural networks towards the task of disease level monitoring. Common RGB images of winter wheat were obtained during a growing season. Five categories of disease levels presence were produced, in collaboration with agronomists, for the algorithm classification. Disease level tasks performed by experts provided ground truth data for the disease score of the same winter wheat plots were RGB images were acquired. The system had an overall accuracy of 84% on the discrimination of the disease level classes.

Keywords: crop disease assessment, deep learning, precision agriculture, residual neural networks

Procedia PDF Downloads 311
2308 Using Closed Frequent Itemsets for Hierarchical Document Clustering

Authors: Cheng-Jhe Lee, Chiun-Chieh Hsu

Abstract:

Due to the rapid development of the Internet and the increased availability of digital documents, the excessive information on the Internet has led to information overflow problem. In order to solve these problems for effective information retrieval, document clustering in text mining becomes a popular research topic. Clustering is the unsupervised classification of data items into groups without the need of training data. Many conventional document clustering methods perform inefficiently for large document collections because they were originally designed for relational database. Therefore they are impractical in real-world document clustering and require special handling for high dimensionality and high volume. We propose the FIHC (Frequent Itemset-based Hierarchical Clustering) method, which is a hierarchical clustering method developed for document clustering, where the intuition of FIHC is that there exist some common words for each cluster. FIHC uses such words to cluster documents and builds hierarchical topic tree. In this paper, we combine FIHC algorithm with ontology to solve the semantic problem and mine the meaning behind the words in documents. Furthermore, we use the closed frequent itemsets instead of only use frequent itemsets, which increases efficiency and scalability. The experimental results show that our method is more accurate than those of well-known document clustering algorithms.

Keywords: FIHC, documents clustering, ontology, closed frequent itemset

Procedia PDF Downloads 381
2307 Population Dynamics and Land Use/Land Cover Change on the Chilalo-Galama Mountain Range, Ethiopia

Authors: Yusuf Jundi Sado

Abstract:

Changes in land use are mostly credited to human actions that result in negative impacts on biodiversity and ecosystem functions. This study aims to analyze the dynamics of land use and land cover changes for sustainable natural resources planning and management. Chilalo-Galama Mountain Range, Ethiopia. This study used Thematic Mapper 05 (TM) for 1986, 2001 and Landsat 8 (OLI) data 2017. Additionally, data from the Central Statistics Agency on human population growth were analyzed. Semi-Automatic classification plugin (SCP) in QGIS 3.2.3 software was used for image classification. Global positioning system, field observations and focus group discussions were used for ground verification. Land Use Land Cover (LU/LC) change analysis was using maximum likelihood supervised classification and changes were calculated for the 1986–2001 and the 2001–2017 and 1986-2017 periods. The results show that agricultural land increased from 27.85% (1986) to 44.43% and 51.32% in 2001 and 2017, respectively with the overall accuracies of 92% (1986), 90.36% (2001), and 88% (2017). On the other hand, forests decreased from 8.51% (1986) to 7.64 (2001) and 4.46% (2017), and grassland decreased from 37.47% (1986) to 15.22%, and 15.01% in 2001 and 2017, respectively. It indicates for the years 1986–2017 the largest area cover gain of agricultural land was obtained from grassland. The matrix also shows that shrubland gained land from agricultural land, afro-alpine, and forest land. Population dynamics is found to be one of the major driving forces for the LU/LU changes in the study area.

Keywords: Landsat, LU/LC change, Semi-Automatic classification plugin, population dynamics, Ethiopia

Procedia PDF Downloads 71
2306 Clinical Feature Analysis and Prediction on Recurrence in Cervical Cancer

Authors: Ravinder Bahl, Jamini Sharma

Abstract:

The paper demonstrates analysis of the cervical cancer based on a probabilistic model. It involves technique for classification and prediction by recognizing typical and diagnostically most important test features relating to cervical cancer. The main contributions of the research include predicting the probability of recurrences in no recurrence (first time detection) cases. The combination of the conventional statistical and machine learning tools is applied for the analysis. Experimental study with real data demonstrates the feasibility and potential of the proposed approach for the said cause.

Keywords: cervical cancer, recurrence, no recurrence, probabilistic, classification, prediction, machine learning

Procedia PDF Downloads 350
2305 Structured Access Control Mechanism for Mesh-based P2P Live Streaming Systems

Authors: Chuan-Ching Sue, Kai-Chun Chuang

Abstract:

Peer-to-Peer (P2P) live streaming systems still suffer a challenge when thousands of new peers want to join into the system in a short time, called flash crowd, and most of new peers suffer long start-up delay. Recent studies have proposed a slot-based user access control mechanism, which periodically determines a certain number of new peers to enter the system, and a user batch join mechanism, which divides new peers into several tree structures with fixed tree size. However, the slot-based user access control mechanism is difficult for accurately determining the optimal time slot length, and the user batch join mechanism is hard for determining the optimal tree size. In this paper, we propose a structured access control (SAC) mechanism, which constructs new peers to a multi-layer mesh structure. The SAC mechanism constructs new peer connections layer by layer to replace periodical access control, and determines the number of peers in each layer according to the system’s remaining upload bandwidth and average video rate. Furthermore, we propose an analytical model to represent the behavior of the system growth if the system can utilize the upload bandwidth efficiently. The analytical result has shown the similar trend in system growth as the SAC mechanism. Additionally, the extensive simulation is conducted to show the SAC mechanism outperforms two previously proposed methods in terms of system growth and start-up delay.

Keywords: peer-to-peer, live video streaming system, flash crowd, start-up delay, access control

Procedia PDF Downloads 307
2304 Immature Palm Tree Detection Using Morphological Filter for Palm Counting with High Resolution Satellite Image

Authors: Nur Nadhirah Rusyda Rosnan, Nursuhaili Najwa Masrol, Nurul Fatiha MD Nor, Mohammad Zafrullah Mohammad Salim, Sim Choon Cheak

Abstract:

Accurate inventories of oil palm planted areas are crucial for plantation management as this would impact the overall economy and production of oil. One of the technological advancements in the oil palm industry is semi-automated palm counting, which is replacing conventional manual palm counting via digitizing aerial imagery. Most of the semi-automated palm counting method that has been developed was limited to mature palms due to their ideal canopy size represented by satellite image. Therefore, immature palms were often left out since the size of the canopy is barely visible from satellite images. In this paper, an approach using a morphological filter and high-resolution satellite image is proposed to detect immature palm trees. This approach makes it possible to count the number of immature oil palm trees. The method begins with an erosion filter with an appropriate window size of 3m onto the high-resolution satellite image. The eroded image was further segmented using watershed segmentation to delineate immature palm tree regions. Then, local minimum detection was used because it is hypothesized that immature oil palm trees are located at the local minimum within an oil palm field setting in a grayscale image. The detection points generated from the local minimum are displaced to the center of the immature oil palm region and thinned. Only one detection point is left that represents a tree. The performance of the proposed method was evaluated on three subsets with slopes ranging from 0 to 20° and different planting designs, i.e., straight and terrace. The proposed method was able to achieve up to more than 90% accuracy when compared with the ground truth, with an overall F-measure score of up to 0.91.

Keywords: immature palm count, oil palm, precision agriculture, remote sensing

Procedia PDF Downloads 63
2303 Phylogenetic Relationships of Common Reef Fish Species in Vietnam

Authors: Dang Thuy Binh, Truong Thi Oanh, Le Phan Khanh Hung, Luong thi Tuong Vy

Abstract:

One of the greatest environmental challenges facing Asia is the management and conservation of the marine biodiversity threaten by fisheries overexploitation, pollution, habitat destruction, and climate change. To date, a few molecular taxonomical studies has been conducted on marine fauna in Vietnam. The purpose of this study was to clarify the phylogeny of economic and ecological reef fish species in Vietnam Reef fish species covering Labridae, Scaridae, Nemipteridae, Serranidae, Acanthuridae, Lutjanidae, Lethrinidae, Mullidae, Balistidae, Pseudochromidae, Pinguipedidae, Fistulariidae, Holocentridae, Synodontidae, and Pomacentridae representing 28 genera were collected from South and Center, Vietnam. Combine with Genbank sequences, a phylogenetic tree was constructed based on 16S gene of mitochondrial DNA using maximum parsimony, maximum likelihood, and Bayesian inference approaches. The phylogram showed the well-resolved clades at genus and family level. Perciformes is the major order of reef fish species in Vietnam. The monophyly of Perciformes is not strongly supported as it was clustered in the same clade with Tetraodontiformes syngnathiformes and Beryciformes. Continue sampling of commercial fish species and classification based on morphology and genetics to build DNA barcoding of fish species in Vietnam is really necessary.

Keywords: reef fish, 16s rDNA, Vietnam, phylogeny

Procedia PDF Downloads 429
2302 A Machine Learning Approach for Classification of Directional Valve Leakage in the Hydraulic Final Test

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Due to increasing cost pressure in global markets, artificial intelligence is becoming a technology that is decisive for competition. Predictive quality enables machinery and plant manufacturers to ensure product quality by using data-driven forecasts via machine learning models as a decision-making basis for test results. The use of cross-process Bosch production data along the value chain of hydraulic valves is a promising approach to classifying the quality characteristics of workpieces.

Keywords: predictive quality, hydraulics, machine learning, classification, supervised learning

Procedia PDF Downloads 219
2301 Time-Frequency Feature Extraction Method Based on Micro-Doppler Signature of Ground Moving Targets

Authors: Ke Ren, Huiruo Shi, Linsen Li, Baoshuai Wang, Yu Zhou

Abstract:

Since some discriminative features are required for ground moving targets classification, we propose a new feature extraction method based on micro-Doppler signature. Firstly, the time-frequency analysis of measured data indicates that the time-frequency spectrograms of the three kinds of ground moving targets, i.e., single walking person, two people walking and a moving wheeled vehicle, are discriminative. Then, a three-dimensional time-frequency feature vector is extracted from the time-frequency spectrograms to depict these differences. At last, a Support Vector Machine (SVM) classifier is trained with the proposed three-dimensional feature vector. The classification accuracy to categorize ground moving targets into the three kinds of the measured data is found to be over 96%, which demonstrates the good discriminative ability of the proposed micro-Doppler feature.

Keywords: micro-doppler, time-frequency analysis, feature extraction, radar target classification

Procedia PDF Downloads 398
2300 Clustering the Wheat Seeds Using SOM Artificial Neural Networks

Authors: Salah Ghamari

Abstract:

In this study, the ability of self organizing map artificial (SOM) neural networks in clustering the wheat seeds varieties according to morphological properties of them was considered. The SOM is one type of unsupervised competitive learning. Experimentally, five morphological features of 300 seeds (including three varieties: gaskozhen, Md and sardari) were obtained using image processing technique. The results show that the artificial neural network has a good performance (90.33% accuracy) in classification of the wheat varieties despite of high similarity in them. The highest classification accuracy (100%) was achieved for sardari.

Keywords: artificial neural networks, clustering, self organizing map, wheat variety

Procedia PDF Downloads 635
2299 SEM Image Classification Using CNN Architectures

Authors: Güzi̇n Ti̇rkeş, Özge Teki̇n, Kerem Kurtuluş, Y. Yekta Yurtseven, Murat Baran

Abstract:

A scanning electron microscope (SEM) is a type of electron microscope mainly used in nanoscience and nanotechnology areas. Automatic image recognition and classification are among the general areas of application concerning SEM. In line with these usages, the present paper proposes a deep learning algorithm that classifies SEM images into nine categories by means of an online application to simplify the process. The NFFA-EUROPE - 100% SEM data set, containing approximately 21,000 images, was used to train and test the algorithm at 80% and 20%, respectively. Validation was carried out using a separate data set obtained from the Middle East Technical University (METU) in Turkey. To increase the accuracy in the results, the Inception ResNet-V2 model was used in view of the Fine-Tuning approach. By using a confusion matrix, it was observed that the coated-surface category has a negative effect on the accuracy of the results since it contains other categories in the data set, thereby confusing the model when detecting category-specific patterns. For this reason, the coated-surface category was removed from the train data set, hence increasing accuracy by up to 96.5%.

Keywords: convolutional neural networks, deep learning, image classification, scanning electron microscope

Procedia PDF Downloads 111
2298 Mixed Integer Programming-Based One-Class Classification Method for Process Monitoring

Authors: Younghoon Kim, Seoung Bum Kim

Abstract:

One-class classification plays an important role in detecting outlier and abnormality from normal observations. In the previous research, several attempts were made to extend the scope of application of the one-class classification techniques to statistical process control problems. For most previous approaches, such as support vector data description (SVDD) control chart, the design of the control limits is commonly based on the assumption that the proportion of abnormal observations is approximately equal to an expected Type I error rate in Phase I process. Because of the limitation of the one-class classification techniques based on convex optimization, we cannot make the proportion of abnormal observations exactly equal to expected Type I error rate: controlling Type I error rate requires to optimize constraints with integer decision variables, but convex optimization cannot satisfy the requirement. This limitation would be undesirable in theoretical and practical perspective to construct effective control charts. In this work, to address the limitation of previous approaches, we propose the one-class classification algorithm based on the mixed integer programming technique, which can solve problems formulated with continuous and integer decision variables. The proposed method minimizes the radius of a spherically shaped boundary subject to the number of normal data to be equal to a constant value specified by users. By modifying this constant value, users can exactly control the proportion of normal data described by the spherically shaped boundary. Thus, the proportion of abnormal observations can be made theoretically equal to an expected Type I error rate in Phase I process. Moreover, analogous to SVDD, the boundary can be made to describe complex structures by using some kernel functions. New multivariate control chart applying the effectiveness of the algorithm is proposed. This chart uses a monitoring statistic to characterize the degree of being an abnormal point as obtained through the proposed one-class classification. The control limit of the proposed chart is established by the radius of the boundary. The usefulness of the proposed method was demonstrated through experiments with simulated and real process data from a thin film transistor-liquid crystal display.

Keywords: control chart, mixed integer programming, one-class classification, support vector data description

Procedia PDF Downloads 165
2297 Tea (Camellia sinensis (L.) O. Kuntze) Typology in Kenya: A Review

Authors: Joseph Kimutai Langat

Abstract:

Tea typology is the science of classifying tea. This study was carried out between November 2023 and July 2024, whose main objective was to investigate the typological classification nomenclature of processed tea in the world, narrowing down to Kenya. Centres of origin, historical background, tea growing region, scientific naming system, market, fermentation levels, processing/ oxidation levels and cultural reasons are used to classify tea at present. Of these, the most common typology is by oxidation, and more specifically, by the production methods within the oxidation categories. While the Asian tea producing countries categorises tea products based on the decreasing oxidation levels during the manufacturing process: black tea, green tea, oolong tea and instant tea, Kenya’s tea typology system is based on the degree of fermentation process, i.e. black tea, purple tea, green tea and white tea. Tea is also classified into five categories: black tea, green tea, white tea, oolong tea, and dark tea. Black tea is the main tea processed and exported in Kenya, manufactured mainly by withering, rolling, or by use of cutting-tearing-curling (CTC) method that ensures efficient conversion of leaf herbage to made tea, oxidizing, and drying before being sorted into different grades. It is from these varied typological methods that this review paper concludes that different regions of the world use different classification nomenclature. Therefore, since tea typology is not standardized, it is recommended that a global tea regulator dealing in tea classification be created to standardize tea typology, with domestic in-country regulatory bodies in tea growing countries accredited to implement the global-wide typological agreements and resolutions.

Keywords: classification, fermentation, oxidation, tea, typology

Procedia PDF Downloads 18
2296 A Proposed Optimized and Efficient Intrusion Detection System for Wireless Sensor Network

Authors: Abdulaziz Alsadhan, Naveed Khan

Abstract:

In recent years intrusions on computer network are the major security threat. Hence, it is important to impede such intrusions. The hindrance of such intrusions entirely relies on its detection, which is primary concern of any security tool like Intrusion Detection System (IDS). Therefore, it is imperative to accurately detect network attack. Numerous intrusion detection techniques are available but the main issue is their performance. The performance of IDS can be improved by increasing the accurate detection rate and reducing false positive. The existing intrusion detection techniques have the limitation of usage of raw data set for classification. The classifier may get jumble due to redundancy, which results incorrect classification. To minimize this problem, Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Local Binary Pattern (LBP) can be applied to transform raw features into principle features space and select the features based on their sensitivity. Eigen values can be used to determine the sensitivity. To further classify, the selected features greedy search, back elimination, and Particle Swarm Optimization (PSO) can be used to obtain a subset of features with optimal sensitivity and highest discriminatory power. These optimal feature subset used to perform classification. For classification purpose, Support Vector Machine (SVM) and Multilayer Perceptron (MLP) used due to its proven ability in classification. The Knowledge Discovery and Data mining (KDD’99) cup dataset was considered as a benchmark for evaluating security detection mechanisms. The proposed approach can provide an optimal intrusion detection mechanism that outperforms the existing approaches and has the capability to minimize the number of features and maximize the detection rates.

Keywords: Particle Swarm Optimization (PSO), Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), Local Binary Pattern (LBP), Support Vector Machine (SVM), Multilayer Perceptron (MLP)

Procedia PDF Downloads 353
2295 Classification of Germinatable Mung Bean by Near Infrared Hyperspectral Imaging

Authors: Kaewkarn Phuangsombat, Arthit Phuangsombat, Anupun Terdwongworakul

Abstract:

Hard seeds will not grow and can cause mold in sprouting process. Thus, the hard seeds need to be separated from the normal seeds. Near infrared hyperspectral imaging in a range of 900 to 1700 nm was implemented to develop a model by partial least squares discriminant analysis to discriminate the hard seeds from the normal seeds. The orientation of the seeds was also studied to compare the performance of the models. The model based on hilum-up orientation achieved the best result giving the coefficient of determination of 0.98, and root mean square error of prediction of 0.07 with classification accuracy was equal to 100%.

Keywords: mung bean, near infrared, germinatability, hard seed

Procedia PDF Downloads 289
2294 Performance Study of Classification Algorithms for Consumer Online Shopping Attitudes and Behavior Using Data Mining

Authors: Rana Alaa El-Deen Ahmed, M. Elemam Shehab, Shereen Morsy, Nermeen Mekawie

Abstract:

With the growing popularity and acceptance of e-commerce platforms, users face an ever increasing burden in actually choosing the right product from the large number of online offers. Thus, techniques for personalization and shopping guides are needed by users. For a pleasant and successful shopping experience, users need to know easily which products to buy with high confidence. Since selling a wide variety of products has become easier due to the popularity of online stores, online retailers are able to sell more products than a physical store. The disadvantage is that the customers might not find products they need. In this research the customer will be able to find the products he is searching for, because recommender systems are used in some ecommerce web sites. Recommender system learns from the information about customers and products and provides appropriate personalized recommendations to customers to find the needed product. In this paper eleven classification algorithms are comparatively tested to find the best classifier fit for consumer online shopping attitudes and behavior in the experimented dataset. The WEKA knowledge analysis tool, which is an open source data mining workbench software used in comparing conventional classifiers to get the best classifier was used in this research. In this research by using the data mining tool (WEKA) with the experimented classifiers the results show that decision table and filtered classifier gives the highest accuracy and the lowest accuracy classification via clustering and simple cart.

Keywords: classification, data mining, machine learning, online shopping, WEKA

Procedia PDF Downloads 341
2293 Comparative Study Using WEKA for Red Blood Cells Classification

Authors: Jameela Ali, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifying the RBCs as normal, or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithm tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital-alaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively.

Keywords: K-nearest neighbors algorithm, radial basis function neural network, red blood cells, support vector machine

Procedia PDF Downloads 400
2292 Application of Fuzzy Clustering on Classification Agile Supply Chain

Authors: Hamidreza Fallah Lajimi , Elham Karami, Fatemeh Ali nasab, Mostafa Mahdavikia

Abstract:

Being responsive is an increasingly important skill for firms in today’s global economy; thus firms must be agile. Naturally, it follows that an organization’s agility depends on its supply chain being agile. However, achieving supply chain agility is a function of other abilities within the organization. This paper analyses results from a survey of 71 Iran manufacturing companies in order to identify some of the factors for agile organizations in managing their supply chains. Then we classification this company in four cluster with fuzzy c-mean technique and with four validations functional determine automatically the optimal number of clusters.

Keywords: agile supply chain, clustering, fuzzy clustering

Procedia PDF Downloads 455