Search results for: classification and regression tree
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5926

Search results for: classification and regression tree

5176 Quantitative Texture Analysis of Shoulder Sonography for Rotator Cuff Lesion Classification

Authors: Chung-Ming Lo, Chung-Chien Lee

Abstract:

In many countries, the lifetime prevalence of shoulder pain is up to 70%. In America, the health care system spends 7 billion per year about the healthy issues of shoulder pain. With respect to the origin, up to 70% of shoulder pain is attributed to rotator cuff lesions This study proposed a computer-aided diagnosis (CAD) system to assist radiologists classifying rotator cuff lesions with less operator dependence. Quantitative features were extracted from the shoulder ultrasound images acquired using an ALOKA alpha-6 US scanner (Hitachi-Aloka Medical, Tokyo, Japan) with linear array probe (scan width: 36mm) ranging from 5 to 13 MHz. During examination, the postures of the examined patients are standard sitting position and are followed by the regular routine. After acquisition, the shoulder US images were drawn out from the scanner and stored as 8-bit images with pixel value ranging from 0 to 255. Upon the sonographic appearance, the boundary of each lesion was delineated by a physician to indicate the specific pattern for analysis. The three lesion categories for classification were composed of 20 cases of tendon inflammation, 18 cases of calcific tendonitis, and 18 cases of supraspinatus tear. For each lesion, second-order statistics were quantified in the feature extraction. The second-order statistics were the texture features describing the correlations between adjacent pixels in a lesion. Because echogenicity patterns were expressed via grey-scale. The grey-scale co-occurrence matrixes with four angles of adjacent pixels were used. The texture metrics included the mean and standard deviation of energy, entropy, correlation, inverse different moment, inertia, cluster shade, cluster prominence, and Haralick correlation. Then, the quantitative features were combined in a multinomial logistic regression classifier to generate a prediction model of rotator cuff lesions. Multinomial logistic regression classifier is widely used in the classification of more than two categories such as the three lesion types used in this study. In the classifier, backward elimination was used to select a feature subset which is the most relevant. They were selected from the trained classifier with the lowest error rate. Leave-one-out cross-validation was used to evaluate the performance of the classifier. Each case was left out of the total cases and used to test the trained result by the remaining cases. According to the physician’s assessment, the performance of the proposed CAD system was shown by the accuracy. As a result, the proposed system achieved an accuracy of 86%. A CAD system based on the statistical texture features to interpret echogenicity values in shoulder musculoskeletal ultrasound was established to generate a prediction model for rotator cuff lesions. Clinically, it is difficult to distinguish some kinds of rotator cuff lesions, especially partial-thickness tear of rotator cuff. The shoulder orthopaedic surgeon and musculoskeletal radiologist reported greater diagnostic test accuracy than general radiologist or ultrasonographers based on the available literature. Consequently, the proposed CAD system which was developed according to the experiment of the shoulder orthopaedic surgeon can provide reliable suggestions to general radiologists or ultrasonographers. More quantitative features related to the specific patterns of different lesion types would be investigated in the further study to improve the prediction.

Keywords: shoulder ultrasound, rotator cuff lesions, texture, computer-aided diagnosis

Procedia PDF Downloads 284
5175 Chemical Composition and Nutritional Value of Leaves and Pods of Leucaena Leucocephala, Prosopis Laevigata and Acacia Farnesiana in a Xerophyllous Shrubland

Authors: Miguel Mellado, Cecilia Zapata

Abstract:

Goats can be exploited in harsh environments due to their capacity to adjust to limited quantity and quality forage sources. In these environments, leguminous trees can be used as supplementary feeds as foliage and fruits of these trees can contribute to maintain or improve production efficiency in ruminants. The objective of this study was to determine the nutritional value of three leguminous trees heavily selected by goats in a xerophyllous shrubland. Chemical composition and in vitro dry matter disappearance (IVDMD) of leaves and pods from leucaena (Leucaena leucocephala), mesquite (Prosopis laevigata) and huisache (Acacia farnesiana) is presented. Crude protein (CP) ranged from 17.3% for leaves of huisache to 21.9% for leucaena. The neutral detergent fiber (NDF) content ranged from 39.0 to 40.3 with no difference among fodder threes. Across tree species, mean IVDMD was 61.6% for pods and 52.2% for leaves. IVDMD for leaves was highest (P < 0.01) for leucaena (54.9%) and lowest for huisache (47.3%). Condensed tannins in an acetonic extract were highest for leaves of huisache (45.3 mg CE/g DM) and lowest for mesquite (25.9 mg CE/g DM). Pods and leaves of huisache presented the highest number of secondary metabolites, mainly related to hydrobenzoic acid and flavonols; leucaena and mesquite presented mainly flavonols and anthocyanins. It was concluded that leaves and pods of leucaena, mesquite and huisache constitute valuable forages for ruminant livestock due to its low fiber, high CP levels, moderate in vitro fermentation characteristics and high mineral content. Keywords: Fodder tree; ruminants; secondary metabolites; minerals; tannins

Keywords: fodder tree, ruminants, secondary metabolites, minerals, tannins

Procedia PDF Downloads 144
5174 Dual-Channel Reliable Breast Ultrasound Image Classification Based on Explainable Attribution and Uncertainty Quantification

Authors: Haonan Hu, Shuge Lei, Dasheng Sun, Huabin Zhang, Kehong Yuan, Jian Dai, Jijun Tang

Abstract:

This paper focuses on the classification task of breast ultrasound images and conducts research on the reliability measurement of classification results. A dual-channel evaluation framework was developed based on the proposed inference reliability and predictive reliability scores. For the inference reliability evaluation, human-aligned and doctor-agreed inference rationals based on the improved feature attribution algorithm SP-RISA are gracefully applied. Uncertainty quantification is used to evaluate the predictive reliability via the test time enhancement. The effectiveness of this reliability evaluation framework has been verified on the breast ultrasound clinical dataset YBUS, and its robustness is verified on the public dataset BUSI. The expected calibration errors on both datasets are significantly lower than traditional evaluation methods, which proves the effectiveness of the proposed reliability measurement.

Keywords: medical imaging, ultrasound imaging, XAI, uncertainty measurement, trustworthy AI

Procedia PDF Downloads 99
5173 A Multi-Output Network with U-Net Enhanced Class Activation Map and Robust Classification Performance for Medical Imaging Analysis

Authors: Jaiden Xuan Schraut, Leon Liu, Yiqiao Yin

Abstract:

Computer vision in medical diagnosis has achieved a high level of success in diagnosing diseases with high accuracy. However, conventional classifiers that produce an image to-label result provides insufficient information for medical professionals to judge and raise concerns over the trust and reliability of a model with results that cannot be explained. In order to gain local insight into cancerous regions, separate tasks such as imaging segmentation need to be implemented to aid the doctors in treating patients, which doubles the training time and costs which renders the diagnosis system inefficient and difficult to be accepted by the public. To tackle this issue and drive AI-first medical solutions further, this paper proposes a multi-output network that follows a U-Net architecture for image segmentation output and features an additional convolutional neural networks (CNN) module for auxiliary classification output. Class activation maps are a method of providing insight into a convolutional neural network’s feature maps that leads to its classification but in the case of lung diseases, the region of interest is enhanced by U-net-assisted Class Activation Map (CAM) visualization. Therefore, our proposed model combines image segmentation models and classifiers to crop out only the lung region of a chest X-ray’s class activation map to provide a visualization that improves the explainability and is able to generate classification results simultaneously which builds trust for AI-led diagnosis systems. The proposed U-Net model achieves 97.61% accuracy and a dice coefficient of 0.97 on testing data from the COVID-QU-Ex Dataset which includes both diseased and healthy lungs.

Keywords: multi-output network model, U-net, class activation map, image classification, medical imaging analysis

Procedia PDF Downloads 200
5172 Assessing the Effectiveness of Machine Learning Algorithms for Cyber Threat Intelligence Discovery from the Darknet

Authors: Azene Zenebe

Abstract:

Deep learning is a subset of machine learning which incorporates techniques for the construction of artificial neural networks and found to be useful for modeling complex problems with large dataset. Deep learning requires a very high power computational and longer time for training. By aggregating computing power, high performance computer (HPC) has emerged as an approach to resolving advanced problems and performing data-driven research activities. Cyber threat intelligence (CIT) is actionable information or insight an organization or individual uses to understand the threats that have, will, or are currently targeting the organization. Results of review of literature will be presented along with results of experimental study that compares the performance of tree-based and function-base machine learning including deep learning algorithms using secondary dataset collected from darknet.

Keywords: deep-learning, cyber security, cyber threat modeling, tree-based machine learning, function-based machine learning, data science

Procedia PDF Downloads 152
5171 Forecasting Equity Premium Out-of-Sample with Sophisticated Regression Training Techniques

Authors: Jonathan Iworiso

Abstract:

Forecasting the equity premium out-of-sample is a major concern to researchers in finance and emerging markets. The quest for a superior model that can forecast the equity premium with significant economic gains has resulted in several controversies on the choice of variables and suitable techniques among scholars. This research focuses mainly on the application of Regression Training (RT) techniques to forecast monthly equity premium out-of-sample recursively with an expanding window method. A broad category of sophisticated regression models involving model complexity was employed. The RT models include Ridge, Forward-Backward (FOBA) Ridge, Least Absolute Shrinkage and Selection Operator (LASSO), Relaxed LASSO, Elastic Net, and Least Angle Regression were trained and used to forecast the equity premium out-of-sample. In this study, the empirical investigation of the RT models demonstrates significant evidence of equity premium predictability both statistically and economically relative to the benchmark historical average, delivering significant utility gains. They seek to provide meaningful economic information on mean-variance portfolio investment for investors who are timing the market to earn future gains at minimal risk. Thus, the forecasting models appeared to guarantee an investor in a market setting who optimally reallocates a monthly portfolio between equities and risk-free treasury bills using equity premium forecasts at minimal risk.

Keywords: regression training, out-of-sample forecasts, expanding window, statistical predictability, economic significance, utility gains

Procedia PDF Downloads 106
5170 Self-Image of Police Officers

Authors: Leo Carlo B. Rondina

Abstract:

Self-image is an important factor to improve the self-esteem of the personnel. The purpose of the study is to determine the self-image of the police. The respondents were the 503 policemen assigned in different Police Station in Davao City, and they were chosen with the used of random sampling. With the used of Exploratory Factor Analysis (EFA), latent construct variables of police image were identified as follows; professionalism, obedience, morality and justice and fairness. Further, ordinal regression indicates statistical characteristics on ages 21-40 which means the age of the respondent statistically improves self-image.

Keywords: police image, exploratory factor analysis, ordinal regression, Galatea effect

Procedia PDF Downloads 284
5169 Urbanization in Delhi: A Multiparameter Study

Authors: Ishu Surender, M. Amez Khair, Ishan Singh

Abstract:

Urbanization is a multidimensional phenomenon. It is an indication of the long-term process for the shift of economics to industrial from rural. The significance of urbanization in modernization, socio-economic development, and poverty eradication is relevant in modern times. This paper aims to study the urbanization index model in the capital of India, Delhi using aspects such as demographic aspect, infrastructural development aspect, and economic development aspect. The urbanization index of all the nine districts of Delhi will be determined using multiple parameters such as population density and the availability of health and education facilities. The definition of the urban area varies from city to city and requires periodic classification which makes direct comparisons difficult. The urbanization index calculated in this paper can be employed to measure the urbanization of a district and compare the level of urbanization in different districts.

Keywords: multiparameter, population density, multiple regression, normalized urbanization index

Procedia PDF Downloads 111
5168 Regression Analysis of Travel Indicators and Public Transport Usage in Urban Areas

Authors: Mehdi Moeinaddini, Zohreh Asadi-Shekari, Muhammad Zaly Shah, Amran Hamzah

Abstract:

Currently, planners try to have more green travel options to decrease economic, social and environmental problems. Therefore, this study tries to find significant urban travel factors to be used to increase the usage of alternative urban travel modes. This paper attempts to identify the relationship between prominent urban mobility indicators and daily trips by public transport in 30 cities from various parts of the world. Different travel modes, infrastructures and cost indicators were evaluated in this research as mobility indicators. The results of multi-linear regression analysis indicate that there is a significant relationship between mobility indicators and the daily usage of public transport.

Keywords: green travel modes, urban travel indicators, daily trips by public transport, multi-linear regression analysis

Procedia PDF Downloads 546
5167 Development of Generalized Correlation for Liquid Thermal Conductivity of N-Alkane and Olefin

Authors: A. Ishag Mohamed, A. A. Rabah

Abstract:

The objective of this research is to develop a generalized correlation for the prediction of thermal conductivity of n-Alkanes and Alkenes. There is a minority of research and lack of correlation for thermal conductivity of liquids in the open literature. The available experimental data are collected covering the groups of n-Alkanes and Alkenes.The data were assumed to correlate to temperature using Filippov correlation. Nonparametric regression of Grace Algorithm was used to develop the generalized correlation model. A spread sheet program based on Microsoft Excel was used to plot and calculate the value of the coefficients. The results obtained were compared with the data that found in Perry's Chemical Engineering Hand Book. The experimental data correlated to the temperature ranged "between" 273.15 to 673.15 K, with R2 = 0.99.The developed correlation reproduced experimental data that which were not included in regression with absolute average percent deviation (AAPD) of less than 7 %. Thus the spread sheet was quite accurate which produces reliable data.

Keywords: N-Alkanes, N-Alkenes, nonparametric, regression

Procedia PDF Downloads 652
5166 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 166
5165 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 158
5164 Surface Hole Defect Detection of Rolled Sheets Based on Pixel Classification Approach

Authors: Samira Taleb, Sakina Aoun, Slimane Ziani, Zoheir Mentouri, Adel Boudiaf

Abstract:

Rolling is a pressure treatment technique that modifies the shape of steel ingots or billets between rotating rollers. During this process, defects may form on the surface of the rolled sheets and are likely to affect the performance and quality of the finished product. In our study, we developed a method for detecting surface hole defects using a pixel classification approach. This work includes several steps. First, we performed image preprocessing to delimit areas with and without hole defects on the sheet image. Then, we developed the histograms of each area to generate the gray level membership intervals of the pixels that characterize each area. As we noticed an intersection between the characteristics of the gray level intervals of the images of the two areas, we finally performed a learning step based on a series of detection tests to refine the membership intervals of each area, and to choose the defect detection criterion in order to optimize the recognition of the surface hole.

Keywords: classification, defect, surface, detection, hole

Procedia PDF Downloads 14
5163 Vegetation Assessment Under the Influence of Environmental Variables; A Case Study from the Yakhtangay Hill of Himalayan Range, Pakistan

Authors: Hameed Ullah, Shujaul Mulk Khan, Zahid Ullah, Zeeshan Ahmad Sadia Jahangir, Abdullah, Amin Ur Rahman, Muhammad Suliman, Dost Muhammad

Abstract:

The interrelationship between vegetation and abiotic variables inside an ecosystem is one of the main jobs of plant scientists. This study was designed to investigate the vegetation structure and species diversity along with the environmental variables in the Yakhtangay hill district Shangla of the Himalayan Mountain series Pakistan by using multivariate statistical analysis. Quadrat’s method was used and a total of 171 Quadrats were laid down 57 for Tree, Shrubs and Herbs, respectively, to analyze the phytosociological attributes of the vegetation. The vegetation of the selected area was classified into different Life and leaf-forms according to Raunkiaer classification, while PCORD software version 5 was used to classify the vegetation into different plants communities by Two-way indicator species Analysis (TWINSPAN). The CANOCCO version 4.5 was used for DCA and CCA analysis to find out variation directories of vegetation with different environmental variables. A total of 114 plants species belonging to 45 different families was investigated inside the area. The Rosaceae (12 species) was the dominant family followed by Poaceae (10 species) and then Asteraceae (7 species). Monocots were more dominant than Dicots and Angiosperms were more dominant than Gymnosperms. Among the life forms the Hemicryptophytes and Nanophanerophytes were dominant, followed by Therophytes, while among the leaf forms Microphylls were dominant, followed by Leptophylls. It is concluded that among the edaphic factors such as soil pH, the concentration of soil organic matter, Calcium Carbonates concentration in soil, soil EC, soil TDS, and physiographic factors such as Altitude and slope are affecting the structure of vegetation, species composition and species diversity at the significant level with p-value ≤0.05. The Vegetation of the selected area was classified into four major plants communities and the indicator species for each community was recorded. Classification of plants into 4 different communities based upon edaphic gradients favors the individualistic hypothesis. Indicator Species Analysis (ISA) shows the indicators of the study area are mostly indicators to the Himalayan or moist temperate ecosystem, furthermore, these indicators could be considered for micro-habitat conservation and respective ecosystem management plans.

Keywords: species richness, edaphic gradients, canonical correspondence analysis (CCA), TWCA

Procedia PDF Downloads 151
5162 Classification of EEG Signals Based on Dynamic Connectivity Analysis

Authors: Zoran Šverko, Saša Vlahinić, Nino Stojković, Ivan Markovinović

Abstract:

In this article, the classification of target letters is performed using data from the EEG P300 Speller paradigm. Neural networks trained with the results of dynamic connectivity analysis between different brain regions are used for classification. Dynamic connectivity analysis is based on the adaptive window size and the imaginary part of the complex Pearson correlation coefficient. Brain dynamics are analysed using the relative intersection of confidence intervals for the imaginary component of the complex Pearson correlation coefficient method (RICI-imCPCC). The RICI-imCPCC method overcomes the shortcomings of currently used dynamical connectivity analysis methods, such as the low reliability and low temporal precision for short connectivity intervals encountered in constant sliding window analysis with wide window size and the high susceptibility to noise encountered in constant sliding window analysis with narrow window size. This method overcomes these shortcomings by dynamically adjusting the window size using the RICI rule. This method extracts information about brain connections for each time sample. Seventy percent of the extracted brain connectivity information is used for training and thirty percent for validation. Classification of the target word is also done and based on the same analysis method. As far as we know, through this research, we have shown for the first time that dynamic connectivity can be used as a parameter for classifying EEG signals.

Keywords: dynamic connectivity analysis, EEG, neural networks, Pearson correlation coefficients

Procedia PDF Downloads 213
5161 Accuracy Analysis of the American Society of Anesthesiologists Classification Using ChatGPT

Authors: Jae Ni Jang, Young Uk Kim

Abstract:

Background: Chat Generative Pre-training Transformer-3 (ChatGPT; San Francisco, California, Open Artificial Intelligence) is an artificial intelligence chatbot based on a large language model designed to generate human-like text. As the usage of ChatGPT is increasing among less knowledgeable patients, medical students, and anesthesia and pain medicine residents or trainees, we aimed to evaluate the accuracy of ChatGPT-3 responses to questions about the American Society of Anesthesiologists (ASA) classification based on patients’ underlying diseases and assess the quality of the generated responses. Methods: A total of 47 questions were submitted to ChatGPT using textual prompts. The questions were designed for ChatGPT-3 to provide answers regarding ASA classification in response to common underlying diseases frequently observed in adult patients. In addition, we created 18 questions regarding the ASA classification for pediatric patients and pregnant women. The accuracy of ChatGPT’s responses was evaluated by cross-referencing with Miller’s Anesthesia, Morgan & Mikhail’s Clinical Anesthesiology, and the American Society of Anesthesiologists’ ASA Physical Status Classification System (2020). Results: Out of the 47 questions pertaining to adults, ChatGPT -3 provided correct answers for only 23, resulting in an accuracy rate of 48.9%. Furthermore, the responses provided by ChatGPT-3 regarding children and pregnant women were mostly inaccurate, as indicated by a 28% accuracy rate (5 out of 18). Conclusions: ChatGPT provided correct responses to questions relevant to the daily clinical routine of anesthesiologists in approximately half of the cases, while the remaining responses contained errors. Therefore, caution is advised when using ChatGPT to retrieve anesthesia-related information. Although ChatGPT may not yet be suitable for clinical settings, we anticipate significant improvements in ChatGPT and other large language models in the near future. Regular assessments of ChatGPT's ASA classification accuracy are essential due to the evolving nature of ChatGPT as an artificial intelligence entity. This is especially important because ChatGPT has a clinically unacceptable rate of error and hallucination, particularly in pediatric patients and pregnant women. The methodology established in this study may be used to continue evaluating ChatGPT.

Keywords: American Society of Anesthesiologists, artificial intelligence, Chat Generative Pre-training Transformer-3, ChatGPT

Procedia PDF Downloads 46
5160 Response Surface Methodology for the Optimization of Paddy Husker by Medium Brown Rice Peeling Machine 6 Rubber Type

Authors: S. Bangphan, P. Bangphan, C. Ketsombun, T. Sammana

Abstract:

Optimization of response surface methodology (RSM) was employed to study the effects of three factor (rubber of clearance, spindle of speed, and rice of moisture) in brown rice peeling machine of the optimal good rice yield (99.67, average of three repeats). The optimized composition derived from RSM regression was analyzed using Regression analysis and Analysis of Variance (ANOVA). At a significant level α=0.05, the values of Regression coefficient, R2 adjust were 96.55% and standard deviation were 1.05056. The independent variables are initial rubber of clearance, spindle of speed and rice of moisture parameters namely. The investigating responses are final rubber clearance, spindle of speed and moisture of rice.

Keywords: brown rice, response surface methodology (RSM), peeling machine, optimization, paddy husker

Procedia PDF Downloads 571
5159 Structured Access Control Mechanism for Mesh-based P2P Live Streaming Systems

Authors: Chuan-Ching Sue, Kai-Chun Chuang

Abstract:

Peer-to-Peer (P2P) live streaming systems still suffer a challenge when thousands of new peers want to join into the system in a short time, called flash crowd, and most of new peers suffer long start-up delay. Recent studies have proposed a slot-based user access control mechanism, which periodically determines a certain number of new peers to enter the system, and a user batch join mechanism, which divides new peers into several tree structures with fixed tree size. However, the slot-based user access control mechanism is difficult for accurately determining the optimal time slot length, and the user batch join mechanism is hard for determining the optimal tree size. In this paper, we propose a structured access control (SAC) mechanism, which constructs new peers to a multi-layer mesh structure. The SAC mechanism constructs new peer connections layer by layer to replace periodical access control, and determines the number of peers in each layer according to the system’s remaining upload bandwidth and average video rate. Furthermore, we propose an analytical model to represent the behavior of the system growth if the system can utilize the upload bandwidth efficiently. The analytical result has shown the similar trend in system growth as the SAC mechanism. Additionally, the extensive simulation is conducted to show the SAC mechanism outperforms two previously proposed methods in terms of system growth and start-up delay.

Keywords: peer-to-peer, live video streaming system, flash crowd, start-up delay, access control

Procedia PDF Downloads 315
5158 The Impact on the Composition of Survey Refusals΄ Demographic Profile When Implementing Different Classifications

Authors: Eva Tsouparopoulou, Maria Symeonaki

Abstract:

The internationally documented declining survey response rates of the last two decades are mainly attributed to refusals. In fieldwork, a refusal may be obtained not only from the respondent himself/herself, but from other sources on the respondent’s behalf, such as other household members, apartment building residents or administrator(s), and neighborhood residents. In this paper, we investigate how the composition of the demographic profile of survey refusals changes when different classifications are implemented and the classification issues arising from that. The analysis is based on the 2002-2018 European Social Survey (ESS) datasets for Belgium, Germany, and United Kingdom. For these three countries, the size of selected sample units coded as a type of refusal for all nine under investigation rounds was large enough to meet the purposes of the analysis. The results indicate the existence of four different possible classifications that can be implemented and the significance of choosing the one that strengthens the contrasts of the different types of respondents' demographic profiles. Since the foundation of social quantitative research lies in the triptych of definition, classification, and measurement, this study aims to identify the multiplicity of the definition of survey refusals as a methodological tool for the continually growing research on non-response.

Keywords: non-response, refusals, European social survey, classification

Procedia PDF Downloads 84
5157 Using Closed Frequent Itemsets for Hierarchical Document Clustering

Authors: Cheng-Jhe Lee, Chiun-Chieh Hsu

Abstract:

Due to the rapid development of the Internet and the increased availability of digital documents, the excessive information on the Internet has led to information overflow problem. In order to solve these problems for effective information retrieval, document clustering in text mining becomes a popular research topic. Clustering is the unsupervised classification of data items into groups without the need of training data. Many conventional document clustering methods perform inefficiently for large document collections because they were originally designed for relational database. Therefore they are impractical in real-world document clustering and require special handling for high dimensionality and high volume. We propose the FIHC (Frequent Itemset-based Hierarchical Clustering) method, which is a hierarchical clustering method developed for document clustering, where the intuition of FIHC is that there exist some common words for each cluster. FIHC uses such words to cluster documents and builds hierarchical topic tree. In this paper, we combine FIHC algorithm with ontology to solve the semantic problem and mine the meaning behind the words in documents. Furthermore, we use the closed frequent itemsets instead of only use frequent itemsets, which increases efficiency and scalability. The experimental results show that our method is more accurate than those of well-known document clustering algorithms.

Keywords: FIHC, documents clustering, ontology, closed frequent itemset

Procedia PDF Downloads 397
5156 Immature Palm Tree Detection Using Morphological Filter for Palm Counting with High Resolution Satellite Image

Authors: Nur Nadhirah Rusyda Rosnan, Nursuhaili Najwa Masrol, Nurul Fatiha MD Nor, Mohammad Zafrullah Mohammad Salim, Sim Choon Cheak

Abstract:

Accurate inventories of oil palm planted areas are crucial for plantation management as this would impact the overall economy and production of oil. One of the technological advancements in the oil palm industry is semi-automated palm counting, which is replacing conventional manual palm counting via digitizing aerial imagery. Most of the semi-automated palm counting method that has been developed was limited to mature palms due to their ideal canopy size represented by satellite image. Therefore, immature palms were often left out since the size of the canopy is barely visible from satellite images. In this paper, an approach using a morphological filter and high-resolution satellite image is proposed to detect immature palm trees. This approach makes it possible to count the number of immature oil palm trees. The method begins with an erosion filter with an appropriate window size of 3m onto the high-resolution satellite image. The eroded image was further segmented using watershed segmentation to delineate immature palm tree regions. Then, local minimum detection was used because it is hypothesized that immature oil palm trees are located at the local minimum within an oil palm field setting in a grayscale image. The detection points generated from the local minimum are displaced to the center of the immature oil palm region and thinned. Only one detection point is left that represents a tree. The performance of the proposed method was evaluated on three subsets with slopes ranging from 0 to 20° and different planting designs, i.e., straight and terrace. The proposed method was able to achieve up to more than 90% accuracy when compared with the ground truth, with an overall F-measure score of up to 0.91.

Keywords: immature palm count, oil palm, precision agriculture, remote sensing

Procedia PDF Downloads 74
5155 Disease Level Assessment in Wheat Plots Using a Residual Deep Learning Algorithm

Authors: Felipe A. Guth, Shane Ward, Kevin McDonnell

Abstract:

The assessment of disease levels in crop fields is an important and time-consuming task that generally relies on expert knowledge of trained individuals. Image classification in agriculture problems historically has been based on classical machine learning strategies that make use of hand-engineered features in the top of a classification algorithm. This approach tends to not produce results with high accuracy and generalization to the classes classified by the system when the nature of the elements has a significant variability. The advent of deep convolutional neural networks has revolutionized the field of machine learning, especially in computer vision tasks. These networks have great resourcefulness of learning and have been applied successfully to image classification and object detection tasks in the last years. The objective of this work was to propose a new method based on deep learning convolutional neural networks towards the task of disease level monitoring. Common RGB images of winter wheat were obtained during a growing season. Five categories of disease levels presence were produced, in collaboration with agronomists, for the algorithm classification. Disease level tasks performed by experts provided ground truth data for the disease score of the same winter wheat plots were RGB images were acquired. The system had an overall accuracy of 84% on the discrimination of the disease level classes.

Keywords: crop disease assessment, deep learning, precision agriculture, residual neural networks

Procedia PDF Downloads 331
5154 Phylogenetic Relationships of Common Reef Fish Species in Vietnam

Authors: Dang Thuy Binh, Truong Thi Oanh, Le Phan Khanh Hung, Luong thi Tuong Vy

Abstract:

One of the greatest environmental challenges facing Asia is the management and conservation of the marine biodiversity threaten by fisheries overexploitation, pollution, habitat destruction, and climate change. To date, a few molecular taxonomical studies has been conducted on marine fauna in Vietnam. The purpose of this study was to clarify the phylogeny of economic and ecological reef fish species in Vietnam Reef fish species covering Labridae, Scaridae, Nemipteridae, Serranidae, Acanthuridae, Lutjanidae, Lethrinidae, Mullidae, Balistidae, Pseudochromidae, Pinguipedidae, Fistulariidae, Holocentridae, Synodontidae, and Pomacentridae representing 28 genera were collected from South and Center, Vietnam. Combine with Genbank sequences, a phylogenetic tree was constructed based on 16S gene of mitochondrial DNA using maximum parsimony, maximum likelihood, and Bayesian inference approaches. The phylogram showed the well-resolved clades at genus and family level. Perciformes is the major order of reef fish species in Vietnam. The monophyly of Perciformes is not strongly supported as it was clustered in the same clade with Tetraodontiformes syngnathiformes and Beryciformes. Continue sampling of commercial fish species and classification based on morphology and genetics to build DNA barcoding of fish species in Vietnam is really necessary.

Keywords: reef fish, 16s rDNA, Vietnam, phylogeny

Procedia PDF Downloads 437
5153 Population Dynamics and Land Use/Land Cover Change on the Chilalo-Galama Mountain Range, Ethiopia

Authors: Yusuf Jundi Sado

Abstract:

Changes in land use are mostly credited to human actions that result in negative impacts on biodiversity and ecosystem functions. This study aims to analyze the dynamics of land use and land cover changes for sustainable natural resources planning and management. Chilalo-Galama Mountain Range, Ethiopia. This study used Thematic Mapper 05 (TM) for 1986, 2001 and Landsat 8 (OLI) data 2017. Additionally, data from the Central Statistics Agency on human population growth were analyzed. Semi-Automatic classification plugin (SCP) in QGIS 3.2.3 software was used for image classification. Global positioning system, field observations and focus group discussions were used for ground verification. Land Use Land Cover (LU/LC) change analysis was using maximum likelihood supervised classification and changes were calculated for the 1986–2001 and the 2001–2017 and 1986-2017 periods. The results show that agricultural land increased from 27.85% (1986) to 44.43% and 51.32% in 2001 and 2017, respectively with the overall accuracies of 92% (1986), 90.36% (2001), and 88% (2017). On the other hand, forests decreased from 8.51% (1986) to 7.64 (2001) and 4.46% (2017), and grassland decreased from 37.47% (1986) to 15.22%, and 15.01% in 2001 and 2017, respectively. It indicates for the years 1986–2017 the largest area cover gain of agricultural land was obtained from grassland. The matrix also shows that shrubland gained land from agricultural land, afro-alpine, and forest land. Population dynamics is found to be one of the major driving forces for the LU/LU changes in the study area.

Keywords: Landsat, LU/LC change, Semi-Automatic classification plugin, population dynamics, Ethiopia

Procedia PDF Downloads 83
5152 Clinical Feature Analysis and Prediction on Recurrence in Cervical Cancer

Authors: Ravinder Bahl, Jamini Sharma

Abstract:

The paper demonstrates analysis of the cervical cancer based on a probabilistic model. It involves technique for classification and prediction by recognizing typical and diagnostically most important test features relating to cervical cancer. The main contributions of the research include predicting the probability of recurrences in no recurrence (first time detection) cases. The combination of the conventional statistical and machine learning tools is applied for the analysis. Experimental study with real data demonstrates the feasibility and potential of the proposed approach for the said cause.

Keywords: cervical cancer, recurrence, no recurrence, probabilistic, classification, prediction, machine learning

Procedia PDF Downloads 357
5151 Machine Learning Framework: Competitive Intelligence and Key Drivers Identification of Market Share Trends among Healthcare Facilities

Authors: Anudeep Appe, Bhanu Poluparthi, Lakshmi Kasivajjula, Udai Mv, Sobha Bagadi, Punya Modi, Aditya Singh, Hemanth Gunupudi, Spenser Troiano, Jeff Paul, Justin Stovall, Justin Yamamoto

Abstract:

The necessity of data-driven decisions in healthcare strategy formulation is rapidly increasing. A reliable framework which helps identify factors impacting a healthcare provider facility or a hospital (from here on termed as facility) market share is of key importance. This pilot study aims at developing a data-driven machine learning-regression framework which aids strategists in formulating key decisions to improve the facility’s market share which in turn impacts in improving the quality of healthcare services. The US (United States) healthcare business is chosen for the study, and the data spanning 60 key facilities in Washington State and about 3 years of historical data is considered. In the current analysis, market share is termed as the ratio of the facility’s encounters to the total encounters among the group of potential competitor facilities. The current study proposes a two-pronged approach of competitor identification and regression approach to evaluate and predict market share, respectively. Leveraged model agnostic technique, SHAP, to quantify the relative importance of features impacting the market share. Typical techniques in literature to quantify the degree of competitiveness among facilities use an empirical method to calculate a competitive factor to interpret the severity of competition. The proposed method identifies a pool of competitors, develops Directed Acyclic Graphs (DAGs) and feature level word vectors, and evaluates the key connected components at the facility level. This technique is robust since its data-driven, which minimizes the bias from empirical techniques. The DAGs factor in partial correlations at various segregations and key demographics of facilities along with a placeholder to factor in various business rules (for ex. quantifying the patient exchanges, provider references, and sister facilities). Identified are the multiple groups of competitors among facilities. Leveraging the competitors' identified developed and fine-tuned Random Forest Regression model to predict the market share. To identify key drivers of market share at an overall level, permutation feature importance of the attributes was calculated. For relative quantification of features at a facility level, incorporated SHAP (SHapley Additive exPlanations), a model agnostic explainer. This helped to identify and rank the attributes at each facility which impacts the market share. This approach proposes an amalgamation of the two popular and efficient modeling practices, viz., machine learning with graphs and tree-based regression techniques to reduce the bias. With these, we helped to drive strategic business decisions.

Keywords: competition, DAGs, facility, healthcare, machine learning, market share, random forest, SHAP

Procedia PDF Downloads 89
5150 On the Performance of Improvised Generalized M-Estimator in the Presence of High Leverage Collinearity Enhancing Observations

Authors: Habshah Midi, Mohammed A. Mohammed, Sohel Rana

Abstract:

Multicollinearity occurs when two or more independent variables in a multiple linear regression model are highly correlated. The ridge regression is the commonly used method to rectify this problem. However, the ridge regression cannot handle the problem of multicollinearity which is caused by high leverage collinearity enhancing observation (HLCEO). Since high leverage points (HLPs) are responsible for inducing multicollinearity, the effect of HLPs needs to be reduced by using Generalized M estimator. The existing GM6 estimator is based on the Minimum Volume Ellipsoid (MVE) which tends to swamp some low leverage points. Hence an improvised GM (MGM) estimator is presented to improve the precision of the GM6 estimator. Numerical example and simulation study are presented to show how HLPs can cause multicollinearity. The numerical results show that our MGM estimator is the most efficient method compared to some existing methods.

Keywords: identification, high leverage points, multicollinearity, GM-estimator, DRGP, DFFITS

Procedia PDF Downloads 260
5149 A Machine Learning Approach for Classification of Directional Valve Leakage in the Hydraulic Final Test

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Due to increasing cost pressure in global markets, artificial intelligence is becoming a technology that is decisive for competition. Predictive quality enables machinery and plant manufacturers to ensure product quality by using data-driven forecasts via machine learning models as a decision-making basis for test results. The use of cross-process Bosch production data along the value chain of hydraulic valves is a promising approach to classifying the quality characteristics of workpieces.

Keywords: predictive quality, hydraulics, machine learning, classification, supervised learning

Procedia PDF Downloads 228
5148 Time-Frequency Feature Extraction Method Based on Micro-Doppler Signature of Ground Moving Targets

Authors: Ke Ren, Huiruo Shi, Linsen Li, Baoshuai Wang, Yu Zhou

Abstract:

Since some discriminative features are required for ground moving targets classification, we propose a new feature extraction method based on micro-Doppler signature. Firstly, the time-frequency analysis of measured data indicates that the time-frequency spectrograms of the three kinds of ground moving targets, i.e., single walking person, two people walking and a moving wheeled vehicle, are discriminative. Then, a three-dimensional time-frequency feature vector is extracted from the time-frequency spectrograms to depict these differences. At last, a Support Vector Machine (SVM) classifier is trained with the proposed three-dimensional feature vector. The classification accuracy to categorize ground moving targets into the three kinds of the measured data is found to be over 96%, which demonstrates the good discriminative ability of the proposed micro-Doppler feature.

Keywords: micro-doppler, time-frequency analysis, feature extraction, radar target classification

Procedia PDF Downloads 403
5147 The Assessment Groundwater Geochemistry of Some Wells in Rafsanjan Plain, Southeast of Iran

Authors: Milad Mirzaei Aminiyan, Abdolreza Akhgar, Farzad Mirzaei Aminiyan

Abstract:

Water quality is the critical factor that influence on human health and quantity and quality of grain production in semi-humid and semi-arid area. Pistachio is a main crop that accounts for a considerable portion of Iranian agricultural exports. Give that pistachio tree is a tolerant type of tree to saline and alkaline soil and water conditions, but groundwater and irrigation water quality play important roles in main production this crop. For this purpose, 94 well water samples were taken from 25 wells and samples were analyzed. The results showed give that region’s geological, climatic characteristics, statistical analysis, and based on dominant cations and anions in well water samples (piper diagram); four main types of water were found: Na-Cl, K-Cl, Na-SO4, and K-SO4. It seems that most wells in terms of water quality (salinity and alkalinity) and based on Wilcox diagram have critical status. The analysis suggested that more than eighty-seven percentage of the well water samples have high values of EC that these values are higher than into critical limit EC value for irrigation water, which may be due to the sandy soils in this area. Most groundwater were relatively unsuitable for irrigation but it could be used by application of correct management such as removing and reducing the ion concentrations of Cl‾, SO42‾, Na+ and total hardness in groundwater and also the concentrated deep groundwater was required treatment to reduce the salinity and sodium hazard. Given that irrigation water quality in this area was relatively unsuitable for most agriculture production but pistachio tree was adapted to this area conditions. The integrated management of groundwater for irrigation is the way to solve water quality issues not only in Rafsanjan area, but also in other arid and semi-arid areas.

Keywords: groundwater quality, irrigation water quality, salinity, alkalinity, Rafsanjan plain, pistachio

Procedia PDF Downloads 415