Search results for: classification tree
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2861

Search results for: classification tree

2081 Web Page Design Optimisation Based on Segment Analytics

Authors: Varsha V. Rohini, P. R. Shreya, B. Renukadevi

Abstract:

In the web analytics the information delivery and the web usage is optimized and the analysis of data is done. The analytics is the measurement, collection and analysis of webpage data. Page statistics and user metrics are the important factor in most of the web analytics tool. This is the limitation of the existing tools. It does not provide design inputs for the optimization of information. This paper aims at providing an extension for the scope of web analytics to provide analysis and statistics of each segment of a webpage. The number of click count is calculated and the concentration of links in a web page is obtained. Its user metrics are used to help in proper design of the displayed content in a webpage by Vision Based Page Segmentation (VIPS) algorithm. When the algorithm is applied on the web page it divides the entire web page into the visual block tree. The visual block tree generated will further divide the web page into visual blocks or segments which help us to understand the usage of each segment in a page and its content. The dynamic web pages and deep web pages are used to extend the scope of web page segment analytics. Space optimization concept is used with the help of the output obtained from the Vision Based Page Segmentation (VIPS) algorithm. This technique provides us the visibility of the user interaction with the WebPages and helps us to place the important links in the appropriate segments of the webpage and effectively manage space in a page and the concentration of links.

Keywords: analytics, design optimization, visual block trees, vision based technology

Procedia PDF Downloads 260
2080 Applying Unmanned Aerial Vehicle on Agricultural Damage: A Case Study of the Meteorological Disaster on Taiwan Paddy Rice

Authors: Chiling Chen, Chiaoying Chou, Siyang Wu

Abstract:

Taiwan locates at the west of Pacific Ocean and intersects between continental and marine climate. Typhoons frequently strike Taiwan and come with meteorological disasters, i.e., heavy flooding, landslides, loss of life and properties, etc. Global climate change brings more extremely meteorological disasters. So, develop techniques to improve disaster prevention and mitigation is needed, to improve rescue processes and rehabilitations is important as well. In this study, UAVs (Unmanned Aerial Vehicles) are applied to take instant images for improving the disaster investigation and rescue processes. Paddy rice fields in the central Taiwan are the study area. There have been attacked by heavy rain during the monsoon season in June 2016. UAV images provide the high ground resolution (3.5cm) with 3D Point Clouds to develop image discrimination techniques and digital surface model (DSM) on rice lodging. Firstly, image supervised classification with Maximum Likelihood Method (MLD) is used to delineate the area of rice lodging. Secondly, 3D point clouds generated by Pix4D Mapper are used to develop DSM for classifying the lodging levels of paddy rice. As results, discriminate accuracy of rice lodging is 85% by image supervised classification, and the classification accuracy of lodging level is 87% by DSM. Therefore, UAVs not only provide instant images of agricultural damage after the meteorological disaster, but the image discriminations on rice lodging also reach acceptable accuracy (>85%). In the future, technologies of UAVs and image discrimination will be applied to different crop fields. The results of image discrimination will be overlapped with administrative boundaries of paddy rice, to establish GIS-based assist system on agricultural damage discrimination. Therefore, the time and labor would be greatly reduced on damage detection and monitoring.

Keywords: Monsoon, supervised classification, Pix4D, 3D point clouds, discriminate accuracy

Procedia PDF Downloads 294
2079 A Gene Selection Algorithm for Microarray Cancer Classification Using an Improved Particle Swarm Optimization

Authors: Arfan Ali Nagra, Tariq Shahzad, Meshal Alharbi, Khalid Masood Khan, Muhammad Mugees Asif, Taher M. Ghazal, Khmaies Ouahada

Abstract:

Gene selection is an essential step for the classification of microarray cancer data. Gene expression cancer data (DNA microarray) facilitates computing the robust and concurrent expression of various genes. Particle swarm optimization (PSO) requires simple operators and less number of parameters for tuning the model in gene selection. The selection of a prognostic gene with small redundancy is a great challenge for the researcher as there are a few complications in PSO based selection method. In this research, a new variant of PSO (Self-inertia weight adaptive PSO) has been proposed. In the proposed algorithm, SIW-APSO-ELM is explored to achieve gene selection prediction accuracies. This new algorithm balances the exploration capabilities of the improved inertia weight adaptive particle swarm optimization and the exploitation. The self-inertia weight adaptive particle swarm optimization (SIW-APSO) is used to search the solution. The SIW-APSO is updated with an evolutionary process in such a way that each particle iteratively improves its velocities and positions. The extreme learning machine (ELM) has been designed for the selection procedure. The proposed method has been to identify a number of genes in the cancer dataset. The classification algorithm contains ELM, K- centroid nearest neighbor (KCNN), and support vector machine (SVM) to attain high forecast accuracy as compared to the start-of-the-art methods on microarray cancer datasets that show the effectiveness of the proposed method.

Keywords: microarray cancer, improved PSO, ELM, SVM, evolutionary algorithms

Procedia PDF Downloads 73
2078 Detection of Phoneme [S] Mispronounciation for Sigmatism Diagnosis in Adults

Authors: Michal Krecichwost, Zauzanna Miodonska, Pawel Badura

Abstract:

The diagnosis of sigmatism is mostly based on the observation of articulatory organs. It is, however, not always possible to precisely observe the vocal apparatus, in particular in the oral cavity of the patient. Speech processing can allow to objectify the therapy and simplify the verification of its progress. In the described study the methodology for classification of incorrectly pronounced phoneme [s] is proposed. The recordings come from adults. They were registered with the speech recorder at the sampling rate of 44.1 kHz and the resolution of 16 bit. The database of pathological and normative speech has been collected for the study including reference assessments provided by the speech therapy experts. Ten adult subjects were asked to simulate a certain type of stigmatism under the speech therapy expert supervision. In the recordings, the analyzed phone [s] was surrounded by vowels, viz: ASA, ESE, ISI, SPA, USU, YSY. Thirteen MFCC (mel-frequency cepstral coefficients) and RMS (root mean square) values are calculated within each frame being a part of the analyzed phoneme. Additionally, 3 fricative formants along with corresponding amplitudes are determined for the entire segment. In order to aggregate the information within the segment, the average value of each MFCC coefficient is calculated. All features of other types are aggregated by means of their 75th percentile. The proposed method of features aggregation reduces the size of the feature vector used in the classification. Binary SVM (support vector machine) classifier is employed at the phoneme recognition stage. The first group consists of pathological phones, while the other of the normative ones. The proposed feature vector yields classification sensitivity and specificity measures above 90% level in case of individual logo phones. The employment of a fricative formants-based information improves the sole-MFCC classification results average of 5 percentage points. The study shows that the employment of specific parameters for the selected phones improves the efficiency of pathology detection referred to the traditional methods of speech signal parameterization.

Keywords: computer-aided pronunciation evaluation, sibilants, sigmatism diagnosis, speech processing

Procedia PDF Downloads 270
2077 Transformation of Positron Emission Tomography Raw Data into Images for Classification Using Convolutional Neural Network

Authors: Paweł Konieczka, Lech Raczyński, Wojciech Wiślicki, Oleksandr Fedoruk, Konrad Klimaszewski, Przemysław Kopka, Wojciech Krzemień, Roman Shopa, Jakub Baran, Aurélien Coussat, Neha Chug, Catalina Curceanu, Eryk Czerwiński, Meysam Dadgar, Kamil Dulski, Aleksander Gajos, Beatrix C. Hiesmayr, Krzysztof Kacprzak, łukasz Kapłon, Grzegorz Korcyl, Tomasz Kozik, Deepak Kumar, Szymon Niedźwiecki, Dominik Panek, Szymon Parzych, Elena Pérez Del Río, Sushil Sharma, Shivani Shivani, Magdalena Skurzok, Ewa łucja Stępień, Faranak Tayefi, Paweł Moskal

Abstract:

This paper develops the transformation of non-image data into 2-dimensional matrices, as a preparation stage for classification based on convolutional neural networks (CNNs). In positron emission tomography (PET) studies, CNN may be applied directly to the reconstructed distribution of radioactive tracers injected into the patient's body, as a pattern recognition tool. Nonetheless, much PET data still exists in non-image format and this fact opens a question on whether they can be used for training CNN. In this contribution, the main focus of this paper is the problem of processing vectors with a small number of features in comparison to the number of pixels in the output images. The proposed methodology was applied to the classification of PET coincidence events.

Keywords: convolutional neural network, kernel principal component analysis, medical imaging, positron emission tomography

Procedia PDF Downloads 126
2076 Volume Estimation of Trees: An Exploratory Study on Pterocarpus erinaceus Logging Operations within Forest Transition and Savannah Ecological Zones of Ghana

Authors: Albert Kwabena Osei Konadu

Abstract:

Pterocarpus erinaceus, also known as Rosewood, is tropical wood, endemic in forest savannah transition zones within the middle and northern portion of Ghana. Its economic viability has made it increasingly popular and in high demand, leading to widespread conservation concerns. Ghana’s forest resource management regime for these ecozones is mainly on conservation and very little on resource utilization. Consequently, commercial logging management standards are at teething stage and not fully developed, leading to a deficiency in the monitoring of logging operations and quantification of harvested trees volumes. Tree information form (TIF); a volume estimation and tracking regime, has proven to be an effective, sustainable management tool for regulating timber resource extraction in the high forest zones of the country. This work aims to generate TIF that can track and capture requisite parameters to accurately estimate the volume of harvested rosewood within forest savannah transition zones. Tree information forms were created on three scenarios of individual billets, stacked billets and conveying vessel basis. These TIFs were field-tested to deduce the most viable option for the tracking and estimation of harvested volumes of rosewood using the smallian and cubic volume estimation formula. Overall, four districts were covered with individual billets, stacked billets and conveying vessel scenarios registering mean volumes of 25.83m3,45.08m3 and 32.6m3, respectively. These adduced volumes were validated by benchmarking to assigned volumes of the Forestry Commission of Ghana and known standard volumes of conveying vessels. The results did indicate an underestimation of extracted volumes under the quotas regime, a situation that could lead to unintended overexploitation of the species. The research revealed conveying vessels route is the most viable volume estimation and tracking regime for the sustainable management of the Pterocarpous erinaceus species as it provided a more practical volume estimate and data extraction protocol.

Keywords: convention on international trade in endangered species, cubic volume formula, forest transition savannah zones, pterocarpus erinaceus, smallian’s volume formula, tree information form

Procedia PDF Downloads 82
2075 Using Probabilistic Neural Network (PNN) for Extracting Acoustic Microwaves (Bulk Acoustic Waves) in Piezoelectric Material

Authors: Hafdaoui Hichem, Mehadjebia Cherifa, Benatia Djamel

Abstract:

In this paper, we propose a new method for Bulk detection of an acoustic microwave signal during the propagation of acoustic microwaves in a piezoelectric substrate (Lithium Niobate LiNbO3). We have used the classification by probabilistic neural network (PNN) as a means of numerical analysis in which we classify all the values of the real part and the imaginary part of the coefficient attenuation with the acoustic velocity in order to build a model from which we note the Bulk waves easily. These singularities inform us of presence of Bulk waves in piezoelectric materials. By which we obtain accurate values for each of the coefficient attenuation and acoustic velocity for Bulk waves. This study will be very interesting in modeling and realization of acoustic microwaves devices (ultrasound) based on the propagation of acoustic microwaves.

Keywords: piezoelectric material, probabilistic neural network (PNN), classification, acoustic microwaves, bulk waves, the attenuation coefficient

Procedia PDF Downloads 422
2074 Hyperspectral Imagery for Tree Speciation and Carbon Mass Estimates

Authors: Jennifer Buz, Alvin Spivey

Abstract:

The most common greenhouse gas emitted through human activities, carbon dioxide (CO2), is naturally consumed by plants during photosynthesis. This process is actively being monetized by companies wishing to offset their carbon dioxide emissions. For example, companies are now able to purchase protections for vegetated land due-to-be clear cut or purchase barren land for reforestation. Therefore, by actively preventing the destruction/decay of plant matter or by introducing more plant matter (reforestation), a company can theoretically offset some of their emissions. One of the biggest issues in the carbon credit market is validating and verifying carbon offsets. There is a need for a system that can accurately and frequently ensure that the areas sold for carbon credits have the vegetation mass (and therefore for carbon offset capability) they claim. Traditional techniques for measuring vegetation mass and determining health are costly and require many person-hours. Orbital Sidekick offers an alternative approach that accurately quantifies carbon mass and assesses vegetation health through satellite hyperspectral imagery, a technique which enables us to remotely identify material composition (including plant species) and condition (e.g., health and growth stage). How much carbon a plant is capable of storing ultimately is tied to many factors, including material density (primarily species-dependent), plant size, and health (trees that are actively decaying are not effectively storing carbon). All of these factors are capable of being observed through satellite hyperspectral imagery. This abstract focuses on speciation. To build a species classification model, we matched pixels in our remote sensing imagery to plants on the ground for which we know the species. To accomplish this, we collaborated with the researchers at the Teakettle Experimental Forest. Our remote sensing data comes from our airborne “Kato” sensor, which flew over the study area and acquired hyperspectral imagery (400-2500 nm, 472 bands) at ~0.5 m/pixel resolution. Coverage of the entire teakettle experimental forest required capturing dozens of individual hyperspectral images. In order to combine these images into a mosaic, we accounted for potential variations of atmospheric conditions throughout the data collection. To do this, we ran an open source atmospheric correction routine called ISOFIT1 (Imaging Spectrometer Optiman FITting), which converted all of our remote sensing data from radiance to reflectance. A database of reflectance spectra for each of the tree species within the study area was acquired using the Teakettle stem map and the geo-referenced hyperspectral images. We found that a wide variety of machine learning classifiers were able to identify the species within our images with high (>95%) accuracy. For the most robust quantification of carbon mass and the best assessment of the health of a vegetated area, speciation is critical. Through the use of high resolution hyperspectral data, ground-truth databases, and complex analytical techniques, we are able to determine the species present within a pixel to a high degree of accuracy. These species identifications will feed directly into our carbon mass model.

Keywords: hyperspectral, satellite, carbon, imagery, python, machine learning, speciation

Procedia PDF Downloads 109
2073 Early Stage Suicide Ideation Detection Using Supervised Machine Learning and Neural Network Classifier

Authors: Devendra Kr Tayal, Vrinda Gupta, Aastha Bansal, Khushi Singh, Sristi Sharma, Hunny Gaur

Abstract:

In today's world, suicide is a serious problem. In order to save lives, early suicide attempt detection and prevention should be addressed. A good number of at-risk people utilize social media platforms to talk about their issues or find knowledge on related chores. Twitter and Reddit are two of the most common platforms that are used for expressing oneself. Extensive research has already been done in this field. Through supervised classification techniques like Nave Bayes, Bernoulli Nave Bayes, and Multiple Layer Perceptron on a Reddit dataset, we demonstrate the early recognition of suicidal ideation. We also performed comparative analysis on these approaches and used accuracy, recall score, F1 score, and precision score for analysis.

Keywords: machine learning, suicide ideation detection, supervised classification, natural language processing

Procedia PDF Downloads 81
2072 Automatic Adult Age Estimation Using Deep Learning of the ResNeXt Model Based on CT Reconstruction Images of the Costal Cartilage

Authors: Ting Lu, Ya-Ru Diao, Fei Fan, Ye Xue, Lei Shi, Xian-e Tang, Meng-jun Zhan, Zhen-hua Deng

Abstract:

Accurate adult age estimation (AAE) is a significant and challenging task in forensic and archeology fields. Attempts have been made to explore optimal adult age metrics, and the rib is considered a potential age marker. The traditional way is to extract age-related features designed by experts from macroscopic or radiological images followed by classification or regression analysis. Those results still have not met the high-level requirements for practice, and the limitation of using feature design and manual extraction methods is loss of information since the features are likely not designed explicitly for extracting information relevant to age. Deep learning (DL) has recently garnered much interest in imaging learning and computer vision. It enables learning features that are important without a prior bias or hypothesis and could be supportive of AAE. This study aimed to develop DL models for AAE based on CT images and compare their performance to the manual visual scoring method. Chest CT data were reconstructed using volume rendering (VR). Retrospective data of 2500 patients aged 20.00-69.99 years were obtained between December 2019 and September 2021. Five-fold cross-validation was performed, and datasets were randomly split into training and validation sets in a 4:1 ratio for each fold. Before feeding the inputs into networks, all images were augmented with random rotation and vertical flip, normalized, and resized to 224×224 pixels. ResNeXt was chosen as the DL baseline due to its advantages of higher efficiency and accuracy in image classification. Mean absolute error (MAE) was the primary parameter. Independent data from 100 patients acquired between March and April 2022 were used as a test set. The manual method completely followed the prior study, which reported the lowest MAEs (5.31 in males and 6.72 in females) among similar studies. CT data and VR images were used. The radiation density of the first costal cartilage was recorded using CT data on the workstation. The osseous and calcified projections of the 1 to 7 costal cartilages were scored based on VR images using an eight-stage staging technique. According to the results of the prior study, the optimal models were the decision tree regression model in males and the stepwise multiple linear regression equation in females. Predicted ages of the test set were calculated separately using different models by sex. A total of 2600 patients (training and validation sets, mean age=45.19 years±14.20 [SD]; test set, mean age=46.57±9.66) were evaluated in this study. Of ResNeXt model training, MAEs were obtained with 3.95 in males and 3.65 in females. Based on the test set, DL achieved MAEs of 4.05 in males and 4.54 in females, which were far better than the MAEs of 8.90 and 6.42 respectively, for the manual method. Those results showed that the DL of the ResNeXt model outperformed the manual method in AAE based on CT reconstruction of the costal cartilage and the developed system may be a supportive tool for AAE.

Keywords: forensic anthropology, age determination by the skeleton, costal cartilage, CT, deep learning

Procedia PDF Downloads 61
2071 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: data mining, textile production, decision trees, classification

Procedia PDF Downloads 339
2070 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling

Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal

Abstract:

Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.

Keywords: ABET, accreditation, benchmark collection, machine learning, program educational objectives, student outcomes, supervised multi-class classification, text mining

Procedia PDF Downloads 158
2069 Early Diagnosis of Myocardial Ischemia Based on Support Vector Machine and Gaussian Mixture Model by Using Features of ECG Recordings

Authors: Merve Begum Terzi, Orhan Arikan, Adnan Abaci, Mustafa Candemir

Abstract:

Acute myocardial infarction is a major cause of death in the world. Therefore, its fast and reliable diagnosis is a major clinical need. ECG is the most important diagnostic methodology which is used to make decisions about the management of the cardiovascular diseases. In patients with acute myocardial ischemia, temporary chest pains together with changes in ST segment and T wave of ECG occur shortly before the start of myocardial infarction. In this study, a technique which detects changes in ST/T sections of ECG is developed for the early diagnosis of acute myocardial ischemia. For this purpose, a database of real ECG recordings that contains a set of records from 75 patients presenting symptoms of chest pain who underwent elective percutaneous coronary intervention (PCI) is constituted. 12-lead ECG’s of the patients were recorded before and during the PCI procedure. Two ECG epochs, which are the pre-inflation ECG which is acquired before any catheter insertion and the occlusion ECG which is acquired during balloon inflation, are analyzed for each patient. By using pre-inflation and occlusion recordings, ECG features that are critical in the detection of acute myocardial ischemia are identified and the most discriminative features for the detection of acute myocardial ischemia are extracted. A classification technique based on support vector machine (SVM) approach operating with linear and radial basis function (RBF) kernels to detect ischemic events by using ST-T derived joint features from non-ischemic and ischemic states of the patients is developed. The dataset is randomly divided into training and testing sets and the training set is used to optimize SVM hyperparameters by using grid-search method and 10fold cross-validation. SVMs are designed specifically for each patient by tuning the kernel parameters in order to obtain the optimal classification performance results. As a result of implementing the developed classification technique to real ECG recordings, it is shown that the proposed technique provides highly reliable detections of the anomalies in ECG signals. Furthermore, to develop a detection technique that can be used in the absence of ECG recording obtained during healthy stage, the detection of acute myocardial ischemia based on ECG recordings of the patients obtained during ischemia is also investigated. For this purpose, a Gaussian mixture model (GMM) is used to represent the joint pdf of the most discriminating ECG features of myocardial ischemia. Then, a Neyman-Pearson type of approach is developed to provide detection of outliers that would correspond to acute myocardial ischemia. Neyman – Pearson decision strategy is used by computing the average log likelihood values of ECG segments and comparing them with a range of different threshold values. For different discrimination threshold values and number of ECG segments, probability of detection and probability of false alarm values are computed, and the corresponding ROC curves are obtained. The results indicate that increasing number of ECG segments provide higher performance for GMM based classification. Moreover, the comparison between the performances of SVM and GMM based classification showed that SVM provides higher classification performance results over ECG recordings of considerable number of patients.

Keywords: ECG classification, Gaussian mixture model, Neyman–Pearson approach, support vector machine

Procedia PDF Downloads 151
2068 Adapting Tools for Text Monitoring and for Scenario Analysis Related to the Field of Social Disasters

Authors: Svetlana Cojocaru, Mircea Petic, Inga Titchiev

Abstract:

Humanity faces more and more often with different social disasters, which in turn can generate new accidents and catastrophes. To mitigate their consequences, it is important to obtain early possible signals about the events which are or can occur and to prepare the corresponding scenarios that could be applied. Our research is focused on solving two problems in this domain: identifying signals related that an accident occurred or may occur and mitigation of some consequences of disasters. To solve the first problem, methods of selecting and processing texts from global network Internet are developed. Information in Romanian is of special interest for us. In order to obtain the mentioned tools, we should follow several steps, divided into preparatory stage and processing stage. Throughout the first stage, we manually collected over 724 news articles and classified them into 10 categories of social disasters. It constitutes more than 150 thousand words. Using this information, a controlled vocabulary of more than 300 keywords was elaborated, that will help in the process of classification and identification of the texts related to the field of social disasters. To solve the second problem, the formalism of Petri net has been used. We deal with the problem of inhabitants’ evacuation in useful time. The analysis methods such as reachability or coverability tree and invariants technique to determine dynamic properties of the modeled systems will be used. To perform a case study of properties of extended evacuation system by adding time, the analysis modules of PIPE such as Generalized Stochastic Petri Nets (GSPN) Analysis, Simulation, State Space Analysis, and Invariant Analysis have been used. These modules helped us to obtain the average number of persons situated in the rooms and the other quantitative properties and characteristics related to its dynamics.

Keywords: lexicon of disasters, modelling, Petri nets, text annotation, social disasters

Procedia PDF Downloads 191
2067 Modular Robotics and Terrain Detection Using Inertial Measurement Unit Sensor

Authors: Shubhakar Gupta, Dhruv Prakash, Apoorv Mehta

Abstract:

In this project, we design a modular robot capable of using and switching between multiple methods of propulsion and classifying terrain, based on an Inertial Measurement Unit (IMU) input. We wanted to make a robot that is not only intelligent in its functioning but also versatile in its physical design. The advantage of a modular robot is that it can be designed to hold several movement-apparatuses, such as wheels, legs for a hexapod or a quadpod setup, propellers for underwater locomotion, and any other solution that may be needed. The robot takes roughness input from a gyroscope and an accelerometer in the IMU, and based on the terrain classification from an artificial neural network; it decides which method of propulsion would best optimize its movement. This provides the bot with adaptability over a set of terrains, which means it can optimize its locomotion on a terrain based on its roughness. A feature like this would be a great asset to have in autonomous exploration or research drones.

Keywords: modular robotics, terrain detection, terrain classification, neural network

Procedia PDF Downloads 136
2066 ICanny: CNN Modulation Recognition Algorithm

Authors: Jingpeng Gao, Xinrui Mao, Zhibin Deng

Abstract:

Aiming at the low recognition rate on the composite signal modulation in low signal to noise ratio (SNR), this paper proposes a modulation recognition algorithm based on ICanny-CNN. Firstly, the radar signal is transformed into the time-frequency image by Choi-Williams Distribution (CWD). Secondly, we propose an image processing algorithm using the Guided Filter and the threshold selection method, which is combined with the hole filling and the mask operation. Finally, the shallow convolutional neural network (CNN) is combined with the idea of the depth-wise convolution (Dw Conv) and the point-wise convolution (Pw Conv). The proposed CNN is designed to complete image classification and realize modulation recognition of radar signal. The simulation results show that the proposed algorithm can reach 90.83% at 0dB and 71.52% at -8dB. Therefore, the proposed algorithm has a good classification and anti-noise performance in radar signal modulation recognition and other fields.

Keywords: modulation recognition, image processing, composite signal, improved Canny algorithm

Procedia PDF Downloads 180
2065 Language Shapes Thought: An Experimental Study on English and Mandarin Native Speakers' Sequencing of Size

Authors: Hsi Wei

Abstract:

Does the language we speak affect the way we think? This question has been discussed for a long time from different aspects. In this article, the issue is examined with an experiment on how speakers of different languages tend to do different sequencing when it comes to the size of general objects. An essential difference between the usage of English and Mandarin is the way we sequence the size of places or objects. In English, when describing the location of something we may say, for example, ‘The pen is inside the trashcan next to the tree at the park.’ In Mandarin, however, we would say, ‘The pen is at the park next to the tree inside the trashcan.’ It’s clear that generally English use the sequence of small to big while Mandarin the opposite. Therefore, the experiment was conducted to test if the difference of the languages affects the speakers’ ability to do the different sequencing. There were two groups of subjects; one consisted of English native speakers, another of Mandarin native speakers. Within the experiment, three nouns were showed as a group to the subjects as their native languages. Before they saw the nouns, they would first get an instruction of ‘big to small’, ‘small to big’, or ‘repeat’. Therefore, the subjects had to sequence the following group of nouns as the instruction they get or simply repeat the nouns. After completing every sequencing and repetition in their minds, they pushed a button as reaction. The repetition design was to gather the mere reading time of the person. As the result of the experiment showed, English native speakers reacted more quickly to the sequencing of ‘small to big’; on the other hand, Mandarin native speakers reacted more quickly to the sequence ‘big to small’. To conclude, this study may be of importance as a support for linguistic relativism that the language we speak do shape the way we think.

Keywords: language, linguistic relativism, size, sequencing

Procedia PDF Downloads 273
2064 Efficient Manageability and Intelligent Classification of Web Browsing History Using Machine Learning

Authors: Suraj Gururaj, Sumantha Udupa U.

Abstract:

Browsing the Web has emerged as the de facto activity performed on the Internet. Although browsing gets tracked, the manageability aspect of Web browsing history is very poor. In this paper, we have a workable solution implemented by using machine learning and natural language processing techniques for efficient manageability of user’s browsing history. The significance of adding such a capability to a Web browser is that it ensures efficient and quick information retrieval from browsing history, which currently is very challenging. Our solution guarantees that any important websites visited in the past can be easily accessible because of the intelligent and automatic classification. In a nutshell, our solution-based paper provides an implementation as a browser extension by intelligently classifying the browsing history into most relevant category automatically without any user’s intervention. This guarantees no information is lost and increases productivity by saving time spent revisiting websites that were of much importance.

Keywords: adhoc retrieval, Chrome extension, supervised learning, tile, Web personalization

Procedia PDF Downloads 360
2063 The Qualitative and Quantitative Detection of Pistachio in Processed Food Products Using Florescence Dye Based PCR

Authors: Ergün Şakalar, Şeyma Özçirak Ergün

Abstract:

Pistachio nuts, the fruits of the pistachio tree (Pistacia vera), are edible tree nuts highly valued for their organoleptic properties. Pistachio nuts used in snack foods, chocolates, baklava, meat products, ice-cream industries and other gourmet products as ingredients. Undeclared pistachios may be present in food products as a consequence of fraudulent substitution. Control of food samples is very important for safety and fraud. Mix of pistachio, peanut (Arachis hypogaea), pea (Pisum sativum L.) used instead of pistachio in food products, because pistachio is a considerably expensive nut. To solve this problem, a sensitive polymerase chain reaction PCR has been developed. A real-time PCR assay for the detection of pea, peanut and pistachio in baklava was designed by using EvaGreen fluorescence dye. Primers were selected from powerful regions for identification of pea, peanut and pistachio. DNA from reference samples and industrial products were successfully extracted with the GIDAGEN® Multi-Fast DNA Isolation Kit. Genomes were identified based on their specific melting peaks (Mp) which are 77°C, 85.5°C and 82.5°C for pea, peanut and pistachio, respectively. Homogenized mixtures of raw pistachio, pea and peanut were prepared with the ratio of 0.01%, 0.1%, 1%, 10%, 40% and 70% of pistachio. Quantitative detection limit of assay was 0.1% for pistachio. Also, real-time PCR technique used in this study allowed the qualitative detection of as little as 0.001% level of peanut DNA, 0,000001% level of pistachio DNA and 0.000001% level of pea DNA in the experimental admixtures. This assay represents a potentially valuable diagnostic method for detection of nut species adulterated with pistachio as well as for highly specific and relatively rapid detection of small amounts of pistachio in food samples.

Keywords: pea, peanut, pistachio, real-time PCR

Procedia PDF Downloads 256
2062 Review of Cyber Security in Oil and Gas Industry with Cloud Computing Perspective: Taxonomy, Issues and Future Direction

Authors: Irfan Mohiuddin, Ahmad Al Mogren

Abstract:

In recent years, cloud computing has earned substantial attention in the Oil and Gas Industry and provides services in all the phases of the industry lifecycle. Oil and gas supply infrastructure, in particular, is more vulnerable to accidental, natural and intentional threats because of its widespread distribution. Numerous surveys have been conducted on cloud security and privacy. However, to the best of our knowledge, hardly any survey is carried out that reviews cyber security in all phases with a cloud computing perspective. Moreover, a distinctive classification is performed for all the cloud-based cyber security measures based on the cloud component in use. The classification approach will enable researchers to identify the required technique used to enhance the security in specific cloud components. Also, the limitation of each component will allow the researchers to design optimal algorithms. Lastly, future directions are given to point out the imminent challenges that can pave the way for researchers to further enhance the resilience to cyber security threats in the oil and gas industry.

Keywords: cyber security, cloud computing, safety and security, oil and gas industry, security threats, oil and gas pipelines

Procedia PDF Downloads 133
2061 Analysis on Prediction Models of TBM Performance and Selection of Optimal Input Parameters

Authors: Hang Lo Lee, Ki Il Song, Hee Hwan Ryu

Abstract:

An accurate prediction of TBM(Tunnel Boring Machine) performance is very difficult for reliable estimation of the construction period and cost in preconstruction stage. For this purpose, the aim of this study is to analyze the evaluation process of various prediction models published since 2000 for TBM performance, and to select the optimal input parameters for the prediction model. A classification system of TBM performance prediction model and applied methodology are proposed in this research. Input and output parameters applied for prediction models are also represented. Based on these results, a statistical analysis is performed using the collected data from shield TBM tunnel in South Korea. By performing a simple regression and residual analysis utilizinFg statistical program, R, the optimal input parameters are selected. These results are expected to be used for development of prediction model of TBM performance.

Keywords: TBM performance prediction model, classification system, simple regression analysis, residual analysis, optimal input parameters

Procedia PDF Downloads 299
2060 lncRNA Gene Expression Profiling Analysis by TCGA RNA-Seq Data of Breast Cancer

Authors: Xiaoping Su, Gabriel G. Malouf

Abstract:

Introduction: Breast cancer is a heterogeneous disease that can be classified in 4 subgroups using transcriptional profiling. The role of lncRNA expression in human breast cancer biology, prognosis, and molecular classification remains unknown. Methods and results: Using an integrative comprehensive analysis of lncRNA, mRNA and DNA methylation in 900 breast cancer patients from The Cancer Genome Atlas (TCGA) project, we unraveled the molecular portraits of 1,700 expressed lncRNA. Some of those lncRNAs (i.e, HOTAIR) are previously reported and others are novel (i.e, HOTAIRM1, MAPT-AS1). The lncRNA classification correlated well with the PAM50 classification for basal-like, Her-2 enriched and luminal B subgroups, in contrast to the luminal A subgroup which behaved differently. Importantly, estrogen receptor (ESR1) expression was associated with distinct lncRNA networks in lncRNA clusters III and IV. Gene set enrichment analysis for cis- and trans-acting lncRNA showed enrichment for breast cancer signatures driven by breast cancer master regulators. Almost two third of those lncRNA were marked by enhancer chromatin modifications (i.e., H3K27ac), suggesting that lncRNA expression may result in increased activity of neighboring genes. Differential analysis of gene expression profiling data showed that lncRNA HOTAIRM1 was significantly down-regulated in basal-like subtype, and DNA methylation profiling data showed that lncRNA HOTAIRM1 was highly methylated in basal-like subtype. Thus, our integrative analysis of gene expression and DNA methylation strongly suggested that lncRNA HOTAIRM1 should be a tumor suppressor in basal-like subtype. Conclusion and significance: Our study depicts the first lncRNA molecular portrait of breast cancer and shows that lncRNA HOTAIRM1 might be a novel tumor suppressor.

Keywords: lncRNA profiling, breast cancer, HOTAIRM1, tumor suppressor

Procedia PDF Downloads 94
2059 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract

Procedia PDF Downloads 643
2058 Co-Development of an Assisted Manual Harvesting Tool for Peach Palm That Avoids the Harvest in Heights

Authors: Mauricio Quintero Angel, Alexander Pereira, Selene Alarcón

Abstract:

One of the elements of greatest importance in agricultural production is the harvesting; an activity associated to different occupational health risks such as harvesting in high altitudes, the transport of heavy materials and the application of excessive muscle strain that leads to muscular-bone disorders. Therefore, there is an urgent necessity to improve and validate interventions to reduce exposition and risk to harvesters. This article has the objective of describing the co-development under the ergonomic analysis framework of an assisted manual harvesting tool for peach palm oriented to reduce the risk of death and accidents as it avoid the harvest in heights. The peach palm is a palm tree that is cultivated in Colombia, Perú, Brasil, Costa Rica, among others and that reaches heights of over 20 m, with stipes covered with spines. The fruits are drupes of variable size. For the harvesting of peach palm, in Colombia farmers use the “Marota” or “Climber”, a tool in a closed X shape built in wood, that has two supports adjusted at the stipe, that elevate alternately until reaching a point high enough to grab the bunch that is brought down using a rope. An activity of high risk since it is done at a high altitude without any type of protection and safety measures. The Marota is alternated with a rod, which as variable height between 5 and 12 Meters with a harness system at one end to hold the bunch that is lowered with the whole system (bamboo bunch). The rod is used from the ground or from the Marota in height. As an alternative to traditional tools, the Bajachonta was co-developed with farmers, a tool that employs a traditional bamboo hook system with modifications, to be able to hold it with a rope that passes through a pulley. Once the bunch is hitched, the hook system is detached and this stays attached to the peduncle of the palm tree, afterwards through a pulling force being exerted towards the ground by tensioning the rope, the bunch comes loose to be taken down using a rope and the pulley system to the ground, reducing the risk and efforts in the operation. The bajachonta was evaluated in tree productive zones of Colombia, with innovative farmers, were the adoption is highly probable, with some modifications to improve its efficiency and effectiveness, keeping in mind that the farmers perceive in it an advantage in the reduction of death and accidents by not having to harvest in heights.

Keywords: assisted harvesting, ergonomics, harvesting in high altitudes, participative design, peach palm

Procedia PDF Downloads 392
2057 Astronomical Object Classification

Authors: Alina Muradyan, Lina Babayan, Arsen Nanyan, Gohar Galstyan, Vigen Khachatryan

Abstract:

We present a photometric method for identifying stars, galaxies and quasars in multi-color surveys, which uses a library of ∼> 65000 color templates for comparison with observed objects. The method aims for extracting the information content of object colors in a statistically correct way, and performs a classification as well as a redshift estimation for galaxies and quasars in a unified approach based on the same probability density functions. For the redshift estimation, we employ an advanced version of the Minimum Error Variance estimator which determines the redshift error from the redshift dependent probability density function itself. The method was originally developed for the Calar Alto Deep Imaging Survey (CADIS), but is now used in a wide variety of survey projects. We checked its performance by spectroscopy of CADIS objects, where the method provides high reliability (6 errors among 151 objects with R < 24), especially for the quasar selection, and redshifts accurate within σz ≈ 0.03 for galaxies and σz ≈ 0.1 for quasars. For an optimization of future survey efforts, a few model surveys are compared, which are designed to use the same total amount of telescope time but different sets of broad-band and medium-band filters. Their performance is investigated by Monte-Carlo simulations as well as by analytic evaluation in terms of classification and redshift estimation. If photon noise were the only error source, broad-band surveys and medium-band surveys should perform equally well, as long as they provide the same spectral coverage. In practice, medium-band surveys show superior performance due to their higher tolerance for calibration errors and cosmic variance. Finally, we discuss the relevance of color calibration and derive important conclusions for the issues of library design and choice of filters. The calibration accuracy poses strong constraints on an accurate classification, which are most critical for surveys with few, broad and deeply exposed filters, but less severe for surveys with many, narrow and less deep filters.

Keywords: VO, ArVO, DFBS, FITS, image processing, data analysis

Procedia PDF Downloads 61
2056 Analyses of Adverse Drug Reactions Reported of Hospital in Taiwan

Authors: Yu-Hong Lin

Abstract:

Background: An adverse drug reaction (ADR) reported is an injury which caused by taking medicines. Sometimes the severity of ADR reported may be minor, but sometimes it could be a life-threatening situation. In order to provide healthcare professionals as a better reference in clinical practice, we do data collection and analysis from our hospital. Methods: This was a retrospective study of ADRs reported performed from 2014 to 2015 in our hospital in Taiwan. We collected assessment items of ADRs reported, which contain gender and age, occurring sources, Anatomical Therapeutic Chemical (ATC) classification of suspected drugs, types of adverse reactions, Naranjo score calculating by Naranjo Adverse Drug Reaction Probability Scale and so on. Results: The investigation included two hundred and seven ADRs reported. Most of ADRs reported were occurring in outpatient department (92%). The average age of ADRs reported was 65.3 years. Less than 65 years of age were in the majority in this study (54%). Majority of all ADRs reported were males (51%). According to ATC classification system, the major classification of suspected drugs was cardiovascular system (19%) and antiinfectives for systemic use (18%) respectively. Among the adverse reactions, Dermatologic Effects (35%) were the major type of ADRs. Also, the major Naranjo scores of all ADRs reported ranged from 1 to 4 points (91%), which represents a possible correlation between ADRs reported and suspected drugs. Conclusions: Definitely, ADRs reported is still an extremely important information for healthcare professionals. For that reason, we put all information of ADRs reported into our hospital's computer system, and it will improve the safety of medication use. By hospital's computer system, it can remind prescribers to think of information about patient's ADRs reported. No drugs are administered without risk. Therefore, all healthcare professionals should have a responsibility to their patients, who themselves are becoming more aware of problems associated with drug therapy.

Keywords: adverse drug reaction, Taiwan, healthcare professionals, safe use of medicines

Procedia PDF Downloads 218
2055 A Two-Week and Six-Month Stability of Cancer Health Literacy Classification Using the CHLT-6

Authors: Levent Dumenci, Laura A. Siminoff

Abstract:

Health literacy has been shown to predict a variety of health outcomes. Reliable identification of persons with limited cancer health literacy (LCHL) has been proved questionable with existing instruments using an arbitrary cut point along a continuum. The CHLT-6, however, uses a latent mixture modeling approach to identify persons with LCHL. The purpose of this study was to estimate two-week and six-month stability of identifying persons with LCHL using the CHLT-6 with a discrete latent variable approach as the underlying measurement structure. Using a test-retest design, the CHLT-6 was administered to cancer patients with two-week (N=98) and six-month (N=51) intervals. The two-week and six-month latent test-retest agreements were 89% and 88%, respectively. The chance-corrected latent agreements estimated from Dumenci’s latent kappa were 0.62 (95% CI: 0.41 – 0.82) and .47 (95% CI: 0.14 – 0.80) for the two-week and six-month intervals, respectively. High levels of latent test-retest agreement between limited and adequate categories of cancer health literacy construct, coupled with moderate to good levels of change-corrected latent agreements indicated that the CHLT-6 classification of limited versus adequate cancer health literacy is relatively stable over time. In conclusion, the measurement structure underlying the instrument allows for estimating classification errors circumventing limitations due to arbitrary approaches adopted by all other instruments. The CHLT-6 can be used to identify persons with LCHL in oncology clinics and intervention studies to accurately estimate treatment effectiveness.

Keywords: limited cancer health literacy, the CHLT-6, discrete latent variable modeling, latent agreement

Procedia PDF Downloads 166
2054 Fake Accounts Detection in Twitter Based on Minimum Weighted Feature Set

Authors: Ahmed ElAzab, Amira M. Idrees, Mahmoud A. Mahmoud, Hesham Hefny

Abstract:

Social networking sites such as Twitter and Facebook attracts over 500 million users across the world, for those users, their social life, even their practical life, has become interrelated. Their interaction with social networking has affected their life forever. Accordingly, social networking sites have become among the main channels that are responsible for vast dissemination of different kinds of information during real time events. This popularity in Social networking has led to different problems including the possibility of exposing incorrect information to their users through fake accounts which results to the spread of malicious content during life events. This situation can result to a huge damage in the real world to the society in general including citizens, business entities, and others. In this paper, we present a classification method for detecting fake accounts on Twitter. The study determines the minimized set of the main factors that influence the detection of the fake accounts on Twitter, then the determined factors have been applied using different classification techniques, a comparison of the results for these techniques has been performed and the most accurate algorithm is selected according to the accuracy of the results. The study has been compared with different recent research in the same area, this comparison has proved the accuracy of the proposed study. We claim that this study can be continuously applied on Twitter social network to automatically detect the fake accounts, moreover, the study can be applied on different Social network sites such as Facebook with minor changes according to the nature of the social network which are discussed in this paper.

Keywords: fake accounts detection, classification algorithms, twitter accounts analysis, features based techniques

Procedia PDF Downloads 390
2053 Rapid Classification of Soft Rot Enterobacteriaceae Phyto-Pathogens Pectobacterium and Dickeya Spp. Using Infrared Spectroscopy and Machine Learning

Authors: George Abu-Aqil, Leah Tsror, Elad Shufan, Shaul Mordechai, Mahmoud Huleihel, Ahmad Salman

Abstract:

Pectobacterium and Dickeya spp which negatively affect a wide range of crops are the main causes of the aggressive diseases of agricultural crops. These aggressive diseases are responsible for a huge economic loss in agriculture including a severe decrease in the quality of the stored vegetables and fruits. Therefore, it is important to detect these pathogenic bacteria at their early stages of infection to control their spread and consequently reduce the economic losses. In addition, early detection is vital for producing non-infected propagative material for future generations. The currently used molecular techniques for the identification of these bacteria at the strain level are expensive and laborious. Other techniques require a long time of ~48 h for detection. Thus, there is a clear need for rapid, non-expensive, accurate and reliable techniques for early detection of these bacteria. In this study, infrared spectroscopy, which is a well-known technique with all its features, was used for rapid detection of Pectobacterium and Dickeya spp. at the strain level. The bacteria were isolated from potato plants and tubers with soft rot symptoms and measured by infrared spectroscopy. The obtained spectra were analyzed using different machine learning algorithms. The performances of our approach for taxonomic classification among the bacterial samples were evaluated in terms of success rates. The success rates for the correct classification of the genus, species and strain levels were ~100%, 95.2% and 92.6% respectively.

Keywords: soft rot enterobacteriaceae (SRE), pectobacterium, dickeya, plant infections, potato, solanum tuberosum, infrared spectroscopy, machine learning

Procedia PDF Downloads 89
2052 Design of Bacterial Pathogens Identification System Based on Scattering of Laser Beam Light and Classification of Binned Plots

Authors: Mubashir Hussain, Mu Lv, Xiaohan Dong, Zhiyang Li, Bin Liu, Nongyue He

Abstract:

Detection and classification of microbes have a vast range of applications in biomedical engineering especially in detection, characterization, and quantification of bacterial contaminants. For identification of pathogens, different techniques are emerging in the field of biomedical engineering. Latest technology uses light scattering, capable of identifying different pathogens without any need for biochemical processing. Bacterial Pathogens Identification System (BPIS) which uses a laser beam, passes through the sample and light scatters off. An assembly of photodetectors surrounded by the sample at different angles to detect the scattering of light. The algorithm of the system consists of two parts: (a) Library files, and (b) Comparator. Library files contain data of known species of bacterial microbes in the form of binned plots, while comparator compares data of unknown sample with library files. Using collected data of unknown bacterial species, highest voltage values stored in the form of peaks and arranged in 3D histograms to find the frequency of occurrence. Resulting data compared with library files of known bacterial species. If sample data matching with any library file of known bacterial species, sample identified as a matched microbe. An experiment performed to identify three different bacteria particles: Enterococcus faecalis, Pseudomonas aeruginosa, and Escherichia coli. By applying algorithm using library files of given samples, results were compromising. This system is potentially applicable to several biomedical areas, especially those related to cell morphology.

Keywords: microbial identification, laser scattering, peak identification, binned plots classification

Procedia PDF Downloads 137