Search results for: classification methods
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 16655

Search results for: classification methods

16145 Unsupervised Learning of Spatiotemporally Coherent Metrics

Authors: Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun

Abstract:

Current state-of-the-art classification and detection algorithms rely on supervised training. In this work we study unsupervised feature learning in the context of temporally coherent video data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pooling auto-encoder regularized by slowness and sparsity. We establish a connection between slow feature learning to metric learning and show that the trained encoder can be used to define a more temporally and semantically coherent metric.

Keywords: machine learning, pattern clustering, pooling, classification

Procedia PDF Downloads 450
16144 Remote Sensing through Deep Neural Networks for Satellite Image Classification

Authors: Teja Sai Puligadda

Abstract:

Satellite images in detail can serve an important role in the geographic study. Quantitative and qualitative information provided by the satellite and remote sensing images minimizes the complexity of work and time. Data/images are captured at regular intervals by satellite remote sensing systems, and the amount of data collected is often enormous, and it expands rapidly as technology develops. Interpreting remote sensing images, geographic data mining, and researching distinct vegetation types such as agricultural and forests are all part of satellite image categorization. One of the biggest challenge data scientists faces while classifying satellite images is finding the best suitable classification algorithms based on the available that could able to classify images with utmost accuracy. In order to categorize satellite images, which is difficult due to the sheer volume of data, many academics are turning to deep learning machine algorithms. As, the CNN algorithm gives high accuracy in image recognition problems and automatically detects the important features without any human supervision and the ANN algorithm stores information on the entire network (Abhishek Gupta., 2020), these two deep learning algorithms have been used for satellite image classification. This project focuses on remote sensing through Deep Neural Networks i.e., ANN and CNN with Deep Sat (SAT-4) Airborne dataset for classifying images. Thus, in this project of classifying satellite images, the algorithms ANN and CNN are implemented, evaluated & compared and the performance is analyzed through evaluation metrics such as Accuracy and Loss. Additionally, the Neural Network algorithm which gives the lowest bias and lowest variance in solving multi-class satellite image classification is analyzed.

Keywords: artificial neural network, convolutional neural network, remote sensing, accuracy, loss

Procedia PDF Downloads 153
16143 The Classification Accuracy of Finance Data through Holder Functions

Authors: Yeliz Karaca, Carlo Cattani

Abstract:

This study focuses on the local Holder exponent as a measure of the function regularity for time series related to finance data. In this study, the attributes of the finance dataset belonging to 13 countries (India, China, Japan, Sweden, France, Germany, Italy, Australia, Mexico, United Kingdom, Argentina, Brazil, USA) located in 5 different continents (Asia, Europe, Australia, North America and South America) have been examined.These countries are the ones mostly affected by the attributes with regard to financial development, covering a period from 2012 to 2017. Our study is concerned with the most important attributes that have impact on the development of finance for the countries identified. Our method is comprised of the following stages: (a) among the multi fractal methods and Brownian motion Holder regularity functions (polynomial, exponential), significant and self-similar attributes have been identified (b) The significant and self-similar attributes have been applied to the Artificial Neuronal Network (ANN) algorithms (Feed Forward Back Propagation (FFBP) and Cascade Forward Back Propagation (CFBP)) (c) the outcomes of classification accuracy have been compared concerning the attributes that have impact on the attributes which affect the countries’ financial development. This study has enabled to reveal, through the application of ANN algorithms, how the most significant attributes are identified within the relevant dataset via the Holder functions (polynomial and exponential function).

Keywords: artificial neural networks, finance data, Holder regularity, multifractals

Procedia PDF Downloads 242
16142 A Survey of Response Generation of Dialogue Systems

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

An essential task in the field of artificial intelligence is to allow computers to interact with people through natural language. Therefore, researches such as virtual assistants and dialogue systems have received widespread attention from industry and academia. The response generation plays a crucial role in dialogue systems, so to push forward the research on this topic, this paper surveys various methods for response generation. We sort out these methods into three categories. First one includes finite state machine methods, framework methods, and instance methods. The second contains full-text indexing methods, ontology methods, vast knowledge base method, and some other methods. The third covers retrieval methods and generative methods. We also discuss some hybrid methods based knowledge and deep learning. We compare their disadvantages and advantages and point out in which ways these studies can be improved further. Our discussion covers some studies published in leading conferences such as IJCAI and AAAI in recent years.

Keywords: deep learning, generative, knowledge, response generation, retrieval

Procedia PDF Downloads 130
16141 An Automatic Generating Unified Modelling Language Use Case Diagram and Test Cases Based on Classification Tree Method

Authors: Wassana Naiyapo, Atichat Sangtong

Abstract:

The processes in software development by Object Oriented methodology have many stages those take time and high cost. The inconceivable error in system analysis process will affect to the design and the implementation process. The unexpected output causes the reason why we need to revise the previous process. The more rollback of each process takes more expense and delayed time. Therefore, the good test process from the early phase, the implemented software is efficient, reliable and also meet the user’s requirement. Unified Modelling Language (UML) is the tool which uses symbols to describe the work process in Object Oriented Analysis (OOA). This paper presents the approach for automatically generated UML use case diagram and test cases. UML use case diagram is generated from the event table and test cases are generated from use case specifications and Graphic User Interfaces (GUI). Test cases are derived from the Classification Tree Method (CTM) that classify data to a node present in the hierarchy structure. Moreover, this paper refers to the program that generates use case diagram and test cases. As the result, it can reduce work time and increase efficiency work.

Keywords: classification tree method, test case, UML use case diagram, use case specification

Procedia PDF Downloads 159
16140 Airport Pavement Crack Measurement Systems and Crack Density for Pavement Evaluation

Authors: Ali Ashtiani, Hamid Shirazi

Abstract:

This paper reviews the status of existing practice and research related to measuring pavement cracking and using crack density as a pavement surface evaluation protocol. Crack density for pavement evaluation is currently not widely used within the airport community and its use by the highway community is limited. However, surface cracking is a distress that is closely monitored by airport staff and significantly influences the development of maintenance, rehabilitation and reconstruction plans for airport pavements. Therefore crack density has the potential to become an important indicator of pavement condition if the type, severity and extent of surface cracking can be accurately measured. A pavement distress survey is an essential component of any pavement assessment. Manual crack surveying has been widely used for decades to measure pavement performance. However, the accuracy and precision of manual surveys can vary depending upon the surveyor and performing surveys may disrupt normal operations. Given the variability of manual surveys, this method has shown inconsistencies in distress classification and measurement. This can potentially impact the planning for pavement maintenance, rehabilitation and reconstruction and the associated funding strategies. A substantial effort has been devoted for the past 20 years to reduce the human intervention and the error associated with it by moving toward automated distress collection methods. The automated methods refer to the systems that identify, classify and quantify pavement distresses through processes that require no or very minimal human intervention. This principally involves the use of a digital recognition software to analyze and characterize pavement distresses. The lack of established protocols for measurement and classification of pavement cracks captured using digital images is a challenge to developing a reliable automated system for distress assessment. Variations in types and severity of distresses, different pavement surface textures and colors and presence of pavement joints and edges all complicate automated image processing and crack measurement and classification. This paper summarizes the commercially available systems and technologies for automated pavement distress evaluation. A comprehensive automated pavement distress survey involves collection, interpretation, and processing of the surface images to identify the type, quantity and severity of the surface distresses. The outputs can be used to quantitatively calculate the crack density. The systems for automated distress survey using digital images reviewed in this paper can assist the airport industry in the development of a pavement evaluation protocol based on crack density. Analysis of automated distress survey data can lead to a crack density index. This index can be used as a means of assessing pavement condition and to predict pavement performance. This can be used by airport owners to determine the type of pavement maintenance and rehabilitation in a more consistent way.

Keywords: airport pavement management, crack density, pavement evaluation, pavement management

Procedia PDF Downloads 183
16139 Amplifying Sine Unit-Convolutional Neural Network: An Efficient Deep Architecture for Image Classification and Feature Visualizations

Authors: Jamshaid Ul Rahman, Faiza Makhdoom, Dianchen Lu

Abstract:

Activation functions play a decisive role in determining the capacity of Deep Neural Networks (DNNs) as they enable neural networks to capture inherent nonlinearities present in data fed to them. The prior research on activation functions primarily focused on the utility of monotonic or non-oscillatory functions, until Growing Cosine Unit (GCU) broke the taboo for a number of applications. In this paper, a Convolutional Neural Network (CNN) model named as ASU-CNN is proposed which utilizes recently designed activation function ASU across its layers. The effect of this non-monotonic and oscillatory function is inspected through feature map visualizations from different convolutional layers. The optimization of proposed network is offered by Adam with a fine-tuned adjustment of learning rate. The network achieved promising results on both training and testing data for the classification of CIFAR-10. The experimental results affirm the computational feasibility and efficacy of the proposed model for performing tasks related to the field of computer vision.

Keywords: amplifying sine unit, activation function, convolutional neural networks, oscillatory activation, image classification, CIFAR-10

Procedia PDF Downloads 105
16138 A Similar Image Retrieval System for Auroral All-Sky Images Based on Local Features and Color Filtering

Authors: Takanori Tanaka, Daisuke Kitao, Daisuke Ikeda

Abstract:

The aurora is an attractive phenomenon but it is difficult to understand the whole mechanism of it. An approach of data-intensive science might be an effective approach to elucidate such a difficult phenomenon. To do that we need labeled data, which shows when and what types of auroras, have appeared. In this paper, we propose an image retrieval system for auroral all-sky images, some of which include discrete and diffuse aurora, and the other do not any aurora. The proposed system retrieves images which are similar to the query image by using a popular image recognition method. Using 300 all-sky images obtained at Tromso Norway, we evaluate two methods of image recognition methods with or without our original color filtering method. The best performance is achieved when SIFT with the color filtering is used and its accuracy is 81.7% for discrete auroras and 86.7% for diffuse auroras.

Keywords: data-intensive science, image classification, content-based image retrieval, aurora

Procedia PDF Downloads 445
16137 Determination of Water Pollution and Water Quality with Decision Trees

Authors: Çiğdem Bakır, Mecit Yüzkat

Abstract:

With the increasing emphasis on water quality worldwide, the search for and expanding the market for new and intelligent monitoring systems has increased. The current method is the laboratory process, where samples are taken from bodies of water, and tests are carried out in laboratories. This method is time-consuming, a waste of manpower, and uneconomical. To solve this problem, we used machine learning methods to detect water pollution in our study. We created decision trees with the Orange3 software we used in our study and tried to determine all the factors that cause water pollution. An automatic prediction model based on water quality was developed by taking many model inputs such as water temperature, pH, transparency, conductivity, dissolved oxygen, and ammonia nitrogen with machine learning methods. The proposed approach consists of three stages: preprocessing of the data used, feature detection, and classification. We tried to determine the success of our study with different accuracy metrics and the results. We presented it comparatively. In addition, we achieved approximately 98% success with the decision tree.

Keywords: decision tree, water quality, water pollution, machine learning

Procedia PDF Downloads 77
16136 Activity Data Analysis for Status Classification Using Fitness Trackers

Authors: Rock-Hyun Choi, Won-Seok Kang, Chang-Sik Son

Abstract:

Physical activity is important for healthy living. Recently wearable devices which motivate physical activity are quickly developing, and become cheaper and more comfortable. In particular, fitness trackers provide a variety of information and need to provide well-analyzed, and user-friendly results. In this study, frequency analysis was performed to classify various data sets of Fitbit into simple activity status. The data from Fitbit cloud server consists of 263 subjects who were healthy factory and office workers in Korea from March 7th to April 30th, 2016. In the results, we found assumptions of activity state classification seem to be sufficient and reasonable.

Keywords: activity status, fitness tracker, heart rate, steps

Procedia PDF Downloads 379
16135 Classification of Traffic Complex Acoustic Space

Authors: Bin Wang, Jian Kang

Abstract:

After years of development, the study of soundscape has been refined to the types of urban space and building. Traffic complex takes traffic function as the core, with obvious design features of architectural space combination and traffic streamline. The acoustic environment is strongly characterized by function, space, material, user and other factors. Traffic complex integrates various functions of business, accommodation, entertainment and so on. It has various forms, complex and varied experiences, and its acoustic environment is turned rich and interesting with distribution and coordination of various functions, division and unification of the mass, separation and organization of different space and the cross and the integration of multiple traffic flow. In this study, it made field recordings of each space of various traffic complex, and extracted and analyzed different acoustic elements, including changes in sound pressure, frequency distribution, steady sound source, sound source information and other aspects, to make cluster analysis of each independent traffic complex buildings. It divided complicated traffic complex building space into several typical sound space from acoustic environment perspective, mainly including stable sound space, high-pressure sound space, rhythm sound space and upheaval sound space. This classification can further deepen the study of subjective evaluation and control of the acoustic environment of traffic complex.

Keywords: soundscape, traffic complex, cluster analysis, classification

Procedia PDF Downloads 247
16134 Classification of Myoelectric Signals Using Multilayer Perceptron Neural Network with Back-Propagation Algorithm in a Wireless Surface Myoelectric Prosthesis of the Upper-Limb

Authors: Kevin D. Manalo, Jumelyn L. Torres, Noel B. Linsangan

Abstract:

This paper focuses on a wireless myoelectric prosthesis of the upper-limb that uses a Multilayer Perceptron Neural network with back propagation. The algorithm is widely used in pattern recognition. The network can be used to train signals and be able to use it in performing a function on their own based on sample inputs. The paper makes use of the Neural Network in classifying the electromyography signal that is produced by the muscle in the amputee’s skin surface. The gathered data will be passed on through the Classification Stage wirelessly through Zigbee Technology. The signal will be classified and trained to be used in performing the arm positions in the prosthesis. Through programming using Verilog and using a Field Programmable Gate Array (FPGA) with Zigbee, the EMG signals will be acquired and will be used for classification. The classified signal is used to produce the corresponding Hand Movements (Open, Pick, Hold, and Grip) through the Zigbee controller. The data will then be processed through the MLP Neural Network using MATLAB which then be used for the surface myoelectric prosthesis. Z-test will be used to display the output acquired from using the neural network.

Keywords: field programmable gate array, multilayer perceptron neural network, verilog, zigbee

Procedia PDF Downloads 386
16133 Conformance to Spatial Planning between the Kampala Physical Development Plan of 2012 and the Existing Land Use in 2021

Authors: Brendah Nagula, Omolo Fredrick Okalebo, Ronald Ssengendo, Ivan Bamweyana

Abstract:

The Kampala Physical Development Plan (KPDP) was developed in 2012 and projected both long term and short term developments within the City .The purpose of the plan was to not only shape the city into a spatially planned area but also to control the urban sprawl trends that had expanded with pronounced instances of informal settlements. This plan was approved by the National Physical Planning Board and a signature was appended by the Minister in 2013. Much as the KPDP plan has been implemented using different approaches such as detailed planning, development control, subdivision planning, carrying out construction inspections, greening and beautification, there is still limited knowledge on the level of conformance towards this plan. Therefore, it is yet to be determined whether it has been effective in shaping the City into an ideal spatially planned area. Attaining a clear picture of the level of conformance towards the KPDP 2012 through evaluation between the planned and the existing land use in Kampala City was performed. Methods such as Supervised Classification and Post Classification Change Detection were adopted to perform this evaluation. Scrutiny of findings revealed Central Division registered the lowest level of conformance to the planning standards specified in the KPDP 2012 followed by Nakawa, Rubaga, Kawempe, and Makindye. Furthermore, mixed-use development was identified as the land use with the highest level of non-conformity of 25.11% and institutional land use registered the highest level of conformance of 84.45 %. The results show that the aspect of location was not carefully considered while allocating uses in the KPDP whereby areas located near the Central Business District have higher land rents and hence require uses that ensure profit maximization. Also, the prominence of development towards mixed-use denotes an increased demand for land towards compact development that was not catered for in the plan. Therefore in order to transform Kampala city into a spatially planned area, there is need to carefully develop detailed plans especially for all the Central Division planning precincts indicating considerations for land use densification.

Keywords: spatial plan, post classification change detection, Kampala city, landuse

Procedia PDF Downloads 86
16132 A Review of Feature Selection Methods Implemented in Neural Stem Cells

Authors: Natasha Petrovska, Mirjana Pavlovic, Maria M. Larrondo-Petrie

Abstract:

Neural stem cells (NSCs) are multi-potent, self-renewing cells that generate new neurons. Three subtypes of NSCs can be separated regarding the stages of NSC lineage: quiescent neural stem cells (qNSCs), activated neural stem cells (aNSCs) and neural progenitor cells (NPCs), but their gene expression signatures are not utterly understood yet. Single-cell examinations have started to elucidate the complex structure of NSC populations. Nevertheless, there is a lack of thorough molecular interpretation of the NSC lineage heterogeneity and an increasing need for tools to analyze and improve the efficiency and correctness of single-cell sequencing data. Feature selection and ordering can identify and classify the gene expression signatures of these subtypes and can discover novel subpopulations during the NSCs activation and differentiation processes. The aim here is to review the implementation of the feature selection technique on NSC subtypes and the classification techniques that have been used for the identification of gene expression signatures.

Keywords: feature selection, feature similarity, neural stem cells, genes, feature selection methods

Procedia PDF Downloads 146
16131 A Framework for Automated Nuclear Waste Classification

Authors: Seonaid Hume, Gordon Dobie, Graeme West

Abstract:

Detecting and localizing radioactive sources is a necessity for safe and secure decommissioning of nuclear facilities. An important aspect for the management of the sort-and-segregation process is establishing the spatial distributions and quantities of the waste radionuclides, their type, corresponding activity, and ultimately classification for disposal. The data received from surveys directly informs decommissioning plans, on-site incident management strategies, the approach needed for a new cell, as well as protecting the workforce and the public. Manual classification of nuclear waste from a nuclear cell is time-consuming, expensive, and requires significant expertise to make the classification judgment call. Also, in-cell decommissioning is still in its relative infancy, and few techniques are well-developed. As with any repetitive and routine tasks, there is the opportunity to improve the task of classifying nuclear waste using autonomous systems. Hence, this paper proposes a new framework for the automatic classification of nuclear waste. This framework consists of five main stages; 3D spatial mapping and object detection, object classification, radiological mapping, source localisation based on gathered evidence and finally, waste classification. The first stage of the framework, 3D visual mapping, involves object detection from point cloud data. A review of related applications in other industries is provided, and recommendations for approaches for waste classification are made. Object detection focusses initially on cylindrical objects since pipework is significant in nuclear cells and indeed any industrial site. The approach can be extended to other commonly occurring primitives such as spheres and cubes. This is in preparation of stage two, characterizing the point cloud data and estimating the dimensions, material, degradation, and mass of the objects detected in order to feature match them to an inventory of possible items found in that nuclear cell. Many items in nuclear cells are one-offs, have limited or poor drawings available, or have been modified since installation, and have complex interiors, which often and inadvertently pose difficulties when accessing certain zones and identifying waste remotely. Hence, this may require expert input to feature match objects. The third stage, radiological mapping, is similar in order to facilitate the characterization of the nuclear cell in terms of radiation fields, including the type of radiation, activity, and location within the nuclear cell. The fourth stage of the framework takes the visual map for stage 1, the object characterization from stage 2, and radiation map from stage 3 and fuses them together, providing a more detailed scene of the nuclear cell by identifying the location of radioactive materials in three dimensions. The last stage involves combining the evidence from the fused data sets to reveal the classification of the waste in Bq/kg, thus enabling better decision making and monitoring for in-cell decommissioning. The presentation of the framework is supported by representative case study data drawn from an application in decommissioning from a UK nuclear facility. This framework utilises recent advancements of the detection and mapping capabilities of complex radiation fields in three dimensions to make the process of classifying nuclear waste faster, more reliable, cost-effective and safer.

Keywords: nuclear decommissioning, radiation detection, object detection, waste classification

Procedia PDF Downloads 198
16130 Evaluation and Assessment of Bioinformatics Methods and Their Applications

Authors: Fatemeh Nokhodchi Bonab

Abstract:

Bioinformatics, in its broad sense, involves application of computer processes to solve biological problems. A wide range of computational tools are needed to effectively and efficiently process large amounts of data being generated as a result of recent technological innovations in biology and medicine. A number of computational tools have been developed or adapted to deal with the experimental riches of complex and multivariate data and transition from data collection to information or knowledge. These bioinformatics tools are being evaluated and applied in various medical areas including early detection, risk assessment, classification, and prognosis of cancer. The goal of these efforts is to develop and identify bioinformatics methods with optimal sensitivity, specificity, and predictive capabilities. The recent flood of data from genome sequences and functional genomics has given rise to new field, bioinformatics, which combines elements of biology and computer science. Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale. Here we propose a definition for this new field and review some of the research that is being pursued, particularly in relation to transcriptional regulatory systems.

Keywords: methods, applications, transcriptional regulatory systems, techniques

Procedia PDF Downloads 121
16129 Roof Material Detection Based on Object-Based Approach Using WorldView-2 Satellite Imagery

Authors: Ebrahim Taherzadeh, Helmi Z. M. Shafri, Kaveh Shahi

Abstract:

One of the most important tasks in urban area remote sensing is detection of impervious surface (IS), such as building roof and roads. However, detection of IS in heterogeneous areas still remains as one of the most challenging works. In this study, detection of concrete roof using an object-oriented approach was proposed. A new rule-based classification was developed to detect concrete roof tile. The proposed rule-based classification was applied to WorldView-2 image. Results showed that the proposed rule has good potential to predict concrete roof material from WorldView-2 images with 85% accuracy.

Keywords: object-based, roof material, concrete tile, WorldView-2

Procedia PDF Downloads 419
16128 Global Positioning System Match Characteristics as a Predictor of Badminton Players’ Group Classification

Authors: Yahaya Abdullahi, Ben Coetzee, Linda Van Den Berg

Abstract:

The study aimed at establishing the global positioning system (GPS) determined singles match characteristics that act as predictors of successful and less-successful male singles badminton players’ group classification. Twenty-two (22) male single players (aged: 23.39 ± 3.92 years; body stature: 177.11 ± 3.06cm; body mass: 83.46 ± 14.59kg) who represented 10 African countries participated in the study. Players were categorised as successful and less-successful players according to the results of five championships’ of the 2014/2015 season. GPS units (MinimaxX V4.0), Polar Heart Rate Transmitter Belts and digital video cameras were used to collect match data. GPS-related variables were corrected for match duration and independent t-tests, a cluster analysis and a binary forward stepwise logistic regression were calculated. A Receiver Operating Characteristic Curve (ROC) was used to determine the validity of the group classification model. High-intensity accelerations per second were identified as the only GPS-determined variable that showed a significant difference between groups. Furthermore, only high-intensity accelerations per second (p=0.03) and low-intensity efforts per second (p=0.04) were identified as significant predictors of group classification with 76.88% of players that could be classified back into their original groups by making use of the GPS-based logistic regression formula. The ROC showed a value of 0.87. The identification of the last-mentioned GPS-related variables for the attainment of badminton performances, emphasizes the importance of using badminton drills and conditioning techniques to not only improve players’ physical fitness levels but also their abilities to accelerate at high intensities.

Keywords: badminton, global positioning system, match analysis, inertial movement analysis, intensity, effort

Procedia PDF Downloads 187
16127 Revisiting the Swadesh Wordlist: How Long Should It Be

Authors: Feda Negesse

Abstract:

One of the most important indicators of research quality is a good data - collection instrument that can yield reliable and valid data. The Swadesh wordlist has been used for more than half a century for collecting data in comparative and historical linguistics though arbitrariness is observed in its application and size. This research compare s the classification results of the 100 Swadesh wordlist with those of its subsets to determine if reducing the size of the wordlist impact s its effectiveness. In the comparison, the 100, 50 and 40 wordlists were used to compute lexical distances of 29 Cushitic and Semitic languages spoken in Ethiopia and neighbouring countries. Gabmap, a based application, was employed to compute the lexical distances and to divide the languages into related clusters. The study shows that the subsets are not as effective as the 100 wordlist in clustering languages into smaller subgroups but they are equally effective in di viding languages into bigger groups such as subfamilies. It is noted that the subsets may lead to an erroneous classification whereby unrelated languages by chance form a cluster which is not attested by a comparative study. The chance to get a wrong result is higher when the subsets are used to classify languages which are not closely related. Though a further study is still needed to settle the issues around the size of the Swadesh wordlist, this study indicates that the 50 and 40 wordlists cannot be recommended as reliable substitute s for the 100 wordlist under all circumstances. The choice seems to be determined by the objective of a researcher and the degree of affiliation among the languages to be classified.

Keywords: classification, Cushitic, Swadesh, wordlist

Procedia PDF Downloads 295
16126 Time and Cost Prediction Models for Language Classification Over a Large Corpus on Spark

Authors: Jairson Barbosa Rodrigues, Paulo Romero Martins Maciel, Germano Crispim Vasconcelos

Abstract:

This paper presents an investigation of the performance impacts regarding the variation of five factors (input data size, node number, cores, memory, and disks) when applying a distributed implementation of Naïve Bayes for text classification of a large Corpus on the Spark big data processing framework. Problem: The algorithm's performance depends on multiple factors, and knowing before-hand the effects of each factor becomes especially critical as hardware is priced by time slice in cloud environments. Objectives: To explain the functional relationship between factors and performance and to develop linear predictor models for time and cost. Methods: the solid statistical principles of Design of Experiments (DoE), particularly the randomized two-level fractional factorial design with replications. This research involved 48 real clusters with different hardware arrangements. The metrics were analyzed using linear models for screening, ranking, and measurement of each factor's impact. Results: Our findings include prediction models and show some non-intuitive results about the small influence of cores and the neutrality of memory and disks on total execution time, and the non-significant impact of data input scale on costs, although notably impacts the execution time.

Keywords: big data, design of experiments, distributed machine learning, natural language processing, spark

Procedia PDF Downloads 112
16125 An Automatic Bayesian Classification System for File Format Selection

Authors: Roman Graf, Sergiu Gordea, Heather M. Ryan

Abstract:

This paper presents an approach for the classification of an unstructured format description for identification of file formats. The main contribution of this work is the employment of data mining techniques to support file format selection with just the unstructured text description that comprises the most important format features for a particular organisation. Subsequently, the file format indentification method employs file format classifier and associated configurations to support digital preservation experts with an estimation of required file format. Our goal is to make use of a format specification knowledge base aggregated from a different Web sources in order to select file format for a particular institution. Using the naive Bayes method, the decision support system recommends to an expert, the file format for his institution. The proposed methods facilitate the selection of file format and the quality of a digital preservation process. The presented approach is meant to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and specifications of file formats. To facilitate decision-making, the aggregated information about the file formats is presented as a file format vocabulary that comprises most common terms that are characteristic for all researched formats. The goal is to suggest a particular file format based on this vocabulary for analysis by an expert. The sample file format calculation and the calculation results including probabilities are presented in the evaluation section.

Keywords: data mining, digital libraries, digital preservation, file format

Procedia PDF Downloads 491
16124 Proposal for a Web System for the Control of Fungal Diseases in Grapes in Fruits Markets

Authors: Carlos Tarmeño Noriega, Igor Aguilar Alonso

Abstract:

Fungal diseases are common in vineyards; they cause a decrease in the quality of the products that can be sold, generating distrust of the customer towards the seller when buying fruit. Currently, technology allows the classification of fruits according to their characteristics thanks to artificial intelligence. This study proposes the implementation of a control system that allows the identification of the main fungal diseases present in the Italia grape, making use of a convolutional neural network (CNN), OpenCV, and TensorFlow. The methodology used was based on a collection of 20 articles referring to the proposed research on quality control, classification, and recognition of fruits through artificial vision techniques.

Keywords: computer vision, convolutional neural networks, quality control, fruit market, OpenCV, TensorFlow

Procedia PDF Downloads 79
16123 Valence and Arousal-Based Sentiment Analysis: A Comparative Study

Authors: Usama Shahid, Muhammad Zunnurain Hussain

Abstract:

This research paper presents a comprehensive analysis of a sentiment analysis approach that employs valence and arousal as its foundational pillars, in comparison to traditional techniques. Sentiment analysis is an indispensable task in natural language processing that involves the extraction of opinions and emotions from textual data. The valence and arousal dimensions, representing the intensity and positivity/negativity of emotions, respectively, enable the creation of four quadrants, each representing a specific emotional state. The study seeks to determine the impact of utilizing these quadrants to identify distinct emotional states on the accuracy and efficiency of sentiment analysis, in comparison to traditional techniques. The results reveal that the valence and arousal-based approach outperforms other approaches, particularly in identifying nuanced emotions that may be missed by conventional methods. The study's findings are crucial for applications such as social media monitoring and market research, where the accurate classification of emotions and opinions is paramount. Overall, this research highlights the potential of using valence and arousal as a framework for sentiment analysis and offers invaluable insights into the benefits of incorporating specific types of emotions into the analysis. These findings have significant implications for researchers and practitioners in the field of natural language processing, as they provide a basis for the development of more accurate and effective sentiment analysis tools.

Keywords: sentiment analysis, valence and arousal, emotional states, natural language processing, machine learning, text analysis, sentiment classification, opinion mining

Procedia PDF Downloads 93
16122 Development of a Biomechanical Method for Ergonomic Evaluation: Comparison with Observational Methods

Authors: M. Zare, S. Biau, M. Corq, Y. Roquelaure

Abstract:

A wide variety of observational methods have been developed to evaluate the ergonomic workloads in manufacturing. However, the precision and accuracy of these methods remain a subject of debate. The aims of this study were to develop biomechanical methods to evaluate ergonomic workloads and to compare them with observational methods. Two observational methods, i.e. SCANIA Ergonomic Standard (SES) and Rapid Upper Limb Assessment (RULA), were used to assess ergonomic workloads at two simulated workstations. They included four tasks such as tightening & loosening, attachment of tubes and strapping as well as other actions. Sensors were also used to measure biomechanical data (Inclinometers, Accelerometers, and Goniometers). Our findings showed that in assessment of some risk factors both RULA & SES were in agreement with the results of biomechanical methods. However, there was disagreement on neck and wrist postures. In conclusion, the biomechanical approach was more precise than observational methods, but some risk factors evaluated with observational methods were not measurable with the biomechanical techniques developed.

Keywords: ergonomic, observational method, biomechanical methods, workload

Procedia PDF Downloads 386
16121 2D Point Clouds Features from Radar for Helicopter Classification

Authors: Danilo Habermann, Aleksander Medella, Carla Cremon, Yusef Caceres

Abstract:

This paper aims to analyze the ability of 2d point clouds features to classify different models of helicopters using radars. This method does not need to estimate the blade length, the number of blades of helicopters, and the period of their micro-Doppler signatures. It is also not necessary to generate spectrograms (or any other image based on time and frequency domain). This work transforms a radar return signal into a 2D point cloud and extracts features of it. Three classifiers are used to distinguish 9 different helicopter models in order to analyze the performance of the features used in this work. The high accuracy obtained with each of the classifiers demonstrates that the 2D point clouds features are very useful for classifying helicopters from radar signal.

Keywords: helicopter classification, point clouds features, radar, supervised classifiers

Procedia PDF Downloads 219
16120 Fully Automated Methods for the Detection and Segmentation of Mitochondria in Microscopy Images

Authors: Blessing Ojeme, Frederick Quinn, Russell Karls, Shannon Quinn

Abstract:

The detection and segmentation of mitochondria from fluorescence microscopy are crucial for understanding the complex structure of the nervous system. However, the constant fission and fusion of mitochondria and image distortion in the background make the task of detection and segmentation challenging. In the literature, a number of open-source software tools and artificial intelligence (AI) methods have been described for analyzing mitochondrial images, achieving remarkable classification and quantitation results. However, the availability of combined expertise in the medical field and AI required to utilize these tools poses a challenge to its full adoption and use in clinical settings. Motivated by the advantages of automated methods in terms of good performance, minimum detection time, ease of implementation, and cross-platform compatibility, this study proposes a fully automated framework for the detection and segmentation of mitochondria using both image shape information and descriptive statistics. Using the low-cost, open-source python and openCV library, the algorithms are implemented in three stages: pre-processing, image binarization, and coarse-to-fine segmentation. The proposed model is validated using the mitochondrial fluorescence dataset. Ground truth labels generated using a Lab kit were also used to evaluate the performance of our detection and segmentation model. The study produces good detection and segmentation results and reports the challenges encountered during the image analysis of mitochondrial morphology from the fluorescence mitochondrial dataset. A discussion on the methods and future perspectives of fully automated frameworks conclude the paper.

Keywords: 2D, binarization, CLAHE, detection, fluorescence microscopy, mitochondria, segmentation

Procedia PDF Downloads 355
16119 Breaking the Barrier of Service Hostility: A Lean Approach to Achieve Operational Excellence

Authors: Mofizul Islam Awwal

Abstract:

Due to globalization, industries are rapidly growing throughout the world which leads to many manufacturing organizations. But recently, service industries are beginning to emerge in large numbers almost in all parts of the world including some developing countries. In this context, organizations need to have strong competitive advantage over their rivals to achieve their strategic business goals. Manufacturing industries are adopting many methods and techniques in order to achieve such competitive edge. Over the last decades, manufacturing industries have been successfully practicing lean concept to optimize their production lines. Due to its huge success in manufacturing context, lean has made its way into the service industry. Very little importance has been addressed to service in the area of operations management. Service industries are far behind than manufacturing industries in terms of operations improvement. It will be a hectic job to transfer the lean concept from production floor to service back/front office which will obviously yield possible improvement. Service processes are not as visible as production processes and can be very complex. Lack of research in this area made it quite difficult for service industries as there are no standardized frameworks for successfully implementing lean concept in service organization. The purpose of this research paper is to capture the present scenario of service industry in terms of lean implementation. Thorough analysis of past literature will be done on the applicability and understanding of lean in service structure. Classification of research papers will be done and critical factors will be unveiled for implementing lean in service industry to achieve operational excellence.

Keywords: lean service, lean literature classification, lean implementation, service industry, service excellence

Procedia PDF Downloads 372
16118 The Functions of “Question” and Its Role in Education Process: Quranic Approach

Authors: Sara Tusian, Zahra Salehi Motaahed, Narges Sajjadie, Nikoo Dialame

Abstract:

One of the methods which have frequently been used in Quran is the “question”. In the Quran, in addition to the content, methods are also important. Using analysis-interpretation method, the present study has investigated Quranic questions, and extracted its functions from educational perspective. In so doing, it has first investigated all the questions in Quran and then taking the three-stage classification of education into account, it has offered question functions. The results obtained from this study suggest that question functions in Quran are presented in three categories: the preparation stage (including preparation of the audience, revising the insights, and internal Evolution); main body (including the granting the insight, and elimination of intellectual negligence and the question of innate and logical axioms, the introducting of the realm of thinking, creating emotional arousal and alleged in the claim) and the third stage as modification and revision (including invitation to move in the framework of tasks using the individual beliefs to reveal the contradictions and, Error detection and contribution to change the function) that each of which has a special role in the education process.

Keywords: education, question, Quranic questions, Quran

Procedia PDF Downloads 499
16117 A New Conjugate Gradient Method with Guaranteed Descent

Authors: B. Sellami, M. Belloufi

Abstract:

Conjugate gradient methods are an important class of methods for unconstrained optimization, especially for large-scale problems. Recently, they have been much studied. In this paper, we propose a new two-parameter family of conjugate gradient methods for unconstrained optimization. The two-parameter family of methods not only includes the already existing three practical nonlinear conjugate gradient methods, but also has other family of conjugate gradient methods as subfamily. The two-parameter family of methods with the Wolfe line search is shown to ensure the descent property of each search direction. Some general convergence results are also established for the two-parameter family of methods. The numerical results show that this method is efficient for the given test problems. In addition, the methods related to this family are uniformly discussed.

Keywords: unconstrained optimization, conjugate gradient method, line search, global convergence

Procedia PDF Downloads 447
16116 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 47