Search results for: hierarchical text classification models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 10057

Search results for: hierarchical text classification models

9547 Isolation and Classification of Red Blood Cells in Anemic Microscopic Images

Authors: Jameela Ali Alkrimi, Abdul Rahim Ahmad, Azizah Suliman, Loay E. George

Abstract:

Red blood cells (RBCs) are among the most commonly and intensively studied type of blood cells in cell biology. The lack of RBCs is a condition characterized by lower than normal hemoglobin level; this condition is referred to as 'anemia'. In this study, a software was developed to isolate RBCs by using a machine learning approach to classify anemic RBCs in microscopic images. Several features of RBCs were extracted using image processing algorithms, including principal component analysis (PCA). With the proposed method, RBCs were isolated in 34 second from an image containing 18 to 27 cells. We also proposed that PCA could be performed to increase the speed and efficiency of classification. Our classifier algorithm yielded accuracy rates of 100%, 99.99%, and 96.50% for K-nearest neighbor (K-NN) algorithm, support vector machine (SVM), and neural network ANN, respectively. Classification was evaluated in highly sensitivity, specificity, and kappa statistical parameters. In conclusion, the classification results were obtained for a short time period with more efficient when PCA was used.

Keywords: red blood cells, pre-processing image algorithms, classification algorithms, principal component analysis PCA, confusion matrix, kappa statistical parameters, ROC

Procedia PDF Downloads 405
9546 Understanding the Qualitative Nature of Product Reviews by Integrating Text Processing Algorithm and Usability Feature Extraction

Authors: Cherry Yieng Siang Ling, Joong Hee Lee, Myung Hwan Yun

Abstract:

The quality of a product to be usable has become the basic requirement in consumer’s perspective while failing the requirement ends up the customer from not using the product. Identifying usability issues from analyzing quantitative and qualitative data collected from usability testing and evaluation activities aids in the process of product design, yet the lack of studies and researches regarding analysis methodologies in qualitative text data of usability field inhibits the potential of these data for more useful applications. While the possibility of analyzing qualitative text data found with the rapid development of data analysis studies such as natural language processing field in understanding human language in computer, and machine learning field in providing predictive model and clustering tool. Therefore, this research aims to study the application capability of text processing algorithm in analysis of qualitative text data collected from usability activities. This research utilized datasets collected from LG neckband headset usability experiment in which the datasets consist of headset survey text data, subject’s data and product physical data. In the analysis procedure, which integrated with the text-processing algorithm, the process includes training of comments onto vector space, labeling them with the subject and product physical feature data, and clustering to validate the result of comment vector clustering. The result shows 'volume and music control button' as the usability feature that matches best with the cluster of comment vectors where centroid comments of a cluster emphasized more on button positions, while centroid comments of the other cluster emphasized more on button interface issues. When volume and music control buttons are designed separately, the participant experienced less confusion, and thus, the comments mentioned only about the buttons' positions. While in the situation where the volume and music control buttons are designed as a single button, the participants experienced interface issues regarding the buttons such as operating methods of functions and confusion of functions' buttons. The relevance of the cluster centroid comments with the extracted feature explained the capability of text processing algorithms in analyzing qualitative text data from usability testing and evaluations.

Keywords: usability, qualitative data, text-processing algorithm, natural language processing

Procedia PDF Downloads 285
9545 An Attempt at the Multi-Criterion Classification of Small Towns

Authors: Jerzy Banski

Abstract:

The basic aim of this study is to discuss and assess different classifications and research approaches to small towns that take their social and economic functions into account, as well as relations with surrounding areas. The subject literature typically includes three types of approaches to the classification of small towns: 1) the structural, 2) the location-related, and 3) the mixed. The structural approach allows for the grouping of towns from the point of view of the social, cultural and economic functions they discharge. The location-related approach draws on the idea of there being a continuum between the center and the periphery. A mixed classification making simultaneous use of the different approaches to research brings the most information to bear in regard to categories of the urban locality. Bearing in mind the approaches to classification, it is possible to propose a synthetic method for classifying small towns that takes account of economic structure, location and the relationship between the towns and their surroundings. In the case of economic structure, the small centers may be divided into two basic groups – those featuring a multi-branch structure and those that are specialized economically. A second element of the classification reflects the locations of urban centers. Two basic types can be identified – the small town within the range of impact of a large agglomeration, or else the town outside such areas, which is to say located peripherally. The third component of the classification arises out of small towns’ relations with their surroundings. In consequence, it is possible to indicate 8 types of small-town: from local centers enjoying good accessibility and a multi-branch economic structure to peripheral supra-local centers characterised by a specialized economic structure.

Keywords: small towns, classification, functional structure, localization

Procedia PDF Downloads 182
9544 Hybrid Fuzzy Weighted K-Nearest Neighbor to Predict Hospital Readmission for Diabetic Patients

Authors: Soha A. Bahanshal, Byung G. Kim

Abstract:

Identification of patients at high risk for hospital readmission is of crucial importance for quality health care and cost reduction. Predicting hospital readmissions among diabetic patients has been of great interest to many researchers and health decision makers. We build a prediction model to predict hospital readmission for diabetic patients within 30 days of discharge. The core of the prediction model is a modified k Nearest Neighbor called Hybrid Fuzzy Weighted k Nearest Neighbor algorithm. The prediction is performed on a patient dataset which consists of more than 70,000 patients with 50 attributes. We applied data preprocessing using different techniques in order to handle data imbalance and to fuzzify the data to suit the prediction algorithm. The model so far achieved classification accuracy of 80% compared to other models that only use k Nearest Neighbor.

Keywords: machine learning, prediction, classification, hybrid fuzzy weighted k-nearest neighbor, diabetic hospital readmission

Procedia PDF Downloads 186
9543 The Difference of Learning Outcomes in Reading Comprehension between Text and Film as The Media in Indonesian Language for Foreign Speaker in Intermediate Level

Authors: Siti Ayu Ningsih

Abstract:

This study aims to find the differences outcomes in learning reading comprehension with text and film as media on Indonesian Language for foreign speaker (BIPA) learning at intermediate level. By using quantitative and qualitative research methods, the respondent of this study is a single respondent from D'Royal Morocco Integrative Islamic School in grade nine from secondary level. Quantitative method used to calculate the learning outcomes that have been given the appropriate action cycle, whereas qualitative method used to translate the findings derived from quantitative methods to be described. The technique used in this study is the observation techniques and testing work. Based on the research, it is known that the use of the text media is more effective than the film for intermediate level of Indonesian Language for foreign speaker learner. This is because, when using film the learner does not have enough time to take note the difficult vocabulary and don't have enough time to look for the meaning of the vocabulary from the dictionary. While the use of media texts shows the better effectiveness because it does not require additional time to take note the difficult words. For the words that are difficult or strange, the learner can immediately find its meaning from the dictionary. The presence of the text is also very helpful for Indonesian Language for foreign speaker learner to find the answers according to the questions more easily. By matching the vocabulary of the question into the text references.

Keywords: Indonesian language for foreign speaker, learning outcome, media, reading comprehension

Procedia PDF Downloads 197
9542 Determination of the Bank's Customer Risk Profile: Data Mining Applications

Authors: Taner Ersoz, Filiz Ersoz, Seyma Ozbilge

Abstract:

In this study, the clients who applied to a bank branch for loan were analyzed through data mining. The study was composed of the information such as amounts of loans received by personal and SME clients working with the bank branch, installment numbers, number of delays in loan installments, payments available in other banks and number of banks to which they are in debt between 2010 and 2013. The client risk profile was examined through Classification and Regression Tree (CART) analysis, one of the decision tree classification methods. At the end of the study, 5 different types of customers have been determined on the decision tree. The classification of these types of customers has been created with the rating of those posing a risk for the bank branch and the customers have been classified according to the risk ratings.

Keywords: client classification, loan suitability, risk rating, CART analysis

Procedia PDF Downloads 338
9541 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Su-Hyeon Jeon, ByeoungKug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we previously proposed a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. In this paper, we design a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: big data analysis, document classification, multi-category, text mining, topic analysis

Procedia PDF Downloads 272
9540 Motives for Reshoring from China to Europe: A Hierarchical Classification of Companies

Authors: Fabienne Fel, Eric Griette

Abstract:

Reshoring, whether concerning back-reshoring or near-reshoring, is a quite recent phenomenon. Despite the economic and political interest of this topic, academic research questioning determinants of reshoring remains rare. Our paper aims at contributing to fill this gap. In order to better understand the reasons for reshoring, we conducted a study among 280 French firms during spring 2016, three-quarters of which sourced, or source, in China. 105 firms in the sample have reshored all or part of their Chinese production or supply in recent years, and we aimed to establish a typology of the motives that drove them to this decision. We asked our respondents about the history of their Chinese supplies, their current reshoring strategies, and their motivations. Statistical analysis was performed with SPSS 22 and SPAD 8. Our results show that change in commercial and financial terms with China is the first motive explaining the current reshoring movement from this country (it applies to 54% of our respondents). A change in corporate strategy is the second motive (30% of our respondents); the reshoring decision follows a change in companies’ strategies (upgrading, implementation of a CSR policy, or a 'lean management' strategy). The third motive (14% of our sample) is a mere correction of the initial offshoring decision, considered as a mistake (under-estimation of hidden costs, non-quality and non-responsiveness problems). Some authors emphasize that developing a short supply chain, involving geographic proximity between design and production, gives a competitive advantage to companies wishing to offer innovative products. Admittedly 40% of our respondents indicate that this motive could have played a part in their decision to reshore, but this reason was not enough for any of them and is not an intrinsic motive leading to leaving Chinese suppliers. Having questioned our respondents about the importance given to various problems leading them to reshore, we then performed a Principal Components Analysis (PCA), associated with an Ascending Hierarchical Classification (AHC), based on Ward criterion, so as to point out more specific motivations. Three main classes of companies should be distinguished: -The 'Cost Killers' (23% of the sample), which reshore their supplies from China only because of higher procurement costs and so as to find lower costs elsewhere. -The 'Realists' (50% of the sample), giving equal weight or importance to increasing procurement costs in China and to the quality of their supplies (to a large extend). Companies being part of this class tend to take advantage of this changing environment to change their procurement strategy, seeking suppliers offering better quality and responsiveness. - The 'Voluntarists' (26% of the sample), which choose to reshore their Chinese supplies regardless of higher Chinese costs, to obtain better quality and greater responsiveness. We emphasize that if the main driver for reshoring from China is indeed higher local costs, it is should not be regarded as an exclusive motivation; 77% of the companies in the sample, are also seeking, sometimes exclusively, more reactive suppliers, liable to quality, respect for the environment and intellectual property.

Keywords: China, procurement, reshoring, strategy, supplies

Procedia PDF Downloads 326
9539 Data-Driven Market Segmentation in Hospitality Using Unsupervised Machine Learning

Authors: Rik van Leeuwen, Ger Koole

Abstract:

Within hospitality, marketing departments use segmentation to create tailored strategies to ensure personalized marketing. This study provides a data-driven approach by segmenting guest profiles via hierarchical clustering based on an extensive set of features. The industry requires understandable outcomes that contribute to adaptability for marketing departments to make data-driven decisions and ultimately driving profit. A marketing department specified a business question that guides the unsupervised machine learning algorithm. Features of guests change over time; therefore, there is a probability that guests transition from one segment to another. The purpose of the study is to provide steps in the process from raw data to actionable insights, which serve as a guideline for how hospitality companies can adopt an algorithmic approach.

Keywords: hierarchical cluster analysis, hospitality, market segmentation

Procedia PDF Downloads 108
9538 Multi-Objective Evolutionary Computation Based Feature Selection Applied to Behaviour Assessment of Children

Authors: F. Jiménez, R. Jódar, M. Martín, G. Sánchez, G. Sciavicco

Abstract:

Abstract—Attribute or feature selection is one of the basic strategies to improve the performances of data classification tasks, and, at the same time, to reduce the complexity of classifiers, and it is a particularly fundamental one when the number of attributes is relatively high. Its application to unsupervised classification is restricted to a limited number of experiments in the literature. Evolutionary computation has already proven itself to be a very effective choice to consistently reduce the number of attributes towards a better classification rate and a simpler semantic interpretation of the inferred classifiers. We present a feature selection wrapper model composed by a multi-objective evolutionary algorithm, the clustering method Expectation-Maximization (EM), and the classifier C4.5 for the unsupervised classification of data extracted from a psychological test named BASC-II (Behavior Assessment System for Children - II ed.) with two objectives: Maximizing the likelihood of the clustering model and maximizing the accuracy of the obtained classifier. We present a methodology to integrate feature selection for unsupervised classification, model evaluation, decision making (to choose the most satisfactory model according to a a posteriori process in a multi-objective context), and testing. We compare the performance of the classifier obtained by the multi-objective evolutionary algorithms ENORA and NSGA-II, and the best solution is then validated by the psychologists that collected the data.

Keywords: evolutionary computation, feature selection, classification, clustering

Procedia PDF Downloads 370
9537 Mood Recognition Using Indian Music

Authors: Vishwa Joshi

Abstract:

The study of mood recognition in the field of music has gained a lot of momentum in the recent years with machine learning and data mining techniques and many audio features contributing considerably to analyze and identify the relation of mood plus music. In this paper we consider the same idea forward and come up with making an effort to build a system for automatic recognition of mood underlying the audio song’s clips by mining their audio features and have evaluated several data classification algorithms in order to learn, train and test the model describing the moods of these audio songs and developed an open source framework. Before classification, Preprocessing and Feature Extraction phase is necessary for removing noise and gathering features respectively.

Keywords: music, mood, features, classification

Procedia PDF Downloads 495
9536 Horizontal Gender Inequality and Segregation at Workplace in China: Understanding How Implicit and Unconscious Gender Stereotypes Produce and Reinforce Workplace Gender Inequality in China through Interview-Based Qualitative Analysis

Authors: Yiyan Wu

Abstract:

In the past several decades, the market transition in China has brought in not only more opportunities for women in the labor market but also more attention to gender inequality in workplace. Although some pieces of literature have mentioned gender inequality and segregation at workplace in China, the paper looks into the variations of gender inequality and segregation: working women have little feeling about 'hierarchical inequalities', which define the status and position of women at the workplace. However, at the same time, they unconsciously reinforced 'horizontal inequalities', which creates gender segregation across occupations and job titles. Using qualitative interviews with women employers and employees of various occupations and job titles in Eastern and Southern China, this paper finds evidence that working women's understandings of the division of labor based on the characteristics and expectations of women and men are not as a result of rationality and efficiency, but instead, are the products of gendered stereotypes and traditions. However, holding positive views of gender equality at workplace, working women are not aware of the existence and influence of such gendered stereotypes and traditions. By distinguishing the concepts of 'horizontal inequality' and 'hierarchical inequality' with a cultural sociological approach, this paper contributes to the understanding of gender inequality and segregation in contemporary Chinese society. Moreover, this paper explains the logic behind the paradox in which gender inequality and segregation at workplace persist while women are feeling equal.

Keywords: gender equality, segregation, hierarchical inequality, horizontal inequality, China

Procedia PDF Downloads 164
9535 Increased Reaction and Movement Times When Text Messaging during Simulated Driving

Authors: Adriana M. Duquette, Derek P. Bornath

Abstract:

Reaction Time (RT) and Movement Time (MT) are important components of everyday life that have an effect on the way in which we move about our environment. These measures become even more crucial when an event can be caused (or avoided) in a fraction of a second, such as the RT and MT required while driving. The purpose of this study was to develop a more simple method of testing RT and MT during simulated driving with or without text messaging, in a university-aged population (n = 170). In the control condition, a randomly-delayed red light stimulus flashed on a computer interface after the participant began pressing the ‘gas’ pedal on a foot switch mat. Simple RT was defined as the time between the presentation of the light stimulus and the initiation of lifting the foot from the switch mat ‘gas’ pedal; while MT was defined as the time after the initiation of lifting the foot, to the initiation of depressing the switch mat ‘brake’ pedal. In the texting condition, upon pressing the ‘gas’ pedal, a ‘text message’ appeared on the computer interface in a dialog box that the participant typed on their cell phone while waiting for the light stimulus to turn red. In both conditions, the sequence was repeated 10 times, and an average RT (seconds) and average MT (seconds) were recorded. Condition significantly (p = .000) impacted overall RTs, as the texting condition (0.47 s) took longer than the no-texting (control) condition (0.34 s). Longer MTs were also recorded during the texting condition (0.28 s) than in the control condition (0.23 s), p = .001. Overall increases in Response Time (RT + MT) of 189 ms during the texting condition would equate to an additional 4.2 meters (to react to the stimulus and begin braking) if the participant had been driving an automobile at 80 km per hour. In conclusion, increasing task complexity due to the dual-task demand of text messaging during simulated driving caused significant increases in RT (41%), MT (23%) and Response Time (34%), thus further strengthening the mounting evidence against text messaging while driving.

Keywords: simulated driving, text messaging, reaction time, movement time

Procedia PDF Downloads 523
9534 Hyper Parameter Optimization of Deep Convolutional Neural Networks for Pavement Distress Classification

Authors: Oumaima Khlifati, Khadija Baba

Abstract:

Pavement distress is the main factor responsible for the deterioration of road structure durability, damage vehicles, and driver comfort. Transportation agencies spend a high proportion of their funds on pavement monitoring and maintenance. The auscultation of pavement distress was based on the manual survey, which was extremely time consuming, labor intensive, and required domain expertise. Therefore, the automatic distress detection is needed to reduce the cost of manual inspection and avoid more serious damage by implementing the appropriate remediation actions at the right time. Inspired by recent deep learning applications, this paper proposes an algorithm for automatic road distress detection and classification using on the Deep Convolutional Neural Network (DCNN). In this study, the types of pavement distress are classified as transverse or longitudinal cracking, alligator, pothole, and intact pavement. The dataset used in this work is composed of public asphalt pavement images. In order to learn the structure of the different type of distress, the DCNN models are trained and tested as a multi-label classification task. In addition, to get the highest accuracy for our model, we adjust the structural optimization hyper parameters such as the number of convolutions and max pooling, filers, size of filters, loss functions, activation functions, and optimizer and fine-tuning hyper parameters that conclude batch size and learning rate. The optimization of the model is executed by checking all feasible combinations and selecting the best performing one. The model, after being optimized, performance metrics is calculated, which describe the training and validation accuracies, precision, recall, and F1 score.

Keywords: distress pavement, hyperparameters, automatic classification, deep learning

Procedia PDF Downloads 93
9533 Hybrid Hierarchical Routing Protocol for WSN Lifetime Maximization

Authors: H. Aoudia, Y. Touati, E. H. Teguig, A. Ali Cherif

Abstract:

Conceiving and developing routing protocols for wireless sensor networks requires considerations on constraints such as network lifetime and energy consumption. In this paper, we propose a hybrid hierarchical routing protocol named HHRP combining both clustering mechanism and multipath optimization taking into account residual energy and RSSI measures. HHRP consists of classifying dynamically nodes into clusters where coordinators nodes with extra privileges are able to manipulate messages, aggregate data and ensure transmission between nodes according to TDMA and CDMA schedules. The reconfiguration of the network is carried out dynamically based on a threshold value which is associated with the number of nodes belonging to the smallest cluster. To show the effectiveness of the proposed approach HHRP, a comparative study with LEACH protocol is illustrated in simulations.

Keywords: routing protocol, optimization, clustering, WSN

Procedia PDF Downloads 469
9532 A Survey on Taxpayer's Compliance in Prospect Theory Structure Using Hierarchical Bayesian Approach

Authors: Sahar Dehghan, Yeganeh Mousavi Jahromi, Ghahraman Abdoli

Abstract:

Since tax revenues are one of the most important sources of government revenue, it is essential to consider increasing taxpayers' compliance. One of the factors that can affect the taxpayers' compliance is the structure of the crimes and incentives envisaged in the tax law. In this research, by using the 'prospect theory', the effects of changes in the rate of crimes and the tax incentive in the direct tax law on the taxpayer’s compliance behavior have been investigated. To determine the preferences and preferences of taxpayer’s in the business sector and their degree of sensitivity to fines and incentives, a questionnaire with mixed gamble structure is designed. Estimated results using the Hierarchical Bayesian method indicate that the taxpayer’s that have been tested in this study are more sensitive to the incentives in the direct tax law, and the tax administration can use this to increase the level of collected tax and increase the level of compliance.

Keywords: tax compliance, prospect theory, value function, mixed gamble

Procedia PDF Downloads 174
9531 Discriminant Analysis as a Function of Predictive Learning to Select Evolutionary Algorithms in Intelligent Transportation System

Authors: Jorge A. Ruiz-Vanoye, Ocotlán Díaz-Parra, Alejandro Fuentes-Penna, Daniel Vélez-Díaz, Edith Olaco García

Abstract:

In this paper, we present the use of the discriminant analysis to select evolutionary algorithms that better solve instances of the vehicle routing problem with time windows. We use indicators as independent variables to obtain the classification criteria, and the best algorithm from the generic genetic algorithm (GA), random search (RS), steady-state genetic algorithm (SSGA), and sexual genetic algorithm (SXGA) as the dependent variable for the classification. The discriminant classification was trained with classic instances of the vehicle routing problem with time windows obtained from the Solomon benchmark. We obtained a classification of the discriminant analysis of 66.7%.

Keywords: Intelligent Transportation Systems, data-mining techniques, evolutionary algorithms, discriminant analysis, machine learning

Procedia PDF Downloads 472
9530 Air Classification of Dust from Steel Converter Secondary De-dusting for Zinc Enrichment

Authors: C. Lanzerstorfer

Abstract:

The off-gas from the basic oxygen furnace (BOF), where pig iron is converted into steel, is treated in the primary ventilation system. This system is in full operation only during oxygen-blowing when the BOF converter vessel is in a vertical position. When pig iron and scrap are charged into the BOF and when slag or steel are tapped, the vessel is tilted. The generated emissions during charging and tapping cannot be captured by the primary off-gas system. To capture these emissions, a secondary ventilation system is usually installed. The emissions are captured by a canopy hood installed just above the converter mouth in tilted position. The aim of this study was to investigate the dependence of Zn and other components on the particle size of BOF secondary ventilation dust. Because of the high temperature of the BOF process it can be expected that Zn will be enriched in the fine dust fractions. If Zn is enriched in the fine fractions, classification could be applied to split the dust into two size fractions with a different content of Zn. For this air classification experiments with dust from the secondary ventilation system of a BOF were performed. The results show that Zn and Pb are highly enriched in the finest dust fraction. For Cd, Cu and Sb the enrichment is less. In contrast, the non-volatile metals Al, Fe, Mn and Ti were depleted in the fine fractions. Thus, air classification could be considered for the treatment of dust from secondary BOF off-gas cleaning.

Keywords: air classification, converter dust, recycling, zinc

Procedia PDF Downloads 425
9529 Evaluation of the CRISP-DM Business Understanding Step: An Approach for Assessing the Predictive Power of Regression versus Classification for the Quality Prediction of Hydraulic Test Results

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Digitalisation in production technology is a driver for the application of machine learning methods. Through the application of predictive quality, the great potential for saving necessary quality control can be exploited through the data-based prediction of product quality and states. However, the serial use of machine learning applications is often prevented by various problems. Fluctuations occur in real production data sets, which are reflected in trends and systematic shifts over time. To counteract these problems, data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets to extract stable features. Successful process control of the target variables aims to centre the measured values around a mean and minimise variance. Competitive leaders claim to have mastered their processes. As a result, much of the real data has a relatively low variance. For the training of prediction models, the highest possible generalisability is required, which is at least made more difficult by this data availability. The implementation of a machine learning application can be interpreted as a production process. The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model with six phases that describes the life cycle of data science. As in any process, the costs to eliminate errors increase significantly with each advancing process phase. For the quality prediction of hydraulic test steps of directional control valves, the question arises in the initial phase whether a regression or a classification is more suitable. In the context of this work, the initial phase of the CRISP-DM, the business understanding, is critically compared for the use case at Bosch Rexroth with regard to regression and classification. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. Suitable methods for leakage volume flow regression and classification for inspection decision are applied. Impressively, classification is clearly superior to regression and achieves promising accuracies.

Keywords: classification, CRISP-DM, machine learning, predictive quality, regression

Procedia PDF Downloads 144
9528 Spare Part Inventory Optimization Policy: A Study Literature

Authors: Zukhrof Romadhon, Nani Kurniati

Abstract:

Availability of Spare parts is critical to support maintenance tasks and the production system. Managing spare part inventory deals with some parameters and objective functions, as well as the tradeoff between inventory costs and spare parts availability. Several mathematical models and methods have been developed to optimize the spare part policy. Many researchers who proposed optimization models need to be considered to identify other potential models. This work presents a review of several pertinent literature on spare part inventory optimization and analyzes the gaps for future research. Initial investigation on scholars and many journal database systems under specific keywords related to spare parts found about 17K papers. Filtering was conducted based on five main aspects, i.e., replenishment policy, objective function, echelon network, lead time, model solving, and additional aspects of part classification. Future topics could be identified based on the number of papers that haven’t addressed specific aspects, including joint optimization of spare part inventory and maintenance.

Keywords: spare part, spare part inventory, inventory model, optimization, maintenance

Procedia PDF Downloads 62
9527 Hindi Speech Synthesis by Concatenation of Recognized Hand Written Devnagri Script Using Support Vector Machines Classifier

Authors: Saurabh Farkya, Govinda Surampudi

Abstract:

Optical Character Recognition is one of the current major research areas. This paper is focussed on recognition of Devanagari script and its sound generation. This Paper consists of two parts. First, Optical Character Recognition of Devnagari handwritten Script. Second, speech synthesis of the recognized text. This paper shows an implementation of support vector machines for the purpose of Devnagari Script recognition. The Support Vector Machines was trained with Multi Domain features; Transform Domain and Spatial Domain or Structural Domain feature. Transform Domain includes the wavelet feature of the character. Structural Domain consists of Distance Profile feature and Gradient feature. The Segmentation of the text document has been done in 3 levels-Line Segmentation, Word Segmentation, and Character Segmentation. The pre-processing of the characters has been done with the help of various Morphological operations-Otsu's Algorithm, Erosion, Dilation, Filtration and Thinning techniques. The Algorithm was tested on the self-prepared database, a collection of various handwriting. Further, Unicode was used to convert recognized Devnagari text into understandable computer document. The document so obtained is an array of codes which was used to generate digitized text and to synthesize Hindi speech. Phonemes from the self-prepared database were used to generate the speech of the scanned document using concatenation technique.

Keywords: Character Recognition (OCR), Text to Speech (TTS), Support Vector Machines (SVM), Library of Support Vector Machines (LIBSVM)

Procedia PDF Downloads 499
9526 The Role of Inventory Classification in Supply Chain Responsiveness in a Build-to-Order and Build-To-Forecast Manufacturing Environment: A Comparative Analysis

Authors: Qamar Iqbal

Abstract:

Companies strive to improve their forecasting methods to predict the fluctuations in customer demand. These fluctuation and variation in demand affect the manufacturing operations and can limit a company’s ability to fulfill customer demand on time. Companies keep the inventory buffer and maintain the stocking levels to reduce the impact of demand variation. A mid-size company deals with thousands of stock keeping units (skus). It is neither easy and nor efficient to control and manage each sku. Inventory classification provides a tool to the management to increase their ability to support customer demand. The paper presents a framework that shows how inventory classification can play a role to increase supply chain responsiveness. A case study will be presented to further elaborate the method both for build-to-order and build-to-forecast manufacturing environments. Results will be compared that will show which manufacturing setting has advantage over another under different circumstances. The outcome of this study is very useful to the management because this will give them an insight on how inventory classification can be used to increase their ability to respond to changing customer needs.

Keywords: inventory classification, supply chain responsiveness, forecast, manufacturing environment

Procedia PDF Downloads 595
9525 On the Cyclic Property of Groups of Prime Order

Authors: Ying Yi Wu

Abstract:

The study of finite groups is a central topic in algebraic structures, and one of the most fundamental questions in this field is the classification of finite groups up to isomorphism. In this paper, we investigate the cyclic property of groups of prime order, which is a crucial result in the classification of finite abelian groups. We prove the following statement: If p is a prime, then every group G of order p is cyclic. Our proof utilizes the properties of group actions and the class equation, which provide a powerful tool for studying the structure of finite groups. In particular, we first show that any non-identity element of G generates a cyclic subgroup of G. Then, we establish the existence of an element of order p, which implies that G is generated by a single element. Finally, we demonstrate that any two generators of G are conjugate, which shows that G is a cyclic group. Our result has significant implications in the classification of finite groups, as it implies that any group of prime order is isomorphic to the cyclic group of the same order. Moreover, it provides a useful tool for understanding the structure of more complicated finite groups, as any finite abelian group can be decomposed into a direct product of cyclic groups. Our proof technique can also be extended to other areas of group theory, such as the classification of finite p-groups, where p is a prime. Therefore, our work has implications beyond the specific result we prove and can contribute to further research in algebraic structures.

Keywords: group theory, finite groups, cyclic groups, prime order, classification.

Procedia PDF Downloads 84
9524 Sentiment Analysis on the East Timor Accession Process to the ASEAN

Authors: Marcelino Caetano Noronha, Vosco Pereira, Jose Soares Pinto, Ferdinando Da C. Saores

Abstract:

One particularly popular social media platform is Youtube. It’s a video-sharing platform where users can submit videos, and other users can like, dislike or comment on the videos. In this study, we conduct a binary classification task on YouTube’s video comments and review from the users regarding the accession process of Timor Leste to become the eleventh member of the Association of South East Asian Nations (ASEAN). We scrape the data directly from the public YouTube video and apply several pre-processing and weighting techniques. Before conducting the classification, we categorized the data into two classes, namely positive and negative. In the classification part, we apply Support Vector Machine (SVM) algorithm. By comparing with Naïve Bayes Algorithm, the experiment showed SVM achieved 84.1% of Accuracy, 94.5% of Precision, and Recall 73.8% simultaneously.

Keywords: classification, YouTube, sentiment analysis, support sector machine

Procedia PDF Downloads 108
9523 On the Network Packet Loss Tolerance of SVM Based Activity Recognition

Authors: Gamze Uslu, Sebnem Baydere, Alper K. Demir

Abstract:

In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.

Keywords: activity recognition, support vector machines, acceleration sensor, wireless sensor networks, packet loss

Procedia PDF Downloads 475
9522 Dissimilarity-Based Coloring for Symbolic and Multivariate Data Visualization

Authors: K. Umbleja, M. Ichino, H. Yaguchi

Abstract:

In this paper, we propose a coloring method for multivariate data visualization by using parallel coordinates based on dissimilarity and tree structure information gathered during hierarchical clustering. The proposed method is an extension for proximity-based coloring that suffers from a few undesired side effects if hierarchical tree structure is not balanced tree. We describe the algorithm by assigning colors based on dissimilarity information, show the application of proposed method on three commonly used datasets, and compare the results with proximity-based coloring. We found our proposed method to be especially beneficial for symbolic data visualization where many individual objects have already been aggregated into a single symbolic object.

Keywords: data visualization, dissimilarity-based coloring, proximity-based coloring, symbolic data

Procedia PDF Downloads 170
9521 The Influence of Alvar Aalto on the Early Work of Álvaro Siza

Authors: Eduardo Jorge Cabral dos Santos Fernandes

Abstract:

The expression ‘Porto School’, usually associated with an educational institution, the School of Fine Arts of Porto, is applied for the first time with the sense of an architectural trend by Nuno Portas in a text published in 1983. The expression is used to characterize a set of works by Porto architects, in which common elements are found, namely the desire to reuse languages and forms of the German and Dutch rationalism of the twenties, using the work of Alvar Aalto as a mediation for the reinterpretation of these models. In the same year, Álvaro Siza classifies the Finnish architect as a miscegenation agent who transforms experienced models and introduces them to different realities in a text published in Jornal de Letras, Artes e Ideias. The influence of foreign models and their adaptation to the context has been a recurrent theme in Portuguese architecture, which finds important contributions in the writings of Alexandre Alves Costa, at this time. However, the identification of these characteristics in Siza’s work is not limited to the Portuguese theoretical production: it is the recognition of this attitude towards the context that leads Kenneth Frampton to include Siza in the restricted group of architects who embody Critical Regionalism (in his book Modern architecture: a critical history). For Frampton, his work focuses on the territory and on the consequences of the intervention in the context, viewing architecture as a tectonic fact rather than a series of scenographic episodes and emphasizing site-specific aspects (topography, light, climate). Therefore, the motto of this paper is the dichotomous opposition between foreign influences and adaptation to the context in the early work of Álvaro Siza (designed in the sixties) in which the influence (theoretical, methodological, and formal) of Alvar Aalto manifests itself in the form and the language: the pool at Quinta da Conceição, the Seaside Pools and the Tea House (three works in Leça da Palmeira) and the Lordelo Cooperative (in Porto). This work is part of a more comprehensive project, which considers several case studies throughout the Portuguese architect's vast career, built in Portugal and abroad, in order to obtain a holistic view.

Keywords: Alvar Aalto, Álvaro Siza, foreign influences, adaptation to the context

Procedia PDF Downloads 30
9520 A Study on Information Structure in the Vajrachedika-Prajna-paramita Sutra and Translation Aspect

Authors: Yoon-Cheol Park

Abstract:

This research focuses on examining the information structures in the old Chinese character-Korean translation of the Vajrachedika-prajna-paramita sutra. The background of this research comes from the fact that there were no previous researches which looked into the information structures in the target text of the Vajrachedika-prajna-paramita sutra by now. The existing researches on the Buddhist scripture translation mainly put weight on message conveyance by literal and semantic translation methods. But the message conveyance from one language to another has a necessity to be delivered with equivalent information structure. Thus, this research is intended to investigate on the flow of old and new information in the target text of Buddhist scripture, compared with source text. The Vajrachedika-prajna-paramita sutra unlike other Buddhist scriptures is composed of conversational structures between Buddha and his disciple, Suboli. This implies that the information flow can be changed by utterance context and some propositions. So, this research tries to analyze the flow of old and new information within the source and target text. As a result of analysis, this research can discover the following facts; firstly, there are the differences of the information flow in the message conveyance between the old Chinese character and Korean by language features. The old Chinese character reveals that old-new information flow is developed, while Korean indicates new-old information flow because of word order. Secondly, the source text of the Vajrachedika-prajna-paramita sutra includes abstruse terminologies, jargon and abstract words. These make influence on the target text and cause the change of the information flow. But the repetitive expressions of these words provide the old information in the target text. Lastly, the Vajrachedika-prajna-paramita sutra offers the expository structure from conversations between Buddha and Suboli. It means that the information flow is developed in the way of explaining specific subjects and of paraphrasing unfamiliar phrases and expressions. From the results of analysis above, this research can verify that the information structures in the target text of the Vajrachedika-prajna-paramita sutra are changed by specific subjects and terminologies, developed with the new-old information flow by repetitive expressions or word order and reveal the information structures familiar to target culture. It also implies that the translation of the Vajrachedika-prajna-paramita sutra as a religious book needs the message conveyance to take into account the information structures of two languages.

Keywords: abstruse terminologies, the information structure, new and old information, old Chinese character-Korean translation

Procedia PDF Downloads 368
9519 Prediction Modeling of Alzheimer’s Disease and Its Prodromal Stages from Multimodal Data with Missing Values

Authors: M. Aghili, S. Tabarestani, C. Freytes, M. Shojaie, M. Cabrerizo, A. Barreto, N. Rishe, R. E. Curiel, D. Loewenstein, R. Duara, M. Adjouadi

Abstract:

A major challenge in medical studies, especially those that are longitudinal, is the problem of missing measurements which hinders the effective application of many machine learning algorithms. Furthermore, recent Alzheimer's Disease studies have focused on the delineation of Early Mild Cognitive Impairment (EMCI) and Late Mild Cognitive Impairment (LMCI) from cognitively normal controls (CN) which is essential for developing effective and early treatment methods. To address the aforementioned challenges, this paper explores the potential of using the eXtreme Gradient Boosting (XGBoost) algorithm in handling missing values in multiclass classification. We seek a generalized classification scheme where all prodromal stages of the disease are considered simultaneously in the classification and decision-making processes. Given the large number of subjects (1631) included in this study and in the presence of almost 28% missing values, we investigated the performance of XGBoost on the classification of the four classes of AD, NC, EMCI, and LMCI. Using 10-fold cross validation technique, XGBoost is shown to outperform other state-of-the-art classification algorithms by 3% in terms of accuracy and F-score. Our model achieved an accuracy of 80.52%, a precision of 80.62% and recall of 80.51%, supporting the more natural and promising multiclass classification.

Keywords: eXtreme gradient boosting, missing data, Alzheimer disease, early mild cognitive impairment, late mild cognitive impair, multiclass classification, ADNI, support vector machine, random forest

Procedia PDF Downloads 188
9518 Contextual Distribution for Textual Alignment

Authors: Yuri Bizzoni, Marianne Reboul

Abstract:

Our program compares French and Italian translations of Homer’s Odyssey, from the XVIth to the XXth century. We focus on the third point, showing how distributional semantics systems can be used both to improve alignment between different French translations as well as between the Greek text and a French translation. Although we focus on French examples, the techniques we display are completely language independent.

Keywords: classical receptions, computational linguistics, distributional semantics, Homeric poems, machine translation, translation studies, text alignment

Procedia PDF Downloads 433