Search results for: mining software repositories
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2617

Search results for: mining software repositories

2137 Impact Analysis Based on Change Requirement Traceability in Object Oriented Software Systems

Authors: Sunil Tumkur Dakshinamurthy, Mamootil Zachariah Kurian

Abstract:

Change requirement traceability in object oriented software systems is one of the challenging areas in research. We know that the traces between links of different artifacts are to be automated or semi-automated in the software development life cycle (SDLC). The aim of this paper is discussing and implementing aspects of dynamically linking the artifacts such as requirements, high level design, code and test cases through the Extensible Markup Language (XML) or by dynamically generating Object Oriented (OO) metrics. Also, non-functional requirements (NFR) aspects such as stability, completeness, clarity, validity, feasibility and precision are discussed. We discuss this as a Fifth Taxonomy, which is a system vulnerability concern.

Keywords: Artifacts, NFRs, OO metrics, SDLC, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1122
2136 A Novel Approach to Optimal Cutting Tool Replacement

Authors: Cem Karacal, Sohyung Cho, William Yu

Abstract:

In metal cutting industries, mathematical/statistical models are typically used to predict tool replacement time. These off-line methods usually result in less than optimum replacement time thereby either wasting resources or causing quality problems. The few online real-time methods proposed use indirect measurement techniques and are prone to similar errors. Our idea is based on identifying the optimal replacement time using an electronic nose to detect the airborne compounds released when the tool wear reaches to a chemical substrate doped into tool material during the fabrication. The study investigates the feasibility of the idea, possible doping materials and methods along with data stream mining techniques for detection and monitoring different phases of tool wear.

Keywords: Tool condition monitoring, cutting tool replacement, data stream mining, e-Nose.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1860
2135 Evaluation of Model Evaluation Criterion for Software Development Effort Estimation

Authors: S. K. Pillai, M. K. Jeyakumar

Abstract:

Estimation of model parameters is necessary to predict the behavior of a system. Model parameters are estimated using optimization criteria. Most algorithms use historical data to estimate model parameters. The known target values (actual) and the output produced by the model are compared. The differences between the two form the basis to estimate the parameters. In order to compare different models developed using the same data different criteria are used. The data obtained for short scale projects are used here. We consider software effort estimation problem using radial basis function network. The accuracy comparison is made using various existing criteria for one and two predictors. Then, we propose a new criterion based on linear least squares for evaluation and compared the results of one and two predictors. We have considered another data set and evaluated prediction accuracy using the new criterion. The new criterion is easy to comprehend compared to single statistic. Although software effort estimation is considered, this method is applicable for any modeling and prediction.

Keywords: Software effort estimation, accuracy, Radial Basis Function, linear least squares.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2006
2134 The Impacts of Local Decision Making on Customisation Process Speed across Distributed Boundaries: A Case Study

Authors: A. M. Qahtani, G. B. Wills, A. M. Gravell

Abstract:

Communicating and managing customers’ requirements in software development projects play a vital role in the software development process. While it is difficult to do so locally, it is even more difficult to communicate these requirements over distributed boundaries and to convey them to multiple distribution customers. This paper discusses the communication of multiple distribution customers’ requirements in the context of customised software products. The main purpose is to understand the challenges of communicating and managing customisation requirements across distributed boundaries. We propose a model for Communicating Customisation Requirements of Multi-Clients in a Distributed Domain (CCRD). Thereafter, we evaluate that model by presenting the findings of a case study conducted with a company with customisation projects for 18 distributed customers. Then, we compare the outputs of the real case process and the outputs of the CCRD model using simulation methods. Our conjecture is that the CCRD model can reduce the challenge of communication requirements over distributed organisational boundaries, and the delay in decision making and in the entire customisation process time.

Keywords: Customisation Software Products, Global Software Engineering, Local Decision Making, Requirement Engineering, Simulation Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1869
2133 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian

Abstract:

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

Keywords: Data mining, K-means, road traffic accidents, Waze, Weka.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1170
2132 Implementation of an On-Line PD Measurement System Using HFCT

Authors: F. Haghjoo, M. Sarlak, S.M. Shahrtash

Abstract:

In order to perform on-line measuring and detection of PD signals, a total solution composing of an HFCT, A/D converter and a complete software package is proposed. The software package includes compensation of HFCT contribution, filtering and noise reduction using wavelet transform and soft calibration routines. The results have shown good performance and high accuracy.

Keywords: Partial Discharge, Measurement, On-line, HFCT

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1792
2131 Design of Personal Job Recommendation Framework on Smartphone Platform

Authors: Chayaporn Kaensar

Abstract:

Recently, Job Recommender Systems have gained much attention in industries since they solve the problem of information overload on the recruiting website. Therefore, we proposed Extended Personalized Job System that has the capability of providing the appropriate jobs for job seeker and recommending some suitable information for them using Data Mining Techniques and Dynamic User Profile. On the other hands, company can also interact to the system for publishing and updating job information. This system have emerged and supported various platforms such as web application and android mobile application. In this paper, User profiles, Implicit User Action, User Feedback, and Clustering Techniques in WEKA libraries were applied and implemented. In additions, open source tools like Yii Web Application Framework, Bootstrap Front End Framework and Android Mobile Technology were also applied.

Keywords: Recommendation, user profile, data mining, web technology, mobile technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2123
2130 Finding Fuzzy Association Rules Using FWFP-Growth with Linguistic Supports and Confidences

Authors: Chien-Hua Wang, Chin-Tzong Pang

Abstract:

In data mining, the association rules are used to search for the relations of items of the transactions database. Following the data is collected and stored, it can find rules of value through association rules, and assist manager to proceed marketing strategy and plan market framework. In this paper, we attempt fuzzy partition methods and decide membership function of quantitative values of each transaction item. Also, by managers we can reflect the importance of items as linguistic terms, which are transformed as fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth (FWFP-Growth) is used to complete the process of data mining. The method above is expected to improve Apriori algorithm for its better efficiency of the whole association rules. An example is given to clearly illustrate the proposed approach.

Keywords: Association Rule, Fuzzy Partition Methods, FWFP-Growth, Apiroir algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1620
2129 AniMoveMineR: Animal Behavior Exploratory Analysis Using Association Rules Mining

Authors: Suelane Garcia Fontes, Silvio Luiz Stanzani, Pedro L. Pizzigatti Corrła Ronaldo G. Morato

Abstract:

Environmental changes and major natural disasters are most prevalent in the world due to the damage that humanity has caused to nature and these damages directly affect the lives of animals. Thus, the study of animal behavior and their interactions with the environment can provide knowledge that guides researchers and public agencies in preservation and conservation actions. Exploratory analysis of animal movement can determine the patterns of animal behavior and with technological advances the ability of animals to be tracked and, consequently, behavioral studies have been expanded. There is a lot of research on animal movement and behavior, but we note that a proposal that combines resources and allows for exploratory analysis of animal movement and provide statistical measures on individual animal behavior and its interaction with the environment is missing. The contribution of this paper is to present the framework AniMoveMineR, a unified solution that aggregates trajectory analysis and data mining techniques to explore animal movement data and provide a first step in responding questions about the animal individual behavior and their interactions with other animals over time and space. We evaluated the framework through the use of monitored jaguar data in the city of Miranda Pantanal, Brazil, in order to verify if the use of AniMoveMineR allows to identify the interaction level between these jaguars. The results were positive and provided indications about the individual behavior of jaguars and about which jaguars have the highest or lowest correlation.

Keywords: Data mining, data science, trajectory, animal behavior.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 865
2128 Feature Selection Approaches with Missing Values Handling for Data Mining - A Case Study of Heart Failure Dataset

Authors: N.Poolsawad, C.Kambhampati, J. G. F. Cleland

Abstract:

In this paper, we investigated the characteristic of a clinical dataseton the feature selection and classification measurements which deal with missing values problem.And also posed the appropriated techniques to achieve the aim of the activity; in this research aims to find features that have high effect to mortality and mortality time frame. We quantify the complexity of a clinical dataset. According to the complexity of the dataset, we proposed the data mining processto cope their complexity; missing values, high dimensionality, and the prediction problem by using the methods of missing value replacement, feature selection, and classification.The experimental results will extend to develop the prediction model for cardiology.

Keywords: feature selection, missing values, classification, clinical dataset, heart failure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3187
2127 Implementation and Demonstration of Software-Defined Traffic Grooming

Authors: Lei Guo, Xu Zhang, Weigang Hou

Abstract:

Since the traditional network is closed and it has no architecture to create applications, it has been unable to evolve with changing demands under the rapid innovation in services. Additionally, due to the lack of the whole network profile, the quality of service cannot be well guaranteed in the traditional network. The Software Defined Network (SDN) utilizes global resources to support on-demand applications/services via open, standardized and programmable interfaces. In this paper, we implement the traffic grooming application under a real SDN environment, and the corresponding analysis is made. In our SDN: 1) we use OpenFlow protocol to control the entire network by using software applications running on the network operating system; 2) several virtual switches are combined into the data forwarding plane through Open vSwitch; 3) An OpenFlow controller, NOX, is involved as a logically centralized control plane that dynamically configures the data forwarding plane; 4) The traffic grooming based on SDN is demonstrated through dynamically modifying the idle time of flow entries. The experimental results demonstrate that the SDN-based traffic grooming effectively reduces the end-to-end delay, and the improvement ratio arrives to 99%.

Keywords: NOX, OpenFlow, software defined network, traffic grooming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1003
2126 Seasonal Variation of the Impact of Mining Activities on Ga-Selati River in Limpopo Province, South Africa

Authors: Joshua N. Edokpayi, John O. Odiyo, Patience P. Shikwambana

Abstract:

Water is a very rare natural resource in South Africa. Ga-Selati River is used for both domestic and industrial purposes. This study was carried out in order to assess the quality of Ga-Selati River in a mining area of Limpopo Province-Phalaborwa. The pH, Electrical Conductivity (EC) and Total Dissolved Solids (TDS) were determined using a Crinson multimeter while turbidity was measured using a Labcon Turbidimeter. The concentrations of Al, Ca, Cd, Cr, Fe, K, Mg, Mn, Na and Pb were analysed in triplicate using a Varian 520 flame atomic absorption spectrometer (AAS) supplied by PerkinElmer, after acid digestion with nitric acid in a fume cupboard. The average pH of the river from eight different sampling sites was 8.00 and 9.38 in wet and dry season respectively. Higher EC values were determined in the dry season (138.7 mS/m) than in the wet season (96.93 mS/m). Similarly, TDS values were higher in dry (929.29 mg/L) than in the wet season (640.72 mg/L) season. These values exceeded the recommended guideline of South Africa Department of Water Affairs and Forestry (DWAF) for domestic water use (70 mS/m) and that of the World Health Organization (WHO) (600 mS/m), respectively. Turbidity varied between 1.78-5.20 and 0.95-2.37 NTU in both wet and dry seasons. Total hardness of 312.50 mg/L and 297.75 mg/L as the concentration of CaCO3 was computed for the river in both the wet and the dry seasons and the river water was categorised as very hard. Mean concentration of the metals studied in both the wet and the dry seasons are: Na (94.06 mg/L and 196.3 mg/L), K (11.79 mg/L and 13.62 mg/L), Ca (45.60 mg/L and 41.30 mg/L), Mg (48.41 mg/L and 44.71 mg/L), Al (0.31 mg/L and 0.38 mg/L), Cd (0.01 mg/L and 0.01 mg/L), Cr (0.02 mg/L and 0.09 mg/L), Pb (0.05 mg/L and 0.06 mg/L), Mn (0.31 mg/L and 0.11 mg/L) and Fe (0.76 mg/L and 0.69 mg/L). Results from this study reveal that most of the metals were present in concentrations higher than the recommended guidelines of DWAF and WHO for domestic use and the protection of aquatic life.

Keywords: Contamination, mining activities, surface water, trace metals.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1941
2125 Automated Java Testing: JUnit versus AspectJ

Authors: Manish Jain, Dinesh Gopalani

Abstract:

Growing dependency of mankind on software technology increases the need for thorough testing of the software applications and automated testing techniques that support testing activities. We have outlined our testing strategy for performing various types of automated testing of Java applications using AspectJ which has become the de-facto standard for Aspect Oriented Programming (AOP). Likewise JUnit, a unit testing framework is the most popular Java testing tool. In this paper, we have evaluated our proposed AOP approach for automated testing and JUnit on various parameters. First we have provided the similarity between the two approaches and then we have done a detailed comparison of the two testing techniques on factors like lines of testing code, learning curve, testing of private members etc. We established that our AOP testing approach using AspectJ has got several advantages and is thus particularly more effective than JUnit.

Keywords: Aspect oriented programming, AspectJ, Aspects, JUnit, software testing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1870
2124 Studying on ARINC653 Partition Run-time Scheduling and Simulation

Authors: Dongliang Wang, Jun Han, Dianfu Ma, Xianqi Zhao

Abstract:

Avionics software is safe-critical embedded software and its architecture is evolving from traditional federated architectures to Integrated Modular Avionics (IMA) to improve resource usability. ARINC 653 (Avionics Application Standard Software Interface) is a software specification for space and time partitioning in Safety-critical avionics Real-time operating systems. Arinc653 uses two-level scheduling strategies, but current modeling tools only apply to simple problems of Arinc653 two-level scheduling, which only contain time property. In avionics industry, we are always manually allocating tasks and calculating the timing table of a real-time system to ensure it-s running as we design. In this paper we represent an automatically generating strategy which applies to the two scheduling problems with dependent constraints in Arinc653 partition run-time environment. It provides the functionality of automatic generation from the task and partition models to scheduling policy through allocating the tasks to the partitions while following the constraints, and then we design a simulating mechanism to check whether our policy is schedulable or not

Keywords: Arinc653, scheduling, task allocation, simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2312
2123 Effective Keyword and Similarity Thresholds for the Discovery of Themes from the User Web Access Patterns

Authors: Haider A Ramadhan, Khalil Shihab

Abstract:

Clustering techniques have been used by many intelligent software agents to group similar access patterns of the Web users into high level themes which express users intentions and interests. However, such techniques have been mostly focusing on one salient feature of the Web document visited by the user, namely the extracted keywords. The major aim of these techniques is to come up with an optimal threshold for the number of keywords needed to produce more focused themes. In this paper we focus on both keyword and similarity thresholds to generate themes with concentrated themes, and hence build a more sound model of the user behavior. The purpose of this paper is two fold: use distance based clustering methods to recognize overall themes from the Proxy log file, and suggest an efficient cut off levels for the keyword and similarity thresholds which tend to produce more optimal clusters with better focus and efficient size.

Keywords: Data mining, knowledge discovery, clustering, dataanalysis, Web log analysis, theme based searching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1429
2122 Analysis of the Impact of NVivo and EndNote on Academic Research Productivity

Authors: Sujit K. Basak

Abstract:

The aim of this paper is to analyze the impact of literature review software on researchers. The aim of this study was achieved by analyzing models in terms of perceived usefulness, perceived ease of use, and acceptance level. Collected data were analyzed using WarpPLS 4.0 software. This study used two theoretical frameworks, namely, Technology Acceptance Model and the Training Needs Assessment Model. The study was experimental and was conducted at a public university in South Africa. The results of the study showed that acceptance level has a high impact on research productivity followed by perceived usefulness and perceived ease of use.

Keywords: Technology acceptance model, training needs assessment model, literature review software, research productivity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2957
2121 Moving From Problem Space to Solution Space

Authors: Bilal Saeed Raja, M. Ali Iqbal, Imran Ihsan

Abstract:

Extracting and elaborating software requirements and transforming them into viable software architecture are still an intricate task. This paper defines a solution architecture which is based on the blurred amalgamation of problem space and solution space. The dependencies between domain constraints, requirements and architecture and their importance are described that are to be considered collectively while evolving from problem space to solution space. This paper proposes a revised version of Twin Peaks Model named Win Peaks Model that reconciles software requirements and architecture in more consistent and adaptable manner. Further the conflict between stakeholders- win-requirements is resolved by proposed Voting methodology that is simple adaptation of win-win requirements negotiation model and QARCC.

Keywords: Functional Requirements, Non Functional Requirements, Twin Peaks Model, QARCC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1836
2120 Underlying Cognitive Complexity Measure Computation with Combinatorial Rules

Authors: Benjapol Auprasert, Yachai Limpiyakorn

Abstract:

Measuring the complexity of software has been an insoluble problem in software engineering. Complexity measures can be used to predict critical information about testability, reliability, and maintainability of software systems from automatic analysis of the source code. During the past few years, many complexity measures have been invented based on the emerging Cognitive Informatics discipline. These software complexity measures, including cognitive functional size, lend themselves to the approach of the total cognitive weights of basic control structures such as loops and branches. This paper shows that the current existing calculation method can generate different results that are algebraically equivalence. However, analysis of the combinatorial meanings of this calculation method shows significant flaw of the measure, which also explains why it does not satisfy Weyuker's properties. Based on the findings, improvement directions, such as measures fusion, and cumulative variable counting scheme are suggested to enhance the effectiveness of cognitive complexity measures.

Keywords: Cognitive Complexity Measure, Cognitive Weight of Basic Control Structure, Counting Rules, Cumulative Variable Counting Scheme.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1861
2119 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining

Authors: Tatjana Eitrich, Bruno Lang

Abstract:

This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.

Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1555
2118 System-Level Energy Estimation for SoC based on the Dynamic Behavior of Embedded Software

Authors: Yoshifumi Sakamoto, Kouichi Ono, Takeo Nakada, Yousuke Kubo, Hiroto Yasuura

Abstract:

This paper describes a system-level SoC energy consumption estimation method based on a dynamic behavior of embedded software in the early stages of the SoC development. A major problem of SOC development is development rework caused by unreliable energy consumption estimation at the early stages. The energy consumption of an SoC used in embedded systems is strongly affected by the dynamic behavior of the software. At the early stages of SoC development, modeling with a high level of abstraction is required for both the dynamic behavior of the software, and the behavior of the SoC. We estimate the energy consumption by a UML model-based simulation. The proposed method is applied for an actual embedded system in an MFP. The energy consumption estimation of the SoC is more accurate than conventional methods and this proposed method is promising to reduce the chance of development rework in the SoC development. ∈

Keywords: SoC, Embedded Sytem, Energy Consumption, Dynamic behavior, UML, Modeling, Model-based simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2430
2117 A Common Automated Programming Platform for Knowledge Based Software Engineering

Authors: Ivan Stanev, Maria Koleva

Abstract:

Common Platform for Automated Programming (CPAP) is defined in details. Two versions of CPAP are described: Cloud based (including set of components for classic programming, and set of components for combined programming); and Knowledge Based Automated Software Engineering (KBASE) based (including set of components for automated programming, and set of components for ontology programming). Four KBASE products (Module for Automated Programming of Robots, Intelligent Product Manual, Intelligent Document Display, and Intelligent Form Generator) are analyzed and CPAP contributions to automated programming are presented.

Keywords: Automated Programming, Cloud Computing, Knowledge Based Software Engineering, Service Oriented Architecture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1859
2116 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: Academic performance prediction system, prediction model, educational data mining, dominant factors, feature selection methods, student performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 933
2115 Mining User-Generated Contents to Detect Service Failures with Topic Model

Authors: Kyung Bae Park, Sung Ho Ha

Abstract:

Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.

Keywords: Latent Dirichlet allocation, R program, text mining, topic model, user generated contents, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1191
2114 An Eulerian Numerical Method and its Application to Explosion Problems

Authors: Li Hao, Yan Zhang, Jingan Cui

Abstract:

The Eulerian numerical method is proposed to analyze the explosion in tunnel. Based on this method, an original software M-MMIC2D is developed by Cµ program language. With this software, the explosion problem in the tunnel with three expansion-chambers is numerically simulated, and the results are found to be in full agreement with the observed experimental data.

Keywords: Eulerian method, numerical simulation, shock wave, tunnel

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1427
2113 Induction Heating Process Design Using Comsol® Multiphysics Software Version 4.2a

Authors: K. Djellabi, M. E. H. Latreche

Abstract:

Induction heating computer simulation is a powerful tool for process design and optimization, induction coil design, equipment selection, as well as education and business presentations. The authors share their vast experience in the practical use of computer simulation for different induction heating and heat treating processes. In this paper treated with mathematical modeling and numerical simulation of induction heating furnaces with axisymmetric geometries for the numerical solution, we propose finite element methods combined with boundary (FEM) for the electromagnetic model using COMSOL® Multiphysics Software. Some numerical results for an industrial furnace are shown with high frequency.

Keywords: Numerical methods, Induction furnaces, Induction Heating, Finite element method, Comsol Multiphysics software.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8007
2112 Evaluating Hurst Parameters and Fractal Dimensions of Surveyed Dataset of Tailings Dam Embankment

Authors: I. Yakubu, Y. Y. Ziggah, C. Yeboah

Abstract:

In the mining environment, tailings dam embankment is among the hazards and risk areas. The tailings dam embankment could fail and result to damages to facilities, human injuries or even fatalities. Periodic monitoring of the dam embankment is needed to help assess the safety of the tailings dam embankment. Artificial intelligence techniques such as fractals can be used to analyse the stability of the monitored dataset from survey measurement techniques. In this paper, the fractal dimension (D) was determined using D = 2-H. The Hurst parameters (H) of each monitored prism were determined by using a time domain of rescaled range programming in MATLAB software. The fractal dimensions of each monitored prism were determined based on the values of H. The results reveal that the values of the determined H were all within the threshold of 0 ≤ H ≤ 1 m. The smaller the H, the bigger the fractal dimension is. Fractal dimension values ranging from 1.359 x 10-4 m to 1.8843 x 10-3 m were obtained from the monitored prisms on the based on the tailing dam embankment dataset used. The ranges of values obtained indicate that the tailings dam embankment is stable.

Keywords: Hurst parameter, fractal dimension, tailings dam embankment, surveyed dataset.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 721
2111 Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. Earlier we predicted the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven datasets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: Software Metrics, Fault prediction, Cross project, Within project.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2507
2110 Dynamic Mesh Based Airfoil Design Optimization

Authors: Zhu Xiong-feng, Hou Zhong-xi, Guo Zheng, Liu Zhao-Wei

Abstract:

A method of dynamic mesh based airfoil optimization is proposed according to the drawbacks of surrogate model based airfoil optimization. Programs are designed to achieve the dynamic mesh. Boundary condition is add by integrating commercial software Pointwise, meanwhile the CFD calculation is carried out by commercial software Fluent. The data exchange and communication between the software and programs referred above have been accomplished, and the whole optimization process is performed in iSIGHT platform. A simplified airfoil optimization study case is brought out to show that aerodynamic performances of airfoil have been significantly improved, even save massive repeat operations and increase the robustness and credibility of the optimization result. The case above proclaims that dynamic mesh based airfoil optimization is an effective and high efficient method.

Keywords: unmanned air vehicles, dynamic mesh, airfoil optimization, CFD, genetic algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3373
2109 Prototype of Business Directory for Micro, Small and Medium Enterprises Using Google Maps API and Multimedia

Authors: Suselo Thomas, Suyoto, Dwiandiyanta B. Yudi

Abstract:

This paper explain about prototype of a business directory for micro-scale businesses, small and medium enterprises (SMEs), the third phase of the research. The third phase is the phase of software development based on the model of SME business directory that had been developed, to create prototype software SME business directory. In the fourth phase, namely the implementation, these units have been developed are tested to obtain input from potential users. The fifth phase is the testing phase to determine the strengths and weaknesses of software has been developed. The result of this phase is the software in the form of on-line (web based) and multimedia-based. Business Directory, if implemented will facilitate and optimize the access of SMEs to ease supplier access to marketing. Business Directory will be equipped with the power of geocoding, so each location can be easily viewed SMEs on the map. The map will be constructed by using the functionality of a web-based Google Maps API. The information presented in the form of multimedia that can be more interesting and interactive. Methodology used to achieve the goal: observation, interviews, modeling and classifying business directory for SMEs. 

Keywords: Business directories, SMEs, Google Maps API, Multimedia, Prototype.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2100
2108 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars, and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: Remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2027