Search results for: Data mining andInformation Extraction
7531 Mining Sequential Patterns Using I-PrefixSpan
Authors: Dhany Saputra, Dayang R. A. Rambli, Oi Mean Foong
Abstract:
In this paper, we propose an improvement of pattern growth-based PrefixSpan algorithm, called I-PrefixSpan. The general idea of I-PrefixSpan is to use sufficient data structure for Seq-Tree framework and separator database to reduce the execution time and memory usage. Thus, with I-PrefixSpan there is no in-memory database stored after index set is constructed. The experimental result shows that using Java 2, this method improves the speed of PrefixSpan up to almost two orders of magnitude as well as the memory usage to more than one order of magnitude.Keywords: ArrayList, ArrayIntList, minimum support, sequence database, sequential patterns.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15647530 Performance Evaluation of ROI Extraction Models from Stationary Images
Authors: K.V. Sridhar, Varun Gunnala, K.S.R Krishna Prasad
Abstract:
In this paper three basic approaches and different methods under each of them for extracting region of interest (ROI) from stationary images are explored. The results obtained for each of the proposed methods are shown, and it is demonstrated where each method outperforms the other. Two main problems in ROI extraction: the channel selection problem and the saliency reversal problem are discussed and how best these two are addressed by various methods is also seen. The basic approaches are 1) Saliency based approach 2) Wavelet based approach 3) Clustering based approach. The saliency approach performs well on images containing objects of high saturation and brightness. The wavelet based approach performs well on natural scene images that contain regions of distinct textures. The mean shift clustering approach partitions the image into regions according to the density distribution of pixel intensities. The experimental results of various methodologies show that each technique performs at different acceptable levels for various types of images.Keywords: clustering, ROI, saliency, wavelets.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14097529 Extraction of Temporal Relation by the Creation of Historical Natural Disaster Archive
Authors: Suguru Yoshioka, Seiichi Tani, Seinosuke Toda
Abstract:
In historical science and social science, the influence of natural disaster upon society is a matter of great interest. In recent years, some archives are made through many hands for natural disasters, however it is inefficiency and waste. So, we suppose a computer system to create a historical natural disaster archive. As the target of this analysis, we consider newspaper articles. The news articles are considered to be typical examples that prescribe the temporal relations of affairs for natural disaster. In order to do this analysis, we identify the occurrences in newspaper articles by some index entries, considering the affairs which are specific to natural disasters, and show the temporal relation between natural disasters. We designed and implemented the automatic system of “extraction of the occurrences of natural disaster" and “temporal relation table for natural disaster."Keywords: Database, digital library, corpus, historical natural disaster, temporal relation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14047528 Genetic-based Anomaly Detection in Logs of Process Aware Systems
Authors: Hanieh Jalali, Ahmad Baraani
Abstract:
Nowaday-s, many organizations use systems that support business process as a whole or partially. However, in some application domains, like software development and health care processes, a normative Process Aware System (PAS) is not suitable, because a flexible support is needed to respond rapidly to new process models. On the other hand, a flexible Process Aware System may be vulnerable to undesirable and fraudulent executions, which imposes a tradeoff between flexibility and security. In order to make this tradeoff available, a genetic-based anomaly detection model for logs of Process Aware Systems is presented in this paper. The detection of an anomalous trace is based on discovering an appropriate process model by using genetic process mining and detecting traces that do not fit the appropriate model as anomalous trace; therefore, when used in PAS, this model is an automated solution that can support coexistence of flexibility and security.Keywords: Anomaly Detection, Genetic Algorithm, ProcessAware Systems, Process Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19257527 Fast Facial Feature Extraction and Matching with Artificial Face Models
Authors: Y. H. Tsai, Y. W. Chen
Abstract:
Facial features are frequently used to represent local properties of a human face image in computer vision applications. In this paper, we present a fast algorithm that can extract the facial features online such that they can give a satisfying representation of a face image. It includes one step for a coarse detection of each facial feature by AdaBoost and another one to increase the accuracy of the found points by Active Shape Models (ASM) in the regions of interest. The resulted facial features are evaluated by matching with artificial face models in the applications of physiognomy. The distance measure between the features and those in the fate models from the database is carried out by means of the Hausdorff distance. In the experiment, the proposed method shows the efficient performance in facial feature extractions and online system of physiognomy.Keywords: Facial feature extraction, AdaBoost, Active shapemodel, Hausdorff distance
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18127526 Knowledge Mining in Web-based Learning Environments
Authors: Nittaya Kerdprasop, Kittisak Kerdprasop
Abstract:
The state of the art in instructional design for computer-assisted learning has been strongly influenced by advances in information technology, Internet and Web-based systems. The emphasis of educational systems has shifted from training to learning. The course delivered has also been changed from large inflexible content to sequential small chunks of learning objects. The concepts of learning objects together with the advanced technologies of Web and communications support the reusability, interoperability, and accessibility design criteria currently exploited by most learning systems. These concepts enable just-in-time learning. We propose to extend theses design criteria further to include the learnability concept that will help adapting content to the needs of learners. The learnability concept offers a better personalization leading to the creation and delivery of course content more appropriate to performance and interest of each learner. In this paper we present a new framework of learning environments containing knowledge discovery as a tool to automatically learn patterns of learning behavior from learners' profiles and history.Keywords: Knowledge mining, Web-based learning, Learning environments.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17867525 Leaching Behaviour of a Low-grade South African Nickel Laterite
Authors: Catherine K. Thubakgale, Richard K.K. Mbaya, Kaby Kabongo
Abstract:
The morphology, mineralogical and chemical composition of a low-grade nickel ore from Mpumalanga, South Africa, were studied by scanning electron microscope (SEM), X-ray diffraction (XRD) and X-ray fluorescence (XRF), respectively. The ore was subjected to atmospheric agitation leaching using sulphuric acid to investigate the effects of acid concentration, leaching temperature, leaching time and particle size on extraction of nickel and cobalt. Analyses results indicated the ore to be a saprolitic nickel laterite belonging to the serpentine group of minerals. Sulphuric acid was found to be able to extract nickel from the ore. Increased acid concentration and temperature only produced low amounts of nickel but improved cobalt extraction. As high as 77.44% Ni was achieved when leaching a -106+75μm fraction with 4.0M acid concentration at 25oC. The kinetics of nickel leaching from the saprolitic ore were studied and the activation energy was determined to be 18.16kJ/mol. This indicated that nickel leaching reaction was diffusion controlled.Keywords: Laterite, sulphuric acid, atmospheric leaching, nickel.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33167524 The Use of Microorganisms in the Bioleaching of Soils Polluted with Heavy Metals
Authors: I. M. Sur, A. M. Chirila-Babau, T. Gabor, V. Micle
Abstract:
This paper shows researches in order to extract Cr, Cu and Ni from the polluted soils. Research is based on preliminary studies regarding the usage of Thiobacillus ferrooxidans bacterium (9K medium) for bioleaching of soil polluted with heavy metal (Cu, Cr and Ni). The microorganisms (Thiobacillus ferooxidans) selected directly from polluted soil samples were used in this experimental work. Soil samples used in the experimental research were taken from an area polluted with heavy metals from Romania. The soil samples are subjected to the cleaning process using the 9K medium solution (20 mL and 40 mL, respectively), stirred 200 rpm for 20 hours at a controlled temperature (30 ˚C). During the experiment (0, 2, 4, 8 and 20 h), liquid samples have been extracted and analyzed using the Atomic Absorption Spectrophotometer AA-6800 (AAS) in order to determine the Cr, Cu and Ni concentration. Experiments led to the conclusion that these soils can be depolluted by bioleaching, being a biological treatment method involving the use of microorganisms to favor the extraction of Cr, Cu and Ni from polluted soils.
Keywords: Bioleaching, extraction, microorganisms, polluted soil, Thiobacillus ferooxidans.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9697523 Assessment of Negative Impacts Affecting Public Transportation Modes and Infrastructure in Burgersfort Town towards Building Urban Sustainability
Authors: Ntloana Hlabishi Peter
Abstract:
The availability of public transportation modes and qualitative infrastructure is a burning issue that affects urban sustainability. Public transportation is indispensable in providing adequate transportation means to people at an affordable price, and it promotes public transport reliance. Burgersfort town has a critical condition on the urban public transportation infrastructure which affects the bus and taxi public transport modes and the existing infrastructure. The municipality is regarded as one of the mining towns in Limpopo Province considering the availability of mining activities and proposal on establishment of a Special Economic Zone (SEZ). The study aim is to assess the efficacy of current public transportation infrastructure and to propose relevant recommendations that will unlock the possibility of future supportable public transportation systems. The Key Informant Interview (KII) was used to acquire data on the views from commuters and stakeholders involved. There KII incorporated three relevant questions in relation to services rendered in public transportation. Relevant literature relating to public transportation modes and infrastructure revealed the imperatives of public transportation infrastructure, and relevant legislation was reviewed concerning public transport infrastructure. The finding revealed poor conditions on the public transportation ranks and also inadequate parking space for public transportation modes. The study reveals that 100% of people interviewed were not satisfied with the condition of public transportation infrastructure and 100% are not satisfied with the services offered by public transportation sectors. The findings revealed that the municipality is the main player who can upgrade the existing conditions of public transportation. The study recommended that an intermodal transportation facility must be established to resolve the emerging challenges.
Keywords: Public transportation, modes, infrastructure, urban sustainability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6997522 Life Cycle Datasets for the Ornamental Stone Sector
Authors: Isabella Bianco, Gian Andrea Blengini
Abstract:
The environmental impact related to ornamental stones (such as marbles and granites) is largely debated. Starting from the industrial revolution, continuous improvements of machineries led to a higher exploitation of this natural resource and to a more international interaction between markets. As a consequence, the environmental impact of the extraction and processing of stones has increased. Nevertheless, if compared with other building materials, ornamental stones are generally more durable, natural, and recyclable. From the scientific point of view, studies on stone life cycle sustainability have been carried out, but these are often partial or not very significant because of the high percentage of approximations and assumptions in calculations. This is due to the lack, in life cycle databases (e.g. Ecoinvent, Thinkstep, and ELCD), of datasets about the specific technologies employed in the stone production chain. For example, databases do not contain information about diamond wires, chains or explosives, materials commonly used in quarries and transformation plants. The project presented in this paper aims to populate the life cycle databases with specific data of specific stone processes. To this goal, the methodology follows the standardized approach of Life Cycle Assessment (LCA), according to the requirements of UNI 14040-14044 and to the International Reference Life Cycle Data System (ILCD) Handbook guidelines of the European Commission. The study analyses the processes of the entire production chain (from-cradle-to-gate system boundaries), including the extraction of benches, the cutting of blocks into slabs/tiles and the surface finishing. Primary data have been collected in Italian quarries and transformation plants which use technologies representative of the current state-of-the-art. Since the technologies vary according to the hardness of the stone, the case studies comprehend both soft stones (marbles) and hard stones (gneiss). In particular, data about energy, materials and emissions were collected in marble basins of Carrara and in Beola and Serizzo basins located in the province of Verbano Cusio Ossola. Data were then elaborated through an appropriate software to build a life cycle model. The model was realized setting free parameters that allow an easy adaptation to specific productions. Through this model, the study aims to boost the direct participation of stone companies and encourage the use of LCA tool to assess and improve the stone sector environmental sustainability. At the same time, the realization of accurate Life Cycle Inventory data aims at making available, to researchers and stone experts, ILCD compliant datasets of the most significant processes and technologies related to the ornamental stone sector.
Keywords: LCA datasets, life cycle assessment, ornamental stone, stone environmental impact.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11557521 Mining and Visual Management of XML-Based Image Collections
Authors: Khalil Shihab, Nida Al-Chalabi
Abstract:
This article describes Uruk, the virtual museum of Iraq that we developed for visual exploration and retrieval of image collections. The system largely exploits the loosely-structured hierarchy of XML documents that provides a useful representation method to store semi-structured or unstructured data, which does not easily fit into existing database. The system offers users the capability to mine and manage the XML-based image collections through a web-based Graphical User Interface (GUI). Typically, at an interactive session with the system, the user can browse a visual structural summary of the XML database in order to select interesting elements. Using this intermediate result, queries combining structure and textual references can be composed and presented to the system. After query evaluation, the full set of answers is presented in a visual and structured way.Keywords: Data-centric XML, graphical user interfaces, information retrieval, case-based reasoning, fuzzy sets
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17907520 Programming Language Extension Using Structured Query Language for Database Access
Authors: Chapman Eze Nnadozie
Abstract:
Relational databases constitute a very vital tool for the effective management and administration of both personal and organizational data. Data access ranges from a single user database management software to a more complex distributed server system. This paper intends to appraise the use a programming language extension like structured query language (SQL) to establish links to a relational database (Microsoft Access 2013) using Visual C++ 9 programming language environment. The methodology used involves the creation of tables to form a database using Microsoft Access 2013, which is Object Linking and Embedding (OLE) database compliant. The SQL command is used to query the tables in the database for easy extraction of expected records inside the visual C++ environment. The findings of this paper reveal that records can easily be accessed and manipulated to filter exactly what the user wants, such as retrieval of records with specified criteria, updating of records, and deletion of part or the whole records in a table.
Keywords: Data access, database, database management system, OLE, programming language, records, relational database, software, SQL, table.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7217519 Optimization of Some Process Parameters to Produce Raisin Concentrate in Khorasan Region of Iran
Authors: Peiman Ariaii, Hamid Tavakolipour, Mohsen Pirdashti, Rabehe Izadi Amoli
Abstract:
Raisin Concentrate (RC) are the most important products obtained in the raisin processing industries. These RC products are now used to make the syrups, drinks and confectionery productions and introduced as natural substitute for sugar in food applications. Iran is a one of the biggest raisin exporter in the world but unfortunately despite a good raw material, no serious effort to extract the RC has been taken in Iran. Therefore, in this paper, we determined and analyzed affected parameters on extracting RC process and then optimizing these parameters for design the extracting RC process in two types of raisin (round and long) produced in Khorasan region. Two levels of solvent (1:1 and 2:1), three levels of extraction temperature (60°C, 70°C and 80°C), and three levels of concentration temperature (50°C, 60°C and 70°C) were the treatments. Finally physicochemical characteristics of the obtained concentrate such as color, viscosity, percentage of reduction sugar, acidity and the microbial tests (mould and yeast) were counted. The analysis was performed on the basis of factorial in the form of completely randomized design (CRD) and Duncan's multiple range test (DMRT) was used for the comparison of the means. Statistical analysis of results showed that optimal conditions for production of concentrate is round raisins when the solvent ratio was 2:1 with extraction temperature of 60°C and then concentration temperature of 50°C. Round raisin is cheaper than the long one, and it is more economical to concentrate production. Furthermore, round raisin has more aromas and the less color degree with increasing the temperature of concentration and extraction. Finally, according to mentioned factors the concentrate of round raisin is recommended.Keywords: Raisin concentrate, optimization, process parameters, round raisin, Iran.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16007518 Composite Kernels for Public Emotion Recognition from Twitter
Authors: Chien-Hung Chen, Yan-Chun Hsing, Yung-Chun Chang
Abstract:
The Internet has grown into a powerful medium for information dispersion and social interaction that leads to a rapid growth of social media which allows users to easily post their emotions and perspectives regarding certain topics online. Our research aims at using natural language processing and text mining techniques to explore the public emotions expressed on Twitter by analyzing the sentiment behind tweets. In this paper, we propose a composite kernel method that integrates tree kernel with the linear kernel to simultaneously exploit both the tree representation and the distributed emotion keyword representation to analyze the syntactic and content information in tweets. The experiment results demonstrate that our method can effectively detect public emotion of tweets while outperforming the other compared methods.
Keywords: Public emotion recognition, natural language processing, composite kernel, sentiment analysis, text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7737517 Analysis of Roasted and Ground Grains on the Seoul (Korea) Market for Their Contaminants of Aflatoxins, Ochratoxin A and Fusarium Toxins by LC-MS/MS
Authors: So-young Jung, Bu-chuhl Choe, Gi-young Shin, Jung-hun Kim, Young-zoo Chae
Abstract:
A sensitive and specific method for quantitative determination of aflatoxins(B1, B2, G1,G2), deoxynivalenol, fumonisin(B1,B2), ochratoxin A, zearalenone, T-2 and HT-2 in roasted and ground grains using liquid chromatography combined with tandem mass spectrometry. A double extraction using a phosphate buffer solution followed by methanol was applied to achieve effective co extraction of 11 mycotoxins. A multitoxin immunoaffinity column for all these mycotoxins was used to clean up the extract. The LODs of mycotoxins were 0.1~6.1 μg/kg, LOQs were 0.3~18.4 μg/kg. Forty seven samples collected from Seoul (Korea) for mycotoxin contamination monitoring. The results showed that the occurrence of zearalenone and deoxynivalenol were frequent. Zearalenone was detected in all samples and deoxynivalenol was detected in 80.9 % samples in the range 0.626 ~ 29.264 μg/kg and N.D ~ 48.332 μg/kg respectively. Fumonisins and ochratoxin A were detected in 46.8% samples and 17 % samples respectively, aflatoxins and T-2/HT-2 toxins were not detected all samples.Keywords: LC-MS/MS, mycotoxins, roasted and ground grains.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19967516 Text Mining Analysis of the Reconstruction Plans after the Great East Japan Earthquake
Authors: Minami Ito, Akihiro Iijima
Abstract:
On March 11, 2011, the Great East Japan Earthquake occurred off the coast of Sanriku, Japan. It is important to build a sustainable society through the reconstruction process rather than simply restoring the infrastructure. To compare the goals of reconstruction plans of quake-stricken municipalities, Japanese language morphological analysis was performed by using text mining techniques. Frequently-used nouns were sorted into four main categories of “life”, “disaster prevention”, “economy”, and “harmony with environment”. Because Soma City is affected by nuclear accident, sentences tagged to “harmony with environment” tended to be frequent compared to the other municipalities. Results from cluster analysis and principle component analysis clearly indicated that the local government reinforces the efforts to reduce risks from radiation exposure as a top priority.
Keywords: Eco-friendly reconstruction, harmony with environment, decontamination, nuclear disaster.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19667515 A Digital Twin Approach for Sustainable Territories Planning: A Case Study on District Heating
Authors: A. Amrani, O. Allali, A. Ben Hamida, F. Defrance, S. Morland, E. Pineau, T. Lacroix
Abstract:
The energy planning process is a very complex task that involves several stakeholders and requires the consideration of several local and global factors and constraints. In order to optimize and simplify this process, we propose a tool-based iterative approach applied to district heating planning. We build our tool with the collaboration of a French territory using actual district data and implementing the European incentives. We set up an iterative process including data visualization and analysis, identification and extraction of information related to the area concerned by the operation, design of sustainable planning scenarios leveraging local renewable and recoverable energy sources, and finally, the evaluation of scenarios. The last step is performed by a dynamic digital twin replica of the city. Territory’s energy experts confirm that the tool provides them with valuable support towards sustainable energy planning.
Keywords: Climate change, data management, decision support, digital twin, district heating, energy planning, renewables, smart city.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6547514 Using Data Mining in Automotive Safety
Authors: Carine Cridelich, Pablo Juesas Cano, Emmanuel Ramasso, Noureddine Zerhouni, Bernd Weiler
Abstract:
Safety is one of the most important considerations when buying a new car. While active safety aims at avoiding accidents, passive safety systems such as airbags and seat belts protect the occupant in case of an accident. In addition to legal regulations, organizations like Euro NCAP provide consumers with an independent assessment of the safety performance of cars and drive the development of safety systems in automobile industry. Those ratings are mainly based on injury assessment reference values derived from physical parameters measured in dummies during a car crash test. The components and sub-systems of a safety system are designed to achieve the required restraint performance. Sled tests and other types of tests are then carried out by car makers and their suppliers to confirm the protection level of the safety system. A Knowledge Discovery in Databases (KDD) process is proposed in order to minimize the number of tests. The KDD process is based on the data emerging from sled tests according to Euro NCAP specifications. About 30 parameters of the passive safety systems from different data sources (crash data, dummy protocol) are first analysed together with experts opinions. A procedure is proposed to manage missing data and validated on real data sets. Finally, a procedure is developed to estimate a set of rough initial parameters of the passive system before testing aiming at reducing the number of tests.
Keywords: KDD process, passive safety systems, sled test, dummy injury assessment reference values, frontal impact
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28447513 Rule Insertion Technique for Dynamic Cell Structure Neural Network
Authors: Osama Elsarrar, Marjorie Darrah, Richard Devin
Abstract:
This paper discusses the idea of capturing an expert’s knowledge in the form of human understandable rules and then inserting these rules into a dynamic cell structure (DCS) neural network. The DCS is a form of self-organizing map that can be used for many purposes, including classification and prediction. This particular neural network is considered to be a topology preserving network that starts with no pre-structure, but assumes a structure once trained. The DCS has been used in mission and safety-critical applications, including adaptive flight control and health-monitoring in aerial vehicles. The approach is to insert expert knowledge into the DCS before training. Rules are translated into a pre-structure and then training data are presented. This idea has been demonstrated using the well-known Iris data set and it has been shown that inserting the pre-structure results in better accuracy with the same training.
Keywords: Neural network, rule extraction, rule insertion, self-organizing map.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5307512 Comparisons of Surveying with Terrestrial Laser Scanner and Total Station for Volume Determination of Overburden and Coal Excavations in Large Open-Pit Mine
Authors: B. Keawaram, P. Dumrongchai
Abstract:
The volume of overburden and coal excavations in open-pit mine is generally determined by conventional survey such as total station. This study aimed to evaluate the accuracy of terrestrial laser scanner (TLS) used to measure overburden and coal excavations, and to compare TLS survey data sets with the data of the total station. Results revealed that, the reference points measured with the total station showed 0.2 mm precision for both horizontal and vertical coordinates. When using TLS on the same points, the standard deviations of 4.93 cm and 0.53 cm for horizontal and vertical coordinates, respectively, were achieved. For volume measurements covering the mining areas of 79,844 m2, TLS yielded the mean difference of about 1% and the surface error margin of 6 cm at the 95% confidence level when compared to the volume obtained by total station.
Keywords: Mine, survey, terrestrial laser scanner, total station.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16637511 Over-Height Vehicle Detection in Low Headroom Roads Using Digital Video Processing
Authors: Vahid Khorramshahi, Alireza Behrad, Neeraj K. Kanhere
Abstract:
In this paper we present a new method for over-height vehicle detection in low headroom streets and highways using digital video possessing. The accuracy and the lower price comparing to present detectors like laser radars and the capability of providing extra information like speed and height measurement make this method more reliable and efficient. In this algorithm the features are selected and tracked using KLT algorithm. A blob extraction algorithm is also applied using background estimation and subtraction. Then the world coordinates of features that are inside the blobs are estimated using a noble calibration method. As, the heights of the features are calculated, we apply a threshold to select overheight features and eliminate others. The over-height features are segmented using some association criteria and grouped using an undirected graph. Then they are tracked through sequential frames. The obtained groups refer to over-height vehicles in a scene.Keywords: Feature extraction, over-height vehicle detection, traffic monitoring, vehicle tracking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28287510 Cost Sensitive Feature Selection in Decision-Theoretic Rough Set Models for Customer Churn Prediction: The Case of Telecommunication Sector Customers
Authors: Emel Kızılkaya Aydogan, Mihrimah Ozmen, Yılmaz Delice
Abstract:
In recent days, there is a change and the ongoing development of the telecommunications sector in the global market. In this sector, churn analysis techniques are commonly used for analysing why some customers terminate their service subscriptions prematurely. In addition, customer churn is utmost significant in this sector since it causes to important business loss. Many companies make various researches in order to prevent losses while increasing customer loyalty. Although a large quantity of accumulated data is available in this sector, their usefulness is limited by data quality and relevance. In this paper, a cost-sensitive feature selection framework is developed aiming to obtain the feature reducts to predict customer churn. The framework is a cost based optional pre-processing stage to remove redundant features for churn management. In addition, this cost-based feature selection algorithm is applied in a telecommunication company in Turkey and the results obtained with this algorithm.
Keywords: Churn prediction, data mining, decision-theoretic rough set, feature selection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17637509 The Autoregresive Analysis for Wind Turbine Signal Postprocessing
Authors: Daniel Pereiro, Felix Martinez, Iker Urresti, Ana Gomez Gonzalez
Abstract:
Today modern simulations solutions in the wind turbine industry have achieved a high degree of complexity and detail in result. Limitations exist when it is time to validate model results against measurements. Regarding Model validation it is of special interest to identify mode frequencies and to differentiate them from the different excitations. A wind turbine is a complex device and measurements regarding any part of the assembly show a lot of noise. Input excitations are difficult or even impossible to measure due to the stochastic nature of the environment. Traditional techniques for frequency analysis or features extraction are widely used to analyze wind turbine sensor signals, but have several limitations specially attending to non stationary signals (Events). A new technique based on autoregresive analysis techniques is introduced here for a specific application, a comparison and examples related to different events in the wind turbine operations are presented.
Keywords: Wind turbine, signal processing, mode extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15677508 Application of Kansei Engineering and Association Rules Mining in Product Design
Authors: Pitaktiratham J., Sinlan T., Anuntavoranich P., Sinthupinyo S.
Abstract:
The Kansei engineering is a technology which converts human feelings into quantitative terms and helps designers develop new products that meet customers- expectation. Standard Kansei engineering procedure involves finding relationships between human feelings and design elements of which many researchers have found forward and backward relationship through various soft computing techniques. In this paper, we proposed the framework of Kansei engineering linking relationship not only between human feelings and design elements, but also the whole part of product, by constructing association rules. In this experiment, we obtain input from emotion score that subjects rate when they see the whole part of the product by applying semantic differentials. Then, association rules are constructed to discover the combination of design element which affects the human feeling. The results of our experiment suggest the pattern of relationship of design elements according to human feelings which can be derived from the whole part of product.Keywords: Association Rules Mining, Kansei Engineering, Product Design, Semantic Differentials
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25237507 Application of KL Divergence for Estimation of Each Metabolic Pathway Genes
Authors: Shohei Maruyama, Yasuo Matsuyama, Sachiyo Aburatani
Abstract:
Development of a method to estimate gene functions is an important task in bioinformatics. One of the approaches for the annotation is the identification of the metabolic pathway that genes are involved in. Since gene expression data reflect various intracellular phenomena, those data are considered to be related with genes’ functions. However, it has been difficult to estimate the gene function with high accuracy. It is considered that the low accuracy of the estimation is caused by the difficulty of accurately measuring a gene expression. Even though they are measured under the same condition, the gene expressions will vary usually. In this study, we proposed a feature extraction method focusing on the variability of gene expressions to estimate the genes' metabolic pathway accurately. First, we estimated the distribution of each gene expression from replicate data. Next, we calculated the similarity between all gene pairs by KL divergence, which is a method for calculating the similarity between distributions. Finally, we utilized the similarity vectors as feature vectors and trained the multiclass SVM for identifying the genes' metabolic pathway. To evaluate our developed method, we applied the method to budding yeast and trained the multiclass SVM for identifying the seven metabolic pathways. As a result, the accuracy that calculated by our developed method was higher than the one that calculated from the raw gene expression data. Thus, our developed method combined with KL divergence is useful for identifying the genes' metabolic pathway.
Keywords: Metabolic pathways, gene expression data, microarray, Kullback–Leibler divergence, KL divergence, support vector machines, SVM, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23367506 A Hypercube Social Feature Extraction and Multipath Routing in Delay Tolerant Networks
Authors: S. Balaji, M. Rajaram, Y. Harold Robinson, E. Golden Julie
Abstract:
Delay Tolerant Networks (DTN) which have sufficient state information include trajectory and contact information, to protect routing efficiency. However, state information is dynamic and hard to obtain without a global and/or long-term collection process. To deal with these problems, the internal social features of each node are introduced in the network to perform the routing process. This type of application is motivated from several human contact networks where people contact each other more frequently if they have more social features in common. Two unique processes were developed for this process; social feature extraction and multipath routing. The routing method then becomes a hypercube–based feature matching process. Furthermore, the effectiveness of multipath routing is evaluated and compared to that of single-path routing.
Keywords: Delay tolerant networks, entropy, human contact networks, hyper cubes, multipath Routing, social features.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13057505 A Novel In-Place Sorting Algorithm with O(n log z) Comparisons and O(n log z) Moves
Authors: Hanan Ahmed-Hosni Mahmoud, Nadia Al-Ghreimil
Abstract:
In-place sorting algorithms play an important role in many fields such as very large database systems, data warehouses, data mining, etc. Such algorithms maximize the size of data that can be processed in main memory without input/output operations. In this paper, a novel in-place sorting algorithm is presented. The algorithm comprises two phases; rearranging the input unsorted array in place, resulting segments that are ordered relative to each other but whose elements are yet to be sorted. The first phase requires linear time, while, in the second phase, elements of each segment are sorted inplace in the order of z log (z), where z is the size of the segment, and O(1) auxiliary storage. The algorithm performs, in the worst case, for an array of size n, an O(n log z) element comparisons and O(n log z) element moves. Further, no auxiliary arithmetic operations with indices are required. Besides these theoretical achievements of this algorithm, it is of practical interest, because of its simplicity. Experimental results also show that it outperforms other in-place sorting algorithms. Finally, the analysis of time and space complexity, and required number of moves are presented, along with the auxiliary storage requirements of the proposed algorithm.
Keywords: Auxiliary storage sorting, in-place sorting, sorting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19107504 Forecasting Fraudulent Financial Statements using Data Mining
Authors: S. Kotsiantis, E. Koumanakos, D. Tzelepis, V. Tampakas
Abstract:
This paper explores the effectiveness of machine learning techniques in detecting firms that issue fraudulent financial statements (FFS) and deals with the identification of factors associated to FFS. To this end, a number of experiments have been conducted using representative learning algorithms, which were trained using a data set of 164 fraud and non-fraud Greek firms in the recent period 2001-2002. The decision of which particular method to choose is a complicated problem. A good alternative to choosing only one method is to create a hybrid forecasting system incorporating a number of possible solution methods as components (an ensemble of classifiers). For this purpose, we have implemented a hybrid decision support system that combines the representative algorithms using a stacking variant methodology and achieves better performance than any examined simple and ensemble method. To sum up, this study indicates that the investigation of financial information can be used in the identification of FFS and underline the importance of financial ratios.Keywords: Machine learning, stacking, classifier.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30537503 The Implementation of the Multi-Agent Classification System (MACS) in Compliance with FIPA Specifications
Authors: Mohamed R. Mhereeg
Abstract:
The paper discusses the implementation of the MultiAgent classification System (MACS) and utilizing it to provide an automated and accurate classification of end users developing applications in the spreadsheet domain. However, different technologies have been brought together to build MACS. The strength of the system is the integration of the agent technology with the FIPA specifications together with other technologies, which are the .NET widows service based agents, the Windows Communication Foundation (WCF) services, the Service Oriented Architecture (SOA), and Oracle Data Mining (ODM). The Microsoft's .NET widows service based agents were utilized to develop the monitoring agents of MACS, the .NET WCF services together with SOA approach allowed the distribution and communication between agents over the WWW. The Monitoring Agents (MAs) were configured to execute automatically to monitor excel spreadsheets development activities by content. Data gathered by the Monitoring Agents from various resources over a period of time was collected and filtered by a Database Updater Agent (DUA) residing in the .NET client application of the system. This agent then transfers and stores the data in Oracle server database via Oracle stored procedures for further processing that leads to the classification of the end user developers.
Keywords: MACS, Implementation, Multi-Agent, SOA, Autonomous, WCF.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17097502 Correlation-based Feature Selection using Ant Colony Optimization
Authors: M. Sadeghzadeh, M. Teshnehlab
Abstract:
Feature selection has recently been the subject of intensive research in data mining, specially for datasets with a large number of attributes. Recent work has shown that feature selection can have a positive effect on the performance of machine learning algorithms. The success of many learning algorithms in their attempts to construct models of data, hinges on the reliable identification of a small set of highly predictive attributes. The inclusion of irrelevant, redundant and noisy attributes in the model building process phase can result in poor predictive performance and increased computation. In this paper, a novel feature search procedure that utilizes the Ant Colony Optimization (ACO) is presented. The ACO is a metaheuristic inspired by the behavior of real ants in their search for the shortest paths to food sources. It looks for optimal solutions by considering both local heuristics and previous knowledge. When applied to two different classification problems, the proposed algorithm achieved very promising results.
Keywords: Ant colony optimization, Classification, Datamining, Feature selection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2420