Search results for: Data Mining Techniques
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9218

Search results for: Data Mining Techniques

8438 The Functional Magnetic Resonance Imaging and the Consumer Behaviour: Reviewing Recent Research

Authors: Mikel Alonso López

Abstract:

In the first decade of the twenty-first century, advanced imaging techniques began to be applied for neuroscience research. The Functional Magnetic Resonance Imaging (fMRI) is one of the most important and most used research techniques for the investigation of emotions, because of its ease to observe the brain areas that oxygenate when performing certain tasks. In this research, we make a review about the main research carried out on the influence of the emotions in the decision-making process that is exposed by using the fMRI.

Keywords: Decision making, emotions, fMRI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
8437 Genetic Programming Approach to Hierarchical Production Rule Discovery

Authors: Basheer M. Al-Maqaleh, Kamal K. Bharadwaj

Abstract:

Automated discovery of hierarchical structures in large data sets has been an active research area in the recent past. This paper focuses on the issue of mining generalized rules with crisp hierarchical structure using Genetic Programming (GP) approach to knowledge discovery. The post-processing scheme presented in this work uses flat rules as initial individuals of GP and discovers hierarchical structure. Suitable genetic operators are proposed for the suggested encoding. Based on the Subsumption Matrix(SM), an appropriate fitness function is suggested. Finally, Hierarchical Production Rules (HPRs) are generated from the discovered hierarchy. Experimental results are presented to demonstrate the performance of the proposed algorithm.

Keywords: Genetic Programming, Hierarchy, Knowledge Discovery in Database, Subsumption Matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1434
8436 Machine Learning Framework: Competitive Intelligence and Key Drivers Identification of Market Share Trends among Healthcare Facilities

Authors: A. Appe, B. Poluparthi, L. Kasivajjula, U. Mv, S. Bagadi, P. Modi, A. Singh, H. Gunupudi, S. Troiano, J. Paul, J. Stovall, J. Yamamoto

Abstract:

The necessity of data-driven decisions in healthcare strategy formulation is rapidly increasing. A reliable framework which helps identify factors impacting a healthcare provider facility or a hospital (from here on termed as facility) market share is of key importance. This pilot study aims at developing a data-driven machine learning-regression framework which aids strategists in formulating key decisions to improve the facility’s market share which in turn impacts in improving the quality of healthcare services. The US (United States) healthcare business is chosen for the study, and the data spanning 60 key facilities in Washington State and about 3 years of historical data are considered. In the current analysis, market share is termed as the ratio of the facility’s encounters to the total encounters among the group of potential competitor facilities. The current study proposes a two-pronged approach of competitor identification and regression approach to evaluate and predict market share, respectively. Leveraged model agnostic technique, SHAP (SHapley Additive exPlanations), to quantify the relative importance of features impacting the market share. Typical techniques in literature to quantify the degree of competitiveness among facilities use an empirical method to calculate a competitive factor to interpret the severity of competition. The proposed method identifies a pool of competitors, develops Directed Acyclic Graphs (DAGs) and feature level word vectors, and evaluates the key connected components at the facility level. This technique is robust since it is data-driven, which minimizes the bias from empirical techniques. The DAGs factor in partial correlations at various segregations and key demographics of facilities along with a placeholder to factor in various business rules (for e.g., quantifying the patient exchanges, provider references, and sister facilities). Identified are the multiple groups of competitors among facilities. Leveraging the competitors' identified developed and fine-tuned Random Forest Regression model to predict the market share. To identify key drivers of market share at an overall level, permutation feature importance of the attributes was calculated. For relative quantification of features at a facility level, incorporated SHAP, a model agnostic explainer. This helped to identify and rank the attributes at each facility which impacts the market share. This approach proposes an amalgamation of the two popular and efficient modeling practices, viz., machine learning with graphs and tree-based regression techniques to reduce the bias. With these, we helped to drive strategic business decisions.

Keywords: Competition, DAGs, hospital, healthcare, machine learning, market share, random forest, SHAP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 233
8435 Coordinated Multi-Point Scheme Based On Channel State Information in MIMO-OFDM System

Authors: Su-Hyun Jung, Chang-Bin Ha, Hyoung-Kyu Song

Abstract:

Recently, increasing the quality of experience (QoE) is an important issue. Since performance degradation at cell edge extremely reduces the QoE, several techniques are defined at LTE/LTE-A standard to remove inter-cell interference (ICI). However, the conventional techniques have disadvantage because there is a trade-off between resource allocation and reliable communication. The proposed scheme reduces the ICI more efficiently by using channel state information (CSI) smartly. It is shown that the proposed scheme can reduce the ICI with fewer resources.

Keywords: Adaptive beam forming, CoMP, LTE-A, ICI reduction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2497
8434 Exploiting Machine Learning Techniques for the Enhancement of Acceptance Sampling

Authors: Aikaterini Fountoulaki, Nikos Karacapilidis, Manolis Manatakis

Abstract:

This paper proposes an innovative methodology for Acceptance Sampling by Variables, which is a particular category of Statistical Quality Control dealing with the assurance of products quality. Our contribution lies in the exploitation of machine learning techniques to address the complexity and remedy the drawbacks of existing approaches. More specifically, the proposed methodology exploits Artificial Neural Networks (ANNs) to aid decision making about the acceptance or rejection of an inspected sample. For any type of inspection, ANNs are trained by data from corresponding tables of a standard-s sampling plan schemes. Once trained, ANNs can give closed-form solutions for any acceptance quality level and sample size, thus leading to an automation of the reading of the sampling plan tables, without any need of compromise with the values of the specific standard chosen each time. The proposed methodology provides enough flexibility to quality control engineers during the inspection of their samples, allowing the consideration of specific needs, while it also reduces the time and the cost required for these inspections. Its applicability and advantages are demonstrated through two numerical examples.

Keywords: Acceptance Sampling, Neural Networks, Statistical Quality Control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1682
8433 A Rule-based Approach for Anomaly Detection in Subscriber Usage Pattern

Authors: Rupesh K. Gopal, Saroj K. Meher

Abstract:

In this report we present a rule-based approach to detect anomalous telephone calls. The method described here uses subscriber usage CDR (call detail record) data sampled over two observation periods: study period and test period. The study period contains call records of customers- non-anomalous behaviour. Customers are first grouped according to their similar usage behaviour (like, average number of local calls per week, etc). For customers in each group, we develop a probabilistic model to describe their usage. Next, we use maximum likelihood estimation (MLE) to estimate the parameters of the calling behaviour. Then we determine thresholds by calculating acceptable change within a group. MLE is used on the data in the test period to estimate the parameters of the calling behaviour. These parameters are compared against thresholds. Any deviation beyond the threshold is used to raise an alarm. This method has the advantage of identifying local anomalies as compared to techniques which identify global anomalies. The method is tested for 90 days of study data and 10 days of test data of telecom customers. For medium to large deviations in the data in test window, the method is able to identify 90% of anomalous usage with less than 1% false alarm rate.

Keywords: Subscription fraud, fraud detection, anomalydetection, maximum likelihood estimation, rule based systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2798
8432 Video Data Mining based on Information Fusion for Tamper Detection

Authors: Girija Chetty, Renuka Biswas

Abstract:

In this paper, we propose novel algorithmic models based on information fusion and feature transformation in crossmodal subspace for different types of residue features extracted from several intra-frame and inter-frame pixel sub-blocks in video sequences for detecting digital video tampering or forgery. An evaluation of proposed residue features – the noise residue features and the quantization features, their transformation in cross-modal subspace, and their multimodal fusion, for emulated copy-move tamper scenario shows a significant improvement in tamper detection accuracy as compared to single mode features without transformation in cross-modal subspace.

Keywords: image tamper detection, digital forensics, correlation features image fusion

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1884
8431 A Specification-Based Approach for Retrieval of Reusable Business Component for Software Reuse

Authors: Meng Fanchao, Zhan Dechen, Xu Xiaofei

Abstract:

Software reuse can be considered as the most realistic and promising way to improve software engineering productivity and quality. Automated assistance for software reuse involves the representation, classification, retrieval and adaptation of components. The representation and retrieval of components are important to software reuse in Component-Based on Software Development (CBSD). However, current industrial component models mainly focus on the implement techniques and ignore the semantic information about component, so it is difficult to retrieve the components that satisfy user-s requirements. This paper presents a method of business component retrieval based on specification matching to solve the software reuse of enterprise information system. First, a business component model oriented reuse is proposed. In our model, the business data type is represented as sign data type based on XML, which can express the variable business data type that can describe the variety of business operations. Based on this model, we propose specification match relationships in two levels: business operation level and business component level. In business operation level, we use input business data types, output business data types and the taxonomy of business operations evaluate the similarity between business operations. In the business component level, we propose five specification matches between business components. To retrieval reusable business components, we propose the measure of similarity degrees to calculate the similarities between business components. Finally, a business component retrieval command like SQL is proposed to help user to retrieve approximate business components from component repository.

Keywords: Business component, business operation, business data type, specification matching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1394
8430 Towards Modeling for Crashes A Low-Cost Adaptive Methodology for Karachi

Authors: Mohammad Ahmed Rehmatullah

Abstract:

The aim of this paper is to discuss a low-cost methodology that can predict traffic flow conflicts and quantitatively rank crash expectancies (based on relative probability) for various traffic facilities. This paper focuses on the application of statistical distributions to model traffic flow and Monte Carlo techniques to simulate traffic and discusses how to create a tool in order to predict the possibility of a traffic crash. A low-cost data collection methodology has been discussed for the heterogeneous traffic flow that exists and a GIS platform has been proposed to thematically represent traffic flow from simulations and the probability of a crash. Furthermore, discussions have been made to reflect the dynamism of the model in reference to its adaptability, adequacy, economy, and efficiency to ensure adoption.

Keywords: Heterogeneous traffic data collection, Monte CarloSimulation, Traffic Flow Modeling, GIS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1418
8429 On-The-Spot Spectators- Motivations, Experiences, and Satisfactions at the 2011 TPGA Ever Rich Championship – North Bay Open

Authors: Li-Wei Liu, Cheng-Yu Tsai, Ming-Tsang Wu

Abstract:

The study investigated the 2011 TPGA Ever Rich Championship – North Bay Open spectators- on-the-site spectating motivations, experiences, and satisfactions. The research was conducted on a convenience sample of the on-the-spot spectators at the North Bay Golf and Country Club. A total of 200 questionnaires were distributed, of which 185 valid questionnaires were collected, approaching a 92.5% response rate. The data obtained was analyzed with statistical techniques. First, the data showed significant differences in motivations, experiences, and satisfactions relative to demographic variables among the on-the-spot spectators. Second, spectating motivation, experience, and satisfaction were significantly related to one another.

Keywords: Spectating motivation, spectating experience, spectating satisfaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1276
8428 Development of Prediction Models of Day-Ahead Hourly Building Electricity Consumption and Peak Power Demand Using the Machine Learning Method

Authors: Dalin Si, Azizan Aziz, Bertrand Lasternas

Abstract:

To encourage building owners to purchase electricity at the wholesale market and reduce building peak demand, this study aims to develop models that predict day-ahead hourly electricity consumption and demand using artificial neural network (ANN) and support vector machine (SVM). All prediction models are built in Python, with tool Scikit-learn and Pybrain. The input data for both consumption and demand prediction are time stamp, outdoor dry bulb temperature, relative humidity, air handling unit (AHU), supply air temperature and solar radiation. Solar radiation, which is unavailable a day-ahead, is predicted at first, and then this estimation is used as an input to predict consumption and demand. Models to predict consumption and demand are trained in both SVM and ANN, and depend on cooling or heating, weekdays or weekends. The results show that ANN is the better option for both consumption and demand prediction. It can achieve 15.50% to 20.03% coefficient of variance of root mean square error (CVRMSE) for consumption prediction and 22.89% to 32.42% CVRMSE for demand prediction, respectively. To conclude, the presented models have potential to help building owners to purchase electricity at the wholesale market, but they are not robust when used in demand response control.

Keywords: Building energy prediction, data mining, demand response, electricity market.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2195
8427 The Investigation of Enzymatic Activity in the Soils under the Impact of Metallurgical Industrial Activity in Lori Marz, Armenia

Authors: T. H. Derdzyan, K. A. Ghazaryan, G. A. Gevorgyan

Abstract:

Beta-glucosidase, chitinase, leucine-aminopeptidase, acid phosphomonoesterase and acetate-esterase enzyme activities in the soils under the impact of metallurgical industrial activity in Lori marz (district) were investigated. The results of the study showed that the activities of the investigated enzymes in the soils decreased with increasing distance from the Shamlugh copper mine, the Chochkan tailings storage facility and the ore transportation road. Statistical analysis revealed that the activities of the enzymes were positively correlated (significant) to each other according to the observation sites which indicated that enzyme activities were affected by the same anthropogenic factor. The investigations showed that the soils were polluted with heavy metals (Cu, Pb, As, Co, Ni, Zn) due to copper mining activity in this territory. The results of Pearson correlation analysis revealed a significant negative correlation between heavy metal pollution degree (Nemerow integrated pollution index) and soil enzyme activity. All of this indicated that copper mining activity in this territory causing the heavy metal pollution of the soils resulted in the inhabitation of the activities of the enzymes which are considered as biological catalysts to decompose organic materials and facilitate the cycling of nutrients.

Keywords: Armenia, metallurgical industrial activity, heavy metal pollutionl, soil enzyme activity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2555
8426 A Developmental Survey of Local Stereo Matching Algorithms

Authors: André Smith, Amr Abdel-Dayem

Abstract:

This paper presents an overview of the history and development of stereo matching algorithms. Details from its inception, up to relatively recent techniques are described, noting challenges that have been surmounted across these past decades. Different components of these are explored, though focus is directed towards the local matching techniques. While global approaches have existed for some time, and demonstrated greater accuracy than their counterparts, they are generally quite slow. Many strides have been made more recently, allowing local methods to catch up in terms of accuracy, without sacrificing the overall performance.

Keywords: Developmental survey, local stereo matching, stereo correspondence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1451
8425 Does Practice Reflect Theory? An Exploratory Study of a Successful Knowledge Management System

Authors: Janet L. Kourik, Peter E. Maher

Abstract:

To investigate the correspondence of theory and practice, a successfully implemented Knowledge Management System (KMS) is explored through the lens of Alavi and Leidner-s proposed KMS framework for the analysis of an information system in knowledge management (Framework-AISKM). The applied KMS system was designed to manage curricular knowledge in a distributed university environment. The motivation for the KMS is discussed along with the types of knowledge necessary in an academic setting. Elements of the KMS involved in all phases of capturing and disseminating knowledge are described. As the KMS matures the resulting data stores form the precursor to and the potential for knowledge mining. The findings from this exploratory study indicate substantial correspondence between the successful KMS and the theory-based framework providing provisional confirmation for the framework while suggesting factors that contributed to the system-s success. Avenues for future work are described.

Keywords: Applied KMS, education, knowledge management (KM), KM framework, knowledge management system (KMS).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1024
8424 Real-Time Visual Simulation and Interactive Animation of Shadow Play Puppets Using OpenGL

Authors: Tan Kian Lam, Abdullah Zawawi bin Haji Talib, Mohd. Azam Osman

Abstract:

This paper describes a method of modeling to model shadow play puppet using sophisticated computer graphics techniques available in OpenGL in order to allow interactive play in real-time environment as well as producing realistic animation. This paper proposes a novel real-time method is proposed for modeling of puppet and its shadow image that allows interactive play of virtual shadow play using texture mapping and blending techniques. Special effects such as lighting and blurring effects for virtual shadow play environment are also developed. Moreover, the use of geometric transformations and hierarchical modeling facilitates interaction among the different parts of the puppet during animation. Based on the experiments and the survey that were carried out, the respondents involved are very satisfied with the outcomes of these techniques.

Keywords: Animation, blending, hierarchical modeling, interactive play, real-time, shadow play, visual simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2491
8423 Development of a Technology Assessment Model by Patents and Customers' Review Data

Authors: Kisik Song, Sungjoo Lee

Abstract:

Recent years have seen an increasing number of patent disputes due to excessive competition in the global market and a reduced technology life-cycle; this has increased the risk of investment in technology development. While many global companies have started developing a methodology to identify promising technologies and assess for decisions, the existing methodology still has some limitations. Post hoc assessments of the new technology are not being performed, especially to determine whether the suggested technologies turned out to be promising. For example, in existing quantitative patent analysis, a patent’s citation information has served as an important metric for quality assessment, but this analysis cannot be applied to recently registered patents because such information accumulates over time. Therefore, we propose a new technology assessment model that can replace citation information and positively affect technological development based on post hoc analysis of the patents for promising technologies. Additionally, we collect customer reviews on a target technology to extract keywords that show the customers’ needs, and we determine how many keywords are covered in the new technology. Finally, we construct a portfolio (based on a technology assessment from patent information) and a customer-based marketability assessment (based on review data), and we use them to visualize the characteristics of the new technologies.

Keywords: Technology assessment, patents, citation information, opinion mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 978
8422 Research on IBR-Driven Distributed Collaborative Visualization System

Authors: Yin Runmin, Song Changfeng

Abstract:

Image-based Rendering(IBR) techniques recently reached in broad fields which leads to a critical challenge to build up IBR-Driven visualization platform where meets requirement of high performance, large bounds of distributed visualization resource aggregation and concentration, multiple operators deploying and CSCW design employing. This paper presents an unique IBR-based visualization dataflow model refer to specific characters of IBR techniques and then discusses prominent feature of IBR-Driven distributed collaborative visualization (DCV) system before finally proposing an novel prototype. The prototype provides a well-defined three level modules especially work as Central Visualization Server, Local Proxy Server and Visualization Aid Environment, by which data and control for collaboration move through them followed the previous dataflow model. With aid of this triple hierarchy architecture of that, IBR oriented application construction turns to be easy. The employed augmented collaboration strategy not only achieve convenient multiple users synchronous control and stable processing management, but also is extendable and scalable.

Keywords: Image-Based Rendering, Distributed CollaborativeVisualization, Computer Supported Cooperative Work, Model andSimulation, Modular Visualization Environment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1469
8421 Latent Semantic Inference for Agriculture FAQ Retrieval

Authors: Dawei Wang, Rujing Wang, Ying Li, Baozi Wei

Abstract:

FAQ system can make user find answer to the problem that puzzles them. But now the research on Chinese FAQ system is still on the theoretical stage. This paper presents an approach to semantic inference for FAQ mining. To enhance the efficiency, a small pool of the candidate question-answering pairs retrieved from the system for the follow-up work according to the concept of the agriculture domain extracted from user input .Input queries or questions are converted into four parts, the question word segment (QWS), the verb segment (VS), the concept of agricultural areas segment (CS), the auxiliary segment (AS). A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the pool of the candidate. A thesaurus constructed from the HowNet, a Chinese knowledge base, is adopted for word similarity measure in the matcher. The questions are classified into eleven intension categories using predefined question stemming keywords. For FAQ mining, given a query, the question part and answer part in an FAQ question-answer pair is matched with the input query, respectively. Finally, the probabilities estimated from these two parts are integrated and used to choose the most likely answer for the input query. These approaches are experimented on an agriculture FAQ system. Experimental results indicate that the proposed approach outperformed the FAQ-Finder system in agriculture FAQ retrieval.

Keywords: FAQ, Semantic Inference, Ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1369
8420 Churn Prediction: Does Technology Matter?

Authors: John Hadden, Ashutosh Tiwari, Rajkumar Roy, Dymitr Ruta

Abstract:

The aim of this paper is to identify the most suitable model for churn prediction based on three different techniques. The paper identifies the variables that affect churn in reverence of customer complaints data and provides a comparative analysis of neural networks, regression trees and regression in their capabilities of predicting customer churn.

Keywords: Churn, Decision Trees, Neural Networks, Regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3275
8419 Machine Learning Techniques in Bank Credit Analysis

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner

Abstract:

The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.

Keywords: Artificial Neural Networks, ANNs, classifier algorithms, credit risk assessment, logistic regression, machine learning, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1250
8418 Identification of Promising Infant Clusters to Obtain Improved Block Layout Designs

Authors: Mustahsan Mir, Ahmed Hassanin, Mohammed A. Al-Saleh

Abstract:

The layout optimization of building blocks of unequal areas has applications in many disciplines including VLSI floorplanning, macrocell placement, unequal-area facilities layout optimization, and plant or machine layout design. A number of heuristics and some analytical and hybrid techniques have been published to solve this problem. This paper presents an efficient high-quality building-block layout design technique especially suited for solving large-size problems. The higher efficiency and improved quality of optimized solutions are made possible by introducing the concept of Promising Infant Clusters in a constructive placement procedure. The results presented in the paper demonstrate the improved performance of the presented technique for benchmark problems in comparison with published heuristic, analytic, and hybrid techniques.

Keywords: Block layout problem, building-block layout design, CAD, optimization, search techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1228
8417 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Soo-Hyeon Jeon, Byeoung Kug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including large volumes of unstructured data and text have been created because of the rapid increase in the use of social media and the Internet. Usually, these documents are categorized for the convenience of users. Because the accuracy of manual categorization is not guaranteed, and such categorization requires a large amount of time and incurs huge costs. Many studies on automatic categorization have been conducted to help mitigate the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorize complex documents with multiple topics because they work on the assumption that individual documents can be categorized into single categories only. Therefore, to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, the learning process employed in these studies involves training using a multi-categorized document set. These methods therefore cannot be applied to the multi-categorization of most documents unless multi-categorized training sets using traditional multi-categorization algorithms are provided. To overcome this limitation, in this study, we review our novel methodology for extending the category of a single-categorized document to multiple categorizes, and then introduce a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: Big Data Analysis, Document Classification, Text Mining, Topic Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1739
8416 Comparison of Different Hydrograph Routing Techniques in XPSTORM Modelling Software: A Case Study

Authors: Fatema Akram, Mohammad Golam Rasul, Mohammad Masud Kamal Khan, Md. Sharif Imam Ibne Amir

Abstract:

A variety of routing techniques are available to develop surface runoff hydrographs from rainfall. The selection of runoff routing method is very vital as it is directly related to the type of watershed and the required degree of accuracy. There are different modelling softwares available to explore the rainfall-runoff process in urban areas. XPSTORM, a link-node based, integrated stormwater modelling software, has been used in this study for developing surface runoff hydrograph for a Golf course area located in Rockhampton in Central Queensland in Australia. Four commonly used methods, namely SWMM runoff, Kinematic wave, Laurenson, and Time-Area are employed to generate runoff hydrograph for design storm of this study area. In runoff mode of XPSTORM, the rainfall, infiltration, evaporation and depression storage for subcatchments were simulated and the runoff from the subcatchment to collection node was calculated. The simulation results are presented, discussed and compared. The total surface runoff generated by SWMM runoff, Kinematic wave and Time-Area methods are found to be reasonably close, which indicates any of these methods can be used for developing runoff hydrograph of the study area. Laurenson method produces a comparatively less amount of surface runoff, however, it creates highest peak of surface runoff among all which may be suitable for hilly region. Although the Laurenson hydrograph technique is widely acceptable surface runoff routing technique in Queensland (Australia), extensive investigation is recommended with detailed topographic and hydrologic data in order to assess its suitability for use in the case study area.

Keywords: ARI, design storm, IFD, rainfall temporal pattern, routing techniques, surface runoff, XPSTORM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5028
8415 Steganalysis of Data Hiding via Halftoning and Coordinate Projection

Authors: Woong Hee Kim, Ilhwan Park

Abstract:

Steganography is the art of hiding and transmitting data through apparently innocuous carriers in an effort to conceal the existence of the data. A lot of steganography algorithms have been proposed recently. Many of them use the digital image data as a carrier. In data hiding scheme of halftoning and coordinate projection, still image data is used as a carrier, and the data of carrier image are modified for data embedding. In this paper, we present three features for analysis of data hiding via halftoning and coordinate projection. Also, we present a classifier using the proposed three features.

Keywords: Steganography, steganalysis, digital halftoning, data hiding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1587
8414 Biological Data Integration using SOA

Authors: Noura Meshaan Al-Otaibi, Amin Yousef Noaman

Abstract:

Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. This research suggests the use of Service Oriented Architecture (SOA) to integrate biological data from different data sources. This work shows SOA will solve the problems that facing integration process and if the biologist scientists can access the biological data in easier way. There are several methods to implement SOA but web service is the most popular method. The Microsoft .Net Framework used to implement proposed architecture.

Keywords: Bioinformatics, Biological data, Data Integration, SOA and Web Services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2449
8413 Synthesis of Y2O3 Films by Spray Coating with Milled EDTA·Y·H Complexes

Authors: Keiji Komatsu, Tetsuo Sekiya, Ayumu Toyama, Atsushi Nakamura, Ikumi Toda, Shigeo Ohshio, Hiroyuki Muramatsu, Hidetoshi Saitoh, Atsushi Nakamura, Ariyuki Kato

Abstract:

Yttrium oxide (Y2O3) films have been successfully deposited with yttrium-ethylenediamine tetraacetic acid (EDTA·Y·H) complexes prepared by various milling techniques. The effects of the properties of the EDTA·Y·H complex on the properties of the deposited Y2O3 films have been analyzed. Seven different types of the raw EDTA·Y·H complexes were prepared by various commercial milling techniques such as ball milling, hammer milling, commercial milling, and mortar milling. The milled EDTA·Y·H complexes exhibited various particle sizes and distributions, depending on the milling method. Furthermore, we analyzed the crystal structure, morphology and elemental distribution profile of the metal oxide films deposited on stainless steel substrate with the milled EDTA·Y·H complexes. Depending on the milling technique, the flow properties of the raw powders differed. The X-ray diffraction pattern of all the samples revealed the formation of Y2O3 crystalline phase, irrespective of the milling technique. Of all the different milling techniques, the hammer milling technique is considered suitable for fabricating dense Y2O3 films.

Keywords: Powder sizes and distributions, Flame spray coating techniques, Yttrium oxide.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2617
8412 Analysis of S.P.O Techniques for Prediction of Dynamic Behavior of the Plate

Authors: Byung-kyoo Jung, Weui-bong Jeong

Abstract:

In most cases, it is considerably difficult to directly measure structural vibration with a lot of sensors because of complex geometry, time and equipment cost. For this reason, this paper deals with the problem of locating sensors on a plate model by four advanced sensor placement optimization (S.P.O) techniques. It also suggests the evaluation index representing the characteristic of orthogonal between each of natural modes. The index value provides the assistance to selecting of proper S.P.O technique and optimal positions for monitoring of dynamic systems without the experiment.

Keywords: Genetic algorithm, Modal assurance criterion, Sensor placement optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665
8411 Spatial Data Science for Data Driven Urban Planning: The Youth Economic Discomfort Index for Rome

Authors: Iacopo Testi, Diego Pajarito, Nicoletta Roberto, Carmen Greco

Abstract:

Today, a consistent segment of the world’s population lives in urban areas, and this proportion will vastly increase in the next decades. Therefore, understanding the key trends in urbanization, likely to unfold over the coming years, is crucial to the implementation of sustainable urban strategies. In parallel, the daily amount of digital data produced will be expanding at an exponential rate during the following years. The analysis of various types of data sets and its derived applications have incredible potential across different crucial sectors such as healthcare, housing, transportation, energy, and education. Nevertheless, in city development, architects and urban planners appear to rely mostly on traditional and analogical techniques of data collection. This paper investigates the prospective of the data science field, appearing to be a formidable resource to assist city managers in identifying strategies to enhance the social, economic, and environmental sustainability of our urban areas. The collection of different new layers of information would definitely enhance planners' capabilities to comprehend more in-depth urban phenomena such as gentrification, land use definition, mobility, or critical infrastructural issues. Specifically, the research results correlate economic, commercial, demographic, and housing data with the purpose of defining the youth economic discomfort index. The statistical composite index provides insights regarding the economic disadvantage of citizens aged between 18 years and 29 years, and results clearly display that central urban zones and more disadvantaged than peripheral ones. The experimental set up selected the city of Rome as the testing ground of the whole investigation. The methodology aims at applying statistical and spatial analysis to construct a composite index supporting informed data-driven decisions for urban planning.

Keywords: Data science, spatial analysis, composite index, Rome, urban planning, youth economic discomfort index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 866
8410 Artificial Neural Network Development by means of Genetic Programming with Graph Codification

Authors: Daniel Rivero, Julián Dorado, Juan R. Rabuñal, Alejandro Pazos, Javier Pereira

Abstract:

The development of Artificial Neural Networks (ANNs) is usually a slow process in which the human expert has to test several architectures until he finds the one that achieves best results to solve a certain problem. This work presents a new technique that uses Genetic Programming (GP) for automatically generating ANNs. To do this, the GP algorithm had to be changed in order to work with graph structures, so ANNs can be developed. This technique also allows the obtaining of simplified networks that solve the problem with a small group of neurons. In order to measure the performance of the system and to compare the results with other ANN development methods by means of Evolutionary Computation (EC) techniques, several tests were performed with problems based on some of the most used test databases. The results of those comparisons show that the system achieves good results comparable with the already existing techniques and, in most of the cases, they worked better than those techniques.

Keywords: Artificial Neural Networks, Evolutionary Computation, Genetic Programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1448
8409 Rapid Processing Techniques Applied to Sintered Nickel Battery Technologies for Utility Scale Applications

Authors: J. D. Marinaccio, I. Mabbett, C. Glover, D. Worsley

Abstract:

Through use of novel modern/rapid processing techniques such as screen printing and Near-Infrared (NIR) radiative curing, process time for the sintering of sintered nickel plaques, applicable to alkaline nickel battery chemistries, has been drastically reduced from in excess of 200 minutes with conventional convection methods to below 2 minutes using NIR curing methods. Steps have also been taken to remove the need for forming gas as a reducing agent by implementing carbon as an in-situ reducing agent, within the ink formulation.

Keywords: Batteries, energy, iron, nickel, storage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2330