Search results for: review mining.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1563

Search results for: review mining.

963 Learning an Overcomplete Dictionary using a Cauchy Mixture Model for Sparse Decay

Authors: E. S. Gower, M. O. J. Hawksford

Abstract:

An algorithm for learning an overcomplete dictionary using a Cauchy mixture model for sparse decomposition of an underdetermined mixing system is introduced. The mixture density function is derived from a ratio sample of the observed mixture signals where 1) there are at least two but not necessarily more mixture signals observed, 2) the source signals are statistically independent and 3) the sources are sparse. The basis vectors of the dictionary are learned via the optimization of the location parameters of the Cauchy mixture components, which is shown to be more accurate and robust than the conventional data mining methods usually employed for this task. Using a well known sparse decomposition algorithm, we extract three speech signals from two mixtures based on the estimated dictionary. Further tests with additive Gaussian noise are used to demonstrate the proposed algorithm-s robustness to outliers.

Keywords: expectation-maximization, Pitman estimator, sparsedecomposition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1948
962 Development of Subjective Measures of Interestingness: From Unexpectedness to Shocking

Authors: Eiad Yafi, M. A. Alam, Ranjit Biswas

Abstract:

Knowledge Discovery of Databases (KDD) is the process of extracting previously unknown but useful and significant information from large massive volume of databases. Data Mining is a stage in the entire process of KDD which applies an algorithm to extract interesting patterns. Usually, such algorithms generate huge volume of patterns. These patterns have to be evaluated by using interestingness measures to reflect the user requirements. Interestingness is defined in different ways, (i) Objective measures (ii) Subjective measures. Objective measures such as support and confidence extract meaningful patterns based on the structure of the patterns, while subjective measures such as unexpectedness and novelty reflect the user perspective. In this report, we try to brief the more widely spread and successful subjective measures and propose a new subjective measure of interestingness, i.e. shocking.

Keywords: Shocking rules (SHR).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1536
961 Literature-Based Discoveries in Lupus Treatment

Authors: Oluwaseyi Jaiyeoba, Vetria Byrd

Abstract:

Systemic lupus erythematosus (aka lupus) is a chronic disease known for its chameleon-like ability to mimic symptoms of other diseases rendering it hard to detect, diagnose and treat. The heterogeneous nature of the disease generates disparate data that are often multifaceted and multi-dimensional. Musculoskeletal manifestation of lupus is one of the most common clinical manifestations of lupus. This research links disparate literature on the treatment of lupus as it affects the musculoskeletal system using the discoveries from literature-based research articles available on the PubMed database. Several Natural Language Processing (NPL) tools exist to connect disjointed but related literature, such as Connected Papers, Bitola, and Gopalakrishnan. Literature-based discovery (LBD) has been used to bridge unconnected disciplines based on text mining procedures. The technical/medical literature consists of many technical/medical concepts, each having its  sub-literature. This approach has been used to link Parkinson’s, Raynaud, and Multiple Sclerosis treatment within works of literature.  Literature-based discovery methods can connect two or more related but disjointed literature concepts to produce a novel and plausible approach to solving a research problem. Data visualization techniques with the help of natural language processing tools are used to visually represent the result of literature-based discoveries. Literature search results can be voluminous, but Data visualization processes can provide insight and detect subtle patterns in large data. These insights and patterns can lead to discoveries that would have otherwise been hidden from disjointed literature. In this research, literature data are mined and combined with visualization techniques for heterogeneous data to discover viable treatments reported in the literature for lupus expression in the musculoskeletal system. This research answers the question of using literature-based discovery to identify potential treatments for a multifaceted disease like lupus. A three-pronged methodology is used in this research: text mining, natural language processing, and data visualization. These three research-related fields are employed to identify patterns in lupus-related data that, when visually represented, could aid research in the treatment of lupus. This work introduces a method for visually representing interconnections of various lupus-related literature. The methodology outlined in this work is the first step toward literature-based research and treatment planning for the musculoskeletal manifestation of lupus. The results also outline the interconnection of complex, disparate data associated with the manifestation of lupus in the musculoskeletal system. The societal impact of this work is broad. Advances in this work will improve the quality of life for millions of persons in the workforce currently diagnosed and silently living with a musculoskeletal disease associated with lupus.

Keywords: Systemic lupus erythematosus, LBD, Data Visualization, musculoskeletal system, treatment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 506
960 Integration of Support Vector Machine and Bayesian Neural Network for Data Mining and Classification

Authors: Essam Al-Daoud

Abstract:

Several combinations of the preprocessing algorithms, feature selection techniques and classifiers can be applied to the data classification tasks. This study introduces a new accurate classifier, the proposed classifier consist from four components: Signal-to- Noise as a feature selection technique, support vector machine, Bayesian neural network and AdaBoost as an ensemble algorithm. To verify the effectiveness of the proposed classifier, seven well known classifiers are applied to four datasets. The experiments show that using the suggested classifier enhances the classification rates for all datasets.

Keywords: AdaBoost, Bayesian neural network, Signal-to-Noise, support vector machine, MCMC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2020
959 The Investigation on the Relationship between Religion and Development: By Focusing on Islam

Authors: Dalal Benboutrif

Abstract:

Religion and Development relation is one of the most arguable phrases amongst politicians, philosophers, clerics, scientists, sociologists and even the public. The main objective of this research is to clarify the relations, contrasts and interactions between religion and the major types of development including social, political, economic and scientific developments, by focusing on Islam religion. A review of the literature was performed concerning religion and development relations and conflicts, by focusing on Islam religion and then the unprocessed tips of the review were characterized. Regarding clarification of the key points of the literature, three main sectors were considered in the research. The first sector of the research mainly focused on the philosophical views on religion, which were analyzed by main evaluation of three famous philosophers’ ideas: ‘Kant’, ‘Hegel’ and ‘Weber’, and then a critical discussion on Weber’s idea about Islam and development was applied. The second sector was specified to ‘Religion and Development’ and mainly discussed the role of religion in development through poverty reduction, the interconnection of religion, spirituality and social development, religious education effects on social development, and the relation of religion with political development. The third sector was specified to ‘Islam and Development’ and mainly discussed the Islamic golden age of science, major reasons of today’s backwardness (non-development) of most Islamic countries, and Quranic instructions regarding adaptability of Islam with development. The findings of the current research approved the research hypothesis as: ‘Religious instructions (included Islam) are not in conflict with development’, rather, it could have positive effects mainly on social development and it can pave the way for society to develop. Turkey was considered as a study model, as a successful developed Islamic country demonstrating the non-conflicting relation of Islam and development.

Keywords: Development, Islam, philosophy, religion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1673
958 Decision-Making Strategies on Smart Dairy Farms: A Review

Authors: L. Krpalkova, N. O' Mahony, A. Carvalho, S. Campbell, G. Corkery, E. Broderick, J. Walsh

Abstract:

Farm management and operations will drastically change due to access to real-time data, real-time forecasting and tracking of physical items in combination with Internet of Things (IoT) developments to further automate farm operations. Dairy farms have embraced technological innovations and procured vast amounts of permanent data streams during the past decade; however, the integration of this information to improve the whole farm decision-making process does not exist. It is now imperative to develop a system that can collect, integrate, manage, and analyze on-farm and off-farm data in real-time for practical and relevant environmental and economic actions. The developed systems, based on machine learning and artificial intelligence, need to be connected for useful output, a better understanding of the whole farming issue and environmental impact. Evolutionary Computing (EC) can be very effective in finding the optimal combination of sets of some objects and finally, in strategy determination. The system of the future should be able to manage the dairy farm as well as an experienced dairy farm manager with a team of the best agricultural advisors. All these changes should bring resilience and sustainability to dairy farming as well as improving and maintaining good animal welfare and the quality of dairy products. This review aims to provide an insight into the state-of-the-art of big data applications and EC in relation to smart dairy farming and identify the most important research and development challenges to be addressed in the future. Smart dairy farming influences every area of management and its uptake has become a continuing trend.

Keywords: Big data, evolutionary computing, cloud, precision technologies

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 756
957 Analysis of Textual Data Based On Multiple 2-Class Classification Models

Authors: Shigeaki Sakurai, Ryohei Orihara

Abstract:

This paper proposes a new method for analyzing textual data. The method deals with items of textual data, where each item is described based on various viewpoints. The method acquires 2- class classification models of the viewpoints by applying an inductive learning method to items with multiple viewpoints. The method infers whether the viewpoints are assigned to the new items or not by using the models. The method extracts expressions from the new items classified into the viewpoints and extracts characteristic expressions corresponding to the viewpoints by comparing the frequency of expressions among the viewpoints. This paper also applies the method to questionnaire data given by guests at a hotel and verifies its effect through numerical experiments.

Keywords: Text mining, Multiple viewpoints, Differential analysis, Questionnaire data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1290
956 Bio-Inspired Design Approach Analysis: A Case Study of Antoni Gaudi and Santiago Calatrava

Authors: Marzieh Imani

Abstract:

Antoni Gaudi and Santiago Calatrava have reputation for designing bio-inspired creative and technical buildings. Even though they have followed different independent approaches towards design, the source of bio-inspiration seems to be common. Taking a closer look at their projects reveals that Calatrava has been influenced by Gaudi in terms of interpreting nature and applying natural principles into the design process. This research firstly discusses the dialogue between Biomimicry and architecture. This review also explores human/nature discourse during the history by focusing on how nature revealed itself to the fine arts. This is explained by introducing naturalism and romantic style in architecture as the outcome of designers’ inclination towards nature. Reviewing the literature, theoretical background and practical illustration of nature have been included. The most dominant practical aspects of imitating nature are form and function. Nature has been reflected in architectural science resulted in shaping different architectural styles such as organic, green, sustainable, bionic, and biomorphic. By defining a set of common aspects of Gaudi and Calatrava‘s design approach and by considering biomimetic design categories (organism, ecosystem, and behaviour as the main division and form, function, process, material, and construction as subdivisions), Gaudi’s and Calatrava’s project have been analysed. This analysis explores if their design approaches are equivalent or different. Based on this analysis, Gaudi’s architecture can be recognised as biomorphic while Calatrava’s projects are literally biomimetic. Referring to these architects, this review suggests a new set of principles by which a bio-inspired project can be determined either biomorphic or biomimetic.

Keywords: Biomimicry, Calatrava, Gaudi, nature.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3310
955 Network Anomaly Detection using Soft Computing

Authors: Surat Srinoy, Werasak Kurutach, Witcha Chimphlee, Siriporn Chimphlee

Abstract:

One main drawback of intrusion detection system is the inability of detecting new attacks which do not have known signatures. In this paper we discuss an intrusion detection method that proposes independent component analysis (ICA) based feature selection heuristics and using rough fuzzy for clustering data. ICA is to separate these independent components (ICs) from the monitored variables. Rough set has to decrease the amount of data and get rid of redundancy and Fuzzy methods allow objects to belong to several clusters simultaneously, with different degrees of membership. Our approach allows us to recognize not only known attacks but also to detect activity that may be the result of a new, unknown attack. The experimental results on Knowledge Discovery and Data Mining- (KDDCup 1999) dataset.

Keywords: Network security, intrusion detection, rough set, ICA, anomaly detection, independent component analysis, rough fuzzy .

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1955
954 Sickle Cell Disease: Review of Managements in Pregnancy and the Outcome in Ampang Hospital, Selangor

Authors: Z. Nurzaireena, K. Azalea, T. Azirawaty, S. Jameela, G. Muralitharan

Abstract:

The aim of this study is the review of the management practices of sickle cell disease patients during pregnancy, as well as the maternal and neonatal outcome at Ampang Hospital, Selangor. The study consisted of a review of pregnant patients with sickle cell disease under follow up at the Hematology Clinic, Ampang Hospital over the last seven years to assess their management and maternal-fetal outcome. The results of the review show that Ampang Hospital is considered the public hematology centre for sickle cell disease and had successfully managed three pregnancies throughout the last seven years. Patients’ presentations, managements and maternal-fetal outcome were compared and reviewed for academic improvements. All three patients were seen very early in their pregnancy and had been given a regime of folic acid, antibiotics and thrombo-prophylactic drugs. Close monitoring of maternal and fetal well being was done by the hematologists and obstetricians. Among the patients, there were multiple admissions during the pregnancy for either a painful sickle cell bone crisis, haemolysis following an infection and anemia requiring phenotype- matched blood and exchange transfusions. Broad spectrum antibiotics coverage during and infection, hydration, pain management and venous-thrombolism prophylaxis were mandatory. The pregnancies managed to reach near term in the third trimester but all required emergency caesarean section for obstetric indications. All pregnancies resulted in live births with good fetal outcome. During post partum all were nursed closely in the high dependency units for further complications and were discharged well. Post partum follow up and contraception counseling was comprehensively given for future pregnancies. Sickle cell disease is uncommonly seen in the East, especially in the South East Asian region, yet more cases are seen in the current decade due to improved medical expertise and advance medical laboratory technologies. Pregnancy itself is a risk factor for sickle cell patients as increased thrombosis event and risk of infections can lead to multiple crisis, haemolysis, anemia and vaso-occlusive complications including eclampsia, cerebrovasular accidents and acute bone pain. Patients mostly require multiple blood product transfusions thus phenotype-matched blood is required to reduce the risk of alloimmunozation. Emphasizing the risks and complications in preconception counseling and establishing an ultimate pregnancy plan would probably reduce the risk of morbidity and mortality to the mother and unborn child. Early management for risk of infection, thromboembolic events and adequate hydration is mandatory. A holistic approach involving multidisciplinary team care between the hematologist, obstetricians, anesthetist, neonatologist and close nursing care for both mother and baby would ensure the best outcome. In conclusion, sickle cell disease by itself is a high risk medical condition and pregnancy would further amplify the risk. Thus, close monitoring with combine multidisciplinary care, counseling and educating the patients are crucial in achieving the safe outcome.

Keywords: Anemia, haemoglobinopathies, pregnancy, sickle cell disease.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1295
953 Web Usability : A Fuzzy Approach to the Navigation Structure Enhancement in a Website System, Case of Iranian Civil Aviation Organization Website

Authors: Hamed Qahri Saremi, Gholam Ali Montazer

Abstract:

With the proliferation of World Wide Web, development of web-based technologies and the growth in web content, the structure of a website becomes more complex and web navigation becomes a critical issue to both web designers and users. In this paper we define the content and web pages as two important and influential factors in website navigation and paraphrase the enhancement in the website navigation as making some useful changes in the link structure of the website based on the aforementioned factors. Then we suggest a new method for proposing the changes using fuzzy approach to optimize the website architecture. Applying the proposed method to a real case of Iranian Civil Aviation Organization (CAO) website, we discuss the results of the novel approach at the final section.

Keywords: Web content, Web navigation, Website system, Webusage mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1787
952 Context-aware Recommender Systems using Data Mining Techniques

Authors: Kyoung-jae Kim, Hyunchul Ahn, Sangwon Jeong

Abstract:

This study proposes a novel recommender system to provide the advertisements of context-aware services. Our proposed model is designed to apply a modified collaborative filtering (CF) algorithm with regard to the several dimensions for the personalization of mobile devices – location, time and the user-s needs type. In particular, we employ a classification rule to understand user-s needs type using a decision tree algorithm. In addition, we collect primary data from the mobile phone users and apply them to the proposed model to validate its effectiveness. Experimental results show that the proposed system makes more accurate and satisfactory advertisements than comparative systems.

Keywords: Location-based advertisement, Recommender system, Collaborative filtering, User needs type, Mobile user.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2174
951 Genetic Programming Approach to Hierarchical Production Rule Discovery

Authors: Basheer M. Al-Maqaleh, Kamal K. Bharadwaj

Abstract:

Automated discovery of hierarchical structures in large data sets has been an active research area in the recent past. This paper focuses on the issue of mining generalized rules with crisp hierarchical structure using Genetic Programming (GP) approach to knowledge discovery. The post-processing scheme presented in this work uses flat rules as initial individuals of GP and discovers hierarchical structure. Suitable genetic operators are proposed for the suggested encoding. Based on the Subsumption Matrix(SM), an appropriate fitness function is suggested. Finally, Hierarchical Production Rules (HPRs) are generated from the discovered hierarchy. Experimental results are presented to demonstrate the performance of the proposed algorithm.

Keywords: Genetic Programming, Hierarchy, Knowledge Discovery in Database, Subsumption Matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1451
950 Towards Achieving Energy Efficiency in Kazakhstan

Authors: Aigerim Uyzbayeva, Valeriya Tyo, Nurlan Ibrayev

Abstract:

Kazakhstan is currently one of the dynamically developing states in its region. The stable growth in all sectors of the economy leads to a corresponding increase in energy consumption. Thus country consumes significant amount of energy due to the high level of industrialisation and the presence of energy-intensive manufacturing such as mining and metallurgy which in turn leads to low energy efficiency. With allowance for this the Government has set several priorities to adopt a transition of Republic of Kazakhstan to a “green economy”. This article provides an overview of Kazakhstan’s energy efficiency situation in for the period of 1991- 2014. First, the dynamics of production and consumption of conventional energy resources are given. Second, the potential of renewable energy sources is summarised followed by the description of GHG emissions trends in the country. Third, Kazakhstan’ national initiatives, policies and locally implemented projects in the field of energy efficiency are described.

Keywords: Energy efficiency in Kazakhstan, greenhouse gases, renewable energy, sustainable development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3538
949 Video Data Mining based on Information Fusion for Tamper Detection

Authors: Girija Chetty, Renuka Biswas

Abstract:

In this paper, we propose novel algorithmic models based on information fusion and feature transformation in crossmodal subspace for different types of residue features extracted from several intra-frame and inter-frame pixel sub-blocks in video sequences for detecting digital video tampering or forgery. An evaluation of proposed residue features – the noise residue features and the quantization features, their transformation in cross-modal subspace, and their multimodal fusion, for emulated copy-move tamper scenario shows a significant improvement in tamper detection accuracy as compared to single mode features without transformation in cross-modal subspace.

Keywords: image tamper detection, digital forensics, correlation features image fusion

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1899
948 Machine Learning Methods for Network Intrusion Detection

Authors: Mouhammad Alkasassbeh, Mohammad Almseidin

Abstract:

Network security engineers work to keep services available all the time by handling intruder attacks. Intrusion Detection System (IDS) is one of the obtainable mechanisms that is used to sense and classify any abnormal actions. Therefore, the IDS must be always up to date with the latest intruder attacks signatures to preserve confidentiality, integrity, and availability of the services. The speed of the IDS is a very important issue as well learning the new attacks. This research work illustrates how the Knowledge Discovery and Data Mining (or Knowledge Discovery in Databases) KDD dataset is very handy for testing and evaluating different Machine Learning Techniques. It mainly focuses on the KDD preprocess part in order to prepare a decent and fair experimental data set. The J48, MLP, and Bayes Network classifiers have been chosen for this study. It has been proven that the J48 classifier has achieved the highest accuracy rate for detecting and classifying all KDD dataset attacks, which are of type DOS, R2L, U2R, and PROBE.

Keywords: IDS, DDoS, MLP, KDD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 727
947 A Survey on Principal Aspects of Secure Image Transmission

Authors: Ali Soleymani, Zulkarnain Md Ali, Md Jan Nordin

Abstract:

This paper is a review on the aspects and approaches of design an image cryptosystem. First a general introduction given for cryptography and images encryption and followed by different techniques in image encryption and related works for each technique surveyed. Finally, general security analysis methods for encrypted images are mentioned.

Keywords: Image, cryptography, encryption, security, analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2385
946 A Review on WEB Resources in Teaching of Geotechnical Engineering

Authors: Amin Chegenizadeh, Hamid Nikraz

Abstract:

The use of computer hardware and software in education and training dates to the early 1940s, when American researchers developed flight simulators which used analog computers to generate simulated onboard instrument data.Computer software is widely used to help engineers and undergraduate student solve their problems quickly and more accurately. This paper presents the list of computer software in geotechnical engineering.

Keywords: Geotechnical, Teaching, Courseware

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1743
945 Does Practice Reflect Theory? An Exploratory Study of a Successful Knowledge Management System

Authors: Janet L. Kourik, Peter E. Maher

Abstract:

To investigate the correspondence of theory and practice, a successfully implemented Knowledge Management System (KMS) is explored through the lens of Alavi and Leidner-s proposed KMS framework for the analysis of an information system in knowledge management (Framework-AISKM). The applied KMS system was designed to manage curricular knowledge in a distributed university environment. The motivation for the KMS is discussed along with the types of knowledge necessary in an academic setting. Elements of the KMS involved in all phases of capturing and disseminating knowledge are described. As the KMS matures the resulting data stores form the precursor to and the potential for knowledge mining. The findings from this exploratory study indicate substantial correspondence between the successful KMS and the theory-based framework providing provisional confirmation for the framework while suggesting factors that contributed to the system-s success. Avenues for future work are described.

Keywords: Applied KMS, education, knowledge management (KM), KM framework, knowledge management system (KMS).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1037
944 Systems and Software Safety and Security

Authors: Marzieh Mokhtaripour

Abstract:

Security issue and the importance of the function of police to provide practical and psychological contexts in the community has been the main topics among researchers , police and security circles and this subject require to review and analysis mechanisms within the police and its interaction with other parts of the system for providing community safety. This paper examine national and social security in the Internet.

Keywords: Internet National security Social security

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1266
943 Importance of Simulation in Manufacturing

Authors: F. Hosseinpour, H. Hajihosseini

Abstract:

Simulation is a very helpful and valuable work tool in manufacturing. It can be used in industrial field allowing the system`s behavior to be learnt and tested. Simulation provides a low cost, secure and fast analysis tool. It also provides benefits, which can be reached with many different system configurations. Topics to be discussed include: Applications, Modeling, Validating, Software and benefits of simulation. This paper provides a comprehensive literature review on research efforts in simulation.

Keywords: Manufacturing, modeling, simulation, training.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8013
942 An Efficient and Generic Hybrid Framework for High Dimensional Data Clustering

Authors: Dharmveer Singh Rajput , P. K. Singh, Mahua Bhattacharya

Abstract:

Clustering in high dimensional space is a difficult problem which is recurrent in many fields of science and engineering, e.g., bioinformatics, image processing, pattern reorganization and data mining. In high dimensional space some of the dimensions are likely to be irrelevant, thus hiding the possible clustering. In very high dimensions it is common for all the objects in a dataset to be nearly equidistant from each other, completely masking the clusters. Hence, performance of the clustering algorithm decreases. In this paper, we propose an algorithmic framework which combines the (reduct) concept of rough set theory with the k-means algorithm to remove the irrelevant dimensions in a high dimensional space and obtain appropriate clusters. Our experiment on test data shows that this framework increases efficiency of the clustering process and accuracy of the results.

Keywords: High dimensional clustering, sub-space, k-means, rough set, discernibility matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1949
941 Knowledge Acquisition for the Construction of an Evolving Ontology: Application to Augmented Surgery

Authors: Nora Taleb, Sellami Mokhtar, Michel Simonet

Abstract:

This work concerns the evolution and the maintenance of an ontological resource in relation with the evolution of the corpus of texts from which it had been built. The knowledge forming a text corpus, especially in dynamic domains, is in continuous evolution. When a change in the corpus occurs, the domain ontology must evolve accordingly. Most methods manage ontology evolution independently from the corpus from which it is built; in addition, they treat evolution just as a process of knowledge addition, not considering other knowledge changes. We propose a methodology for managing an evolving ontology from a text corpus that evolves over time, while preserving the consistency and the persistence of this ontology. Our methodology is based on the changes made on the corpus to reflect the evolution of the considered domain - augmented surgery in our case. In this context, the results of text mining techniques, as well as the ARCHONTE method slightly modified, are used to support the evolution process.

Keywords: Corpus, Evolution, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1443
940 Using Pattern Search Methods for Minimizing Clustering Problems

Authors: Parvaneh Shabanzadeh, Malik Hj Abu Hassan, Leong Wah June, Maryam Mohagheghtabar

Abstract:

Clustering is one of an interesting data mining topics that can be applied in many fields. Recently, the problem of cluster analysis is formulated as a problem of nonsmooth, nonconvex optimization, and an algorithm for solving the cluster analysis problem based on nonsmooth optimization techniques is developed. This optimization problem has a number of characteristics that make it challenging: it has many local minimum, the optimization variables can be either continuous or categorical, and there are no exact analytical derivatives. In this study we show how to apply a particular class of optimization methods known as pattern search methods to address these challenges. These methods do not explicitly use derivatives, an important feature that has not been addressed in previous studies. Results of numerical experiments are presented which demonstrate the effectiveness of the proposed method.

Keywords: Clustering functions, Non-smooth Optimization, Nonconvex Optimization, Pattern Search Method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1640
939 Characterizing Multivariate Thresholds in Industrial Engineering

Authors: Ali E. Abbas

Abstract:

This paper highlights some of the normative issues that might result by setting independent thresholds in risk analyses and particularly with safety regions. A second objective is to explain how such regions can be specified appropriately in a meaningful way. We start with a review of the importance of setting deterministic trade-offs among target requirements. We then show how to determine safety regions for risk analysis appropriately using utility functions.

Keywords: Decision analysis, thresholds, risk, reliability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1096
938 The Role of Fluid Catalytic Cracking in Process Optimisation for Petroleum Refineries

Authors: Chinwendu R. Nnabalu, Gioia Falcone, Imma Bortone

Abstract:

Petroleum refining is a chemical process in which the raw material (crude oil) is converted to finished commercial products for end users. The fluid catalytic cracking (FCC) unit is a key asset in refineries, requiring optimised processes in the context of engineering design. Following the first stage of separation of crude oil in a distillation tower, an additional 40 per cent quantity is attainable in the gasoline pool with further conversion of the downgraded product of crude oil (residue from the distillation tower) using a catalyst in the FCC process. Effective removal of sulphur oxides, nitrogen oxides, carbon and heavy metals from FCC gasoline requires greater separation efficiency and involves an enormous environmental significance. The FCC unit is primarily a reactor and regeneration system which employs cyclone systems for separation.  Catalyst losses in FCC cyclones lead to high particulate matter emission on the regenerator side and fines carryover into the product on the reactor side. This paper aims at demonstrating the importance of FCC unit design criteria in terms of technical performance and compliance with environmental legislation. A systematic review of state-of-the-art FCC technology was carried out, identifying its key technical challenges and sources of emissions.  Case studies of petroleum refineries in Nigeria were assessed against selected global case studies. The review highlights the need for further modelling investigations to help improve FCC design to more effectively meet product specification requirements while complying with stricter environmental legislation.

Keywords: Design, emissions, fluid catalytic cracking, petroleum refineries.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 879
937 Impovement of a Label Extraction Method for a Risk Search System

Authors: Shigeaki Sakurai, Ryohei Orihara

Abstract:

This paper proposes an improvement method of classification efficiency in a classification model. The model is used in a risk search system and extracts specific labels from articles posted at bulletin board sites. The system can analyze the important discussions composed of the articles. The improvement method introduces ensemble learning methods that use multiple classification models. Also, it introduces expressions related to the specific labels into generation of word vectors. The paper applies the improvement method to articles collected from three bulletin board sites selected by users and verifies the effectiveness of the improvement method.

Keywords: Text mining, Risk search system, Corporate reputation, Bulletin board site, Ensemble learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1325
936 An Improvement of Multi-Label Image Classification Method Based on Histogram of Oriented Gradient

Authors: Ziad Abdallah, Mohamad Oueidat, Ali El-Zaart

Abstract:

Image Multi-label Classification (IMC) assigns a label or a set of labels to an image. The big demand for image annotation and archiving in the web attracts the researchers to develop many algorithms for this application domain. The existing techniques for IMC have two drawbacks: The description of the elementary characteristics from the image and the correlation between labels are not taken into account. In this paper, we present an algorithm (MIML-HOGLPP), which simultaneously handles these limitations. The algorithm uses the histogram of gradients as feature descriptor. It applies the Label Priority Power-set as multi-label transformation to solve the problem of label correlation. The experiment shows that the results of MIML-HOGLPP are better in terms of some of the evaluation metrics comparing with the two existing techniques.

Keywords: Data mining, information retrieval system, multi-label, problem transformation, histogram of gradients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1315
935 Advanced Information Extraction with n-gram based LSI

Authors: Ahmet Güven, Ö. Özgür Bozkurt, Oya Kalıpsız

Abstract:

Number of documents being created increases at an increasing pace while most of them being in already known topics and little of them introducing new concepts. This fact has started a new era in information retrieval discipline where the requirements have their own specialties. That is digging into topics and concepts and finding out subtopics or relations between topics. Up to now IR researches were interested in retrieving documents about a general topic or clustering documents under generic subjects. However these conventional approaches can-t go deep into content of documents which makes it difficult for people to reach to right documents they were searching. So we need new ways of mining document sets where the critic point is to know much about the contents of the documents. As a solution we are proposing to enhance LSI, one of the proven IR techniques by supporting its vector space with n-gram forms of words. Positive results we have obtained are shown in two different application area of IR domain; querying a document database, clustering documents in the document database.

Keywords: Document clustering, Information Extraction, Information Retrieval, LSI, n-gram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1803
934 A Case-Based Reasoning-Decision Tree Hybrid System for Stock Selection

Authors: Yaojun Wang, Yaoqing Wang

Abstract:

Stock selection is an important decision-making problem. Many machine learning and data mining technologies are employed to build automatic stock-selection system. A profitable stock-selection system should consider the stock’s investment value and the market timing. In this paper, we present a hybrid system including both engage for stock selection. This system uses a case-based reasoning (CBR) model to execute the stock classification, uses a decision-tree model to help with market timing and stock selection. The experiments show that the performance of this hybrid system is better than that of other techniques regarding to the classification accuracy, the average return and the Sharpe ratio.

Keywords: Case-based reasoning, decision tree, stock selection, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1705