Search results for: Sequential pattern mining
1275 Building an Integrated Relational Database from Swiss Nutrition National Survey and Swiss Health Datasets for Data Mining Purposes
Authors: Ilona Mewes, Helena Jenzer, Farshideh Einsele
Abstract:
Objective: The objective of the study was to integrate two big databases from Swiss nutrition national survey (menuCH) and Swiss health national survey 2012 for data mining purposes. Each database has a demographic base data. An integrated Swiss database is built to later discover critical food consumption patterns linked with lifestyle diseases known to be strongly tied with food consumption. Design: Swiss nutrition national survey (menuCH) with approx. 2000 respondents from two different surveys, one by Phone and the other by questionnaire along with Swiss health national survey 2012 with 21500 respondents were pre-processed, cleaned and finally integrated to a unique relational database. Results: The result of this study is an integrated relational database from the Swiss nutritional and health databases.
Keywords: Health informatics, data mining, nutritional and health databases, nutritional and chronical databases.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17291274 Attribute Selection Methods Comparison for Classification of Diffuse Large B-Cell Lymphoma
Authors: Helyane Bronoski Borges, Júlio Cesar Nievola
Abstract:
The most important subtype of non-Hodgkin-s lymphoma is the Diffuse Large B-Cell Lymphoma. Approximately 40% of the patients suffering from it respond well to therapy, whereas the remainder needs a more aggressive treatment, in order to better their chances of survival. Data Mining techniques have helped to identify the class of the lymphoma in an efficient manner. Despite that, thousands of genes should be processed to obtain the results. This paper presents a comparison of the use of various attribute selection methods aiming to reduce the number of genes to be searched, looking for a more effective procedure as a whole.Keywords: Attribute selection, data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14171273 Online Forums Hotspot Detection and Analysis Using Aging Theory
Authors: K. Nirmala Devi, V. Murali Bhaskaran
Abstract:
The exponential growth of social media arouses much attention on public opinion information. The online forums, blogs, micro blogs are proving to be extremely valuable resources and are having bulk volume of information. However, most of the social media data is unstructured and semi structured form. So that it is more difficult to decipher automatically. Therefore, it is very much essential to understand and analyze those data for making a right decision. The online forums hotspot detection is a promising research field in the web mining and it guides to motivate the user to take right decision in right time. The proposed system consist of a novel approach to detect a hotspot forum for any given time period. It uses aging theory to find the hot terms and E-K-means for detecting the hotspot forum. Experimental results demonstrate that the proposed approach outperforms k-means for detecting the hotspot forums with the improved accuracy.
Keywords: Hotspot forums, Micro blog, Blog, Sentiment Analysis, Opinion Mining, Social media, Twitter, Web mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21851272 Prediction of Metals Available to Maize Seedlings in Crude Oil Contaminated Soil
Authors: Stella O. Olubodun, George E. Eriyamremu
Abstract:
The study assessed the effect of crude oil applied at rates, 0, 2, 5, and 10% on the fractional chemical forms and availability of some metals in soils from Usen, Edo State, with no known crude oil contamination and soil from a crude oil spill site in Ubeji, Delta State, Nigeria. Three methods were used to determine the bioavailability of metals in the soils: maize (Zea mays) plant, EDTA and BCR sequential extraction. The sequential extract acid soluble fraction of the BCR extraction (most labile fraction of the soils, normally associated with bioavailability) were compared with total metal concentration in maize seedlings as a means to compare the chemical and biological measures of bioavailability. Total Fe was higher in comparison to other metals for the crude oil contaminated soils. The metal concentrations were below the limits of 4.7% Fe, 190mg/kg Cu and 720mg/kg Zn intervention values and 36mg/kg Cu and 140mg/kg Zn target values for soils provided by the Department of Petroleum Resources (DPR) guidelines. The concentration of the metals in maize seedlings increased with increasing rates of crude oil contamination. Comparison of the metal concentrations in maize seedlings with EDTA extractable concentrations showed that EDTA extracted more metals than maize plant.Keywords: Availability, crude oil contamination, EDTA, maize, metals.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13801271 A Modified Cross Correlation in the Frequency Domain for Fast Pattern Detection Using Neural Networks
Authors: Hazem M. El-Bakry, Qiangfu Zhao
Abstract:
Recently, neural networks have shown good results for detection of a certain pattern in a given image. In our previous papers [1-5], a fast algorithm for pattern detection using neural networks was presented. Such algorithm was designed based on cross correlation in the frequency domain between the input image and the weights of neural networks. Image conversion into symmetric shape was established so that fast neural networks can give the same results as conventional neural networks. Another configuration of symmetry was suggested in [3,4] to improve the speed up ratio. In this paper, our previous algorithm for fast neural networks is developed. The frequency domain cross correlation is modified in order to compensate for the symmetric condition which is required by the input image. Two new ideas are introduced to modify the cross correlation algorithm. Both methods accelerate the speed of the fast neural networks as there is no need for converting the input image into symmetric one as previous. Theoretical and practical results show that both approaches provide faster speed up ratio than the previous algorithm.Keywords: Fast Pattern Detection, Neural Networks, Modified Cross Correlation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17461270 Enhance the Power of Sentiment Analysis
Authors: Yu Zhang, Pedro Desouza
Abstract:
Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modeling and testing work was done in R and Greenplum in-database analytic tools.
Keywords: Sentiment Analysis, Social Media, Twitter, Amazon, Data Mining, Machine Learning, Text Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 35181269 A Novel Approach to Improve Users Search Goal in Web Usage Mining
Authors: R. Lokeshkumar, P. Sengottuvelan
Abstract:
Web mining is to discover and extract useful Information. Different users may have different search goals when they search by giving queries and submitting it to a search engine. The inference and analysis of user search goals can be very useful for providing an experience result for a user search query. In this project, we propose a novel approach to infer user search goals by analyzing search web logs. First, we propose a novel approach to infer user search goals by analyzing search engine query logs, the feedback sessions are constructed from user click-through logs and it efficiently reflect the information needed for users. Second we propose a preprocessing technique to clean the unnecessary data’s from web log file (feedback session). Third we propose a technique to generate pseudo-documents to representation of feedback sessions for clustering. Finally we implement k-medoids clustering algorithm to discover different user search goals and to provide a more optimal result for a search query based on feedback sessions for the user.Keywords: Data Preprocessing, Session Identification, Web log mining, Web Personalization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20231268 Local Steerable Pyramid Binary Pattern Sequence LSPBPS for Face Recognition Method
Authors: Mohamed El Aroussi, Mohammed El Hassouni, Sanaa Ghouzali, Mohammed Rziza, Driss Aboutajdine
Abstract:
In this paper the problem of face recognition under variable illumination conditions is considered. Most of the works in the literature exhibit good performance under strictly controlled acquisition conditions, but the performance drastically drop when changes in pose and illumination occur, so that recently number of approaches have been proposed to deal with such variability. The aim of this work is to introduce an efficient local appearance feature extraction method based steerable pyramid (SP) for face recognition. Local information is extracted from SP sub-bands using LBP(Local binary Pattern). The underlying statistics allow us to reduce the required amount of data to be stored. The experiments carried out on different face databases confirm the effectiveness of the proposed approach.
Keywords: Face recognition (FR), Steerable pyramid (SP), localBinary Pattern (LBP).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21871267 Central Pattern Generator Incorporating the Actuator Dynamics for a Hexapod Robot
Authors: Valeri A. Makarov, Ezequiel Del Rio, Manuel G. Bedia, Manuel G. Velarde, Werner Ebeling
Abstract:
We proposed the use of a Toda-Rayleigh ring as a central pattern generator (CPG) for controlling hexapodal robots. We show that the ring composed of six Toda-Rayleigh units coupled to the limb actuators reproduces the most common hexapodal gaits. We provide an electrical circuit implementation of the CPG and test our theoretical results obtaining fixed gaits. Then we propose a method of incorporation of the actuator (motor) dynamics in the CPG. With this approach we close the loop CPG – environment – CPG, thus obtaining a decentralized model for the leg control that does not require higher level intervention to the CPG during locomotion in a nonhomogeneous environments. The gaits generated by the novel CPG are not fixed, but adapt to the current robot bahvior.Keywords: Central pattern generator, electrical circuit, hexapod robot
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18041266 Total and Leachable Concentration of Trace Elements in Soil towards Human Health Risk, Related with Coal Mine in Jorong, South Kalimantan, Indonesia
Authors: Arie Pujiwati, Kengo Nakamura, Noriaki Watanabe, Takeshi Komai
Abstract:
Coal mining is well known to cause considerable environmental impacts, including trace element contamination of soil. This study aimed to assess the trace element (As, Cd, Co, Cu, Ni, Pb, Sb, and Zn) contamination of soil in the vicinity of coal mining activities, using the case study of Asam-asam River basin, South Kalimantan, Indonesia, and to assess the human health risk, incorporating total and bioavailable (water-leachable and acid-leachable) concentrations. The results show the enrichment of As and Co in soil, surpassing the background soil value. Contamination was evaluated based on the index of geo-accumulation, Igeo and the pollution index, PI. Igeo values showed that the soil was generally uncontaminated (Igeo ≤ 0), except for elevated As and Co. Mean PI for Ni and Cu indicated slight contamination. Regarding the assessment of health risks, the Hazard Index, HI showed adverse risks (HI > 1) for Ni, Co, and As. Further, Ni and As were found to pose unacceptable carcinogenic risk (risk > 1.10-5). Farming, settlement, and plantation were found to present greater risk than coal mines. These results show that coal mining activity in the study area contaminates the soils by particular elements and may pose potential human health risk in its surrounding area. This study is important for setting appropriate countermeasure actions and improving basic coal mining management in Indonesia.
Keywords: Coal mine, risk, soil, trace elements.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11751265 The Power of Indigenous Peoples in Decision-Making Processes of Mining Projects: The Pilbara Region
Authors: K. N. Penna, J. P. English
Abstract:
The destruction of the Juukan Gorge rock shelters in 2020 has catalysed impetus within Australian society for a significant change in engagement with Indigenous Peoples, and the approach to Indigenous cultural heritage, both within the Pilbara region and more broadly across Australia. Culture-based and people-centred approaches are inherent to inclusive sustainable development and Free, Prior, Informed Consent, outcomes encouraged by international and local recommendations on the human rights and cultural heritage preservation of Indigenous peoples. In this paper, we present an interpretive model of an evolved process for mining project development, incorporating culture-based and people-centred approaches, based on the Theory U system change method. The evolved process advocates a change in organisational mindset and culture, and a comprehensive understanding of Indigenous Peoples’ culture and values, as the foundations for increasing their influence and achieving mutually beneficial developments.
Keywords: Indigenous Engagement, mining industry, culture-based approach, people-centred approach, Theory U.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4401264 A Semantic Recommendation Procedure for Electronic Product Catalog
Authors: Hadi Khosravi Farsani, Mohammadali Nematbakhsh
Abstract:
To overcome the product overload of Internet shoppers, we introduce a semantic recommendation procedure which is more efficient when applied to Internet shopping malls. The suggested procedure recommends the semantic products to the customers and is originally based on Web usage mining, product classification, association rule mining, and frequently purchasing. We applied the procedure to the data set of MovieLens Company for performance evaluation, and some experimental results are provided. The experimental results have shown superior performance in terms of coverage and precision.Keywords: Personalization, Recommendation, OWL Ontology, Electronic Catalogs, Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19261263 Theoretical Isotope Generator: An Alternative towards Isotope Pattern Calculator
Authors: K. Massila, R. D. Stein, S. M. Suhaizan, A. A. Azlianor
Abstract:
A number of mass spectrometry applications are already available as web-based and windows-based systems to calculate isotope pattern and to display the mass spectrum based on the specific molecular formula besides providing necessary information. These applications were evaluated and compared with our new alternative application called Theoretical Isotope Generator (TIG) in terms of its functionality and features provided to prove this new application is working better and performing well. TIG provides extra features than others, complete with several functionality such as drawing, normalizing and zooming the generated graph that convey with the molecular information in a number of formats by providing the details of the calculation and molecules. Thus, any chemist, students, lecturers and researchers from anywhere could use TIG to gain related information on molecules and their relative intensity.
Keywords: Isotope pattern calculator, mass number, massspectrum, relative intensity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23371262 Comparative Study of Universities’ Web Structure Mining
Authors: Z. Abdullah, A. R. Hamdan
Abstract:
This paper is meant to analyze the ranking of University of Malaysia Terengganu, UMT’s website in the World Wide Web. There are only few researches have been done on comparing the ranking of universities’ websites so this research will be able to determine whether the existing UMT’s website is serving its purpose which is to introduce UMT to the world. The ranking is based on hub and authority values which are accordance to the structure of the website. These values are computed using two websearching algorithms, HITS and SALSA. Three other universities’ websites are used as the benchmarks which are UM, Harvard and Stanford. The result is clearly showing that more work has to be done on the existing UMT’s website where important pages according to the benchmarks, do not exist in UMT’s pages. The ranking of UMT’s website will act as a guideline for the web-developer to develop a more efficient website.Keywords: Algorithm, ranking, website, web structure mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16681261 A Testbed for the Experiments Performed in Missing Value Treatments
Authors: Dias de J. C. Lilian, Lobato M. F. Fábio, de Santana L. Ádamo
Abstract:
The occurrence of missing values in database is a serious problem for Data Mining tasks, responsible for degrading data quality and accuracy of analyses. In this context, the area has shown a lack of standardization for experiments to treat missing values, introducing difficulties to the evaluation process among different researches due to the absence in the use of common parameters. This paper proposes a testbed intended to facilitate the experiments implementation and provide unbiased parameters using available datasets and suited performance metrics in order to optimize the evaluation and comparison between the state of art missing values treatments.
Keywords: Data imputation, data mining, missing values treatment, testbed.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15151260 A Method for Iris Recognition Based on 1D Coiflet Wavelet
Authors: Agus Harjoko, Sri Hartati, Henry Dwiyasa
Abstract:
There have been numerous implementations of security system using biometric, especially for identification and verification cases. An example of pattern used in biometric is the iris pattern in human eye. The iris pattern is considered unique for each person. The use of iris pattern poses problems in encoding the human iris. In this research, an efficient iris recognition method is proposed. In the proposed method the iris segmentation is based on the observation that the pupil has lower intensity than the iris, and the iris has lower intensity than the sclera. By detecting the boundary between the pupil and the iris and the boundary between the iris and the sclera, the iris area can be separated from pupil and sclera. A step is taken to reduce the effect of eyelashes and specular reflection of pupil. Then the four levels Coiflet wavelet transform is applied to the extracted iris image. The modified Hamming distance is employed to measure the similarity between two irises. This research yields the identification success rate of 84.25% for the CASIA version 1.0 database. The method gives an accuracy of 77.78% for the left eyes of MMU 1 database and 86.67% for the right eyes. The time required for the encoding process, from the segmentation until the iris code is generated, is 0.7096 seconds. These results show that the accuracy and speed of the method is better than many other methods.Keywords: Biometric, iris recognition, wavelet transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19061259 Data Mining Techniques in Computer-Aided Diagnosis: Non-Invasive Cancer Detection
Authors: Florin Gorunescu
Abstract:
Diagnosis can be achieved by building a model of a certain organ under surveillance and comparing it with the real time physiological measurements taken from the patient. This paper deals with the presentation of the benefits of using Data Mining techniques in the computer-aided diagnosis (CAD), focusing on the cancer detection, in order to help doctors to make optimal decisions quickly and accurately. In the field of the noninvasive diagnosis techniques, the endoscopic ultrasound elastography (EUSE) is a recent elasticity imaging technique, allowing characterizing the difference between malignant and benign tumors. Digitalizing and summarizing the main EUSE sample movies features in a vector form concern with the use of the exploratory data analysis (EDA). Neural networks are then trained on the corresponding EUSE sample movies vector input in such a way that these intelligent systems are able to offer a very precise and objective diagnosis, discriminating between benign and malignant tumors. A concrete application of these Data Mining techniques illustrates the suitability and the reliability of this methodology in CAD.Keywords: Endoscopic ultrasound elastography, exploratorydata analysis, neural networks, non-invasive cancer detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18691258 Analysis of a Spatiotemporal Phytoplankton Dynamics: Higher Order Stability and Pattern Formation
Authors: Randhir Singh Baghel, Joydip Dhar, Renu Jain
Abstract:
In this paper, for the understanding of the phytoplankton dynamics in marine ecosystem, a susceptible and an infected class of phytoplankton population is considered in spatiotemporal domain. Here, the susceptible phytoplankton is growing logistically and the growth of infected phytoplankton is due to the instantaneous Holling type-II infection response function. The dynamics are studied in terms of the local and global stabilities for the system and further explore the possibility of Hopf -bifurcation, taking the half saturation period as (i.e., ) the bifurcation parameter in temporal domain. It is also observe that the reaction diffusion system exhibits spatiotemporal chaos and pattern formation in phytoplankton dynamics, which is particularly important role play for the spatially extended phytoplankton system. Also the effect of the diffusion coefficient on the spatial system for both one and two dimensional case is obtained. Furthermore, we explore the higher-order stability analysis of the spatial phytoplankton system for both linear and no-linear system. Finally, few numerical simulations are carried out for pattern formation.Keywords: Phytoplankton dynamics, Reaction-diffusion system, Local stability, Hopf-bifurcation, Global stability, Chaos, Pattern Formation, Higher-order stability analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16531257 Power System Security Assessment using Binary SVM Based Pattern Recognition
Authors: S Kalyani, K Shanti Swarup
Abstract:
Power System Security is a major concern in real time operation. Conventional method of security evaluation consists of performing continuous load flow and transient stability studies by simulation program. This is highly time consuming and infeasible for on-line application. Pattern Recognition (PR) is a promising tool for on-line security evaluation. This paper proposes a Support Vector Machine (SVM) based binary classification for static and transient security evaluation. The proposed SVM based PR approach is implemented on New England 39 Bus and IEEE 57 Bus systems. The simulation results of SVM classifier is compared with the other classifier algorithms like Method of Least Squares (MLS), Multi- Layer Perceptron (MLP) and Linear Discriminant Analysis (LDA) classifiers.Keywords: Static Security, Transient Security, Pattern Recognition, Classifier, Support Vector Machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18761256 The Water Level Detection Algorithm Using the Accumulated Histogram with Band Pass Filter
Authors: Sangbum Park, Namki Lee, Youngjoon Han, Hernsoo Hahn
Abstract:
In this paper, we propose the robust water level detection method based on the accumulated histogram under small changed image which is acquired from water level surveillance camera. In general surveillance system, this is detecting and recognizing invasion from searching area which is in big change on the sequential images. However, in case of a water level detection system, these general surveillance techniques are not suitable due to small change on the water surface. Therefore the algorithm introduces the accumulated histogram which is emphasizing change of water surface in sequential images. Accumulated histogram is based on the current image frame. The histogram is cumulating differences between previous images and current image. But, these differences are also appeared in the land region. The band pass filter is able to remove noises in the accumulated histogram Finally, this algorithm clearly separates water and land regions. After these works, the algorithm converts from the water level value on the image space to the real water level on the real space using calibration table. The detected water level is sent to the host computer with current image. To evaluate the proposed algorithm, we use test images from various situations.Keywords: accumulated histogram, water level detection, band pass filter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20001255 Hardware Implementation of Local Binary Pattern Based Two-Bit Transform Motion Estimation
Authors: Seda Yavuz, Anıl Çelebi, Aysun Taşyapı Çelebi, Oğuzhan Urhan
Abstract:
Nowadays, demand for using real-time video transmission capable devices is ever-increasing. So, high resolution videos have made efficient video compression techniques an essential component for capturing and transmitting video data. Motion estimation has a critical role in encoding raw video. Hence, various motion estimation methods are introduced to efficiently compress the video. Low bit‑depth representation based motion estimation methods facilitate computation of matching criteria and thus, provide small hardware footprint. In this paper, a hardware implementation of a two-bit transformation based low-complexity motion estimation method using local binary pattern approach is proposed. Image frames are represented in two-bit depth instead of full-depth by making use of the local binary pattern as a binarization approach and the binarization part of the hardware architecture is explained in detail. Experimental results demonstrate the difference between the proposed hardware architecture and the architectures of well-known low-complexity motion estimation methods in terms of important aspects such as resource utilization, energy and power consumption.
Keywords: Binarization, hardware architecture, local binary pattern, motion estimation, two-bit transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13771254 Evaluation of Sensor Pattern Noise Estimators for Source Camera Identification
Authors: Benjamin Anderson-Sackaney, Amr Abdel-Dayem
Abstract:
This paper presents a comprehensive survey of recent source camera identification (SCI) systems. Then, the performance of various sensor pattern noise (SPN) estimators was experimentally assessed, under common photo response non-uniformity (PRNU) frameworks. The experiments used 1350 natural and 900 flat-field images, captured by 18 individual cameras. 12 different experiments, grouped into three sets, were conducted. The results were analyzed using the receiver operator characteristic (ROC) curves. The experimental results demonstrated that combining the basic SPN estimator with a wavelet-based filtering scheme provides promising results. However, the phase SPN estimator fits better with both patch-based (BM3D) and anisotropic diffusion (AD) filtering schemes.Keywords: Sensor pattern noise, source camera identification, photo response non-uniformity, anisotropic diffusion, peak to correlation energy ratio.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11361253 Dimensional Modeling of HIV Data Using Open Source
Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer
Abstract:
Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19611252 Present Status, Driving Forces and Pattern Optimization of Territory in Hubei Province, China
Abstract:
“National Territorial Planning (2016-2030)” was issued by the State Council of China in 2017. As an important initiative of putting it into effect, territorial planning at provincial level makes overall arrangement of territorial development, resources and environment protection, comprehensive renovation and security system construction. Hubei province, as the pivot of the “Rise of Central China” national strategy, is now confronted with great opportunities and challenges in territorial development, protection, and renovation. Territorial spatial pattern experiences long time evolution, influenced by multiple internal and external driving forces. It is not clear what are the main causes of its formation and what are effective ways of optimizing it. By analyzing land use data in 2016, this paper reveals present status of territory in Hubei. Combined with economic and social data and construction information, driving forces of territorial spatial pattern are then analyzed. Research demonstrates that the three types of territorial space aggregate distinctively. The four aspects of driving forces include natural background which sets the stage for main functions, population and economic factors which generate agglomeration effect, transportation infrastructure construction which leads to axial expansion and significant provincial strategies which encourage the established path. On this basis, targeted strategies for optimizing territory spatial pattern are then put forward. Hierarchical protection pattern should be established based on development intensity control as respect for nature. By optimizing the layout of population and industry and improving the transportation network, polycentric network-based development pattern could be established. These findings provide basis for Hubei Territorial Planning, and reference for future territorial planning in other provinces.Keywords: Driving forces, Hubei, optimizing strategies, spatial pattern, territory.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6261251 Integrated Method for Detection of Unknown Steganographic Content
Authors: Magdalena Pejas
Abstract:
This article concerns the presentation of an integrated method for detection of steganographic content embedded by new unknown programs. The method is based on data mining and aggregated hypothesis testing. The article contains the theoretical basics used to deploy the proposed detection system and the description of improvement proposed for the basic system idea. Further main results of experiments and implementation details are collected and described. Finally example results of the tests are presented.Keywords: Steganography, steganalysis, data embedding, data mining, feature extraction, knowledge base, system learning, hypothesis testing, error estimation, black box program, file structure.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15661250 Application of Data Mining Techniques for Tourism Knowledge Discovery
Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee
Abstract:
Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.
Keywords: Classification algorithms; data mining; tourism; knowledge discovery.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25481249 Application of Computer Aided Engineering Tools in Performance Prediction and Fault Detection of Mechanical Equipment of Mining Process Line
Abstract:
Nowadays, to decrease the number of downtimes in the industries such as metal mining, petroleum and chemical industries, predictive maintenance is crucial. In order to have efficient predictive maintenance, knowing the performance of critical equipment of production line such as pumps and hydro-cyclones under variable operating parameters, selecting best indicators of this equipment health situations, best locations for instrumentation, and also measuring of these indicators are very important. In this paper, computer aided engineering (CAE) tools are implemented to study some important elements of copper process line, namely slurry pumps and cyclone to predict the performance of these components under different working conditions. These modeling and simulations can be used in predicting, for example, the damage tolerance of the main shaft of the slurry pump or wear rate and location of cyclone wall or pump case and impeller. Also, the simulations can suggest best-measuring parameters, measuring intervals, and their locations.Keywords: Computer aided engineering, predictive maintenance, fault detection, mining process line, slurry pump, hydrocyclone.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18341248 A Review and Comparative Analysis on Cluster Ensemble Methods
Authors: S. Sarumathi, P. Ranjetha, C. Saraswathy, M. Vaishnavi, S. Geetha
Abstract:
Clustering is an unsupervised learning technique for aggregating data objects into meaningful classes so that intra cluster similarity is maximized and inter cluster similarity is minimized in data mining. However, no single clustering algorithm proves to be the most effective in producing the best result. As a result, a new challenging technique known as the cluster ensemble approach has blossomed in order to determine the solution to this problem. For the cluster analysis issue, this new technique is a successful approach. The cluster ensemble's main goal is to combine similar clustering solutions in a way that achieves the precision while also improving the quality of individual data clustering. Because of the massive and rapid creation of new approaches in the field of data mining, the ongoing interest in inventing novel algorithms necessitates a thorough examination of current techniques and future innovation. This paper presents a comparative analysis of various cluster ensemble approaches, including their methodologies, formal working process, and standard accuracy and error rates. As a result, the society of clustering practitioners will benefit from this exploratory and clear research, which will aid in determining the most appropriate solution to the problem at hand.
Keywords: Clustering, cluster ensemble methods, consensus function, data mining, unsupervised learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8241247 Improving Classification Accuracy with Discretization on Datasets Including Continuous Valued Features
Authors: Mehmet Hacibeyoglu, Ahmet Arslan, Sirzat Kahramanli
Abstract:
This study analyzes the effect of discretization on classification of datasets including continuous valued features. Six datasets from UCI which containing continuous valued features are discretized with entropy-based discretization method. The performance improvement between the dataset with original features and the dataset with discretized features is compared with k-nearest neighbors, Naive Bayes, C4.5 and CN2 data mining classification algorithms. As the result the classification accuracies of the six datasets are improved averagely by 1.71% to 12.31%.Keywords: Data mining classification algorithms, entropy-baseddiscretization method
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24631246 Response of the Residential Building Structureon Load Technical Seismicity due to Mining Activities
Authors: V. Salajka, Z. Kaláb, J. Kala, P. Hradil
Abstract:
In the territories where high-intensity earthquakes are frequent is paid attention to the solving of the seismic problems. In the paper are described two computational model variants based on finite element method of the construction with different subsoil simulation (rigid or elastic subsoil) is used. For simulation and calculations program system based on method final elements ANSYS was used. Seismic responses calculations of residential building structure were effected on loading characterized by accelerogram for comparing with the responses spectra method.Keywords: Accelerogram, ANSYS, mining induced seismic, residential building structure, spectra, subsoil.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538