Search results for: Spatial temporal data mining
7913 GeoSEMA: A Modelling Platform, Emerging “GeoSpatial-based Evolutionary and Mobile Agents“
Authors: Mohamed Dbouk, Ihab Sbeity
Abstract:
Spatial and mobile computing evolves. This paper describes a smart modeling platform called “GeoSEMA". This approach tends to model multidimensional GeoSpatial Evolutionary and Mobile Agents. Instead of 3D and location-based issues, there are some other dimensions that may characterize spatial agents, e.g. discrete-continuous time, agent behaviors. GeoSEMA is seen as a devoted design pattern motivating temporal geographic-based applications; it is a firm foundation for multipurpose and multidimensional special-based applications. It deals with multipurpose smart objects (buildings, shapes, missiles, etc.) by stimulating geospatial agents. Formally, GeoSEMA refers to geospatial, spatio-evolutive and mobile space constituents where a conceptual geospatial space model is given in this paper. In addition to modeling and categorizing geospatial agents, the model incorporates the concept of inter-agents event-based protocols. Finally, a rapid software-architecture prototyping GeoSEMA platform is also given. It will be implemented/ validated in the next phase of our work.Keywords: Location-Trajectory management, GIS, Mobile- Moving Objects/Agents, Multipurpose/Spatiotemporal data, Multi- Agent Systems.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16537912 Performance Enhancement of Cellular OFDM Based Wireless LANs by Exploiting Spatial Diversity Techniques
Authors: S. Ali. Tajer, Babak H. Khalaj
Abstract:
This paper represents an investigation on how exploiting multiple transmit antennas by OFDM based wireless LAN subscribers can mitigate physical layer error rate. Then by comparing the Wireless LANs that utilize spatial diversity techniques with the conventional ones it will reveal how PHY and TCP throughputs behaviors are ameliorated. In the next step it will assess the same issues based on a cellular context operation which is mainly introduced as an innovated solution that beside a multi cell operation scenario benefits spatio-temporal signaling schemes as well. Presented simulations will shed light on the improved performance of the wide range and high quality wireless LAN services provided by the proposed approach.
Keywords: Multiple Input Multiple Output (MIMO), Orthogonal Frequency Division Multiplexing (OFDM), and WirelessLocal Area Network (WLAN).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14047911 Research on Urban Point of Interest Generalization Method Based on Mapping Presentation
Authors: Chengming Li, Yong Yin, Peipei Guo, Xiaoli Liu
Abstract:
Without taking account of the attribute richness of POI (point of interest) data and spatial distribution limited by roads, a POI generalization method considering both attribute information and spatial distribution has been proposed against the existing point generalization algorithm merely focusing on overall information of point groups. Hierarchical characteristic of urban POI information expression has been firstly analyzed to point out the measurement feature of the corresponding hierarchy. On this basis, an urban POI generalizing strategy has been put forward: POIs urban road network have been divided into three distribution pattern; corresponding generalization methods have been proposed according to the characteristic of POI data in different distribution patterns. Experimental results showed that the method taking into account both attribute information and spatial distribution characteristics of POI can better implement urban POI generalization in the mapping presentation.
Keywords: POI, Road network, spatial information expression, selection method, distribution pattern.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10397910 Parallel and Distributed Mining of Association Rule on Knowledge Grid
Authors: U. Sakthi, R. Hemalatha, R. S. Bhuvaneswaran
Abstract:
In Virtual organization, Knowledge Discovery (KD) service contains distributed data resources and computing grid nodes. Computational grid is integrated with data grid to form Knowledge Grid, which implements Apriori algorithm for mining association rule on grid network. This paper describes development of parallel and distributed version of Apriori algorithm on Globus Toolkit using Message Passing Interface extended with Grid Services (MPICHG2). The creation of Knowledge Grid on top of data and computational grid is to support decision making in real time applications. In this paper, the case study describes design and implementation of local and global mining of frequent item sets. The experiments were conducted on different configurations of grid network and computation time was recorded for each operation. We analyzed our result with various grid configurations and it shows speedup of computation time is almost superlinear.Keywords: Association rule, Grid computing, Knowledge grid, Mobility prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21827909 Analysis of Sequence Moves in Successful Chess Openings Using Data Mining with Association Rules
Authors: R.M.Rani
Abstract:
Chess is one of the indoor games, which improves the level of human confidence, concentration, planning skills and knowledge. The main objective of this paper is to help the chess players to improve their chess openings using data mining techniques. Budding Chess Players usually do practices by analyzing various existing openings. When they analyze and correlate thousands of openings it becomes tedious and complex for them. The work done in this paper is to analyze the best lines of Blackmar- Diemer Gambit(BDG) which opens with White D4... using data mining analysis. It is carried out on the collection of winning games by applying association rules. The first step of this analysis is assigning variables to each different sequence moves. In the second step, the sequence association rules were generated to calculate support and confidence factor which help us to find the best subsequence chess moves that may lead to winning position.Keywords: Blackmar-Diemer Gambit(BDG), Confidence, sequence Association Rules, Support.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30947908 Video Mining for Creative Rendering
Authors: Mei Chen
Abstract:
More and more home videos are being generated with the ever growing popularity of digital cameras and camcorders. For many home videos, a photo rendering, whether capturing a moment or a scene within the video, provides a complementary representation to the video. In this paper, a video motion mining framework for creative rendering is presented. The user-s capture intent is derived by analyzing video motions, and respective metadata is generated for each capture type. The metadata can be used in a number of applications, such as creating video thumbnail, generating panorama posters, and producing slideshows of video.
Keywords: Motion mining, semantic abstraction, video mining, video representation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16517907 Determination and Comparison of Fabric Pills Distribution Using Image Processing and Spatial Data Analysis Tools
Authors: Lenka Techniková, Maroš Tunák, Jiří Janáček
Abstract:
This work deals with the determination and comparison of pill patterns in 2 sets of fabric samples which differ in way of pill creation. The first set contains fabric samples with the pills created by simulation on a Martindale abrasion machine, while pills in the second set originated during normal wearing and maintenance. The goal of the study is to determine whether the pattern of the fabric pills created by simulation is the same as the pattern of naturally occurring pills. The system of determination and comparison of the pills is based on image processing and spatial data analysis tools. Firstly, 3D reconstruction of the fabric surfaces with the pills is realized with using a gradient fields method. The gradient fields method creates a 3D fabric surface from a set of 4 images. Thereafter, the pills are detected in 3D fabric surfaces using image-processing tools in the MATLAB software. Determination and comparison of the pills patterns of two sets of fabric samples is based on spatial data analysis using tools in R software.
Keywords: 3D reconstruction of the surface, image analysis tools, distribution of the pills, spatial data analysis tools.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21737906 Specifying a Timestamp-based Protocol For Multi-step Transactions Using LTL
Authors: Rafat Alshorman, Walter Hussak
Abstract:
Most of the concurrent transactional protocols consider serializability as a correctness criterion of the transactions execution. Usually, the proof of the serializability relies on mathematical proofs for a fixed finite number of transactions. In this paper, we introduce a protocol to deal with an infinite number of transactions which are iterated infinitely often. We specify serializability of the transactions and the protocol using a specification language based on temporal logics. It is worthwhile using temporal logics such as LTL (Lineartime Temporal Logic) to specify transactions, to gain full automatic verification by using model checkers.Keywords: Multi-step transactions, LTL specifications, Model Checking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13817905 Navigation Patterns Mining Approach based on Expectation Maximization Algorithm
Authors: Norwati Mustapha, Manijeh Jalali, Abolghasem Bozorgniya, Mehrdad Jalali
Abstract:
Web usage mining algorithms have been widely utilized for modeling user web navigation behavior. In this study we advance a model for mining of user-s navigation pattern. The model makes user model based on expectation-maximization (EM) algorithm.An EM algorithm is used in statistics for finding maximum likelihood estimates of parameters in probabilistic models, where the model depends on unobserved latent variables. The experimental results represent that by decreasing the number of clusters, the log likelihood converges toward lower values and probability of the largest cluster will be decreased while the number of the clusters increases in each treatment.Keywords: Web Usage Mining, Expectation maximization, navigation pattern mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15797904 Numerical Modeling of Artisanal and Small-Scale Mining of Coltan in the African Great Lakes Region
Authors: Sergio Perez Rodriguez
Abstract:
Findings of a production model of Artisanal and Small-Scale Mining (ASM) of coltan ore by an average Democratic Republic of Congo (DRC) mineworker are presented in this paper. These can be used as a reference for a similar characterization of the daily labor of counterparts from other countries in the Africa's Great Lakes region. To that end, the Fundamental Equation of Mineral Production has been applied in this paper, considering a miner's average daily output of coltan, estimated in the base of gross statistical data gathered from reputable sources. Results indicate daily yields of individual miners in the order of 300 g of coltan ore, with hourly peaks of production in the range of 30 to 40 g of the mineral. Yields are expected to be in the order of 5 g or less during the least productive hours. These outputs are expected to be achieved during the halves of the eight to 10 hours of daily working sessions that these artisanal laborers can attend during the mining season.
Keywords: Coltan, mineral production, Production to Reserve ratio, artisanal mining, small-scale mining, ASM, human work, Great Lakes region, Democratic Republic of Congo.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1947903 Investigation of Maritime Accidents with Exploratory Data Analysis in the Strait of Çanakkale (Dardanelles)
Authors: Gizem Kodak
Abstract:
The Strait of Çanakkale (Dardanelles), together with the Strait of Istanbul and the Sea of Marmara, form the Turkish Straits System. In other words, the Strait of Çanakkale is the southern gate of the system that connects the Black Sea countries with the other countries of the world. Due to the heavy maritime traffic, it is important to scientifically examine the accident characteristics in the region. In particular, the results indicated by the descriptive statistics are of critical importance in order to strengthen the safety of navigation. At this point, exploratory data analysis offers strategic outputs in terms of defining the problem and knowing the strengths and weaknesses against possible accident risk. The study aims to determine the accident characteristics in the Strait of Çanakkale with temporal and spatial analysis of historical data, using Exploratory Data Analysis (EDA) as the research method. The study's results will reveal the general characteristics of maritime accidents in the region and form the infrastructure for future studies. Therefore, the text provides a clear description of the research goals and methodology, and the study's contributions are well-defined.
Keywords: Maritime Accidents, EDA, Strait of Çanakkale, navigational safety.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1307902 Using Textual Pre-Processing and Text Mining to Create Semantic Links
Authors: Ricardo Avila, Gabriel Lopes, Vania Vidal, Jose Macedo
Abstract:
This article offers a approach to the automatic discovery of semantic concepts and links in the domain of Oil Exploration and Production (E&P). Machine learning methods combined with textual pre-processing techniques were used to detect local patterns in texts and, thus, generate new concepts and new semantic links. Even using more specific vocabularies within the oil domain, our approach has achieved satisfactory results, suggesting that the proposal can be applied in other domains and languages, requiring only minor adjustments.Keywords: Semantic links, data mining, linked data, SKOS.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10647901 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-
Authors: Nieto Bernal Wilson, Carmona Suarez Edgar
Abstract:
The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects. Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.
Keywords: Data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14787900 Integrated Method for Detection of Unknown Steganographic Content
Authors: Magdalena Pejas
Abstract:
This article concerns the presentation of an integrated method for detection of steganographic content embedded by new unknown programs. The method is based on data mining and aggregated hypothesis testing. The article contains the theoretical basics used to deploy the proposed detection system and the description of improvement proposed for the basic system idea. Further main results of experiments and implementation details are collected and described. Finally example results of the tests are presented.Keywords: Steganography, steganalysis, data embedding, data mining, feature extraction, knowledge base, system learning, hypothesis testing, error estimation, black box program, file structure.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15647899 A Review and Comparative Analysis on Cluster Ensemble Methods
Authors: S. Sarumathi, P. Ranjetha, C. Saraswathy, M. Vaishnavi, S. Geetha
Abstract:
Clustering is an unsupervised learning technique for aggregating data objects into meaningful classes so that intra cluster similarity is maximized and inter cluster similarity is minimized in data mining. However, no single clustering algorithm proves to be the most effective in producing the best result. As a result, a new challenging technique known as the cluster ensemble approach has blossomed in order to determine the solution to this problem. For the cluster analysis issue, this new technique is a successful approach. The cluster ensemble's main goal is to combine similar clustering solutions in a way that achieves the precision while also improving the quality of individual data clustering. Because of the massive and rapid creation of new approaches in the field of data mining, the ongoing interest in inventing novel algorithms necessitates a thorough examination of current techniques and future innovation. This paper presents a comparative analysis of various cluster ensemble approaches, including their methodologies, formal working process, and standard accuracy and error rates. As a result, the society of clustering practitioners will benefit from this exploratory and clear research, which will aid in determining the most appropriate solution to the problem at hand.
Keywords: Clustering, cluster ensemble methods, consensus function, data mining, unsupervised learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8227898 On Pattern-Based Programming towards the Discovery of Frequent Patterns
Authors: Kittisak Kerdprasop, Nittaya Kerdprasop
Abstract:
The problem of frequent pattern discovery is defined as the process of searching for patterns such as sets of features or items that appear in data frequently. Finding such frequent patterns has become an important data mining task because it reveals associations, correlations, and many other interesting relationships hidden in a database. Most of the proposed frequent pattern mining algorithms have been implemented with imperative programming languages. Such paradigm is inefficient when set of patterns is large and the frequent pattern is long. We suggest a high-level declarative style of programming apply to the problem of frequent pattern discovery. We consider two languages: Haskell and Prolog. Our intuitive idea is that the problem of finding frequent patterns should be efficiently and concisely implemented via a declarative paradigm since pattern matching is a fundamental feature supported by most functional languages and Prolog. Our frequent pattern mining implementation using the Haskell and Prolog languages confirms our hypothesis about conciseness of the program. The comparative performance studies on line-of-code, speed and memory usage of declarative versus imperative programming have been reported in the paper.Keywords: Frequent pattern mining, functional programming, pattern matching, logic programming.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13437897 Biological Soil Conservation Planning by Spatial Multi-Criteria Evaluation Techniques (Case Study: Bonkuh Watershed in Iran)
Authors: Ali Akbar Jamali
Abstract:
This paper discusses site selection process for biological soil conservation planning. It was supported by a valuefocused approach and spatial multi-criteria evaluation techniques. A first set of spatial criteria was used to design a number of potential sites. Next, a new set of spatial and non-spatial criteria was employed, including the natural factors and the financial costs, together with the degree of suitability for the Bonkuh watershed to biological soil conservation planning and to recommend the most acceptable program. The whole process was facilitated by a new software tool that supports spatial multiple criteria evaluation, or SMCE in GIS software (ILWIS). The application of this tool, combined with a continual feedback by the public attentions, has provided an effective methodology to solve complex decisional problem in biological soil conservation planning.Keywords: GIS, Biological soil conservation planning, Spatial multi-criteria evaluation, Iran
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17167896 Opinion Mining and Sentiment Analysis on DEFT
Authors: Najiba Ouled Omar, Azza Harbaoui, Henda Ben Ghezala
Abstract:
Current research practices sentiment analysis with a focus on social networks, DEfi Fouille de Texte (DEFT) (Text Mining Challenge) evaluation campaign focuses on opinion mining and sentiment analysis on social networks, especially social network Twitter. It aims to confront the systems produced by several teams from public and private research laboratories. DEFT offers participants the opportunity to work on regularly renewed themes and proposes to work on opinion mining in several editions. The purpose of this article is to scrutinize and analyze the works relating to opinions mining and sentiment analysis in the Twitter social network realized by DEFT. It examines the tasks proposed by the organizers of the challenge and the methods used by the participants.
Keywords: Opinion mining, sentiment analysis, emotion, polarity, annotation, OSEE, figurative language, DEFT, Twitter, Tweet.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8227895 Customer Need Type Classification Model using Data Mining Techniques for Recommender Systems
Authors: Kyoung-jae Kim
Abstract:
Recommender systems are usually regarded as an important marketing tool in the e-commerce. They use important information about users to facilitate accurate recommendation. The information includes user context such as location, time and interest for personalization of mobile users. We can easily collect information about location and time because mobile devices communicate with the base station of the service provider. However, information about user interest can-t be easily collected because user interest can not be captured automatically without user-s approval process. User interest usually represented as a need. In this study, we classify needs into two types according to prior research. This study investigates the usefulness of data mining techniques for classifying user need type for recommendation systems. We employ several data mining techniques including artificial neural networks, decision trees, case-based reasoning, and multivariate discriminant analysis. Experimental results show that CHAID algorithm outperforms other models for classifying user need type. This study performs McNemar test to examine the statistical significance of the differences of classification results. The results of McNemar test also show that CHAID performs better than the other models with statistical significance.Keywords: Customer need type, Data mining techniques, Recommender system, Personalization, Mobile user.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21467894 A CTL Specification of Serializability for Transactions Accessing Uniform Data
Authors: Rafat Alshorman, Walter Hussak
Abstract:
Existing work in temporal logic on representing the execution of infinitely many transactions, uses linear-time temporal logic (LTL) and only models two-step transactions. In this paper, we use the comparatively efficient branching-time computational tree logic CTL and extend the transaction model to a class of multistep transactions, by introducing distinguished propositional variables to represent the read and write steps of n multi-step transactions accessing m data items infinitely many times. We prove that the well known correspondence between acyclicity of conflict graphs and serializability for finite schedules, extends to infinite schedules. Furthermore, in the case of transactions accessing the same set of data items in (possibly) different orders, serializability corresponds to the absence of cycles of length two. This result is used to give an efficient encoding of the serializability condition into CTL.Keywords: computational tree logic, serializability, multi-step transactions.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11777893 Web Traffic Mining using Neural Networks
Authors: Farhad F. Yusifov
Abstract:
With the explosive growth of data available on the Internet, personalization of this information space become a necessity. At present time with the rapid increasing popularity of the WWW, Websites are playing a crucial role to convey knowledge and information to the end users. Discovering hidden and meaningful information about Web users usage patterns is critical to determine effective marketing strategies to optimize the Web server usage for accommodating future growth. The task of mining useful information becomes more challenging when the Web traffic volume is enormous and keeps on growing. In this paper, we propose a intelligent model to discover and analyze useful knowledge from the available Web log data.Keywords: Clustering, Self organizing map, Web log files, Web traffic.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16037892 Color Image Segmentation using Adaptive Spatial Gaussian Mixture Model
Authors: M.Sujaritha, S. Annadurai
Abstract:
An adaptive spatial Gaussian mixture model is proposed for clustering based color image segmentation. A new clustering objective function which incorporates the spatial information is introduced in the Bayesian framework. The weighting parameter for controlling the importance of spatial information is made adaptive to the image content to augment the smoothness towards piecewisehomogeneous region and diminish the edge-blurring effect and hence the name adaptive spatial finite mixture model. The proposed approach is compared with the spatially variant finite mixture model for pixel labeling. The experimental results with synthetic and Berkeley dataset demonstrate that the proposed method is effective in improving the segmentation and it can be employed in different practical image content understanding applications.
Keywords: Adaptive; Spatial, Mixture model, Segmentation, Color.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24987891 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining
Authors: Tatjana Eitrich, Bruno Lang
Abstract:
This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.
Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large Data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15777890 Urban Greenery in the Greatest Polish Cities: Analysis of Spatial Concentration
Authors: Elżbieta Antczak
Abstract:
Cities offer important opportunities for economic development and for expanding access to basic services, including health care and education, for large numbers of people. Moreover, green areas (as an integral part of sustainable urban development) present a major opportunity for improving urban environments, quality of lives and livelihoods. This paper examines, using spatial concentration and spatial taxonomic measures, regional diversification of greenery in the cities of Poland. The analysis includes location quotients, Lorenz curve, Locational Gini Index, and the synthetic index of greenery and spatial statistics tools: (1) To verify the occurrence of strong concentration or dispersion of the phenomenon in time and space depending on the variable category, and, (2) To study if the level of greenery depends on the spatial autocorrelation. The data includes the greatest Polish cities, categories of the urban greenery (parks, lawns, street greenery, and green areas on housing estates, cemeteries, and forests) and the time span 2004-2015. According to the obtained estimations, most of cites in Poland are already taking measures to become greener. However, in the country there are still many barriers to well-balanced urban greenery development (e.g. uncontrolled urban sprawl, poor management as well as lack of spatial urban planning systems).
Keywords: Greenery, urban areas, regional spatial diversification and concentration, spatial taxonomic measure.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12537889 Idiopathic Constipation can be Subdivided in Clinical Subtypes: Data Mining by Cluster Analysis on a Population based Study
Authors: Mauro Giacomini, Stefania Bertone, Carlo Mansi, Pietro Dulbecco, Vincenzo Savarino
Abstract:
The prevalence of non organic constipation differs from country to country and the reliability of the estimate rates is uncertain. Moreover, the clinical relevance of subdividing the heterogeneous functional constipation disorders into pre-defined subgroups is largely unknown.. Aim: to estimate the prevalence of constipation in a population-based sample and determine whether clinical subgroups can be identified. An age and gender stratified sample population from 5 Italian cities was evaluated using a previously validated questionnaire. Data mining by cluster analysis was used to determine constipation subgroups. Results: 1,500 complete interviews were obtained from 2,083 contacted households (72%). Self-reported constipation correlated poorly with symptombased constipation found in 496 subjects (33.1%). Cluster analysis identified four constipation subgroups which correlated to subgroups identified according to pre-defined symptom criteria. Significant differences in socio-demographics and lifestyle were observed among subgroups.Keywords: Cluster analysis, constipation, data mining, statistical analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12977888 Data Mining Techniques in Computer-Aided Diagnosis: Non-Invasive Cancer Detection
Authors: Florin Gorunescu
Abstract:
Diagnosis can be achieved by building a model of a certain organ under surveillance and comparing it with the real time physiological measurements taken from the patient. This paper deals with the presentation of the benefits of using Data Mining techniques in the computer-aided diagnosis (CAD), focusing on the cancer detection, in order to help doctors to make optimal decisions quickly and accurately. In the field of the noninvasive diagnosis techniques, the endoscopic ultrasound elastography (EUSE) is a recent elasticity imaging technique, allowing characterizing the difference between malignant and benign tumors. Digitalizing and summarizing the main EUSE sample movies features in a vector form concern with the use of the exploratory data analysis (EDA). Neural networks are then trained on the corresponding EUSE sample movies vector input in such a way that these intelligent systems are able to offer a very precise and objective diagnosis, discriminating between benign and malignant tumors. A concrete application of these Data Mining techniques illustrates the suitability and the reliability of this methodology in CAD.Keywords: Endoscopic ultrasound elastography, exploratorydata analysis, neural networks, non-invasive cancer detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18677887 Novelty as a Measure of Interestingness in Knowledge Discovery
Authors: Vasudha Bhatnagar, Ahmed Sultan Al-Hegami, Naveen Kumar
Abstract:
Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules leads to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a hybrid approach based on both objective and subjective measures to quantify novelty of the discovered rules in terms of their deviations from the known rules (knowledge). We analyze the types of deviation that can arise between two rules and categorize the discovered rules according to the user specified threshold. We implement the proposed framework and experiment with some public datasets. The experimental results are promising.Keywords: Knowledge Discovery in Databases (KDD), Interestingness, Subjective Measures, Novelty Index.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18077886 Detection of Temporal Change of Fishery and Island Activities by DNB and SAR on the South China Sea
Authors: I. Asanuma, T. Yamaguchi, J. Park, K. J. Mackin
Abstract:
Fishery lights on the surface could be detected by the Day and Night Band (DNB) of the Visible Infrared Imaging Radiometer Suite (VIIRS) on the Suomi National Polar-orbiting Partnership (Suomi-NPP). The DNB covers the spectral range of 500 to 900 nm and realized a higher sensitivity. The DNB has a difficulty of identification of fishing lights from lunar lights reflected by clouds, which affects observations for the half of the month. Fishery lights and lights of the surface are identified from lunar lights reflected by clouds by a method using the DNB and the infrared band, where the detection limits are defined as a function of the brightness temperature with a difference from the maximum temperature for each level of DNB radiance and with the contrast of DNB radiance against the background radiance. Fishery boats or structures on islands could be detected by the Synthetic Aperture Radar (SAR) on the polar orbit satellites using the reflected microwave by the surface reflecting targets. The SAR has a difficulty of tradeoff between spatial resolution and coverage while detecting the small targets like fishery boats. A distribution of fishery boats and island activities were detected by the scan-SAR narrow mode of Radarsat-2, which covers 300 km by 300 km with various combinations of polarizations. The fishing boats were detected as a single pixel of highly scattering targets with the scan-SAR narrow mode of which spatial resolution is 30 m. As the look angle dependent scattering signals exhibits the significant differences, the standard deviations of scattered signals for each look angles were taken into account as a threshold to identify the signal from fishing boats and structures on the island from background noise. It was difficult to validate the detected targets by DNB with SAR data because of time lag of observations for 6 hours between midnight by DNB and morning or evening by SAR. The temporal changes of island activities were detected as a change of mean intensity of DNB for circular area for a certain scale of activities. The increase of DNB mean intensity was corresponding to the beginning of dredging and the change of intensity indicated the ending of reclamation and following constructions of facilities.Keywords: Day night band, fishery, SAR, South China Sea.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10957885 Temporal Extension to OWL Ontologies
Authors: Sudeep Marwaha, Punam Bedi
Abstract:
Ontologies play an important role in semantic web applications and are often developed by different groups and continues to evolve over time. The knowledge in ontologies changes very rapidly that make the applications outdated if they continue to use old versions or unstable if they jump to new versions. Temporal frames using frame versioning and slot versioning are used to take care of dynamic nature of the ontologies. The paper proposes new tags and restructured OWL format enabling the applications to work with the old or new version of ontologies. Gene Ontology, a very dynamic ontology, has been used as a case study to explain the OWL Ontology with Temporal Tags.Keywords: Frame and slot Versioning, OWL, OntologyVersioning, Semantic Web.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13897884 Reduction of Plants Biodiversity in Hyrcanian Forest by Coal Mining Activities
Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch
Abstract:
Considering that coal mining is one of the important industrial activities, it may cause damages to environment. According to the author’s best knowledge, the effect of traditional coal mining activities on plant biodiversity has not been investigated in the Hyrcanian forests. Therefore, in this study, the effect of coal mining activities on vegetation and tree diversity was investigated in Hyrcanian forest, North Iran. After filed visiting and determining the mine, 16 plots (20×20 m2) were established by systematic-randomly (60×60 m2) in an area of 4 ha (200×200 m2-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity, and it is considered as the control area. In each plot, the data about trees such as number and type of species were recorded. The biodiversity of vegetation cover was considered 5 square sub-plots (1 m2) in each plot. PAST software and Ecological Methodology were used to calculate Biodiversity indices. The value of Shannon Wiener and Simpson diversity indices for tree cover in control area (1.04±0.34 and 0.62±0.20) was significantly higher than mining area (0.78±0.27 and 0.45±0.14). The value of evenness indices for tree cover in the mining area was significantly lower than that of the control area. The value of Shannon Wiener and Simpson diversity indices for vegetation cover in the control area (1.37±0.06 and 0.69±0.02) was significantly higher than the mining area (1.02±0.13 and 0.50±0.07). The value of evenness index in the control area was significantly higher than the mining area. Plant communities are a good indicator of the changes in the site. Study about changes in vegetation biodiversity and plant dynamics in the degraded land can provide necessary information for forest management and reforestation of these areas.
Keywords: Vegetation biodiversity, species composition, traditional coal mining, caspian forest.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 898