Search results for: Data cutting and sorting method

13385 Decision Trees for Predicting Risk of Mortality using Routinely Collected Data

Authors: Tessy Badriyah, Jim S. Briggs, Dave R. Prytherch

Abstract:

It is well known that Logistic Regression is the gold standard method for predicting clinical outcome, especially predicting risk of mortality. In this paper, the Decision Tree method has been proposed to solve specific problems that commonly use Logistic Regression as a solution. The Biochemistry and Haematology Outcome Model (BHOM) dataset obtained from Portsmouth NHS Hospital from 1 January to 31 December 2001 was divided into four subsets. One subset of training data was used to generate a model, and the model obtained was then applied to three testing datasets. The performance of each model from both methods was then compared using calibration (the χ2 test or chi-test) and discrimination (area under ROC curve or c-index). The experiment presented that both methods have reasonable results in the case of the c-index. However, in some cases the calibration value (χ2) obtained quite a high result. After conducting experiments and investigating the advantages and disadvantages of each method, we can conclude that Decision Trees can be seen as a worthy alternative to Logistic Regression in the area of Data Mining.

Keywords: Decision Trees, Logistic Regression, clinical outcome, risk of mortality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2523

13384 Principal Component Analysis-Ranking as a Variable Selection Method for the Simultaneous Spectrophotometric Determination of Phenol, Resorcinol and Catechol in Real Samples

Authors: Nahid Ghasemi, Mohammad Goodarzi, Morteza Khosravi

Abstract:

Simultaneous determination of multicomponents of phenol, resorcinol and catechol with a chemometric technique a PCranking artificial neural network (PCranking-ANN) algorithm is reported in this study. Based on the data correlation coefficient method, 3 representative PCs are selected from the scores of original UV spectral data (35 PCs) as the original input patterns for ANN to build a neural network model. The results obtained by iterating 8000 .The RMSEP for phenol, resorcinol and catechol with PCranking- ANN were 0.6680, 0.0766 and 0.1033, respectively. Calibration matrices were 0.50-21.0, 0.50-15.1 and 0.50-20.0 μg ml-1 for phenol, resorcinol and catechol, respectively. The proposed method was successfully applied for the determination of phenol, resorcinol and catechol in synthetic and water samples.

Keywords: Phenol, Resorcinol, Catechol, Principal componentrankingArtificial Neural Network, Chemometrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1427

13383 3D Face Modeling based on 3D Dense Morphable Face Shape Model

Authors: Yongsuk Jang Kim, Sun-Tae Chung, Boogyun Kim, Seongwon Cho

Abstract:

Realistic 3D face model is more precise in representing pose, illumination, and expression of face than 2D face model so that it can be utilized usefully in various applications such as face recognition, games, avatars, animations, and etc. In this paper, we propose a 3D face modeling method based on 3D dense morphable shape model. The proposed 3D modeling method first constructs a 3D dense morphable shape model from 3D face scan data obtained using a 3D scanner. Next, the proposed method extracts and matches facial landmarks from 2D image sequence containing a face to be modeled, and then reconstructs 3D vertices coordinates of the landmarks using a factorization-based SfM technique. Then, the proposed method obtains a 3D dense shape model of the face to be modeled by fitting the constructed 3D dense morphable shape model into the reconstructed 3D vertices. Also, the proposed method makes a cylindrical texture map using 2D face image sequence. Finally, the proposed method generates a 3D face model by rendering the 3D dense face shape model using the cylindrical texture map. Through building processes of 3D face model by the proposed method, it is shown that the proposed method is relatively easy, fast and precise.

Keywords: 3D Face Modeling, 3D Morphable Shape Model, 3DReconstruction, 3D Correspondence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2428

13382 Image-Based (RBG) Technique for Estimating Phosphorus Levels of Crops

Authors: M. M. Ali, Ahmed Al-Ani, Derek Eamus, Daniel K. Y. Tan

Abstract:

In this glasshouse study, we developed a new imagebased non-destructive technique for detecting leaf P status of different crops such as cotton, tomato and lettuce. The plants were grown on a nutrient solution containing different P concentrations, e.g. 0%, 50% and 100% of recommended P concentration (P0 = no P, L; P1 = 2.5 mL 10 L-1 of P and P2 = 5 mL 10 L-1 of P). After 7 weeks of treatment, the plants were harvested and data on leaf P contents were collected using the standard destructive laboratory method and at the same time leaf images were collected by a handheld crop image sensor. We calculated leaf area, leaf perimeter and RGB (red, green and blue) values of these images. These data were further used in linear discriminant analysis (LDA) to estimate leaf P contents, which successfully classified these plants on the basis of leaf P contents. The data indicated that P deficiency in crop plants can be predicted using leaf image and morphological data. Our proposed nondestructive imaging method is precise in estimating P requirements of different crop species.

Keywords: Image-based techniques, leaf area, leaf P contents, linear discriminant analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1649

13381 CFD Modeling of Boiling in a Microchannel Based On Phase-Field Method

Authors: Rahim Jafari, Tuba Okutucu-Özyurt

Abstract:

The hydrodynamics and heat transfer characteristics of a vaporized elongated bubble in a rectangular microchannel have been simulated based on Cahn-Hilliard phase-field method. In the simulations, the initially nucleated bubble starts growing as it comes in contact with superheated water. The growing shape of the bubble compared well with the available experimental data in the literature.

Keywords: Microchannel, boiling, Cahn-Hilliard method, Two-phase flow, Simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3846

13380 Genetic Combined with a Simplex Algorithm as an Efficient Method for the Detection of a Depressed Ellipsoidal Flaw using the Boundary Element Method

Authors: Clio G. Vossou, Ioannis N. Koukoulis, Christopher G. Provatidis

Abstract:

The present work encounters the solution of the defect identification problem with the use of an evolutionary algorithm combined with a simplex method. In more details, a Matlab implementation of Genetic Algorithms is combined with a Simplex method in order to lead to the successful identification of the defect. The influence of the location and the orientation of the depressed ellipsoidal flaw was investigated as well as the use of different amount of static data in the cost function. The results were evaluated according to the ability of the simplex method to locate the global optimum in each test case. In this way, a clear impression regarding the performance of the novel combination of the optimization algorithms, and the influence of the geometrical parameters of the flaw in defect identification problems was obtained.

Keywords: Defect identification, genetic algorithms, optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1290

13379 Detecting the Nonlinearity in Time Series from Continuous Dynamic Systems Based on Delay Vector Variance Method

Authors: Shumin Hou, Yourong Li, Sanxing Zhao

Abstract:

Much time series data is generally from continuous dynamic system. Firstly, this paper studies the detection of the nonlinearity of time series from continuous dynamics systems by applying the Phase-randomized surrogate algorithm. Then, the Delay Vector Variance (DVV) method is introduced into nonlinearity test. The results show that under the different sampling conditions, the opposite detection of nonlinearity is obtained via using traditional test statistics methods, which include the third-order autocovariance and the asymmetry due to time reversal. Whereas the DVV method can perform well on determining nonlinear of Lorenz signal. It indicates that the proposed method can describe the continuous dynamics signal effectively.

Keywords: Nonlinearity, Time series, continuous dynamics system, DVV method

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1626

13378 Low Energy Method for Data Delivery in Ubiquitous Network

Authors: Tae Kyung Kim, Hee Suk Seo

Abstract:

Recent advances in wireless sensor networks have led to many routing methods designed for energy-efficiency in wireless sensor networks. Despite that many routing methods have been proposed in USN, a single routing method cannot be energy-efficient if the environment of the ubiquitous sensor network varies. We present the controlling network access to various hosts and the services they offer, rather than on securing them one by one with a network security model. When ubiquitous sensor networks are deployed in hostile environments, an adversary may compromise some sensor nodes and use them to inject false sensing reports. False reports can lead to not only false alarms but also the depletion of limited energy resource in battery powered networks. The interleaved hop-by-hop authentication scheme detects such false reports through interleaved authentication. This paper presents a LMDD (Low energy method for data delivery) algorithm that provides energy-efficiency by dynamically changing protocols installed at the sensor nodes. The algorithm changes protocols based on the output of the fuzzy logic which is the fitness level of the protocols for the environment.

Keywords: Data delivery, routing, simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1345

13377 A Message Passing Implementation of a New Parallel Arrangement Algorithm

Authors: Ezequiel Herruzo, Juan José Cruz, José Ignacio Benavides, Oscar Plata

Abstract:

This paper describes a new algorithm of arrangement in parallel, based on Odd-Even Mergesort, called division and concurrent mixes. The main idea of the algorithm is to achieve that each processor uses a sequential algorithm for ordering a part of the vector, and after that, for making the processors work in pairs in order to mix two of these sections ordered in a greater one, also ordered; after several iterations, the vector will be completely ordered. The paper describes the implementation of the new algorithm on a Message Passing environment (such as MPI). Besides, it compares the obtained experimental results with the quicksort sequential algorithm and with the parallel implementations (also on MPI) of the algorithms quicksort and bitonic sort. The comparison has been realized in an 8 processors cluster under GNU/Linux which is running on a unique PC processor.

Keywords: Parallel algorithm, arrangement, MPI, sorting, parallel program.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1692

13376 Evaluation of Dynamic Behavior a Machine Tool Spindle System through Modal and Unbalance Response Analysis

Authors: Khairul Jauhari, Achmad Widodo, Ismoyo Haryanto

Abstract:

The spindle system is one of the most important components of machine tool. The dynamic properties of the spindle affect the machining productivity and quality of the work pieces. Thus, it is important and necessary to determine its dynamic characteristics of spindles in the design and development in order to avoid forced resonance. The finite element method (FEM) has been adopted in order to obtain the dynamic behavior of spindle system. For this reason, obtaining the Campbell diagrams and determining the critical speeds are very useful to evaluate the spindle system dynamics. The unbalance response of the system to the center of mass unbalance at the cutting tool is also calculated to investigate the dynamic behavior. In this paper, we used an ANSYS Parametric Design Language (APDL) program which based on finite element method has been implemented to make the full dynamic analysis and evaluation of the results. Results show that the calculated critical speeds are far from the operating speed range of the spindle, thus, the spindle would not experience resonance, and the maximum unbalance response at operating speed is still with acceptable limit. ANSYS Parametric Design Language (APDL) can be used by spindle designer as tools in order to increase the product quality, reducing cost, and time consuming in the design and development stages.

Keywords: ANSYS parametric design language (APDL), Campbell diagram, Critical speeds, Unbalance response, The Spindle system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2830

13375 A Proposal for U-City (Smart City) Service Method Using Real-Time Digital Map

Authors: SangWon Han, MuWook Pyeon, Sujung Moon, DaeKyo Seo

Abstract:

Recently, technologies based on three-dimensional (3D) space information are being developed and quality of life is improving as a result. Research on real-time digital map (RDM) is being conducted now to provide 3D space information. RDM is a service that creates and supplies 3D space information in real time based on location/shape detection. Research subjects on RDM include the construction of 3D space information with matching image data, complementing the weaknesses of image acquisition using multi-source data, and data collection methods using big data. Using RDM will be effective for space analysis using 3D space information in a U-City and for other space information utilization technologies.

Keywords: RDM, multi-source data, big data, U-City.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 805

13374 The Pixel Value Data Approach for Rainfall Forecasting Based on GOES-9 Satellite Image Sequence Analysis

Authors: C. Yaiprasert, K. Jaroensutasinee, M. Jaroensutasinee

Abstract:

To develop a process of extracting pixel values over the using of satellite remote sensing image data in Thailand. It is a very important and effective method of forecasting rainfall. This paper presents an approach for forecasting a possible rainfall area based on pixel values from remote sensing satellite images. First, a method uses an automatic extraction process of the pixel value data from the satellite image sequence. Then, a data process is designed to enable the inference of correlations between pixel value and possible rainfall occurrences. The result, when we have a high averaged pixel value of daily water vapor data, we will also have a high amount of daily rainfall. This suggests that the amount of averaged pixel values can be used as an indicator of raining events. There are some positive associations between pixel values of daily water vapor images and the amount of daily rainfall at each rain-gauge station throughout Thailand. The proposed approach was proven to be a helpful manual for rainfall forecasting from meteorologists by which using automated analyzing and interpreting process of meteorological remote sensing data.

Keywords: Pixel values, satellite image, water vapor, rainfall, image processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1862

13373 A Hybrid Data Mining Method for the Medical Classification of Chest Pain

Authors: Sung Ho Ha, Seong Hyeon Joo

Abstract:

Data mining techniques have been used in medical research for many years and have been known to be effective. In order to solve such problems as long-waiting time, congestion, and delayed patient care, faced by emergency departments, this study concentrates on building a hybrid methodology, combining data mining techniques such as association rules and classification trees. The methodology is applied to real-world emergency data collected from a hospital and is evaluated by comparing with other techniques. The methodology is expected to help physicians to make a faster and more accurate classification of chest pain diseases.

Keywords: Data mining, medical decisions, medical domainknowledge, chest pain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2220

13372 Operational Risk – Scenario Analysis

Authors: Milan Rippel, Petr Teply

Abstract:

This paper focuses on operational risk measurement techniques and on economic capital estimation methods. A data sample of operational losses provided by an anonymous Central European bank is analyzed using several approaches. Loss Distribution Approach and scenario analysis method are considered. Custom plausible loss events defined in a particular scenario are merged with the original data sample and their impact on capital estimates and on the financial institution is evaluated. Two main questions are assessed – What is the most appropriate statistical method to measure and model operational loss data distribution? and What is the impact of hypothetical plausible events on the financial institution? The g&h distribution was evaluated to be the most suitable one for operational risk modeling. The method based on the combination of historical loss events modeling and scenario analysis provides reasonable capital estimates and allows for the measurement of the impact of extreme events on banking operations.

Keywords: operational risk, scenario analysis, economic capital, loss distribution approach, extreme value theory, stress testing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2429

13371 Surface Roughness Optimization in End Milling Operation with Damper Inserted End Milling Cutters

Authors: Krishna Mohana Rao, G. Ravi Kumar, P. Sowmya

Abstract:

This paper presents a study of the Taguchi design application to optimize surface quality in damper inserted end milling operation. Maintaining good surface quality usually involves additional manufacturing cost or loss of productivity. The Taguchi design is an efficient and effective experimental method in which a response variable can be optimized, given various factors, using fewer resources than a factorial design. This Study included spindle speed, feed rate, and depth of cut as control factors, usage of different tools in the same specification, which introduced tool condition and dimensional variability. An orthogonal array of L9(3^4)was used; ANOVA analyses were carried out to identify the significant factors affecting surface roughness, and the optimal cutting combination was determined by seeking the best surface roughness (response) and signal-to-noise ratio. Finally, confirmation tests verified that the Taguchi design was successful in optimizing milling parameters for surface roughness.

Keywords: ANOVA, Damper, End Milling, Optimization, Surface roughness, Taguchi design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2348

13370 K-Means for Spherical Clusters with Large Variance in Sizes

Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.

Keywords: K-Means, Data Clustering, Cluster Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3281

13369 Generator of Hypotheses an Approach of Data Mining Based on Monotone Systems Theory

Authors: Rein Kuusik, Grete Lind

Abstract:

Generator of hypotheses is a new method for data mining. It makes possible to classify the source data automatically and produces a particular enumeration of patterns. Pattern is an expression (in a certain language) describing facts in a subset of facts. The goal is to describe the source data via patterns and/or IF...THEN rules. Used evaluation criteria are deterministic (not probabilistic). The search results are trees - form that is easy to comprehend and interpret. Generator of hypotheses uses very effective algorithm based on the theory of monotone systems (MS) named MONSA (MONotone System Algorithm).

Keywords: data mining, monotone systems, pattern, rule.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1256

13368 Anisotropic Total Fractional Order Variation Model in Seismic Data Denoising

Authors: Jianwei Ma, Diriba Gemechu

Abstract:

In seismic data processing, attenuation of random noise is the basic step to improve quality of data for further application of seismic data in exploration and development in different gas and oil industries. The signal-to-noise ratio of the data also highly determines quality of seismic data. This factor affects the reliability as well as the accuracy of seismic signal during interpretation for different purposes in different companies. To use seismic data for further application and interpretation, we need to improve the signal-to-noise ration while attenuating random noise effectively. To improve the signal-to-noise ration and attenuating seismic random noise by preserving important features and information about seismic signals, we introduce the concept of anisotropic total fractional order denoising algorithm. The anisotropic total fractional order variation model defined in fractional order bounded variation is proposed as a regularization in seismic denoising. The split Bregman algorithm is employed to solve the minimization problem of the anisotropic total fractional order variation model and the corresponding denoising algorithm for the proposed method is derived. We test the effectiveness of theproposed method for synthetic and real seismic data sets and the denoised result is compared with F-X deconvolution and non-local means denoising algorithm.

Keywords: Anisotropic total fractional order variation, fractional order bounded variation, seismic random noise attenuation, Split Bregman Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1013

13367 A Spatial Point Pattern Analysis to Recognize Fail Bit Patterns in Semiconductor Manufacturing

Authors: Youngji Yoo, Seung Hwan Park, Daewoong An, Sung-Shick Kim, Jun-Geol Baek

Abstract:

The yield management system is very important to produce high-quality semiconductor chips in the semiconductor manufacturing process. In order to improve quality of semiconductors, various tests are conducted in the post fabrication (FAB) process. During the test process, large amount of data are collected and the data includes a lot of information about defect. In general, the defect on the wafer is the main causes of yield loss. Therefore, analyzing the defect data is necessary to improve performance of yield prediction. The wafer bin map (WBM) is one of the data collected in the test process and includes defect information such as the fail bit patterns. The fail bit has characteristics of spatial point patterns. Therefore, this paper proposes the feature extraction method using the spatial point pattern analysis. Actual data obtained from the semiconductor process is used for experiments and the experimental result shows that the proposed method is more accurately recognize the fail bit patterns.

Keywords: Semiconductor, wafer bin map (WBM), feature extraction, spatial point patterns, contour map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2500

13366 The RK1GL2X3 Method for Initial Value Problems in Ordinary Differential Equations

Authors: J.S.C. Prentice

Abstract:

The RK1GL2X3 method is a numerical method for solving initial value problems in ordinary differential equations, and is based on the RK1GL2 method which, in turn, is a particular case of the general RKrGLm method. The RK1GL2X3 method is a fourth-order method, even though its underlying Runge-Kutta method RK1 is the first-order Euler method, and hence, RK1GL2X3 is considerably more efficient than RK1. This enhancement is achieved through an implementation involving triple-nested two-point Gauss- Legendre quadrature.

Keywords: RK1GL2X3, RK1GL2, RKrGLm, Runge-Kutta, Gauss-Legendre, initial value problem, local error, global error.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1318

13365 An Efficient Iterative Updating Method for Damped Structural Systems

Authors: Jiashang Jiang

Abstract:

Model updating is an inverse eigenvalue problem which concerns the modification of an existing but inaccurate model with measured modal data. In this paper, an efficient gradient based iterative method for updating the mass, damping and stiffness matrices simultaneously using a few of complex measured modal data is developed. Convergence analysis indicates that the iterative solutions always converge to the unique minimum Frobenius norm symmetric solution of the model updating problem by choosing a special kind of initial matrices.

Keywords: Model updating, iterative algorithm, damped structural system, optimal approximation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2084

13364 Inspection of Geometrical Integrity of Work Piece and Measurement of Tool Wear by the Use of Photo Digitizing Method

Authors: R. Alipour, F. Nadjarian, A. Alinaghizade

Abstract:

Considering complexity of products, new geometrical design and investment tolerances that are necessary, measuring and dimensional controlling involve modern and more precise methods. Photo digitizing method using two cameras to record pictures and utilization of conventional method named “cloud points" and data analysis by the use of ATOUS software, is known as modern and efficient in mentioned context. In this paper, benefits of photo digitizing method in evaluating sampling of machining processes have been put forward. For example, assessment of geometrical integrity surface in 5-axis milling process and measurement of carbide tool wear in turning process, can be can be brought forward. Advantages of this method comparing to conventional methods have been expressed.

Keywords: photo digitizing, tool wear, geometrical integrity, cloud points

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1357

13363 Parametric Analysis and Optimal Design of Functionally Graded Plates Using Particle Swarm Optimization Algorithm and a Hybrid Meshless Method

Authors: Foad Nazari, Seyed Mahmood Hosseini, Mohammad Hossein Abolbashari, Mohammad Hassan Abolbashari

Abstract:

The present study is concerned with the optimal design of functionally graded plates using particle swarm optimization (PSO) algorithm. In this study, meshless local Petrov-Galerkin (MLPG) method is employed to obtain the functionally graded (FG) plate’s natural frequencies. Effects of two parameters including thickness to height ratio and volume fraction index on the natural frequencies and total mass of plate are studied by using the MLPG results. Then the first natural frequency of the plate, for different conditions where MLPG data are not available, is predicted by an artificial neural network (ANN) approach which is trained by back-error propagation (BEP) technique. The ANN results show that the predicted data are in good agreement with the actual one. To maximize the first natural frequency and minimize the mass of FG plate simultaneously, the weighted sum optimization approach and PSO algorithm are used. However, the proposed optimization process of this study can provide the designers of FG plates with useful data.

Keywords: Optimal design, natural frequency, FG plate, hybrid meshless method, MLPG method, ANN approach, particle swarm optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1433

13362 Identification of Coauthors in Scientific Database

Authors: Thiago M. R Dias, Gray F. Moita

Abstract:

The analysis of scientific collaboration networks has contributed significantly to improving the understanding of how does the process of collaboration between researchers and also to understand how the evolution of scientific production of researchers or research groups occurs. However, the identification of collaborations in large scientific databases is not a trivial task given the high computational cost of the methods commonly used. This paper proposes a method for identifying collaboration in large data base of curriculum researchers. The proposed method has low computational cost with satisfactory results, proving to be an interesting alternative for the modeling and characterization of large scientific collaboration networks.

Keywords: Extraction and data integration, Information Retrieval, Scientific Collaboration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1712

13361 A Comprehensive Review on Different Mixed Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila

Abstract:

An extensive amount of work has been done in data clustering research under the unsupervised learning technique in Data Mining during the past two decades. Moreover, several approaches and methods have been emerged focusing on clustering diverse data types, features of cluster models and similarity rates of clusters. However, none of the single clustering algorithm exemplifies its best nature in extracting efficient clusters. Consequently, in order to rectify this issue, a new challenging technique called Cluster Ensemble method was bloomed. This new approach tends to be the alternative method for the cluster analysis problem. The main objective of the Cluster Ensemble is to aggregate the diverse clustering solutions in such a way to attain accuracy and also to improve the eminence the individual clustering algorithms. Due to the massive and rapid development of new methods in the globe of data mining, it is highly mandatory to scrutinize a vital analysis of existing techniques and the future novelty. This paper shows the comparative analysis of different cluster ensemble methods along with their methodologies and salient features. Henceforth this unambiguous analysis will be very useful for the society of clustering experts and also helps in deciding the most appropriate one to resolve the problem in hand.

Keywords: Clustering, Cluster Ensemble Methods, Coassociation matrix, Consensus Function, Median Partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2104

13360 Seat Assignment Problem Optimization

Authors: Mohammed Salem Alzahrani

Abstract:

In this paper the optimality of the solution of an existing real word assignment problem known as the seat assignment problem using Seat Assignment Method (SAM) is discussed. SAM is the newly driven method from three existing methods, Hungarian Method, Northwest Corner Method and Least Cost Method in a special way that produces the easiness & fairness among all methods that solve the seat assignment problem.

Keywords: Assignment Problem, Hungarian Method, Least Cost Method, Northwest Corner Method, Seat Assignment Method (SAM), A Real Word Assignment Problem.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3446

13359 Approaches and Schemes for Storing DTD-Independent XML Data in Relational Databases

Authors: Mehdi Emadi, Masoud Rahgozar, Adel Ardalan, Alireza Kazerani, Mohammad Mahdi Ariyan

Abstract:

The volume of XML data exchange is explosively increasing, and the need for efficient mechanisms of XML data management is vital. Many XML storage models have been proposed for storing XML DTD-independent documents in relational database systems. Benchmarking is the best way to highlight pros and cons of different approaches. In this study, we use a common benchmarking scheme, known as XMark to compare the most cited and newly proposed DTD-independent methods in terms of logical reads, physical I/O, CPU time and duration. We show the effect of Label Path, extracting values and storing in another table and type of join needed for each method's query answering.

Keywords: XML Data Management, XPath, DTD-IndependentXML Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1881

13358 Approaches and Schemes for Storing DTDIndependent XML Data in Relational Databases

Authors: Mehdi Emadi, Masoud Rahgozar, Adel Ardalan, Alireza Kazerani, Mohammad Mahdi Ariyan

Abstract:

The volume of XML data exchange is explosively increasing, and the need for efficient mechanisms of XML data management is vital. Many XML storage models have been proposed for storing XML DTD-independent documents in relational database systems. Benchmarking is the best way to highlight pros and cons of different approaches. In this study, we use a common benchmarking scheme, known as XMark to compare the most cited and newly proposed DTD-independent methods in terms of logical reads, physical I/O, CPU time and duration. We show the effect of Label Path, extracting values and storing in another table and type of join needed for each method-s query answering.

Keywords: XML Data Management, XPath, DTD-Independent XML Data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1453

13357 A Spatial Information Network Traffic Prediction Method Based on Hybrid Model

Authors: Jingling Li, Yi Zhang, Wei Liang, Tao Cui, Jun Li

Abstract:

Compared with terrestrial network, the traffic of spatial information network has both self-similarity and short correlation characteristics. By studying its traffic prediction method, the resource utilization of spatial information network can be improved, and the method can provide an important basis for traffic planning of a spatial information network. In this paper, considering the accuracy and complexity of the algorithm, the spatial information network traffic is decomposed into approximate component with long correlation and detail component with short correlation, and a time series hybrid prediction model based on wavelet decomposition is proposed to predict the spatial network traffic. Firstly, the original traffic data are decomposed to approximate components and detail components by using wavelet decomposition algorithm. According to the autocorrelation and partial correlation smearing and truncation characteristics of each component, the corresponding model (AR/MA/ARMA) of each detail component can be directly established, while the type of approximate component modeling can be established by ARIMA model after smoothing. Finally, the prediction results of the multiple models are fitted to obtain the prediction results of the original data. The method not only considers the self-similarity of a spatial information network, but also takes into account the short correlation caused by network burst information, which is verified by using the measured data of a certain back bone network released by the MAWI working group in 2018. Compared with the typical time series model, the predicted data of hybrid model is closer to the real traffic data and has a smaller relative root means square error, which is more suitable for a spatial information network.

Keywords: Spatial Information Network, Traffic prediction, Wavelet decomposition, Time series model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 637

13356 XML Data Management in Compressed Relational Database

Authors: Hongzhi Wang, Jianzhong Li, Hong Gao

Abstract:

XML is an important standard of data exchange and representation. As a mature database system, using relational database to support XML data may bring some advantages. But storing XML in relational database has obvious redundancy that wastes disk space, bandwidth and disk I/O when querying XML data. For the efficiency of storage and query XML, it is necessary to use compressed XML data in relational database. In this paper, a compressed relational database technology supporting XML data is presented. Original relational storage structure is adaptive to XPath query process. The compression method keeps this feature. Besides traditional relational database techniques, additional query process technologies on compressed relations and for special structure for XML are presented. In this paper, technologies for XQuery process in compressed relational database are presented..

Keywords: XML, compression, query processing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1805