Search results for: curve feature
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2505

Search results for: curve feature

2505 Comparison of the Distillation Curve Obtained Experimentally with the Curve Extrapolated by a Commercial Simulator

Authors: Lívia B. Meirelles, Erika C. A. N. Chrisman, Flávia B. de Andrade, Lilian C. M. de Oliveira

Abstract:

True Boiling Point distillation (TBP) is one of the most common experimental techniques for the determination of petroleum properties. This curve provides information about the performance of petroleum in terms of its cuts. The experiment is performed in a few days. Techniques are used to determine the properties faster with a software that calculates the distillation curve when a little information about crude oil is known. In order to evaluate the accuracy of distillation curve prediction, eight points of the TBP curve and specific gravity curve (348 K and 523 K) were inserted into the HYSYS Oil Manager, and the extended curve was evaluated up to 748 K. The methods were able to predict the curve with the accuracy of 0.6%-9.2% error (Software X ASTM), 0.2%-5.1% error (Software X Spaltrohr).

Keywords: distillation curve, petroleum distillation, simulation, true boiling point curve

Procedia PDF Downloads 412
2504 Detection of Keypoint in Press-Fit Curve Based on Convolutional Neural Network

Authors: Shoujia Fang, Guoqing Ding, Xin Chen

Abstract:

The quality of press-fit assembly is closely related to reliability and safety of product. The paper proposed a keypoint detection method based on convolutional neural network to improve the accuracy of keypoint detection in press-fit curve. It would provide an auxiliary basis for judging quality of press-fit assembly. The press-fit curve is a curve of press-fit force and displacement. Both force data and distance data are time-series data. Therefore, one-dimensional convolutional neural network is used to process the press-fit curve. After the obtained press-fit data is filtered, the multi-layer one-dimensional convolutional neural network is used to perform the automatic learning of press-fit curve features, and then sent to the multi-layer perceptron to finally output keypoint of the curve. We used the data of press-fit assembly equipment in the actual production process to train CNN model, and we used different data from the same equipment to evaluate the performance of detection. Compared with the existing research result, the performance of detection was significantly improved. This method can provide a reliable basis for the judgment of press-fit quality.

Keywords: keypoint detection, curve feature, convolutional neural network, press-fit assembly

Procedia PDF Downloads 185
2503 Approximating Maximum Speed on Road from Curvature Information of Bezier Curve

Authors: M. Yushalify Misro, Ahmad Ramli, Jamaludin M. Ali

Abstract:

Bezier curves have useful properties for path generation problem, for instance, it can generate the reference trajectory for vehicles to satisfy the path constraints. Both algorithms join cubic Bezier curve segment smoothly to generate the path. Some of the useful properties of Bezier are curvature. In mathematics, the curvature is the amount by which a geometric object deviates from being flat, or straight in the case of a line. Another extrinsic example of curvature is a circle, where the curvature is equal to the reciprocal of its radius at any point on the circle. The smaller the radius, the higher the curvature thus the vehicle needs to bend sharply. In this study, we use Bezier curve to fit highway-like curve. We use the different approach to finding the best approximation for the curve so that it will resemble highway-like curve. We compute curvature value by analytical differentiation of the Bezier Curve. We will then compute the maximum speed for driving using the curvature information obtained. Our research works on some assumptions; first the Bezier curve estimates the real shape of the curve which can be verified visually. Even, though, the fitting process of Bezier curve does not interpolate exactly on the curve of interest, we believe that the estimation of speed is acceptable. We verified our result with the manual calculation of the curvature from the map.

Keywords: speed estimation, path constraints, reference trajectory, Bezier curve

Procedia PDF Downloads 346
2502 Bifurcation Curve for Semipositone Problem with Minkowski-Curvature Operator

Authors: Shao-Yuan Huang

Abstract:

We study the shape of the bifurcation curve of positive solutions for the semipositone problem with the Minkowski-curvature operator. The Minkowski-curvature problem plays an important role in certain fundamental issues in differential geometry and in the special theory of relativity. In addition, it is well known that studying the multiplicity of positive solutions is equivalent to studying the shape of the bifurcation curve. By the shape of the bifurcation curve, we can understand the change in the multiplicity of positive solutions with varying parameters. In this paper, our main technique is a time-map method used in Corsato's PhD Thesis. By this method, studying the shape of the bifurcation curve is equivalent to studying the shape of a certain function T with improper integral. Generally speaking, it is difficult to study the shape of T. So, in this paper, we consider two cases that the nonlinearity is convex or concave. Thus we obtain the following results: (i) If f''(u) < 0 for u > 0, then the bifurcation curve is C-shaped. (ii) If f''(u) > 0 for u > 0, then there exists η>β such that the bifurcation curve does not exist for 0 η. Furthermore, we prove that the bifurcation is C-shaped for L > η under a certain condition.

Keywords: bifurcation curve, Minkowski-curvature problem, positive solution, time-map method

Procedia PDF Downloads 67
2501 Solving 94-Bit ECDLP with 70 Computers in Parallel

Authors: Shunsuke Miyoshi, Yasuyuki Nogami, Takuya Kusaka, Nariyoshi Yamai

Abstract:

Elliptic curve discrete logarithm problem (ECDLP) is one of problems on which the security of pairing-based cryptography is based. This paper considers Pollard's rho method to evaluate the security of ECDLP on Barreto-Naehrig (BN) curve that is an efficient pairing-friendly curve. Some techniques are proposed to make the rho method efficient. Especially, the group structure on BN curve, distinguished point method, and Montgomery trick are well-known techniques. This paper applies these techniques and shows its optimization. According to the experimental results for which a large-scale parallel system with MySQL is applied, 94-bit ECDLP was solved about 28 hours by parallelizing 71 computers.

Keywords: Pollard's rho method, BN curve, Montgomery multiplication

Procedia PDF Downloads 241
2500 Generating Arabic Fonts Using Rational Cubic Ball Functions

Authors: Fakharuddin Ibrahim, Jamaludin Md. Ali, Ahmad Ramli

Abstract:

In this paper, we will discuss about the data interpolation by using the rational cubic Ball curve. To generate a curve with a better and satisfactory smoothness, the curve segments must be connected with a certain amount of continuity. The continuity that we will consider is of type G1 continuity. The conditions considered are known as the G1 Hermite condition. A simple application of the proposed method is to generate an Arabic font satisfying the required continuity.

Keywords: data interpolation, rational ball curve, hermite condition, continuity

Procedia PDF Downloads 395
2499 An Optimized RDP Algorithm for Curve Approximation

Authors: Jean-Pierre Lomaliza, Kwang-Seok Moon, Hanhoon Park

Abstract:

It is well-known that Ramer Douglas Peucker (RDP) algorithm greatly depends on the method of choosing starting points. Therefore, this paper focuses on finding such starting points that will optimize the results of RDP algorithm. Specifically, this paper proposes a curve approximation algorithm that finds flat points, called essential points, of an input curve, divides the curve into corner-like sub-curves using the essential points, and applies the RDP algorithm to the sub-curves. The number of essential points play a role on optimizing the approximation results by balancing the degree of shape information loss and the amount of data reduction. Through experiments with curves of various types and complexities of shape, we compared the performance of the proposed algorithm with three other methods, i.e., the RDP algorithm itself and its variants. As a result, the proposed algorithm outperformed the others in term of maintaining the original shapes of the input curve, which is important in various applications like pattern recognition.

Keywords: curve approximation, essential point, RDP algorithm

Procedia PDF Downloads 505
2498 The Effect of Feature Selection on Pattern Classification

Authors: Chih-Fong Tsai, Ya-Han Hu

Abstract:

The aim of feature selection (or dimensionality reduction) is to filter out unrepresentative features (or variables) making the classifier perform better than the one without feature selection. Since there are many well-known feature selection algorithms, and different classifiers based on different selection results may perform differently, very few studies consider examining the effect of performing different feature selection algorithms on the classification performances by different classifiers over different types of datasets. In this paper, two widely used algorithms, which are the genetic algorithm (GA) and information gain (IG), are used to perform feature selection. On the other hand, three well-known classifiers are constructed, which are the CART decision tree (DT), multi-layer perceptron (MLP) neural network, and support vector machine (SVM). Based on 14 different types of datasets, the experimental results show that in most cases IG is a better feature selection algorithm than GA. In addition, the combinations of IG with DT and IG with SVM perform best and second best for small and large scale datasets.

Keywords: data mining, feature selection, pattern classification, dimensionality reduction

Procedia PDF Downloads 635
2497 A Cohesive Zone Model with Parameters Determined by Uniaxial Stress-Strain Curve

Authors: Y.J. Wang, C. Q. Ru

Abstract:

A key issue of cohesive zone models is how to determine the cohesive zone model parameters based on real material test data. In this paper, uniaxial nominal stress-strain curve (SS curve) is used to determine two key parameters of a cohesive zone model (CZM): The maximum traction and the area under the curve of traction-separation law (TSL). To this end, the true SS curve is obtained based on the nominal SS curve, and the relationship between the nominal SS curve and TSL is derived based on an assumption that the stress for cracking should be the same in both CZM and the real material. In particular, the true SS curve after necking is derived from the nominal SS curve by taking the average of the power law extrapolation and the linear extrapolation, and a damage factor is introduced to offset the true stress reduction caused by the voids generated at the necking zone. The maximum traction of the TSL is equal to the maximum true stress calculated based on the damage factor at the end of hardening. In addition, a simple specimen is modeled by Abaqus/Standard to calculate the critical J-integral, and the fracture energy calculated by the critical J-integral represents the stored strain energy in the necking zone calculated by the true SS curve. Finally, the CZM parameters obtained by the present method are compared to those used in a previous related work for a simulation of the drop-weight tear test.

Keywords: dynamic fracture, cohesive zone model, traction-separation law, stress-strain curve, J-integral

Procedia PDF Downloads 437
2496 Determination of Cohesive Zone Model’s Parameters Based On the Uniaxial Stress-Strain Curve

Authors: Y. J. Wang, C. Q. Ru

Abstract:

A key issue of cohesive zone models is how to determine the cohesive zone model (CZM) parameters based on real material test data. In this paper, uniaxial nominal stress-strain curve (SS curve) is used to determine two key parameters of a cohesive zone model: the maximum traction and the area under the curve of traction-separation law (TSL). To this end, the true SS curve is obtained based on the nominal SS curve, and the relationship between the nominal SS curve and TSL is derived based on an assumption that the stress for cracking should be the same in both CZM and the real material. In particular, the true SS curve after necking is derived from the nominal SS curve by taking the average of the power law extrapolation and the linear extrapolation, and a damage factor is introduced to offset the true stress reduction caused by the voids generated at the necking zone. The maximum traction of the TSL is equal to the maximum true stress calculated based on the damage factor at the end of hardening. In addition, a simple specimen is simulated by Abaqus/Standard to calculate the critical J-integral, and the fracture energy calculated by the critical J-integral represents the stored strain energy in the necking zone calculated by the true SS curve. Finally, the CZM parameters obtained by the present method are compared to those used in a previous related work for a simulation of the drop-weight tear test.

Keywords: dynamic fracture, cohesive zone model, traction-separation law, stress-strain curve, J-integral

Procedia PDF Downloads 481
2495 The Term Structure of Government Bond Yields in an Emerging Market: Empirical Evidence from Pakistan Bond Market

Authors: Wali Ullah, Muhammad Nishat

Abstract:

The study investigates the extent to which the so called Nelson-Siegel model (DNS) and its extended version that accounts for time varying volatility (DNS-EGARCH) can optimally fit the yield curve and predict its future path in the context of an emerging economy. For the in-sample fit, both models fit the curve remarkably well even in the emerging markets. However, the DNS-EGARCH model fits the curve slightly better than the DNS. Moreover, both specifications of yield curve that are based on the Nelson-Siegel functional form outperform the benchmark VAR forecasts at all forecast horizons. The DNS-EGARCH comes with more precise forecasts than the DNS for the 6- and 12-month ahead forecasts, while the two have almost similar performance in terms of RMSE for the very short forecast horizons.

Keywords: yield curve, forecasting, emerging markets, Kalman filter, EGARCH

Procedia PDF Downloads 513
2494 A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning

Authors: Samina Khalid, Shamila Nasreen

Abstract:

Dimensionality reduction as a preprocessing step to machine learning is effective in removing irrelevant and redundant data, increasing learning accuracy, and improving result comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection and feature extraction methods with respect to efficiency and effectiveness. In the field of machine learning and pattern recognition, dimensionality reduction is important area, where many approaches have been proposed. In this paper, some widely used feature selection and feature extraction techniques have analyzed with the purpose of how effectively these techniques can be used to achieve high performance of learning algorithms that ultimately improves predictive accuracy of classifier. An endeavor to analyze dimensionality reduction techniques briefly with the purpose to investigate strengths and weaknesses of some widely used dimensionality reduction methods is presented.

Keywords: age related macular degeneration, feature selection feature subset selection feature extraction/transformation, FSA’s, relief, correlation based method, PCA, ICA

Procedia PDF Downloads 458
2493 Study of Bifurcation Curve with Aspect Ratio at Low Reynolds Number

Authors: Amit K. Singh, Subhankar Sen

Abstract:

The bifurcation curve of separation in steady two-dimensional viscous flow past an elliptic cylinder is studied by varying the angle of incidence (α) with different aspect ratio (ratio of minor to major axis). The solutions are based on numerical investigation, using finite element analysis, of the Navier-Stokes equations for incompressible flow. Results are presented for Reynolds number up to 50 and angle of incidence varies from 0° to 90°. Range of aspect ratio (Ar) is from 0.1 to 1 (in steps of 0.1) and flow is considered as unbounded flow. Bifurcation curve represents the locus of Reynolds numbers (Res) at which flow detaches or separates from the surface of the body at a given α and Ar. In earlier studies, effect of Ar on laminar separation curve or bifurcation curve is limited for Ar = 0.1, 0.2, 0.5 and 0.8. Some results are also available at α = 90° and 45°. The present study attempts to provide a systematic data and clear understanding on the effect of Ar at bifurcation curve and its point of maxima. In addition, issues regarding location of separation angle and maximum ratio of coefficient of lift to drag are studied. We found that nature of curve, separation angle and maximum ratio of lift to drag changes considerably with respect to change in Ar.

Keywords: aspect ratio, bifurcation curve, elliptic cylinder, GMRES, stabilized finite-element

Procedia PDF Downloads 307
2492 Choosing between the Regression Correlation, the Rank Correlation, and the Correlation Curve

Authors: Roger L. Goodwin

Abstract:

This paper presents a rank correlation curve. The traditional correlation coefficient is valid for both continuous variables and for integer variables using rank statistics. Since the correlation coefficient has already been established in rank statistics by Spearman, such a calculation can be extended to the correlation curve. This paper presents two survey questions. The survey collected non-continuous variables. We will show weak to moderate correlation. Obviously, one question has a negative effect on the other. A review of the qualitative literature can answer which question and why. The rank correlation curve shows which collection of responses has a positive slope and which collection of responses has a negative slope. Such information is unavailable from the flat, "first-glance" correlation statistics.

Keywords: Bayesian estimation, regression model, rank statistics, correlation, correlation curve

Procedia PDF Downloads 431
2491 Representation of the Solution of One Dynamical System on the Plane

Authors: Kushakov Kholmurodjon, Muhammadjonov Akbarshox

Abstract:

This present paper is devoted to a system of second-order nonlinear differential equations with a special right-hand side, exactly, the linear part and a third-order polynomial of a special form. It is shown that for some relations between the parameters, there is a second-order curve in which trajectories leaving the points of this curve remain in the same place. Thus, the curve is invariant with respect to the given system. Moreover, this system is invariant under a non-degenerate linear transformation of variables. The form of this curve, depending on the relations between the parameters and the eigenvalues of the matrix, is proved. All solutions of this system of differential equations are shown analytically.

Keywords: dynamic system, ellipse, hyperbola, Hess system, polar coordinate system

Procedia PDF Downloads 163
2490 A Bathtub Curve from Nonparametric Model

Authors: Eduardo C. Guardia, Jose W. M. Lima, Afonso H. M. Santos

Abstract:

This paper presents a nonparametric method to obtain the hazard rate “Bathtub curve” for power system components. The model is a mixture of the three known phases of a component life, the decreasing failure rate (DFR), the constant failure rate (CFR) and the increasing failure rate (IFR) represented by three parametric Weibull models. The parameters are obtained from a simultaneous fitting process of the model to the Kernel nonparametric hazard rate curve. From the Weibull parameters and failure rate curves the useful lifetime and the characteristic lifetime were defined. To demonstrate the model the historic time-to-failure of distribution transformers were used as an example. The resulted “Bathtub curve” shows the failure rate for the equipment lifetime which can be applied in economic and replacement decision models.

Keywords: bathtub curve, failure analysis, lifetime estimation, parameter estimation, Weibull distribution

Procedia PDF Downloads 415
2489 Weighted G2 Multi-Degree Reduction of Bezier Curves

Authors: Salisu ibrahim, Abdalla Rababah

Abstract:

In this research, we use Weighted G2-Multi-degree reduction of Bezier curve of degree n to a Bezier curve of degree m, m < n. The degree reduction of Bezier curves is used to represent a given Bezier curve of n by a Bezier curve of degree m, m < n. Exact degree reduction is not possible, and degree reduction is approximate process in nature. We derive a weighted degree reducing method that is geometrically continuous at the end points. Different norms will be considered, several error minimizations will be given. The proposed methods produce error function that are less than the errors of existing methods.

Keywords: Bezier curves, multiple degree reduction, geometric continuity, error function

Procedia PDF Downloads 450
2488 Hybrid Feature Selection Method for Sentiment Classification of Movie Reviews

Authors: Vishnu Goyal, Basant Agarwal

Abstract:

Sentiment analysis research provides methods for identifying the people’s opinion written in blogs, reviews, social networking websites etc. Sentiment analysis is to understand what opinion people have about any given entity, object or thing. Sentiment analysis research can be broadly categorised into three types of approaches i.e. semantic orientation, machine learning and lexicon based approaches. Feature selection methods improve the performance of the machine learning algorithms by eliminating the irrelevant features. Information gain feature selection method has been considered best method for sentiment analysis; however, it has the drawback of selection of threshold. Therefore, in this paper, we propose a hybrid feature selection methods comprising of information gain and proposed feature selection method. Initially, features are selected using Information Gain (IG) and further more noisy features are eliminated using the proposed feature selection method. Experimental results show the efficiency of the proposed feature selection methods.

Keywords: feature selection, sentiment analysis, hybrid feature selection

Procedia PDF Downloads 303
2487 Comparison of Receiver Operating Characteristic Curve Smoothing Methods

Authors: D. Sigirli

Abstract:

The Receiver Operating Characteristic (ROC) curve is a commonly used statistical tool for evaluating the diagnostic performance of screening and diagnostic test with continuous or ordinal scale results which aims to predict the presence or absence probability of a condition, usually a disease. When the test results were measured as numeric values, sensitivity and specificity can be computed across all possible threshold values which discriminate the subjects as diseased and non-diseased. There are infinite numbers of possible decision thresholds along the continuum of the test results. The ROC curve presents the trade-off between sensitivity and the 1-specificity as the threshold changes. The empirical ROC curve which is a non-parametric estimator of the ROC curve is robust and it represents data accurately. However, especially for small sample sizes, it has a problem of variability and as it is a step function there can be different false positive rates for a true positive rate value and vice versa. Besides, the estimated ROC curve being in a jagged form, since the true ROC curve is a smooth curve, it underestimates the true ROC curve. Since the true ROC curve is assumed to be smooth, several smoothing methods have been explored to smooth a ROC curve. These include using kernel estimates, using log-concave densities, to fit parameters for the specified density function to the data with the maximum-likelihood fitting of univariate distributions or to create a probability distribution by fitting the specified distribution to the data nd using smooth versions of the empirical distribution functions. In the present paper, we aimed to propose a smooth ROC curve estimation based on the boundary corrected kernel function and to compare the performances of ROC curve smoothing methods for the diagnostic test results coming from different distributions in different sample sizes. We performed simulation study to compare the performances of different methods for different scenarios with 1000 repetitions. It is seen that the performance of the proposed method was typically better than that of the empirical ROC curve and only slightly worse compared to the binormal model when in fact the underlying samples were generated from the normal distribution.

Keywords: empirical estimator, kernel function, smoothing, receiver operating characteristic curve

Procedia PDF Downloads 124
2486 Nanoindentation Behaviour and Microstructural Evolution of Annealed Single-Crystal Silicon

Authors: Woei-Shyan Lee, Shuo-Ling Chang

Abstract:

The nanoindentation behaviour and phase transformation of annealed single-crystal silicon wafers are examined. The silicon specimens are annealed at temperatures of 250, 350 and 450ºC, respectively, for 15 minutes and are then indented to maximum loads of 30, 50 and 70 mN. The phase changes induced in the indented specimens are observed using transmission electron microscopy (TEM) and micro-Raman scattering spectroscopy (RSS). For all annealing temperatures, an elbow feature is observed in the unloading curve following indentation to a maximum load of 30 mN. Under higher loads of 50 mN and 70 mN, respectively, the elbow feature is replaced by a pop-out event. The elbow feature reveals a complete amorphous phase transformation within the indented zone, whereas the pop-out event indicates the formation of Si XII and Si III phases. The experimental results show that the formation of these crystalline silicon phases increases with an increasing annealing temperature and indentation load. The hardness and Young’s modulus both decrease as the annealing temperature and indentation load are increased.

Keywords: nanoindentation, silicon, phase transformation, amorphous, annealing

Procedia PDF Downloads 331
2485 Feature Location Restoration for Under-Sampled Photoplethysmogram Using Spline Interpolation

Authors: Hangsik Shin

Abstract:

The purpose of this research is to restore the feature location of under-sampled photoplethysmogram using spline interpolation and to investigate feasibility for feature shape restoration. We obtained 10 kHz-sampled photoplethysmogram and decimated it to generate under-sampled dataset. Decimated dataset has 5 kHz, 2.5 k Hz, 1 kHz, 500 Hz, 250 Hz, 25 Hz and 10 Hz sampling frequency. To investigate the restoration performance, we interpolated under-sampled signals with 10 kHz, then compared feature locations with feature locations of 10 kHz sampled photoplethysmogram. Features were upper and lower peak of photplethysmography waveform. Result showed that time differences were dramatically decreased by interpolation. Location error was lesser than 1 ms in both feature types. In 10 Hz sampled cases, location error was also deceased a lot, however, they were still over 10 ms.

Keywords: peak detection, photoplethysmography, sampling, signal reconstruction

Procedia PDF Downloads 339
2484 Efficient Human Motion Detection Feature Set by Using Local Phase Quantization Method

Authors: Arwa Alzughaibi

Abstract:

Human Motion detection is a challenging task due to a number of factors including variable appearance, posture and a wide range of illumination conditions and background. So, the first need of such a model is a reliable feature set that can discriminate between a human and a non-human form with a fair amount of confidence even under difficult conditions. By having richer representations, the classification task becomes easier and improved results can be achieved. The Aim of this paper is to investigate the reliable and accurate human motion detection models that are able to detect the human motions accurately under varying illumination levels and backgrounds. Different sets of features are tried and tested including Histogram of Oriented Gradients (HOG), Deformable Parts Model (DPM), Local Decorrelated Channel Feature (LDCF) and Aggregate Channel Feature (ACF). However, we propose an efficient and reliable human motion detection approach by combining Histogram of oriented gradients (HOG) and local phase quantization (LPQ) as the feature set, and implementing search pruning algorithm based on optical flow to reduce the number of false positive. Experimental results show the effectiveness of combining local phase quantization descriptor and the histogram of gradient to perform perfectly well for a large range of illumination conditions and backgrounds than the state-of-the-art human detectors. Areaunder th ROC Curve (AUC) of the proposed method achieved 0.781 for UCF dataset and 0.826 for CDW dataset which indicates that it performs comparably better than HOG, DPM, LDCF and ACF methods.

Keywords: human motion detection, histograms of oriented gradient, local phase quantization, local phase quantization

Procedia PDF Downloads 230
2483 Classification of Political Affiliations by Reduced Number of Features

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.

Keywords: feature selection, LIWC, machine learning, politics

Procedia PDF Downloads 354
2482 Processing Big Data: An Approach Using Feature Selection

Authors: Nikat Parveen, M. Ananthi

Abstract:

Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.

Keywords: big data, key value, feature selection, retrieval, performance

Procedia PDF Downloads 311
2481 Applications of Probabilistic Interpolation via Orthogonal Matrices

Authors: Dariusz Jacek Jakóbczak

Abstract:

Mathematics and computer science are interested in methods of 2D curve interpolation and extrapolation using the set of key points (knots). A proposed method of Hurwitz- Radon Matrices (MHR) is such a method. This novel method is based on the family of Hurwitz-Radon (HR) matrices which possess columns composed of orthogonal vectors. Two-dimensional curve is interpolated via different functions as probability distribution functions: polynomial, sinus, cosine, tangent, cotangent, logarithm, exponent, arcsin, arccos, arctan, arcctg or power function, also inverse functions. It is shown how to build the orthogonal matrix operator and how to use it in a process of curve reconstruction.

Keywords: 2D data interpolation, hurwitz-radon matrices, MHR method, probabilistic modeling, curve extrapolation

Procedia PDF Downloads 498
2480 The Growth Curve of Gompertz Model in Body Weight of Slovak Mixed-Sex Goose Breeds

Authors: Cyril Hrncar, Jozef Bujko, Widya P. B. Putra

Abstract:

The growth curve of poultry is important to evaluate the farming management system. This study was aimed to estimate the growth curve of body weight in goose. The growth curve in this study was estimated with non-linear Gompertz model through CurveExpert 1.4. software. Three Slovak mixed-sex goose breeds of Landes (L), Pomeranian (P) and Steinbacher (S) were used in this study. Total of 28 geese (10 L, 8 P and 10 S) were used to estimate the growth curve. Research showed that the asymptotic weight (A) in those geese were reached of 5332.51 g (L), 6186.14 g (P) and 5048.27 g (S). Thus, the maturing rate (k) in each breed were similar (0.05 g/day). The weight of inflection was reached of 1960.48 g (L), 2274.32 g (P) and 1855.98 g (S). The time of inflection (ti) was reached of 25.6 days (L), 26.2 days (P) and 27.80 days (S). The maximum growth rate (MGR) was reached of 98.02 g/day (L), 113.72 g/day (P) and 92.80 g/day (S). Hence, the coefficient of determination (R2) in Gompertz model was 0.99 for each breed. It can be concluded that Pomeranian geese had highest of growth trait than the other breeds.

Keywords: body weight, growth curve, inflection, Slovak geese, Gompertz model

Procedia PDF Downloads 114
2479 Modelling the Indonesian Goverment Securities Yield Curve Using Nelson-Siegel-Svensson and Support Vector Regression

Authors: Jamilatuzzahro, Rezzy Eko Caraka

Abstract:

The yield curve is the plot of the yield to maturity of zero-coupon bonds against maturity. In practice, the yield curve is not observed but must be extracted from observed bond prices for a set of (usually) incomplete maturities. There exist many methodologies and theory to analyze of yield curve. We use two methods (the Nelson-Siegel Method, the Svensson Method, and the SVR method) in order to construct and compare our zero-coupon yield curves. The objectives of this research were: (i) to study the adequacy of NSS model and SVR to Indonesian government bonds data, (ii) to choose the best optimization or estimation method for NSS model and SVR. To obtain that objective, this research was done by the following steps: data preparation, cleaning or filtering data, modeling, and model evaluation.

Keywords: support vector regression, Nelson-Siegel-Svensson, yield curve, Indonesian government

Procedia PDF Downloads 217
2478 Improved Qualitative Modeling of the Magnetization Curve B(H) of the Ferromagnetic Materials for a Transformer Used in the Power Supply for Magnetron

Authors: M. Bassoui, M. Ferfra, M. Chrayagne

Abstract:

This paper presents a qualitative modeling for the nonlinear B-H curve of the saturable magnetic materials for a transformer with shunts used in the power supply for the magnetron. This power supply is composed of a single phase leakage flux transformer supplying a cell composed of a capacitor and a diode, which double the voltage and stabilize the current, and a single magnetron at the output of the cell. A procedure consisting of a fuzzy clustering method and a rule processing algorithm is then employed for processing the constructed fuzzy modeling rules to extract the qualitative properties of the curve.

Keywords: B(H) curve, fuzzy clustering, magnetron, power supply

Procedia PDF Downloads 206
2477 K-Means Clustering-Based Infinite Feature Selection Method

Authors: Seyyedeh Faezeh Hassani Ziabari, Sadegh Eskandari, Maziar Salahi

Abstract:

Infinite Feature Selection (IFS) algorithm is an efficient feature selection algorithm that selects a subset of features of all sizes (including infinity). In this paper, we present an improved version of it, called clustering IFS (CIFS), by clustering the dataset in advance. To do so, first, we apply the K-means algorithm to cluster the dataset, then we apply IFS. In the CIFS method, the spatial and temporal complexities are reduced compared to the IFS method. Experimental results on 6 datasets show the superiority of CIFS compared to IFS in terms of accuracy, running time, and memory consumption.

Keywords: feature selection, infinite feature selection, clustering, graph

Procedia PDF Downloads 96
2476 Feature Evaluation Based on Random Subspace and Multiple-K Ensemble

Authors: Jaehong Yu, Seoung Bum Kim

Abstract:

Clustering analysis can facilitate the extraction of intrinsic patterns in a dataset and reveal its natural groupings without requiring class information. For effective clustering analysis in high dimensional datasets, unsupervised dimensionality reduction is an important task. Unsupervised dimensionality reduction can generally be achieved by feature extraction or feature selection. In many situations, feature selection methods are more appropriate than feature extraction methods because of their clear interpretation with respect to the original features. The unsupervised feature selection can be categorized as feature subset selection and feature ranking method, and we focused on unsupervised feature ranking methods which evaluate the features based on their importance scores. Recently, several unsupervised feature ranking methods were developed based on ensemble approaches to achieve their higher accuracy and stability. However, most of the ensemble-based feature ranking methods require the true number of clusters. Furthermore, these algorithms evaluate the feature importance depending on the ensemble clustering solution, and they produce undesirable evaluation results if the clustering solutions are inaccurate. To address these limitations, we proposed an ensemble-based feature ranking method with random subspace and multiple-k ensemble (FRRM). The proposed FRRM algorithm evaluates the importance of each feature with the random subspace ensemble, and all evaluation results are combined with the ensemble importance scores. Moreover, FRRM does not require the determination of the true number of clusters in advance through the use of the multiple-k ensemble idea. Experiments on various benchmark datasets were conducted to examine the properties of the proposed FRRM algorithm and to compare its performance with that of existing feature ranking methods. The experimental results demonstrated that the proposed FRRM outperformed the competitors.

Keywords: clustering analysis, multiple-k ensemble, random subspace-based feature evaluation, unsupervised feature ranking

Procedia PDF Downloads 308