Search results for: curse of dimensionality
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 137

Search results for: curse of dimensionality

137 A Comparative Study of Additive and Nonparametric Regression Estimators and Variable Selection Procedures

Authors: Adriano Z. Zambom, Preethi Ravikumar

Abstract:

One of the biggest challenges in nonparametric regression is the curse of dimensionality. Additive models are known to overcome this problem by estimating only the individual additive effects of each covariate. However, if the model is misspecified, the accuracy of the estimator compared to the fully nonparametric one is unknown. In this work the efficiency of completely nonparametric regression estimators such as the Loess is compared to the estimators that assume additivity in several situations, including additive and non-additive regression scenarios. The comparison is done by computing the oracle mean square error of the estimators with regards to the true nonparametric regression function. Then, a backward elimination selection procedure based on the Akaike Information Criteria is proposed, which is computed from either the additive or the nonparametric model. Simulations show that if the additive model is misspecified, the percentage of time it fails to select important variables can be higher than that of the fully nonparametric approach. A dimension reduction step is included when nonparametric estimator cannot be computed due to the curse of dimensionality. Finally, the Boston housing dataset is analyzed using the proposed backward elimination procedure and the selected variables are identified.

Keywords: additive model, nonparametric regression, variable selection, Akaike Information Criteria

Procedia PDF Downloads 242
136 Enhanced Image Representation for Deep Belief Network Classification of Hyperspectral Images

Authors: Khitem Amiri, Mohamed Farah

Abstract:

Image classification is a challenging task and is gaining lots of interest since it helps us to understand the content of images. Recently Deep Learning (DL) based methods gave very interesting results on several benchmarks. For Hyperspectral images (HSI), the application of DL techniques is still challenging due to the scarcity of labeled data and to the curse of dimensionality. Among other approaches, Deep Belief Network (DBN) based approaches gave a fair classification accuracy. In this paper, we address the problem of the curse of dimensionality by reducing the number of bands and replacing the HSI channels by the channels representing radiometric indices. Therefore, instead of using all the HSI bands, we compute the radiometric indices such as NDVI (Normalized Difference Vegetation Index), NDWI (Normalized Difference Water Index), etc, and we use the combination of these indices as input for the Deep Belief Network (DBN) based classification model. Thus, we keep almost all the pertinent spectral information while reducing considerably the size of the image. In order to test our image representation, we applied our method on several HSI datasets including the Indian pines dataset, Jasper Ridge data and it gave comparable results to the state of the art methods while reducing considerably the time of training and testing.

Keywords: hyperspectral images, deep belief network, radiometric indices, image classification

Procedia PDF Downloads 245
135 A Spatial Hypergraph Based Semi-Supervised Band Selection Method for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah

Abstract:

Hyperspectral imagery (HSI) typically provides a wealth of information captured in a wide range of the electromagnetic spectrum for each pixel in the image. Hence, a pixel in HSI is a high-dimensional vector of intensities with a large spectral range and a high spectral resolution. Therefore, the semantic interpretation is a challenging task of HSI analysis. We focused in this paper on object classification as HSI semantic interpretation. However, HSI classification still faces some issues, among which are the following: The spatial variability of spectral signatures, the high number of spectral bands, and the high cost of true sample labeling. Therefore, the high number of spectral bands and the low number of training samples pose the problem of the curse of dimensionality. In order to resolve this problem, we propose to introduce the process of dimensionality reduction trying to improve the classification of HSI. The presented approach is a semi-supervised band selection method based on spatial hypergraph embedding model to represent higher order relationships with different weights of the spatial neighbors corresponding to the centroid of pixel. This semi-supervised band selection has been developed to select useful bands for object classification. The presented approach is evaluated on AVIRIS and ROSIS HSIs and compared to other dimensionality reduction methods. The experimental results demonstrate the efficacy of our approach compared to many existing dimensionality reduction methods for HSI classification.

Keywords: dimensionality reduction, hyperspectral image, semantic interpretation, spatial hypergraph

Procedia PDF Downloads 285
134 Oil and Development: The Case of Kuwait

Authors: Abdulaziz Abdulrahman Albahar

Abstract:

This paper aims to answer the question of: is oil as a natural resource with all the wealth that it brings an economic burden? And how can resource curse be mitigated in such oil dependent nations? The case of Kuwait will be used as an example. The paper begins with an introduction of the resource curse and the Kuwaiti economy in general. Then there is an attempt to see that does the curse exist in the case for Kuwait. Furthermore, in the analysis section, an exploration on how the economy is dependent on oil and how oil is more of a burden if there is mismanagement is conducted. Later on, in answering on how to mitigate the problem of a resource curse, the case of Norway is explored. In concluding the paper, the results do show that oil rentals affects the Kuwaiti economy via 2 main channels, these are government spending that are mainly financed via oil rentals and exportation of oil based products. The surprising result was that government spending had a negative impact on GDP (gross domestic product) growth when oil rentals where instrumented on government expenditure, this is due to the issue of rent seeking in which government spending in Kuwait is financing things such as stimulus packages and raising the nominal wages. Yet, when comparing the magnitude of both oil exportation and government spending, the latter has a stronger effect on the GDP (gross domestic product) growth than the former. A resource curse doesn’t seem to exist in the case of Kuwait however, the characteristics of a curse do show in the form of rent seeking in the political sphere, the disruption of the traditional sectors like that of pearl trade and fishing markets. Yet, a curse doesn’t show due to the fact that the currency of the nation is very stable and hasn’t experienced any appreciation because of the fixed exchange rate system. Moreover, even if we can’t say that a curse exists, it is clear to see that the Kuwaiti economy is heading towards one. Whether or not it faces a resource curse will be based on how judicious the nation will be in exploiting their sovereign wealth fund and implementing diversification strategies to be less oil dependent like the vision “New Kuwait-2035” which has been underway since 2017.

Keywords: economic development, Kuwait, oil curse, dutch disease

Procedia PDF Downloads 50
133 A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning

Authors: Samina Khalid, Shamila Nasreen

Abstract:

Dimensionality reduction as a preprocessing step to machine learning is effective in removing irrelevant and redundant data, increasing learning accuracy, and improving result comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection and feature extraction methods with respect to efficiency and effectiveness. In the field of machine learning and pattern recognition, dimensionality reduction is important area, where many approaches have been proposed. In this paper, some widely used feature selection and feature extraction techniques have analyzed with the purpose of how effectively these techniques can be used to achieve high performance of learning algorithms that ultimately improves predictive accuracy of classifier. An endeavor to analyze dimensionality reduction techniques briefly with the purpose to investigate strengths and weaknesses of some widely used dimensionality reduction methods is presented.

Keywords: age related macular degeneration, feature selection feature subset selection feature extraction/transformation, FSA’s, relief, correlation based method, PCA, ICA

Procedia PDF Downloads 458
132 Oil Revenues Anticipation, Global Entanglements and Indigenous Rights: Negotiating a Potential Resource Curse in Uganda

Authors: Nsubuga Bright Titus

Abstract:

The resource curse is an unavoidable phenomenon among oil producing states in Africa. There is no oil production currently in Uganda although exploration projections set 2020 as the year of initial production. But as the exploration proceeds and Production Sharing Agreements (PSA) are negotiated, so does the anticipation for oil revenues. The Indigenous people of Bunyoro are claiming the right to their indigenous lands through the African Commission on Human and People’s Rights (ACHPR) of the African Union. They urge the commission to investigate the government of Uganda on violations of their human rights. In this paper, oil as a resource curse is examined through the Dutch disease. Regional and global entanglements, as well as the contestation between the indigenous Bunyoro group and the oil industry in Uganda is explored. The paper also demonstrates that oil as a local possibility and national reality has propelled anxiety about oil revenues among various, local actors, State actors, regional and global actors.

Keywords: Entanglements, Extractive resources, Framing, web of relations

Procedia PDF Downloads 76
131 An Adaptive Dimensionality Reduction Approach for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah, Basel Solaiman

Abstract:

With the development of HyperSpectral Imagery (HSI) technology, the spectral resolution of HSI became denser, which resulted in large number of spectral bands, high correlation between neighboring, and high data redundancy. However, the semantic interpretation is a challenging task for HSI analysis due to the high dimensionality and the high correlation of the different spectral bands. In fact, this work presents a dimensionality reduction approach that allows to overcome the different issues improving the semantic interpretation of HSI. Therefore, in order to preserve the spatial information, the Tensor Locality Preserving Projection (TLPP) has been applied to transform the original HSI. In the second step, knowledge has been extracted based on the adjacency graph to describe the different pixels. Based on the transformation matrix using TLPP, a weighted matrix has been constructed to rank the different spectral bands based on their contribution score. Thus, the relevant bands have been adaptively selected based on the weighted matrix. The performance of the presented approach has been validated by implementing several experiments, and the obtained results demonstrate the efficiency of this approach compared to various existing dimensionality reduction techniques. Also, according to the experimental results, we can conclude that this approach can adaptively select the relevant spectral improving the semantic interpretation of HSI.

Keywords: band selection, dimensionality reduction, feature extraction, hyperspectral imagery, semantic interpretation

Procedia PDF Downloads 330
130 Modelling Causal Effects from Complex Longitudinal Data via Point Effects of Treatments

Authors: Xiaoqin Wang, Li Yin

Abstract:

Background and purpose: In many practices, one estimates causal effects arising from a complex stochastic process, where a sequence of treatments are assigned to influence a certain outcome of interest, and there exist time-dependent covariates between treatments. When covariates are plentiful and/or continuous, statistical modeling is needed to reduce the huge dimensionality of the problem and allow for the estimation of causal effects. Recently, Wang and Yin (Annals of statistics, 2020) derived a new general formula, which expresses these causal effects in terms of the point effects of treatments in single-point causal inference. As a result, it is possible to conduct the modeling via point effects. The purpose of the work is to study the modeling of these causal effects via point effects. Challenges and solutions: The time-dependent covariates often have influences from earlier treatments as well as on subsequent treatments. Consequently, the standard parameters – i.e., the mean of the outcome given all treatments and covariates-- are essentially all different (null paradox). Furthermore, the dimension of the parameters is huge (curse of dimensionality). Therefore, it can be difficult to conduct the modeling in terms of standard parameters. Instead of standard parameters, we have use point effects of treatments to develop likelihood-based parametric approach to the modeling of these causal effects and are able to model the causal effects of a sequence of treatments by modeling a small number of point effects of individual treatment Achievements: We are able to conduct the modeling of the causal effects from a sequence of treatments in the familiar framework of single-point causal inference. The simulation shows that our method achieves not only an unbiased estimate for the causal effect but also the nominal level of type I error and a low level of type II error for the hypothesis testing. We have applied this method to a longitudinal study of COVID-19 mortality among Scandinavian countries and found that the Swedish approach performed far worse than the other countries' approach for COVID-19 mortality and the poor performance was largely due to its early measure during the initial period of the pandemic.

Keywords: causal effect, point effect, statistical modelling, sequential causal inference

Procedia PDF Downloads 175
129 Examining Relationship between Resource-Curse and Under-Five Mortality in Resource-Rich Countries

Authors: Aytakin Huseynli

Abstract:

The paper reports findings of the study which examined under-five mortality rate among resource-rich countries. Typically when countries obtain wealth citizens gain increased wellbeing. Societies with new wealth create equal opportunities for everyone including vulnerable groups. But scholars claim that this is not the case for developing resource-rich countries and natural resources become the curse for them rather than the blessing. Spillovers from natural resource curse affect the social wellbeing of vulnerable people negatively. They get excluded from the mainstream society, and their situation becomes tangible. In order to test this hypothesis, the study compared under-5 mortality rate among resource-rich countries by using independent sample one-way ANOVA. The data on under-five mortality rate came from the World Bank. The natural resources for this study are oil, gas and minerals. The list of 67 resource-rich countries was taken from Natural Resource Governance Institute. The sample size was categorized and 4 groups were created such as low, low-middle, upper middle and high-income countries based on income classification of the World Bank. Results revealed that there was a significant difference in the scores for low, middle, upper-middle and high-income countries in under-five mortality rate (F(3(29.01)=33.70, p=.000). To find out the difference among income groups, the Games-Howell test was performed and it was found that infant mortality was an issue for low, middle and upper middle countries but not for high-income countries. Results of this study are in agreement with previous research on resource curse and negative effects of resource-based development. Policy implications of the study for social workers, policy makers, academicians and social development specialists are to raise and discuss issues of marginalization and exclusion of vulnerable groups in developing resource-rich countries and suggest interventions for avoiding them.

Keywords: children, natural resource, extractive industries, resource-based development, vulnerable groups

Procedia PDF Downloads 234
128 Using Confirmatory Factor Analysis to Test the Dimensional Structure of Tourism Service Quality

Authors: Ibrahim A. Elshaer, Alaa M. Shaker

Abstract:

Several previous empirical studies have operationalized service quality as either a multidimensional or unidimensional construct. While few earlier studies investigated some practices of the assumed dimensional structure of service quality, no study has been found to have tested the construct’s dimensionality using confirmatory factor analysis (CFA). To gain a better insight into the dimensional structure of service quality construct, this paper tests its dimensionality using three CFA models (higher order factor model, oblique factor model, and one factor model) on a set of data collected from 390 British tourists visited Egypt. The results of the three tests models indicate that service quality construct is multidimensional. This result helps resolving the problems that might arise from the lack of clarity concerning the dimensional structure of service quality, as without testing the dimensional structure of a measure, researchers cannot assume that the significant correlation is a result of factors measuring the same construct.

Keywords: service quality, dimensionality, confirmatory factor analysis, Egypt

Procedia PDF Downloads 558
127 SC-LSH: An Efficient Indexing Method for Approximate Similarity Search in High Dimensional Space

Authors: Sanaa Chafik, Imane Daoudi, Mounim A. El Yacoubi, Hamid El Ouardi

Abstract:

Locality Sensitive Hashing (LSH) is one of the most promising techniques for solving nearest neighbour search problem in high dimensional space. Euclidean LSH is the most popular variation of LSH that has been successfully applied in many multimedia applications. However, the Euclidean LSH presents limitations that affect structure and query performances. The main limitation of the Euclidean LSH is the large memory consumption. In order to achieve a good accuracy, a large number of hash tables is required. In this paper, we propose a new hashing algorithm to overcome the storage space problem and improve query time, while keeping a good accuracy as similar to that achieved by the original Euclidean LSH. The Experimental results on a real large-scale dataset show that the proposed approach achieves good performances and consumes less memory than the Euclidean LSH.

Keywords: approximate nearest neighbor search, content based image retrieval (CBIR), curse of dimensionality, locality sensitive hashing, multidimensional indexing, scalability

Procedia PDF Downloads 302
126 Microarray Gene Expression Data Dimensionality Reduction Using PCA

Authors: Fuad M. Alkoot

Abstract:

Different experimental technologies such as microarray sequencing have been proposed to generate high-resolution genetic data, in order to understand the complex dynamic interactions between complex diseases and the biological system components of genes and gene products. However, the generated samples have a very large dimension reaching thousands. Therefore, hindering all attempts to design a classifier system that can identify diseases based on such data. Additionally, the high overlap in the class distributions makes the task more difficult. The data we experiment with is generated for the identification of autism. It includes 142 samples, which is small compared to the large dimension of the data. The classifier systems trained on this data yield very low classification rates that are almost equivalent to a guess. We aim at reducing the data dimension and improve it for classification. Here, we experiment with applying a multistage PCA on the genetic data to reduce its dimensionality. Results show a significant improvement in the classification rates which increases the possibility of building an automated system for autism detection.

Keywords: PCA, gene expression, dimensionality reduction, classification, autism

Procedia PDF Downloads 533
125 Solving Dimensionality Problem and Finding Statistical Constructs on Latent Regression Models: A Novel Methodology with Real Data Application

Authors: Sergio Paez Moncaleano, Alvaro Mauricio Montenegro

Abstract:

This paper presents a novel statistical methodology for measuring and founding constructs in Latent Regression Analysis. This approach uses the qualities of Factor Analysis in binary data with interpretations on Item Response Theory (IRT). In addition, based on the fundamentals of submodel theory and with a convergence of many ideas of IRT, we propose an algorithm not just to solve the dimensionality problem (nowadays an open discussion) but a new research field that promises more fear and realistic qualifications for examiners and a revolution on IRT and educational research. In the end, the methodology is applied to a set of real data set presenting impressive results for the coherence, speed and precision. Acknowledgments: This research was financed by Colciencias through the project: 'Multidimensional Item Response Theory Models for Practical Application in Large Test Designed to Measure Multiple Constructs' and both authors belong to SICS Research Group from Universidad Nacional de Colombia.

Keywords: item response theory, dimensionality, submodel theory, factorial analysis

Procedia PDF Downloads 342
124 Genomic Sequence Representation Learning: An Analysis of K-Mer Vector Embedding Dimensionality

Authors: James Jr. Mashiyane, Risuna Nkolele, Stephanie J. Müller, Gciniwe S. Dlamini, Rebone L. Meraba, Darlington S. Mapiye

Abstract:

When performing language tasks in natural language processing (NLP), the dimensionality of word embeddings is chosen either ad-hoc or is calculated by optimizing the Pairwise Inner Product (PIP) loss. The PIP loss is a metric that measures the dissimilarity between word embeddings, and it is obtained through matrix perturbation theory by utilizing the unitary invariance of word embeddings. Unlike in natural language, in genomics, especially in genome sequence processing, unlike in natural language processing, there is no notion of a “word,” but rather, there are sequence substrings of length k called k-mers. K-mers sizes matter, and they vary depending on the goal of the task at hand. The dimensionality of word embeddings in NLP has been studied using the matrix perturbation theory and the PIP loss. In this paper, the sufficiency and reliability of applying word-embedding algorithms to various genomic sequence datasets are investigated to understand the relationship between the k-mer size and their embedding dimension. This is completed by studying the scaling capability of three embedding algorithms, namely Latent Semantic analysis (LSA), Word2Vec, and Global Vectors (GloVe), with respect to the k-mer size. Utilising the PIP loss as a metric to train embeddings on different datasets, we also show that Word2Vec outperforms LSA and GloVe in accurate computing embeddings as both the k-mer size and vocabulary increase. Finally, the shortcomings of natural language processing embedding algorithms in performing genomic tasks are discussed.

Keywords: word embeddings, k-mer embedding, dimensionality reduction

Procedia PDF Downloads 102
123 The Spectral Power Amplification on the Regular Lattices

Authors: Kotbi Lakhdar, Hachi Mostefa

Abstract:

We show that a simple transformation between the regular lattices (the square, the triangular, and the honeycomb) belonging to the same dimensionality can explain in a natural way the universality of the critical exponents found in phase transitions and critical phenomena. It suffices that the Hamiltonian and the lattice present similar writing forms. In addition, it appears that if a property can be calculated for a given lattice then it can be extrapolated simply to any other lattice belonging to the same dimensionality. In this study, we have restricted ourselves on the spectral power amplification (SPA), we note that the SPA does not have an effect on the critical exponents but does have an effect by the criticality temperature of the lattice; the generalisation to other lattice could be shown according to the containment principle.

Keywords: ising model, phase transitions, critical temperature, critical exponent, spectral power amplification

Procedia PDF Downloads 282
122 A Fuzzy-Rough Feature Selection Based on Binary Shuffled Frog Leaping Algorithm

Authors: Javad Rahimipour Anaraki, Saeed Samet, Mahdi Eftekhari, Chang Wook Ahn

Abstract:

Feature selection and attribute reduction are crucial problems, and widely used techniques in the field of machine learning, data mining and pattern recognition to overcome the well-known phenomenon of the Curse of Dimensionality. This paper presents a feature selection method that efficiently carries out attribute reduction, thereby selecting the most informative features of a dataset. It consists of two components: 1) a measure for feature subset evaluation, and 2) a search strategy. For the evaluation measure, we have employed the fuzzy-rough dependency degree (FRFDD) of the lower approximation-based fuzzy-rough feature selection (L-FRFS) due to its effectiveness in feature selection. As for the search strategy, a modified version of a binary shuffled frog leaping algorithm is proposed (B-SFLA). The proposed feature selection method is obtained by hybridizing the B-SFLA with the FRDD. Nine classifiers have been employed to compare the proposed approach with several existing methods over twenty two datasets, including nine high dimensional and large ones, from the UCI repository. The experimental results demonstrate that the B-SFLA approach significantly outperforms other metaheuristic methods in terms of the number of selected features and the classification accuracy.

Keywords: binary shuffled frog leaping algorithm, feature selection, fuzzy-rough set, minimal reduct

Procedia PDF Downloads 186
121 Spatial Rank-Based High-Dimensional Monitoring through Random Projection

Authors: Chen Zhang, Nan Chen

Abstract:

High-dimensional process monitoring becomes increasingly important in many application domains, where usually the process distribution is unknown and much more complicated than the normal distribution, and the between-stream correlation can not be neglected. However, since the process dimension is generally much bigger than the reference sample size, most traditional nonparametric multivariate control charts fail in high-dimensional cases due to the curse of dimensionality. Furthermore, when the process goes out of control, the influenced variables are quite sparse compared with the whole dimension, which increases the detection difficulty. Targeting at these issues, this paper proposes a new nonparametric monitoring scheme for high-dimensional processes. This scheme first projects the high-dimensional process into several subprocesses using random projections for dimension reduction. Then, for every subprocess with the dimension much smaller than the reference sample size, a local nonparametric control chart is constructed based on the spatial rank test to detect changes in this subprocess. Finally, the results of all the local charts are fused together for decision. Furthermore, after an out-of-control (OC) alarm is triggered, a diagnostic framework is proposed. using the square-root LASSO. Numerical studies demonstrate that the chart has satisfactory detection power for sparse OC changes and robust performance for non-normally distributed data, The diagnostic framework is also effective to identify truly changed variables. Finally, a real-data example is presented to demonstrate the application of the proposed method.

Keywords: random projection, high-dimensional process control, spatial rank, sequential change detection

Procedia PDF Downloads 275
120 The Effect of Feature Selection on Pattern Classification

Authors: Chih-Fong Tsai, Ya-Han Hu

Abstract:

The aim of feature selection (or dimensionality reduction) is to filter out unrepresentative features (or variables) making the classifier perform better than the one without feature selection. Since there are many well-known feature selection algorithms, and different classifiers based on different selection results may perform differently, very few studies consider examining the effect of performing different feature selection algorithms on the classification performances by different classifiers over different types of datasets. In this paper, two widely used algorithms, which are the genetic algorithm (GA) and information gain (IG), are used to perform feature selection. On the other hand, three well-known classifiers are constructed, which are the CART decision tree (DT), multi-layer perceptron (MLP) neural network, and support vector machine (SVM). Based on 14 different types of datasets, the experimental results show that in most cases IG is a better feature selection algorithm than GA. In addition, the combinations of IG with DT and IG with SVM perform best and second best for small and large scale datasets.

Keywords: data mining, feature selection, pattern classification, dimensionality reduction

Procedia PDF Downloads 635
119 The Impact of Natural Resources on Financial Development: The Global Perspective

Authors: Remy Jonkam Oben

Abstract:

Using a time series approach, this study investigates how natural resources impact financial development from a global perspective over the 1980-2019 period. Some important determinants of financial development (economic growth, trade openness, population growth, and investment) have been added to the model as control variables. Unit root tests have revealed that all the variables are integrated into order one. Johansen's cointegration test has shown that the variables are in a long-run equilibrium relationship. The vector error correction model (VECM) has estimated the coefficient of the error correction term (ECT), which suggests that the short-run values of natural resources, economic growth, trade openness, population growth, and investment contribute to financial development converging to its long-run equilibrium level by a 23.63% annual speed of adjustment. The estimated coefficients suggest that global natural resource rent has a statistically-significant negative impact on global financial development in the long-run (thereby validating the financial resource curse) but not in the short-run. Causality test results imply that neither global natural resource rent nor global financial development Granger-causes each other.

Keywords: financial development, natural resources, resource curse hypothesis, time series analysis, Granger causality, global perspective

Procedia PDF Downloads 122
118 Rank-Based Chain-Mode Ensemble for Binary Classification

Authors: Chongya Song, Kang Yen, Alexander Pons, Jin Liu

Abstract:

In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.

Keywords: consensus, curse of correlation, imbalance classification, rank-based chain-mode ensemble

Procedia PDF Downloads 108
117 Hierarchical Piecewise Linear Representation of Time Series Data

Authors: Vineetha Bettaiah, Heggere S. Ranganath

Abstract:

This paper presents a Hierarchical Piecewise Linear Approximation (HPLA) for the representation of time series data in which the time series is treated as a curve in the time-amplitude image space. The curve is partitioned into segments by choosing perceptually important points as break points. Each segment between adjacent break points is recursively partitioned into two segments at the best point or midpoint until the error between the approximating line and the original curve becomes less than a pre-specified threshold. The HPLA representation achieves dimensionality reduction while preserving prominent local features and general shape of time series. The representation permits course-fine processing at different levels of details, allows flexible definition of similarity based on mathematical measures or general time series shape, and supports time series data mining operations including query by content, clustering and classification based on whole or subsequence similarity.

Keywords: data mining, dimensionality reduction, piecewise linear representation, time series representation

Procedia PDF Downloads 246
116 A Local Invariant Generalized Hough Transform Method for Integrated Circuit Visual Positioning

Authors: Wei Feilong

Abstract:

In this study, an local invariant generalized Houghtransform (LI-GHT) method is proposed for integrated circuit (IC) visual positioning. The original generalized Hough transform (GHT) is robust to external noise; however, it is not suitable for visual positioning of IC chips due to the four-dimensionality (4D) of parameter space which leads to the substantial storage requirement and high computational complexity. The proposed LI-GHT method can reduce the dimensionality of parameter space to 2D thanks to the rotational invariance of local invariant geometric feature and it can estimate the accuracy position and rotation angle of IC chips in real-time under noise and blur influence. The experiment results show that the proposed LI-GHT can estimate position and rotation angle of IC chips with high accuracy and fast speed. The proposed LI-GHT algorithm was implemented in IC visual positioning system of radio frequency identification (RFID) packaging equipment.

Keywords: Integrated Circuit Visual Positioning, Generalized Hough Transform, Local invariant Generalized Hough Transform, ICpacking equipment

Procedia PDF Downloads 244
115 Novel Recommender Systems Using Hybrid CF and Social Network Information

Authors: Kyoung-Jae Kim

Abstract:

Collaborative Filtering (CF) is a popular technique for the personalization in the E-commerce domain to reduce information overload. In general, CF provides recommending items list based on other similar users’ preferences from the user-item matrix and predicts the focal user’s preference for particular items by using them. Many recommender systems in real-world use CF techniques because it’s excellent accuracy and robustness. However, it has some limitations including sparsity problems and complex dimensionality in a user-item matrix. In addition, traditional CF does not consider the emotional interaction between users. In this study, we propose recommender systems using social network and singular value decomposition (SVD) to alleviate some limitations. The purpose of this study is to reduce the dimensionality of data set using SVD and to improve the performance of CF by using emotional information from social network data of the focal user. In this study, we test the usability of hybrid CF, SVD and social network information model using the real-world data. The experimental results show that the proposed model outperforms conventional CF models.

Keywords: recommender systems, collaborative filtering, social network information, singular value decomposition

Procedia PDF Downloads 258
114 The Curse of Natural Resources: An Empirical Analysis Applied to the Case of Copper Mining in Zambia

Authors: Chomba Kalunga

Abstract:

Many developing countries have a rich endowment of natural resources. Yet, amidst that wealth, living standards remain poor. At the same time, international markets have been surged with an increase in copper prices in the last twenty years. This is a presentation of the findings on the causal economic impact of Zambia’s copper mines, a country located in sub-Saharan Africa endowed with vast copper deposits on living standards using household data from 1996 to 2010, exploiting an episode where the copper prices on the international market were rising. Using an Instrumental Variable approach and controlling for constituency-level and microeconomic factors, the results show a significant impact of copper production on living standards. After splitting the constituencies close to and far away from the nearest mine, the results document that constituencies close to the mines benefited significantly from the increase in copper production, compared to their counterparts through increased levels of employment. Finally, the results are not consistent with the natural resource curse hypothesis; findings show a positive causal relationship between the presence of natural resources and socioeconomic outcomes in less developed countries, particularly for constituencies close to the mines in Zambia. Some key policy implications follow from the findings. The finding that increased copper production led to an increase in employment suggests that, in Zambias’ context, policies that promote local employment may be more beneficial to residents. Meaning that it is government policies that can help improve the living standards were government needs to work towards making this impact more substantial.

Keywords: copper prices, local development, mining, natural resources

Procedia PDF Downloads 188
113 Hydrothermally Fabricated 3-D Nanostructure Metal Oxide Sensors

Authors: Mohammad Alenezi

Abstract:

Hierarchical nanostructures with higher dimensionality, consisting of nanostructure building blocks such as nanowires, nanotubes, or nanosheets are very attractive. They hold great properties like the high surface-to-volume ratio and well-ordered porous structures, which can be very challenging to attain for other mono-morphological nanostructures. Well-ordered hierarchical nanostructures with high surface-to-volume ratios facilitate gas diffusion into their surfaces as well as scattering of light. Therefore, hierarchical nanostructures are expected to perform highly as gas sensors. A multistage controlled hydrothermal synthesis method to fabricate high-performance single ZnO brushlike hierarchical nanostructure gas sensor from initial nanowires is reported. The performance of the sensor based on brush-like hierarchical nanostructure is analyzed and compared to that of a nanowire gas sensor. The hierarchical gas sensor demonstrated high sensitivity toward low concentration of acetone at high speed of response. The enhancement in the hierarchical sensor performance is attributed to the increased surface to volume ratio, reduction in dimensionality of the nanowire building blocks, formation of junctions between the initial nanowire and the secondary nanowires, and enhanced gas diffusion into the surfaces of the hierarchical nanostructures.

Keywords: metal oxide, nanostructure, hydrothermal, sensor

Procedia PDF Downloads 243
112 Study of Thermal and Mechanical Properties of Ethylene/1-Octene Copolymer Based Nanocomposites

Authors: Sharmila Pradhan, Ralf Lach, George Michler, Jean Mark Saiter, Rameshwar Adhikari

Abstract:

Ethylene/1-octene copolymer was modified incorporating three types of nanofillers differed in their dimensionality in order to investigate the effect of filler dimensionality on mechanical properties, for instance, tensile strength, microhardness etc. The samples were prepared by melt mixing followed by compression moldings. The microstructure of the novel material was characterized by Fourier transform infrared spectroscopy (FTIR), X-ray diffraction (XRD) method and Transmission electron microscopy (TEM). Other important properties such as melting, crystallizing and thermal stability were also investigated via differential scanning calorimetry (DSC) and Thermogravimetry analysis (TGA). The FTIR and XRD results showed that the composites were formed by physical mixing. The TEM result supported the homogeneous dispersion of nanofillers in the matrix. The mechanical characterization performed by tensile testing showed that the composites with 1D nanofiller effectively reinforced the polymer. TGA results revealed that the thermal stability of pure EOC is marginally improved by the addition of nanofillers. Likewise, melting and crystallizing properties of the composites are not much different from that of pure.

Keywords: copolymer, differential scanning calorimetry, nanofiller, tensile strength

Procedia PDF Downloads 206
111 Music Genre Classification Based on Non-Negative Matrix Factorization Features

Authors: Soyon Kim, Edward Kim

Abstract:

In order to retrieve information from the massive stream of songs in the music industry, music search by title, lyrics, artist, mood, and genre has become more important. Despite the subjectivity and controversy over the definition of music genres across different nations and cultures, automatic genre classification systems that facilitate the process of music categorization have been developed. Manual genre selection by music producers is being provided as statistical data for designing automatic genre classification systems. In this paper, an automatic music genre classification system utilizing non-negative matrix factorization (NMF) is proposed. Short-term characteristics of the music signal can be captured based on the timbre features such as mel-frequency cepstral coefficient (MFCC), decorrelated filter bank (DFB), octave-based spectral contrast (OSC), and octave band sum (OBS). Long-term time-varying characteristics of the music signal can be summarized with (1) the statistical features such as mean, variance, minimum, and maximum of the timbre features and (2) the modulation spectrum features such as spectral flatness measure, spectral crest measure, spectral peak, spectral valley, and spectral contrast of the timbre features. Not only these conventional basic long-term feature vectors, but also NMF based feature vectors are proposed to be used together for genre classification. In the training stage, NMF basis vectors were extracted for each genre class. The NMF features were calculated in the log spectral magnitude domain (NMF-LSM) as well as in the basic feature vector domain (NMF-BFV). For NMF-LSM, an entire full band spectrum was used. However, for NMF-BFV, only low band spectrum was used since high frequency modulation spectrum of the basic feature vectors did not contain important information for genre classification. In the test stage, using the set of pre-trained NMF basis vectors, the genre classification system extracted the NMF weighting values of each genre as the NMF feature vectors. A support vector machine (SVM) was used as a classifier. The GTZAN multi-genre music database was used for training and testing. It is composed of 10 genres and 100 songs for each genre. To increase the reliability of the experiments, 10-fold cross validation was used. For a given input song, an extracted NMF-LSM feature vector was composed of 10 weighting values that corresponded to the classification probabilities for 10 genres. An NMF-BFV feature vector also had a dimensionality of 10. Combined with the basic long-term features such as statistical features and modulation spectrum features, the NMF features provided the increased accuracy with a slight increase in feature dimensionality. The conventional basic features by themselves yielded 84.0% accuracy, but the basic features with NMF-LSM and NMF-BFV provided 85.1% and 84.2% accuracy, respectively. The basic features required dimensionality of 460, but NMF-LSM and NMF-BFV required dimensionalities of 10 and 10, respectively. Combining the basic features, NMF-LSM and NMF-BFV together with the SVM with a radial basis function (RBF) kernel produced the significantly higher classification accuracy of 88.3% with a feature dimensionality of 480.

Keywords: mel-frequency cepstral coefficient (MFCC), music genre classification, non-negative matrix factorization (NMF), support vector machine (SVM)

Procedia PDF Downloads 266
110 Globalisation, Growth and Sustainability in Sub-Saharan Africa

Authors: Ourvashi Bissoon

Abstract:

Sub-Saharan Africa in addition to being resource rich is increasingly being seen as having a huge growth potential and as a result, is increasingly attracting MNEs on its soil. To empirically assess the effectiveness of GDP in tracking sustainable resource use and the role played by MNEs in Sub-Saharan Africa, a panel data analysis has been undertaken for 32 countries over thirty-five years. The time horizon spans the period 1980-2014 to reflect the evolution from before the publication of the pioneering Brundtland report on sustainable development to date. Multinationals’ presence is proxied by the level of FDI stocks. The empirical investigation first focuses on the impact of trade openness and MNE presence on the traditional measure of economic growth namely the GDP growth rate, and then on the genuine savings (GS) rate, a measure of weak sustainability developed by the World Bank, which assumes the substitutability between different forms of capital and finally, the impact on the adjusted Net National Income (aNNI), a measure of green growth which caters for the depletion of natural resources is examined. For countries with significant exhaustible natural resources and important foreign investor presence, the adjusted net national income (aNNI) can be a better indicator of economic performance than GDP growth (World Bank, 2010). The issue of potential endogeneity and reverse causality is also addressed in addition to robustness tests. The findings indicate that FDI and openness contribute significantly and positively to the GDP growth of the countries in the sample; however there is a threshold level of institutional quality below which FDI has a negative impact on growth. When the GDP growth rate is substituted for the GS rate, a natural resource curse becomes evident. The rents being generated from the exploitation of natural resources are not being re-invested into other forms of capital namely human and physical capital. FDI and trade patterns may be setting the economies in the sample on a unsustainable path of resource depletion. The resource curse is confirmed when utilising the aNNI as well, thus implying that GDP growth measure may not be a reliable to capture sustainable development.

Keywords: FDI, sustainable development, genuine savings, sub-Saharan Africa

Procedia PDF Downloads 187
109 Detection and Classification of Mammogram Images Using Principle Component Analysis and Lazy Classifiers

Authors: Rajkumar Kolangarakandy

Abstract:

Feature extraction and selection is the primary part of any mammogram classification algorithms. The choice of feature, attribute or measurements have an important influence in any classification system. Discrete Wavelet Transformation (DWT) coefficients are one of the prominent features for representing images in frequency domain. The features obtained after the decomposition of the mammogram images using wavelet transformations have higher dimension. Even though the features are higher in dimension, they were highly correlated and redundant in nature. The dimensionality reduction techniques play an important role in selecting the optimum number of features from the higher dimension data, which are highly correlated. PCA is a mathematical tool that reduces the dimensionality of the data while retaining most of the variation in the dataset. In this paper, a multilevel classification of mammogram images using reduced discrete wavelet transformation coefficients and lazy classifiers is proposed. The classification is accomplished in two different levels. In the first level, mammogram ROIs extracted from the dataset is classified as normal and abnormal types. In the second level, all the abnormal mammogram ROIs is classified into benign and malignant too. A further classification is also accomplished based on the variation in structure and intensity distribution of the images in the dataset. The Lazy classifiers called Kstar, IBL and LWL are used for classification. The classification results obtained with the reduced feature set is highly promising and the result is also compared with the performance obtained without dimension reduction.

Keywords: PCA, wavelet transformation, lazy classifiers, Kstar, IBL, LWL

Procedia PDF Downloads 314
108 Beyond Voluntary Corporate Social Responsibility: Examining the Impact of the New Mandatory Community Development Agreement in the Mining Sector of Sierra Leone

Authors: Wusu Conteh

Abstract:

Since the 1990s, neo-liberalization has become a global agenda. The free market ushered in an unprecedented drive by Multinational Corporations (MNCs) to secure mineral rights in resource-rich countries. Several governments in the Global South implemented a liberalized mining policy with support from the International Financial Institutions (IFIs). MNCs have maintained that voluntary Corporate Social Responsibility (CSR) has engendered socio-economic development in mining-affected communities. However, most resource-rich countries are struggling to transform the resources into sustainable socio-economic development. They are trapped in what has been widely described as the ‘resource curse.’ In an attempt to address this resource conundrum, the African Mining Vision (AMV) of 2009 developed a model on resource governance. The advent of the AMV has engendered the introduction of mandatory community development agreement (CDA) into the legal framework of many countries in Africa. In 2009, Sierra Leone enacted the Mines and Minerals Act that obligates mining companies to invest in Primary Host Communities. The study employs interviews and field observation techniques to explicate the dynamics of the CDA program. A total of 25 respondents -government officials, NGOs/CSOs and community stakeholders were interviewed. The study focuses on a case study of the Sierra Rutile CDA program in Sierra Leone. Extant scholarly works have extensively explored the resource curse and voluntary CSR. There are limited studies to uncover the mandatory CDA and its impact on socio-economic development in mining-affected communities. Thus, the purpose of this study is to explicate the impact of the CDA in Sierra Leone. Using the theory of change helps to understand how the availability of mandatory funds can empower communities to take an active part in decision making related to the development of the communities. The results show that the CDA has engendered a predictable fund for community development. It has also empowered ordinary members of the community to determine the development program. However, the CDA has created a new ground for contestations between the pre-existing local governance structure (traditional authority) and the newly created community development committee (CDC) that is headed by an ordinary member of the community.

Keywords: community development agreement, impact, mandatory, participation

Procedia PDF Downloads 91