Search results for: Principal Component Analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 28422

Search results for: Principal Component Analysis

28392 Assessment of Social Vulnerability of Urban Population to Floods – a Case Study of Mumbai

Authors: Sherly M. A., Varsha Vijaykumar, Subhankar Karmakar, Terence Chan, Christian Rau

Abstract:

This study aims at proposing an indicator-based framework for assessing social vulnerability of any coastal megacity to floods. The final set of indicators of social vulnerability are chosen from a set of feasible and available indicators which are prepared using a Geographic Information System (GIS) framework on a smaller scale considering 1-km grid cell to provide an insight into the spatial variability of vulnerability. The optimal weight for each individual indicator is assigned using data envelopment analysis (DEA) as it avoids subjective weights and improves the confidence on the results obtained. In order to de-correlate and reduce the dimension of multivariate data, principal component analysis (PCA) has been applied. The proposed methodology is demonstrated on twenty four wards of Mumbai under the jurisdiction of Municipal Corporation of Greater Mumbai (MCGM). This framework of vulnerability assessment is not limited to the present study area, and may be applied to other urban damage centers.

Keywords: urban floods, vulnerability, data envelopment analysis, principal component analysis

Procedia PDF Downloads 332
28391 Application of Principle Component Analysis for Classification of Random Doppler-Radar Targets during the Surveillance Operations

Authors: G. C. Tikkiwal, Mukesh Upadhyay

Abstract:

During the surveillance operations at war or peace time, the Radar operator gets a scatter of targets over the screen. This may be a tracked vehicle like tank vis-à-vis T72, BMP etc, or it may be a wheeled vehicle like ALS, TATRA, 2.5Tonne, Shaktiman or moving army, moving convoys etc. The Radar operator selects one of the promising targets into Single Target Tracking (STT) mode. Once the target is locked, the operator gets a typical audible signal into his headphones. With reference to the gained experience and training over the time, the operator then identifies the random target. But this process is cumbersome and is solely dependent on the skills of the operator, thus may lead to misclassification of the object. In this paper we present a technique using mathematical and statistical methods like Fast Fourier Transformation (FFT) and Principal Component Analysis (PCA) to identify the random objects. The process of classification is based on transforming the audible signature of target into music octave-notes. The whole methodology is then automated by developing suitable software. This automation increases the efficiency of identification of the random target by reducing the chances of misclassification. This whole study is based on live data.

Keywords: radar target, fft, principal component analysis, eigenvector, octave-notes, dsp

Procedia PDF Downloads 319
28390 Performance of the Cmip5 Models in Simulation of the Present and Future Precipitation over the Lake Victoria Basin

Authors: M. A. Wanzala, L. A. Ogallo, F. J. Opijah, J. N. Mutemi

Abstract:

The usefulness and limitations in climate information are due to uncertainty inherent in the climate system. For any given region to have sustainable development it is important to apply climate information into its socio-economic strategic plans. The overall objective of the study was to assess the performance of the Coupled Model Inter-comparison Project (CMIP5) over the Lake Victoria Basin. The datasets used included the observed point station data, gridded rainfall data from Climate Research Unit (CRU) and hindcast data from eight CMIP5. The methodology included trend analysis, spatial analysis, correlation analysis, Principal Component Analysis (PCA) regression analysis, and categorical statistical skill score. Analysis of the trends in the observed rainfall records indicated an increase in rainfall variability both in space and time for all the seasons. The spatial patterns of the individual models output from the models of MPI, MIROC, EC-EARTH and CNRM were closest to the observed rainfall patterns.

Keywords: categorical statistics, coupled model inter-comparison project, principal component analysis, statistical downscaling

Procedia PDF Downloads 340
28389 Statistical Wavelet Features, PCA, and SVM-Based Approach for EEG Signals Classification

Authors: R. K. Chaurasiya, N. D. Londhe, S. Ghosh

Abstract:

The study of the electrical signals produced by neural activities of human brain is called Electroencephalography. In this paper, we propose an automatic and efficient EEG signal classification approach. The proposed approach is used to classify the EEG signal into two classes: epileptic seizure or not. In the proposed approach, we start with extracting the features by applying Discrete Wavelet Transform (DWT) in order to decompose the EEG signals into sub-bands. These features, extracted from details and approximation coefficients of DWT sub-bands, are used as input to Principal Component Analysis (PCA). The classification is based on reducing the feature dimension using PCA and deriving the support-vectors using Support Vector Machine (SVM). The experimental are performed on real and standard dataset. A very high level of classification accuracy is obtained in the result of classification.

Keywords: discrete wavelet transform, electroencephalogram, pattern recognition, principal component analysis, support vector machine

Procedia PDF Downloads 603
28388 Disparities in the Levels of Economic Development in Uttar Pradesh: A Regional Analysis

Authors: Naushaba Naseem Ahmed

Abstract:

Economic development does not merely depend upon the level of development but also on its distributive aspect. As it is a serious issue, the fruit of development is not equally distributed among the different section of peoples and different part of the country this cause the regional disparities in the levels of social economic development. Different part of the country has different resource endowments in term of natural, human and capital. If there is the uniform condition to grow, these areas that have better resources, are favourably placed grow comparatively faster as other areas. Thus with the very stage of development, gap between resourceful and less resourceful area goes on widening. This paper is an attempt to highlight the levels of disparities in term of economic development with the help of selected variables. Principal component analysis, correlation, and coefficient of variation are the techniques which were used in paper and employed published data for analysis. The result shows that Western region of Uttar Pradesh is more developed followed by Central Region. There will be urgent need in investment and developmental policies for the backward region like Bundelkhand region of Uttar Pradesh.

Keywords: coefficient of variation, correlation, economic development, principal component analysis

Procedia PDF Downloads 235
28387 Principal Component Regression in Amylose Content on the Malaysian Market Rice Grains Using Near Infrared Reflectance Spectroscopy

Authors: Syahira Ibrahim, Herlina Abdul Rahim

Abstract:

The amylose content is an essential element in determining the texture and taste of rice grains. This paper evaluates the use of VIS-SWNIRS in estimating the amylose content for seven varieties of rice grains available in the Malaysian market. Each type consists of 30 samples and all the samples are scanned using the spectroscopy to obtain a range of values between 680-1000nm. The Savitzky-Golay (SG) smoothing filter is applied to each sample’s data before the Principal Component Regression (PCR) technique is used to examine the data and produce a single value for each sample. This value is then compared with reference values obtained from the standard iodine colorimetric test in terms of its coefficient of determination, R2. Results show that this technique produced low R2 values of less than 0.50. In order to improve the result, the range should include a wavelength range of 1100-2500nm and the number of samples processed should also be increased.

Keywords: amylose content, diffuse reflectance, Malaysia rice grain, principal component regression (PCR), Visible and Shortwave near-infrared spectroscopy (VIS-SWNIRS)

Procedia PDF Downloads 352
28386 Assessment of Soil Quality Indicators in Rice Soil of Tamil Nadu

Authors: Kaleeswari R. K., Seevagan L .

Abstract:

Soil quality in an agroecosystem is influenced by the cropping system, water and soil fertility management. A valid soil quality index would help to assess the soil and crop management practices for desired productivity and soil health. The soil quality indices also provide an early indication of soil degradation and needy remedial and rehabilitation measures. Imbalanced fertilization and inadequate organic carbon dynamics deteriorate soil quality in an intensive cropping system. The rice soil ecosystem is different from other arable systems since rice is grown under submergence, which requires a different set of key soil attributes for enhancing soil quality and productivity. Assessment of the soil quality index involves indicator selection, indicator scoring and comprehensive score into one index. The most appropriate indicator to evaluate soil quality can be selected by establishing the minimum data set, which can be screened by linear and multiple regression factor analysis and score function. This investigation was carried out in intensive rice cultivating regions (having >1.0 lakh hectares) of Tamil Nadu viz., Thanjavur, Thiruvarur, Nagapattinam, Villupuram, Thiruvannamalai, Cuddalore and Ramanathapuram districts. In each district, intensive rice growing block was identified. In each block, two sampling grids (10 x 10 sq.km) were used with a sampling depth of 10 – 15 cm. Using GIS coordinates, and soil sampling was carried out at various locations in the study area. The number of soil sampling points were 41, 28, 28, 32, 37, 29 and 29 in Thanjavur, Thiruvarur, Nagapattinam, Cuddalore, Villupuram, Thiruvannamalai and Ramanathapuram districts, respectively. Principal Component Analysis is a data reduction tool to select some of the potential indicators. Principal Component is a linear combination of different variables that represents the maximum variance of the dataset. Principal Component that has eigenvalues equal or higher than 1.0 was taken as the minimum data set. Principal Component Analysis was used to select the representative soil quality indicators in rice soils based on factor loading values and contribution percent values. Variables having significant differences within the production system were used for the preparation of the minimum data set. Each Principal Component explained a certain amount of variation (%) in the total dataset. This percentage provided the weight for variables. The final Principal Component Analysis based soil quality equation is SQI = ∑ i=1 (W ᵢ x S ᵢ); where S- score for the subscripted variable; W-weighing factor derived from PCA. Higher index scores meant better soil quality. Soil respiration, Soil available Nitrogen and Potentially Mineralizable Nitrogen were assessed as soil quality indicators in rice soil of the Cauvery Delta zone covering Thanjavur, Thiruvavur and Nagapattinam districts. Soil available phosphorus could be used as a soil quality indicator of rice soils in the Cuddalore district. In rain-fed rice ecosystems of coastal sandy soil, DTPA – Zn could be used as an effective soil quality indicator. Among the soil parameters selected from Principal Component Analysis, Microbial Biomass Nitrogen could be used quality indicator for rice soils of the Villupuram district. Cauvery Delta zone has better SQI as compared with other intensive rice growing zone of Tamil Nadu.

Keywords: soil quality index, soil attributes, soil mapping, and rice soil

Procedia PDF Downloads 49
28385 Sensitivity of Credit Default Swaps Premium to Global Risk Factor: Evidence from Emerging Markets

Authors: Oguzhan Cepni, Doruk Kucuksarac, M. Hasan Yilmaz

Abstract:

Risk premium of emerging markets are moving altogether depending on the momentum and shifts in the global risk appetite. However, the magnitudes of these changes in the risk premium of emerging market economies might vary. In this paper, we focus on how global risk factor affects credit default swaps (CDS) premiums of emerging markets using principal component analysis (PCA) and rolling regressions. PCA results indicate that the first common component accounts for almost 76% of common variation in CDS premiums of emerging markets. Additionally, the explanatory power of the first factor seems to be high over sample period. However, the sensitivity to the global risk factor tends to change over time and across countries. In this regard, fixed effects panel regressions are employed to identify the macroeconomic factors driving the heterogeneity across emerging markets. There are two main macroeconomic variables that affect the sensitivity; government debt to GDP and international reserves to GDP. The countries with lower government debt and higher reserves tend to be less subject to the variations in the global risk appetite.

Keywords: emerging markets, principal component analysis, credit default swaps, sovereign risk

Procedia PDF Downloads 344
28384 Principal Component Analysis in Drug-Excipient Interactions

Authors: Farzad Khajavi

Abstract:

Studies about the interaction between active pharmaceutical ingredients (API) and excipients are so important in the pre-formulation stage of development of all dosage forms. Analytical techniques such as differential scanning calorimetry (DSC), Thermal gravimetry (TG), and Furrier transform infrared spectroscopy (FTIR) are commonly used tools for investigating regarding compatibility and incompatibility of APIs with excipients. Sometimes the interpretation of data obtained from these techniques is difficult because of severe overlapping of API spectrum with excipients in their mixtures. Principal component analysis (PCA) as a powerful factor analytical method is used in these situations to resolve data matrices acquired from these analytical techniques. Binary mixtures of API and interested excipients are considered and produced. Peaks of FTIR, DSC, or TG of pure API and excipient and their mixtures at different mole ratios will construct the rows of the data matrix. By applying PCA on the data matrix, the number of principal components (PCs) is determined so that it contains the total variance of the data matrix. By plotting PCs or factors obtained from the score of the matrix in two-dimensional spaces if the pure API and its mixture with the excipient at the high amount of API and the 1:1mixture form a separate cluster and the other cluster comprise of the pure excipient and its blend with the API at the high amount of excipient. This confirms the existence of compatibility between API and the interested excipient. Otherwise, the incompatibility will overcome a mixture of API and excipient.

Keywords: API, compatibility, DSC, TG, interactions

Procedia PDF Downloads 87
28383 Principal Components Analysis of the Causes of High Blood Pressure at Komfo Anokye Teaching Hospital, Ghana

Authors: Joseph K. A. Johnson

Abstract:

Hypertension affects 20 percent of the people within the ages 55 upward in Ghana. Of these, almost one-third are unaware of their condition. Also at the age of 55, more men turned to have hypertension than women. After that age, the condition becomes more prevalent with women. Hypertension is significantly more common in African Americans of both sexes than the racial or ethnic groups. This study was conducted to determine the causes of high blood pressure in Ashanti Region, Ghana. The study employed One Hundred and Seventy (170) respondents. The sample population for the study was all the available respondents at the time of the data collection. The research was conducted using primary data where convenience sampling was used to locate the respondents. A set of questionnaire were used to gather the data for the study. The gathered data was analysed using principal component analysis. The study revealed that, personal description, lifestyle behavior and risk awareness as some of the causes of high blood pressure in Ashanti Region. The study therefore recommend that people must be advice to see to their personal characteristics that may contribute to high blood pressure such as controlling of their temper and how to react perfectly to stressful situations. They must be educated on the factors that may increase the level of their blood pressure such as the essence of seeing a medical doctor before taking in any drug. People must also be made known by the public health officers to those lifestyles behaviour such as smoking and drinking of alcohol which are major contributors of high blood pressure.

Keywords: high blood pressure, principal component analysis, hypertension, public health

Procedia PDF Downloads 457
28382 An Efficient Machine Learning Model to Detect Metastatic Cancer in Pathology Scans Using Principal Component Analysis Algorithm, Genetic Algorithm, and Classification Algorithms

Authors: Bliss Singhal

Abstract:

Machine learning (ML) is a branch of Artificial Intelligence (AI) where computers analyze data and find patterns in the data. The study focuses on the detection of metastatic cancer using ML. Metastatic cancer is the stage where cancer has spread to other parts of the body and is the cause of approximately 90% of cancer-related deaths. Normally, pathologists spend hours each day to manually classifying whether tumors are benign or malignant. This tedious task contributes to mislabeling metastasis being over 60% of the time and emphasizes the importance of being aware of human error and other inefficiencies. ML is a good candidate to improve the correct identification of metastatic cancer, saving thousands of lives and can also improve the speed and efficiency of the process, thereby taking fewer resources and time. So far, the deep learning methodology of AI has been used in research to detect cancer. This study is a novel approach to determining the potential of using preprocessing algorithms combined with classification algorithms in detecting metastatic cancer. The study used two preprocessing algorithms: principal component analysis (PCA) and the genetic algorithm, to reduce the dimensionality of the dataset and then used three classification algorithms: logistic regression, decision tree classifier, and k-nearest neighbors to detect metastatic cancer in the pathology scans. The highest accuracy of 71.14% was produced by the ML pipeline comprising of PCA, the genetic algorithm, and the k-nearest neighbor algorithm, suggesting that preprocessing and classification algorithms have great potential for detecting metastatic cancer.

Keywords: breast cancer, principal component analysis, genetic algorithm, k-nearest neighbors, decision tree classifier, logistic regression

Procedia PDF Downloads 49
28381 Research Attitude: Its Factor Structure and Determinants in the Graduate Level

Authors: Janet Lynn S. Montemayor

Abstract:

Dropping survivability and rising drop-out rate in the graduate school is attributed to the demands that come along with research-related requirements. Graduate students tend to withdraw from their studies when confronted with such requirements. This act of succumbing to the challenge is primarily due to a negative mindset. An understanding of students’ view towards research is essential for teachers in facilitating research activities in the graduate school. This study aimed to develop a tool that accurately measures attitude towards research. Psychometric properties of the Research Attitude Inventory (RAIn) was assessed. A pool of items (k=50) was initially constructed and was administered to a development sample composed of Masters and Doctorate degree students (n=159). Results show that the RAIn is a reliable measure of research attitude (k=41, αmax = 0.894). Principal component analysis using orthogonal rotation with Kaiser normalization identified four underlying factors of research attitude, namely predisposition, purpose, perspective, and preparation. Research attitude among the respondents was analyzed using this measure.

Keywords: graduate education, principal component analysis, research attitude, scale development

Procedia PDF Downloads 159
28380 Parametric Appraisal of Robotic Arc Welding of Mild Steel Material by Principal Component Analysis-Fuzzy with Taguchi Technique

Authors: Amruta Rout, Golak Bihari Mahanta, Gunji Bala Murali, Bibhuti Bhusan Biswal, B. B. V. L. Deepak

Abstract:

The use of industrial robots for performing welding operation is one of the chief sign of contemporary welding in these days. The weld joint parameter and weld process parameter modeling is one of the most crucial aspects of robotic welding. As weld process parameters affect the weld joint parameters differently, a multi-objective optimization technique has to be utilized to obtain optimal setting of weld process parameter. In this paper, a hybrid optimization technique, i.e., Principal Component Analysis (PCA) combined with fuzzy logic has been proposed to get optimal setting of weld process parameters like wire feed rate, welding current. Gas flow rate, welding speed and nozzle tip to plate distance. The weld joint parameters considered for optimization are the depth of penetration, yield strength, and ultimate strength. PCA is a very efficient multi-objective technique for converting the correlated and dependent parameters into uncorrelated and independent variables like the weld joint parameters. Also in this approach, no need for checking the correlation among responses as no individual weight has been assigned to responses. Fuzzy Inference Engine can efficiently consider these aspects into an internal hierarchy of it thereby overcoming various limitations of existing optimization approaches. At last Taguchi method is used to get the optimal setting of weld process parameters. Therefore, it has been concluded the hybrid technique has its own advantages which can be used for quality improvement in industrial applications.

Keywords: robotic arc welding, weld process parameters, weld joint parameters, principal component analysis, fuzzy logic, Taguchi method

Procedia PDF Downloads 152
28379 Image Multi-Feature Analysis by Principal Component Analysis for Visual Surface Roughness Measurement

Authors: Wei Zhang, Yan He, Yan Wang, Yufeng Li, Chuanpeng Hao

Abstract:

Surface roughness is an important index for evaluating surface quality, needs to be accurately measured to ensure the performance of the workpiece. The roughness measurement based on machine vision involves various image features, some of which are redundant. These redundant features affect the accuracy and speed of the visual approach. Previous research used correlation analysis methods to select the appropriate features. However, this feature analysis is independent and cannot fully utilize the information of data. Besides, blindly reducing features lose a lot of useful information, resulting in unreliable results. Therefore, the focus of this paper is on providing a redundant feature removal approach for visual roughness measurement. In this paper, the statistical methods and gray-level co-occurrence matrix(GLCM) are employed to extract the texture features of machined images effectively. Then, the principal component analysis(PCA) is used to fuse all extracted features into a new one, which reduces the feature dimension and maintains the integrity of the original information. Finally, the relationship between new features and roughness is established by the support vector machine(SVM). The experimental results show that the approach can effectively solve multi-feature information redundancy of machined surface images and provides a new idea for the visual evaluation of surface roughness.

Keywords: feature analysis, machine vision, PCA, surface roughness, SVM

Procedia PDF Downloads 181
28378 Emotion Recognition with Occlusions Based on Facial Expression Reconstruction and Weber Local Descriptor

Authors: Jadisha Cornejo, Helio Pedrini

Abstract:

Recognition of emotions based on facial expressions has received increasing attention from the scientific community over the last years. Several fields of applications can benefit from facial emotion recognition, such as behavior prediction, interpersonal relations, human-computer interactions, recommendation systems. In this work, we develop and analyze an emotion recognition framework based on facial expressions robust to occlusions through the Weber Local Descriptor (WLD). Initially, the occluded facial expressions are reconstructed following an extension approach of Robust Principal Component Analysis (RPCA). Then, WLD features are extracted from the facial expression representation, as well as Local Binary Patterns (LBP) and Histogram of Oriented Gradients (HOG). The feature vector space is reduced using Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). Finally, K-Nearest Neighbor (K-NN) and Support Vector Machine (SVM) classifiers are used to recognize the expressions. Experimental results on three public datasets demonstrated that the WLD representation achieved competitive accuracy rates for occluded and non-occluded facial expressions compared to other approaches available in the literature.

Keywords: emotion recognition, facial expression, occlusion, fiducial landmarks

Procedia PDF Downloads 152
28377 Use of Landsat OLI Images in the Mapping of Landslides: Case of the Taounate Province in Northern Morocco

Authors: S. Benchelha, H. Chennaoui, M. Hakdaoui, L. Baidder, H. Mansouri, H. Ejjaaouani, T. Benchelha

Abstract:

Northern Morocco is characterized by relatively young mountains experiencing a very important dynamic compared to other areas of Morocco. The dynamics associated with the formation of the Rif chain (Alpine tectonics), is accompanied by instabilities essentially related to tectonic movements. The realization of important infrastructures (Roads, Highways,...) represents a triggering factor and favoring landslides. This paper is part of the establishment of landslides susceptibility map and concerns the mapping of unstable areas in the province of Taounate. The landslide was identified using the components of the false color (FCC) of images Landsat OLI: i) the first independent component (IC1), ii) The main component (PC), iii) Normalized difference index (NDI). This mapping for landslides class is validated by in-situ surveys.

Keywords: landslides, False Color Composite (FCC), Independent Component Analysis (ICA), Principal Component Analysis (PCA), Normalized Difference Index (NDI), Normalized Difference Mid Red Index (NDMIDR)

Procedia PDF Downloads 261
28376 Gan Nanowire-Based Sensor Array for the Detection of Cross-Sensitive Gases Using Principal Component Analysis

Authors: Ashfaque Hossain Khan, Brian Thomson, Ratan Debnath, Abhishek Motayed, Mulpuri V. Rao

Abstract:

Though the efforts had been made, the problem of cross-sensitivity for a single metal oxide-based sensor can’t be fully eliminated. In this work, a sensor array has been designed and fabricated comprising of platinum (Pt), copper (Cu), and silver (Ag) decorated TiO2 and ZnO functionalized GaN nanowires using industry-standard top-down fabrication approach. The metal/metal-oxide combinations within the array have been determined from prior molecular simulation study using first principle calculations based on density functional theory (DFT). The gas responses were obtained for both single and mixture of NO2, SO2, ethanol, and H2 in the presence of H2O and O2 gases under UV light at room temperature. Each gas leaves a unique response footprint across the array sensors by which precise discrimination of cross-sensitive gases has been achieved. An unsupervised principal component analysis (PCA) technique has been implemented on the array response. Results indicate that each gas forms a distinct cluster in the score plot for all the target gases and their mixtures, indicating a clear separation among them. In addition, the developed array device consumes very low power because of ultra-violet (UV) assisted sensing as compared to commercially available metal-oxide sensors. The nanowire sensor array, in combination with PCA, is a potential approach for precise real-time gas monitoring applications.

Keywords: cross-sensitivity, gas sensor, principle component analysis (PCA), sensor array

Procedia PDF Downloads 74
28375 Spatial Analysis of Flood Vulnerability in Highly Urbanized Area: A Case Study in Taipei City

Authors: Liang Weichien

Abstract:

Without adequate information and mitigation plan for natural disaster, the risk to urban populated areas will increase in the future as populations grow, especially in Taiwan. Taiwan is recognized as the world's high-risk areas, where an average of 5.7 times of floods occur per year should seek to strengthen coherence and consensus in how cities can plan for flood and climate change. Therefore, this study aims at understanding the vulnerability to flooding in Taipei city, Taiwan, by creating indicators and calculating the vulnerability of each study units. The indicators were grouped into sensitivity and adaptive capacity based on the definition of vulnerability of Intergovernmental Panel on Climate Change. The indicators were weighted by using Principal Component Analysis. However, current researches were based on the assumption that the composition and influence of the indicators were the same in different areas. This disregarded spatial correlation that might result in inaccurate explanation on local vulnerability. The study used Geographically Weighted Principal Component Analysis by adding geographic weighting matrix as weighting to get the different main flood impact characteristic in different areas. Cross Validation Method and Akaike Information Criterion were used to decide bandwidth and Gaussian Pattern as the bandwidth weight scheme. The ultimate outcome can be used for the reduction of damage potential by integrating the outputs into local mitigation plan and urban planning.

Keywords: flood vulnerability, geographically weighted principal components analysis, GWPCA, highly urbanized area, spatial correlation

Procedia PDF Downloads 265
28374 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

Authors: Carol Anne Hargreaves

Abstract:

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Keywords: machine learning, stock market trading, logistic regression, cluster analysis, factor analysis, decision trees, neural networks, automated stock investment system

Procedia PDF Downloads 125
28373 Principal Component Analysis Combined Machine Learning Techniques on Pharmaceutical Samples by Laser Induced Breakdown Spectroscopy

Authors: Kemal Efe Eseller, Göktuğ Yazici

Abstract:

Laser-induced breakdown spectroscopy (LIBS) is a rapid optical atomic emission spectroscopy which is used for material identification and analysis with the advantages of in-situ analysis, elimination of intensive sample preparation, and micro-destructive properties for the material to be tested. LIBS delivers short pulses of laser beams onto the material in order to create plasma by excitation of the material to a certain threshold. The plasma characteristics, which consist of wavelength value and intensity amplitude, depends on the material and the experiment’s environment. In the present work, medicine samples’ spectrum profiles were obtained via LIBS. Medicine samples’ datasets include two different concentrations for both paracetamol based medicines, namely Aferin and Parafon. The spectrum data of the samples were preprocessed via filling outliers based on quartiles, smoothing spectra to eliminate noise and normalizing both wavelength and intensity axis. Statistical information was obtained and principal component analysis (PCA) was incorporated to both the preprocessed and raw datasets. The machine learning models were set based on two different train-test splits, which were 70% training – 30% test and 80% training – 20% test. Cross-validation was preferred to protect the models against overfitting; thus the sample amount is small. The machine learning results of preprocessed and raw datasets were subjected to comparison for both splits. This is the first time that all supervised machine learning classification algorithms; consisting of Decision Trees, Discriminant, naïve Bayes, Support Vector Machines (SVM), k-NN(k-Nearest Neighbor) Ensemble Learning and Neural Network algorithms; were incorporated to LIBS data of paracetamol based pharmaceutical samples, and their different concentrations on preprocessed and raw dataset in order to observe the effect of preprocessing.

Keywords: machine learning, laser-induced breakdown spectroscopy, medicines, principal component analysis, preprocessing

Procedia PDF Downloads 62
28372 Classification of Random Doppler-Radar Targets during the Surveillance Operations

Authors: G. C. Tikkiwal, Mukesh Upadhyay

Abstract:

During the surveillance operations at war or peace time, the Radar operator gets a scatter of targets over the screen. This may be a tracked vehicle like tank vis-à-vis T72, BMP etc, or it may be a wheeled vehicle like ALS, TATRA, 2.5Tonne, Shaktiman or moving the army, moving convoys etc. The radar operator selects one of the promising targets into single target tracking (STT) mode. Once the target is locked, the operator gets a typical audible signal into his headphones. With reference to the gained experience and training over the time, the operator then identifies the random target. But this process is cumbersome and is solely dependent on the skills of the operator, thus may lead to misclassification of the object. In this paper, we present a technique using mathematical and statistical methods like fast fourier transformation (FFT) and principal component analysis (PCA) to identify the random objects. The process of classification is based on transforming the audible signature of target into music octave-notes. The whole methodology is then automated by developing suitable software. This automation increases the efficiency of identification of the random target by reducing the chances of misclassification. This whole study is based on live data.

Keywords: radar target, FFT, principal component analysis, eigenvector, octave-notes, DSP

Procedia PDF Downloads 365
28371 Isolation and Classification of Red Blood Cells in Anemic Microscopic Images

Authors: Jameela Ali Alkrimi, Abdul Rahim Ahmad, Azizah Suliman, Loay E. George

Abstract:

Red blood cells (RBCs) are among the most commonly and intensively studied type of blood cells in cell biology. The lack of RBCs is a condition characterized by lower than normal hemoglobin level; this condition is referred to as 'anemia'. In this study, a software was developed to isolate RBCs by using a machine learning approach to classify anemic RBCs in microscopic images. Several features of RBCs were extracted using image processing algorithms, including principal component analysis (PCA). With the proposed method, RBCs were isolated in 34 second from an image containing 18 to 27 cells. We also proposed that PCA could be performed to increase the speed and efficiency of classification. Our classifier algorithm yielded accuracy rates of 100%, 99.99%, and 96.50% for K-nearest neighbor (K-NN) algorithm, support vector machine (SVM), and neural network ANN, respectively. Classification was evaluated in highly sensitivity, specificity, and kappa statistical parameters. In conclusion, the classification results were obtained for a short time period with more efficient when PCA was used.

Keywords: red blood cells, pre-processing image algorithms, classification algorithms, principal component analysis PCA, confusion matrix, kappa statistical parameters, ROC

Procedia PDF Downloads 377
28370 Regeneration of Geological Models Using Support Vector Machine Assisted by Principal Component Analysis

Authors: H. Jung, N. Kim, B. Kang, J. Choe

Abstract:

History matching is a crucial procedure for predicting reservoir performances and making future decisions. However, it is difficult due to uncertainties of initial reservoir models. Therefore, it is important to have reliable initial models for successful history matching of highly heterogeneous reservoirs such as channel reservoirs. In this paper, we proposed a novel scheme for regenerating geological models using support vector machine (SVM) and principal component analysis (PCA). First, we perform PCA for figuring out main geological characteristics of models. Through the procedure, permeability values of each model are transformed to new parameters by principal components, which have eigenvalues of large magnitude. Secondly, the parameters are projected into two-dimensional plane by multi-dimensional scaling (MDS) based on Euclidean distances. Finally, we train an SVM classifier using 20% models which show the most similar or dissimilar well oil production rates (WOPR) with the true values (10% for each). Then, the other 80% models are classified by trained SVM. We select models on side of low WOPR errors. One hundred channel reservoir models are initially generated by single normal equation simulation. By repeating the classification process, we can select models which have similar geological trend with the true reservoir model. The average field of the selected models is utilized as a probability map for regeneration. Newly generated models can preserve correct channel features and exclude wrong geological properties maintaining suitable uncertainty ranges. History matching with the initial models cannot provide trustworthy results. It fails to find out correct geological features of the true model. However, history matching with the regenerated ensemble offers reliable characterization results by figuring out proper channel trend. Furthermore, it gives dependable prediction of future performances with reduced uncertainties. We propose a novel classification scheme which integrates PCA, MDS, and SVM for regenerating reservoir models. The scheme can easily sort out reliable models which have similar channel trend with the reference in lowered dimension space.

Keywords: history matching, principal component analysis, reservoir modelling, support vector machine

Procedia PDF Downloads 132
28369 Understanding the Information in Principal Component Analysis of Raman Spectroscopic Data during Healing of Subcritical Calvarial Defects

Authors: Rafay Ahmed, Condon Lau

Abstract:

Bone healing is a complex and sequential process involving changes at the molecular level. Raman spectroscopy is a promising technique to study bone mineral and matrix environments simultaneously. In this study, subcritical calvarial defects are used to study bone composition during healing without discomposing the fracture. The model allowed to monitor the natural healing of bone avoiding mechanical harm to the callus. Calvarial defects were created using 1mm burr drill in the parietal bones of Sprague-Dawley rats (n=8) that served in vivo defects. After 7 days, their skulls were harvested after euthanizing. One additional defect per sample was created on the opposite parietal bone using same calvarial defect procedure to serve as control defect. Raman spectroscopy (785 nm) was established to investigate bone parameters of three different skull surfaces; in vivo defects, control defects and normal surface. Principal component analysis (PCA) was utilized for the data analysis and interpretation of Raman spectra and helped in the classification of groups. PCA was able to distinguish in vivo defects from normal surface and control defects. PC1 shows that the major variation at 958 cm⁻¹, which corresponds to ʋ1 phosphate mineral band. PC2 shows the major variation at 1448 cm⁻¹ which is the characteristic band of CH2 deformation and corresponds to collagens. Raman parameters, namely, mineral to matrix ratio and crystallinity was found significantly decreased in the in vivo defects compared to surface and controls. Scanning electron microscope and optical microscope images show the formation of newly generated matrix by means of bony bridges of collagens. Optical profiler shows that surface roughness increased by 30% from controls to in vivo defects after 7 days. These results agree with Raman assessment parameters and confirm the new collagen formation during healing.

Keywords: Raman spectroscopy, principal component analysis, calvarial defects, tissue characterization

Procedia PDF Downloads 191
28368 The Sensitivity of Credit Defaults Swaps Premium to Global Risk Factor: Evidence from Emerging Markets

Authors: Oguzhan Cepni, Doruk Kucuksarac, M. Hasan Yilmaz

Abstract:

Changes in the global risk appetite cause co-movement in emerging market risk premiums. However, the sensitivity of the changes in risk premium to the global risk appetite may vary across emerging markets. In this study, how the global risk appetite affects Credit Default Swap (CDS) premiums in emerging markets are analyzed using Principal Component Analysis (PCA) and rolling regressions. The PCA results indicate that the first common component derived by the PCA accounts for almost 76 percent of the common variation in CDS premiums. Additionally, the explanatory power of the first factor seems to be high over the sample period. However, the sensitivity to the global risk factor tends to change over time and across countries. In this regard, fixed effects panel regressions are used to identify the macroeconomic factors driving the heterogeneity across emerging markets. The panel regression results point to the significance of government debt to GDP and international reserves to GDP in explaining sensitivity. Accordingly, countries with lower government debt and higher reserves tend to be less subject to the variations in the global risk appetite.

Keywords: credit default swaps, emerging markets, principal components analysis, sovereign risk

Procedia PDF Downloads 345
28367 Exploratory Study of the Influencing Factors for Hotels' Competitors

Authors: Asma Ameur, Dhafer Malouche

Abstract:

Hotel competitiveness research is an essential phase of the marketing strategy for any hotel. Certainly, knowing the hotels' competitors helps the hotelier to grasp its position in the market and the citizen to make the right choice in picking a hotel. Thus, competitiveness is an important indicator that can be influenced by various factors. In fact, the issue of competitiveness, this ability to cope with competition, remains a difficult and complex concept to define and to exploit. Therefore, the purpose of this article is to make an exploratory study to calculate a competitiveness indicator for hotels. Further on, this paper makes it possible to determine the criteria of direct or indirect effect on the image and the perception of a hotel. The actual research is used to look into the right model for hotel ‘competitiveness. For this reason, we exploit different theoretical contributions in the field of machine learning. Thus, we use some statistical techniques such as the Principal Component Analysis (PCA) to reduce the dimensions, as well as other techniques of statistical modeling. This paper presents a survey covering of the techniques and methods in hotel competitiveness research. Furthermore, this study allows us to deduct the significant variables that influence the determination of hotel’s competitors. Lastly, the discussed experiences in this article found that the hotel competitors are influenced by several factors with different rates.

Keywords: competitiveness, e-reputation, hotels' competitors, online hotel’ review, principal component analysis, statistical modeling

Procedia PDF Downloads 83
28366 Evaluation of Yield and Yield Components of Malaysian Palm Oil Board-Senegal Oil Palm Germplasm Using Multivariate Tools

Authors: Khin Aye Myint, Mohd Rafii Yusop, Mohd Yusoff Abd Samad, Shairul Izan Ramlee, Mohd Din Amiruddin, Zulkifli Yaakub

Abstract:

The narrow base of genetic is the main obstacle of breeding and genetic improvement in oil palm industry. In order to broaden the genetic bases, the Malaysian Palm Oil Board has been extensively collected wild germplasm from its original area of 11 African countries which are Nigeria, Senegal, Gambia, Guinea, Sierra Leone, Ghana, Cameroon, Zaire, Angola, Madagascar, and Tanzania. The germplasm collections were established and maintained as a field gene bank in Malaysian Palm Oil Board (MPOB) Research Station in Kluang, Johor, Malaysia to conserve a wide range of oil palm genetic resources for genetic improvement of Malaysian oil palm industry. Therefore, assessing the performance and genetic diversity of the wild materials is very important for understanding the genetic structure of natural oil palm population and to explore genetic resources. Principal component analysis (PCA) and Cluster analysis are very efficient multivariate tools in the evaluation of genetic variation of germplasm and have been applied in many crops. In this study, eight populations of MPOB-Senegal oil palm germplasm were studied to explore the genetic variation pattern using PCA and cluster analysis. A total of 20 yield and yield component traits were used to analyze PCA and Ward’s clustering using SAS 9.4 version software. The first four principal components which have eigenvalue >1 accounted for 93% of total variation with the value of 44%, 19%, 18% and 12% respectively for each principal component. PC1 showed highest positive correlation with fresh fruit bunch (0.315), bunch number (0.321), oil yield (0.317), kernel yield (0.326), total economic product (0.324), and total oil (0.324) while PC 2 has the largest positive association with oil to wet mesocarp (0.397) and oil to fruit (0.458). The oil palm population were grouped into four distinct clusters based on 20 evaluated traits, this imply that high genetic variation existed in among the germplasm. Cluster 1 contains two populations which are SEN 12 and SEN 10, while cluster 2 has only one population of SEN 3. Cluster 3 consists of three populations which are SEN 4, SEN 6, and SEN 7 while SEN 2 and SEN 5 were grouped in cluster 4. Cluster 4 showed the highest mean value of fresh fruit bunch, bunch number, oil yield, kernel yield, total economic product, and total oil and Cluster 1 was characterized by high oil to wet mesocarp, and oil to fruit. The desired traits that have the largest positive correlation on extracted PCs could be utilized for the improvement of oil palm breeding program. The populations from different clusters with the highest cluster means could be used for hybridization. The information from this study can be utilized for effective conservation and selection of the MPOB-Senegal oil palm germplasm for the future breeding program.

Keywords: cluster analysis, genetic variability, germplasm, oil palm, principal component analysis

Procedia PDF Downloads 138
28365 Rural Households’ Resilience to Food Insecurity in Niger

Authors: Aboubakr Gambo, Adama Diaw, Tobias Wunscher

Abstract:

This study attempts to identify factors affecting rural households’ resilience to food insecurity in Niger. For this, we first create a resilience index by using Principal Component Analysis on the following five variables at the household level: income, food expenditure, duration of grain held in stock, livestock in Tropical Livestock Units and number of farms exploited and second apply Structural Equation Modelling to identify the determinants. Data from the 2010 National Survey on Households’ Vulnerability to Food Insecurity done by the National Institute of Statistics is used. The study shows that asset and social safety nets indicators are significant and have a positive impact on households’ resilience. Climate change approximated by long-term mean rainfall has a negative and significant effect on households’ resilience to food insecurity. The results indicate that to strengthen households’ resilience to food insecurity, there is a need to increase assistance to households through social safety nets and to help them gather more resources in order to acquire more assets. Furthermore, early warning of climatic events could alert households especially farmers to be prepared and avoid important losses that they experience anytime an uneven climatic event occur.

Keywords: food insecurity, principal component analysis, structural equation modelling, resilience

Procedia PDF Downloads 332
28364 Monitoring Blood Pressure Using Regression Techniques

Authors: Qasem Qananwah, Ahmad Dagamseh, Hiam AlQuran, Khalid Shaker Ibrahim

Abstract:

Blood pressure helps the physicians greatly to have a deep insight into the cardiovascular system. The determination of individual blood pressure is a standard clinical procedure considered for cardiovascular system problems. The conventional techniques to measure blood pressure (e.g. cuff method) allows a limited number of readings for a certain period (e.g. every 5-10 minutes). Additionally, these systems cause turbulence to blood flow; impeding continuous blood pressure monitoring, especially in emergency cases or critically ill persons. In this paper, the most important statistical features in the photoplethysmogram (PPG) signals were extracted to estimate the blood pressure noninvasively. PPG signals from more than 40 subjects were measured and analyzed and 12 features were extracted. The features were fed to principal component analysis (PCA) to find the most important independent features that have the highest correlation with blood pressure. The results show that the stiffness index means and standard deviation for the beat-to-beat heart rate were the most important features. A model representing both features for Systolic Blood Pressure (SBP) and Diastolic Blood Pressure (DBP) was obtained using a statistical regression technique. Surface fitting is used to best fit the series of data and the results show that the error value in estimating the SBP is 4.95% and in estimating the DBP is 3.99%.

Keywords: blood pressure, noninvasive optical system, principal component analysis, PCA, continuous monitoring

Procedia PDF Downloads 126
28363 Chemometric Determination of the Geographical Origin of Milk Samples in Malaysia

Authors: Shima Behkami, Nor Shahirul Umirah Idris, Sharifuddin Md. Zain, Kah Hin Low, Mehrdad Gholami, Nima A. Behkami, Ahmad Firdaus Kamaruddin

Abstract:

In this work, Inductively Coupled Plasma Mass Spectrometry (ICP-MS), Isotopic Ratio Mass Spectrometry (IRMS) and Ultrasound Milko Tester were used to study milk samples obtained from various geographical locations in Malaysia. ICP-MS was used to determine the concentration of trace elements in milk, water and soil samples obtained from seven dairy farms at different geographical locations in peninsular Malaysia. IRMS was used to analyze the milk samples for isotopic ratios of δ13C, 15N and 18O. Nutritional parameters in the milk samples were determined using an ultrasound milko tester. Data obtained from these measurements were evaluated by Principal Component Analysis (PCA) and Hierarchical Analysis (HA) as a preliminary step in determining geographical origin of these milk samples. It is observed that the isotopic ratios and a number of the nutritional parameters are responsible for the discrimination of the samples. It was also observed that it is possible to determine the geographical origin of these milk samples solely by the isotopic ratios of δ13C, 15N and 18O. The accuracy of the geographical discrimination is demonstrated when several milk samples from a milk factory taken from one of the regions under study were appropriately assigned to the correct PCA cluster.

Keywords: inductively coupled plasma mass spectroscopy ICP-MS, isotope ratio mass spectroscopy IRMS, ultrasound, principal component analysis, hierarchical analysis, geographical origin, milk

Procedia PDF Downloads 336