Search results for: Unsupervised Clustering
99 Neural Network Optimal Power Flow(NN-OPF) based on IPSO with Developed Load Cluster Method
Authors: Mat Syai'in, Adi Soeprijanto
Abstract:
An Optimal Power Flow based on Improved Particle Swarm Optimization (OPF-IPSO) with Generator Capability Curve Constraint is used by NN-OPF as a reference to get pattern of generator scheduling. There are three stages in Designing NN-OPF. The first stage is design of OPF-IPSO with generator capability curve constraint. The second stage is clustering load to specific range and calculating its index. The third stage is training NN-OPF using constructive back propagation method. In training process total load and load index used as input, and pattern of generator scheduling used as output. Data used in this paper is power system of Java-Bali. Software used in this simulation is MATLAB.Keywords: Optimal Power Flow, Generator Capability Curve, Improved Particle Swarm Optimization, Neural Network
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 195198 An Improved C-Means Model for MRI Segmentation
Authors: Ying Shen, Weihua Zhu
Abstract:
Medical images are important to help identifying different diseases, for example, Magnetic resonance imaging (MRI) can be used to investigate the brain, spinal cord, bones, joints, breasts, blood vessels, and heart. Image segmentation, in medical image analysis, is usually the first step to find out some characteristics with similar color, intensity or texture so that the diagnosis could be further carried out based on these features. This paper introduces an improved C-means model to segment the MRI images. The model is based on information entropy to evaluate the segmentation results by achieving global optimization. Several contributions are significant. Firstly, Genetic Algorithm (GA) is used for achieving global optimization in this model where fuzzy C-means clustering algorithm (FCMA) is not capable of doing that. Secondly, the information entropy after segmentation is used for measuring the effectiveness of MRI image processing. Experimental results show the outperformance of the proposed model by comparing with traditional approaches.
Keywords: Magnetic Resonance Image, C-means model, image segmentation, information entropy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 91897 A Study of Gaps in CBMIR Using Different Methods and Prospective
Authors: Pradeep Singh, Sukhwinder Singh, Gurjinder Kaur
Abstract:
In recent years, rapid advances in software and hardware in the field of information technology along with a digital imaging revolution in the medical domain facilitate the generation and storage of large collections of images by hospitals and clinics. To search these large image collections effectively and efficiently poses significant technical challenges, and it raises the necessity of constructing intelligent retrieval systems. Content-based Image Retrieval (CBIR) consists of retrieving the most visually similar images to a given query image from a database of images[5]. Medical CBIR (content-based image retrieval) applications pose unique challenges but at the same time offer many new opportunities. On one hand, while one can easily understand news or sports videos, a medical image is often completely incomprehensible to untrained eyes.
Keywords: Classification, clustering, content-based image retrieval (CBIR), relevance feedback (RF), statistical similarity matching, support vector machine (SVM).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 178796 Neuro-fuzzy Model and Regression Model a Comparison Study of MRR in Electrical Discharge Machining of D2 Tool Steel
Authors: M. K. Pradhan, C. K. Biswas,
Abstract:
In the current research, neuro-fuzzy model and regression model was developed to predict Material Removal Rate in Electrical Discharge Machining process for AISI D2 tool steel with copper electrode. Extensive experiments were conducted with various levels of discharge current, pulse duration and duty cycle. The experimental data are split into two sets, one for training and the other for validation of the model. The training data were used to develop the above models and the test data, which was not used earlier to develop these models were used for validation the models. Subsequently, the models are compared. It was found that the predicted and experimental results were in good agreement and the coefficients of correlation were found to be 0.999 and 0.974 for neuro fuzzy and regression model respectively
Keywords: Electrical discharge machining, material removal rate, neuro-fuzzy model, regression model, mountain clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 138995 Performance and Emission Prediction in a Biodiesel Engine Fuelled with Honge Methyl Ester Using RBF Neural Networks
Authors: Shivakumar, G. S. Vijay, P. Srinivas Pai, B. R. Shrinivasa Rao
Abstract:
In the present study, RBF neural networks were used for predicting the performance and emission parameters of a biodiesel engine. Engine experiments were carried out in a 4 stroke diesel engine using blends of diesel and Honge methyl ester as the fuel. Performance parameters like BTE, BSEC, Tex and emissions from the engine were measured. These experimental results were used for ANN modeling. RBF center initialization was done by random selection and by using Clustered techniques. Network was trained by using fixed and varying widths for the RBF units. It was observed that RBF results were having a good agreement with the experimental results. Networks trained by using clustering technique gave better results than using random selection of centers in terms of reduced MRE and increased prediction accuracy. The average MRE for the performance parameters was 3.25% with the prediction accuracy of 98% and for emissions it was 10.4% with a prediction accuracy of 80%.Keywords: Radial Basis Function networks, emissions, Performance parameters, Fuzzy c means.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 172994 Registration Management System for the First Access to a Public Moroccan Institution: Case Sultan Moulay Slimane University, Beni Mellal
Authors: Khalid Ghoulam, Belaid Bouikhalene, Zakaria Harmouch, Hicham Mouncif
Abstract:
One of the essential topics in the information systems is the registration management. The objective of this project is to create a web portal designed to help new students on the first access to the Sultan Moulay Slimane University SMSU (Practical Information, Pre-Registration, Placement Test, Terms of use ... etc.) while creating a secure space protecting both data from the institutions of the University and student information. This portal is accessible from any computer connected to the Internet inside and outside the campus. In this work, we present a platform on the first access to the SMSU which is essential for authentication in the digital work space of the university. This platform allows university to make better decisions for students clustering, to avoid traditional manual method, and to reduce the cost in human and material resources.
Keywords: Registration, SMSU, Security, FAUSMS, digital work space, Placement test.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 113093 A Heuristics Approach for Fast Detecting Suspicious Money Laundering Cases in an Investment Bank
Authors: Nhien-An Le-Khac, Sammer Markos, M-Tahar Kechadi
Abstract:
Today, money laundering (ML) poses a serious threat not only to financial institutions but also to the nation. This criminal activity is becoming more and more sophisticated and seems to have moved from the cliché of drug trafficking to financing terrorism and surely not forgetting personal gain. Most international financial institutions have been implementing anti-money laundering solutions (AML) to fight investment fraud. However, traditional investigative techniques consume numerous man-hours. Recently, data mining approaches have been developed and are considered as well-suited techniques for detecting ML activities. Within the scope of a collaboration project for the purpose of developing a new solution for the AML Units in an international investment bank, we proposed a data mining-based solution for AML. In this paper, we present a heuristics approach to improve the performance for this solution. We also show some preliminary results associated with this method on analysing transaction datasets.Keywords: data mining, anti money laundering, clustering, heuristics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 358592 Automatic Facial Skin Segmentation Using Possibilistic C-Means Algorithm for Evaluation of Facial Surgeries
Authors: Elham Alaee, Mousa Shamsi, Hossein Ahmadi, Soroosh Nazem, Mohammadhossein Sedaaghi
Abstract:
Human face has a fundamental role in the appearance of individuals. So the importance of facial surgeries is undeniable. Thus, there is a need for the appropriate and accurate facial skin segmentation in order to extract different features. Since Fuzzy CMeans (FCM) clustering algorithm doesn’t work appropriately for noisy images and outliers, in this paper we exploit Possibilistic CMeans (PCM) algorithm in order to segment the facial skin. For this purpose, first, we convert facial images from RGB to YCbCr color space. To evaluate performance of the proposed algorithm, the database of Sahand University of Technology, Tabriz, Iran was used. In order to have a better understanding from the proposed algorithm; FCM and Expectation-Maximization (EM) algorithms are also used for facial skin segmentation. The proposed method shows better results than the other segmentation methods. Results include misclassification error (0.032) and the region’s area error (0.045) for the proposed algorithm.
Keywords: Facial image, segmentation, PCM, FCM, skin error, facial surgery.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 199091 A Study on Finding Similar Document with Multiple Categories
Authors: R. Saraçoğlu, N. Allahverdi
Abstract:
Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.
Keywords: Document similarity, Fuzzy classification, Multiple categories, Text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 170790 Collaborative and Content-based Recommender System for Social Bookmarking Website
Authors: Cheng-Lung Huang, Cheng-Wei Lin
Abstract:
This study proposes a new recommender system based on the collaborative folksonomy. The purpose of the proposed system is to recommend Internet resources (such as books, articles, documents, pictures, audio and video) to users. The proposed method includes four steps: creating the user profile based on the tags, grouping the similar users into clusters using an agglomerative hierarchical clustering, finding similar resources based on the user-s past collections by using content-based filtering, and recommending similar items to the target user. This study examines the system-s performance for the dataset collected from “del.icio.us," which is a famous social bookmarking website. Experimental results show that the proposed tag-based collaborative and content-based filtering hybridized recommender system is promising and effectiveness in the folksonomy-based bookmarking website.
Keywords: Collaborative recommendation, Folksonomy, Social tagging
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 224889 A Survey of Business Component Identification Methods and Related Techniques
Authors: Zhongjie Wang, Xiaofei Xu, Dechen Zhan
Abstract:
With deep development of software reuse, componentrelated technologies have been widely applied in the development of large-scale complex applications. Component identification (CI) is one of the primary research problems in software reuse, by analyzing domain business models to get a set of business components with high reuse value and good reuse performance to support effective reuse. Based on the concept and classification of CI, its technical stack is briefly discussed from four views, i.e., form of input business models, identification goals, identification strategies, and identification process. Then various CI methods presented in literatures are classified into four types, i.e., domain analysis based methods, cohesion-coupling based clustering methods, CRUD matrix based methods, and other methods, with the comparisons between these methods for their advantages and disadvantages. Additionally, some insufficiencies of study on CI are discussed, and the causes are explained subsequently. Finally, it is concluded with some significantly promising tendency about research on this problem.Keywords: Business component, component granularity, component identification, reuse performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 197488 Handling Mobility using Virtual Grid in Static Wireless Sensor Networks
Authors: T.P. Sharma
Abstract:
Querying a data source and routing data towards sink becomes a serious challenge in static wireless sensor networks if sink and/or data source are mobile. Many a times the event to be observed either moves or spreads across wide area making maintenance of continuous path between source and sink a challenge. Also, sink can move while query is being issued or data is on its way towards sink. In this paper, we extend our already proposed Grid Based Data Dissemination (GBDD) scheme which is a virtual grid based topology management scheme restricting impact of movement of sink(s) and event(s) to some specific cells of a grid. This obviates the need for frequent path modifications and hence maintains continuous flow of data while minimizing the network energy consumptions. Simulation experiments show significant improvements in network energy savings and average packet delay for a packet to reach at sink.Keywords: Mobility in WSNs, virtual grid, GBDD, clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 155087 A Study of Dynamic Clustering Method to Extend the Lifetime of Wireless Sensor Network
Authors: Wernhuar Tarng, Kun-Jie Huang, Li-Zhong Deng, Kun-Rong Hsie, Mingteh Chen
Abstract:
In recent years, the research in wireless sensor network has increased steadily, and many studies were focusing on reducing energy consumption of sensor nodes to extend their lifetimes. In this paper, the issue of energy consumption is investigated and two adaptive mechanisms are proposed to extend the network lifetime. This study uses high-energy-first scheme to determine cluster heads for data transmission. Thus, energy consumption in each cluster is balanced and network lifetime can be extended. In addition, this study uses cluster merging and dynamic routing mechanisms to further reduce energy consumption during data transmission. The simulation results show that the proposed method can effectively extend the lifetime of wireless sensor network, and it is suitable for different base station locations.Keywords: Wireless sensor network, high-energy-first scheme, adaptive mechanisms, network lifetime
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 153886 Neural Network based Texture Analysis of Liver Tumor from Computed Tomography Images
Authors: K.Mala, V.Sadasivam, S.Alagappan
Abstract:
Advances in clinical medical imaging have brought about the routine production of vast numbers of medical images that need to be analyzed. As a result an enormous amount of computer vision research effort has been targeted at achieving automated medical image analysis. Computed Tomography (CT) is highly accurate for diagnosing liver tumors. This study aimed to evaluate the potential role of the wavelet and the neural network in the differential diagnosis of liver tumors in CT images. The tumors considered in this study are hepatocellular carcinoma, cholangio carcinoma, hemangeoma and hepatoadenoma. Each suspicious tumor region was automatically extracted from the CT abdominal images and the textural information obtained was used to train the Probabilistic Neural Network (PNN) to classify the tumors. Results obtained were evaluated with the help of radiologists. The system differentiates the tumor with relatively high accuracy and is therefore clinically useful.
Keywords: Fuzzy c means clustering, texture analysis, probabilistic neural network, LVQ neural network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 298885 Space Telemetry Anomaly Detection Based on Statistical PCA Algorithm
Authors: B. Nassar, W. Hussein, M. Mokhtar
Abstract:
The critical concern of satellite operations is to ensure the health and safety of satellites. The worst case in this perspective is probably the loss of a mission, but the more common interruption of satellite functionality can result in compromised mission objectives. All the data acquiring from the spacecraft are known as Telemetry (TM), which contains the wealth information related to the health of all its subsystems. Each single item of information is contained in a telemetry parameter, which represents a time-variant property (i.e. a status or a measurement) to be checked. As a consequence, there is a continuous improvement of TM monitoring systems to reduce the time required to respond to changes in a satellite's state of health. A fast conception of the current state of the satellite is thus very important to respond to occurring failures. Statistical multivariate latent techniques are one of the vital learning tools that are used to tackle the problem above coherently. Information extraction from such rich data sources using advanced statistical methodologies is a challenging task due to the massive volume of data. To solve this problem, in this paper, we present a proposed unsupervised learning algorithm based on Principle Component Analysis (PCA) technique. The algorithm is particularly applied on an actual remote sensing spacecraft. Data from the Attitude Determination and Control System (ADCS) was acquired under two operation conditions: normal and faulty states. The models were built and tested under these conditions, and the results show that the algorithm could successfully differentiate between these operations conditions. Furthermore, the algorithm provides competent information in prediction as well as adding more insight and physical interpretation to the ADCS operation.Keywords: Space telemetry monitoring, multivariate analysis, PCA algorithm, space operations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 206284 Dempster-Shafer Evidence Theory for Image Segmentation: Application in Cells Images
Authors: S. Ben Chaabane, M. Sayadi, F. Fnaiech, E. Brassart
Abstract:
In this paper we propose a new knowledge model using the Dempster-Shafer-s evidence theory for image segmentation and fusion. The proposed method is composed essentially of two steps. First, mass distributions in Dempster-Shafer theory are obtained from the membership degrees of each pixel covering the three image components (R, G and B). Each membership-s degree is determined by applying Fuzzy C-Means (FCM) clustering to the gray levels of the three images. Second, the fusion process consists in defining three discernment frames which are associated with the three images to be fused, and then combining them to form a new frame of discernment. The strategy used to define mass distributions in the combined framework is discussed in detail. The proposed fusion method is illustrated in the context of image segmentation. Experimental investigations and comparative studies with the other previous methods are carried out showing thus the robustness and superiority of the proposed method in terms of image segmentation.Keywords: Fuzzy C-means, Color image, data fusion, Dempster-Shafer's evidence theory
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 220083 Evaluation of Robust Feature Descriptors for Texture Classification
Authors: Jia-Hong Lee, Mei-Yi Wu, Hsien-Tsung Kuo
Abstract:
Texture is an important characteristic in real and synthetic scenes. Texture analysis plays a critical role in inspecting surfaces and provides important techniques in a variety of applications. Although several descriptors have been presented to extract texture features, the development of object recognition is still a difficult task due to the complex aspects of texture. Recently, many robust and scaling-invariant image features such as SIFT, SURF and ORB have been successfully used in image retrieval and object recognition. In this paper, we have tried to compare the performance for texture classification using these feature descriptors with k-means clustering. Different classifiers including K-NN, Naive Bayes, Back Propagation Neural Network , Decision Tree and Kstar were applied in three texture image sets - UIUCTex, KTH-TIPS and Brodatz, respectively. Experimental results reveal SIFTS as the best average accuracy rate holder in UIUCTex, KTH-TIPS and SURF is advantaged in Brodatz texture set. BP neuro network works best in the test set classification among all used classifiers.Keywords: Texture classification, texture descriptor, SIFT, SURF, ORB.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 160082 A New Approach for Network Reconfiguration Problem in Order to Deviation Bus Voltage Minimization with Regard to Probabilistic Load Model and DGs
Authors: Mahmood Reza Shakarami, Reza Sedaghati
Abstract:
Recently, distributed generation technologies have received much attention for the potential energy savings and reliability assurances that might be achieved as a result of their widespread adoption. The distribution feeder reconfiguration (DFR) is one of the most important control schemes in the distribution networks, which can be affected by DGs. This paper presents a new approach to DFR at the distribution networks considering wind turbines. The main objective of the DFR is to minimize the deviation of the bus voltage. Since the DFR is a nonlinear optimization problem, we apply the Adaptive Modified Firefly Optimization (AMFO) approach to solve it. As a result of the conflicting behavior of the single- objective function, a fuzzy based clustering technique is employed to reach the set of optimal solutions called Pareto solutions. The approach is tested on the IEEE 32-bus standard test system.
Keywords: Adaptive Modified Firefly Optimization (AMFO), Pareto solutions, feeder reconfiguration, wind turbines, bus voltage.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 201781 A Machine Learning-based Analysis of Autism Prevalence Rates across US States against Multiple Potential Explanatory Variables
Authors: Ronit Chakraborty, Sugata Banerji
Abstract:
There has been a marked increase in the reported prevalence of Autism Spectrum Disorder (ASD) among children in the US over the past two decades. This research has analyzed the growth in state-level ASD prevalence against 45 different potentially explanatory factors including socio-economic, demographic, healthcare, public policy and political factors. The goal was to understand if these factors have adequate predictive power in modeling the differential growth in ASD prevalence across various states, and, if they do, which factors are the most influential. The key findings of this study include (1) there is a confirmation that the chosen feature set has considerable power in predicting the growth in ASD prevalence, (2) the most influential predictive factors are identified, (3) given the nature of the most influential predictive variables, an indication that a considerable portion of the reported ASD prevalence differentials across states could be attributable to over and under diagnosis, and (4) Florida is identified as a key outlier state pointing to a potential under-diagnosis of ASD.
Keywords: Autism Spectrum Disorder, ASD, clustering, Machine Learning, predictive modeling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 67180 Normalizing Flow to Augmented Posterior: Conditional Density Estimation with Interpretable Dimension Reduction for High Dimensional Data
Authors: Cheng Zeng, George Michailidis, Hitoshi Iyatomi, Leo L Duan
Abstract:
The conditional density characterizes the distribution of a response variable y given other predictor x, and plays a key role in many statistical tasks, including classification and outlier detection. Although there has been abundant work on the problem of Conditional Density Estimation (CDE) for a low-dimensional response in the presence of a high-dimensional predictor, little work has been done for a high-dimensional response such as images. The promising performance of normalizing flow (NF) neural networks in unconditional density estimation acts a motivating starting point. In this work, we extend NF neural networks when external x is present. Specifically, they use the NF to parameterize a one-to-one transform between a high-dimensional y and a latent z that comprises two components [zP , zN]. The zP component is a low-dimensional subvector obtained from the posterior distribution of an elementary predictive model for x, such as logistic/linear regression. The zN component is a high-dimensional independent Gaussian vector, which explains the variations in y not or less related to x. Unlike existing CDE methods, the proposed approach, coined Augmented Posterior CDE (AP-CDE), only requires a simple modification on the common normalizing flow framework, while significantly improving the interpretation of the latent component, since zP represents a supervised dimension reduction. In image analytics applications, AP-CDE shows good separation of x-related variations due to factors such as lighting condition and subject id, from the other random variations. Further, the experiments show that an unconditional NF neural network, based on an unsupervised model of z, such as Gaussian mixture, fails to generate interpretable results.
Keywords: Conditional density estimation, image generation, normalizing flow, supervised dimension reduction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16579 Denoising by Spatial Domain Averaging for Wireless Local Area Network Terminal Localization
Authors: Diego Felix, Eugene Hyun, Michael McGuire, Mihai Sima
Abstract:
Terminal localization for indoor Wireless Local Area Networks (WLANs) is critical for the deployment of location-aware computing inside of buildings. A major challenge is obtaining high localization accuracy in presence of fluctuations of the received signal strength (RSS) measurements caused by multipath fading. This paper focuses on reducing the effect of the distance-varying noise by spatial filtering of the measured RSS. Two different survey point geometries are tested with the noise reduction technique: survey points arranged in sets of clusters and survey points uniformly distributed over the network area. The results show that the location accuracy improves by 16% when the filter is used and by 18% when the filter is applied to a clustered survey set as opposed to a straight-line survey set. The estimated locations are within 2 m of the true location, which indicates that clustering the survey points provides better localization accuracy due to superior noise removal.Keywords: Position measurement, Wireless LAN, Radio navigation, Filtering
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 152178 Human Action Recognition Based on Ridgelet Transform and SVM
Authors: A. Ouanane, A. Serir
Abstract:
In this paper, a novel algorithm based on Ridgelet Transform and support vector machine is proposed for human action recognition. The Ridgelet transform is a directional multi-resolution transform and it is more suitable for describing the human action by performing its directional information to form spatial features vectors. The dynamic transition between the spatial features is carried out using both the Principal Component Analysis and clustering algorithm K-means. First, the Principal Component Analysis is used to reduce the dimensionality of the obtained vectors. Then, the kmeans algorithm is then used to perform the obtained vectors to form the spatio-temporal pattern, called set-of-labels, according to given periodicity of human action. Finally, a Support Machine classifier is used to discriminate between the different human actions. Different tests are conducted on popular Datasets, such as Weizmann and KTH. The obtained results show that the proposed method provides more significant accuracy rate and it drives more robustness in very challenging situations such as lighting changes, scaling and dynamic environmentKeywords: Human action, Ridgelet Transform, PCA, K-means, SVM.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 207077 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia
Authors: Carol Anne Hargreaves
Abstract:
A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.
Keywords: Machine learning, stock market trading, logistic principal component analysis, automated stock investment system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 109876 Construction of cDNALibrary and EST Analysis of Tenebriomolitorlarvae
Authors: JiEun Jeong, Se-Won Kang, Hee-Ju Hwang, Sung-Hwa Chae, Sang-Haeng Choi, Hong-SeogPark, YeonSoo Han, Bok-Reul Lee, Dae-Hyun Seog, Yong Seok Lee
Abstract:
Tofurther advance research on immune-related genes from T. molitor, we constructed acDNA library and analyzed expressed sequence taq (EST) sequences from 1,056 clones. After removing vector sequence and quality checkingthrough thePhred program (trim_alt 0.05 (P-score>20), 1039 sequences were generated. The average length of insert was 792 bp. In addition, we identified 162 clusters, 167 contigs and 391 contigs after clustering and assembling process using a TGICL package. EST sequences were searchedagainst NCBI nr database by local BLAST (blastx, E75 Research of Potential Cluster Development in Pannonian Croatia
Authors: Mirjana Radman-Funarić, Katarina Potnik Galić
Abstract:
The paper presents an analysis of linkages and structures of co-operation and their intensity like the potential for the establishment of clusters in the Central and Eastern (Pannonian) Croatian. Starting from the theoretical elaboration of the need for entrepreneurs to organize through the cluster model and the terms of their self-actualization, related to the importance of traditional values in terms of benefits, social capital and assess where the company now is, in order to prove the need to create their own identity in terms of clustering. The institutional dimensions of social capital where the public sector has the best role in creating the social structure of clusters, and social dimensions of social capital in terms of trust, cooperation and networking will be analyzed to what extent the trust and coherency are present between companies in the Brod posavina and Pozega slavonia County, expressed through the readiness of inclusion in clusters in the NUTS II region - Central and Eastern (Pannonian) Croatia, as a homogeneous economic entity, with emphasis on limiting factors that stand in the way of greater competitiveness.Keywords: Analysis of linkages, structures of co-operation, Cluster, Region
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 186774 Feature Selection with Kohonen Self Organizing Classification Algorithm
Authors: Francesco Maiorana
Abstract:
In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.Keywords: Clustering algorithm, Data mining, Feature selection, Grid, Kohonen Self Organizing Map.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 305273 An Energy Efficient Cluster Formation Protocol with Low Latency In Wireless Sensor Networks
Authors: A. Allirani, M. Suganthi
Abstract:
Data gathering is an essential operation in wireless sensor network applications. So it requires energy efficiency techniques to increase the lifetime of the network. Similarly, clustering is also an effective technique to improve the energy efficiency and network lifetime of wireless sensor networks. In this paper, an energy efficient cluster formation protocol is proposed with the objective of achieving low energy dissipation and latency without sacrificing application specific quality. The objective is achieved by applying randomized, adaptive, self-configuring cluster formation and localized control for data transfers. It involves application - specific data processing, such as data aggregation or compression. The cluster formation algorithm allows each node to make independent decisions, so as to generate good clusters as the end. Simulation results show that the proposed protocol utilizes minimum energy and latency for cluster formation, there by reducing the overhead of the protocol.Keywords: Sensor networks, Low latency, Energy sorting protocol, data processing, Cluster formation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 274172 A New Hybrid RMN Image Segmentation Algorithm
Authors: Abdelouahab Moussaoui, Nabila Ferahta, Victor Chen
Abstract:
The development of aid's systems for the medical diagnosis is not easy thing because of presence of inhomogeneities in the MRI, the variability of the data from a sequence to the other as well as of other different source distortions that accentuate this difficulty. A new automatic, contextual, adaptive and robust segmentation procedure by MRI brain tissue classification is described in this article. A first phase consists in estimating the density of probability of the data by the Parzen-Rozenblatt method. The classification procedure is completely automatic and doesn't make any assumptions nor on the clusters number nor on the prototypes of these clusters since these last are detected in an automatic manner by an operator of mathematical morphology called skeleton by influence zones detection (SKIZ). The problem of initialization of the prototypes as well as their number is transformed in an optimization problem; in more the procedure is adaptive since it takes in consideration the contextual information presents in every voxel by an adaptive and robust non parametric model by the Markov fields (MF). The number of bad classifications is reduced by the use of the criteria of MPM minimization (Maximum Posterior Marginal).Keywords: Clustering, Automatic Classification, SKIZ, MarkovFields, Image segmentation, Maximum Posterior Marginal (MPM).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 141271 A New Hybrid K-Mean-Quick Reduct Algorithm for Gene Selection
Authors: E. N. Sathishkumar, K. Thangavel, T. Chandrasekhar
Abstract:
Feature selection is a process to select features which are more informative. It is one of the important steps in knowledge discovery. The problem is that all genes are not important in gene expression data. Some of the genes may be redundant, and others may be irrelevant and noisy. Here a novel approach is proposed Hybrid K-Mean-Quick Reduct (KMQR) algorithm for gene selection from gene expression data. In this study, the entire dataset is divided into clusters by applying K-Means algorithm. Each cluster contains similar genes. The high class discriminated genes has been selected based on their degree of dependence by applying Quick Reduct algorithm to all the clusters. Average Correlation Value (ACV) is calculated for the high class discriminated genes. The clusters which have the ACV value as 1 is determined as significant clusters, whose classification accuracy will be equal or high when comparing to the accuracy of the entire dataset. The proposed algorithm is evaluated using WEKA classifiers and compared. The proposed work shows that the high classification accuracy.
Keywords: Clustering, Gene Selection, K-Mean-Quick Reduct, Rough Sets.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 229870 Medical Image Segmentation and Detection of MR Images Based on Spatial Multiple-Kernel Fuzzy C-Means Algorithm
Authors: J. Mehena, M. C. Adhikary
Abstract:
In this paper, a spatial multiple-kernel fuzzy C-means (SMKFCM) algorithm is introduced for segmentation problem. A linear combination of multiples kernels with spatial information is used in the kernel FCM (KFCM) and the updating rules for the linear coefficients of the composite kernels are derived as well. Fuzzy cmeans (FCM) based techniques have been widely used in medical image segmentation problem due to their simplicity and fast convergence. The proposed SMKFCM algorithm provides us a new flexible vehicle to fuse different pixel information in medical image segmentation and detection of MR images. To evaluate the robustness of the proposed segmentation algorithm in noisy environment, we add noise in medical brain tumor MR images and calculated the success rate and segmentation accuracy. From the experimental results it is clear that the proposed algorithm has better performance than those of other FCM based techniques for noisy medical MR images.Keywords: Clustering, fuzzy C-means, image segmentation, MR images, multiple kernels.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2129