Search results for: statistical classifiers.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1345

Search results for: statistical classifiers.

1045 Space Telemetry Anomaly Detection Based on Statistical PCA Algorithm

Authors: B. Nassar, W. Hussein, M. Mokhtar

Abstract:

The critical concern of satellite operations is to ensure the health and safety of satellites. The worst case in this perspective is probably the loss of a mission, but the more common interruption of satellite functionality can result in compromised mission objectives. All the data acquiring from the spacecraft are known as Telemetry (TM), which contains the wealth information related to the health of all its subsystems. Each single item of information is contained in a telemetry parameter, which represents a time-variant property (i.e. a status or a measurement) to be checked. As a consequence, there is a continuous improvement of TM monitoring systems to reduce the time required to respond to changes in a satellite's state of health. A fast conception of the current state of the satellite is thus very important to respond to occurring failures. Statistical multivariate latent techniques are one of the vital learning tools that are used to tackle the problem above coherently. Information extraction from such rich data sources using advanced statistical methodologies is a challenging task due to the massive volume of data. To solve this problem, in this paper, we present a proposed unsupervised learning algorithm based on Principle Component Analysis (PCA) technique. The algorithm is particularly applied on an actual remote sensing spacecraft. Data from the Attitude Determination and Control System (ADCS) was acquired under two operation conditions: normal and faulty states. The models were built and tested under these conditions, and the results show that the algorithm could successfully differentiate between these operations conditions. Furthermore, the algorithm provides competent information in prediction as well as adding more insight and physical interpretation to the ADCS operation.

Keywords: Space telemetry monitoring, multivariate analysis, PCA algorithm, space operations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2034
1044 Transmitting a Distance Training Model to the Community in the Upper Northeastern Region

Authors: Teerawach Khamkorn, Laongtip Mathurasa, Savittree Rochanasmita Arnold, Witthaya Mekhum

Abstract:

The objective of this research seeks to transmit a distance training model to the community in the upper northeastern region. The group sampling consists of 60 community leaders in the municipality of sub-district Kumphawapi, Kumphawapi Disrict, Udonthani Province. The research tools rely on the following instruments, they are : 1) the achievement test of community leaders- training and 2) the satisfaction questionnaires of community leaders. The statistics used in data analysis takes the statistical mean, percentage, standard deviation, and statistical T-test. The resulted findings reveal : 1) the efficiency of the distance training developed by the researcher for the community leaders joining in the training received the average score between in-training and post-training period higher than the setup criterion, 2) the two groups of participants in the training achieved higher knowledge than their pre-training state, 3) the comparison of the achievements between the two group presented no different results, 4) the community leaders obtained the high-to-highest satisfaction.

Keywords: Distance Training, Management, Technology, Transmitting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1277
1043 The Relationships between Market Orientation and Competitiveness of Companies in Banking Sector

Authors: P. Jangl, M. Mikuláštík

Abstract:

The objective of the paper is to measure and compare market orientation of Swiss and Czech banks, as well as examine statistically the degree of influence it has on competitiveness of the institutions. The analysis of market orientation is based on the collecting, analysis and correct interpretation of the data. Descriptive analysis of market orientation describe current situation. Research of relation of competitiveness and market orientation in the sector of big international banks is suggested with the expectation of existence of a strong relationship. Partially, the work served as reconfirmation of suitability of classic methodologies to measurement of banks’ market orientation.

Two types of data were gathered. Firstly, by measuring subjectively perceived market orientation of a company and secondly, by quantifying its competitiveness. All data were collected from a sample of small, mid-sized and large banks. We used numerical secondary character data from the international statistical financial Bureau Van Dijk’s BANKSCOPE database.

 Statistical analysis led to the following results. Assuming classical market orientation measures to be scientifically justified, Czech banks are statistically less market-oriented than Swiss banks. Secondly, among small Swiss banks, which are not broadly internationally active, small relationship exist between market orientation measures and market share based competitiveness measures. Thirdly, among all Swiss banks, a strong relationship exists between market orientation measures and market share based competitiveness measures. Above results imply existence of a strong relation of this measure in sector of big international banks. A strong statistical relationship has been proven to exist between market orientation measures and equity/total assets ratio in Switzerland.

Keywords: Market Orientation, Competitiveness, Marketing Strategy, Measurement of Market Orientation, Relation between Market Orientation and Competitiveness, Banking Sector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2750
1042 Statistical Modeling for Permeabilization of a Novel Yeast Isolate for β-Galactosidase Activity Using Organic Solvents

Authors: Shweta Kumari, Parmjit S. Panesar, Manab B. Bera

Abstract:

The hydrolysis of lactose using β-galactosidase is one of the most promising biotechnological applications, which has wide range of potential applications in food processing industries. However, due to intracellular location of the yeast enzyme, and expensive extraction methods, the industrial applications of enzymatic hydrolysis processes are being hampered. The use of permeabilization technique can help to overcome the problems associated with enzyme extraction and purification of yeast cells and to develop the economically viable process for the utilization of whole cell biocatalysts in food industries. In the present investigation, standardization of permeabilization process of novel yeast isolate was carried out using a statistical model approach known as Response Surface Methodology (RSM) to achieve maximal b-galactosidase activity. The optimum operating conditions for permeabilization process for optimal β-galactosidase activity obtained by RSM were 1:1 ratio of toluene (25%, v/v) and ethanol (50%, v/v), 25.0 oC temperature and treatment time of 12 min, which displayed enzyme activity of 1.71 IU /mg DW.

Keywords: β-galactosidase, optimization, permeabilization, response surface methodology, yeast.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4107
1041 Analysis of Air Quality in the Outdoor Environment of the City of Messina by an Application of the Pollution Index Method

Authors: G. Cannistraro, L. Ponterio

Abstract:

In this paper is reported an analysis about the outdoor air pollution of the urban centre of the city of Messina. The variations of the most critical pollutants concentrations (PM10, O3, CO, C6H6) and their trends respect of climatic parameters and vehicular traffic have been studied. Linear regressions have been effectuated for representing the relations among the pollutants; the differences between pollutants concentrations on weekend/weekday were also analyzed. In order to evaluate air pollution and its effects on human health, a method for calculating a pollution index was implemented and applied in the urban centre of the city. This index is based on the weighted mean of the most detrimental air pollutants concentrations respect of their limit values for protection of human health. The analyzed data of the polluting substances were collected by the Assessorship of the Environment of the Regional Province of Messina in the year 2004. A statistical analysis of the air quality index trends is also reported.

Keywords: Environmental pollution, Pollutants levels, Linearregression, Air Quality Index, Statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1755
1040 A Hybrid Metaheuristic Framework for Evolving the PROAFTN Classifier

Authors: Feras Al-Obeidat, Nabil Belacel, Juan A. Carretero, Prabhat Mahanti,

Abstract:

In this paper, a new learning algorithm based on a hybrid metaheuristic integrating Differential Evolution (DE) and Reduced Variable Neighborhood Search (RVNS) is introduced to train the classification method PROAFTN. To apply PROAFTN, values of several parameters need to be determined prior to classification. These parameters include boundaries of intervals and relative weights for each attribute. Based on these requirements, the hybrid approach, named DEPRO-RVNS, is presented in this study. In some cases, the major problem when applying DE to some classification problems was the premature convergence of some individuals to local optima. To eliminate this shortcoming and to improve the exploration and exploitation capabilities of DE, such individuals were set to iteratively re-explored using RVNS. Based on the generated results on both training and testing data, it is shown that the performance of PROAFTN is significantly improved. Furthermore, the experimental study shows that DEPRO-RVNS outperforms well-known machine learning classifiers in a variety of problems.

Keywords: Knowledge Discovery, Differential Evolution, Reduced Variable Neighborhood Search, Multiple criteria classification, PROAFTN, Supervised Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1450
1039 Assessing Basic Computer Applications’ Skills of College-Level Students in Saudi Arabia

Authors: Mohammed A. Gharawi, Majed M. Khoja

Abstract:

This paper is a report on the findings of a study conducted at the Institute of Public Administration (IPA) in Saudi Arabia. The paper applied both qualitative and quantitative approaches to assess the levels of basic computer applications’ skills among students enrolled in the preparatory programs of the institution. Qualitative data have been collected from semi-structured interviews with the instructors who have previously been assigned to teach Introduction to information technology courses. Quantitative data were collected by executing a self-report questionnaire and a written statistical test. Three hundred eighty enrolled students responded to the questionnaire and one hundred forty two accomplished the statistical test. The results indicate the lack of necessary skills to deal with computer applications among most of the students who are enrolled in the IPA’s preparatory programs.

Keywords: Assessment, Computer Applications, Computer Literacy, Institute of Public Administration, Saudi Arabia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2650
1038 A Hybrid Classification Method using Artificial Neural Network Based Decision Tree for Automatic Sleep Scoring

Authors: Haoyu Ma, Bin Hu, Mike Jackson, Jingzhi Yan, Wen Zhao

Abstract:

In this paper we propose a new classification method for automatic sleep scoring using an artificial neural network based decision tree. It attempts to treat sleep scoring progress as a series of two-class problems and solves them with a decision tree made up of a group of neural network classifiers, each of which uses a special feature set and is aimed at only one specific sleep stage in order to maximize the classification effect. A single electroencephalogram (EEG) signal is used for our analysis rather than depending on multiple biological signals, which makes greatly simplifies the data acquisition process. Experimental results demonstrate that the average epoch by epoch agreement between the visual and the proposed method in separating 30s wakefulness+S1, REM, S2 and SWS epochs was 88.83%. This study shows that the proposed method performed well in all the four stages, and can effectively limit error propagation at the same time. It could, therefore, be an efficient method for automatic sleep scoring. Additionally, since it requires only a small volume of data it could be suited to pervasive applications.

Keywords: Sleep, Sleep stage, Automatic sleep scoring, Electroencephalography, Decision tree, Artificial neural network

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2046
1037 Fuzzy based Security Threshold Determining for the Statistical En-Route Filtering in Sensor Networks

Authors: Hae Young Lee, Tae Ho Cho

Abstract:

In many sensor network applications, sensor nodes are deployed in open environments, and hence are vulnerable to physical attacks, potentially compromising the node's cryptographic keys. False sensing report can be injected through compromised nodes, which can lead to not only false alarms but also the depletion of limited energy resource in battery powered networks. Ye et al. proposed a statistical en-route filtering scheme (SEF) to detect such false reports during the forwarding process. In this scheme, the choice of a security threshold value is important since it trades off detection power and overhead. In this paper, we propose a fuzzy logic for determining a security threshold value in the SEF based sensor networks. The fuzzy logic determines a security threshold by considering the number of partitions in a global key pool, the number of compromised partitions, and the energy level of nodes. The fuzzy based threshold value can conserve energy, while it provides sufficient detection power.

Keywords: Fuzzy logic, security, sensor network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1545
1036 A Ground Observation Based Climatology of Winter Fog: Study over the Indo-Gangetic Plains, India

Authors: Sanjay Kumar Srivastava, Anu Rani Sharma, Kamna Sachdeva

Abstract:

Every year, fog formation over the Indo-Gangetic Plains (IGPs) of Indian region during the winter months of December and January is believed to create numerous hazards, inconvenience, and economic loss to the inhabitants of this densely populated region of Indian subcontinent. The aim of the paper is to analyze the spatial and temporal variability of winter fog over IGPs. Long term ground observations of visibility and other meteorological parameters (1971-2010) have been analyzed to understand the formation of fog phenomena and its relevance during the peak winter months of January and December over IGP of India. In order to examine the temporal variability, time series and trend analysis were carried out by using the Mann-Kendall Statistical test. Trend analysis performed by using the Mann-Kendall test, accepts the alternate hypothesis with 95% confidence level indicating that there exists a trend. Kendall tau’s statistics showed that there exists a positive correlation between time series and fog frequency. Further, the Theil and Sen’s median slope estimate showed that the magnitude of trend is positive. Magnitude is higher during January compared to December for the entire IGP except in December when it is high over the western IGP. Decade wise time series analysis revealed that there has been continuous increase in fog days. The net overall increase of 99 % was observed over IGP in last four decades. Diurnal variability and average daily persistence were computed by using descriptive statistical techniques. Geo-statistical analysis of fog was carried out to understand the spatial variability of fog. Geo-statistical analysis of fog revealed that IGP is a high fog prone zone with fog occurrence frequency of more than 66% days during the study period. Diurnal variability indicates the peak occurrence of fog is between 06:00 and 10:00 local time and average daily fog persistence extends to 5 to 7 hours during the peak winter season. The results would offer a new perspective to take proactive measures in reducing the irreparable damage that could be caused due to changing trends of fog.

Keywords: Fog, climatology, Mann-Kendall test, trend analysis, spatial variability, temporal variability, visibility.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1720
1035 Scaling up Detection Rates and Reducing False Positives in Intrusion Detection using NBTree

Authors: Dewan Md. Farid, Nguyen Huu Hoa, Jerome Darmont, Nouria Harbi, Mohammad Zahidur Rahman

Abstract:

In this paper, we present a new learning algorithm for anomaly based network intrusion detection using improved self adaptive naïve Bayesian tree (NBTree), which induces a hybrid of decision tree and naïve Bayesian classifier. The proposed approach scales up the balance detections for different attack types and keeps the false positives at acceptable level in intrusion detection. In complex and dynamic large intrusion detection dataset, the detection accuracy of naïve Bayesian classifier does not scale up as well as decision tree. It has been successfully tested in other problem domains that naïve Bayesian tree improves the classification rates in large dataset. In naïve Bayesian tree nodes contain and split as regular decision-trees, but the leaves contain naïve Bayesian classifiers. The experimental results on KDD99 benchmark network intrusion detection dataset demonstrate that this new approach scales up the detection rates for different attack types and reduces false positives in network intrusion detection.

Keywords: Detection rates, false positives, network intrusiondetection, naïve Bayesian tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2246
1034 Optimizing Performance of Tablet's Direct Compression Process Using Fuzzy Goal Programming

Authors: Abbas Al-Refaie

Abstract:

This paper aims at improving the performance of the tableting process using statistical quality control and fuzzy goal programming. The tableting process was studied. Statistical control tools were used to characterize the existing process for three critical responses including the averages of a tablet’s weight, hardness, and thickness. At initial process factor settings, the estimated process capability index values for the tablet’s averages of weight, hardness, and thickness were 0.58, 3.36, and 0.88, respectively. The L9 array was utilized to provide experimentation design. Fuzzy goal programming was then employed to find the combination of optimal factor settings. Optimization results showed that the process capability index values for a tablet’s averages of weight, hardness, and thickness were improved to 1.03, 4.42, and 1.42, respectively. Such improvements resulted in significant savings in quality and production costs.

Keywords: Fuzzy goal programming, control charts, process capability, tablet optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 975
1033 Multivariate Statistical Analysis of Decathlon Performance Results in Olympic Athletes (1988-2008)

Authors: Jaebum Park, Vladimir M. Zatsiorsky

Abstract:

The performance results of the athletes competed in the 1988-2008 Olympic Games were analyzed (n = 166). The data were obtained from the IAAF official protocols. In the principal component analysis, the first three principal components explained 70% of the total variance. In the 1st principal component (with 43.1% of total variance explained) the largest factor loadings were for 100m (0.89), 400m (0.81), 110m hurdle run (0.76), and long jump (–0.72). This factor can be interpreted as the 'sprinting performance'. The loadings on the 2nd factor (15.3% of the total variance) presented a counter-intuitive throwing-jumping combination: the highest loadings were for throwing events (javelin throwing 0.76; shot put 0.74; and discus throwing 0.73) and also for jumping events (high jump 0.62; pole vaulting 0.58). On the 3rd factor (11.6% of total variance), the largest loading was for 1500 m running (0.88); all other loadings were below 0.4.

Keywords: Decathlon, principal component analysis, Olympic Games, multivariate statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2779
1032 Gene Expression Signature for Classification of Metastasis Positive and Negative Oral Cancer in Homosapiens

Authors: A. Shukla, A. Tarsauliya, R. Tiwari, S. Sharma

Abstract:

Cancer classification to their corresponding cohorts has been key area of research in bioinformatics aiming better prognosis of the disease. High dimensionality of gene data has been makes it a complex task and requires significance data identification technique in order to reducing the dimensionality and identification of significant information. In this paper, we have proposed a novel approach for classification of oral cancer into metastasis positive and negative patients. We have used significance analysis of microarrays (SAM) for identifying significant genes which constitutes gene signature. 3 different gene signatures were identified using SAM from 3 different combination of training datasets and their classification accuracy was calculated on corresponding testing datasets using k-Nearest Neighbour (kNN), Fuzzy C-Means Clustering (FCM), Support Vector Machine (SVM) and Backpropagation Neural Network (BPNN). A final gene signature of only 9 genes was obtained from above 3 individual gene signatures. 9 gene signature-s classification capability was compared using same classifiers on same testing datasets. Results obtained from experimentation shows that 9 gene signature classified all samples in testing dataset accurately while individual genes could not classify all accurately.

Keywords: Cancer, Gene Signature, SAM, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2045
1031 Early Recognition and Grading of Cataract Using a Combined Log Gabor/Discrete Wavelet Transform with ANN and SVM

Authors: Hadeer R. M. Tawfik, Rania A. K. Birry, Amani A. Saad

Abstract:

Eyes are considered to be the most sensitive and important organ for human being. Thus, any eye disorder will affect the patient in all aspects of life. Cataract is one of those eye disorders that lead to blindness if not treated correctly and quickly. This paper demonstrates a model for automatic detection, classification, and grading of cataracts based on image processing techniques and artificial intelligence. The proposed system is developed to ease the cataract diagnosis process for both ophthalmologists and patients. The wavelet transform combined with 2D Log Gabor Wavelet transform was used as feature extraction techniques for a dataset of 120 eye images followed by a classification process that classified the image set into three classes; normal, early, and advanced stage. A comparison between the two used classifiers, the support vector machine SVM and the artificial neural network ANN were done for the same dataset of 120 eye images. It was concluded that SVM gave better results than ANN. SVM success rate result was 96.8% accuracy where ANN success rate result was 92.3% accuracy.

Keywords: Cataract, classification, detection, feature extraction, grading, log-gabor, neural networks, support vector machines, wavelet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 952
1030 N-Grams: A Tool for Repairing Word Order Errors in Ill-formed Texts

Authors: Theologos Athanaselis, Stelios Bakamidis, Ioannis Dologlou, Konstantinos Mamouras

Abstract:

This paper presents an approach for repairing word order errors in English text by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. A possible way for reordering the words is to use all the permutations. The problem is that for a sentence with length N words the number of all permutations is N!. The novelty of this method concerns the use of an efficient confusion matrix technique for reordering the words. The confusion matrix technique has been designed in order to reduce the search space among permuted sentences. The limitation of search space is succeeded using the statistical inference of N-grams. The results of this technique are very interesting and prove that the number of permuted sentences can be reduced by 98,16%. For experimental purposes a test set of TOEFL sentences was used and the results show that more than 95% can be repaired using the proposed method.

Keywords: Permutations filtering, Statistical language model N-grams, Word order errors, TOEFL

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1636
1029 A Multivariate Statistical Approach for Water Quality Assessment of River Hindon, India

Authors: Nida Rizvi, Deeksha Katyal, Varun Joshi

Abstract:

River Hindon is an important river catering the demand of highly populated rural and industrial cluster of western Uttar Pradesh, India. Water quality of river Hindon is deteriorating at an alarming rate due to various industrial, municipal and agricultural activities. The present study aimed at identifying the pollution sources and quantifying the degree to which these sources are responsible for the deteriorating water quality of the river. Various water quality parameters, like pH, temperature, electrical conductivity, total dissolved solids, total hardness, calcium, chloride, nitrate, sulphate, biological oxygen demand, chemical oxygen demand, and total alkalinity were assessed. Water quality data obtained from eight study sites for one year has been subjected to the two multivariate techniques, namely, principal component analysis and cluster analysis. Principal component analysis was applied with the aim to find out spatial variability and to identify the sources responsible for the water quality of the river. Three Varifactors were obtained after varimax rotation of initial principal components using principal component analysis. Cluster analysis was carried out to classify sampling stations of certain similarity, which grouped eight different sites into two clusters. The study reveals that the anthropogenic influence (municipal, industrial, waste water and agricultural runoff) was the major source of river water pollution. Thus, this study illustrates the utility of multivariate statistical techniques for analysis and elucidation of multifaceted data sets, recognition of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

Keywords: Cluster analysis, multivariate statistical technique, river Hindon, water Quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3775
1028 Classification Control for Discrimination between Interictal Epileptic and Non – Epileptic Pathological EEG Events

Authors: Sozon H. Papavlasopoulos, Marios S. Poulos, George D. Bokos, Angelos M. Evangelou

Abstract:

In this study, the problem of discriminating between interictal epileptic and non- epileptic pathological EEG cases, which present episodic loss of consciousness, investigated. We verify the accuracy of the feature extraction method of autocross-correlated coefficients which extracted and studied in previous study. For this purpose we used in one hand a suitable constructed artificial supervised LVQ1 neural network and in other a cross-correlation technique. To enforce the above verification we used a statistical procedure which based on a chi- square control. The classification and the statistical results showed that the proposed feature extraction is a significant accurate method for diagnostic discrimination cases between interictal and non-interictal EEG events and specifically the classification procedure showed that the LVQ neural method is superior than the cross-correlation one.

Keywords: Cross-Correlation Methods, Diagnostic Test, Interictal Epileptic, LVQ1 neural network, Auto-Cross-Correlation Methods, chi-square test.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493
1027 Superior Performances of the Neural Network on the Masses Lesions Classification through Morphological Lesion Differences

Authors: U. Bottigli, R.Chiarucci, B. Golosio, G.L. Masala, P. Oliva, S.Stumbo, D.Cascio, F. Fauci, M. Glorioso, M. Iacomi, R. Magro, G. Raso

Abstract:

Purpose of this work is to develop an automatic classification system that could be useful for radiologists in the breast cancer investigation. The software has been designed in the framework of the MAGIC-5 collaboration. In an automatic classification system the suspicious regions with high probability to include a lesion are extracted from the image as regions of interest (ROIs). Each ROI is characterized by some features based generally on morphological lesion differences. A study in the space features representation is made and some classifiers are tested to distinguish the pathological regions from the healthy ones. The results provided in terms of sensitivity and specificity will be presented through the ROC (Receiver Operating Characteristic) curves. In particular the best performances are obtained with the Neural Networks in comparison with the K-Nearest Neighbours and the Support Vector Machine: The Radial Basis Function supply the best results with 0.89 ± 0.01 of area under ROC curve but similar results are obtained with the Probabilistic Neural Network and a Multi Layer Perceptron.

Keywords: Neural Networks, K-Nearest Neighbours, Support Vector Machine, Computer Aided Detection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1586
1026 Statistical Analysis of the Factors that Influence the Properties of Blueberries from Cultivar Bluecrop

Authors: Raquel P. F. Guiné, Susana R. Matos, Daniela V. T. A. Costa, Fernando J. Gonçalves

Abstract:

Because blueberries are worldwide recognized as a good source of beneficial components, their consumption has increased in the past decades, and so have the scientific works about their properties. Hence, this work was undertaken to evaluate the effect of some production and conservation factors on the properties of blueberries from cultivar Bluecrop. The physical and chemical analyses were done according to established methodologies and then all data was treated using software SPSS for assessment of the possible differences among the factors investigated and/or the correlations between the variables at study. The results showed that location of production influenced some of the berries properties (caliber, sugars, antioxidant activity, color and texture) and that the age of the bushes was correlated with moisture, sugars and acidity, as well as lightness. On the other hand, altitude of the farm only was correlated to sugar content. With regards to conservation, it influenced only anthocyanins content and DPPH antioxidant activity. Finally, the type of extract and the order of extraction had a pronounced influence on all the phenolic properties evaluated.

Keywords: Antioxidant activity, blueberry, conservation, geographical origin, phenolic compounds, statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2106
1025 Using Statistical Significance and Prediction to Test Long/Short Term Public Services and Patients Cohorts: A Case Study in Scotland

Authors: Sotirios Raptis

Abstract:

Health and Social care (HSc) services planning and scheduling are facing unprecedented challenges, due to the pandemic pressure and also suffer from unplanned spending that is negatively impacted by the global financial crisis. Data-driven approaches can help to improve policies, plan and design services provision schedules using algorithms that assist healthcare managers to face unexpected demands using fewer resources. The paper discusses services packing using statistical significance tests and machine learning (ML) to evaluate demands similarity and coupling. This is achieved by predicting the range of the demand (class) using ML methods such as Classification and Regression Trees (CART), Random Forests (RF), and Logistic Regression (LGR). The significance tests Chi-Squared and Student’s test are used on data over a 39 years span for which data exist for services delivered in Scotland. The demands are associated using probabilities and are parts of statistical hypotheses. These hypotheses, as their NULL part, assume that the target demand is statistically dependent on other services’ demands. This linking is checked using the data. In addition, ML methods are used to linearly predict the above target demands from the statistically found associations and extend the linear dependence of the target’s demand to independent demands forming, thus, groups of services. Statistical tests confirmed ML coupling and made the prediction statistically meaningful and proved that a target service can be matched reliably to other services while ML showed that such marked relationships can also be linear ones. Zero padding was used for missing years records and illustrated better such relationships both for limited years and for the entire span offering long-term data visualizations while limited years periods explained how well patients numbers can be related in short periods of time or that they can change over time as opposed to behaviours across more years. The prediction performance of the associations were measured using metrics such as Receiver Operating Characteristic (ROC), Area Under Curve (AUC) and Accuracy (ACC) as well as the statistical tests Chi-Squared and Student. Co-plots and comparison tables for the RF, CART, and LGR methods as well as the p-value from tests and Information Exchange (IE/MIE) measures are provided showing the relative performance of ML methods and of the statistical tests as well as the behaviour using different learning ratios. The impact of k-neighbours classification (k-NN), Cross-Correlation (CC) and C-Means (CM) first groupings was also studied over limited years and for the entire span. It was found that CART was generally behind RF and LGR but in some interesting cases, LGR reached an AUC = 0 falling below CART, while the ACC was as high as 0.912 showing that ML methods can be confused by zero-padding or by data’s irregularities or by the outliers. On average, 3 linear predictors were sufficient, LGR was found competing well RF and CART followed with the same performance at higher learning ratios. Services were packed only when a significance level (p-value) of their association coefficient was more than 0.05. Social factors relationships were observed between home care services and treatment of old people, low birth weights, alcoholism, drug abuse, and emergency admissions. The work found  that different HSc services can be well packed as plans of limited duration, across various services sectors, learning configurations, as confirmed by using statistical hypotheses.

Keywords: Class, cohorts, data frames, grouping, prediction, probabilities, services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 409
1024 A Brain Controlled Robotic Gait Trainer for Neurorehabilitation

Authors: Qazi Umer Jamil, Abubakr Siddique, Mubeen Ur Rehman, Nida Aziz, Mohsin I. Tiwana

Abstract:

This paper discusses a brain controlled robotic gait trainer for neurorehabilitation of Spinal Cord Injury (SCI) patients. Patients suffering from Spinal Cord Injuries (SCI) become unable to execute motion control of their lower proximities due to degeneration of spinal cord neurons. The presented approach can help SCI patients in neuro-rehabilitation training by directly translating patient motor imagery into walkers motion commands and thus bypassing spinal cord neurons completely. A non-invasive EEG based brain-computer interface is used for capturing patient neural activity. For signal processing and classification, an open source software (OpenVibe) is used. Classifiers categorize the patient motor imagery (MI) into a specific set of commands that are further translated into walker motion commands. The robotic walker also employs fall detection for ensuring safety of patient during gait training and can act as a support for SCI patients. The gait trainer is tested with subjects, and satisfactory results were achieved.

Keywords: Brain Computer Interface (BCI), gait trainer, Spinal Cord Injury (SCI), neurorehabilitation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1229
1023 Forecasting the Influences of Information and Communication Technology on the Structural Changes of Japanese Industrial Sectors: A Study Using Statistical Analysis

Authors: Ubaidillah Zuhdi, Shunsuke Mori, Kazuhisa Kamegai

Abstract:

The purpose of this study is to forecast the influences of information and communication technology (ICT) on the structural changes of Japanese economies. In this study, input-output (IO) and statistical approaches are used as analysis instruments. More specifically, this study employs Leontief IO coefficients and constrained multivariate regression (CMR) model in order to achieve the purpose. The periods of initial and forecast in this study are 2005 and 2015, respectively. In this study, ICT is represented by ICT capital stocks. This study conducts two levels of analysis, namely macro and micro. The results of macro level analysis show that the dynamics of Japanese economies on the forecast period, relative to the initial period, are not so high. We focus on (1) commerce, (2) business services and office supplies, and (3) personal services sectors when conducting the analysis of the micro level. Further, we analyze its specific IO coefficients when doing this analysis. The results of the analysis explain that ICT gives a strong influence on the changes of these coefficients from initial to forecast periods.

Keywords: Forecast, ICT, Structural changes, Japanese economies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1654
1022 Educational Data Mining: The Case of Department of Mathematics and Computing in the Period 2009-2018

Authors: M. Sitoe, O. Zacarias

Abstract:

University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.

Keywords: Evasion and retention, cross validation, bagging, stacking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 70
1021 Motor Imagery Signal Classification for a Four State Brain Machine Interface

Authors: Hema C. R., Paulraj M. P., S. Yaacob, A. H. Adom, R. Nagarajan

Abstract:

Motor imagery classification provides an important basis for designing Brain Machine Interfaces [BMI]. A BMI captures and decodes brain EEG signals and transforms human thought into actions. The ability of an individual to control his EEG through imaginary mental tasks enables him to control devices through the BMI. This paper presents a method to design a four state BMI using EEG signals recorded from the C3 and C4 locations. Principle features extracted through principle component analysis of the segmented EEG are analyzed using two novel classification algorithms using Elman recurrent neural network and functional link neural network. Performance of both classifiers is evaluated using a particle swarm optimization training algorithm; results are also compared with the conventional back propagation training algorithm. EEG motor imagery recorded from two subjects is used in the offline analysis. From overall classification performance it is observed that the BP algorithm has higher average classification of 93.5%, while the PSO algorithm has better training time and maximum classification. The proposed methods promises to provide a useful alternative general procedure for motor imagery classification

Keywords: Motor Imagery, Brain Machine Interfaces, Neural Networks, Particle Swarm Optimization, EEG signal processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2421
1020 ABURAS Index: A Statistically Developed Index for Dengue-Transmitting Vector Population Prediction

Authors: Hani M. Aburas

Abstract:

“Dengue" is an African word meaning “bone breaking" because it causes severe joint and muscle pain that feels like bones are breaking. It is an infectious disease mainly transmitted by female mosquito, Aedes aegypti, and causes four serotypes of dengue viruses. In recent years, a dramatic increase in the dengue fever confirmed cases around the equator-s belt has been reported. Several conventional indices have been designed so far to monitor the transmitting vector populations known as House Index (HI), Container Index (CI), Breteau Index (BI). However, none of them describes the adult mosquito population size which is important to direct and guide comprehensive control strategy operations since number of infected people has a direct relationship with the vector density. Therefore, it is crucial to know the population size of the transmitting vector in order to design a suitable and effective control program. In this context, a study is carried out to report a new statistical index, ABURAS Index, using Poisson distribution based on the collection of vector population in Jeddah Governorate, Saudi Arabia.

Keywords: Poisson distribution, statistical index, prediction, Aedes aegypti.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1888
1019 Spike Sorting Method Using Exponential Autoregressive Modeling of Action Potentials

Authors: Sajjad Farashi

Abstract:

Neurons in the nervous system communicate with each other by producing electrical signals called spikes. To investigate the physiological function of nervous system it is essential to study the activity of neurons by detecting and sorting spikes in the recorded signal. In this paper a method is proposed for considering the spike sorting problem which is based on the nonlinear modeling of spikes using exponential autoregressive model. The genetic algorithm is utilized for model parameter estimation. In this regard some selected model coefficients are used as features for sorting purposes. For optimal selection of model coefficients, self-organizing feature map is used. The results show that modeling of spikes with nonlinear autoregressive model outperforms its linear counterpart. Also the extracted features based on the coefficients of exponential autoregressive model are better than wavelet based extracted features and get more compact and well-separated clusters. In the case of spikes different in small-scale structures where principal component analysis fails to get separated clouds in the feature space, the proposed method can obtain well-separated cluster which removes the necessity of applying complex classifiers.

Keywords: Exponential autoregressive model, Neural data, spike sorting, time series modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1745
1018 Statistical Analysis of the Impact of Maritime Transport Gross Domestic Product on Nigeria’s Economy

Authors: K. P. Oyeduntan, K. Oshinubi

Abstract:

Nigeria is referred as the ‘Giant of Africa’ due to high population, land mass and large economy. However, it still trails far behind many smaller economies in the continent in terms of maritime operations. As we have seen that the maritime industry is the sparkplug for national growth, because it houses the most crucial infrastructure that generates wealth for a nation, it is worrisome that a nation with six seaports lag in maritime activities. In this research, we have studied how the Gross Domestic Product (GDP) of the maritime transport influences the Nigerian economy. To do this, we applied Simple Linear Regression (SLR), Support Vector Machine (SVM), Polynomial Regression Model (PRM), Generalized Additive Model (GAM) and Generalized Linear Mixed Model (GLMM) to model the relationship between the nation’s Total GDP (TGDP) and the Maritime Transport GDP (MGDP) using a time series data of 20 years. The result showed that the MGDP is statistically significant to the Nigerian economy. Amongst the statistical tool applied, the PRM of order 4 describes the relationship better when compared to other methods. The recommendations presented in this study will guide policy makers and help improve the economy of Nigeria.

Keywords: Economy, GDP, maritime transport, port, regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 76
1017 Probability Distribution of Rainfall Depth at Hourly Time-Scale

Authors: S. Dan'azumi, S. Shamsudin, A. A. Rahman

Abstract:

Rainfall data at fine resolution and knowledge of its characteristics plays a major role in the efficient design and operation of agricultural, telecommunication, runoff and erosion control as well as water quality control systems. The paper is aimed to study the statistical distribution of hourly rainfall depth for 12 representative stations spread across Peninsular Malaysia. Hourly rainfall data of 10 to 22 years period were collected and its statistical characteristics were estimated. Three probability distributions namely, Generalized Pareto, Exponential and Gamma distributions were proposed to model the hourly rainfall depth, and three goodness-of-fit tests, namely, Kolmogorov-Sminov, Anderson-Darling and Chi-Squared tests were used to evaluate their fitness. Result indicates that the east cost of the Peninsular receives higher depth of rainfall as compared to west coast. However, the rainfall frequency is found to be irregular. Also result from the goodness-of-fit tests show that all the three models fit the rainfall data at 1% level of significance. However, Generalized Pareto fits better than Exponential and Gamma distributions and is therefore recommended as the best fit.

Keywords: Goodness-of-fit test, Hourly rainfall, Malaysia, Probability distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2881
1016 Customer Relationship Management on Social Media Affecting Brand Loyalty of Siam Commercial Bank in Bangkok

Authors: Charawee Butbumrung

Abstract:

The purpose of this research was to study customer relationship management on social media affecting brand loyalty of Siam Commercial Bank in Bangkok. The statistics used in data analysis were frequency, mean, standard deviation, and Pearson’s correlation coefficient based on social science statistic program. The result of the study found that the majority of the respondents were female, 37–47 years old of age, bachelor degree of education and monthly income between 10,001 and 15,000 Baht. In addition, customer relationship management in the overall and by each aspect of formulating, maintaining, and extending the customer relationship had a high score. Furthermore, the result of hypothesis testing showed that the difference of the customer’s age, education, occupation, average monthly income had the difference in brand loyalty with the statistical significance level of 0.05 and customer relationship management had related with brand loyalty in the same direction with the low level of statistical significance 0.05.

Keywords: Brand loyalty, customer relationship, management, Siam Commercial Bank, social media.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1106