Search results for: statistical features
7660 Development of Sleep Quality Index Using Heart Rate
Authors: Dongjoo Kim, Chang-Sik Son, Won-Seok Kang
Abstract:
Adequate sleep affects various parts of one’s overall physical and mental life. As one of the methods in determining the appropriate amount of sleep, this research presents a heart rate based sleep quality index. In order to evaluate sleep quality using the heart rate, sleep data from 280 subjects taken over one month are used. Their sleep data are categorized by a three-part heart rate range. After categorizing, some features are extracted, and the statistical significances are verified for these features. The results show that some features of this sleep quality index model have statistical significance. Thus, this heart rate based sleep quality index may be a useful discriminator of sleep.Keywords: sleep, sleep quality, heart rate, statistical analysis
Procedia PDF Downloads 3427659 Exploring Syntactic and Semantic Features for Text-Based Authorship Attribution
Authors: Haiyan Wu, Ying Liu, Shaoyun Shi
Abstract:
Authorship attribution is to extract features to identify authors of anonymous documents. Many previous works on authorship attribution focus on statistical style features (e.g., sentence/word length), content features (e.g., frequent words, n-grams). Modeling these features by regression or some transparent machine learning methods gives a portrait of the authors' writing style. But these methods do not capture the syntactic (e.g., dependency relationship) or semantic (e.g., topics) information. In recent years, some researchers model syntactic trees or latent semantic information by neural networks. However, few works take them together. Besides, predictions by neural networks are difficult to explain, which is vital in authorship attribution tasks. In this paper, we not only utilize the statistical style and content features but also take advantage of both syntactic and semantic features. Different from an end-to-end neural model, feature selection and prediction are two steps in our method. An attentive n-gram network is utilized to select useful features, and logistic regression is applied to give prediction and understandable representation of writing style. Experiments show that our extracted features can improve the state-of-the-art methods on three benchmark datasets.Keywords: authorship attribution, attention mechanism, syntactic feature, feature extraction
Procedia PDF Downloads 1377658 Monitoring Blood Pressure Using Regression Techniques
Authors: Qasem Qananwah, Ahmad Dagamseh, Hiam AlQuran, Khalid Shaker Ibrahim
Abstract:
Blood pressure helps the physicians greatly to have a deep insight into the cardiovascular system. The determination of individual blood pressure is a standard clinical procedure considered for cardiovascular system problems. The conventional techniques to measure blood pressure (e.g. cuff method) allows a limited number of readings for a certain period (e.g. every 5-10 minutes). Additionally, these systems cause turbulence to blood flow; impeding continuous blood pressure monitoring, especially in emergency cases or critically ill persons. In this paper, the most important statistical features in the photoplethysmogram (PPG) signals were extracted to estimate the blood pressure noninvasively. PPG signals from more than 40 subjects were measured and analyzed and 12 features were extracted. The features were fed to principal component analysis (PCA) to find the most important independent features that have the highest correlation with blood pressure. The results show that the stiffness index means and standard deviation for the beat-to-beat heart rate were the most important features. A model representing both features for Systolic Blood Pressure (SBP) and Diastolic Blood Pressure (DBP) was obtained using a statistical regression technique. Surface fitting is used to best fit the series of data and the results show that the error value in estimating the SBP is 4.95% and in estimating the DBP is 3.99%.Keywords: blood pressure, noninvasive optical system, principal component analysis, PCA, continuous monitoring
Procedia PDF Downloads 1617657 An Approach Based on Statistics and Multi-Resolution Representation to Classify Mammograms
Authors: Nebi Gedik
Abstract:
One of the significant and continual public health problems in the world is breast cancer. Early detection is very important to fight the disease, and mammography has been one of the most common and reliable methods to detect the disease in the early stages. However, it is a difficult task, and computer-aided diagnosis (CAD) systems are needed to assist radiologists in providing both accurate and uniform evaluation for mass in mammograms. In this study, a multiresolution statistical method to classify mammograms as normal and abnormal in digitized mammograms is used to construct a CAD system. The mammogram images are represented by wave atom transform, and this representation is made by certain groups of coefficients, independently. The CAD system is designed by calculating some statistical features using each group of coefficients. The classification is performed by using support vector machine (SVM).Keywords: wave atom transform, statistical features, multi-resolution representation, mammogram
Procedia PDF Downloads 2237656 Statistical Feature Extraction Method for Wood Species Recognition System
Authors: Mohd Iz'aan Paiz Bin Zamri, Anis Salwa Mohd Khairuddin, Norrima Mokhtar, Rubiyah Yusof
Abstract:
Effective statistical feature extraction and classification are important in image-based automatic inspection and analysis. An automatic wood species recognition system is designed to perform wood inspection at custom checkpoints to avoid mislabeling of timber which will results to loss of income to the timber industry. The system focuses on analyzing the statistical pores properties of the wood images. This paper proposed a fuzzy-based feature extractor which mimics the experts’ knowledge on wood texture to extract the properties of pores distribution from the wood surface texture. The proposed feature extractor consists of two steps namely pores extraction and fuzzy pores management. The total number of statistical features extracted from each wood image is 38 features. Then, a backpropagation neural network is used to classify the wood species based on the statistical features. A comprehensive set of experiments on a database composed of 5200 macroscopic images from 52 tropical wood species was used to evaluate the performance of the proposed feature extractor. The advantage of the proposed feature extraction technique is that it mimics the experts’ interpretation on wood texture which allows human involvement when analyzing the wood texture. Experimental results show the efficiency of the proposed method.Keywords: classification, feature extraction, fuzzy, inspection system, image analysis, macroscopic images
Procedia PDF Downloads 4277655 A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features
Authors: Hee-Jeong Ahn, Kee-Won Kim, Seung-Hoon Kim
Abstract:
The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama.Keywords: data mining, Korean linguistic feature, literary fiction, relationship extraction
Procedia PDF Downloads 3837654 Remaining Useful Life Estimation of Bearings Based on Nonlinear Dimensional Reduction Combined with Timing Signals
Authors: Zhongmin Wang, Wudong Fan, Hengshan Zhang, Yimin Zhou
Abstract:
In data-driven prognostic methods, the prediction accuracy of the estimation for remaining useful life of bearings mainly depends on the performance of health indicators, which are usually fused some statistical features extracted from vibrating signals. However, the existing health indicators have the following two drawbacks: (1) The differnet ranges of the statistical features have the different contributions to construct the health indicators, the expert knowledge is required to extract the features. (2) When convolutional neural networks are utilized to tackle time-frequency features of signals, the time-series of signals are not considered. To overcome these drawbacks, in this study, the method combining convolutional neural network with gated recurrent unit is proposed to extract the time-frequency image features. The extracted features are utilized to construct health indicator and predict remaining useful life of bearings. First, original signals are converted into time-frequency images by using continuous wavelet transform so as to form the original feature sets. Second, with convolutional and pooling layers of convolutional neural networks, the most sensitive features of time-frequency images are selected from the original feature sets. Finally, these selected features are fed into the gated recurrent unit to construct the health indicator. The results state that the proposed method shows the enhance performance than the related studies which have used the same bearing dataset provided by PRONOSTIA.Keywords: continuous wavelet transform, convolution neural net-work, gated recurrent unit, health indicators, remaining useful life
Procedia PDF Downloads 1377653 A New Internal Architecture Based On Feature Selection for Holonic Manufacturing System
Authors: Jihan Abdulazeez Ahmed, Adnan Mohsin Abdulazeez Brifcani
Abstract:
This paper suggests a new internal architecture of holon based on feature selection model using the combination of Bees Algorithm (BA) and Artificial Neural Network (ANN). BA is used to generate features while ANN is used as a classifier to evaluate the produced features. Proposed system is applied on the Wine data set, the statistical result proves that the proposed system is effective and has the ability to choose informative features with high accuracy.Keywords: artificial neural network, bees algorithm, feature selection, Holon
Procedia PDF Downloads 4577652 Lambda-Levelwise Statistical Convergence of a Sequence of Fuzzy Numbers
Authors: F. Berna Benli, Özgür Keskin
Abstract:
Lately, many mathematicians have been studied the statistical convergence of a sequence of fuzzy numbers. We know that Lambda-statistically convergence is a kind of convergence between ordinary convergence and statistical convergence. In this paper, we will introduce the new kind of convergence such as λ-levelwise statistical convergence. Then, we will define the concept of the λ-levelwise statistical cluster and limit points of a sequence of fuzzy numbers. Also, we will discuss the relations between the sets of λ-levelwise statistical cluster points and λ-levelwise statistical limit points of sequences of fuzzy numbers. This work has been extended in this paper, where some relations have been considered such that when lambda-statistical limit inferior and lambda-statistical limit superior for lambda-statistically convergent sequences of fuzzy numbers are equal. Furthermore, lambda-statistical boundedness condition for different sequences of fuzzy numbers has been studied.Keywords: fuzzy number, λ-levelwise statistical cluster points, λ-levelwise statistical convergence, λ-levelwise statistical limit points, λ-statistical cluster points, λ-statistical convergence, λ-statistical limit points
Procedia PDF Downloads 4787651 Credit Card Fraud Detection with Ensemble Model: A Meta-Heuristic Approach
Authors: Gong Zhilin, Jing Yang, Jian Yin
Abstract:
The purpose of this paper is to develop a novel system for credit card fraud detection based on sequential modeling of data using hybrid deep learning models. The projected model encapsulates five major phases are pre-processing, imbalance-data handling, feature extraction, optimal feature selection, and fraud detection with an ensemble classifier. The collected raw data (input) is pre-processed to enhance the quality of the data through alleviation of the missing data, noisy data as well as null values. The pre-processed data are class imbalanced in nature, and therefore they are handled effectively with the K-means clustering-based SMOTE model. From the balanced class data, the most relevant features like improved Principal Component Analysis (PCA), statistical features (mean, median, standard deviation) and higher-order statistical features (skewness and kurtosis). Among the extracted features, the most optimal features are selected with the Self-improved Arithmetic Optimization Algorithm (SI-AOA). This SI-AOA model is the conceptual improvement of the standard Arithmetic Optimization Algorithm. The deep learning models like Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and optimized Quantum Deep Neural Network (QDNN). The LSTM and CNN are trained with the extracted optimal features. The outcomes from LSTM and CNN will enter as input to optimized QDNN that provides the final detection outcome. Since the QDNN is the ultimate detector, its weight function is fine-tuned with the Self-improved Arithmetic Optimization Algorithm (SI-AOA).Keywords: credit card, data mining, fraud detection, money transactions
Procedia PDF Downloads 1317650 Music Genre Classification Based on Non-Negative Matrix Factorization Features
Authors: Soyon Kim, Edward Kim
Abstract:
In order to retrieve information from the massive stream of songs in the music industry, music search by title, lyrics, artist, mood, and genre has become more important. Despite the subjectivity and controversy over the definition of music genres across different nations and cultures, automatic genre classification systems that facilitate the process of music categorization have been developed. Manual genre selection by music producers is being provided as statistical data for designing automatic genre classification systems. In this paper, an automatic music genre classification system utilizing non-negative matrix factorization (NMF) is proposed. Short-term characteristics of the music signal can be captured based on the timbre features such as mel-frequency cepstral coefficient (MFCC), decorrelated filter bank (DFB), octave-based spectral contrast (OSC), and octave band sum (OBS). Long-term time-varying characteristics of the music signal can be summarized with (1) the statistical features such as mean, variance, minimum, and maximum of the timbre features and (2) the modulation spectrum features such as spectral flatness measure, spectral crest measure, spectral peak, spectral valley, and spectral contrast of the timbre features. Not only these conventional basic long-term feature vectors, but also NMF based feature vectors are proposed to be used together for genre classification. In the training stage, NMF basis vectors were extracted for each genre class. The NMF features were calculated in the log spectral magnitude domain (NMF-LSM) as well as in the basic feature vector domain (NMF-BFV). For NMF-LSM, an entire full band spectrum was used. However, for NMF-BFV, only low band spectrum was used since high frequency modulation spectrum of the basic feature vectors did not contain important information for genre classification. In the test stage, using the set of pre-trained NMF basis vectors, the genre classification system extracted the NMF weighting values of each genre as the NMF feature vectors. A support vector machine (SVM) was used as a classifier. The GTZAN multi-genre music database was used for training and testing. It is composed of 10 genres and 100 songs for each genre. To increase the reliability of the experiments, 10-fold cross validation was used. For a given input song, an extracted NMF-LSM feature vector was composed of 10 weighting values that corresponded to the classification probabilities for 10 genres. An NMF-BFV feature vector also had a dimensionality of 10. Combined with the basic long-term features such as statistical features and modulation spectrum features, the NMF features provided the increased accuracy with a slight increase in feature dimensionality. The conventional basic features by themselves yielded 84.0% accuracy, but the basic features with NMF-LSM and NMF-BFV provided 85.1% and 84.2% accuracy, respectively. The basic features required dimensionality of 460, but NMF-LSM and NMF-BFV required dimensionalities of 10 and 10, respectively. Combining the basic features, NMF-LSM and NMF-BFV together with the SVM with a radial basis function (RBF) kernel produced the significantly higher classification accuracy of 88.3% with a feature dimensionality of 480.Keywords: mel-frequency cepstral coefficient (MFCC), music genre classification, non-negative matrix factorization (NMF), support vector machine (SVM)
Procedia PDF Downloads 3037649 Towards Integrating Statistical Color Features for Human Skin Detection
Authors: Mohd Zamri Osman, Mohd Aizaini Maarof, Mohd Foad Rohani
Abstract:
Human skin detection recognized as the primary step in most of the applications such as face detection, illicit image filtering, hand recognition and video surveillance. The performance of any skin detection applications greatly relies on the two components: feature extraction and classification method. Skin color is the most vital information used for skin detection purpose. However, color feature alone sometimes could not handle images with having same color distribution with skin color. A color feature of pixel-based does not eliminate the skin-like color due to the intensity of skin and skin-like color fall under the same distribution. Hence, the statistical color analysis will be exploited such mean and standard deviation as an additional feature to increase the reliability of skin detector. In this paper, we studied the effectiveness of statistical color feature for human skin detection. Furthermore, the paper analyzed the integrated color and texture using eight classifiers with three color spaces of RGB, YCbCr, and HSV. The experimental results show that the integrating statistical feature using Random Forest classifier achieved a significant performance with an F1-score 0.969.Keywords: color space, neural network, random forest, skin detection, statistical feature
Procedia PDF Downloads 4627648 Pantograph-Catenary Contact Force: Features Evaluation for Catenary Diagnostics
Authors: Mehdi Brahimi, Kamal Medjaher, Noureddine Zerhouni, Mohammed Leouatni
Abstract:
The Prognostics and Health Management is a system engineering discipline which provides solutions and models to the implantation of a predictive maintenance. The approach is based on extracting useful information from monitoring data to assess the “health” state of an industrial equipment or an asset. In this paper, we examine multiple extracted features from Pantograph-Catenary contact force in order to select the most relevant ones to achieve a diagnostics function. The feature extraction methodology is based on simulation data generated thanks to a Pantograph-Catenary simulation software called INPAC and measurement data. The feature extraction method is based on both statistical and signal processing analyses. The feature selection method is based on statistical criteria.Keywords: catenary/pantograph interaction, diagnostics, Prognostics and Health Management (PHM), quality of current collection
Procedia PDF Downloads 2907647 Image Multi-Feature Analysis by Principal Component Analysis for Visual Surface Roughness Measurement
Authors: Wei Zhang, Yan He, Yan Wang, Yufeng Li, Chuanpeng Hao
Abstract:
Surface roughness is an important index for evaluating surface quality, needs to be accurately measured to ensure the performance of the workpiece. The roughness measurement based on machine vision involves various image features, some of which are redundant. These redundant features affect the accuracy and speed of the visual approach. Previous research used correlation analysis methods to select the appropriate features. However, this feature analysis is independent and cannot fully utilize the information of data. Besides, blindly reducing features lose a lot of useful information, resulting in unreliable results. Therefore, the focus of this paper is on providing a redundant feature removal approach for visual roughness measurement. In this paper, the statistical methods and gray-level co-occurrence matrix(GLCM) are employed to extract the texture features of machined images effectively. Then, the principal component analysis(PCA) is used to fuse all extracted features into a new one, which reduces the feature dimension and maintains the integrity of the original information. Finally, the relationship between new features and roughness is established by the support vector machine(SVM). The experimental results show that the approach can effectively solve multi-feature information redundancy of machined surface images and provides a new idea for the visual evaluation of surface roughness.Keywords: feature analysis, machine vision, PCA, surface roughness, SVM
Procedia PDF Downloads 2137646 A Cross-Gender Statistical Analysis of Tuvinian Intonation Features in Comparison With Uzbek and Azerbaijani
Authors: Daria Beziakina, Elena Bulgakova
Abstract:
The paper deals with cross-gender and cross-linguistic comparison of pitch characteristics for Tuvinian with two other Turkic languages - Uzbek and Azerbaijani, based on the results of statistical analysis of pitch parameter values and intonation patterns used by male and female speakers. The main goal of our work is to obtain the ranges of pitch parameter values typical for Tuvinian speakers for the purpose of automatic language identification. We also propose a cross-gender analysis of declarative intonation in the poorly studied Tuvinian language. The ranges of pitch parameter values were obtained by means of specially developed software that deals with the distribution of pitch values and allows us to obtain statistical language-specific pitch intervals.Keywords: speech analysis, statistical analysis, speaker recognition, identification of person
Procedia PDF Downloads 3487645 Sleep Apnea Hypopnea Syndrom Diagnosis Using Advanced ANN Techniques
Authors: Sachin Singh, Thomas Penzel, Dinesh Nandan
Abstract:
Accurate identification of Sleep Apnea Hypopnea Syndrom Diagnosis is difficult problem for human expert because of variability among persons and unwanted noise. This paper proposes the diagonosis of Sleep Apnea Hypopnea Syndrome (SAHS) using airflow, ECG, Pulse and SaO2 signals. The features of each type of these signals are extracted using statistical methods and ANN learning methods. These extracted features are used to approximate the patient's Apnea Hypopnea Index(AHI) using sample signals in model. Advance signal processing is also applied to snore sound signal to locate snore event and SaO2 signal is used to support whether determined snore event is true or noise. Finally, Apnea Hypopnea Index (AHI) event is calculated as per true snore event detected. Experiment results shows that the sensitivity can reach up to 96% and specificity to 96% as AHI greater than equal to 5.Keywords: neural network, AHI, statistical methods, autoregressive models
Procedia PDF Downloads 1207644 A Communication Signal Recognition Algorithm Based on Holder Coefficient Characteristics
Authors: Hui Zhang, Ye Tian, Fang Ye, Ziming Guo
Abstract:
Communication signal modulation recognition technology is one of the key technologies in the field of modern information warfare. At present, communication signal automatic modulation recognition methods are mainly divided into two major categories. One is the maximum likelihood hypothesis testing method based on decision theory, the other is a statistical pattern recognition method based on feature extraction. Now, the most commonly used is a statistical pattern recognition method, which includes feature extraction and classifier design. With the increasingly complex electromagnetic environment of communications, how to effectively extract the features of various signals at low signal-to-noise ratio (SNR) is a hot topic for scholars in various countries. To solve this problem, this paper proposes a feature extraction algorithm for the communication signal based on the improved Holder cloud feature. And the extreme learning machine (ELM) is used which aims at the problem of the real-time in the modern warfare to classify the extracted features. The algorithm extracts the digital features of the improved cloud model without deterministic information in a low SNR environment, and uses the improved cloud model to obtain more stable Holder cloud features and the performance of the algorithm is improved. This algorithm addresses the problem that a simple feature extraction algorithm based on Holder coefficient feature is difficult to recognize at low SNR, and it also has a better recognition accuracy. The results of simulations show that the approach in this paper still has a good classification result at low SNR, even when the SNR is -15dB, the recognition accuracy still reaches 76%.Keywords: communication signal, feature extraction, Holder coefficient, improved cloud model
Procedia PDF Downloads 1577643 Statistical Comparison of Machine and Manual Translation: A Corpus-Based Study of Gone with the Wind
Authors: Yanmeng Liu
Abstract:
This article analyzes and compares the linguistic differences between machine translation and manual translation, through a case study of the book Gone with the Wind. As an important carrier of human feeling and thinking, the literature translation poses a huge difficulty for machine translation, and it is supposed to expose distinct translation features apart from manual translation. In order to display linguistic features objectively, tentative uses of computerized and statistical evidence to the systematic investigation of large scale translation corpora by using quantitative methods have been deployed. This study compiles bilingual corpus with four versions of Chinese translations of the book Gone with the Wind, namely, Piao by Chunhai Fan, Piao by Huairen Huang, translations by Google Translation and Baidu Translation. After processing the corpus with the software of Stanford Segmenter, Stanford Postagger, and AntConc, etc., the study analyzes linguistic data and answers the following questions: 1. How does the machine translation differ from manual translation linguistically? 2. Why do these deviances happen? This paper combines translation study with the knowledge of corpus linguistics, and concretes divergent linguistic dimensions in translated text analysis, in order to present linguistic deviances in manual and machine translation. Consequently, this study provides a more accurate and more fine-grained understanding of machine translation products, and it also proposes several suggestions for machine translation development in the future.Keywords: corpus-based analysis, linguistic deviances, machine translation, statistical evidence
Procedia PDF Downloads 1457642 Economics of Oil and Its Stability in the Gulf Region
Authors: Al Mutawa A. Amir, Liaqat Ali, Faisal Ali
Abstract:
After the World War II, the world economy was disrupted and changed due to oil and its prices. The research in this paper presents the basic statistical features and economic characteristics of the Gulf economy. The main features of the Gulf economies and its heavy dependence on oil exports, its dualism between modern and traditional sectors and its rapidly increasing affluences are particularly emphasized. In this context, the research in this paper discussed the problems of growth versus development and has attempted to draw the implications for the future economic development of this area.Keywords: oil prices, GCC, economic growth, gulf oil
Procedia PDF Downloads 3357641 Relevant LMA Features for Human Motion Recognition
Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier
Abstract:
Motion recognition from videos is actually a very complex task due to the high variability of motions. This paper describes the challenges of human motion recognition, especially motion representation step with relevant features. Our descriptor vector is inspired from Laban Movement Analysis method. We propose discriminative features using the Random Forest algorithm in order to remove redundant features and make learning algorithms operate faster and more effectively. We validate our method on MSRC-12 and UTKinect datasets.Keywords: discriminative LMA features, features reduction, human motion recognition, random forest
Procedia PDF Downloads 1977640 Iris Recognition Based on the Low Order Norms of Gradient Components
Authors: Iman A. Saad, Loay E. George
Abstract:
Iris pattern is an important biological feature of human body; it becomes very hot topic in both research and practical applications. In this paper, an algorithm is proposed for iris recognition and a simple, efficient and fast method is introduced to extract a set of discriminatory features using first order gradient operator applied on grayscale images. The gradient based features are robust, up to certain extents, against the variations may occur in contrast or brightness of iris image samples; the variations are mostly occur due lightening differences and camera changes. At first, the iris region is located, after that it is remapped to a rectangular area of size 360x60 pixels. Also, a new method is proposed for detecting eyelash and eyelid points; it depends on making image statistical analysis, to mark the eyelash and eyelid as a noise points. In order to cover the features localization (variation), the rectangular iris image is partitioned into N overlapped sub-images (blocks); then from each block a set of different average directional gradient densities values is calculated to be used as texture features vector. The applied gradient operators are taken along the horizontal, vertical and diagonal directions. The low order norms of gradient components were used to establish the feature vector. Euclidean distance based classifier was used as a matching metric for determining the degree of similarity between the features vector extracted from the tested iris image and template features vectors stored in the database. Experimental tests were performed using 2639 iris images from CASIA V4-Interival database, the attained recognition accuracy has reached up to 99.92%.Keywords: iris recognition, contrast stretching, gradient features, texture features, Euclidean metric
Procedia PDF Downloads 3367639 Students' Statistical Reasoning and Attitudes towards Statistics in Blended Learning, E-Learning and On-Campus Learning
Authors: Petros Roussos
Abstract:
The present study focused on students' statistical reasoning related to Null Hypothesis Statistical Testing and p-values. Its objective was to test the hypothesis that neither the place (classroom, at a distance, online) nor the medium that actually supports the learning (ICT, internet, books) has an effect on understanding of statistical concepts. In addition, it was expected that students' attitudes towards statistics would not predict understanding of statistical concepts. The sample consisted of 385 undergraduate and postgraduate students from six state and private universities (five in Greece and one in Cyprus). Students were administered two questionnaires: a) the Greek version of the Survey of Attitudes Toward Statistics, and b) a short instrument which measures students' understanding of statistical significance and p-values. Results suggest that attitudes towards statistics do not predict students' understanding of statistical concepts, whereas the medium did not have an effect.Keywords: attitudes towards statistics, blended learning, e-learning, statistical reasoning
Procedia PDF Downloads 3107638 Statistical Wavelet Features, PCA, and SVM-Based Approach for EEG Signals Classification
Authors: R. K. Chaurasiya, N. D. Londhe, S. Ghosh
Abstract:
The study of the electrical signals produced by neural activities of human brain is called Electroencephalography. In this paper, we propose an automatic and efficient EEG signal classification approach. The proposed approach is used to classify the EEG signal into two classes: epileptic seizure or not. In the proposed approach, we start with extracting the features by applying Discrete Wavelet Transform (DWT) in order to decompose the EEG signals into sub-bands. These features, extracted from details and approximation coefficients of DWT sub-bands, are used as input to Principal Component Analysis (PCA). The classification is based on reducing the feature dimension using PCA and deriving the support-vectors using Support Vector Machine (SVM). The experimental are performed on real and standard dataset. A very high level of classification accuracy is obtained in the result of classification.Keywords: discrete wavelet transform, electroencephalogram, pattern recognition, principal component analysis, support vector machine
Procedia PDF Downloads 6397637 Classification of Computer Generated Images from Photographic Images Using Convolutional Neural Networks
Authors: Chaitanya Chawla, Divya Panwar, Gurneesh Singh Anand, M. P. S Bhatia
Abstract:
This paper presents a deep-learning mechanism for classifying computer generated images and photographic images. The proposed method accounts for a convolutional layer capable of automatically learning correlation between neighbouring pixels. In the current form, Convolutional Neural Network (CNN) will learn features based on an image's content instead of the structural features of the image. The layer is particularly designed to subdue an image's content and robustly learn the sensor pattern noise features (usually inherited from image processing in a camera) as well as the statistical properties of images. The paper was assessed on latest natural and computer generated images, and it was concluded that it performs better than the current state of the art methods.Keywords: image forensics, computer graphics, classification, deep learning, convolutional neural networks
Procedia PDF Downloads 3387636 Impact of Variability in Delineation on PET Radiomics Features in Lung Tumors
Authors: Mahsa Falahatpour
Abstract:
Introduction: This study aims to explore how inter-observer variability in manual tumor segmentation impacts the reliability of radiomic features in non–small cell lung cancer (NSCLC). Methods: The study included twenty-three NSCLC tumors. Each patient had three tumor segmentations (VOL1, VOL2, VOL3) contoured on PET/CT scans by three radiation oncologists. Dice coefficients (DCS) were used to measure the segmentation variability. Radiomic features were extracted with 3D-slicer software, consisting of 66 features: first-order (n=15), second-order (GLCM, GLDM, GLRLM, and GLSZM) (n=33). The inter-observer variability of radiomic features was assessed using the intraclass correlation coefficient (ICC). An ICC > 0.8 indicates good stability. Results: The mean DSC of VOL1, VOL2, and VOL3 was 0.80 ± 0.04, 0.85 ± 0.03, and 0.76 ± 0.06, respectively. 92% of all extracted radiomic features were found to be stable (ICC > 0.8). The GLCM texture features had the highest stability (96%), followed by GLRLM features (90%) and GLSZM features (87%). The DSC was found to be highly correlated with the stability of radiomic features. Conclusion: The variability in inter-observer segmentation significantly impacts radiomics analysis, leading to a reduction in the number of appropriate radiomic features.Keywords: PET/CT, radiomics, radiotherapy, segmentation, NSCLC
Procedia PDF Downloads 477635 Local Binary Patterns-Based Statistical Data Analysis for Accurate Soccer Match Prediction
Authors: Mohammad Ghahramani, Fahimeh Saei Manesh
Abstract:
Winning a soccer game is based on thorough and deep analysis of the ongoing match. On the other hand, giant gambling companies are in vital need of such analysis to reduce their loss against their customers. In this research work, we perform deep, real-time analysis on every soccer match around the world that distinguishes our work from others by focusing on particular seasons, teams and partial analytics. Our contributions are presented in the platform called “Analyst Masters.” First, we introduce various sources of information available for soccer analysis for teams around the world that helped us record live statistical data and information from more than 50,000 soccer matches a year. Our second and main contribution is to introduce our proposed in-play performance evaluation. The third contribution is developing new features from stable soccer matches. The statistics of soccer matches and their odds before and in-play are considered in the image format versus time including the halftime. Local Binary patterns, (LBP) is then employed to extract features from the image. Our analyses reveal incredibly interesting features and rules if a soccer match has reached enough stability. For example, our “8-minute rule” implies if 'Team A' scores a goal and can maintain the result for at least 8 minutes then the match would end in their favor in a stable match. We could also make accurate predictions before the match of scoring less/more than 2.5 goals. We benefit from the Gradient Boosting Trees, GBT, to extract highly related features. Once the features are selected from this pool of data, the Decision trees decide if the match is stable. A stable match is then passed to a post-processing stage to check its properties such as betters’ and punters’ behavior and its statistical data to issue the prediction. The proposed method was trained using 140,000 soccer matches and tested on more than 100,000 samples achieving 98% accuracy to select stable matches. Our database from 240,000 matches shows that one can get over 20% betting profit per month using Analyst Masters. Such consistent profit outperforms human experts and shows the inefficiency of the betting market. Top soccer tipsters achieve 50% accuracy and 8% monthly profit in average only on regional matches. Both our collected database of more than 240,000 soccer matches from 2012 and our algorithm would greatly benefit coaches and punters to get accurate analysis.Keywords: soccer, analytics, machine learning, database
Procedia PDF Downloads 2397634 Tree Species Classification Using Effective Features of Polarimetric SAR and Hyperspectral Images
Authors: Milad Vahidi, Mahmod R. Sahebi, Mehrnoosh Omati, Reza Mohammadi
Abstract:
Forest management organizations need information to perform their work effectively. Remote sensing is an effective method to acquire information from the Earth. Two datasets of remote sensing images were used to classify forested regions. Firstly, all of extractable features from hyperspectral and PolSAR images were extracted. The optical features were spectral indexes related to the chemical, water contents, structural indexes, effective bands and absorption features. Also, PolSAR features were the original data, target decomposition components, and SAR discriminators features. Secondly, the particle swarm optimization (PSO) and the genetic algorithms (GA) were applied to select optimization features. Furthermore, the support vector machine (SVM) classifier was used to classify the image. The results showed that the combination of PSO and SVM had higher overall accuracy than the other cases. This combination provided overall accuracy about 90.56%. The effective features were the spectral index, the bands in shortwave infrared (SWIR) and the visible ranges and certain PolSAR features.Keywords: hyperspectral, PolSAR, feature selection, SVM
Procedia PDF Downloads 4197633 Local Spectrum Feature Extraction for Face Recognition
Authors: Muhammad Imran Ahmad, Ruzelita Ngadiran, Mohd Nazrin Md Isa, Nor Ashidi Mat Isa, Mohd ZaizuIlyas, Raja Abdullah Raja Ahmad, Said Amirul Anwar Ab Hamid, Muzammil Jusoh
Abstract:
This paper presents two technique, local feature extraction using image spectrum and low frequency spectrum modelling using GMM to capture the underlying statistical information to improve the performance of face recognition system. Local spectrum features are extracted using overlap sub block window that are mapping on the face image. For each of this block, spatial domain is transformed to frequency domain using DFT. A low frequency coefficient is preserved by discarding high frequency coefficients by applying rectangular mask on the spectrum of the facial image. Low frequency information is non Gaussian in the feature space and by using combination of several Gaussian function that has different statistical properties, the best feature representation can be model using probability density function. The recognition process is performed using maximum likelihood value computed using pre-calculate GMM components. The method is tested using FERET data sets and is able to achieved 92% recognition rates.Keywords: local features modelling, face recognition system, Gaussian mixture models, Feret
Procedia PDF Downloads 6697632 Statistical Convergence for the Approximation of Linear Positive Operators
Authors: Neha Bhardwaj
Abstract:
In this paper, we consider positive linear operators and study the Voronovskaya type result of the operator then obtain an error estimate in terms of the higher order modulus of continuity of the function being approximated and its A-statistical convergence. Also, we compute the corresponding rate of A-statistical convergence for the linear positive operators.Keywords: Poisson distribution, Voronovskaya, modulus of continuity, a-statistical convergence
Procedia PDF Downloads 3337631 Active Features Determination: A Unified Framework
Authors: Meenal Badki
Abstract:
We address the issue of active feature determination, where the objective is to determine the set of examples on which additional data (such as lab tests) needs to be gathered, given a large number of examples with some features (such as demographics) and some examples with all the features (such as the complete Electronic Health Record). We note that certain features may be more costly, unique, or laborious to gather. Our proposal is a general active learning approach that is independent of classifiers and similarity metrics. It allows us to identify examples that differ from the full data set and obtain all the features for the examples that match. Our comprehensive evaluation shows the efficacy of this approach, which is driven by four authentic clinical tasks.Keywords: feature determination, classification, active learning, sample-efficiency
Procedia PDF Downloads 77