Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 27028

Search results for: statistical data

26788 Investigated Optimization of Davidson Path Loss Model for Digital Terrestrial Television (DTTV) Propagation in Urban Area

Authors: Pitak Keawbunsong, Sathaporn Promwong

Abstract:

This paper presents an investigation on the efficiency of the optimized Davison path loss model in order to look for a suitable path loss model to design and planning DTTV propagation for small and medium urban areas in southern Thailand. Hadyai City in Songkla Province is chosen as the case study to collect the analytical data on the electric field strength. The optimization is conducted through the least square method while the efficiency index is through the statistical value of relative error (RE). The result of the least square method is the offset and slop of the frequency to be used in the optimized process. The statistical result shows that RE of the old Davidson model is at the least when being compared with the optimized Davison and the Hata models. Thus, the old Davison path loss model is the most accurate that further becomes the most optimized for the plan on the propagation network design.

Keywords: DTTV propagation, path loss model, Davidson model, least square method

Procedia PDF Downloads 345

26787 Statistical Analysis for Overdispersed Medical Count Data

Authors: Y. N. Phang, E. F. Loh

Abstract:

Many researchers have suggested the use of zero inflated Poisson (ZIP) and zero inflated negative binomial (ZINB) models in modeling over-dispersed medical count data with extra variations caused by extra zeros and unobserved heterogeneity. The studies indicate that ZIP and ZINB always provide better fit than using the normal Poisson and negative binomial models in modeling over-dispersed medical count data. In this study, we proposed the use of Zero Inflated Inverse Trinomial (ZIIT), Zero Inflated Poisson Inverse Gaussian (ZIPIG) and zero inflated strict arcsine models in modeling over-dispersed medical count data. These proposed models are not widely used by many researchers especially in the medical field. The results show that these three suggested models can serve as alternative models in modeling over-dispersed medical count data. This is supported by the application of these suggested models to a real life medical data set. Inverse trinomial, Poisson inverse Gaussian, and strict arcsine are discrete distributions with cubic variance function of mean. Therefore, ZIIT, ZIPIG and ZISA are able to accommodate data with excess zeros and very heavy tailed. They are recommended to be used in modeling over-dispersed medical count data when ZIP and ZINB are inadequate.

Keywords: zero inflated, inverse trinomial distribution, Poisson inverse Gaussian distribution, strict arcsine distribution, Pearson’s goodness of fit

Procedia PDF Downloads 552

26786 Wind Power Forecast Error Simulation Model

Authors: Josip Vasilj, Petar Sarajcev, Damir Jakus

Abstract:

One of the major difficulties introduced with wind power penetration is the inherent uncertainty in production originating from uncertain wind conditions. This uncertainty impacts many different aspects of power system operation, especially the balancing power requirements. For this reason, in power system development planing, it is necessary to evaluate the potential uncertainty in future wind power generation. For this purpose, simulation models are required, reproducing the performance of wind power forecasts. This paper presents a wind power forecast error simulation models which are based on the stochastic process simulation. Proposed models capture the most important statistical parameters recognized in wind power forecast error time series. Furthermore, two distinct models are presented based on data availability. First model uses wind speed measurements on potential or existing wind power plant locations, while the seconds model uses statistical distribution of wind speeds.

Keywords: wind power, uncertainty, stochastic process, Monte Carlo simulation

Procedia PDF Downloads 488

26785 Local Binary Patterns-Based Statistical Data Analysis for Accurate Soccer Match Prediction

Authors: Mohammad Ghahramani, Fahimeh Saei Manesh

Abstract:

Winning a soccer game is based on thorough and deep analysis of the ongoing match. On the other hand, giant gambling companies are in vital need of such analysis to reduce their loss against their customers. In this research work, we perform deep, real-time analysis on every soccer match around the world that distinguishes our work from others by focusing on particular seasons, teams and partial analytics. Our contributions are presented in the platform called “Analyst Masters.” First, we introduce various sources of information available for soccer analysis for teams around the world that helped us record live statistical data and information from more than 50,000 soccer matches a year. Our second and main contribution is to introduce our proposed in-play performance evaluation. The third contribution is developing new features from stable soccer matches. The statistics of soccer matches and their odds before and in-play are considered in the image format versus time including the halftime. Local Binary patterns, (LBP) is then employed to extract features from the image. Our analyses reveal incredibly interesting features and rules if a soccer match has reached enough stability. For example, our “8-minute rule” implies if 'Team A' scores a goal and can maintain the result for at least 8 minutes then the match would end in their favor in a stable match. We could also make accurate predictions before the match of scoring less/more than 2.5 goals. We benefit from the Gradient Boosting Trees, GBT, to extract highly related features. Once the features are selected from this pool of data, the Decision trees decide if the match is stable. A stable match is then passed to a post-processing stage to check its properties such as betters’ and punters’ behavior and its statistical data to issue the prediction. The proposed method was trained using 140,000 soccer matches and tested on more than 100,000 samples achieving 98% accuracy to select stable matches. Our database from 240,000 matches shows that one can get over 20% betting profit per month using Analyst Masters. Such consistent profit outperforms human experts and shows the inefficiency of the betting market. Top soccer tipsters achieve 50% accuracy and 8% monthly profit in average only on regional matches. Both our collected database of more than 240,000 soccer matches from 2012 and our algorithm would greatly benefit coaches and punters to get accurate analysis.

Keywords: soccer, analytics, machine learning, database

Procedia PDF Downloads 242

26784 Urbanization and Income Inequality in Thailand

Authors: Acumsiri Tantikarnpanit

Abstract:

This paper aims to examine the relationship between urbanization and income inequality in Thailand during the period 2002–2020. Using a panel of data for 76 provinces collected from Thailand’s National Statistical Office (Labor Force Survey: LFS), as well as geospatial data from the U.S. Air Force Defense Meteorological Satellite Program (DMSP) and the Visible Infrared Imaging Radiometer Suite Day/Night band (VIIRS-DNB) satellite for nineteen selected years. This paper employs two different definitions to identify urban areas: 1) Urban areas defined by Thailand's National Statistical Office (Labor Force Survey: LFS), and 2) Urban areas estimated using nighttime light data from the DMSP and VIIRS-DNB satellite. The second method includes two sub-categories: 2.1) Determining urban areas by calculating nighttime light density with a population density of 300 people per square kilometer, and 2.2) Calculating urban areas based on nighttime light density corresponding to a population density of 1,500 people per square kilometer. The empirical analysis based on Ordinary Least Squares (OLS), fixed effects, and random effects models reveals a consistent U-shaped relationship between income inequality and urbanization. The findings from the econometric analysis demonstrate that urbanization or population density has a significant and negative impact on income inequality. Moreover, the square of urbanization shows a statistically significant positive impact on income inequality. Additionally, there is a negative association between logarithmically transformed income and income inequality. This paper also proposes the inclusion of satellite imagery, geospatial data, and spatial econometric techniques in future studies to conduct quantitative analysis of spatial relationships.

Keywords: income inequality, nighttime light, population density, Thailand, urbanization

Procedia PDF Downloads 81

26783 Mixture statistical modeling for predecting mortality human immunodeficiency virus (HIV) and tuberculosis(TB) infection patients

Authors: Mohd Asrul Affendi Bi Abdullah, Nyi Nyi Naing

Abstract:

The purpose of this study was to identify comparable manner between negative binomial death rate (NBDR) and zero inflated negative binomial death rate (ZINBDR) with died patients with (HIV + T B+) and (HIV + T B−). HIV and TB is a serious world wide problem in the developing country. Data were analyzed with applying NBDR and ZINBDR to make comparison which a favorable model is better to used. The ZINBDR model is able to account for the disproportionately large number of zero within the data and is shown to be a consistently better fit than the NBDR model. Hence, as a results ZINBDR model is a superior fit to the data than the NBDR model and provides additional information regarding the died mechanisms HIV+TB. The ZINBDR model is shown to be a use tool for analysis death rate according age categorical.

Keywords: zero inflated negative binomial death rate, HIV and TB, AIC and BIC, death rate

Procedia PDF Downloads 437

26782 Predictive Analytics for Theory Building

Authors: Ho-Won Jung, Donghun Lee, Hyung-Jin Kim

Abstract:

Predictive analytics (data analysis) uses a subset of measurements (the features, predictor, or independent variable) to predict another measurement (the outcome, target, or dependent variable) on a single person or unit. It applies empirical methods in statistics, operations research, and machine learning to predict the future, or otherwise unknown events or outcome on a single or person or unit, based on patterns in data. Most analyses of metabolic syndrome are not predictive analytics but statistical explanatory studies that build a proposed model (theory building) and then validate metabolic syndrome predictors hypothesized (theory testing). A proposed theoretical model forms with causal hypotheses that specify how and why certain empirical phenomena occur. Predictive analytics and explanatory modeling have their own territories in analysis. However, predictive analytics can perform vital roles in explanatory studies, i.e., scientific activities such as theory building, theory testing, and relevance assessment. In the context, this study is to demonstrate how to use our predictive analytics to support theory building (i.e., hypothesis generation). For the purpose, this study utilized a big data predictive analytics platform TM based on a co-occurrence graph. The co-occurrence graph is depicted with nodes (e.g., items in a basket) and arcs (direct connections between two nodes), where items in a basket are fully connected. A cluster is a collection of fully connected items, where the specific group of items has co-occurred in several rows in a data set. Clusters can be ranked using importance metrics, such as node size (number of items), frequency, surprise (observed frequency vs. expected), among others. The size of a graph can be represented by the numbers of nodes and arcs. Since the size of a co-occurrence graph does not depend directly on the number of observations (transactions), huge amounts of transactions can be represented and processed efficiently. For a demonstration, a total of 13,254 metabolic syndrome training data is plugged into the analytics platform to generate rules (potential hypotheses). Each observation includes 31 predictors, for example, associated with sociodemographic, habits, and activities. Some are intentionally included to get predictive analytics insights on variable selection such as cancer examination, house type, and vaccination. The platform automatically generates plausible hypotheses (rules) without statistical modeling. Then the rules are validated with an external testing dataset including 4,090 observations. Results as a kind of inductive reasoning show potential hypotheses extracted as a set of association rules. Most statistical models generate just one estimated equation. On the other hand, a set of rules (many estimated equations from a statistical perspective) in this study may imply heterogeneity in a population (i.e., different subpopulations with unique features are aggregated). Next step of theory development, i.e., theory testing, statistically tests whether a proposed theoretical model is a plausible explanation of a phenomenon interested in. If hypotheses generated are tested statistically with several thousand observations, most of the variables will become significant as the p-values approach zero. Thus, theory validation needs statistical methods utilizing a part of observations such as bootstrap resampling with an appropriate sample size.

Keywords: explanatory modeling, metabolic syndrome, predictive analytics, theory building

Procedia PDF Downloads 283

26781 Radar Signal Detection Using Neural Networks in Log-Normal Clutter for Multiple Targets Situations

Authors: Boudemagh Naime

Abstract:

Automatic radar detection requires some methods of adapting to variations in the background clutter in order to control their false alarm rate. The problem becomes more complicated in non-Gaussian environment. In fact, the conventional approach in real time applications requires a complex statistical modeling and much computational operations. To overcome these constraints, we propose another approach based on artificial neural network (ANN-CMLD-CFAR) using a Back Propagation (BP) training algorithm. The considered environment follows a log-normal distribution in the presence of multiple Rayleigh-targets. To evaluate the performances of the considered detector, several situations, such as scale parameter and the number of interferes targets, have been investigated. The simulation results show that the ANN-CMLD-CFAR processor outperforms the conventional statistical one.

Keywords: radat detection, ANN-CMLD-CFAR, log-normal clutter, statistical modelling

Procedia PDF Downloads 368

26780 Troubleshooting Petroleum Equipment Based on Wireless Sensors Based on Bayesian Algorithm

Authors: Vahid Bayrami Rad

Abstract:

In this research, common methods and techniques have been investigated with a focus on intelligent fault finding and monitoring systems in the oil industry. In fact, remote and intelligent control methods are considered a necessity for implementing various operations in the oil industry, but benefiting from the knowledge extracted from countless data generated with the help of data mining algorithms. It is a avoid way to speed up the operational process for monitoring and troubleshooting in today's big oil companies. Therefore, by comparing data mining algorithms and checking the efficiency and structure and how these algorithms respond in different conditions, The proposed (Bayesian) algorithm using data clustering and their analysis and data evaluation using a colored Petri net has provided an applicable and dynamic model from the point of view of reliability and response time. Therefore, by using this method, it is possible to achieve a dynamic and consistent model of the remote control system and prevent the occurrence of leakage in oil pipelines and refineries and reduce costs and human and financial errors. Statistical data The data obtained from the evaluation process shows an increase in reliability, availability and high speed compared to other previous methods in this proposed method.

Keywords: wireless sensors, petroleum equipment troubleshooting, Bayesian algorithm, colored Petri net, rapid miner, data mining-reliability

Procedia PDF Downloads 72

26779 Comparative Study to Evaluate Chronological Age and Dental Age in North Indian Population Using Cameriere Method

Authors: Ranjitkumar Patil

Abstract:

Age estimation has its importance in forensic dentistry. Dental age estimation has emerged as an alternative to skeletal age determination. The methods based on stages of tooth formation, as appreciated on radiographs, seems to be more appropriate in the assessment of age than those based on skeletal development. The study was done to evaluate dental age in north Indian population using Cameriere’s method. Aims/Objectives: The study was conducted to assess the dental age of North Indian children using Cameriere’smethodand to compare the chronological age and dental age for validation of the Cameriere’smethod in the north Indian population. A comparative study of 02 year duration on the OPG (using PLANMECA Promax 3D) data of 497 individuals with age ranging from 5 to 15 years was done based on simple random technique ethical approval obtained from the institutional ethical committee. The data was obtained based on inclusion and exclusion criteria was analyzed by a software for dental age estimation. Statistical analysis: Student’s t test was used to compare the morphological variables of males with those of females and to compare observed age with estimated age. Regression formula was also calculated. Results: Present study was a comparative study of 497 subjects with a distribution between male and female, with their dental age assessed by using Panoramic radiograph, following the method described by Cameriere, which is widely accepted. Statistical analysis in our study indicated that gender does not have a significant influence on age estimation. (R2= 0.787). Conclusion: This infers that cameriere’s method can be effectively applied in north Indianpopulation.

Keywords: Forensic, Chronological Age, Dental Age, Skeletal Age

Procedia PDF Downloads 92

26778 Statistical Randomness Testing of Some Second Round Candidate Algorithms of CAESAR Competition

Authors: Fatih Sulak, Betül A. Özdemir, Beyza Bozdemir

Abstract:

In order to improve symmetric key research, several competitions had been arranged by organizations like National Institute of Standards and Technology (NIST) and International Association for Cryptologic Research (IACR). In recent years, the importance of authenticated encryption has rapidly increased because of the necessity of simultaneously enabling integrity, confidentiality and authenticity. Therefore, at January 2013, IACR announced the Competition for Authenticated Encryption: Security, Applicability, and Robustness (CAESAR Competition) which will select secure and efficient algorithms for authenticated encryption. Cryptographic algorithms are anticipated to behave like random mappings; hence, it is important to apply statistical randomness tests to the outputs of the algorithms. In this work, the statistical randomness tests in the NIST Test Suite and the other recently designed randomness tests are applied to six second round algorithms of the CAESAR Competition. It is observed that AEGIS achieves randomness after 3 rounds, Ascon permutation function achieves randomness after 1 round, Joltik encryption function achieves randomness after 9 rounds, Morus state update function achieves randomness after 3 rounds, Pi-cipher achieves randomness after 1 round, and Tiaoxin achieves randomness after 1 round.

Keywords: authenticated encryption, CAESAR competition, NIST test suite, statistical randomness tests

Procedia PDF Downloads 318

26777 Quality of Age Reporting from Tanzania 2012 Census Results: An Assessment Using Whipple’s Index, Myer’s Blended Index, and Age-Sex Accuracy Index

Authors: A. Sathiya Susuman, Hamisi F. Hamisi

Abstract:

Background: Many socio-economic and demographic data are age-sex attributed. However, a variety of irregularities and misstatement are noted with respect to age-related data and less to sex data because of its biological differences between the genders. Noting the misstatement/misreporting of age data regardless of its significance importance in demographics and epidemiological studies, this study aims at assessing the quality of 2012 Tanzania Population and Housing Census Results. Methods: Data for the analysis are downloaded from Tanzania National Bureau of Statistics. Age heaping and digit preference were measured using summary indices viz., Whipple’s index, Myers’ blended index, and Age-Sex Accuracy index. Results: The recorded Whipple’s index for both sexes was 154.43; male has the lowest index of about 152.65 while female has the highest index of about 156.07. For Myers’ blended index, the preferences were at digits ‘0’ and ‘5’ while avoidance were at digits ‘1’ and ‘3’ for both sexes. Finally, Age-sex index stood at 59.8 where sex ratio score was 5.82 and age ratio scores were 20.89 and 21.4 for males and female respectively. Conclusion: The evaluation of the 2012 PHC data using the demographic techniques has qualified the data inaccurate as the results of systematic heaping and digit preferences/avoidances. Thus, innovative methods in data collection along with measuring and minimizing errors using statistical techniques should be used to ensure accuracy of age data.

Keywords: age heaping, digit preference/avoidance, summary indices, Whipple’s index, Myer’s index, age-sex accuracy index

Procedia PDF Downloads 479

26776 A Mixed-Method Exploration of the Interrelationship between Corporate Governance and Firm Performance

Authors: Chen Xiatong

Abstract:

The study aims to explore the interrelationship between corporate governance factors and firm performance in Mainland China using a mixed-method approach. To clarify the current effectiveness of corporate governance, uncover the complex interrelationships between governance factors and firm performance, and enhance understanding of corporate governance strategies in Mainland China. The research involves quantitative methods like statistical analysis of governance factors and firm performance data, as well as qualitative approaches including policy research, case studies, and interviews with staff members. The study aims to reveal the current effectiveness of corporate governance in Mainland China, identify complex interrelationships between governance factors and firm performance, and provide suggestions for companies to enhance their governance practices. The research contributes to enriching the literature on corporate governance by providing insights into the effectiveness of governance practices in Mainland China and offering suggestions for improvement. Quantitative data will be gathered through surveys and sampling methods, focusing on governance factors and firm performance indicators. Qualitative data will be collected through policy research, case studies, and interviews with staff members. Quantitative data will be analyzed using statistical, mathematical, and computational techniques. Qualitative data will be analyzed through thematic analysis and interpretation of policy documents, case study findings, and interview responses. The study addresses the effectiveness of corporate governance in Mainland China, the interrelationship between governance factors and firm performance, and staff members' perceptions of corporate governance strategies. The research aims to enhance understanding of corporate governance effectiveness, enrich the literature on governance practices, and contribute to the field of business management and human resources management in Mainland China.

Keywords: corporate governance, business management, human resources management, board of directors

Procedia PDF Downloads 59

26775 The Impact of Data Science on Geography: A Review

Authors: Roberto Machado

Abstract:

We conducted a systematic review using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses methodology, analyzing 2,996 studies and synthesizing 41 of them to explore the evolution of data science and its integration into geography. By employing optimization algorithms, we accelerated the review process, significantly enhancing the efficiency and precision of literature selection. Our findings indicate that data science has developed over five decades, facing challenges such as the diversified integration of data and the need for advanced statistical and computational skills. In geography, the integration of data science underscores the importance of interdisciplinary collaboration and methodological innovation. Techniques like large-scale spatial data analysis and predictive algorithms show promise in natural disaster management and transportation route optimization, enabling faster and more effective responses. These advancements highlight the transformative potential of data science in geography, providing tools and methodologies to address complex spatial problems. The relevance of this study lies in the use of optimization algorithms in systematic reviews and the demonstrated need for deeper integration of data science into geography. Key contributions include identifying specific challenges in combining diverse spatial data and the necessity for advanced computational skills. Examples of connections between these two fields encompass significant improvements in natural disaster management and transportation efficiency, promoting more effective and sustainable environmental solutions with a positive societal impact.

Keywords: data science, geography, systematic review, optimization algorithms, supervised learning

Procedia PDF Downloads 40

26774 A User Identification Technique to Access Big Data Using Cloud Services

Authors: A. R. Manu, V. K. Agrawal, K. N. Balasubramanya Murthy

Abstract:

Authentication is required in stored database systems so that only authorized users can access the data and related cloud infrastructures. This paper proposes an authentication technique using multi-factor and multi-dimensional authentication system with multi-level security. The proposed technique is likely to be more robust as the probability of breaking the password is extremely low. This framework uses a multi-modal biometric approach and SMS to enforce additional security measures with the conventional Login/password system. The robustness of the technique is demonstrated mathematically using a statistical analysis. This work presents the authentication system along with the user authentication architecture diagram, activity diagrams, data flow diagrams, sequence diagrams, and algorithms.

Keywords: design, implementation algorithms, performance, biometric approach

Procedia PDF Downloads 481

26773 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: classification algorithms, data mining, knowledge discovery, tourism

Procedia PDF Downloads 301

26772 Detection of Change Points in Earthquakes Data: A Bayesian Approach

Authors: F. A. Al-Awadhi, D. Al-Hulail

Abstract:

In this study, we applied the Bayesian hierarchical model to detect single and multiple change points for daily earthquake body wave magnitude. The change point analysis is used in both backward (off-line) and forward (on-line) statistical research. In this study, it is used with the backward approach. Different types of change parameters are considered (mean, variance or both). The posterior model and the conditional distributions for single and multiple change points are derived and implemented using BUGS software. The model is applicable for any set of data. The sensitivity of the model is tested using different prior and likelihood functions. Using Mb data, we concluded that during January 2002 and December 2003, three changes occurred in the mean magnitude of Mb in Kuwait and its vicinity.

Keywords: multiple change points, Markov Chain Monte Carlo, earthquake magnitude, hierarchical Bayesian mode

Procedia PDF Downloads 460

26771 Automatic Tuning for a Systemic Model of Banking Originated Losses (SYMBOL) Tool on Multicore

Authors: Ronal Muresano, Andrea Pagano

Abstract:

Nowadays, the mathematical/statistical applications are developed with more complexity and accuracy. However, these precisions and complexities have brought as result that applications need more computational power in order to be executed faster. In this sense, the multicore environments are playing an important role to improve and to optimize the execution time of these applications. These environments allow us the inclusion of more parallelism inside the node. However, to take advantage of this parallelism is not an easy task, because we have to deal with some problems such as: cores communications, data locality, memory sizes (cache and RAM), synchronizations, data dependencies on the model, etc. These issues are becoming more important when we wish to improve the application’s performance and scalability. Hence, this paper describes an optimization method developed for Systemic Model of Banking Originated Losses (SYMBOL) tool developed by the European Commission, which is based on analyzing the application's weakness in order to exploit the advantages of the multicore. All these improvements are done in an automatic and transparent manner with the aim of improving the performance metrics of our tool. Finally, experimental evaluations show the effectiveness of our new optimized version, in which we have achieved a considerable improvement on the execution time. The time has been reduced around 96% for the best case tested, between the original serial version and the automatic parallel version.

Keywords: algorithm optimization, bank failures, OpenMP, parallel techniques, statistical tool

Procedia PDF Downloads 372

26770 Statistical Characteristics of Distribution of Radiation-Induced Defects under Random Generation

Authors: P. Selyshchev

Abstract:

We consider fluctuations of defects density taking into account their interaction. Stochastic field of displacement generation rate gives random defect distribution. We determinate statistical characteristics (mean and dispersion) of random field of point defect distribution as function of defect generation parameters, temperature and properties of irradiated crystal.

Keywords: irradiation, primary defects, interaction, fluctuations

Procedia PDF Downloads 346

26769 Advances in Artificial intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: speech recognition, acoustic phonetic, artificial intelligence, hidden markov models (HMM), statistical models of speech recognition, human machine performance

Procedia PDF Downloads 480

26768 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Model

Authors: Alam Ali, Ashok Kumar Pathak

Abstract:

Path analysis is a statistical technique used to evaluate the strength of the direct and indirect effects of variables. One or more structural regression equations are used to estimate a series of parameters in order to find the better fit of data. Sometimes, exogenous variables do not show a significant strength of their direct and indirect effect when the assumption of classical regression (ordinary least squares (OLS)) are violated by the nature of the data. The main motive of this article is to investigate the efficacy of the copula-based regression approach over the classical regression approach and calculate the direct and indirect effects of variables when data violates the OLS assumption and variables are linked through an elliptical copula. We perform this study using a well-organized numerical scheme. Finally, a real data application is also presented to demonstrate the performance of the superiority of the copula approach.

Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique

Procedia PDF Downloads 77

26767 Statistical Analysis of Rainfall Change over the Blue Nile Basin

Authors: Hany Mustafa, Mahmoud Roushdi, Khaled Kheireldin

Abstract:

Rainfall variability is an important feature of semi-arid climates. Climate change is very likely to increase the frequency, magnitude, and variability of extreme weather events such as droughts, floods, and storms. The Blue Nile Basin is facing extreme climate change-related events such as floods and droughts and its possible impacts on ecosystem, livelihood, agriculture, livestock, and biodiversity are expected. Rainfall variability is a threat to food production in the Blue Nile Basin countries. This study investigates the long-term variations and trends of seasonal and annual precipitation over the Blue Nile Basin for 102-year period (1901-2002). Six statistical trend analysis of precipitation was performed with nonparametric Mann-Kendall test and Sen's slope estimator. On the other hands, four statistical absolute homogeneity tests: Standard Normal Homogeneity Test, Buishand Range test, Pettitt test and the Von Neumann ratio test were applied to test the homogeneity of the rainfall data, using XLSTAT software, which results of p-valueless than alpha=0.05, were significant. The percentages of significant trends obtained for each parameter in the different seasons are presented. The study recommends adaptation strategies to be streamlined to relevant policies, enhancing local farmers’ adaptive capacity for facing future climate change effects.

Keywords: Blue Nile basin, climate change, Mann-Kendall test, trend analysis

Procedia PDF Downloads 555

26766 Different Data-Driven Bivariate Statistical Approaches to Landslide Susceptibility Mapping (Uzundere, Erzurum, Turkey)

Authors: Azimollah Aleshzadeh, Enver Vural Yavuz

Abstract:

The main goal of this study is to produce landslide susceptibility maps using different data-driven bivariate statistical approaches; namely, entropy weight method (EWM), evidence belief function (EBF), and information content model (ICM), at Uzundere county, Erzurum province, in the north-eastern part of Turkey. Past landslide occurrences were identified and mapped from an interpretation of high-resolution satellite images, and earlier reports as well as by carrying out field surveys. In total, 42 landslide incidence polygons were mapped using ArcGIS 10.4.1 software and randomly split into a construction dataset 70 % (30 landslide incidences) for building the EWM, EBF, and ICM models and the remaining 30 % (12 landslides incidences) were used for verification purposes. Twelve layers of landslide-predisposing parameters were prepared, including total surface radiation, maximum relief, soil groups, standard curvature, distance to stream/river sites, distance to the road network, surface roughness, land use pattern, engineering geological rock group, topographical elevation, the orientation of slope, and terrain slope gradient. The relationships between the landslide-predisposing parameters and the landslide inventory map were determined using different statistical models (EWM, EBF, and ICM). The model results were validated with landslide incidences, which were not used during the model construction. In addition, receiver operating characteristic curves were applied, and the area under the curve (AUC) was determined for the different susceptibility maps using the success (construction data) and prediction (verification data) rate curves. The results revealed that the AUC for success rates are 0.7055, 0.7221, and 0.7368, while the prediction rates are 0.6811, 0.6997, and 0.7105 for EWM, EBF, and ICM models, respectively. Consequently, landslide susceptibility maps were classified into five susceptibility classes, including very low, low, moderate, high, and very high. Additionally, the portion of construction and verification landslides incidences in high and very high landslide susceptibility classes in each map was determined. The results showed that the EWM, EBF, and ICM models produced satisfactory accuracy. The obtained landslide susceptibility maps may be useful for future natural hazard mitigation studies and planning purposes for environmental protection.

Keywords: entropy weight method, evidence belief function, information content model, landslide susceptibility mapping

Procedia PDF Downloads 135

26765 Development of Energy Benchmarks Using Mandatory Energy and Emissions Reporting Data: Ontario Post-Secondary Residences

Authors: C. Xavier Mendieta, J. J McArthur

Abstract:

Governments are playing an increasingly active role in reducing carbon emissions, and a key strategy has been the introduction of mandatory energy disclosure policies. These policies have resulted in a significant amount of publicly available data, providing researchers with a unique opportunity to develop location-specific energy and carbon emission benchmarks from this data set, which can then be used to develop building archetypes and used to inform urban energy models. This study presents the development of such a benchmark using the public reporting data. The data from Ontario’s Ministry of Energy for Post-Secondary Educational Institutions are being used to develop a series of building archetype dynamic building loads and energy benchmarks to fill a gap in the currently available building database. This paper presents the development of a benchmark for college and university residences within ASHRAE climate zone 6 areas in Ontario using the mandatory disclosure energy and greenhouse gas emissions data. The methodology presented includes data cleaning, statistical analysis, and benchmark development, and lessons learned from this investigation are presented and discussed to inform the development of future energy benchmarks from this larger data set. The key findings from this initial benchmarking study are: (1) the importance of careful data screening and outlier identification to develop a valid dataset; (2) the key features used to develop a model of the data are building age, size, and occupancy schedules and these can be used to estimate energy consumption; and (3) policy changes affecting the primary energy generation significantly affected greenhouse gas emissions, and consideration of these factors was critical to evaluate the validity of the reported data.

Keywords: building archetypes, data analysis, energy benchmarks, GHG emissions

Procedia PDF Downloads 310

26764 Statistical Feature Extraction Method for Wood Species Recognition System

Authors: Mohd Iz'aan Paiz Bin Zamri, Anis Salwa Mohd Khairuddin, Norrima Mokhtar, Rubiyah Yusof

Abstract:

Effective statistical feature extraction and classification are important in image-based automatic inspection and analysis. An automatic wood species recognition system is designed to perform wood inspection at custom checkpoints to avoid mislabeling of timber which will results to loss of income to the timber industry. The system focuses on analyzing the statistical pores properties of the wood images. This paper proposed a fuzzy-based feature extractor which mimics the experts’ knowledge on wood texture to extract the properties of pores distribution from the wood surface texture. The proposed feature extractor consists of two steps namely pores extraction and fuzzy pores management. The total number of statistical features extracted from each wood image is 38 features. Then, a backpropagation neural network is used to classify the wood species based on the statistical features. A comprehensive set of experiments on a database composed of 5200 macroscopic images from 52 tropical wood species was used to evaluate the performance of the proposed feature extractor. The advantage of the proposed feature extraction technique is that it mimics the experts’ interpretation on wood texture which allows human involvement when analyzing the wood texture. Experimental results show the efficiency of the proposed method.

Keywords: classification, feature extraction, fuzzy, inspection system, image analysis, macroscopic images

Procedia PDF Downloads 430

26763 Geostatistical and Geochemical Study of the Aquifer System Waters Complex Terminal in the Valley of Oued Righ-Arid Area Algeria

Authors: Asma Bettahar, Imed Eddine Nezli, Sameh Habes

Abstract:

Groundwater resources in the Oued Righ valley are represented like the parts of the eastern basin of the Algerian Sahara, superposed by two major aquifers: the Intercalary Continental (IC) and the Terminal Complex (TC). From a qualitative point of view, various studies have highlighted that the waters of this region showed excessive mineralization, including the waters of the terminal complex (EC Avg equal 5854.61 S/cm) .The present article is a statistical approach by two multi methods various complementary (ACP, CAH), applied to the analytical data of multilayered aquifer waters Terminal Complex of the Oued Righ valley. The approach is to establish a correlation between the chemical composition of water and the lithological nature of different aquifer levels formations, and predict possible connection between groundwater’s layers. The results show that the mineralization of water is from geological origin. They concern the composition of the layers that make up the complex terminal.

Keywords: complex terminal, mineralization, oued righ, statistical approach

Procedia PDF Downloads 395

26762 Anxiety and Depression in Caregivers of Autistic Children

Authors: Mou Juliet Rebeiro, S. M. Abul Kalam Azad

Abstract:

This study was carried out to see the anxiety and depression in caregivers of autistic children. The objectives of the research were to assess depression and anxiety among caregivers of autistic children and to find out the experience of caregivers. For this purpose, the research was conducted on a sample of 39 caregivers of autistic children. Participants were taken from a special school. To collect data for this study each of the caregivers were administered questionnaire comprising scales to measure anxiety and depression and some responses of the participants were taken through interview based on a topic guide. Obtained quantitative data were analyzed by using statistical analysis and qualitative data were analyzed according to themes. Mean of the anxiety score (55.85) and depression score (108.33) is above the cutoff point. Results showed that anxiety and depression is clinically present in caregivers of autistic children. Most of the caregivers experienced behavior, emotional, cognitive and social problems of their child that is linked with anxiety and depression.

Keywords: anxiety, autism, caregiver, depression

Procedia PDF Downloads 308

26761 A Comparative Study to Evaluate Chronological Age and Dental Age in the North Indian Population Using Cameriere's Method

Authors: Ranjitkumar Patil

Abstract:

Age estimation has importance in forensic dentistry. Dental age estimation has emerged as an alternative to skeletal age determination. The methods based on stages of tooth formation, as appreciated on radiographs, seem to be more appropriate in the assessment of age than those based on skeletal development. The study was done to evaluate dental age in the north Indian population using Cameriere’s method. Aims/Objectives: The study was conducted to assess the dental age of North Indian children using Cameriere’s method and to compare the chronological age and dental age for validation of the Cameriere’s method in the north Indian population. A comparative study of 02-year duration on the OPG (using PLANMECA Promax 3D) data of 497 individuals with ages ranging from 5 to 15 years was done based on simple random technique ethical approval obtained from institutional ethical committee. The data was obtained based on inclusion and exclusion criteria and was analyzed by software for dental age estimation. Statistical analysis: The student’s t-test was used to compare the morphological variables of males with those of females and to compare observed age with estimated age. The regression formula was also calculated. Results: Present study was a comparative study of 497 subjects with a distribution between males and females, with their dental age assessed by using a Panoramic radiograph, following the method described by Cameriere, which is widely accepted. Statistical analysis in our study indicated that gender does not have a significant influence on age estimation. (R2= 0.787). Conclusion: This infers that Cameriere’s method can be effectively applied to the north Indian population.

Keywords: forensic, dental age, skeletal age, chronological age, Cameriere’s method

Procedia PDF Downloads 118

26760 Time of Week Intensity Estimation from Interval Censored Data with Application to Police Patrol Planning

Authors: Jiahao Tian, Michael D. Porter

Abstract:

Law enforcement agencies are tasked with crime prevention and crime reduction under limited resources. Having an accurate temporal estimate of the crime rate would be valuable to achieve such a goal. However, estimation is usually complicated by the interval-censored nature of crime data. We cast the problem of intensity estimation as a Poisson regression using an EM algorithm to estimate the parameters. Two special penalties are added that provide smoothness over the time of day and day of the week. This approach presented here provides accurate intensity estimates and can also uncover day-of-week clusters that share the same intensity patterns. Anticipating where and when crimes might occur is a key element to successful policing strategies. However, this task is complicated by the presence of interval-censored data. The censored data refers to the type of data that the event time is only known to lie within an interval instead of being observed exactly. This type of data is prevailing in the field of criminology because of the absence of victims for certain types of crime. Despite its importance, the research in temporal analysis of crime has lagged behind the spatial component. Inspired by the success of solving crime-related problems with a statistical approach, we propose a statistical model for the temporal intensity estimation of crime with censored data. The model is built on Poisson regression and has special penalty terms added to the likelihood. An EM algorithm was derived to obtain maximum likelihood estimates, and the resulting model shows superior performance to the competing model. Our research is in line with the smart policing initiative (SPI) proposed by the Bureau Justice of Assistance (BJA) as an effort to support law enforcement agencies in building evidence-based, data-driven law enforcement tactics. The goal is to identify strategic approaches that are effective in crime prevention and reduction. In our case, we allow agencies to deploy their resources for a relatively short period of time to achieve the maximum level of crime reduction. By analyzing a particular area within cities where data are available, our proposed approach could not only provide an accurate estimate of intensities for the time unit considered but a time-variation crime incidence pattern. Both will be helpful in the allocation of limited resources by either improving the existing patrol plan with the understanding of the discovery of the day of week cluster or supporting extra resources available.

Keywords: cluster detection, EM algorithm, interval censoring, intensity estimation

Procedia PDF Downloads 69

26759 Research on Transmission Parameters Determination Method Based on Dynamic Characteristic Analysis

Authors: Baoshan Huang, Fanbiao Bao, Bing Li, Lianghua Zeng, Yi Zheng

Abstract:

Parameter control strategy based on statistical characteristics can analyze the choice of the transmission ratio of an automobile transmission. According to the difference of the transmission gear, the number and spacing of the gear can be determined. Transmission ratio distribution of transmission needs to satisfy certain distribution law. According to the statistic characteristics of driving parameters, the shift control strategy of the vehicle is analyzed. CVT shift schedule adjustment algorithm based on statistical characteristic parameters can be seen from the above analysis, if according to the certain algorithm to adjust the size of, can adjust the target point are in the best efficiency curve and dynamic curve between the location, to alter the vehicle characteristics. Based on the dynamic characteristics and the practical application of the vehicle, this paper presents the setting scheme of the transmission ratio.

Keywords: vehicle dynamics, transmission ratio, transmission parameters, statistical characteristics

Procedia PDF Downloads 409