Search results for: multivariate data analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 40968

Search results for: multivariate data analysis

40818 Big Data Analysis with RHadoop

Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim

Abstract:

It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop

Procedia PDF Downloads 413
40817 The Effect of Transactional Analysis Group Training on Self-Knowledge and Its Ego States (The Child, Parent, and Adult): A Quasi-Experimental Study Applied to Counselors of Tehran

Authors: Mehravar Javid, Sadrieh Khajavi Mazanderani, Kelly Gleischman, Zoe Andris

Abstract:

The present study was conducted with the aim of investigating the effectiveness of transactional analysis group training on self-knowledge and Its dimensions (self, child, and adult) in counselors working in public and private high schools in Tehran. Counseling has become an important job for society, and there is a need for consultants in organizations. Providing better and more efficient counseling is one of the goals of the education system. The personal characteristics of counselors are important for the success of the therapy. In TA, humans have three ego states, which are named parent, adult, and child, and the main concept in the transactional analysis is self-state, which means a stable feeling and pattern of thinking related to behavioral patterns. Self-knowledge, considered a prerequisite to effective communication, fosters psychological growth, and recognizing it, is pivotal for emotional development, leading to profound insights. The research sample included 30 working counselors (22 women and 8 men) in the academic year 2019-2020 who achieved the lowest scores on the self-knowledge questionnaire. The research method was quasi-experimental with a control group (15 people in the experimental group and 15 people in the control group). The research tool was a self-awareness questionnaire with 29 questions and three subscales (child, parent, and adult Ego state). The experimental group was exposed to transactional analysis training for 10 once-weekly 2-hour sessions; the questionnaire was implemented in both groups (post-test). Multivariate covariance analysis was used to analyze the data. The data showed that the level of self-awareness of counselors who received transactional analysis training is higher than that of counselors who did not receive any training (p<0.01). The result obtained from this analysis shows that transactional analysis training is an effective therapy for enhancing self-knowledge and its subscales (Adult ego state, Parent ego state, and Child ego state). Teaching transactional analysis increases self-knowledge, and self-realization and helps people to achieve independence and remove irresponsibility to improve intra-personal and interpersonal relationships.

Keywords: ego state, group, transactional analysis, self-knowledge

Procedia PDF Downloads 52
40816 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: text mining, topic extraction, independent, incremental, independent component analysis

Procedia PDF Downloads 286
40815 A Review of Spatial Analysis as a Geographic Information Management Tool

Authors: Chidiebere C. Agoha, Armstong C. Awuzie, Chukwuebuka N. Onwubuariri, Joy O. Njoku

Abstract:

Spatial analysis is a field of study that utilizes geographic or spatial information to understand and analyze patterns, relationships, and trends in data. It is characterized by the use of geographic or spatial information, which allows for the analysis of data in the context of its location and surroundings. It is different from non-spatial or aspatial techniques, which do not consider the geographic context and may not provide as complete of an understanding of the data. Spatial analysis is applied in a variety of fields, which includes urban planning, environmental science, geosciences, epidemiology, marketing, to gain insights and make decisions about complex spatial problems. This review paper explores definitions of spatial analysis from various sources, including examples of its application and different analysis techniques such as Buffer analysis, interpolation, and Kernel density analysis (multi-distance spatial cluster analysis). It also contrasts spatial analysis with non-spatial analysis.

Keywords: aspatial technique, buffer analysis, epidemiology, interpolation

Procedia PDF Downloads 291
40814 Frailty Models for Modeling Heterogeneity: Simulation Study and Application to Quebec Pension Plan

Authors: Souad Romdhane, Lotfi Belkacem

Abstract:

When referring to actuarial analysis of lifetime, only models accounting for observable risk factors have been developed. Within this context, Cox proportional hazards model (CPH model) is commonly used to assess the effects of observable covariates as gender, age, smoking habits, on the hazard rates. These covariates may fail to fully account for the true lifetime interval. This may be due to the existence of another random variable (frailty) that is still being ignored. The aim of this paper is to examine the shared frailty issue in the Cox proportional hazard model by including two different parametric forms of frailty into the hazard function. Four estimated methods are used to fit them. The performance of the parameter estimates is assessed and compared between the classical Cox model and these frailty models through a real-life data set from the Quebec Pension Plan and then using a more general simulation study. This performance is investigated in terms of the bias of point estimates and their empirical standard errors in both fixed and random effect parts. Both the simulation and the real dataset studies showed differences between classical Cox model and shared frailty model.

Keywords: life insurance-pension plan, survival analysis, risk factors, cox proportional hazards model, multivariate failure-time data, shared frailty, simulations study

Procedia PDF Downloads 337
40813 Comparison of Power Generation Status of Photovoltaic Systems under Different Weather Conditions

Authors: Zhaojun Wang, Zongdi Sun, Qinqin Cui, Xingwan Ren

Abstract:

Based on multivariate statistical analysis theory, this paper uses the principal component analysis method, Mahalanobis distance analysis method and fitting method to establish the photovoltaic health model to evaluate the health of photovoltaic panels. First of all, according to weather conditions, the photovoltaic panel variable data are classified into five categories: sunny, cloudy, rainy, foggy, overcast. The health of photovoltaic panels in these five types of weather is studied. Secondly, a scatterplot of the relationship between the amount of electricity produced by each kind of weather and other variables was plotted. It was found that the amount of electricity generated by photovoltaic panels has a significant nonlinear relationship with time. The fitting method was used to fit the relationship between the amount of weather generated and the time, and the nonlinear equation was obtained. Then, using the principal component analysis method to analyze the independent variables under five kinds of weather conditions, according to the Kaiser-Meyer-Olkin test, it was found that three types of weather such as overcast, foggy, and sunny meet the conditions for factor analysis, while cloudy and rainy weather do not satisfy the conditions for factor analysis. Therefore, through the principal component analysis method, the main components of overcast weather are temperature, AQI, and pm2.5. The main component of foggy weather is temperature, and the main components of sunny weather are temperature, AQI, and pm2.5. Cloudy and rainy weather require analysis of all of their variables, namely temperature, AQI, pm2.5, solar radiation intensity and time. Finally, taking the variable values in sunny weather as observed values, taking the main components of cloudy, foggy, overcast and rainy weather as sample data, the Mahalanobis distances between observed value and these sample values are obtained. A comparative analysis was carried out to compare the degree of deviation of the Mahalanobis distance to determine the health of the photovoltaic panels under different weather conditions. It was found that the weather conditions in which the Mahalanobis distance fluctuations ranged from small to large were: foggy, cloudy, overcast and rainy.

Keywords: fitting, principal component analysis, Mahalanobis distance, SPSS, MATLAB

Procedia PDF Downloads 120
40812 Copper Price Prediction Model for Various Economic Situations

Authors: Haidy S. Ghali, Engy Serag, A. Samer Ezeldin

Abstract:

Copper is an essential raw material used in the construction industry. During the year 2021 and the first half of 2022, the global market suffered from a significant fluctuation in copper raw material prices due to the aftermath of both the COVID-19 pandemic and the Russia-Ukraine war, which exposed its consumers to an unexpected financial risk. Thereto, this paper aims to develop two ANN-LSTM price prediction models, using Python, that can forecast the average monthly copper prices traded in the London Metal Exchange; the first model is a multivariate model that forecasts the copper price of the next 1-month and the second is a univariate model that predicts the copper prices of the upcoming three months. Historical data of average monthly London Metal Exchange copper prices are collected from January 2009 till July 2022, and potential external factors are identified and employed in the multivariate model. These factors lie under three main categories: energy prices and economic indicators of the three major exporting countries of copper, depending on the data availability. Before developing the LSTM models, the collected external parameters are analyzed with respect to the copper prices using correlation and multicollinearity tests in R software; then, the parameters are further screened to select the parameters that influence the copper prices. Then, the two LSTM models are developed, and the dataset is divided into training, validation, and testing sets. The results show that the performance of the 3-Month prediction model is better than the 1-Month prediction model, but still, both models can act as predicting tools for diverse economic situations.

Keywords: copper prices, prediction model, neural network, time series forecasting

Procedia PDF Downloads 88
40811 EWMA and MEWMA Control Charts for Monitoring Mean and Variance in Industrial Processes

Authors: L. A. Toro, N. Prieto, J. J. Vargas

Abstract:

There are many control charts for monitoring mean and variance. Among these, the X y R, X y S, S2 Hotteling and Shewhart control charts, for mentioning some, are widely used for monitoring mean a variance in industrial processes. In particular, the Shewhart charts are based on the information about the process contained in the current observation only and ignore any information given by the entire sequence of points. Moreover, that the Shewhart chart is a control chart without memory. Consequently, Shewhart control charts are found to be less sensitive in detecting smaller shifts, particularly smaller than 1.5 times of the standard deviation. These kind of small shifts are important in many industrial applications. In this study and effective alternative to Shewhart control chart was implemented. In case of univariate process an Exponentially Moving Average (EWMA) control chart was developed and Multivariate Exponentially Moving Average (MEWMA) control chart in case of multivariate process. Both of these charts were based on memory and perform better that Shewhart chart while detecting smaller shifts. In these charts, information the past sample is cumulated up the current sample and then the decision about the process control is taken. The mentioned characteristic of EWMA and MEWMA charts, are of the paramount importance when it is necessary to control industrial process, because it is possible to correct or predict problems in the processes before they come to a dangerous limit.

Keywords: control charts, multivariate exponentially moving average (MEWMA), exponentially moving average (EWMA), industrial control process

Procedia PDF Downloads 331
40810 Gender Justice and Feminist Self-Management Practices in the Solidarity Economy: A Quantitative Analysis of the Factors that Impact Enterprises Formed by Women in Brazil

Authors: Maria de Nazaré Moraes Soares, Silvia Maria Dias Pedro Rebouças, José Carlos Lázaro

Abstract:

The Solidarity Economy (SE) acts in the re-articulation of the economic field to the other spheres of social action. The significant participation of women in SE resulted in the formation of a national network of self-managed enterprises in Brazil: The Solidarity and Feminist Economy Network (SFEN). The objective of the research is to identify factors of gender justice and feminist self-management practices that adhere to the reality of women in SE enterprises. The conceptual apparatus related to feminist studies in this research covers Nancy Fraser approaches on gender justice, and Patricia Yancey Martin approaches on feminist management practices, and authors of postcolonial feminism such as Mohanty and Maria Lugones, who lead the discussion to peripheral contexts, a necessary perspective when observing the women’s movement in SE. The research has a quantitative nature in the phases of data collection and analysis. The data collection was performed through two data sources: the database mapped in Brazil in 2010-2013 by the National Information System in Solidary Economy and 150 questionnaires with women from 16 enterprises in SFEN, in a state of Brazilian northeast. The data were analyzed using the multivariate statistical technique of Factor Analysis. The results show that the factors that define gender justice and feminist self-management practices in SE are interrelated in several levels, proving statistically the intersectional condition of the issue of women. The evidence from the quantitative analysis allowed us to understand the dimensions of gender justice and feminist management practices intersectionality; in this sense, the non-distribution of domestic work interferes in non-representation of women in public spaces, especially in peripheral contexts. The study contributes with important reflections to the studies of this area and can be complemented in the future with a qualitative research that approaches the perspective of women in the context of the SE self-management paradigm.

Keywords: feminist management practices, gender justice, self-management, solidarity economy

Procedia PDF Downloads 105
40809 Prognostic Impact of Pre-transplant Ferritinemia: A Survival Analysis Among Allograft Patients

Authors: Mekni Sabrine, Nouira Mariem

Abstract:

Background and aim: Allogeneic hematopoietic stem cell transplantation is a curative treatment for several hematological diseases; however, it has a non-negligible morbidity and mortality depending on several prognostic factors, including pre-transplant hyperferritinemia. The aim of our study was to estimate the impact of hyperferritinemia on survivals and on the occurrence of post-transplant complications. Methods: It was a longitudinal study conducted over 8 years and including all patients who had a first allograft. The impact of pretransplant hyperferritinemia (ferritinemia ≥1500) on survivals was studied using the Kaplan Meier method and the COX model for uni- and multivariate analysis. The Khi-deux test and binary logistic regression were used to study the association between pretransplant ferritinemia and post-transplant complications. Results: One hundred forty patients were included with an average age of 26.6 years and a sex ratio (M/F)=1.4. Hyperferritinemia was found in 33% of patients. It had no significant impact on either overall survival (p=0.9) or event -free survival (p=0.6). In multivariate analysis, only the type of disease was independently associated with overall survival (p=0.04) and event-free survival (p=0.002). For post-allograft complications: The occurrence of early documented infections was independently associated with pretransplant hyperferritinemia (p=0.02) and the presence of acute graft versus host disease( GVHD) (p<10-3). The occurrence of acute GVHD was associated with early documented infection (p=0.002) and Cytomegalovirus reactivation (p<10-3). The occurrence of chronic GVHD was associated with the presence of Cytomegalovirus reactivation (p=0.006) and graft source (p=0.009). Conclusion: Our study showed the significant impact of pre-transplant hyperferritinemia on the occurrence of early infections but not on survivals. Early and more accurate assessment iron overload by other tests such as liver magnetic resonance imaging with initiation of chelating treatment could prevent the occurrence of such complications after transplantation.

Keywords: allogeneic, transplants, ferritin, survival

Procedia PDF Downloads 50
40808 Analysis Customer Loyalty Characteristic and Segmentation Analysis in Mobile Phone Category in Indonesia

Authors: A. B. Robert, Adam Pramadia, Calvin Andika

Abstract:

The main purpose of this study is to explore consumer loyalty characteristic of mobile phone category in Indonesia. Second, this research attempts to identify consumer segment and to explore their profile in each segment as the basis of marketing strategy formulation. This study used some tools of multivariate analysis such as discriminant analysis and cluster analysis. Discriminate analysis used to discriminate consumer loyal and not loyal by using particular variables. Cluster analysis used to reveal various segment in mobile phone category. In addition to having better customer understanding in each segment, this study used descriptive analysis and cross tab analysis in each segment defined by cluster analysis. This study expected several findings. First, consumer can be divided into two large group of loyal versus not loyal by set of variables. Second, this study identifies customer segment in mobile phone category. Third, exploring customer profile in each segment that has been identified. This study answer a call for additional empirical research into different product categories. Therefore, a replication research is advisable. By knowing the customer loyalty characteristic, and deep analysis of their consumption behavior and profile for each segment, this study is very advisable for high impact marketing strategy development. This study contributes body of knowledge by adding empirical study of consumer loyalty, segmentation analysis in mobile phone category by multiple brand analysis.

Keywords: customer loyalty, segmentation, marketing strategy, discriminant analysis, cluster analysis, mobile phone

Procedia PDF Downloads 575
40807 On Pooling Different Levels of Data in Estimating Parameters of Continuous Meta-Analysis

Authors: N. R. N. Idris, S. Baharom

Abstract:

A meta-analysis may be performed using aggregate data (AD) or an individual patient data (IPD). In practice, studies may be available at both IPD and AD level. In this situation, both the IPD and AD should be utilised in order to maximize the available information. Statistical advantages of combining the studies from different level have not been fully explored. This study aims to quantify the statistical benefits of including available IPD when conducting a conventional summary-level meta-analysis. Simulated meta-analysis were used to assess the influence of the levels of data on overall meta-analysis estimates based on IPD-only, AD-only and the combination of IPD and AD (mixed data, MD), under different study scenario. The percentage relative bias (PRB), root mean-square-error (RMSE) and coverage probability were used to assess the efficiency of the overall estimates. The results demonstrate that available IPD should always be included in a conventional meta-analysis using summary level data as they would significantly increased the accuracy of the estimates. On the other hand, if more than 80% of the available data are at IPD level, including the AD does not provide significant differences in terms of accuracy of the estimates. Additionally, combining the IPD and AD has moderating effects on the biasness of the estimates of the treatment effects as the IPD tends to overestimate the treatment effects, while the AD has the tendency to produce underestimated effect estimates. These results may provide some guide in deciding if significant benefit is gained by pooling the two levels of data when conducting meta-analysis.

Keywords: aggregate data, combined-level data, individual patient data, meta-analysis

Procedia PDF Downloads 354
40806 Private and Public Health Sector Difference on Client Satisfaction: Results from Secondary Data Analysis in Sindh, Pakistan

Authors: Wajiha Javed, Arsalan Jabbar, Nelofer Mehboob, Muhammad Tafseer, Zahid Memon

Abstract:

Introduction: Researchers globally have strived to explore diverse factors that augment the continuation and uptake of family planning methods. Clients’ satisfaction is one of the core determinants facilitating continuation of family planning methods. There is a major debate yet scanty evidence to contrast public and private sectors with respect to client satisfaction. The objective of this study is to compare quality-of-care provided by public and private sectors of Pakistan through a client satisfaction lens. Methods: We used Pakistan Demographic Heath Survey 2012-13 dataset (Sindh province) on a total of 3133 Married Women of Reproductive Age (MWRA) aged 15-49 years. Source of family planning (public/private sector) was the main exposure variable. Outcome variable was client satisfaction judged by ten different dimensions of client satisfaction. Means and standard deviations were calculated for continuous variable while for categorical variable frequencies and percentages were computed. For univariate analysis, Chi-square/Fisher Exact test was used to find an association between clients’ satisfaction in public and private sectors. Ten different multivariate models were made. Variables were checked for multi-collinearity, confounding, and interaction, and then advanced logistic regression was used to explore the relationship between client satisfaction and dependent outcome after adjusting for all known confounding factors and results are presented as OR and AOR (95% CI). Results: Multivariate analyses showed that clients were less satisfied in contraceptive provision from private sector as compared to public sector (AOR 0.92,95% CI 0.63-1.68) even though the result was not statistically significant. Clients were more satisfied from private sector as compared to the public sector with respect to other determinants of quality-of-care (follow-up care (AOR 3.29, 95% CI 1.95-5.55), infection prevention (AOR 2.41, 95% CI 1.60-3.62), counseling services (AOR 2.01, 95% CI 1.27-3.18, timely treatment (AOR 3.37, 95% CI 2.20-5.15), attitude of staff (AOR 2.23, 95% CI 1.50-3.33), punctuality of staff (AOR 2.28, 95% CI 1.92-4.13), timely referring (AOR 2.34, 95% CI 1.63-3.35), staff cooperation (AOR 1.75, 95% CI 1.22-2.51) and complications handling (AOR 2.27, 95% CI 1.56-3.29).

Keywords: client satisfaction, family planning, public private partnership, quality of care

Procedia PDF Downloads 396
40805 Association of Musculoskeletal and Radiological Features with Clinical and Serological Findings in Systemic Sclerosis: A Single-Centre Registry Study

Authors: Rezvan Hosseinian

Abstract:

Aim: Systemic sclerosis (SSc) is a chronic connective tissue disease with the clinical hallmark of skin thickening and tethering. The correlation of musculoskeletal features with other parameters should be considered in SSc patients. Methods: We reviewed the records of all patients who had more than one visit and standard anteroposterior radiography of hand. We used univariate analysis, and factors with p<0.05 were included in logistic regression to find out dependent factors. Results: Overall, 180 SSc patients were enrolled in our study, 161 (89.4%) of whom were women. The median age (IQR) was 47.0 years (16), and 52% had a diffuse subtype of the disease. In multivariate analysis, tendon friction rubs (TFRs) were associated with the presence of calcinosis, muscle tenderness, and flexion contracture (FC) on physical examination (p<0.05). Arthritis showed no differences in the two subtypes of the disease (p=0.98), and in multivariate analysis, there were no correlations between radiographic arthritis and serological and clinical features. The radiographic results indicated that disease duration correlated with joint erosion, acro-osteolysis, resorption of the distal ulna, calcinosis and radiologic FC (p< 0.05). Acro-osteolysis was more frequent in the dcSSc subtype, TFRs, and anti-TOPO I antibody. Radiologic FC showed an association with skin score, calcinosis and haematocrit <30% (p<0.05). Joint flexion on radiography was associated with disease duration, modified Rodnan skin score, calcinosis, and low hematocrit (P<0.01). Conclusion: Disease duration was a main dependent factor for developing joint erosion, acro-osteolysis, bone resorption, calcinosis, and flexion contracture on hand radiography. Acro-osteolysis presented in the severe form of the disease. Acro-osteolysis was the only dependent variable associated with bone demineralization.

Keywords: disease subsets, hand radiography, joint erosion, sclerosis

Procedia PDF Downloads 63
40804 Association of Musculoskeletal and Radiological Features with Clinical and Serological Findings in Systemic Sclerosis: A Single-Centre Registry Study

Authors: Nasrin Azarbani

Abstract:

Aim: Systemic sclerosis (SSc) is a chronic connective tissue disease with the clinical hallmark of skin thickening and tethering. Correlation of musculoskeletal features with other parameters should be considered in SSc patients. Methods: We reviewed the records of all patients who had more than one visit and standard anteroposterior radiography of hand. We used univariate analysis, and factors with p<0.05 were included in logistic regression to find out dependent factors. Results: Overall, 180 SSc patients were enrolled in our study, 161 (89.4%) of whom were women. Median age (IQR) was 47.0 years (16), and 52% had diffuse subtype of the disease. In multivariate analysis, tendon friction rubs (TFRs) was associated with the presence of calcinosis, muscle tenderness, and flexion contracture (FC) on physical examination (p<0.05). Arthritis showed no differences in the two subtypes of the disease (p=0.98), and in multivariate analysis, there were no correlations between radiographic arthritis and serological and clinical features. The radiographic results indicated that disease duration correlated with joint erosion, acro-osteolysis, resorption of distal ulna, calcinosis and radiologic FC (p< 0.05). Acro-osteolysis was more frequent in the dcSSc subtype, TFRs, and anti-TOPO I antibody. Radiologic FC showed an association with skin score, calcinosis and haematocrit <30% (p<0.05). Joint flexion on radiography was associated with disease duration, modified Rodnan skin score, calcinosis, and low haematocrit (P<0.01). Conclusion: Disease duration was a main dependent factor for developing joint erosion, acro-osteolysis, bone resorption, calcinosis, and flexion contracture on hand radiography. Acro-osteolysis presented in the severe form of the disease. Acro-osteolysis was the only dependent variable associated with bone demineralization.

Keywords: sclerosis, disease subsets, joint erosion, musculoskeletal

Procedia PDF Downloads 46
40803 Multidimensional Item Response Theory Models for Practical Application in Large Tests Designed to Measure Multiple Constructs

Authors: Maria Fernanda Ordoñez Martinez, Alvaro Mauricio Montenegro

Abstract:

This work presents a statistical methodology for measuring and founding constructs in Latent Semantic Analysis. This approach uses the qualities of Factor Analysis in binary data with interpretations present on Item Response Theory. More precisely, we propose initially reducing dimensionality with specific use of Principal Component Analysis for the linguistic data and then, producing axes of groups made from a clustering analysis of the semantic data. This approach allows the user to give meaning to previous clusters and found the real latent structure presented by data. The methodology is applied in a set of real semantic data presenting impressive results for the coherence, speed and precision.

Keywords: semantic analysis, factorial analysis, dimension reduction, penalized logistic regression

Procedia PDF Downloads 421
40802 Predicting Returns Volatilities and Correlations of Stock Indices Using Multivariate Conditional Autoregressive Range and Return Models

Authors: Shay Kee Tan, Kok Haur Ng, Jennifer So-Kuen Chan

Abstract:

This paper extends the conditional autoregressive range (CARR) model to multivariate CARR (MCARR) model and further to the two-stage MCARR-return model to model and forecast volatilities, correlations and returns of multiple financial assets. The first stage model fits the scaled realised Parkinson volatility measures using individual series and their pairwise sums of indices to the MCARR model to obtain in-sample estimates and forecasts of volatilities for these individual and pairwise sum series. Then covariances are calculated to construct the fitted variance-covariance matrix of returns which are imputed into the stage-two return model to capture the heteroskedasticity of assets’ returns. We investigate different choices of mean functions to describe the volatility dynamics. Empirical applications are based on the Standard and Poor 500, Dow Jones Industrial Average and Dow Jones United States Financial Service Indices. Results show that the stage-one MCARR models using asymmetric mean functions give better in-sample model fits than those based on symmetric mean functions. They also provide better out-of-sample volatility forecasts than those using CARR models based on two robust loss functions with the scaled realised open-to-close volatility measure as the proxy for the unobserved true volatility. We also find that the stage-two return models with constant means and multivariate Student-t errors give better in-sample fits than the Baba, Engle, Kraft, and Kroner type of generalized autoregressive conditional heteroskedasticity (BEKK-GARCH) models. The estimates and forecasts of value-at-risk (VaR) and conditional VaR based on the best MCARR-return models for each asset are provided and tested using Kupiec test to confirm the accuracy of the VaR forecasts.

Keywords: range-based volatility, correlation, multivariate CARR-return model, value-at-risk, conditional value-at-risk

Procedia PDF Downloads 80
40801 Collision Theory Based Sentiment Detection Using Discourse Analysis in Hadoop

Authors: Anuta Mukherjee, Saswati Mukherjee

Abstract:

Data is growing everyday. Social networking sites such as Twitter are becoming an integral part of our daily lives, contributing a large increase in the growth of data. It is a rich source especially for sentiment detection or mining since people often express honest opinion through tweets. However, although sentiment analysis is a well-researched topic in text, this analysis using Twitter data poses additional challenges since these are unstructured data with abbreviations and without a strict grammatical correctness. We have employed collision theory to achieve sentiment analysis in Twitter data. We have also incorporated discourse analysis in the collision theory based model to detect accurate sentiment from tweets. We have also used the retweet field to assign weights to certain tweets and obtained the overall weightage of a topic provided in the form of a query. Hadoop has been exploited for speed. Our experiments show effective results.

Keywords: sentiment analysis, twitter, collision theory, discourse analysis

Procedia PDF Downloads 508
40800 Spatial Interpolation Technique for the Optimisation of Geometric Programming Problems

Authors: Debjani Chakraborty, Abhijit Chatterjee, Aishwaryaprajna

Abstract:

Posynomials, a special type of polynomials, having singularities, pose difficulties while solving geometric programming problems. In this paper, a methodology has been proposed and used to obtain extreme values for geometric programming problems by nth degree polynomial interpolation technique. Here the main idea to optimise the posynomial is to fit a best polynomial which has continuous gradient values throughout the range of the function. The approximating polynomial is smoothened to remove the discontinuities present in the feasible region and the objective function. This spatial interpolation method is capable to optimise univariate and multivariate geometric programming problems. An example is solved to explain the robustness of the methodology by considering a bivariate nonlinear geometric programming problem. This method is also applicable for signomial programming problem.

Keywords: geometric programming problem, multivariate optimisation technique, posynomial, spatial interpolation

Procedia PDF Downloads 340
40799 Sexual Behaviours among Iranian Men and Women Aged 15 to 49 Years in Metropolitan Tehran, Iran: A Cross-Sectional Study

Authors: Mahnaz Motamedi, Mohammad Shahbazi, Shahrzad Rahimi-Naghani, Mehrdad Salehi

Abstract:

Introduction and Aim: This study assessed sexual behaviours among men and women aged 15 to 49 years in Tehran. Material and Methods: This was a cross-sectional study conducted on 755 men and women aged 15 to 49 years who were residents of Tehran. To select the participants, a multistage, cluster, random sampling method was used and included different regions of Tehran. The data were collected using the WHO-endorsed Questionnaire of Sexual and Reproductive Health. Descriptive, bivariate, and multivariate analyses were conducted using SPSS version 20. Sexual and reproductive health (SRH) behaviours was a scale variable that was constructed from items of six sections: sexual experiences, characteristics of the first sexual partner, characteristics of the first intercourse, next sexual contact and the consequences of the first sexual contact, homosexual experiences and the causes of sexual abstinence. Results: The mean age at the time of sexual intercourse with penetration (vaginal, anal) was 19.88 in men and 21.82 in women. Multivariate analysis using linear regression showed that by controlling for other variables, gender had a significant relationship with having sexual experience, mean age of first sexual intercourse, and being multi-partner. Thus, women with sexual experience were 0.158 units less than men. The mean age of first intercourse in women was 1.57 units higher than men and being a multi-partner in women was 0.247 less than men (P < 0.001). Sexual experience in very religious and relatively religious individuals was 0.332 and 0.218 units less than those for whom religion did not matter (P < 0.001). 25.6% of men and 40.7% of women who did not have sexual experience at the time of the study stated that their reason for abstinence was their unwillingness to have sex (P < 0.05). 35.9% of men and 16.5% of women stated that the reason for abstinence was not providing a suitable opportunity (P < 0.001). 4.7% of men and 1.7% of women had sexual attraction to the same sex. The difference between men and women was significant (P < 0.001). Conclusion: Sexual relation is also present in singles and younger groups and is not limited to married or final marriage candidates. Therefore, more evaluation should be done in national research and interventions for sexual and reproductive health services should be done at the macro level of policy making.

Keywords: sexual behaviours, Iranian men and women, Iran, cross-sectional study

Procedia PDF Downloads 141
40798 Healthy Lifestyle and Risky Behaviors amongst Students of Physical Education High Schools

Authors: Amin Amani, Masomeh Reihany Shirvan, Mahla Nabizadeh Mashizi, Mohadese Khoshtinat, Mohammad Elyas Ansarinia

Abstract:

The purpose of this study is the relationship between a healthy lifestyle and risky behavior in physical education students of Bojnourd schools. The study sample consisted of teenagers studying in second and third grade of Bojnourd's high schools. According to level sampling, 604 students studying in the second grade, and 600 students studying in third grade were tested from physical education schools in Bojnourd. For sample selection, populations were divided into 4 area including north, East, West and South. Then according to the number of students of each area, sample size of each level was determined. Two questionnaires were used to collect data in this study which were consisted of three parts: The demographic data, Iranian teenagers' risk taking (IARS) and prevention methods with emphasize on the importance of family role were examined. The Central and dispersion indices, such as standard deviation, multiple variance analysis, and multivariate regression analysis were used. Results showed that the observed F is significant (P ≤ 0.01) and 21% of variance related to risky behavior is explained by the lack of awareness. Given the significance of the regression, the coefficients of risky behavior in teenagers in prediction equation showed that each of teenagers' risky behavior can have an impact on healthy lifestyle.

Keywords: healthy lifestyle, high-risk behavior, students, physical education

Procedia PDF Downloads 170
40797 Analysis of Expression Data Using Unsupervised Techniques

Authors: M. A. I Perera, C. R. Wijesinghe, A. R. Weerasinghe

Abstract:

his study was conducted to review and identify the unsupervised techniques that can be employed to analyze gene expression data in order to identify better subtypes of tumors. Identifying subtypes of cancer help in improving the efficacy and reducing the toxicity of the treatments by identifying clues to find target therapeutics. Process of gene expression data analysis described under three steps as preprocessing, clustering, and cluster validation. Feature selection is important since the genomic data are high dimensional with a large number of features compared to samples. Hierarchical clustering and K Means are often used in the analysis of gene expression data. There are several cluster validation techniques used in validating the clusters. Heatmaps are an effective external validation method that allows comparing the identified classes with clinical variables and visual analysis of the classes.

Keywords: cancer subtypes, gene expression data analysis, clustering, cluster validation

Procedia PDF Downloads 128
40796 Variability of Metal Composition and Concentrations in Road Dust in the Urban Environment

Authors: Sandya Mummullage, Prasanna Egodawatta, Ashantha Goonetilleke, Godwin A. Ayoko

Abstract:

Urban road dust comprises of a range of potentially toxic metal elements and plays a critical role in degrading urban receiving water quality. Hence, assessing the metal composition and concentration in urban road dust is a high priority. This study investigated the variability of metal composition and concentrations in road dust in four different urban land uses in Gold Coast, Australia. Samples from 16 road sites were collected and tested for selected 12 metal species. The data set was analyzed using both univariate and multivariate techniques. Outcomes of the data analysis revealed that the metal concentrations inroad dust differs considerably within and between different land uses. Iron, aluminum, magnesium and zinc are the most abundant in urban land uses. It was also noted that metal species such as titanium, nickel, copper, and zinc have the highest concentrations in industrial land use. The study outcomes revealed that soil and traffic related sources as key sources of metals deposited on road surfaces.

Keywords: metals build-up, pollutant accumulation, stormwater quality, urban road dust

Procedia PDF Downloads 271
40795 The Effectiveness of Metaphor Therapy on Depression among Female Students

Authors: Marzieh Talebzadeh Shoushtari

Abstract:

The present study aimed to determine the effectiveness of Metaphor therapy on depression among female students. The sample included 60 female students with depression symptoms selected by simple sampling and randomly divided into two equal groups (experimental and control groups). Beck Depression Inventory was used to measure the variables. This was an experimental study with a pre-test/post-test design with control group. Eight metaphor therapy sessions were held for the experimental group. A post-test was administered to both groups. Data were analyzed using multivariate analysis of covariance (MANCOVA). Results showed that the Metaphor therapy decreased depression in the experimental group compared to the control group.

Keywords: metaphor therapy, depression, female, students

Procedia PDF Downloads 436
40794 Data Mining Algorithms Analysis: Case Study of Price Predictions of Lands

Authors: Julio Albuja, David Zaldumbide

Abstract:

Data analysis is an important step before taking a decision about money. The aim of this work is to analyze the factors that influence the final price of the houses through data mining algorithms. To our best knowledge, previous work was researched just to compare results. Furthermore, before using the data of the data set, the Z-Transformation were used to standardize the data in the same range. Hence, the data was classified into two groups to visualize them in a readability format. A decision tree was built, and graphical data is displayed where clearly is easy to see the results and the factors' influence in these graphics. The definitions of these methods are described, as well as the descriptions of the results. Finally, conclusions and recommendations are presented related to the released results that our research showed making it easier to apply these algorithms using a customized data set.

Keywords: algorithms, data, decision tree, transformation

Procedia PDF Downloads 352
40793 Comprehensive Profiling and Characterization of Untargeted Extracellular Metabolites in Fermentation Processes: Insights and Advances in Analysis and Identification

Authors: Marianna Ciaccia, Gennaro Agrimi, Isabella Pisano, Maurizio Bettiga, Silvia Rapacioli, Giulia Mensa, Monica Marzagalli

Abstract:

Objective: Untargeted metabolomic analysis of extracellular metabolites is a powerful approach that focuses on comprehensively profiling in the extracellular space. In this study, we applied extracellular metabolomic analysis to investigate the metabolism of two probiotic microorganisms with health benefits that extend far beyond the digestive tract and the immune system. Methods: Analytical techniques employed in extracellular metabolomic analysis encompass various technologies, including mass spectrometry (MS), which enables the identification of metabolites present in the fermentation media, as well as the comparison of metabolic profiles under different experimental conditions. Multivariate statistical analysis techniques like principal component analysis (PCA) or partial least squares-discriminant analysis (PLS-DA) play a crucial role in uncovering metabolic signatures and understanding the dynamics of metabolic networks. Results: Different types of supernatants from fermentation processes, such as dairy-free, not dairy-free media and media with no cells or pasteurized, were subjected to metabolite profiling, which contained a complex mixture of metabolites, including substrates, intermediates, and end-products. This profiling provided insights into the metabolic activity of the microorganisms. The integration of advanced software tools has facilitated the identification and characterization of metabolites in different fermentation conditions and microorganism strains. Conclusions: In conclusion, untargeted extracellular metabolomic analysis, combined with software tools, allowed the study of the metabolites consumed and produced during the fermentation processes of probiotic microorganisms. Ongoing advancements in data analysis methods will further enhance the application of extracellular metabolomic analysis in fermentation research, leading to improved bioproduction and the advancement of sustainable manufacturing processes.

Keywords: biotechnology, metabolomics, lactic bacteria, probiotics, postbiotics

Procedia PDF Downloads 46
40792 A Modular Framework for Enabling Analysis for Educators with Different Levels of Data Mining Skills

Authors: Kyle De Freitas, Margaret Bernard

Abstract:

Enabling data mining analysis among a wider audience of educators is an active area of research within the educational data mining (EDM) community. The paper proposes a framework for developing an environment that caters for educators who have little technical data mining skills as well as for more advanced users with some data mining expertise. This framework architecture was developed through the review of the strengths and weaknesses of existing models in the literature. The proposed framework provides a modular architecture for future researchers to focus on the development of specific areas within the EDM process. Finally, the paper also highlights a strategy of enabling analysis through either the use of predefined questions or a guided data mining process and highlights how the developed questions and analysis conducted can be reused and extended over time.

Keywords: educational data mining, learning management system, learning analytics, EDM framework

Procedia PDF Downloads 304
40791 Quantile Coherence Analysis: Application to Precipitation Data

Authors: Yaeji Lim, Hee-Seok Oh

Abstract:

The coherence analysis measures the linear time-invariant relationship between two data sets and has been studied various fields such as signal processing, engineering, and medical science. However classical coherence analysis tends to be sensitive to outliers and focuses only on mean relationship. In this paper, we generalized cross periodogram to quantile cross periodogram and provide richer inter-relationship between two data sets. This is a general version of Laplace cross periodogram. We prove its asymptotic distribution under the long range process and compare them with ordinary coherence through numerical examples. We also present real data example to confirm the usefulness of quantile coherence analysis.

Keywords: coherence, cross periodogram, spectrum, quantile

Procedia PDF Downloads 368
40790 Modeling and Statistical Analysis of a Soap Production Mix in Bejoy Manufacturing Industry, Anambra State, Nigeria

Authors: Okolie Chukwulozie Paul, Iwenofu Chinwe Onyedika, Sinebe Jude Ebieladoh, M. C. Nwosu

Abstract:

The research work is based on the statistical analysis of the processing data. The essence is to analyze the data statistically and to generate a design model for the production mix of soap manufacturing products in Bejoy manufacturing company Nkpologwu, Aguata Local Government Area, Anambra state, Nigeria. The statistical analysis shows the statistical analysis and the correlation of the data. T test, Partial correlation and bi-variate correlation were used to understand what the data portrays. The design model developed was used to model the data production yield and the correlation of the variables show that the R2 is 98.7%. However, the results confirm that the data is fit for further analysis and modeling. This was proved by the correlation and the R-squared.

Keywords: General Linear Model, correlation, variables, pearson, significance, T-test, soap, production mix and statistic

Procedia PDF Downloads 423
40789 Saving Energy at a Wastewater Treatment Plant through Electrical and Production Data Analysis

Authors: Adriano Araujo Carvalho, Arturo Alatrista Corrales

Abstract:

This paper intends to show how electrical energy consumption and production data analysis were used to find opportunities to save energy at Taboada wastewater treatment plant in Callao, Peru. In order to access the data, it was used independent data networks for both electrical and process instruments, which were taken to analyze under an ISO 50001 energy audit, which considered, thus, Energy Performance Indexes for each process and a step-by-step guide presented in this text. Due to the use of aforementioned methodology and data mining techniques applied on information gathered through electronic multimeters (conveniently placed on substation switchboards connected to a cloud network), it was possible to identify thoroughly the performance of each process and thus, evidence saving opportunities which were previously hidden before. The data analysis brought both costs and energy reduction, allowing the plant to save significant resources and to be certified under ISO 50001.

Keywords: energy and production data analysis, energy management, ISO 50001, wastewater treatment plant energy analysis

Procedia PDF Downloads 173