Search results for: multivariate data analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 41259

Search results for: multivariate data analysis

40809 Density-based Denoising of Point Cloud

Authors: Faisal Zaman, Ya Ping Wong, Boon Yian Ng

Abstract:

Point cloud source data for surface reconstruction is usually contaminated with noise and outliers. To overcome this, we present a novel approach using modified kernel density estimation (KDE) technique with bilateral filtering to remove noisy points and outliers. First we present a method for estimating optimal bandwidth of multivariate KDE using particle swarm optimization technique which ensures the robust performance of density estimation. Then we use mean-shift algorithm to find the local maxima of the density estimation which gives the centroid of the clusters. Then we compute the distance of a certain point from the centroid. Points belong to outliers then removed by automatic thresholding scheme which yields an accurate and economical point surface. The experimental results show that our approach comparably robust and efficient.

Keywords: point preprocessing, outlier removal, surface reconstruction, kernel density estimation

Procedia PDF Downloads 334
40808 Processing Big Data: An Approach Using Feature Selection

Authors: Nikat Parveen, M. Ananthi

Abstract:

Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.

Keywords: big data, key value, feature selection, retrieval, performance

Procedia PDF Downloads 329
40807 An Ensemble System of Classifiers for Computer-Aided Volcano Monitoring

Authors: Flavio Cannavo

Abstract:

Continuous evaluation of the status of potentially hazardous volcanos plays a key role for civil protection purposes. The importance of monitoring volcanic activity, especially for energetic paroxysms that usually come with tephra emissions, is crucial not only for exposures to the local population but also for airline traffic. Presently, real-time surveillance of most volcanoes worldwide is essentially delegated to one or more human experts in volcanology, who interpret data coming from different kind of monitoring networks. Unfavorably, the high nonlinearity of the complex and coupled volcanic dynamics leads to a large variety of different volcanic behaviors. Moreover, continuously measured parameters (e.g. seismic, deformation, infrasonic and geochemical signals) are often not able to fully explain the ongoing phenomenon, thus making the fast volcano state assessment a very puzzling task for the personnel on duty at the control rooms. With the aim of aiding the personnel on duty in volcano surveillance, here we introduce a system based on an ensemble of data-driven classifiers to infer automatically the ongoing volcano status from all the available different kind of measurements. The system consists of a heterogeneous set of independent classifiers, each one built with its own data and algorithm. Each classifier gives an output about the volcanic status. The ensemble technique allows weighting the single classifier output to combine all the classifications into a single status that maximizes the performance. We tested the model on the Mt. Etna (Italy) case study by considering a long record of multivariate data from 2011 to 2015 and cross-validated it. Results indicate that the proposed model is effective and of great power for decision-making purposes.

Keywords: Bayesian networks, expert system, mount Etna, volcano monitoring

Procedia PDF Downloads 232
40806 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Model

Authors: Alam Ali, Ashok Kumar Pathak

Abstract:

Path analysis is a statistical technique used to evaluate the strength of the direct and indirect effects of variables. One or more structural regression equations are used to estimate a series of parameters in order to find the better fit of data. Sometimes, exogenous variables do not show a significant strength of their direct and indirect effect when the assumption of classical regression (ordinary least squares (OLS)) are violated by the nature of the data. The main motive of this article is to investigate the efficacy of the copula-based regression approach over the classical regression approach and calculate the direct and indirect effects of variables when data violates the OLS assumption and variables are linked through an elliptical copula. We perform this study using a well-organized numerical scheme. Finally, a real data application is also presented to demonstrate the performance of the superiority of the copula approach.

Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique

Procedia PDF Downloads 60
40805 Analysing the Degree of Climate Risk Perception and Response Strategies of Farm Household Typologies in Northern Ghana

Authors: David Ahiamadia, Ramilan Thiagarajah, Peter Tozer

Abstract:

In Sub Saharan Africa, farm typologies have been used as a practical way to address heterogeneity among farming systems which is mostly done by grouping farms into subsets with similar characteristics. Due to the complexity in farming systems among farm households, it is not possible to formulate policy recommendations for individual farmers. As a result, this study employs a multivariate statistical approach using Principal Component Analysis (PCA) coupled with cluster analysis to reduce heterogeneity in a 615-household data set from the Africa Rising Baseline Evaluation Survey for 25 farming communities in Northern Ghana. Variables selected for the study were mostly socio-economic, production potential, production intensity, production orientation, crop diversity, food security, resource endowments, and climate risk variables. To avoid making some individuals in the subpopulation worse off when aclimate risk intervention is broadly implemented, the findings of the study also account for diversity in climate risk perception among the different farm types identified and their response strategies towards climate risk. The climate risk variables used in this study involve the most severeclimate shock types perceived by the household, household response to climate shock type, and reason for crop failure (i.e., maize, rice, and groundnut). Eventually, four farm types, each with an adequate level of homogeneity in climate risk perception and response strategies, were identified. Farm type 1 and 3 were wealthy with a lower degree of climate risk perception compared to farm type 2 and 4. Also, relatively wealthy farmers used asset liquidation as a climate risk management strategy, whereas poor farmers resorted to engaging in spiritual activities such as prayers, sacrifices, and divine consultations.

Keywords: smallholder, households, climate risk, variables, typologies

Procedia PDF Downloads 76
40804 Sentiment Analysis: An Enhancement of Ontological-Based Features Extraction Techniques and Word Equations

Authors: Mohd Ridzwan Yaakub, Muhammad Iqbal Abu Latiffi

Abstract:

Online business has become popular recently due to the massive amount of information and medium available on the Internet. This has resulted in the huge number of reviews where the consumers share their opinion, criticisms, and satisfaction on the products they have purchased on the websites or the social media such as Facebook and Twitter. However, to analyze customer’s behavior has become very important for organizations to find new market trends and insights. The reviews from the websites or the social media are in structured and unstructured data that need a sentiment analysis approach in analyzing customer’s review. In this article, techniques used in will be defined. Definition of the ontology and description of its possible usage in sentiment analysis will be defined. It will lead to empirical research that related to mobile phones used in research and the ontology used in the experiment. The researcher also will explore the role of preprocessing data and feature selection methodology. As the result, ontology-based approach in sentiment analysis can help in achieving high accuracy for the classification task.

Keywords: feature selection, ontology, opinion, preprocessing data, sentiment analysis

Procedia PDF Downloads 187
40803 Social Network Analysis as a Research and Pedagogy Tool in Problem-Focused Undergraduate Social Innovation Courses

Authors: Sean McCarthy, Patrice M. Ludwig, Will Watson

Abstract:

This exploratory case study explores the deployment of Social Network Analysis (SNA) in mapping community assets in an interdisciplinary, undergraduate, team-taught course focused on income insecure populations in a rural area in the US. Specifically, it analyzes how students were taught to collect data on community assets and to visualize the connections between those assets using Kumu, an SNA data visualization tool. Further, the case study shows how social network data was also collected about student teams via their written communications in Slack, an enterprise messaging tool, which enabled instructors to manage and guide student research activity throughout the semester. The discussion presents how SNA methods can simultaneously inform both community-based research and social innovation pedagogy through the use of data visualization and collaboration-focused communication technologies.

Keywords: social innovation, social network analysis, pedagogy, problem-based learning, data visualization, information communication technologies

Procedia PDF Downloads 136
40802 Reliable Method for Estimating Rating Curves in the Natural Rivers

Authors: Arash Ahmadi, Amirreza Kavousizadeh, Sanaz Heidarzadeh

Abstract:

Stage-discharge curve is one of the conventional methods for continuous river flow measurement. In this paper, an innovative approach is proposed for predicting the stage-discharge relationship using the application of isovel contours. Using the proposed method, it is possible to estimate the stage-discharge curve in the whole section with only using discharge information from just one arbitrary water level. For this purpose, multivariate relationships are used to determine the mean velocity in a cross-section. The unknown exponents of the proposed relationship have been obtained by using the second version of the Strength Pareto Evolutionary Algorithm (SPEA2), and the appropriate equation was selected by applying the TOPSIS (Technique for Order Preferences by Similarity to an Ideal Solution) approach. Results showed a close agreement between the estimated and observed data in the different cross-sections.

Keywords: rating curves, SPEA2, natural rivers, bed roughness distribution

Procedia PDF Downloads 146
40801 Utilization, Barriers and Determinants of Emergency Medical Services in Mekelle City, Tigray, Ethiopia: A Community-Based Cross-Sectional Study

Authors: Goitom Molalign Takele, Tsegalem Hailemariam Ballo, Kiros Belay Gebrekidan, Birhan Gebresilassie Gebregiorgis

Abstract:

Background: Emergency medical services (EMS) are services that provide out-of-hospital emergency medical care to injured or ill peoples, and transporting to definitive care. EMS is an integral part of the emergency medical system and has been associated with decreased morbidity and mortality related to emergency cases. The aim of this study was to assess the utilization, barriers, and determinants of EMS in Mekelle, Ethiopia. Methods: A community-based cross-sectional study was conducted in selected sub-cities of Mekelle. A multistage sampling method was employed to recruit study participants, and data were collected by trained data collectors using an interviewer-administered questionnaire. Multivariate logistic regression analysis was used to examine the statistical association of the determinants of EMS utilization. Results: Half (50.5%) of the respondents had experienced or witnessed an emergency incident in the past year. The common means of transportations used were Bajaj’s (39.2%) and ambulances (22.7%). Majority (88.1%) of the respondents did not knew the EMS access phone number of an ambulance. As their preferred mode of transportation in case of emergency conditions, 42.2% of the participants reported an ambulance, followed by Bajaj 33.7%. Where participants who had gynecologic emergencies were 9.4 times (AOR=9.4, 95% CI: 1.04, 85, p=0.046), and those who knew any ambulance numbers were 3.6 times (AOR=3.6, 95% CI: 1.22, 10.8, p=0.02) more likely to use ambulance services in case of emergencies. Conclusion: The ambulance utilization level in Mekelle city was low and victims of emergency conditions were being transported mainly using public transports such as Bajaj’s and taxis. Even though the perception of the public towards EMS services is favorable, lack of awareness of EMS access, and lack of integrated EMS system in the city are the barriers that may have contributed to the low utilization. Actions to improve EMS access and integrating the system are warranted to promote the services utilization.

Keywords: emergency medical services, utilization, Mekelle, barriers

Procedia PDF Downloads 56
40800 Analysis of Commercial Cow and Camel Milk by Nuclear Magnetic Resonance

Authors: Lucia Pappalardo, Sara Abdul Majid Azzam

Abstract:

Camel milk is widely consumed by people living in arid areas of the world, where it is also known for its potential therapeutic and medical properties. Indeed it has been used as a treatment for several diseases such as tuberculosis, dropsy, asthma, jaundice and leishmaniasis in India, Sudan and some parts of Russia. A wealth of references is available in literature for the composition of milk from different diary animals such as cows, goats and sheep. Camel milk instead has not been extensively studied, despite its nutritional value. In this study commercial cow and camel milk samples, bought from the local market, were analyzed by 1D 1H-NMR and multivariate statistics in order to identify the different composition of the low-molecular-weight compounds in the milk mixtures. The samples were analyzed in their native conditions without any pre-treatment. Our preliminary study shows that the two different types of milk samples differ in the content of metabolites such as orotate, fats and more.

Keywords: camel, cow, milk, Nuclear Magnetic Resonance (NMR)

Procedia PDF Downloads 548
40799 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering

Authors: K. Umbleja, M. Ichino

Abstract:

Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.

Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis

Procedia PDF Downloads 151
40798 Development and Validation of the University of Mindanao Needs Assessment Scale (UMNAS) for College Students

Authors: Ryan Dale B. Elnar

Abstract:

This study developed a multidimensional need assessment scale for college students called The University of Mindanao Needs Assessment Scale (UMNAS). Although there are context-specific instruments measuring the needs of clinical and non-clinical samples, literature reveals no standardized scales to measure the needs of the college students thus a four-phase item development process was initiated to support its content validity. Comprising seven broad facets namely spiritual-moral, intrapersonal, socio-personal, psycho-emotional, cognitive, physical and sexual, a pyramid model of college needs was deconstructed through FGD sample to support the literature review. Using various construct validity procedures, the model was further tested using a total of 881 Filipino college samples. The result of the study revealed evidences of the reliability and validity of the UMNAS. The reliability indices range from .929-.933. Exploratory and confirmatory factor analyses revealed a one-factor-six-dimensional instrument to measure the needs of the college students. Using multivariate regression analysis, year level and course are found predictors of students’ needs. Content analysis attested the usefulness of the instrument to diagnose students’ personal and academic issues and concerns in conjunction with other measures. The norming process includes 1728 students from the different colleges of the University of Mindanao. Further validation is recommended to establish a national norm for the instrument.

Keywords: needs assessment scale, validity, factor analysis, college students

Procedia PDF Downloads 434
40797 Cryptographic Protocol for Secure Cloud Storage

Authors: Luvisa Kusuma, Panji Yudha Prakasa

Abstract:

Cloud storage, as a subservice of infrastructure as a service (IaaS) in Cloud Computing, is the model of nerworked storage where data can be stored in server. In this paper, we propose a secure cloud storage system consisting of two main components; client as a user who uses the cloud storage service and server who provides the cloud storage service. In this system, we propose the protocol schemes to guarantee against security attacks in the data transmission. The protocols are login protocol, upload data protocol, download protocol, and push data protocol, which implement hybrid cryptographic mechanism based on data encryption before it is sent to the cloud, so cloud storage provider does not know the user's data and cannot analysis user’s data, because there is no correspondence between data and user.

Keywords: cloud storage, security, cryptographic protocol, artificial intelligence

Procedia PDF Downloads 345
40796 Sensor Monitoring of the Concentrations of Different Gases Present in Synthesis of Ammonia Based on Multi-Scale Entropy and Multivariate Statistics

Authors: S. Aouabdi, M. Taibi

Abstract:

The supervision of chemical processes is the subject of increased development because of the increasing demands on reliability and safety. An important aspect of the safe operation of chemical process is the earlier detection of (process faults or other special events) and the location and removal of the factors causing such events, than is possible by conventional limit and trend checks. With the aid of process models, estimation and decision methods it is possible to also monitor hundreds of variables in a single operating unit, and these variables may be recorded hundreds or thousands of times per day. In the absence of appropriate processing method, only limited information can be extracted from these data. Hence, a tool is required that can project the high-dimensional process space into a low-dimensional space amenable to direct visualization, and that can also identify key variables and important features of the data. Our contribution based on powerful techniques for development of a new monitoring method based on multi-scale entropy MSE in order to characterize the behaviour of the concentrations of different gases present in synthesis and soft sensor based on PCA is applied to estimate these variables.

Keywords: ammonia synthesis, concentrations of different gases, soft sensor, multi-scale entropy, multivarite statistics

Procedia PDF Downloads 324
40795 Effect of Renin Angiotensin Pathway Inhibition on the Efficacy of Anti-programmed Cell Death (PD-1/L-1) Inhibitors in Advanced Non-small Cell Lung Cancer Patients- Comparison of Single Hospital Retrospective Assessment to the Published Literature

Authors: Esther Friedlander, Philip Friedlander

Abstract:

The use of immunotherapy that inhibits programmed death-1 (PD-1) or its ligand PD-L1 confers survival benefits in patients with non-small cell lung cancer (NSCLC). However, approximately 45% of patients experience primary treatment resistance, necessitating the development of strategies to improve efficacy. While the renin-angiotensin system (RAS) has systemic hemodynamic effects, tissue-specific regulation exists along with modulation of immune activity in part through regulation of myeloid cell activity, leading to the hypothesis that RAS inhibition may improve anti-PD-1/L-1 efficacy. A retrospective analysis was conducted that included 173 advanced solid tumor cancer patients treated at Valley Hospital, a community Hospital in New Jersey, USA, who were treated with a PD-1/L-1 inhibitor in a defined time period showing a statistically significant relationship between RAS pathway inhibition (RASi through concomitant treatment with an ACE inhibitor or angiotensin receptor blocker) and positive efficacy to the immunotherapy that was independent of age, gender and cancer type. Subset analysis revealed strong numerical benefit for efficacy in both patients with squamous and nonsquamous NSCLC as determined by documented clinician assessment of efficacy and by duration of therapy. A PUBMED literature search was now conducted to identify studies assessing the effect of RAS pathway inhibition on anti-PD-1/L1 efficacy in advanced solid tumor patients and compare these findings to those seen in the Valley Hospital retrospective study with a focus on NSCLC specifically. A total of 11 articles were identified assessing the effects of RAS pathway inhibition on the efficacy of checkpoint inhibitor immunotherapy in advanced cancer patients. Of the 11 studies, 10 assessed the effect on survival of RASi in the context of treatment with anti-PD-1/PD-L1, while one assessed the effect on CTLA-4 inhibition. Eight of the studies included patients with NSCLC, while the remaining 2 were specific to genitourinary malignancies. Of the 8 studies, two were specific to NSCLC patients, with the remaining 6 studies including a range of cancer types, of which NSCLC was one. Of these 6 studies, only 2 reported specific survival data for the NSCLC subpopulation. Patient characteristics, multivariate analysis data and efficacy data seen in the 2 NSLCLC specific studies and in the 2 basket studies, which provided data on the NSCLC subpopulation, were compared to that seen in the Valley Hospital retrospective study supporting a broader effect of RASi on anti-PD-1/L1 efficacy in advanced NSLCLC with the majority of studies showing statistically significant benefit or strong statistical trends but with one study demonstrating worsened outcomes. This comparison of studies extends published findings to the community hospital setting and supports prospective assessment through randomized clinical trials of efficacy in NSCLC patients with pharmacodynamic components to determine the effect on immune cell activity in tumors and on the composition of the tumor microenvironment.

Keywords: immunotherapy, cancer, angiotensin, efficacy, PD-1, lung cancer, NSCLC

Procedia PDF Downloads 59
40794 Sales Patterns Clustering Analysis on Seasonal Product Sales Data

Authors: Soojin Kim, Jiwon Yang, Sungzoon Cho

Abstract:

As a seasonal product is only in demand for a short time, inventory management is critical to profits. Both markdowns and stockouts decrease the return on perishable products; therefore, researchers have been interested in the distribution of seasonal products with the aim of maximizing profits. In this study, we propose a data-driven seasonal product sales pattern analysis method for individual retail outlets based on observed sales data clustering; the proposed method helps in determining distribution strategies.

Keywords: clustering, distribution, sales pattern, seasonal product

Procedia PDF Downloads 583
40793 Parents and Stakeholders’ Perspectives on Early Reading Intervention Implemented as a Curriculum for Children with Learning Disabilities

Authors: Bander Mohayya Alotaibi

Abstract:

The valuable partnerships between parents and teachers may develop positive and effective interactions between home and school. This will help these stakeholders share information and resources regarding student academics during ongoing interactions. Thus, partnerships will build a solid foundation for both families and schools to help children succeed in school. Parental involvement can be seen as an effective tool that can change homes and communities and not just schools’ systems. Seeking parents and stakeholders’ attitudes toward learning and learners can help schools design a curriculum. Subsequently, this information can be used to find ways to help improve the academic performance of students, especially in low performing schools. There may be some conflicts when designing curriculum. In addition, designing curriculum might bring more educational expectations to all the sides. There is a lack of research that targets the specific attitude of parents toward specific concepts on curriculum contents. More research is needed to study the perspective that parents of children with learning disabilities (LD) have regarding early reading curriculum. Parents and stakeholders’ perspectives on early reading intervention implemented as a curriculum for children with LD was studied through an advanced quantitative research. The purpose of this study seeks to understand stakeholders and parents’ perspectives of key concepts and essential early reading skills that impact the design of curriculum that will serve as an intervention for early struggler readers who have LD. Those concepts or stages include phonics, phonological awareness, and reading fluency as well as strategies used in house by parents. A survey instrument was used to gather the data. Participants were recruited through 29 schools and districts of the metropolitan area of the northern part of Saudi Arabia. Participants were stakeholders including parents of children with learning disability. Data were collected using distribution of paper and pen survey to schools. Psychometric properties of the instrument were evaluated for the validity and reliability of the survey; face validity, content validity, and construct validity including an Exploratory Factor Analysis were used to shape and reevaluate the structure of the instrument. Multivariate analysis of variance (MANOVA) used to find differences between the variables. The study reported the results of the perspectives of stakeholders toward reading strategies, phonics, phonological awareness, and reading fluency. Also, suggestions and limitations are discussed.

Keywords: stakeholders, learning disability, early reading, perspectives, parents, intervention, curriculum

Procedia PDF Downloads 142
40792 Reliability Prediction of Tires Using Linear Mixed-Effects Model

Authors: Myung Hwan Na, Ho- Chun Song, EunHee Hong

Abstract:

We widely use normal linear mixed-effects model to analysis data in repeated measurement. In case of detecting heteroscedasticity and the non-normality of the population distribution at the same time, normal linear mixed-effects model can give improper result of analysis. To achieve more robust estimation, we use heavy tailed linear mixed-effects model which gives more exact and reliable analysis conclusion than standard normal linear mixed-effects model.

Keywords: reliability, tires, field data, linear mixed-effects model

Procedia PDF Downloads 553
40791 Antenatal Factors Associated with Early Onset Neonatal Sepsis among Neonates 0-7 Days at Fort Portal Regional Referral Hospital

Authors: Moses Balina, Archbald Bahizi

Abstract:

Introduction: Early onset neonatal sepsis is a systemic infection in a newborn baby during the first week after birth and contributes to 50% of neonatal deaths each year. Risk factors for early onset neonatal sepsis, which can be maternal, health care provider, or health care facility associated, can be prevented with access to quality antenatal care. Objective: The objective of the study was to assess early onset neonatal sepsis and antenatal factors associated with Fort Portal Regional Referral Hospital. Methodology: A cross sectional study design was used. The study involved 60 respondents who were mothers of breastfeeding neonates being treated for early onset neonatal sepsis at Fort Portal Regional Referral Hospital neonatal intensive care unit. Simple random sampling was used to select study participants. Data were collected using questionnaires, entered in Stata 16, and analysed using logistic regression. Results: The prevalence of early onset neonatal sepsis at Fort Portal Regional Referral Hospital was 25%. Multivariate analysis revealed that institutional factors were the only antenatal factors found to be significantly associated with early onset neonatal sepsis at Fort Portal Regional Referral Hospital (p < 0.01). Bivariate analysis revealed that attending antenatal care at a health centre III or IV instead of a hospital (p = 0.011) and attending antenatal care in health care facilities with no laboratory investigations (p = 0.048) were risk factors for early onset neonatal sepsis in the newborn at Fort Portal Regional Referral Hospital. Conclusion: Antenatal factors were associated with early onset neonatal sepsis, and health care facility factors like lower level health centre and unavailability of quality laboratory investigations to pregnant women contributed to early onset neonatal sepsis in the newborn. Mentorships, equipping/stocking laboratories, and improving staffing levels were necessary to reduce early onset neonatal sepsis.

Keywords: antenatal factors, early onset neonatal sepsis, neonates 0-7 days, fort portal regional referral hospital

Procedia PDF Downloads 91
40790 Helping the Development of Public Policies with Knowledge of Criminal Data

Authors: Diego De Castro Rodrigues, Marcelo B. Nery, Sergio Adorno

Abstract:

The project aims to develop a framework for social data analysis, particularly by mobilizing criminal records and applying descriptive computational techniques, such as associative algorithms and extraction of tree decision rules, among others. The methods and instruments discussed in this work will enable the discovery of patterns, providing a guided means to identify similarities between recurring situations in the social sphere using descriptive techniques and data visualization. The study area has been defined as the city of São Paulo, with the structuring of social data as the central idea, with a particular focus on the quality of the information. Given this, a set of tools will be validated, including the use of a database and tools for visualizing the results. Among the main deliverables related to products and the development of articles are the discoveries made during the research phase. The effectiveness and utility of the results will depend on studies involving real data, validated both by domain experts and by identifying and comparing the patterns found in this study with other phenomena described in the literature. The intention is to contribute to evidence-based understanding and decision-making in the social field.

Keywords: social data analysis, criminal records, computational techniques, data mining, big data

Procedia PDF Downloads 69
40789 Differential in Dynamics of Contraceptive Practices with Women's Sexual Empowerment in Selected South Asian Countries: Evidence from Two Decades DHS Surveys, 1990 and 2012

Authors: Brajesh

Abstract:

Introduction: It is generally believed that women's lack power to making decision may restrict their use of modern contraceptives practices. However, few studies have examined the different dimensions of women's empowerment and contraceptive use in Asian content. Pervasive gendered inequities and norms regarding the subordination of women give Asian men disproportionately more power than women, particularly in relation to the sex. We hypothesize that lack of sexual empowerment may pose an important barrier to reproductive health and adoption of family planning methods. Using the Demographic Health Survey, we examine the association between women’s sexual empowerment and contraceptive use in Nepal, Bangladesh and Pakistan. Objectives: To understand the trend and pattern of contraceptive choices and use among women due to sexual empowerment in selected south Asian countries. To examine the association between women’s sexual empowerment and contraceptive practices among non-pregnant married and partnered women in Nepal, Bangladesh and Pakistan. Methods: Data came from the latest round of Demographic and Health Surveys conducted between 2010-12 in and during deacde1990 -92 in Nepal, Bangladesh and Pakistan. Responses from married or cohabiting women aged 15-49 years were analyzed for six dimensions of empowerment and the current use of female-only methods or couple of methods. Bi-variate and multivariate multinomial regressions were used to identify associations between the empowerment dimensions and method use. Results: Positive associations were found between the overall empowerment score and method use in all countries (relative risk ratios, 1.1-1.3). In multivariate analysis, household economic decision-making was associated with the use of either female-only or couple methods (relative risk ratios -1. 1 for all), as was agreement on fertility preferences (RRR-1.3-1.6) and the ability to negotiate sexual activity (RRR -1. 1-1.2). In Bangladesh, women's negative attitudes toward domestic violence were correlated with the use of couple of methods (RRR -1. 1). Increasing levels of sexual empowerment were found to be associated with use of contraceptives, even after adjusting for demographic predictors of contraceptive use. This association is moderated by the wealth. Formal education, increasing wealth, and being in an unmarried partnership are associated with contraceptive use, whereas women who identify as being Muslim are less likely to use contraceptives than those who identify as being Hindus or other. These findings suggest that to achieve universal access to reproductive health services, gendered disparities in sexual empowerment, particularly among economically disadvantaged women, need to be better addressed. Conclusions: Intervention programs aimed at increasing contraceptive use may need to involve different approaches, including promoting couples' discussion of fertility preferences and family planning, improving women's self-efficacy in negotiating sexual activity and increasing their economic independence. Policies are needed to encourage the rural families to give their girls a chance of attending higher level education and professional course so that can get a better job opportunity and can economically support their family as son are expected to do.

Keywords: reproductive and child health (RCH), relative risk ratios (RRR), demographic and health survey (DHS), women’s sexual empowerment (WSE)

Procedia PDF Downloads 238
40788 Applying Big Data Analysis to Efficiently Exploit the Vast Unconventional Tight Oil Reserves

Authors: Shengnan Chen, Shuhua Wang

Abstract:

Successful production of hydrocarbon from unconventional tight oil reserves has changed the energy landscape in North America. The oil contained within these reservoirs typically will not flow to the wellbore at economic rates without assistance from advanced horizontal well and multi-stage hydraulic fracturing. Efficient and economic development of these reserves is a priority of society, government, and industry, especially under the current low oil prices. Meanwhile, society needs technological and process innovations to enhance oil recovery while concurrently reducing environmental impacts. Recently, big data analysis and artificial intelligence become very popular, developing data-driven insights for better designs and decisions in various engineering disciplines. However, the application of data mining in petroleum engineering is still in its infancy. The objective of this research aims to apply intelligent data analysis and data-driven models to exploit unconventional oil reserves both efficiently and economically. More specifically, a comprehensive database including the reservoir geological data, reservoir geophysical data, well completion data and production data for thousands of wells is firstly established to discover the valuable insights and knowledge related to tight oil reserves development. Several data analysis methods are introduced to analysis such a huge dataset. For example, K-means clustering is used to partition all observations into clusters; principle component analysis is applied to emphasize the variation and bring out strong patterns in the dataset, making the big data easy to explore and visualize; exploratory factor analysis (EFA) is used to identify the complex interrelationships between well completion data and well production data. Different data mining techniques, such as artificial neural network, fuzzy logic, and machine learning technique are then summarized, and appropriate ones are selected to analyze the database based on the prediction accuracy, model robustness, and reproducibility. Advanced knowledge and patterned are finally recognized and integrated into a modified self-adaptive differential evolution optimization workflow to enhance the oil recovery and maximize the net present value (NPV) of the unconventional oil resources. This research will advance the knowledge in the development of unconventional oil reserves and bridge the gap between the big data and performance optimizations in these formations. The newly developed data-driven optimization workflow is a powerful approach to guide field operation, which leads to better designs, higher oil recovery and economic return of future wells in the unconventional oil reserves.

Keywords: big data, artificial intelligence, enhance oil recovery, unconventional oil reserves

Procedia PDF Downloads 272
40787 Statistical Model of Water Quality in Estero El Macho, Machala-El Oro

Authors: Rafael Zhindon Almeida

Abstract:

Surface water quality is an important concern for the evaluation and prediction of water quality conditions. The objective of this study is to develop a statistical model that can accurately predict the water quality of the El Macho estuary in the city of Machala, El Oro province. The methodology employed in this study is of a basic type that involves a thorough search for theoretical foundations to improve the understanding of statistical modeling for water quality analysis. The research design is correlational, using a multivariate statistical model involving multiple linear regression and principal component analysis. The results indicate that water quality parameters such as fecal coliforms, biochemical oxygen demand, chemical oxygen demand, iron and dissolved oxygen exceed the allowable limits. The water of the El Macho estuary is determined to be below the required water quality criteria. The multiple linear regression model, based on chemical oxygen demand and total dissolved solids, explains 99.9% of the variance of the dependent variable. In addition, principal component analysis shows that the model has an explanatory power of 86.242%. The study successfully developed a statistical model to evaluate the water quality of the El Macho estuary. The estuary did not meet the water quality criteria, with several parameters exceeding the allowable limits. The multiple linear regression model and principal component analysis provide valuable information on the relationship between the various water quality parameters. The findings of the study emphasize the need for immediate action to improve the water quality of the El Macho estuary to ensure the preservation and protection of this valuable natural resource.

Keywords: statistical modeling, water quality, multiple linear regression, principal components, statistical models

Procedia PDF Downloads 78
40786 Wedding Organizer Strategy in the Era Covid-19 Pandemic In Surabaya, Indonesia

Authors: Rifky Cahya Putra

Abstract:

At this time of corona makes some countries affected difficult. As a result, many traders or companies are difficult to work in this pandemic era. So human activities in some fields must implement a new lifestyle or known as new normal. The transition from the one activity to another certainly requires high adaptation. So that almost in all sectors experience the impact of this phase, on of which is the wedding organizer. This research aims to find out what strategies are used so that the company can run in this pandemic. Techniques in data collection in the form interview to the owner of the wedding organizer and his team. Data analysis qualitative descriptive use interactive model analysis consisting of three main things, namely data reduction, data presentaion, and conclusion. For the result of the interview, the conclusion is that there are three strategies consisting of social media, sponsorship, and promotion.

Keywords: strategy, wedding organizer, pandemic, indonesia

Procedia PDF Downloads 124
40785 Intrusion Detection System Using Linear Discriminant Analysis

Authors: Zyad Elkhadir, Khalid Chougdali, Mohammed Benattou

Abstract:

Most of the existing intrusion detection systems works on quantitative network traffic data with many irrelevant and redundant features, which makes detection process more time’s consuming and inaccurate. A several feature extraction methods, such as linear discriminant analysis (LDA), have been proposed. However, LDA suffers from the small sample size (SSS) problem which occurs when the number of the training samples is small compared with the samples dimension. Hence, classical LDA cannot be applied directly for high dimensional data such as network traffic data. In this paper, we propose two solutions to solve SSS problem for LDA and apply them to a network IDS. The first method, reduce the original dimension data using principal component analysis (PCA) and then apply LDA. In the second solution, we propose to use the pseudo inverse to avoid singularity of within-class scatter matrix due to SSS problem. After that, the KNN algorithm is used for classification process. We have chosen two known datasets KDDcup99 and NSLKDD for testing the proposed approaches. Results showed that the classification accuracy of (PCA+LDA) method outperforms clearly the pseudo inverse LDA method when we have large training data.

Keywords: LDA, Pseudoinverse, PCA, IDS, NSL-KDD, KDDcup99

Procedia PDF Downloads 217
40784 Model Predictive Controller for Pasteurization Process

Authors: Tesfaye Alamirew Dessie

Abstract:

Our study focuses on developing a Model Predictive Controller (MPC) and evaluating it against a traditional PID for a pasteurization process. Utilizing system identification from the experimental data, the dynamics of the pasteurization process were calculated. Using best fit with data validation, residual, and stability analysis, the quality of several model architectures was evaluated. The validation data fit the auto-regressive with exogenous input (ARX322) model of the pasteurization process by roughly 80.37 percent. The ARX322 model structure was used to create MPC and PID control techniques. After comparing controller performance based on settling time, overshoot percentage, and stability analysis, it was found that MPC controllers outperform PID for those parameters.

Keywords: MPC, PID, ARX, pasteurization

Procedia PDF Downloads 147
40783 New Two-Way Map-Reduce Join Algorithm: Hash Semi Join

Authors: Marwa Hussein Mohamed, Mohamed Helmy Khafagy, Samah Ahmed Senbel

Abstract:

Map Reduce is a programming model used to handle and support massive data sets. Rapidly increasing in data size and big data are the most important issue today to make an analysis of this data. map reduce is used to analyze data and get more helpful information by using two simple functions map and reduce it's only written by the programmer, and it includes load balancing , fault tolerance and high scalability. The most important operation in data analysis are join, but map reduce is not directly support join. This paper explains two-way map-reduce join algorithm, semi-join and per split semi-join, and proposes new algorithm hash semi-join that used hash table to increase performance by eliminating unused records as early as possible and apply join using hash table rather than using map function to match join key with other data table in the second phase but using hash tables isn't affecting on memory size because we only save matched records from the second table only. Our experimental result shows that using a hash table with hash semi-join algorithm has higher performance than two other algorithms while increasing the data size from 10 million records to 500 million and running time are increased according to the size of joined records between two tables.

Keywords: map reduce, hadoop, semi join, two way join

Procedia PDF Downloads 501
40782 Spatio-Temporal Variation of Gaseous Pollutants and the Contribution of Particulate Matters in Chao Phraya River Basin, Thailand

Authors: Samart Porncharoen, Nisa Pakvilai

Abstract:

The elevated levels of air pollutants in regional atmospheric environments is a significant problem that affects human health in Thailand, particularly in the Chao Phraya River Basin. Of concern are issues surrounding ambient air pollution such as particulate matter, gaseous pollutants and more specifically concerning air pollution along the river. Therefore, the spatio-temporal study of air pollution in this real environment can gain more accurate air quality data for making formalized environmental policy in river basins. In order to inform such a policy, a study was conducted over a period of January –December, 2015 to continually collect measurements of various pollutants in both urban and regional locations in the Chao Phraya River Basin. This study investigated the air pollutants in many diverse environments along the Chao Phraya River Basin, Thailand in 2015. Multivariate Analysis Techniques such as Principle Component Analysis (PCA) and Path analysis were utilised to classify air pollution in the surveyed location. Measurements were collected in both urban and rural areas to see if significant differences existed between the two locations in terms of air pollution levels. The meteorological parameters of various particulates were collected continually from a Thai pollution control department monitoring station over a period of January –December, 2015. Of interest to this study were the readings of SO2, CO, NOx, O3, and PM10. Results showed a daily arithmetic mean concentration of SO2, CO, NOx, O3, PM10 reading at 3±1 ppb, 0.5± 0.5 ppm, 30±21 ppb, 19±16 ppb, and 40±20 ug/m3 in urban locations (Bangkok). During the same time period, the readings for the same measurements in rural areas, Ayutthaya (were 1±0.5 ppb, 0.1± 0.05 ppm, 25±17 ppb, 30±21 ppb, and 35±10 ug/m3respectively. This show that Bangkok were located in highly polluted environments that are dominated source emitted from vehicles. Further, results were analysed to ascertain if significant seasonal variation existed in the measurements. It was found that levels of both gaseous pollutants and particle matter in dry season were higher than the wet season. More broadly, the results show that levels of pollutants were measured highest in locations along the Chao Phraya. River Basin known to have a large number of vehicles and biomass burning. This correlation suggests that the principle pollutants were from these anthropogenic sources. This study contributes to the body of knowledge surrounding ambient air pollution such as particulate matter, gaseous pollutants and more specifically concerning air pollution along the Chao Phraya River Basin. Further, this study is one of the first to utilise continuous mobile monitoring along a river in order to gain accurate measurements during a data collection period. Overall, the results of this study can be used for making formalized environmental policy in river basins in order to reduce the physical effects on human health.

Keywords: air pollution, Chao Phraya river basin, meteorology, seasonal variation, principal component analysis

Procedia PDF Downloads 270
40781 The Quality of the Presentation Influence the Audience Perceptions

Authors: Gilang Maulana, Dhika Rahma Qomariah, Yasin Fadil

Abstract:

Purpose: This research meant to measure the magnitude of the influence of the quality of the presentation to the targeted audience perception in catching information presentation. Design/Methodology/Approach: This research uses a quantitative research method. The kind of data that uses in this research is the primary data. The population in this research are students the economics faculty of Semarang State University. The sampling techniques uses in this research is purposive sampling. The retrieving data uses questionnaire on 30 respondents. The data analysis uses descriptive analysis. Result: The quality of presentation influential positive against perception of the audience. This proved that the more qualified presentation will increase the perception of the audience. Limitation: Respondents were limited to only 30 people.

Keywords: quality of presentation, presentation, audience, perception, semarang state university

Procedia PDF Downloads 374
40780 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: information retrieval, unified medical language system, syntax based analysis, natural language processing, medical informatics

Procedia PDF Downloads 121