Search results for: Data Envelopment Analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 41005

Search results for: Data Envelopment Analysis

39865 A Data-Driven Monitoring Technique Using Combined Anomaly Detectors

Authors: Fouzi Harrou, Ying Sun, Sofiane Khadraoui

Abstract:

Anomaly detection based on Principal Component Analysis (PCA) was studied intensively and largely applied to multivariate processes with highly cross-correlated process variables. Monitoring metrics such as the Hotelling's T2 and the Q statistics are usually used in PCA-based monitoring to elucidate the pattern variations in the principal and residual subspaces, respectively. However, these metrics are ill suited to detect small faults. In this paper, the Exponentially Weighted Moving Average (EWMA) based on the Q and T statistics, T2-EWMA and Q-EWMA, were developed for detecting faults in the process mean. The performance of the proposed methods was compared with that of the conventional PCA-based fault detection method using synthetic data. The results clearly show the benefit and the effectiveness of the proposed methods over the conventional PCA method, especially for detecting small faults in highly correlated multivariate data.

Keywords: data-driven method, process control, anomaly detection, dimensionality reduction

Procedia PDF Downloads 278
39864 Housing Price Dynamics: Comparative Study of 1980-1999 and the New Millenium

Authors: Janne Engblom, Elias Oikarinen

Abstract:

The understanding of housing price dynamics is of importance to a great number of agents: to portfolio investors, banks, real estate brokers and construction companies as well as to policy makers and households. A panel dataset is one that follows a given sample of individuals over time, and thus provides multiple observations on each individual in the sample. Panel data models include a variety of fixed and random effects models which form a wide range of linear models. A special case of panel data models is dynamic in nature. A complication regarding a dynamic panel data model that includes the lagged dependent variable is endogeneity bias of estimates. Several approaches have been developed to account for this problem. In this paper, the panel models were estimated using the Common Correlated Effects estimator (CCE) of dynamic panel data which also accounts for cross-sectional dependence which is caused by common structures of the economy. In presence of cross-sectional dependence standard OLS gives biased estimates. In this study, U.S housing price dynamics were examined empirically using the dynamic CCE estimator with first-difference of housing price as the dependent and first-differences of per capita income, interest rate, housing stock and lagged price together with deviation of housing prices from their long-run equilibrium level as independents. These deviations were also estimated from the data. The aim of the analysis was to provide estimates with comparisons of estimates between 1980-1999 and 2000-2012. Based on data of 50 U.S cities over 1980-2012 differences of short-run housing price dynamics estimates were mostly significant when two time periods were compared. Significance tests of differences were provided by the model containing interaction terms of independents and time dummy variable. Residual analysis showed very low cross-sectional correlation of the model residuals compared with the standard OLS approach. This means a good fit of CCE estimator model. Estimates of the dynamic panel data model were in line with the theory of housing price dynamics. Results also suggest that dynamics of a housing market is evolving over time.

Keywords: dynamic model, panel data, cross-sectional dependence, interaction model

Procedia PDF Downloads 238
39863 Analyzing the Sensation of Jogja Kembali Monument (Monjali): Case Study of Yogyakarta as the Implementation of Attraction Tour

Authors: Hutomo Abdurrohman, Muhammad Latief, Waridatun Nida, Ranta Dwi Irawati

Abstract:

Yogyakarta Kembali Monument (Monjali) is one of the most popular tourist attraction in Yogyakarta. Yogyakarta is known as ‘Student City’, and Monjali is a right place to learn and explore more about Yogyakarta, especially for students in elementary and junior high school to do the study tour. Monjali is located in North Ringroad, Jongkang, Sariharjo village, Ngaglik Subdistrict, Sleman Regency, Yogyakarta. Monjali offers many historical replicas, and also the story behind them. That is about the war between Indonesia's fighter, called TNI (Indonesian national army) and the colonizer of Netherlands in Yogyakarta, on March, 1st 1949. That event could open the eyes of the whole of Indonesia, because at that time the TNI was placed by the invaders. This research is an effort to evaluate the visitor's interest in Monjali as a special tourist attraction. The substance that we use in this research is the Monjali's visitors whom up to 17 years old by taking a respondent in every 15 persons who visit Monjali, and we need 200 respondents to know the condition and facilities of Monjali. This research has been collected since January 2017 until October 2017. We do the interview and spread the questionnaire which has been tested all of its validity and reliability. This data analysis is descriptive statistic analysis by using the qualitative data, which is converted into the quantitative data, use the Linkert Scale. The result of this research shows that the interest of Monjali's visitors is higher 75,6%. Based on the result, we know that Monjali is being an attractiveness for people which always experience its improvements and the development. Monjali is the success to be a place which combines the entertainment with its education as a vision of Yogyakarta as a Student City.

Keywords: descriptive statistical analysis, Jogja Kembali monument, Linkert scale, sensation

Procedia PDF Downloads 172
39862 Experimental Modal Analysis of Reinforced Concrete Square Slabs

Authors: M. S. Ahmed, F. A. Mohammad

Abstract:

The aim of this paper is to perform experimental modal analysis (EMA) of reinforced concrete (RC) square slabs. EMA is the process of determining the modal parameters (Natural Frequencies, damping factors, modal vectors) of a structure from a set of frequency response functions FRFs (curve fitting). Although experimental modal analysis (or modal testing) has grown steadily in popularity since the advent of the digital FFT spectrum analyzer in the early 1970’s, studying all members and materials using such method have not yet been well documented. Therefore, in this work, experimental tests were conducted on RC square specimens (0.6m x 0.6m with 40 mm). Experimental analysis is based on freely supported boundary condition. Moreover, impact testing as a fast and economical means of finding the modes of vibration of a structure was used during the experiments. In addition, Pico Scope 6 device and MATLAB software were used to acquire data, analyze and plot Frequency Response Function (FRF). The experimental natural frequencies which were extracted from measurements exhibit good agreement with analytical predictions. It is showed that EMA method can be usefully employed to perform the dynamic behavior of RC slabs.

Keywords: natural frequencies, mode shapes, modal analysis, RC slabs

Procedia PDF Downloads 394
39861 Study of Components and Effective Factors on Organizational Commitment of Khoramabad Branchs Islamic Azad University’s Faculty Members

Authors: Mehry Daraei

Abstract:

The goal of this study was to survey the components and affective factors on organizational commitment of Islamic Azad university Khoramabad Baranch’s faculty members. The research method was correlation by causal modeling and data were gathered by questionnaire. Statistical society consisted of 147 faculty members in Islamic Azad University Khoramabad Branch and sample size was determined as 106 persons by Morgan’s sample table that were selected by class sampling. Correlation test, T-single group test and path analysis test were used for analysis of data. Data were analyzed by Lisrel software. The results showed that organizational corporate was the most effective element on organizational commitment and organizational corporate, experience work and organizational justice were only in direct relation with organizational commitment. Also, job security had direct and indirect effect on OC. Job security had effect on OC by gender. Gender variable had direct and indirect effect on OC. Gender had effect on OC by organizational corporate. Job opportunities out of university also had direct and indirect effect on OC, which means job opportunities had indirect effect on OC by organizational corporate.

Keywords: organization, commitment, job security, Islamic Azad University

Procedia PDF Downloads 303
39860 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, Data Mining, Hadoop, MapReduce, MongoDB, NoSQL

Procedia PDF Downloads 144
39859 Modeling Waiting and Service Time for Patients: A Case Study of Matawale Health Centre, Zomba, Malawi

Authors: Moses Aron, Elias Mwakilama, Jimmy Namangale

Abstract:

Spending more time on long queues for a basic service remains a common challenge to most developing countries, including Malawi. For health sector in particular, Out-Patient Department (OPD) experiences long queues. This puts the lives of patients at risk. However, using queuing analysis to under the nature of the problems and efficiency of service systems, such problems can be abated. Based on a kind of service, literature proposes different possible queuing models. However, unlike using generalized assumed models proposed by literature, use of real time case study data can help in deeper understanding the particular problem model and how such a model can vary from one day to the other and also from each case to another. As such, this study uses data obtained from one urban HC for BP, Pediatric and General OPD cases to investigate an average queuing time for patients within the system. It seeks to highlight the proper queuing model by investigating the kind of distributions functions over patient’s arrival time, inter-arrival time, waiting time and service time. Comparable with the standard set values by WHO, the study found that patients at this HC spend more waiting times than service times. On model investigation, different days presented different models ranging from an assumed M/M/1, M/M/2 to M/Er/2. As such, through sensitivity analysis, in general, a commonly assumed M/M/1 model failed to fit the data but rather an M/Er/2 demonstrated to fit well. An M/Er/3 model seemed to be good in terms of measuring resource utilization, proposing a need to increase medical personnel at this HC. However, an M/Er/4 showed to cause more idleness of human resources.

Keywords: health care, out-patient department, queuing model, sensitivity analysis

Procedia PDF Downloads 418
39858 Rasch Analysis in the Development of 'Kohesif-Ques': An Instrument to Measure Social Cohesion

Authors: Paramita Sekar Ayu, Sunjaya Deni Kurniadi, Yamazaki Chiho, Hilfi Lukman, Koyama Hiroshi

Abstract:

Social cohesion, or closeness among members of society, is an important determinant of population health. A cohesive society is a crucial societal condition for a positive life evaluation and subjective wellbeing, and people living in a cohesive society are happier and more satisfied with life and achieve better health status. The objective of this study was to compose and validate a questionnaire for measuring social cohesion with Rasch analysis. We develop a set of 13 questions to measure 4 dimensions of social cohesion. Random samples of 166 Bandung citizens’ were selected to answer the questionnaire. To evaluate the questionnaire’s validity and reliability, Rasch analysis (a psychometric model for analyzing categorical data on questionnaire responses) was carried out using Winsteps version 3.75.0. Rasch analysis was performed on the response given to 13 items included in the questionnaire. The reliability coefficient, Cronbach’s alpha was 0.70, model RMSE 0.08, SD 0.54, separation 7.14, and reliability of 0.98. ‘Kohesif-Ques’ is a useful instrument to assess social cohesion.

Keywords: rasch analysis, rasch model, social cohesion, quesionnaire

Procedia PDF Downloads 152
39857 Diagnosis of Logistics Processes: Bibliometric Review and Analysis

Authors: S. F. Bayona, J. Nunez, D. Paez

Abstract:

The diagnostic processes have been consolidated as fundamental tools in the adequate knowledge of organizations and their processes. The diagnosis is related to the interpretation of the data, findings and the relevant information, to determine problems, causes, or the simple state and behavior of a process, without including a solution to the problems detected. The objective of this work is to identify the necessary stages to diagnose the logistic processes in a metalworking company, from the literary revision of different disciplines. A total of 62 articles were chosen to identify, through bibliometric analysis, the most cited articles, as well as the most frequent authors and journals. The results allowed to identify the two fundamental stages in the diagnostic process: a primary phase (general) based on the logical subjectivity of the knowledge of the person who evaluates, and the secondary phase (specific), related to the interpretation of the results, findings or data. Also, two phases were identified, one related to the definition of the scope of the actions to be developed and the other, as an initial description of what was observed in the process.

Keywords: business, diagnostic, management, process

Procedia PDF Downloads 142
39856 Time-Domain Analysis of Pulse Parameters Effects on Crosstalk in High-Speed Circuits

Authors: Loubna Tani, Nabih Elouzzani

Abstract:

Crosstalk among interconnects and printed-circuit board (PCB) traces is a major limiting factor of signal quality in high-speed digital and communication equipments especially when fast data buses are involved. Such a bus is considered as a planar multiconductor transmission line. This paper will demonstrate how the finite difference time domain (FDTD) method provides an exact solution of the transmission-line equations to analyze the near end and the far end crosstalk. In addition, this study makes it possible to analyze the rise time effect on the near and far end voltages of the victim conductor. The paper also discusses a statistical analysis, based upon a set of several simulations. Such analysis leads to a better understanding of the phenomenon and yields useful information.

Keywords: multiconductor transmission line, crosstalk, finite difference time domain (FDTD), printed-circuit board (PCB), rise time, statistical analysis

Procedia PDF Downloads 416
39855 Problems of Learning English Vowels Pronunciation in Nigeria

Authors: Wasila Lawan Gadanya

Abstract:

This paper examines the problems of learning English vowel pronunciation. The objective is to identify some of the factors that affect the learning of English vowel sounds and their proper realization in words. The theoretical framework adopted is based on both error analysis and contrastive analysis. The data collection instruments used in the study are questionnaire and word list for the respondents (students) and observation of some of their lecturers. All the data collected were analyzed using simple percentage. The findings show that it is not a single factor that affects the learning of English vowel pronunciation rather many factors concurrently do so. Among the factors examined, it has been found that lack of correlation between English orthography and its pronunciation, not mother-tongue (which most people consider as a factor affecting learning of the pronunciation of a second language), has the greatest influence on students’ learning and realization of English vowel sounds since the respondents in this study are from different ethnic groups of Nigeria and thus speak different languages but having the same or almost the same problem when pronouncing the English vowel sounds.

Keywords: English vowels, learning, Nigeria, pronunciation

Procedia PDF Downloads 422
39854 Mobile Devices and E-Learning Systems as a Cost-Effective Alternative for Digitizing Paper Quizzes and Questionnaires in Social Work

Authors: K. Myška, L. Pilařová

Abstract:

The article deals with possibilities of using cheap mobile devices with the combination of free or open source software tools as an alternative to professional hardware and software equipment. Especially in social work, it is important to find cheap yet functional solution that can compete with complex but expensive solutions for digitizing paper materials. Our research was focused on the analysis of cheap and affordable solutions for digitizing the most frequently used paper materials that are being commonly used by terrain workers in social work. We used comparative analysis as a research method. Social workers need to process data from paper forms quite often. It is still more affordable, time and cost-effective to use paper forms to get feedback in many cases. Collecting data from paper quizzes and questionnaires can be done with the help of professional scanners and software. These technologies are very powerful and have advanced options for digitizing and processing digitized data, but are also very expensive. According to results of our study, the combination of open source software and mobile phone or cheap scanner can be considered as a cost-effective alternative to professional equipment.

Keywords: digitalization, e-learning, mobile devices, questionnaire

Procedia PDF Downloads 134
39853 Modified InVEST for Whatsapp Messages Forensic Triage and Search through Visualization

Authors: Agria Rhamdhan

Abstract:

WhatsApp as the most popular mobile messaging app has been used as evidence in many criminal cases. As the use of mobile messages generates large amounts of data, forensic investigation faces the challenge of large data problems. The hardest part of finding this important evidence is because current practice utilizes tools and technique that require manual analysis to check all messages. That way, analyze large sets of mobile messaging data will take a lot of time and effort. Our work offers methodologies based on forensic triage to reduce large data to manageable sets resulting easier to do detailed reviews, then show the results through interactive visualization to show important term, entities and relationship through intelligent ranking using Term Frequency-Inverse Document Frequency (TF-IDF) and Latent Dirichlet Allocation (LDA) Model. By implementing this methodology, investigators can improve investigation processing time and result's accuracy.

Keywords: forensics, triage, visualization, WhatsApp

Procedia PDF Downloads 153
39852 Immunization-Data-Quality in Public Health Facilities in the Pastoralist Communities: A Comparative Study Evidence from Afar and Somali Regional States, Ethiopia

Authors: Melaku Tsehay

Abstract:

The Consortium of Christian Relief and Development Associations (CCRDA), and the CORE Group Polio Partners (CGPP) Secretariat have been working with Global Alliance for Vac-cines and Immunization (GAVI) to improve the immunization data quality in Afar and Somali Regional States. The main aim of this study was to compare the quality of immunization data before and after the above interventions in health facilities in the pastoralist communities in Ethiopia. To this end, a comparative-cross-sectional study was conducted on 51 health facilities. The baseline data was collected in May 2019, while the end line data in August 2021. The WHO data quality self-assessment tool (DQS) was used to collect data. A significant improvment was seen in the accuracy of the pentavalent vaccine (PT)1 (p = 0.012) data at the health posts (HP), while PT3 (p = 0.010), and Measles (p = 0.020) at the health centers (HC). Besides, a highly sig-nificant improvment was observed in the accuracy of tetanus toxoid (TT)2 data at HP (p < 0.001). The level of over- or under-reporting was found to be < 8%, at the HP, and < 10% at the HC for PT3. The data completeness was also increased from 72.09% to 88.89% at the HC. Nearly 74% of the health facilities timely reported their respective immunization data, which is much better than the baseline (7.1%) (p < 0.001). These findings may provide some hints for the policies and pro-grams targetting on improving immunization data qaulity in the pastoralist communities.

Keywords: data quality, immunization, verification factor, pastoralist region

Procedia PDF Downloads 81
39851 A Proposal to Tackle Security Challenges of Distributed Systems in the Healthcare Sector

Authors: Ang Chia Hong, Julian Khoo Xubin, Burra Venkata Durga Kumar

Abstract:

Distributed systems offer many benefits to the healthcare industry. From big data analysis to business intelligence, the increased computational power and efficiency from distributed systems serve as an invaluable resource in the healthcare sector to utilize. However, as the usage of these distributed systems increases, many issues arise. The main focus of this paper will be on security issues. Many security issues stem from distributed systems in the healthcare industry, particularly information security. The data of people is especially sensitive in the healthcare industry. If important information gets leaked (Eg. IC, credit card number, address, etc.), a person’s identity, financial status, and safety might get compromised. This results in the responsible organization losing a lot of money in compensating these people and even more resources expended trying to fix the fault. Therefore, a framework for a blockchain-based healthcare data management system for healthcare was proposed. In this framework, the usage of a blockchain network is explored to store the encryption key of the patient’s data. As for the actual data, it is encrypted and its encrypted data, called ciphertext, is stored in a cloud storage platform. Furthermore, there are some issues that have to be emphasized and tackled for future improvements, such as a multi-user scheme that could be proposed, authentication issues that have to be tackled or migrating the backend processes into the blockchain network. Due to the nature of blockchain technology, the data will be tamper-proof, and its read-only function can only be accessed by authorized users such as doctors and nurses. This guarantees the confidentiality and immutability of the patient’s data.

Keywords: distributed, healthcare, efficiency, security, blockchain, confidentiality and immutability

Procedia PDF Downloads 165
39850 Investigation of Surface Electromyograph Signal Acquired from the around Shoulder Muscles of Upper Limb Amputees

Authors: Amanpreet Kaur, Ravinder Agarwal, Amod Kumar

Abstract:

Surface electromyography is a strategy to measure the muscle activity of the skin. Sensors placed on the skin recognize the electrical current or signal generated by active muscles. A lot of the research has focussed on the detection of signal from upper limb amputee with activity of triceps and biceps muscles. The purpose of this study was to correlate phantom movement and sEMG activity in residual stump muscles of transhumeral amputee from the shoulder muscles. Eight non- amputee and seven right hand amputees were recruited for this study. sEMG data were collected for the trapezius, pectoralis and teres muscles for elevation, protraction and retraction of shoulder. Contrast between the amputees and non-amputees muscles action have been investigated. Subsequently, to investigate the impact of class separability for different motions of shoulder, analysis of variance for experimental recorded data was carried out. Results were analyzed to recognize different shoulder movements and represent a step towards the surface electromyography controlled system for amputees. Difference in F ratio (p < 0.05) values indicates the distinction in mean therefore these analysis helps to determine the independent motion. The identified signal would be used to design more accurate and efficient controllers for the upper-limb amputee for researchers.

Keywords: around shoulder amputation, surface electromyography, analysis of variance, features

Procedia PDF Downloads 409
39849 Estimation of Energy Losses of Photovoltaic Systems in France Using Real Monitoring Data

Authors: Mohamed Amhal, Jose Sayritupac

Abstract:

Photovoltaic (PV) systems have risen as one of the modern renewable energy sources that are used in wide ranges to produce electricity and deliver it to the electrical grid. In parallel, monitoring systems have been deployed as a key element to track the energy production and to forecast the total production for the next days. The reliability of the PV energy production has become a crucial point in the analysis of PV systems. A deeper understanding of each phenomenon that causes a gain or a loss of energy is needed to better design, operate and maintain the PV systems. This work analyzes the current losses distribution in PV systems starting from the available solar energy, going through the DC side and AC side, to the delivery point. Most of the phenomena linked to energy losses and gains are considered and modeled, based on real time monitoring data and datasheets of the PV system components. An analysis of the order of magnitude of each loss is compared to the current literature and commercial software. To date, the analysis of PV systems performance based on a breakdown structure of energy losses and gains is not covered enough in the literature, except in some software where the concept is very common. The cutting-edge of the current analysis is the implementation of software tools for energy losses estimation in PV systems based on several energy losses definitions and estimation technics. The developed tools have been validated and tested on some PV plants in France, which are operating for years. Among the major findings of the current study: First, PV plants in France show very low rates of soiling and aging. Second, the distribution of other losses is comparable to the literature. Third, all losses reported are correlated to operational and environmental conditions. For future work, an extended analysis on further PV plants in France and abroad will be performed.

Keywords: energy gains, energy losses, losses distribution, monitoring, photovoltaic, photovoltaic systems

Procedia PDF Downloads 152
39848 Analysis of Process Methane Hydrate Formation That Include the Important Role of Deep-Sea Sediments with Analogy in Kerek Formation, Sub-Basin Kendeng, Central Java, Indonesia

Authors: Yan Bachtiar Muslih, Hangga Wijaya, Trio Fani, Putri Agustin

Abstract:

Demand of Energy in Indonesia always increases 5-6% a year, but production of conventional energy always decreases 3-5% a year, it means that conventional energy in 20-40 years ahead will not able to complete all energy demand in Indonesia, one of the solve way is using unconventional energy that is gas hydrate, gas hydrate is gas that form by biogenic process, gas hydrate stable in condition with extremely depth and low temperature, gas hydrate can form in two condition that is in pole condition and in deep-sea condition, wherein this research will focus in gas hydrate that association with methane form methane hydrate in deep-sea condition and usually form in depth between 150-2000 m, this research will focus in process of methane hydrate formation that is biogenic process and the important role of deep-sea sediment so can produce accumulation of methane hydrate, methane hydrate usually will be accumulated in find sediment in deep-sea environment with condition high-pressure and low-temperature this condition too usually make methane hydrate change into white nodule, methodology of this research is geology field work and laboratory analysis, from geology field work will get sample data consist of 10-15 samples from Kerek Formation outcrops as random for imagine the condition of deep-sea environment that influence the methane hydrate formation and also from geology field work will get data of measuring stratigraphy in outcrops Kerek Formation too from this data will help to imagine the process in deep-sea sediment like energy flow, supply sediment, and etc, and laboratory analysis is activity to analyze all data that get from geology field work, the result of this research can used to exploration activity of methane hydrate in another prospect deep-sea environment in Indonesia.

Keywords: methane hydrate, deep-sea sediment, kerek formation, sub-basin of kendeng, central java, Indonesia

Procedia PDF Downloads 448
39847 Socioeconomic Factors Associated with the Knowledge, Attitude, and Practices of Oil Palm Smallholders toward Ganoderma Disease

Authors: K. Assis, B. Bonaventure, A. Abdul Rahim, H. Affendy, A. Mohammad Amizi

Abstract:

Oil palm smallholders are considered as a very important producer of oil palm in Malaysia. They are categorized into two, which are organized smallholder and independent smallholder. In this study, there were 1000 oil palms smallholders have been interviewed by using a structured questionnaire. The main objective of the survey is to identify the relationship between socioeconomic characteristics of smallholders with their knowledge, attitude, and practices toward Ganoderma disease. The locations of study include Peninsular Malaysia and Sabah. There were three important aspects studied, namely knowledge of Ganoderma disease, attitude towards the disease as well as the practices in managing the disease. Cluster analysis, factor analysis, and binary logistic regression were used to analyze the data collected. The findings of the study should provide a baseline data which can be used by the relevant agencies to conduct programs or to formulate a suitable development plan to improve the knowledge, attitude and practices of oil palm smallholders in managing Ganoderma disease.

Keywords: attitude, Ganoderma, knowledge, oil palm, practices, smallholders

Procedia PDF Downloads 378
39846 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling

Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal

Abstract:

Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.

Keywords: ABET, accreditation, benchmark collection, machine learning, program educational objectives, student outcomes, supervised multi-class classification, text mining

Procedia PDF Downloads 150
39845 [Keynote Speech]: Feature Selection and Predictive Modeling of Housing Data Using Random Forest

Authors: Bharatendra Rai

Abstract:

Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).

Keywords: housing data, feature selection, random forest, Boruta algorithm, root mean square error

Procedia PDF Downloads 303
39844 An Assessment of Different Blade Tip Timing (BTT) Algorithms Using an Experimentally Validated Finite Element Model Simulator

Authors: Mohamed Mohamed, Philip Bonello, Peter Russhard

Abstract:

Blade Tip Timing (BTT) is a technology concerned with the estimation of both frequency and amplitude of rotating blades. A BTT system comprises two main parts: (a) the arrival time measurement system, and (b) the analysis algorithms. Simulators play an important role in the development of the analysis algorithms since they generate blade tip displacement data from the simulated blade vibration under controlled conditions. This enables an assessment of the performance of the different algorithms with respect to their ability to accurately reproduce the original simulated vibration. Such an assessment is usually not possible with real engine data since there is no practical alternative to BTT for blade vibration measurement. Most simulators used in the literature are based on a simple spring-mass-damper model to determine the vibration. In this work, a more realistic experimentally validated simulator based on the Finite Element (FE) model of a bladed disc (blisk) is first presented. It is then used to generate the necessary data for the assessment of different BTT algorithms. The FE modelling is validated using both a hammer test and two firewire cameras for the mode shapes. A number of autoregressive methods, fitting methods and state-of-the-art inverse methods (i.e. Russhard) are compared. All methods are compared with respect to both synchronous and asynchronous excitations with both single and simultaneous frequencies. The study assesses the applicability of each method for different conditions of vibration, amount of sampling data, and testing facilities, according to its performance and efficiency under these conditions.

Keywords: blade tip timing, blisk, finite element, vibration measurement

Procedia PDF Downloads 292
39843 Decoding the Natural Hazards: The Data Paradox, Juggling Data Flows, Transparency and Secrets, Analysis of Khuzestan and Lorestan Floods of Iran

Authors: Kiyanoush Ghalavand

Abstract:

We have a complex paradox in the agriculture and environment sectors in the age of technology. In the one side, the achievements of the science and information ages are shaping to come that is very dangerous than ever last decades. The progress of the past decades is historic, connecting people, empowering individuals, groups, and states, and lifting a thousand people out of land and poverty in the process. Floods are the most frequent natural hazards damaging and recurring of all disasters in Iran. Additionally, floods are morphing into new and even more devastating forms in recent years. Khuzestan and Lorestan Provinces experienced heavy rains that began on March 28, 2019, and led to unprecedented widespread flooding and landslides across the provinces. The study was based on both secondary and primary data. For the present study, a questionnaire-based primary survey was conducted. Data were collected by using a specially designed questionnaire and other instruments, such as focus groups, interview schedules, inception workshops, and roundtable discussions with stakeholders at different levels. Farmers in Khuzestan and Lorestan provinces were the statistical population for this study. Data were analyzed with several software such as ATLASti, NVivo SPSS Win, ،E-Views. According to a factorial analysis conducted for the present study, 10 groups of factors were categorized climatic, economic, cultural, supportive, instructive, planning, military, policymaking, geographical, and human factors. They estimated 71.6 percent of explanatory factors of flood management obstacles in the agricultural sector in Lorestan and Khuzestan provinces. Several recommendations were finally made based on the study findings.

Keywords: chaos theory, natural hazards, risks, environmental risks, paradox

Procedia PDF Downloads 124
39842 Where do Pregnant Women Miss Out on Nutrition? Analysis of Survey Data from 22 Countries

Authors: Alexis D'Agostino, Celeste Sununtunasuk, Jack Fiedler

Abstract:

Background: Iron-folic acid (IFA) supplementation during antenatal care (ANC) has existed in many countries for decades. Despite this, low national coverage persists and women do not often consume appropriate amounts during pregnancy. USAID’s SPRING Project investigated pregnant women’s access to, and consumption of, IFA tablets through ANC. Cross-country analysis provided a global picture of the state of IFA-supplementation, while country-specific results noted key contextual issues, including geography, wealth, and ANC attendance. The analysis can help countries prioritize strategies for systematic performance improvements within one of the most common micronutrient supplementation programs aimed at reducing maternal anemia. Methodology: Using falter point analysis on Demographic and Health Survey (DHS) data collected from 162,958 women across 22 countries, SPRING identified four sequential falter points (ANC attendance, IFA receipt or purchase, IFA consumption, and number of tablets taken) where pregnant women fell out of the IFA distribution structure. SPRING analyzed data on IFA intake from DHS surveys with women of reproductive age. SPRING disaggregated these data by ANC participation during the most recent pregnancy, residency, and women’s socio-economic status. Results: Average sufficient IFA tablet use across all countries was only eight percent. Even in the best performing countries, only about one-third of pregnant women consumed 180 or more IFA tablets during their most recent pregnancy. ANC attendance was an important falter point for a quarter of women across all countries (with highest falter rates in Democratic Republic of the Congo, Nigeria, and Niger). Further analysis reveals patterns, with some countries having high ANC coverage but low IFA provision during ANC (DRC and Haiti), others having high ANC coverage and IFA provision but few women taking any tablets (Nigeria and Liberia), and countries that perform well in ANC, supplies, and initial consumption but where very few women consume the recommended 180 tablets (Malawi and Cambodia). Country-level analysis identifies further patterns of supplementation. In Indonesia, for example, only 62% of women in the poorest quintile took even one IFA tablet, while 86% of the wealthiest women did. This association between socioeconomic status and IFA intake held across nearly all countries where these data are available and was also visible in rural/urban comparisons. Analysis of ANC attendance data also suggests that higher numbers of ANC visits are associated with higher tablet intake. Conclusions: While it is difficult to disentangle which specific aspects of supply or demand cause the low rates of consumption, this tool allows policy-makers to identify major bottlenecks to scaling-up IFA supplementation during ANC. In turn, each falter point provides possible explanations of program performance and helps strategically identify areas for improved IFA supplementation. For example, improving the delivery of IFA supplementation in Ethiopia relies on increasing access to ANC, but also on identifying and addressing program gaps in IFA supply management and health workers’ practices in order to provide quality ANC services. While every country requires a customized approach to improving IFA supplementation, the multi-country analysis conducted by SPRING is a helpful first step in identifying country bottlenecks and prioritizing interventions.

Keywords: iron and folic acid, supplementation, antenatal care, micronutrient

Procedia PDF Downloads 375
39841 A Web and Cloud-Based Measurement System Analysis Tool for the Automotive Industry

Authors: C. A. Barros, Ana P. Barroso

Abstract:

Any industrial company needs to determine the amount of variation that exists within its measurement process and guarantee the reliability of their data, studying the performance of their measurement system, in terms of linearity, bias, repeatability and reproducibility and stability. This issue is critical for automotive industry suppliers, who are required to be certified by the 16949:2016 standard (replaces the ISO/TS 16949) of International Automotive Task Force, defining the requirements of a quality management system for companies in the automotive industry. Measurement System Analysis (MSA) is one of the mandatory tools. Frequently, the measurement system in companies is not connected to the equipment and do not incorporate the methods proposed by the Automotive Industry Action Group (AIAG). To address these constraints, an R&D project is in progress, whose objective is to develop a web and cloud-based MSA tool. This MSA tool incorporates Industry 4.0 concepts, such as, Internet of Things (IoT) protocols to assure the connection with the measuring equipment, cloud computing, artificial intelligence, statistical tools, and advanced mathematical algorithms. This paper presents the preliminary findings of the project. The web and cloud-based MSA tool is innovative because it implements all statistical tests proposed in the MSA-4 reference manual from AIAG as well as other emerging methods and techniques. As it is integrated with the measuring devices, it reduces the manual input of data and therefore the errors. The tool ensures traceability of all performed tests and can be used in quality laboratories and in the production lines. Besides, it monitors MSAs over time, allowing both the analysis of deviations from the variation of the measurements performed and the management of measurement equipment and calibrations. To develop the MSA tool a ten-step approach was implemented. Firstly, it was performed a benchmarking analysis of the current competitors and commercial solutions linked to MSA, concerning Industry 4.0 paradigm. Next, an analysis of the size of the target market for the MSA tool was done. Afterwards, data flow and traceability requirements were analysed in order to implement an IoT data network that interconnects with the equipment, preferably via wireless. The MSA web solution was designed under UI/UX principles and an API in python language was developed to perform the algorithms and the statistical analysis. Continuous validation of the tool by companies is being performed to assure real time management of the ‘big data’. The main results of this R&D project are: MSA Tool, web and cloud-based; Python API; New Algorithms to the market; and Style Guide of UI/UX of the tool. The MSA tool proposed adds value to the state of the art as it ensures an effective response to the new challenges of measurement systems, which are increasingly critical in production processes. Although the automotive industry has triggered the development of this innovative MSA tool, other industries would also benefit from it. Currently, companies from molds and plastics, chemical and food industry are already validating it.

Keywords: automotive Industry, industry 4.0, Internet of Things, IATF 16949:2016, measurement system analysis

Procedia PDF Downloads 200
39840 Correlation Analysis to Quantify Learning Outcomes for Different Teaching Pedagogies

Authors: Kanika Sood, Sijie Shang

Abstract:

A fundamental goal of education includes preparing students to become a part of the global workforce by making beneficial contributions to society. In this paper, we analyze student performance for multiple courses that involve different teaching pedagogies: a cooperative learning technique and an inquiry-based learning strategy. Student performance includes student engagement, grades, and attendance records. We perform this study in the Computer Science department for online and in-person courses for 450 students. We will perform correlation analysis to study the relationship between student scores and other parameters such as gender, mode of learning. We use natural language processing and machine learning to analyze student feedback data and performance data. We assess the learning outcomes of two teaching pedagogies for undergraduate and graduate courses to showcase the impact of pedagogical adoption and learning outcome as determinants of academic achievement. Early findings suggest that when using the specified pedagogies, students become experts on their topics and illustrate enhanced engagement with peers.

Keywords: bag-of-words, cooperative learning, education, inquiry-based learning, in-person learning, natural language processing, online learning, sentiment analysis, teaching pedagogy

Procedia PDF Downloads 60
39839 Mixture statistical modeling for predecting mortality human immunodeficiency virus (HIV) and tuberculosis(TB) infection patients

Authors: Mohd Asrul Affendi Bi Abdullah, Nyi Nyi Naing

Abstract:

The purpose of this study was to identify comparable manner between negative binomial death rate (NBDR) and zero inflated negative binomial death rate (ZINBDR) with died patients with (HIV + T B+) and (HIV + T B−). HIV and TB is a serious world wide problem in the developing country. Data were analyzed with applying NBDR and ZINBDR to make comparison which a favorable model is better to used. The ZINBDR model is able to account for the disproportionately large number of zero within the data and is shown to be a consistently better fit than the NBDR model. Hence, as a results ZINBDR model is a superior fit to the data than the NBDR model and provides additional information regarding the died mechanisms HIV+TB. The ZINBDR model is shown to be a use tool for analysis death rate according age categorical.

Keywords: zero inflated negative binomial death rate, HIV and TB, AIC and BIC, death rate

Procedia PDF Downloads 410
39838 Mediation Analysis of the Efficacy of the Nimotuzumab-Cisplatin-Radiation (NCR) Improve Overall Survival (OS): A HPV Negative Oropharyngeal Cancer Patient (HPVNOCP) Cohort

Authors: Akshay Patil

Abstract:

Objective: Mediation analysis identifies causal pathways by testing the relationships between the NCR, the OS, and an intermediate variable that mediates the relationship between the Nimotuzumab-cisplatin-radiation (NCR) and OS. Introduction: In randomized controlled trials, the primary interest is in the mechanisms by which an intervention exerts its effects on the outcomes. Clinicians are often interested in how the intervention works (or why it does not work) through hypothesized causal mechanisms. In this work, we highlight the value of understanding causal mechanisms in randomized trial by applying causal mediation analysis in a randomized trial in oncology. Methods: Data was obtained from a phase III randomized trial (Subgroup of HPVNOCP). NCR is reported to significantly improve the OS of patients locally advanced head and neck cancer patients undergoing definitive chemoradiation. Here, based on trial data, the mediating effect of NCR on patient overall survival was systematically quantified through progression-free survival(PFS), disease free survival (DFS), Loco-regional failure (LRF), and the disease control rate (DCR), Overall response rate (ORR). Effects of potential mediators on the HR for OS with NCR versus cisplatin-radiation (CR) were analyzed by Cox regression models. Statistical analyses were performed using R software Version 3.6.3 (The R Foundation for Statistical Computing) Results: Effects of potential mediator PFS was an association between NCR treatment and OS, with an indirect-effect (IE) 0.76(0.62 – 0.95), which mediated 60.69% of the treatment effect. Taking into account baseline confounders, the overall adjusted hazard ratio of death was 0.64 (95% CI: 0.43 – 0.96; P=0.03). The DFS was also a significant mediator and had an IE 0.77 (95% CI; 0.62-0.93), 58% mediated). Smaller mediation effects (maximum 27%) were observed for LRF with IE 0.88(0.74 – 1.06). Both DCR and ORR mediated 10% and 15%, respectively, of the effect of NCR vs. CR on the OS with IE 0.65 (95% CI; 0.81 – 1.08) and 0.94(95% CI; 0.79 – 1.04). Conclusion: Our findings suggest that PFS and DFS were the most important mediators of the OS with nimotuzumab to weekly cisplatin-radiation in HPVNOCP.

Keywords: mediation analysis, cancer data, survival, NCR, HPV negative oropharyngeal

Procedia PDF Downloads 125
39837 Seismic Interpretation and Petrophysical Evaluation of SM Field, Libya

Authors: Abdalla Abdelnabi, Yousf Abushalah

Abstract:

The G Formation is a major gas producing reservoir in the SM Field, eastern, Libya. It is called G limestone because it consists of shallow marine limestone. Well data and 3D-Seismic in conjunction with the results of a previous study were used to delineate the hydrocarbon reservoir of Middle Eocene G-Formation of SM Field area. The data include three-dimensional seismic data acquired in 2009. It covers approximately an area of 75 mi² and with more than 9 wells penetrating the reservoir. Seismic data are used to identify any stratigraphic and structural and features such as channels and faults and which may play a significant role in hydrocarbon traps. The well data are used to calculation petrophysical analysis of S field. The average porosity of the Middle Eocene G Formation is very good with porosity reaching 24% especially around well W 6. Average water saturation was calculated for each well from porosity and resistivity logs using Archie’s formula. The average water saturation for the whole well is 25%. Structural mapping of top and bottom of Middle Eocene G formation revealed the highest area in the SM field is at 4800 ft subsea around wells W4, W5, W6, and W7 and the deepest point is at 4950 ft subsea. Correlation between wells using well data and structural maps created from seismic data revealed that net thickness of G Formation range from 0 ft in the north part of the field to 235 ft in southwest and south part of the field. The gas water contact is found at 4860 ft using the resistivity log. The net isopach map using both the trapezoidal and pyramid rules are used to calculate the total bulk volume. The original gas in place and the recoverable gas were calculated volumetrically to be 890 Billion Standard Cubic Feet (BSCF) and 630 (BSCF) respectively.

Keywords: 3D seismic data, well logging, petrel, kingdom suite

Procedia PDF Downloads 135
39836 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: critical success factors, data quality, data quality management, Delphi, Q-Sort

Procedia PDF Downloads 201