Search results for: data standardization
22967 A Deep Learning Approach for the Predictive Quality of Directional Valves in the Hydraulic Final Test
Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter
Abstract:
The increasing use of deep learning applications in production is becoming a competitive advantage. Predictive quality enables the assurance of product quality by using data-driven forecasts via machine learning models as a basis for decisions on test results. The use of real Bosch production data along the value chain of hydraulic valves is a promising approach to classifying the leakage of directional valves.Keywords: artificial neural networks, classification, hydraulics, predictive quality, deep learning
Procedia PDF Downloads 24322966 A Study of Various Ontology Learning Systems from Text and a Look into Future
Authors: Fatima Al-Aswadi, Chan Yong
Abstract:
With the large volume of unstructured data that increases day by day on the web, the motivation of representing the knowledge in this data in the machine processable form is increased. Ontology is one of the major cornerstones of representing the information in a more meaningful way on the semantic Web. The goal of Ontology learning from text is to elicit and represent domain knowledge in the machine readable form. This paper aims to give a follow-up review on the ontology learning systems from text and some of their defects. Furthermore, it discusses how far the ontology learning process will enhance in the future.Keywords: concept discovery, deep learning, ontology learning, semantic relation, semantic web
Procedia PDF Downloads 52122965 Stature Prediction from Anthropometry of Extremities among Jordanians
Authors: Amal A. Mashali, Omar Eltaweel, Elerian Ekladious
Abstract:
Stature of an individual has an important role in identification, which is often required in medico-legal practice. The estimation of stature is an important step in the identification of dismembered remains or when only a part of a skeleton is only available as in major disasters or with mutilation. There is no published data on anthropological data among Jordanian population. The present study was designed in order to find out relationship of stature to some anthropometric measures among a sample of Jordanian population and to determine the most accurate and reliable one in predicting the stature of an individual. A cross sectional study was conducted on 336 adult healthy volunteers , free of bone diseases, nutritional diseases and abnormalities in the extremities after taking their consent. Students of Faculty of Medicine, Mutah University helped in collecting the data. The anthropometric measurements (anatomically defined) were stature, humerus length, hand length and breadth, foot length and breadth, foot index and knee height on both right and left sides of the body. The measurements were typical on both sides of the bodies of the studied samples. All the anthropologic data showed significant relation with age except the knee height. There was a significant difference between male and female measurements except for the foot index where F= 0.269. There was a significant positive correlation between the different measures and the stature of the individuals. Three equations were developed for estimation of stature. The most sensitive measure for prediction of a stature was found to be the humerus length.Keywords: foot index, foot length, hand length, humerus length, stature
Procedia PDF Downloads 30522964 Internalizing and Externalizing Problems as Predictors of Student Wellbeing
Authors: Nai-Jiin Yang, Tyler Renshaw
Abstract:
Prior research has suggested that youth internalizing and externalizing problems significantly correlate with student subjective wellbeing (SSW) and achievement problems (SAP). Yet, only a few studies have used data from mental health screener based on the dual-factor model to explore the empirical relationships among internalizing problems, externalizing problems, academic problems, and student wellbeing. This study was conducted through a secondary analysis of previously collected data in school-wide mental health screening activities across secondary schools within a suburban school district in the western United States. The data set included 1880 student responses from a total of two schools. Findings suggest that both internalizing and externalizing problems are substantial predictors of both student wellbeing and academic problems. However, compared to internalizing problems, externalizing problems were a much stronger predictor of academic problems. Moreover, this study did not support academic problems that moderate the relationship between SSW and youth internalizing problems (YIP) and between youth externalizing problems (YEP) and SSW. Lastly, SAP is the strongest predictor of SSW than YIP and YEP.Keywords: academic problems, externalizing problems, internalizing problems, school mental health, student wellbeing, universal mental health screening
Procedia PDF Downloads 8422963 A Generative Adversarial Framework for Bounding Confounded Causal Effects
Authors: Yaowei Hu, Yongkai Wu, Lu Zhang, Xintao Wu
Abstract:
Causal inference from observational data is receiving wide applications in many fields. However, unidentifiable situations, where causal effects cannot be uniquely computed from observational data, pose critical barriers to applying causal inference to complicated real applications. In this paper, we develop a bounding method for estimating the average causal effect (ACE) under unidentifiable situations due to hidden confounders. We propose to parameterize the unknown exogenous random variables and structural equations of a causal model using neural networks and implicit generative models. Then, with an adversarial learning framework, we search the parameter space to explicitly traverse causal models that agree with the given observational distribution and find those that minimize or maximize the ACE to obtain its lower and upper bounds. The proposed method does not make any assumption about the data generating process and the type of the variables. Experiments using both synthetic and real-world datasets show the effectiveness of the method.Keywords: average causal effect, hidden confounding, bound estimation, generative adversarial learning
Procedia PDF Downloads 19122962 Measurement of Operational and Environmental Performance of the Coal-Fired Power Plants in India by Using Data Envelopment Analysis
Authors: Vijay Kumar Bajpai, Sudhir Kumar Singh
Abstract:
In this study, the performance analyses of the twenty five coal-fired power plants (CFPPs) used for electricity generation are carried out through various data envelopment analysis (DEA) models. Three efficiency indices are defined and pursued. During the calculation of the operational performance, energy and non-energy variables are used as input, and net electricity produced is used as desired output. CO2 emitted to the environment is used as the undesired output in the computation of the pure environmental performance while in Model-3 CO2 emissions is considered as detrimental input in the calculation of operational and environmental performance. Empirical results show that most of the plants are operating in increasing returns to scale region and Mettur plant is efficient one with regards to energy use and environment. The result also indicates that the undesirable output effect is insignificant in the research sample. The present study will provide clues to plant operators towards raising the operational and environmental performance of CFPPs.Keywords: coal fired power plants, environmental performance, data envelopment analysis, operational performance
Procedia PDF Downloads 45522961 Estimation of Maize Yield by Using a Process-Based Model and Remote Sensing Data in the Northeast China Plain
Authors: Jia Zhang, Fengmei Yao, Yanjing Tan
Abstract:
The accurate estimation of crop yield is of great importance for the food security. In this study, a process-based mechanism model was modified to estimate yield of C4 crop by modifying the carbon metabolic pathway in the photosynthesis sub-module of the RS-P-YEC (Remote-Sensing-Photosynthesis-Yield estimation for Crops) model. The yield was calculated by multiplying net primary productivity (NPP) and the harvest index (HI) derived from the ratio of grain to stalk yield. The modified RS-P-YEC model was used to simulate maize yield in the Northeast China Plain during the period 2002-2011. The statistical data of maize yield from study area was used to validate the simulated results at county-level. The results showed that the Pearson correlation coefficient (R) was 0.827 (P < 0.01) between the simulated yield and the statistical data, and the root mean square error (RMSE) was 712 kg/ha with a relative error (RE) of 9.3%. From 2002-2011, the yield of maize planting zone in the Northeast China Plain was increasing with smaller coefficient of variation (CV). The spatial pattern of simulated maize yield was consistent with the actual distribution in the Northeast China Plain, with an increasing trend from the northeast to the southwest. Hence the results demonstrated that the modified process-based model coupled with remote sensing data was suitable for yield prediction of maize in the Northeast China Plain at the spatial scale.Keywords: process-based model, C4 crop, maize yield, remote sensing, Northeast China Plain
Procedia PDF Downloads 37522960 Application of Artificial Intelligence to Schedule Operability of Waterfront Facilities in Macro Tide Dominated Wide Estuarine Harbour
Authors: A. Basu, A. A. Purohit, M. M. Vaidya, M. D. Kudale
Abstract:
Mumbai, being traditionally the epicenter of India's trade and commerce, the existing major ports such as Mumbai and Jawaharlal Nehru Ports (JN) situated in Thane estuary are also developing its waterfront facilities. Various developments over the passage of decades in this region have changed the tidal flux entering/leaving the estuary. The intake at Pir-Pau is facing the problem of shortage of water in view of advancement of shoreline, while jetty near Ulwe faces the problem of ship scheduling due to existence of shallower depths between JN Port and Ulwe Bunder. In order to solve these problems, it is inevitable to have information about tide levels over a long duration by field measurements. However, field measurement is a tedious and costly affair; application of artificial intelligence was used to predict water levels by training the network for the measured tide data for one lunar tidal cycle. The application of two layered feed forward Artificial Neural Network (ANN) with back-propagation training algorithms such as Gradient Descent (GD) and Levenberg-Marquardt (LM) was used to predict the yearly tide levels at waterfront structures namely at Ulwe Bunder and Pir-Pau. The tide data collected at Apollo Bunder, Ulwe, and Vashi for a period of lunar tidal cycle (2013) was used to train, validate and test the neural networks. These trained networks having high co-relation coefficients (R= 0.998) were used to predict the tide at Ulwe, and Vashi for its verification with the measured tide for the year 2000 & 2013. The results indicate that the predicted tide levels by ANN give reasonably accurate estimation of tide. Hence, the trained network is used to predict the yearly tide data (2015) for Ulwe. Subsequently, the yearly tide data (2015) at Pir-Pau was predicted by using the neural network which was trained with the help of measured tide data (2000) of Apollo and Pir-Pau. The analysis of measured data and study reveals that: The measured tidal data at Pir-Pau, Vashi and Ulwe indicate that there is maximum amplification of tide by about 10-20 cm with a phase lag of 10-20 minutes with reference to the tide at Apollo Bunder (Mumbai). LM training algorithm is faster than GD and with increase in number of neurons in hidden layer and the performance of the network increases. The predicted tide levels by ANN at Pir-Pau and Ulwe provides valuable information about the occurrence of high and low water levels to plan the operation of pumping at Pir-Pau and improve ship schedule at Ulwe.Keywords: artificial neural network, back-propagation, tide data, training algorithm
Procedia PDF Downloads 48322959 Algorithm Development of Individual Lumped Parameter Modelling for Blood Circulatory System: An Optimization Study
Authors: Bao Li, Aike Qiao, Gaoyang Li, Youjun Liu
Abstract:
Background: Lumped parameter model (LPM) is a common numerical model for hemodynamic calculation. LPM uses circuit elements to simulate the human blood circulatory system. Physiological indicators and characteristics can be acquired through the model. However, due to the different physiological indicators of each individual, parameters in LPM should be personalized in order for convincing calculated results, which can reflect the individual physiological information. This study aimed to develop an automatic and effective optimization method to personalize the parameters in LPM of the blood circulatory system, which is of great significance to the numerical simulation of individual hemodynamics. Methods: A closed-loop LPM of the human blood circulatory system that is applicable for most persons were established based on the anatomical structures and physiological parameters. The patient-specific physiological data of 5 volunteers were non-invasively collected as personalized objectives of individual LPM. In this study, the blood pressure and flow rate of heart, brain, and limbs were the main concerns. The collected systolic blood pressure, diastolic blood pressure, cardiac output, and heart rate were set as objective data, and the waveforms of carotid artery flow and ankle pressure were set as objective waveforms. Aiming at the collected data and waveforms, sensitivity analysis of each parameter in LPM was conducted to determine the sensitive parameters that have an obvious influence on the objectives. Simulated annealing was adopted to iteratively optimize the sensitive parameters, and the objective function during optimization was the root mean square error between the collected waveforms and data and simulated waveforms and data. Each parameter in LPM was optimized 500 times. Results: In this study, the sensitive parameters in LPM were optimized according to the collected data of 5 individuals. Results show a slight error between collected and simulated data. The average relative root mean square error of all optimization objectives of 5 samples were 2.21%, 3.59%, 4.75%, 4.24%, and 3.56%, respectively. Conclusions: Slight error demonstrated good effects of optimization. The individual modeling algorithm developed in this study can effectively achieve the individualization of LPM for the blood circulatory system. LPM with individual parameters can output the individual physiological indicators after optimization, which are applicable for the numerical simulation of patient-specific hemodynamics.Keywords: blood circulatory system, individual physiological indicators, lumped parameter model, optimization algorithm
Procedia PDF Downloads 13722958 Estimating Water Balance at Beterou Watershed, Benin Using Soil and Water Assessment Tool (SWAT) Model
Authors: Ella Sèdé Maforikan
Abstract:
Sustained water management requires quantitative information and the knowledge of spatiotemporal dynamics of hydrological system within the basin. This can be achieved through the research. Several studies have investigated both surface water and groundwater in Beterou catchment. However, there are few published papers on the application of the SWAT modeling in Beterou catchment. The objective of this study was to evaluate the performance of SWAT to simulate the water balance within the watershed. The inputs data consist of digital elevation model, land use maps, soil map, climatic data and discharge records. The model was calibrated and validated using the Sequential Uncertainty Fitting (SUFI2) approach. The calibrated started from 1989 to 2006 with four years warming up period (1985-1988); and validation was from 2007 to 2020. The goodness of the model was assessed using five indices, i.e., Nash–Sutcliffe efficiency (NSE), the ratio of the root means square error to the standard deviation of measured data (RSR), percent bias (PBIAS), the coefficient of determination (R²), and Kling Gupta efficiency (KGE). Results showed that SWAT model successfully simulated river flow in Beterou catchment with NSE = 0.79, R2 = 0.80 and KGE= 0.83 for the calibration process against validation process that provides NSE = 0.78, R2 = 0.78 and KGE= 0.85 using site-based streamflow data. The relative error (PBIAS) ranges from -12.2% to 3.1%. The parameters runoff curve number (CN2), Moist Bulk Density (SOL_BD), Base Flow Alpha Factor (ALPHA_BF), and the available water capacity of the soil layer (SOL_AWC) were the most sensitive parameter. The study provides further research with uncertainty analysis and recommendations for model improvement and provision of an efficient means to improve rainfall and discharges measurement data.Keywords: watershed, water balance, SWAT modeling, Beterou
Procedia PDF Downloads 5522957 BER Estimate of WCDMA Systems with MATLAB Simulation Model
Authors: Suyeb Ahmed Khan, Mahmood Mian
Abstract:
Simulation plays an important role during all phases of the design and engineering of communications systems, from early stages of conceptual design through the various stages of implementation, testing, and fielding of the system. In the present paper, a simulation model has been constructed for the WCDMA system in order to evaluate the performance. This model describes multiusers effects and calculation of BER (Bit Error Rate) in 3G mobile systems using Simulink MATLAB 7.1. Gaussian Approximation defines the multi-user effect on system performance. BER has been analyzed with comparison between transmitting data and receiving data.Keywords: WCDMA, simulations, BER, MATLAB
Procedia PDF Downloads 59222956 Uncertainty Quantification of Corrosion Anomaly Length of Oil and Gas Steel Pipelines Based on Inline Inspection and Field Data
Authors: Tammeen Siraj, Wenxing Zhou, Terry Huang, Mohammad Al-Amin
Abstract:
The high resolution inline inspection (ILI) tool is used extensively in the pipeline industry to identify, locate, and measure metal-loss corrosion anomalies on buried oil and gas steel pipelines. Corrosion anomalies may occur singly (i.e. individual anomalies) or as clusters (i.e. a colony of corrosion anomalies). Although the ILI technology has advanced immensely, there are measurement errors associated with the sizes of corrosion anomalies reported by ILI tools due limitations of the tools and associated sizing algorithms, and detection threshold of the tools (i.e. the minimum detectable feature dimension). Quantifying the measurement error in the ILI data is crucial for corrosion management and developing maintenance strategies that satisfy the safety and economic constraints. Studies on the measurement error associated with the length of the corrosion anomalies (in the longitudinal direction of the pipeline) has been scarcely reported in the literature and will be investigated in the present study. Limitations in the ILI tool and clustering process can sometimes cause clustering error, which is defined as the error introduced during the clustering process by including or excluding a single or group of anomalies in or from a cluster. Clustering error has been found to be one of the biggest contributory factors for relatively high uncertainties associated with ILI reported anomaly length. As such, this study focuses on developing a consistent and comprehensive framework to quantify the measurement errors in the ILI-reported anomaly length by comparing the ILI data and corresponding field measurements for individual and clustered corrosion anomalies. The analysis carried out in this study is based on the ILI and field measurement data for a set of anomalies collected from two segments of a buried natural gas pipeline currently in service in Alberta, Canada. Data analyses showed that the measurement error associated with the ILI-reported length of the anomalies without clustering error, denoted as Type I anomalies is markedly less than that for anomalies with clustering error, denoted as Type II anomalies. A methodology employing data mining techniques is further proposed to classify the Type I and Type II anomalies based on the ILI-reported corrosion anomaly information.Keywords: clustered corrosion anomaly, corrosion anomaly assessment, corrosion anomaly length, individual corrosion anomaly, metal-loss corrosion, oil and gas steel pipeline
Procedia PDF Downloads 30922955 The Impact of Transformational Leadership on Individual Attributes
Authors: Bilal Liaqat, Muhammad Umar, Zara Bashir, Hassan Rafique, Mohsin Abbasi, Zarak Khan
Abstract:
Transformational leadership is one of the most studied topics in the organization sciences. However, the impact of transformational leadership on employee’s individual attributes have not yet been studied. Purpose: This research aims to discover the relationship between transformational leadership and employee motivation, performance and creativity. Moreover, the study will also investigate the influence of transformational leadership on employee performance through employee motivation and employee creativity. Design-Methodology-Approach: The data was collected from employees in different organization. This cross-sectional study collected data from employees and the methodology used includes survey data that were collected from employees in organizations. Structured interviews were also conducted to explain the outcomes from the survey. Findings: The results of this study reveal that transformational leadership has a positive impact on employee’s individual attributes. Research Implications: Although this study expands our knowledge about the role of learning orientation between transformational leadership and employee motivation, performance and creativity, the prospects for further research are still present.Keywords: employee creativity, employee motivation, employee performance, transformational leadership
Procedia PDF Downloads 22822954 Proposal Method of Prediction of the Early Stages of Dementia Using IoT and Magnet Sensors
Authors: João Filipe Papel, Tatsuji Munaka
Abstract:
With society's aging and the number of elderly with dementia rising, researchers have been actively studying how to support the elderly in the early stages of dementia with the objective of allowing them to have a better life quality and as much as possible independence. To make this possible, most researchers in this field are using the Internet Of Things to monitor the elderly activities and assist them in performing them. The most common sensor used to monitor the elderly activities is the Camera sensor due to its easy installation and configuration. The other commonly used sensor is the sound sensor. However, we need to consider privacy when using these sensors. This research aims to develop a system capable of predicting the early stages of dementia based on monitoring and controlling the elderly activities of daily living. To make this system possible, some issues need to be addressed. First, the issue related to elderly privacy when trying to detect their Activities of Daily Living. Privacy when performing detection and monitoring Activities of Daily Living it's a serious concern. One of the purposes of this research is to achieve this detection and monitoring without putting the privacy of the elderly at risk. To make this possible, the study focuses on using an approach based on using Magnet Sensors to collect binary data. The second is to use the data collected by monitoring Activities of Daily Living to predict the early stages of Dementia. To make this possible, the research team suggests developing a proprietary ontology combined with both data-driven and knowledge-driven.Keywords: dementia, activity recognition, magnet sensors, ontology, data driven and knowledge driven, IoT, activities of daily living
Procedia PDF Downloads 10422953 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data
Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone
Abstract:
This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as a ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease data set, the study successfully identified key factors, and the results were consistent with previous studies.Keywords: lyme disease, Poisson generalized linear model, ridge regression, lasso regression, elastic net regression
Procedia PDF Downloads 13722952 Graph-Based Semantical Extractive Text Analysis
Authors: Mina Samizadeh
Abstract:
In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to explore the data. This leads to an intense growing interest in the research community to develop computational methods focused on processing this text data. A line of study focused on condensing the text so that we are able to get a higher level of understanding in a shorter time. The two important tasks to do this are keyword extraction and text summarization. In keyword extraction, we are interested in finding the key important words from a text. This makes us familiar with the general topic of a text. In text summarization, we are interested in producing a short-length text which includes important information about the document. The TextRank algorithm, an unsupervised learning method that is an extension of the PageRank (algorithm which is the base algorithm of Google search engine for searching pages and ranking them), has shown its efficacy in large-scale text mining, especially for text summarization and keyword extraction. This algorithm can automatically extract the important parts of a text (keywords or sentences) and declare them as a result. However, this algorithm neglects the semantic similarity between the different parts. In this work, we improved the results of the TextRank algorithm by incorporating the semantic similarity between parts of the text. Aside from keyword extraction and text summarization, we develop a topic clustering algorithm based on our framework, which can be used individually or as a part of generating the summary to overcome coverage problems.Keywords: keyword extraction, n-gram extraction, text summarization, topic clustering, semantic analysis
Procedia PDF Downloads 7022951 One of the Missing Pieces of Inclusive Education: Sexual Orientations
Authors: Sıla Uzkul
Abstract:
As a requirement of human rights and children's rights, the basic condition of inclusive education is that it covers all children. However, the reforms made in the context of education in Turkey and around the world include a limited level of inclusiveness. Generally, the inclusiveness mentioned is for individuals who need special education. Educational reforms superficially state that differences are tolerated, but these differences are extremely limited and often do not include sexual orientation. When we look at the education modules of the Ministry of National Education within the scope of inclusive education in Turkey, there are children with special needs, bilingual children, children exposed to violence, children under temporary protection, children affected by migration and terrorism, and children affected by natural disasters. No training modules or inclusion terms regarding sexual orientations could be found. This research aimed to understand the perspectives of research assistants working in the preschool education department regarding sexual orientations within the scope of inclusive education. Six research assistants working in the preschool teaching department at a public university in Ankara (Turkey) participated in this qualitative research study. Participants were determined by typical case sampling, which is one of the purposeful sampling methods. The data of this research was obtained through a "survey consisting of open-ended questions". Raw data from the surveys were analyzed and interpreted using the "content analysis technique" (Yıldırım & Şimşek, 2005). During the data analysis process, the data from the participants were first numbered, then all the data were read, and content analysis was performed, and possible themes, categories, and codes were extracted. The opinions of the participants in the research regarding sexual orientations in inclusive education are presented under three main headings within the scope of the research questions. These are: (a) their views on inclusive education, (b) their views on sexual orientations (c) their views on sexual orientations in the preschool period.Keywords: sexual orientation, inclusive education, child rights, preschool education
Procedia PDF Downloads 6322950 Discovering the Effects of Meteorological Variables on the Air Quality of Bogota, Colombia, by Data Mining Techniques
Authors: Fabiana Franceschi, Martha Cobo, Manuel Figueredo
Abstract:
Bogotá, the capital of Colombia, is its largest city and one of the most polluted in Latin America due to the fast economic growth over the last ten years. Bogotá has been affected by high pollution events which led to the high concentration of PM10 and NO2, exceeding the local 24-hour legal limits (100 and 150 g/m3 each). The most important pollutants in the city are PM10 and PM2.5 (which are associated with respiratory and cardiovascular problems) and it is known that their concentrations in the atmosphere depend on the local meteorological factors. Therefore, it is necessary to establish a relationship between the meteorological variables and the concentrations of the atmospheric pollutants such as PM10, PM2.5, CO, SO2, NO2 and O3. This study aims to determine the interrelations between meteorological variables and air pollutants in Bogotá, using data mining techniques. Data from 13 monitoring stations were collected from the Bogotá Air Quality Monitoring Network within the period 2010-2015. The Principal Component Analysis (PCA) algorithm was applied to obtain primary relations between all the parameters, and afterwards, the K-means clustering technique was implemented to corroborate those relations found previously and to find patterns in the data. PCA was also used on a per shift basis (morning, afternoon, night and early morning) to validate possible variation of the previous trends and a per year basis to verify that the identified trends have remained throughout the study time. Results demonstrated that wind speed, wind direction, temperature, and NO2 are the most influencing factors on PM10 concentrations. Furthermore, it was confirmed that high humidity episodes increased PM2,5 levels. It was also found that there are direct proportional relationships between O3 levels and wind speed and radiation, while there is an inverse relationship between O3 levels and humidity. Concentrations of SO2 increases with the presence of PM10 and decreases with the wind speed and wind direction. They proved as well that there is a decreasing trend of pollutant concentrations over the last five years. Also, in rainy periods (March-June and September-December) some trends regarding precipitations were stronger. Results obtained with K-means demonstrated that it was possible to find patterns on the data, and they also showed similar conditions and data distribution among Carvajal, Tunal and Puente Aranda stations, and also between Parque Simon Bolivar and las Ferias. It was verified that the aforementioned trends prevailed during the study period by applying the same technique per year. It was concluded that PCA algorithm is useful to establish preliminary relationships among variables, and K-means clustering to find patterns in the data and understanding its distribution. The discovery of patterns in the data allows using these clusters as an input to an Artificial Neural Network prediction model.Keywords: air pollution, air quality modelling, data mining, particulate matter
Procedia PDF Downloads 25822949 Comparative Analysis of Effecting Factors on Fertility by Birth Order: A Hierarchical Approach
Authors: Ali Hesari, Arezoo Esmaeeli
Abstract:
Regarding to dramatic changes of fertility and higher order births during recent decades in Iran, access to knowledge about affecting factors on different birth orders has crucial importance. In this study, According to hierarchical structure of many of social sciences data and the effect of variables of different levels of social phenomena that determine different birth orders in 365 days ending to 1390 census have been explored by multilevel approach. In this paper, 2% individual row data for 1390 census is analyzed by HLM software. Three different hierarchical linear regression models are estimated for data analysis of the first and second, third, fourth and more birth order. Research results displays different outcomes for three models. Individual level variables entered in equation are; region of residence (rural/urban), age, educational level and labor participation status and province level variable is GDP per capita. Results show that individual level variables have different effects in these three models and in second level we have different random and fixed effects in these models.Keywords: fertility, birth order, hierarchical approach, fixe effects, random effects
Procedia PDF Downloads 33922948 Axial Load Capacity of Drilled Shafts from In-Situ Test Data at Semani Site, in Albania
Authors: Neritan Shkodrani, Klearta Rrushi, Anxhela Shaha
Abstract:
Generally, the design of axial load capacity of deep foundations is based on the data provided from field tests, such as SPT (Standard Penetration Test) and CPT (Cone Penetration Test) tests. This paper reports the results of axial load capacity analysis of drilled shafts at a construction site at Semani, in Fier county, Fier prefecture in Albania. In this case, the axial load capacity analyses are based on the data of 416 SPT tests and 12 CPTU tests, which are carried out in this site construction using 12 boreholes (10 borings of a depth 30.0 m and 2 borings of a depth of 80.0m). The considered foundation widths range from 0.5m to 2.5 m and foundation embedment lengths is fixed at a value of 25m. SPT – based analytical methods from the Japanese practice of design (Building Standard Law of Japan) and CPT – based analytical Eslami and Fellenius methods are used for obtaining axial ultimate load capacity of drilled shafts. The considered drilled shaft (25m long and 0.5m - 2.5m in diameter) is analyzed for the soil conditions of each borehole. The values obtained from sets of calculations are shown in different charts. Then the reported axial load capacity values acquired from SPT and CPTU data are compared and some conclusions are found related to the mentioned methods of calculations.Keywords: deep foundations, drilled shafts, axial load capacity, ultimate load capacity, allowable load capacity, SPT test, CPTU test
Procedia PDF Downloads 10422947 A Grounded Theory on Marist Spirituality/Charism from the Perspective of the Lay Marists in the Philippines
Authors: Nino M. Pizarro
Abstract:
To the author’s knowledge, despite the written documents about Marist spirituality/charism, nothing has been done concerning a clear theoretical framework that highlights Marist spirituality/charism from the perspective or lived experience of the lay Marists of St. Marcellin Champagnat. The participants of the study are the lay Marist - educators who are from Marist Schools in the Philippines. Since the study would like to find out the respondents’ own concepts and meanings about Marist spirituality/charism, qualitative methodology is considered the approach to be used in the study. In particular, the study will use the qualitative methods of Barney Glaser. The theory will be generated systematically from data collection, coding and analyzing through memoing, theoretical sampling, sorting and writing and using the constant comparative method. The data collection method that will be employed in this grounded theory research is the in-depth interview that is semi-structured and participant driven. Data collection will be done through snowball sampling that is purposive. The study is considering to come up with a theoretical framework that will help the lay Marists to deepen their understanding of the Marist spirituality/charism and their vocation as lay partners of the Marist Brothers of the Schools.Keywords: grounded theory, Lay Marists, lived experience, Marist spirituality/charism
Procedia PDF Downloads 31122946 Annexing the Strength of Information and Communication Technology (ICT) for Real-time TB Reporting Using TB Situation Room (TSR) in Nigeria: Kano State Experience
Authors: Ibrahim Umar, Ashiru Rajab, Sumayya Chindo, Emmanuel Olashore
Abstract:
INTRODUCTION: Kano is the most populous state in Nigeria and one of the two states with the highest TB burden in the country. The state notifies an average of 8,000+ TB cases quarterly and has the highest yearly notification of all the states in Nigeria from 2020 to 2022. The contribution of the state TB program to the National TB notification varies from 9% to 10% quarterly between the first quarter of 2022 and second quarter of 2023. The Kano State TB Situation Room is an innovative platform for timely data collection, collation and analysis for informed decision in health system. During the 2023 second National TB Testing week (NTBTW) Kano TB program aimed at early TB detection, prevention and treatment. The state TB Situation room provided avenue to the state for coordination and surveillance through real time data reporting, review, analysis and use during the NTBTW. OBJECTIVES: To assess the role of innovative information and communication technology platform for real-time TB reporting during second National TB Testing week in Nigeria 2023. To showcase the NTBTW data cascade analysis using TSR as innovative ICT platform. METHODOLOGY: The State TB deployed a real-time virtual dashboard for NTBTW reporting, analysis and feedback. A data room team was set up who received realtime data using google link. Data received was analyzed using power BI analytic tool with statistical alpha level of significance of <0.05. RESULTS: At the end of the week-long activity and using the real-time dashboard with onsite mentorship of the field workers, the state TB program was able to screen a total of 52,054 people were screened for TB from 72,112 individuals eligible for screening (72% screening rate). A total of 9,910 presumptive TB clients were identified and evaluated for TB leading to diagnosis of 445 TB patients with TB (5% yield from presumptives) and placement of 435 TB patients on treatment (98% percentage enrolment). CONCLUSION: The TB Situation Room (TBSR) has been a great asset to Kano State TB Control Program in meeting up with the growing demand for timely data reporting in TB and other global health responses. The use of real time surveillance data during the 2023 NTBTW has in no small measure improved the TB response and feedback in Kano State. Scaling up this intervention to other disease areas, states and nations is a positive step in the right direction towards global TB eradication.Keywords: tuberculosis (tb), national tb testing week (ntbtw), tb situation rom (tsr), information communication technology (ict)
Procedia PDF Downloads 7122945 Density Measurement of Mixed Refrigerants R32+R1234yf and R125+R290 from 0°C to 100°C and at Pressures up to 10 MPa
Authors: Xiaoci Li, Yonghua Huang, Hui Lin
Abstract:
Optimization of the concentration of components in mixed refrigerants leads to potential improvement of either thermodynamic cycle performance or safety performance of heat pumps and refrigerators. R32+R1234yf and R125+R290 are two promising binary mixed refrigerants for the application of heat pumps working in the cold areas. The p-ρ-T data of these mixtures are one of the fundamental and necessary properties for design and evaluation of the performance of the heat pumps. Although the property data of mixtures can be predicted by the mixing models based on the pure substances incorporated in programs such as the NIST database Refprop, direct property measurement will still be helpful to reveal the true state behaviors and verify the models. Densities of the mixtures of R32+R1234yf an d R125+R290 are measured by an Anton Paar U shape oscillating tube digital densimeter DMA-4500 in the range of temperatures from 0°C to 100 °C and pressures up to 10 MPa. The accuracy of the measurement reaches 0.00005 g/cm³. The experimental data are compared with the predictions by Refprop in the corresponding range of pressure and temperature.Keywords: mixed refrigerant, density measurement, densimeter, thermodynamic property
Procedia PDF Downloads 29522944 Classifying and Predicting Efficiencies Using Interval DEA Grid Setting
Authors: Yiannis G. Smirlis
Abstract:
The classification and the prediction of efficiencies in Data Envelopment Analysis (DEA) is an important issue, especially in large scale problems or when new units frequently enter the under-assessment set. In this paper, we contribute to the subject by proposing a grid structure based on interval segmentations of the range of values for the inputs and outputs. Such intervals combined, define hyper-rectangles that partition the space of the problem. This structure, exploited by Interval DEA models and a dominance relation, acts as a DEA pre-processor, enabling the classification and prediction of efficiency scores, without applying any DEA models.Keywords: data envelopment analysis, interval DEA, efficiency classification, efficiency prediction
Procedia PDF Downloads 16422943 Mapping of Traffic Noise in Riyadh City-Saudi Arabia
Authors: Khaled A. Alsaif, Mosaad A. Foda
Abstract:
The present work aims at development of traffic noise maps for Riyadh City using the software Lima. Road traffic data were estimated or measured as accurate as possible in order to obtain consistent noise maps. The predicted noise levels at some selected sites are validated by actual field measurements, which are obtained by a system that consists of a sound level meter, a GPS receiver and a database to manage the measured data. The maps show that noise levels remain over 50 dBA and can exceed 70 dBA at the nearside of major roads and highways.Keywords: noise pollution, road traffic noise, LimA predictor, GPS
Procedia PDF Downloads 38422942 The Introduction of a Tourniquet Checklist to Identify and Record Tourniquet Related Complications
Authors: Akash Soogumbur
Abstract:
Tourniquets are commonly used in orthopaedic surgery to provide hemostasis during procedures on the upper and lower limbs. However, there is a risk of complications associated with tourniquet use, such as nerve damage, skin necrosis, and compartment syndrome. The British Orthopaedic Association (BOAST) guidelines recommend the use of tourniquets at a pressure of 300 mmHg or less for a maximum of 2 hours. Research Aim: The aim of this study was to evaluate the effectiveness of a tourniquet checklist in improving compliance with the BOAST guidelines. Methodology: This was a retrospective study of all orthopaedic procedures performed at a single institution over a 12-month period. The study population included patients who had a tourniquet applied during surgery. Data were collected from the patients' medical records, including the duration of tourniquet use, the pressure used, and the method of exsanguination. Findings: The results showed that the use of the tourniquet checklist significantly improved compliance with the BOAST guidelines. Prior to the introduction of the checklist, compliance with the guidelines was 83% for the duration of tourniquet use and 73% for pressure used. After the introduction of the checklist, compliance increased to 100% for both duration of tourniquet use and pressure used. Theoretical Importance: The findings of this study suggest that the use of a tourniquet checklist can be an effective way to improve compliance with the BOAST guidelines. This is important because it can help to reduce the risk of complications associated with tourniquet use. Data Collection: Data were collected from the patients' medical records. The data included the following information: Patient demographics, procedure performed, duration of tourniquet use, pressure used, method of exsanguination. Analysis Procedures: The data were analyzed using descriptive statistics. The compliance with the BOAST guidelines was calculated as the percentage of patients who met the guidelines for the duration of tourniquet use and pressure used. Question Addressed: The question addressed by this study was whether the use of a tourniquet checklist could improve compliance with the BOAST guidelines. Conclusion: The results of this study suggest that the use of a tourniquet checklist can be an effective way to improve compliance with the BOAST guidelines. This is important because it can help to reduce the risk of complications associated with tourniquet use.Keywords: tourniquet, pressure, duration, complications, surgery
Procedia PDF Downloads 6922941 Data Analysis for Taxonomy Prediction and Annotation of 16S rRNA Gene Sequences from Metagenome Data
Authors: Suchithra V., Shreedhanya, Kavya Menon, Vidya Niranjan
Abstract:
Skin metagenomics has a wide range of applications with direct relevance to the health of the organism. It gives us insight to the diverse community of microorganisms (the microbiome) harbored on the skin. In the recent years, it has become increasingly apparent that the interaction between skin microbiome and the human body plays a prominent role in immune system development, cancer development, disease pathology, and many other biological implications. Next Generation Sequencing has led to faster and better understanding of environmental organisms and their mutual interactions. This project is studying the human skin microbiome of different individuals having varied skin conditions. Bacterial 16S rRNA data of skin microbiome is downloaded from SRA toolkit provided by NCBI to perform metagenomics analysis. Twelve samples are selected with two controls, and 3 different categories, i.e., sex (male/female), skin type (moist/intermittently moist/sebaceous) and occlusion (occluded/intermittently occluded/exposed). Quality of the data is increased using Cutadapt, and its analysis is done using FastQC. USearch, a tool used to analyze an NGS data, provides a suitable platform to obtain taxonomy classification and abundance of bacteria from the metagenome data. The statistical tool used for analyzing the USearch result is METAGENassist. The results revealed that the top three abundant organisms found were: Prevotella, Corynebacterium, and Anaerococcus. Prevotella is known to be an infectious bacterium found on wound, tooth cavity, etc. Corynebacterium and Anaerococcus are opportunist bacteria responsible for skin odor. This result infers that Prevotella thrives easily in sebaceous skin conditions. Therefore it is better to undergo intermittently occluded treatment such as applying ointments, creams, etc. to treat wound for sebaceous skin type. Exposing the wound should be avoided as it leads to an increase in Prevotella abundance. Moist skin type individuals can opt for occluded or intermittently occluded treatment as they have shown to decrease the abundance of bacteria during treatment.Keywords: bacterial 16S rRNA , next generation sequencing, skin metagenomics, skin microbiome, taxonomy
Procedia PDF Downloads 17222940 Development of a Predictive Model to Prevent Financial Crisis
Authors: Tengqin Han
Abstract:
Delinquency has been a crucial factor in economics throughout the years. Commonly seen in credit card and mortgage, it played one of the crucial roles in causing the most recent financial crisis in 2008. In each case, a delinquency is a sign of the loaner being unable to pay off the debt, and thus may cause a lost of property in the end. Individually, one case of delinquency seems unimportant compared to the entire credit system. China, as an emerging economic entity, the national strength and economic strength has grown rapidly, and the gross domestic product (GDP) growth rate has remained as high as 8% in the past decades. However, potential risks exist behind the appearance of prosperity. Among the risks, the credit system is the most significant one. Due to long term and a large amount of balance of the mortgage, it is critical to monitor the risk during the performance period. In this project, about 300,000 mortgage account data are analyzed in order to develop a predictive model to predict the probability of delinquency. Through univariate analysis, the data is cleaned up, and through bivariate analysis, the variables with strong predictive power are detected. The project is divided into two parts. In the first part, the analysis data of 2005 are split into 2 parts, 60% for model development, and 40% for in-time model validation. The KS of model development is 31, and the KS for in-time validation is 31, indicating the model is stable. In addition, the model is further validation by out-of-time validation, which uses 40% of 2006 data, and KS is 33. This indicates the model is still stable and robust. In the second part, the model is improved by the addition of macroeconomic economic indexes, including GDP, consumer price index, unemployment rate, inflation rate, etc. The data of 2005 to 2010 is used for model development and validation. Compared with the base model (without microeconomic variables), KS is increased from 41 to 44, indicating that the macroeconomic variables can be used to improve the separation power of the model, and make the prediction more accurate.Keywords: delinquency, mortgage, model development, model validation
Procedia PDF Downloads 22822939 Self-Supervised Learning for Hate-Speech Identification
Authors: Shrabani Ghosh
Abstract:
Automatic offensive language detection in social media has become a stirring task in today's NLP. Manual Offensive language detection is tedious and laborious work where automatic methods based on machine learning are only alternatives. Previous works have done sentiment analysis over social media in different ways such as supervised, semi-supervised, and unsupervised manner. Domain adaptation in a semi-supervised way has also been explored in NLP, where the source domain and the target domain are different. In domain adaptation, the source domain usually has a large amount of labeled data, while only a limited amount of labeled data is available in the target domain. Pretrained transformers like BERT, RoBERTa models are fine-tuned to perform text classification in an unsupervised manner to perform further pre-train masked language modeling (MLM) tasks. In previous work, hate speech detection has been explored in Gab.ai, which is a free speech platform described as a platform of extremist in varying degrees in online social media. In domain adaptation process, Twitter data is used as the source domain, and Gab data is used as the target domain. The performance of domain adaptation also depends on the cross-domain similarity. Different distance measure methods such as L2 distance, cosine distance, Maximum Mean Discrepancy (MMD), Fisher Linear Discriminant (FLD), and CORAL have been used to estimate domain similarity. Certainly, in-domain distances are small, and between-domain distances are expected to be large. The previous work finding shows that pretrain masked language model (MLM) fine-tuned with a mixture of posts of source and target domain gives higher accuracy. However, in-domain performance of the hate classifier on Twitter data accuracy is 71.78%, and out-of-domain performance of the hate classifier on Gab data goes down to 56.53%. Recently self-supervised learning got a lot of attention as it is more applicable when labeled data are scarce. Few works have already been explored to apply self-supervised learning on NLP tasks such as sentiment classification. Self-supervised language representation model ALBERTA focuses on modeling inter-sentence coherence and helps downstream tasks with multi-sentence inputs. Self-supervised attention learning approach shows better performance as it exploits extracted context word in the training process. In this work, a self-supervised attention mechanism has been proposed to detect hate speech on Gab.ai. This framework initially classifies the Gab dataset in an attention-based self-supervised manner. On the next step, a semi-supervised classifier trained on the combination of labeled data from the first step and unlabeled data. The performance of the proposed framework will be compared with the results described earlier and also with optimized outcomes obtained from different optimization techniques.Keywords: attention learning, language model, offensive language detection, self-supervised learning
Procedia PDF Downloads 10522938 Time and Cost Prediction Models for Language Classification Over a Large Corpus on Spark
Authors: Jairson Barbosa Rodrigues, Paulo Romero Martins Maciel, Germano Crispim Vasconcelos
Abstract:
This paper presents an investigation of the performance impacts regarding the variation of five factors (input data size, node number, cores, memory, and disks) when applying a distributed implementation of Naïve Bayes for text classification of a large Corpus on the Spark big data processing framework. Problem: The algorithm's performance depends on multiple factors, and knowing before-hand the effects of each factor becomes especially critical as hardware is priced by time slice in cloud environments. Objectives: To explain the functional relationship between factors and performance and to develop linear predictor models for time and cost. Methods: the solid statistical principles of Design of Experiments (DoE), particularly the randomized two-level fractional factorial design with replications. This research involved 48 real clusters with different hardware arrangements. The metrics were analyzed using linear models for screening, ranking, and measurement of each factor's impact. Results: Our findings include prediction models and show some non-intuitive results about the small influence of cores and the neutrality of memory and disks on total execution time, and the non-significant impact of data input scale on costs, although notably impacts the execution time.Keywords: big data, design of experiments, distributed machine learning, natural language processing, spark
Procedia PDF Downloads 120