Search results for: multivariate filtering
38 Financing Decision and Productivity Growth for the Venture Capital Industry Using High-Order Fuzzy Time Series
Authors: Shang-En Yu
Abstract:
Human society, there are many uncertainties, such as economic growth rate forecast of the financial crisis, many scholars have, since the the Song Chissom two scholars in 1993 the concept of the so-called fuzzy time series (Fuzzy Time Series)different mode to deal with these problems, a previous study, however, usually does not consider the relevant variables selected and fuzzy process based solely on subjective opinions the fuzzy semantic discrete, so can not objectively reflect the characteristics of the data set, in addition to carrying outforecasts are often fuzzy rules as equally important, failed to consider the importance of each fuzzy rule. For these reasons, the variable selection (Factor Selection) through self-organizing map (Self-Organizing Map, SOM) and proposed high-end weighted multivariate fuzzy time series model based on fuzzy neural network (Fuzzy-BPN), and using the the sequential weighted average operator (Ordered Weighted Averaging operator, OWA) weighted prediction. Therefore, in order to verify the proposed method, the Taiwan stock exchange (Taiwan Stock Exchange Corporation) Taiwan Weighted Stock Index (Taiwan Stock Exchange Capitalization Weighted Stock Index, TAIEX) as experimental forecast target, in order to filter the appropriate variables in the experiment Finally, included in other studies in recent years mode in conjunction with this study, the results showed that the predictive ability of this study further improve.
Keywords: Heterogeneity, residential mortgage loans, foreclosure.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 138837 Data Privacy and Safety with Large Language Models
Authors: Ashly Joseph, Jithu Paulose
Abstract:
Large language models (LLMs) have revolutionized natural language processing capabilities, enabling applications such as chatbots, dialogue agents, image, and video generators. Nevertheless, their trainings on extensive datasets comprising personal information poses notable privacy and safety hazards. This study examines methods for addressing these challenges, specifically focusing on approaches to enhance the security of LLM outputs, safeguard user privacy, and adhere to data protection rules. We explore several methods including post-processing detection algorithms, content filtering, reinforcement learning from human and AI inputs, and the difficulties in maintaining a balance between model safety and performance. The study also emphasizes the dangers of unintentional data leakage, privacy issues related to user prompts, and the possibility of data breaches. We highlight the significance of corporate data governance rules and optimal methods for engaging with chatbots. In addition, we analyze the development of data protection frameworks, evaluate the adherence of LLMs to General Data Protection Regulation (GDPR), and examine privacy legislation in academic and business policies. We demonstrate the difficulties and remedies involved in preserving data privacy and security in the age of sophisticated artificial intelligence by employing case studies and real-life instances. This article seeks to educate stakeholders on practical strategies for improving the security and privacy of LLMs, while also assuring their responsible and ethical implementation.
Keywords: Data privacy, large language models, artificial intelligence, machine learning, cybersecurity, general data protection regulation, data safety.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10436 Ownership, Management Responsibility and Corporate Performance of the Listed Firms in Kazakhstan
Authors: Gulnara Moldasheva
Abstract:
The research explores the relationship between management responsibility and corporate governance of listed companies in Kazakhstan. This research employs firm level data of selected listed non-financial firms and firm level data “operational” financial sector, consisted from banking sector, insurance companies and accumulated pension funds using multivariate regression analysis under fixed effect model approach. Ownership structure includes institutional ownership, managerial ownership and private investor’s ownership. Management responsibility of the firm is expressed by the decision of the firm on amount of leverage. Results of the cross sectional panel study for non-financial firms showed that only institutional shareholding is significantly negatively correlated with debt to equity ratio. Findings from “operational” financial sector show that leverage is significantly affected only by the CEO/Chair duality and the size of financial institutions, and insignificantly affected by ownership structure. Also, the findings show, that there is a significant negative relationship between profitability and the debt to equity ratio for non-financial firms, which is consistent with pecking order theory. Generally, the found results suggest that corporate governance and a management responsibility play important role in corporate performance of listed firms in Kazakhstan.Keywords: Corporate governance, corporate performance, debt to equity ratio, ownership.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 165735 The Effectiveness of Solution-Focused Group Therapy on Improving Depressed Mothers of Child Abuser Families
Authors: Roya Maqami, Kaveh Qaderi Bagajan, Mohammad Mahdi Yousefi, Saeed Moradi
Abstract:
The purpose of this study is to investigate the efficacy of solution-focused group therapy on improving the depressed mothers of child abuser families. This study was carried out in the form of a semi-pilot, pre-test and post-test on two groups (experimental and control). Subjects include all mothers and their children that are the members of Shush and Naser Khosro child home. Beck Depression Inventory and Child Trauma Questionnaire were used to collect data. First, child abuse questionnaire was completed by children, Then Beck Depression Inventory was completed by their mothers that 22 of them were recognized as depressed and randomly divided in two groups of experimental and control. After applying pre-test for both of these groups, the intervention of solution- focused group therapy was performed in five sessions on experimental group. Finally, post-test was applied on both groups and subsequently in a month, follow-up test was performed. T-test, multivariate variance, and repeated measurement analysis of variance were used to analyze the data. According to the findings, it can be concluded that this therapy leads to the improvement of depressed mother's mood. As a result, the intervention of solution-focused group therapy is useful in order to improve the depressing mood of mothers of child abuser families.
Keywords: Child Abuse, Depressed Mothers, Child Abuser Families, Solution-focused Group Therapy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 178234 Application of Single Tuned Passive Filters in Distribution Networks at the Point of Common Coupling
Authors: M. Almutairi, S. Hadjiloucas
Abstract:
The harmonic distortion of voltage is important in relation to power quality due to the interaction between the large diffusion of non-linear and time-varying single-phase and three-phase loads with power supply systems. However, harmonic distortion levels can be reduced by improving the design of polluting loads or by applying arrangements and adding filters. The application of passive filters is an effective solution that can be used to achieve harmonic mitigation mainly because filters offer high efficiency, simplicity, and are economical. Additionally, possible different frequency response characteristics can work to achieve certain required harmonic filtering targets. With these ideas in mind, the objective of this paper is to determine what size single tuned passive filters work in distribution networks best, in order to economically limit violations caused at a given point of common coupling (PCC). This article suggests that a single tuned passive filter could be employed in typical industrial power systems. Furthermore, constrained optimization can be used to find the optimal sizing of the passive filter in order to reduce both harmonic voltage and harmonic currents in the power system to an acceptable level, and, thus, improve the load power factor. The optimization technique works to minimize voltage total harmonic distortions (VTHD) and current total harmonic distortions (ITHD), where maintaining a given power factor at a specified range is desired. According to the IEEE Standard 519, both indices are viewed as constraints for the optimal passive filter design problem. The performance of this technique will be discussed using numerical examples taken from previous publications.
Keywords: Harmonics, passive filter, power factor, power quality.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 219133 Technical, Environmental, and Financial Assessment for the Optimal Sizing of a Run-of-River Small Hydropower Project: A Case Study in Colombia
Authors: David Calderón Villegas, Thomas Kalitzky
Abstract:
Run-of-river (RoR) hydropower projects represent a viable, clean, and cost-effective alternative to dam-based plants and provide decentralized power production. However, RoR schemes’ cost-effectiveness depends on the proper selection of site and design flow, which is a challenging task because it requires multivariate analysis. In this respect, this study presents the development of an investment decision support tool for assessing the optimal size of an RoR scheme considering the technical, environmental, and cost constraints. The net present value (NPV) from a project perspective is used as an objective function for supporting the investment decision. The tool has been tested by applying it to an actual RoR project recently proposed in Colombia. The obtained results show that the optimum point in financial terms does not match the flow that maximizes energy generation from exploiting the river's available flow. For the case study, the flow that maximizes energy corresponds to a value of 5.1 m3/s. In comparison, an amount of 2.1 m3/s maximizes the investors NPV. Finally, a sensitivity analysis is performed to determine the NPV as a function of the debt rate changes and the electricity prices and the CapEx. Even for the worst-case scenario, the optimal size represents a positive business case with an NPV of 2.2 USD million and an internal rate of return (IRR) 1.5 times higher than the discount rate.
Keywords: small hydropower, renewable energy, RoR schemes, optimal sizing, financial analysis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 60032 Regional Analysis of Streamflow Drought: A Case Study for Southwestern Iran
Authors: M. Byzedi, B. Saghafian
Abstract:
Droughts are complex, natural hazards that, to a varying degree, affect some parts of the world every year. The range of drought impacts is related to drought occurring in different stages of the hydrological cycle and usually different types of droughts, such as meteorological, agricultural, hydrological, and socioeconomical are distinguished. Streamflow drought was analyzed by the method of truncation level (at 70% level) on daily discharges measured in 54 hydrometric stations in southwestern Iran. Frequency analysis was carried out for annual maximum series (AMS) of drought deficit volume and duration series. Some factors including physiographic, climatic, geologic, and vegetation cover were studied as influential factors in the regional analysis. According to the results of factor analysis, six most effective factors were identified as area, rainfall from December to February, the percent of area with Normalized Difference Vegetation Index (NDVI) <0.1, the percent of convex area, drainage density and the minimum of watershed elevation that explained 90.9% of variance. The homogenous regions were determined by cluster analysis and discriminate function analysis. Suitable multivariate regression models were evaluated for streamflow drought deficit volume with 2 years return period. The significance level of regression models was 0.01. The results showed that the watershed area is the most effective factor with high correlation with deficit volume. Also, drought duration was not a suitable drought index for regional analysis.Keywords: Iran, Streamflow drought, truncation level method, regional analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 174331 Dimensionality Reduction in Modal Analysis for Structural Health Monitoring
Authors: Elia Favarelli, Enrico Testi, Andrea Giorgetti
Abstract:
Autonomous structural health monitoring (SHM) of many structures and bridges became a topic of paramount importance for maintenance purposes and safety reasons. This paper proposes a set of machine learning (ML) tools to perform automatic feature selection and detection of anomalies in a bridge from vibrational data and compare different feature extraction schemes to increase the accuracy and reduce the amount of data collected. As a case study, the Z-24 bridge is considered because of the extensive database of accelerometric data in both standard and damaged conditions. The proposed framework starts from the first four fundamental frequencies extracted through operational modal analysis (OMA) and clustering, followed by time-domain filtering (tracking). The fundamental frequencies extracted are then fed to a dimensionality reduction block implemented through two different approaches: feature selection (intelligent multiplexer) that tries to estimate the most reliable frequencies based on the evaluation of some statistical features (i.e., entropy, variance, kurtosis), and feature extraction (auto-associative neural network (ANN)) that combine the fundamental frequencies to extract new damage sensitive features in a low dimensional feature space. Finally, one-class classification (OCC) algorithms perform anomaly detection, trained with standard condition points, and tested with normal and anomaly ones. In particular, principal component analysis (PCA), kernel principal component analysis (KPCA), and autoassociative neural network (ANN) are presented and their performance are compared. It is also shown that, by evaluating the correct features, the anomaly can be detected with accuracy and an F1 score greater than 95%.
Keywords: Anomaly detection, dimensionality reduction, frequencies selection, modal analysis, neural network, structural health monitoring, vibration measurement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 70830 Appraisal of Methods for Identifying, Mapping, and Modelling of Fluvial Erosion in a Mining Environment
Authors: F. F. Howard, I. Yakubu, C. B. Boye, J. S. Y. Kuma
Abstract:
Natural and human activities, such as mining operations, expose the natural soil to adverse environmental conditions, leading to contamination of soil, groundwater, and surface water, which has negative effects on humans, flora, and fauna. Bare or partly exposed soil is most liable to fluvial erosion. This paper enumerates various methods used to identify, map, and model fluvial erosion in a mining environment. Classical, Artificial Intelligence (AI), and GIS methods have been reviewed. One of the many classical methods used to estimate river erosion is the Revised Universal Soil Loss Equation (RUSLE) model. The RUSLE model is easy to use. Its reliance on empirical relationships that may not always be applicable to specific circumstances or locations is a flaw. Other classical models for estimating fluvial erosion are the Soil and Water Assessment Tool (SWAT) and the Universal Soil Loss Equation (USLE). These models offer a more complete understanding of the underlying physical processes and encompass a wider range of situations. Although more difficult to utilise, they depend on the availability and dependability of input data for correctness. AI can help deal with multivariate and complex difficulties and predict soil loss with higher accuracy than traditional methods, and also be used to build unique models for identifying degraded areas. AI techniques have become popular as an alternative predictor for degraded environments. However, this research proposed a hybrid of classical, AI, and GIS methods for efficient and effective modelling of fluvial erosion.
Keywords: Fluvial erosion, classical methods, Artificial Intelligence, Geographic Information System.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18529 Infrastructure Change Monitoring Using Multitemporal Multispectral Satellite Images
Authors: U. Datta
Abstract:
The main objective of this study is to find a suitable approach to monitor the land infrastructure growth over a period of time using multispectral satellite images. Bi-temporal change detection method is unable to indicate the continuous change occurring over a long period of time. To achieve this objective, the approach used here estimates a statistical model from series of multispectral image data over a long period of time, assuming there is no considerable change during that time period and then compare it with the multispectral image data obtained at a later time. The change is estimated pixel-wise. Statistical composite hypothesis technique is used for estimating pixel based change detection in a defined region. The generalized likelihood ratio test (GLRT) is used to detect the changed pixel from probabilistic estimated model of the corresponding pixel. The changed pixel is detected assuming that the images have been co-registered prior to estimation. To minimize error due to co-registration, 8-neighborhood pixels around the pixel under test are also considered. The multispectral images from Sentinel-2 and Landsat-8 from 2015 to 2018 are used for this purpose. There are different challenges in this method. First and foremost challenge is to get quite a large number of datasets for multivariate distribution modelling. A large number of images are always discarded due to cloud coverage. Due to imperfect modelling there will be high probability of false alarm. Overall conclusion that can be drawn from this work is that the probabilistic method described in this paper has given some promising results, which need to be pursued further.
Keywords: Co-registration, GLRT, infrastructure growth, multispectral, multitemporal, pixel-based change detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 72828 Validation on 3D Surface Roughness Algorithm for Measuring Roughness of Psoriasis Lesion
Authors: M.H. Ahmad Fadzil, Esa Prakasa, Hurriyatul Fitriyah, Hermawan Nugroho, Azura Mohd Affandi, S.H. Hussein
Abstract:
Psoriasis is a widespread skin disease affecting up to 2% population with plaque psoriasis accounting to about 80%. It can be identified as a red lesion and for the higher severity the lesion is usually covered with rough scale. Psoriasis Area Severity Index (PASI) scoring is the gold standard method for measuring psoriasis severity. Scaliness is one of PASI parameter that needs to be quantified in PASI scoring. Surface roughness of lesion can be used as a scaliness feature, since existing scale on lesion surface makes the lesion rougher. The dermatologist usually assesses the severity through their tactile sense, therefore direct contact between doctor and patient is required. The problem is the doctor may not assess the lesion objectively. In this paper, a digital image analysis technique is developed to objectively determine the scaliness of the psoriasis lesion and provide the PASI scaliness score. Psoriasis lesion is modelled by a rough surface. The rough surface is created by superimposing a smooth average (curve) surface with a triangular waveform. For roughness determination, a polynomial surface fitting is used to estimate average surface followed by a subtraction between rough and average surface to give elevation surface (surface deviations). Roughness index is calculated by using average roughness equation to the height map matrix. The roughness algorithm has been tested to 444 lesion models. From roughness validation result, only 6 models can not be accepted (percentage error is greater than 10%). These errors occur due the scanned image quality. Roughness algorithm is validated for roughness measurement on abrasive papers at flat surface. The Pearson-s correlation coefficient of grade value (G) of abrasive paper and Ra is -0.9488, its shows there is a strong relation between G and Ra. The algorithm needs to be improved by surface filtering, especially to overcome a problem with noisy data.
Keywords: psoriasis, roughness algorithm, polynomial surfacefitting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 249027 Time Series Forecasting Using Various Deep Learning Models
Authors: Jimeng Shi, Mahek Jain, Giri Narasimhan
Abstract:
Time Series Forecasting (TSF) is used to predict the target variables at a future time point based on the learning from previous time points. To keep the problem tractable, learning methods use data from a fixed length window in the past as an explicit input. In this paper, we study how the performance of predictive models change as a function of different look-back window sizes and different amounts of time to predict into the future. We also consider the performance of the recent attention-based transformer models, which had good success in the image processing and natural language processing domains. In all, we compare four different deep learning methods (Recurrent Neural Network (RNN), Long Short-term Memory (LSTM), Gated Recurrent Units (GRU), and Transformer) along with a baseline method. The dataset (hourly) we used is the Beijing Air Quality Dataset from the website of University of California, Irvine (UCI), which includes a multivariate time series of many factors measured on an hourly basis for a period of 5 years (2010-14). For each model, we also report on the relationship between the performance and the look-back window sizes and the number of predicted time points into the future. Our experiments suggest that Transformer models have the best performance with the lowest Mean Absolute Errors (MAE = 14.599, 23.273) and Root Mean Square Errors (RSME = 23.573, 38.131) for most of our single-step and multi-steps predictions. The best size for the look-back window to predict 1 hour into the future appears to be one day, while 2 or 4 days perform the best to predict 3 hours into the future.
Keywords: Air quality prediction, deep learning algorithms, time series forecasting, look-back window.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 116726 Effects of Different Meteorological Variables on Reference Evapotranspiration Modeling: Application of Principal Component Analysis
Authors: Akinola Ikudayisi, Josiah Adeyemo
Abstract:
The correct estimation of reference evapotranspiration (ETₒ) is required for effective irrigation water resources planning and management. However, there are some variables that must be considered while estimating and modeling ETₒ. This study therefore determines the multivariate analysis of correlated variables involved in the estimation and modeling of ETₒ at Vaalharts irrigation scheme (VIS) in South Africa using Principal Component Analysis (PCA) technique. Weather and meteorological data between 1994 and 2014 were obtained both from South African Weather Service (SAWS) and Agricultural Research Council (ARC) in South Africa for this study. Average monthly data of minimum and maximum temperature (°C), rainfall (mm), relative humidity (%), and wind speed (m/s) were the inputs to the PCA-based model, while ETₒ is the output. PCA technique was adopted to extract the most important information from the dataset and also to analyze the relationship between the five variables and ETₒ. This is to determine the most significant variables affecting ETₒ estimation at VIS. From the model performances, two principal components with a variance of 82.7% were retained after the eigenvector extraction. The results of the two principal components were compared and the model output shows that minimum temperature, maximum temperature and windspeed are the most important variables in ETₒ estimation and modeling at VIS. In order words, ETₒ increases with temperature and windspeed. Other variables such as rainfall and relative humidity are less important and cannot be used to provide enough information about ETₒ estimation at VIS. The outcome of this study has helped to reduce input variable dimensionality from five to the three most significant variables in ETₒ modelling at VIS, South Africa.
Keywords: Irrigation, principal component analysis, reference evapotranspiration, Vaalharts.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 106125 Spatial Integration at the Room-Level of 'Sequina' Slum Area in Alexandria, Egypt
Authors: Ali Essam El Shazly
Abstract:
The social logic of 'Sequina' slum area in Alexandria details the integral measure of space syntax at the room-level of twenty-building samples. The essence of spatial structure integrates the central 'visitor' domain with the 'living' frontage of the 'children' zone against the segregated privacy of the opposite 'parent' depth. Meanwhile, the multifunctioning of shallow rooms optimizes the integral 'visitor' structure through graph and visibility dimensions in contrast to the 'inhabitant' structure of graph-tails out of sight. Common theme of the layout integrity increases in compensation to the decrease of room visibility. Despite the 'pheno-type' of collective integration, the individual layouts observe 'geno-type' structure of spatial diversity per room adjoins. In this regard, the layout integrity alternates the cross-correlation of the 'kitchen & living' rooms with the 'inhabitant & visitor' domains of 'motherhood' dynamic structure. Moreover, the added 'grandparent' restructures the integral measure to become the deepest space, but opens to the 'living' of 'household' integrity. Some isomorphic layouts change the integral structure just through the 'balcony' extension of access, visual or ignored 'ringiness' of space syntax. However, the most integrated or segregated layouts invert the 'geno-type' into a shallow 'inhabitant' centrality versus the remote 'visitor' structure. Overview of the multivariate social logic of spatial integrity could never clarify without the micro-data analysis.Keywords: Alexandria, Sequina slum, spatial integration, space syntax.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 143624 Issues in Spectral Source Separation Techniques for Plant-wide Oscillation Detection and Diagnosis
Authors: A.K. Tangirala, S. Babji
Abstract:
In the last few years, three multivariate spectral analysis techniques namely, Principal Component Analysis (PCA), Independent Component Analysis (ICA) and Non-negative Matrix Factorization (NMF) have emerged as effective tools for oscillation detection and isolation. While the first method is used in determining the number of oscillatory sources, the latter two methods are used to identify source signatures by formulating the detection problem as a source identification problem in the spectral domain. In this paper, we present a critical drawback of the underlying linear (mixing) model which strongly limits the ability of the associated source separation methods to determine the number of sources and/or identify the physical source signatures. It is shown that the assumed mixing model is only valid if each unit of the process gives equal weighting (all-pass filter) to all oscillatory components in its inputs. This is in contrast to the fact that each unit, in general, acts as a filter with non-uniform frequency response. Thus, the model can only facilitate correct identification of a source with a single frequency component, which is again unrealistic. To overcome this deficiency, an iterative post-processing algorithm that correctly identifies the physical source(s) is developed. An additional issue with the existing methods is that they lack a procedure to pre-screen non-oscillatory/noisy measurements which obscure the identification of oscillatory sources. In this regard, a pre-screening procedure is prescribed based on the notion of sparseness index to eliminate the noisy and non-oscillatory measurements from the data set used for analysis.Keywords: non-negative matrix factorization, PCA, source separation, plant-wide diagnosis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 153323 Calibration of 2D and 3D Optical Measuring Instruments in Industrial Environments at Submillimeter Range
Authors: A. Mínguez-Martínez, J. de Vicente
Abstract:
Modern manufacturing processes have led to the miniaturization of systems and, as a result, parts at the micro and nanoscale are produced. This trend seems to become increasingly important in the near future. Besides, as a requirement of Industry 4.0, the digitalization of the models of production and processes makes it very important to ensure that the dimensions of newly manufactured parts meet the specifications of the models. Therefore, it is possible to reduce the scrap and the cost of non-conformities, ensuring the stability of the production at the same time. To ensure the quality of manufactured parts, it becomes necessary to carry out traceable measurements at scales lower than one millimeter. Providing adequate traceability to the SI unit of length (the meter) to 2D and 3D measurements at this scale is a problem that does not have a unique solution in industrial environments. Researchers in the field of dimensional metrology all around the world are working on this issue. A solution for industrial environments, even if it is not complete, will enable working with some traceability. At this point, we believe that the study of the surfaces could provide us with a first approximation to a solution. In this paper, we propose a calibration procedure for the scales of optical measuring instruments, particularizing for a confocal microscope, using material standards easy to find and calibrate in metrology and quality laboratories in industrial environments. Confocal microscopes are measuring instruments capable of filtering the out-of-focus reflected light so that when it reaches the detector, it is possible to take pictures of the part of the surface that is focused. Varying and taking pictures at different Z levels of the focus, a specialized software interpolates between the different planes, and it could reconstruct the surface geometry into a 3D model. As it is easy to deduce, it is necessary to give traceability to each axis. As a complementary result, the roughness Ra parameter will be traced to the reference. Although the solution is designed for a confocal microscope, it may be used for the calibration of other optical measuring instruments, by applying minor changes.
Keywords: Industrial environment, confocal microscope, optical measuring instrument, traceability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 41022 Remote Vital Signs Monitoring in Neonatal Intensive Care Unit Using a Digital Camera
Authors: Fatema-Tuz-Zohra Khanam, Ali Al-Naji, Asanka G. Perera, Kim Gibson, Javaan Chahl
Abstract:
Conventional contact-based vital signs monitoring sensors such as pulse oximeters or electrocardiogram (ECG) may cause discomfort, skin damage, and infections, particularly in neonates with fragile, sensitive skin. Therefore, remote monitoring of the vital sign is desired in both clinical and non-clinical settings to overcome these issues. Camera-based vital signs monitoring is a recent technology for these applications with many positive attributes. However, there are still limited camera-based studies on neonates in a clinical setting. In this study, the heart rate (HR) and respiratory rate (RR) of eight infants at the Neonatal Intensive Care Unit (NICU) in Flinders Medical Centre were remotely monitored using a digital camera applying color and motion-based computational methods. The region-of-interest (ROI) was efficiently selected by incorporating an image decomposition method. Furthermore, spatial averaging, spectral analysis, band-pass filtering, and peak detection were also used to extract both HR and RR. The experimental results were validated with the ground truth data obtained from an ECG monitor and showed a strong correlation using the Pearson correlation coefficient (PCC) 0.9794 and 0.9412 for HR and RR, respectively. The root mean square errors (RMSE) between camera-based data and ECG data for HR and RR were 2.84 beats/min and 2.91 breaths/min, respectively. A Bland Altman analysis of the data also showed a close correlation between both data sets with a mean bias of 0.60 beats/min and 1 breath/min, and the lower and upper limit of agreement -4.9 to + 6.1 beats/min and -4.4 to +6.4 breaths/min for both HR and RR, respectively. Therefore, video camera imaging may replace conventional contact-based monitoring in NICU and has potential applications in other contexts such as home health monitoring.
Keywords: Neonates, NICU, digital camera, heart rate, respiratory rate, image decomposition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 57521 Neuromuscular Control and Performance during Sudden Acceleration in Subjects with and without Unilateral Acute Ankle Sprains
Authors: M. Qorbani
Abstract:
Neuromuscular control of posture as understood through studies of responses to mechanical sudden acceleration automatically has been previously demonstrated in individuals with chronic ankle instability (CAI), but the presence of acute condition has not been previously explored specially in a sudden acceleration. The aim of this study was to determine neuromuscular control pattern in those with and without unilateral acute ankle sprains. Design: Case - control. Setting: University research laboratory. The sinker–card protocol with surface translation was be used as a sudden acceleration protocol with study of EMG upon 4 posture stabilizer muscles in two sides of the body in response to sudden acceleration in forward and backward directions. 20 young adult women in two groups (10 LAS; 23.9 ± 2.03 yrs and 10 normal; 26.4 ± 3.2 yrs). The data of EMG were assessed by using multivariate test and one-way repeated measures 2×2×4 ANOVA (P< 0.05). The results showed a significant muscle by direction interaction. Higher TA activity of left and right side in LAS group than normal group in forward direction significantly be showed. Higher MGR activity in normal group than LAS group in backward direction significantly showed. These findings suggest that compared two sides of the body in two directions for 4 muscles EMG activities between and within group for neuromuscular control of posture in avoiding fall. EMG activations of two sides of the body in lateral ankle sprain (LAS) patients were symmetric significantly. Acute ankle instability following once ankle sprains caused to coordinated temporal spatial patterns and strategy selection.Keywords: Neuromuscular response, sEMG, Lateral Ankle Sprain, posture.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 103420 The Comparison of Parental Childrearing Styles and Anxiety in Children with Stuttering and Normal Population
Authors: Pegah Farokhzad
Abstract:
Family has a crucial role in maintaining the physical, social and mental health of the children. Most of the mental and anxiety problems of children reflect the complex interpersonal situations among family members, especially parents. In other words, anxiety problems of the children are correlated with deficit relationships of family members and improper childrearing styles. The parental child rearing styles leads to positive and negative consequences which affect the children’s mental health. Therefore, the present research was aimed to compare the parental childrearing styles and anxiety of children with stuttering and normal population. It was also aimed to study the relationship between parental child rearing styles and anxiety of children. The research sample included 54 boys with stuttering and 54 normal boys who were selected from the children (boys) of Tehran, Iran in the age range of 5 to 8 years in 2013. In order to collect data, Baum-rind Childrearing Styles Inventory and Spence Parental Anxiety Inventory were used. Appropriate descriptive statistical methods and multivariate variance analysis and t test for independent groups were used to test the study hypotheses. Statistical data analyses demonstrated that there was a significant difference between stuttering boys and normal boys in anxiety (t = 7.601, p< 0.01); but there was no significant difference between stuttering boys and normal boys in parental childrearing styles (F = 0.129). There was also not found significant relationship between parental childrearing styles and children anxiety (F = 0.135, p< 0.05). It can be concluded that the influential factors of children’s society are parents, school, teachers, peers and media. So, parental childrearing styles are not the only influential factors on anxiety of children, and other factors including genetic, environment and child experiences are effective in anxiety as well. Details are discussed.Keywords: Anxiety, Childrearing Styles, Stuttering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 307319 Person Identification using Gait by Combined Features of Width and Shape of the Binary Silhouette
Authors: M.K. Bhuyan, Aragala Jagan.
Abstract:
Current image-based individual human recognition methods, such as fingerprints, face, or iris biometric modalities generally require a cooperative subject, views from certain aspects, and physical contact or close proximity. These methods cannot reliably recognize non-cooperating individuals at a distance in the real world under changing environmental conditions. Gait, which concerns recognizing individuals by the way they walk, is a relatively new biometric without these disadvantages. The inherent gait characteristic of an individual makes it irreplaceable and useful in visual surveillance. In this paper, an efficient gait recognition system for human identification by extracting two features namely width vector of the binary silhouette and the MPEG-7-based region-based shape descriptors is proposed. In the proposed method, foreground objects i.e., human and other moving objects are extracted by estimating background information by a Gaussian Mixture Model (GMM) and subsequently, median filtering operation is performed for removing noises in the background subtracted image. A moving target classification algorithm is used to separate human being (i.e., pedestrian) from other foreground objects (viz., vehicles). Shape and boundary information is used in the moving target classification algorithm. Subsequently, width vector of the outer contour of binary silhouette and the MPEG-7 Angular Radial Transform coefficients are taken as the feature vector. Next, the Principal Component Analysis (PCA) is applied to the selected feature vector to reduce its dimensionality. These extracted feature vectors are used to train an Hidden Markov Model (HMM) for identification of some individuals. The proposed system is evaluated using some gait sequences and the experimental results show the efficacy of the proposed algorithm.Keywords: Gait Recognition, Gaussian Mixture Model, PrincipalComponent Analysis, MPEG-7 Angular Radial Transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 191118 Nuclear Fuel Safety Threshold Determined by Logistic Regression Plus Uncertainty
Authors: D. S. Gomes, A. T. Silva
Abstract:
Analysis of the uncertainty quantification related to nuclear safety margins applied to the nuclear reactor is an important concept to prevent future radioactive accidents. The nuclear fuel performance code may involve the tolerance level determined by traditional deterministic models producing acceptable results at burn cycles under 62 GWd/MTU. The behavior of nuclear fuel can simulate applying a series of material properties under irradiation and physics models to calculate the safety limits. In this study, theoretical predictions of nuclear fuel failure under transient conditions investigate extended radiation cycles at 75 GWd/MTU, considering the behavior of fuel rods in light-water reactors under reactivity accident conditions. The fuel pellet can melt due to the quick increase of reactivity during a transient. Large power excursions in the reactor are the subject of interest bringing to a treatment that is known as the Fuchs-Hansen model. The point kinetic neutron equations show similar characteristics of non-linear differential equations. In this investigation, the multivariate logistic regression is employed to a probabilistic forecast of fuel failure. A comparison of computational simulation and experimental results was acceptable. The experiments carried out use the pre-irradiated fuels rods subjected to a rapid energy pulse which exhibits the same behavior during a nuclear accident. The propagation of uncertainty utilizes the Wilk's formulation. The variables chosen as essential to failure prediction were the fuel burnup, the applied peak power, the pulse width, the oxidation layer thickness, and the cladding type.Keywords: Logistic regression, reactivity-initiated accident, safety margins, uncertainty propagation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 101817 Distribution of Macrobenthic Polychaete Families in Relation to Environmental Parameters in North West Penang, Malaysia
Authors: Mohammad Gholizadeh, Khairun Yahya, Anita Talib, Omar Ahmad
Abstract:
The distribution of macrobenthic polychaetes along the coastal waters of Penang National Park was surveyed to estimate the effect of various environmental parameters at three stations (200m, 600m and 1200m) from the shoreline, during six sampling months, from June 2010 to April 2011.The use of polychaetes in descriptive ecology is surveyed in the light of a recent investigation particularly concerning the soft bottom biota environments. Polychaetes, often connected in the former to the notion of opportunistic species able to proliferate after an enhancement in organic matter, had performed a momentous role particularly with regard to effected soft-bottom habitats. The objective of this survey was to investigate different environment stress over soft bottom polychaete community along Teluk Ketapang and Pantai Acheh (Penang National Park) over a year period. Variations in the polychaete community were evaluated using univariate and multivariate methods. The results of PCA analysis displayed a positive relation between macrobenthic community structures and environmental parameters such as sediment particle size and organic matter in the coastal water. A total of 604 individuals were examined which was grouped into 23 families. Family Nereidae was the most abundant (22.68%), followed by Spionidae (22.02%), Hesionidae (12.58%), Nephtylidae (9.27%) and Orbiniidae (8.61%). It is noticeable that good results can only be obtained on the basis of good taxonomic resolution. We proposed that, in monitoring surveys, operative time could be optimized not only by working at a highertaxonomic level on the entire macrobenthic data set, but by also choosing an especially indicative group and working at lower taxonomic and good level.Keywords: Polychaete families, environment parameters, Bioindicators, Pantai Acheh, Teluk Ketapang.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 199916 Effect of Zidovudine on Hematological and Virologic Parameters among Female Sex Workers Receiving Antiretroviral Therapy (ART) in North – Western Nigeria
Authors: N. M. Sani, E. D. Jatau, O. S. Olonitola, M. Y. Gwarzo, P. Moodley, N. S. Mujahid
Abstract:
Hemoglobin (HB) indicates anemia level and by extension may reflect the nutritional level and perhaps the immunity of an individual. Some antiretroviral drugs like Zidovudine are known to cause anemia in people living with HIV/AIDS (PLWHA). A cross sectional study using demographic data and blood specimen from 218 female commercial sex workers attending antiretroviral therapy (ART) clinics was conducted between December, 2009 and July, 2011 to assess the effect of zidovudine on hematologic, and RNA viral load of female sex workers receiving antiretroviral treatment in north western Nigeria. Anemia is a common and serious complication of both HIV infection and its treatment. In the setting of HIV infection, anemia has been associated with decreased quality of life, functional status, and survival. Antiretroviral therapy, particularly the highly active antiretroviral therapy (HAART), has been associated with a decrease in the incidence and severity of anemia in HIV-infected patients who have received a HAART regimen for at least 1 year. In this study, result has shown that of the 218 patients, 26 with hemoglobin count between 5.1 – 10g/dl were observed to have the highest viral load count of 300,000 – 350,000copies/ml. It was also observed that most patients (190) with HB of 10.1 – 15.0g/dl had viral load count of 200,000 – 250,000 copies /ml. An inverse relationship therefore exists i.e. the lower the hemoglobin level, the higher the viral load count even though the test statistics did not show any significance between the two (P = 0.206). This shows that multivariate logistic regression analysis demonstrated that anemia was associated with a CD4 + cell count below 50/μL, female sex workers with a viral load above 100,000 copies/mL, who use zidovudine. Severe anemia was less prevalent in this study population than in historical comparators; however, mild to moderate anemia rates remain high. The study therefore recommends that hematological and virologic parameters be monitored closely in patients receiving first line ART regimen.Keywords: Female sex worker, Zidovudine, Hemoglobin, Anemia.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 176315 dynr.mi: An R Program for Multiple Imputation in Dynamic Modeling
Authors: Yanling Li, Linying Ji, Zita Oravecz, Timothy R. Brick, Michael D. Hunter, Sy-Miin Chow
Abstract:
Assessing several individuals intensively over time yields intensive longitudinal data (ILD). Even though ILD provide rich information, they also bring other data analytic challenges. One of these is the increased occurrence of missingness with increased study length, possibly under non-ignorable missingness scenarios. Multiple imputation (MI) handles missing data by creating several imputed data sets, and pooling the estimation results across imputed data sets to yield final estimates for inferential purposes. In this article, we introduce dynr.mi(), a function in the R package, Dynamic Modeling in R (dynr). The package dynr provides a suite of fast and accessible functions for estimating and visualizing the results from fitting linear and nonlinear dynamic systems models in discrete as well as continuous time. By integrating the estimation functions in dynr and the MI procedures available from the R package, Multivariate Imputation by Chained Equations (MICE), the dynr.mi() routine is designed to handle possibly non-ignorable missingness in the dependent variables and/or covariates in a user-specified dynamic systems model via MI, with convergence diagnostic check. We utilized dynr.mi() to examine, in the context of a vector autoregressive model, the relationships among individuals’ ambulatory physiological measures, and self-report affect valence and arousal. The results from MI were compared to those from listwise deletion of entries with missingness in the covariates. When we determined the number of iterations based on the convergence diagnostics available from dynr.mi(), differences in the statistical significance of the covariate parameters were observed between the listwise deletion and MI approaches. These results underscore the importance of considering diagnostic information in the implementation of MI procedures.Keywords: Dynamic modeling, missing data, multiple imputation, physiological measures.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 80914 Disparity of Learning Styles and Cognitive Abilities in Vocational Education
Authors: Mimi Mohaffyza Mohamad, Yee Mei Heong, Nurfirdawati Muhammad Hanafi Tee Tze Kiong
Abstract:
This study is conducted to investigate the disparity of between learning styles and cognitive abilities specifically in Vocational Education. Felder and Silverman Learning Styles Model (FSLSM) was applied to measure the students’ learning styles while the content in Building Construction Subject consists; knowledge, skills and problem solving were taken into account in constructing the elements of cognitive abilities. Building Construction is one of the vocational courses offered in Vocational Education structure. There are four dimension of learning styles proposed by Felder and Silverman intended to capture student learning preferences with regards to processing either active or reflective, perception based on sensing or intuitive, input of information used visual or verbal and understanding information represent with sequential or global learner. Felder-Solomon Learning Styles Index was developed based on FSLSM and the questions were used to identify what type of student learning preferences. The index consists 44 item-questions characterize for learning styles dimension in FSLSM. The achievement test was developed to determine the students’ cognitive abilities. The quantitative data was analyzed in descriptive and inferential statistic involving Multivariate Analysis of Variance (MANOVA). The study discovered students are tending to be visual learners and each type of learner having significant difference whereas cognitive abilities there are different finding for each type of learners in knowledge, skills and problem solving. This study concludes the gap between type of learner and the cognitive abilities in few illustrations and it explained how the connecting made. The finding may help teachers to facilitate students more effectively and to boost the student’s cognitive abilities.
Keywords: Learning Styles, Cognitive Abilities, Dimension of Learning Styles, Learning Preferences.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 263513 Agro-Morphological Characterization of Vicia faba L. Accessions in the Kingdom of Saudi Arabia
Authors: Zia Amjad, Salem S. Alghamdi
Abstract:
The study was conducted at the student educational farm at the College of Food and Agriculture in the Kingdom of Saudi Arabia. The aim of study was to characterize 154 Vicia faba L. accessions using agro-morphological traits based on The International Union for the Protection of New Varieties of Plants (UPOV) and The International Board for Plant Genetic Resources (IBPGR) descriptors. This research is significant as it contributes to the understanding of the genetic diversity and potential yield of V. faba in Saudi Arabia. In the study, 24 agro-morphological characters including 11 quantitative and 13 qualitative were observed for genetic variation. All the results were analyzed using multivariate analysis i.e., principal component analysis (PCA). First, six principal components (PC) had eigenvalues greater than one; accounted for 72% of available V. faba genetic diversity. However, first three components revealed more than 10% of genetic diversity each i.e., 22.36%, 15.86% and 10.89% respectively. PCA distributed the V. faba accessions into different groups based on their performance for the characters under observation. PC-1, which represented 22.36% of the genetic diversity, was positively associated with stipule spot pigmentation, intensity of streaks, pod degree of curvature and to some extent with 100 seed weight. PC-2 covered 15.86 of the genetic diversity and showed positive association for average seed weight per plant, pod length, number of seeds per plant, 100 seed weight, stipule spot pigmentation, intensity of streaks (same as in PC-1) and to some extent for pod degree of curvature and number of pods per plant. PC-3 revealed 10.89% of genetic diversity and expressed positive association for number of pods per plant and number of leaflets per plant. This study contributes to the understanding of the genetic diversity and potential yield of V. faba in the Kingdom of Saudi Arabia. By establishing a core collection of V. faba, the research provides a valuable resource for future conservation and utilization of this crop worldwide.
Keywords: Agro-morphological characterization, genetic diversity, core collection, PCA, Vicia faba L.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20112 An Epidemiological Study on an Outbreak of Gastroenteritis Linked to Dinner Served at a Senior High School in Accra
Authors: Benjamin Osei Tutu, Rita Asante, Emefa Atsu
Abstract:
Background: An outbreak of gastroenteritis occurred in December 2019 after students of a Senior High School in Accra were served with kenkey and fish during their dinner. An investigation was conducted to characterize the affected people, the source of contamination, the etiologic food and agent. Methods: An epidemiological study was conducted with cases selected from the student population who were ill. Controls were selected from among students who also ate from the school canteen during dinner but were not ill. Food history of each case and control was taken to assess their exposure status. Epi Info 7 was used to analyze the data obtained from the outbreak. Attack rates and odds ratios were calculated to determine the risk of foodborne infection for each of the foods consumed by the population. The source of contamination of the foods was ascertained by conducting an environmental risk assessment at the school. Results: Data were obtained from 126 students, out of which 57 (45.2%) were cases and 69 (54.8%) were controls. The cases presented with symptoms such as diarrhea (85.96%), abdominal cramps (66.67%), vomiting (50.88%), headache (21.05%), fever (17.86%) and nausea (3.51%). The peak incubation period was 18 hours with a minimum and maximum incubation periods of 6 and 50 hours respectively. From the incubation period, duration of illness and the symptoms, non-typhoidal salmonellosis was suspected. Multivariate analysis indicated that the illness was associated with the consumption of the fried fish served, however this was statistically insignificant (AOR 3.1.00, P = 0.159). No stool, blood or food samples were available for organism isolation and confirmation of suspected etiologic agent. The environmental risk assessment indicated poor hand washing practices on the part of both the food handlers and students. Conclusion: The outbreak could probably be due to the consumption of the fried fish that might have been contaminated with Salmonella sp. as a result of poor hand washing practices in the school.
Keywords: Case control study, food poisoning, handwashing, Salmonella, school.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 66811 Towards End-To-End Disease Prediction from Raw Metagenomic Data
Authors: Maxence Queyrel, Edi Prifti, Alexandre Templier, Jean-Daniel Zucker
Abstract:
Analysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and stored as fastq files. Conventional processing pipelines consist in multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimensionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life data-sets as well a simulated one, we demonstrated that this original approach reaches high performance, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.Keywords: Metagenomics, phenotype prediction, deep learning, embeddings, multiple instance learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 91010 Gender Justice and Feminist Self-Management Practices in the Solidarity Economy: A Quantitative Analysis of the Factors that Impact Enterprises Formed by Women in Brazil
Authors: Maria de Nazaré Moraes Soares, Silvia Maria Dias Pedro Rebouças, José Carlos Lázaro
Abstract:
The Solidarity Economy (SE) acts in the re-articulation of the economic field to the other spheres of social action. The significant participation of women in SE resulted in the formation of a national network of self-managed enterprises in Brazil: The Solidarity and Feminist Economy Network (SFEN). The objective of the research is to identify factors of gender justice and feminist self-management practices that adhere to the reality of women in SE enterprises. The conceptual apparatus related to feminist studies in this research covers Nancy Fraser approaches on gender justice, and Patricia Yancey Martin approaches on feminist management practices, and authors of postcolonial feminism such as Mohanty and Maria Lugones, who lead the discussion to peripheral contexts, a necessary perspective when observing the women’s movement in SE. The research has a quantitative nature in the phases of data collection and analysis. The data collection was performed through two data sources: the database mapped in Brazil in 2010-2013 by the National Information System in Solidary Economy and 150 questionnaires with women from 16 enterprises in SFEN, in a state of Brazilian northeast. The data were analyzed using the multivariate statistical technique of Factor Analysis. The results show that the factors that define gender justice and feminist self-management practices in SE are interrelated in several levels, proving statistically the intersectional condition of the issue of women. The evidence from the quantitative analysis allowed us to understand the dimensions of gender justice and feminist management practices intersectionality; in this sense, the non-distribution of domestic work interferes in non-representation of women in public spaces, especially in peripheral contexts. The study contributes with important reflections to the studies of this area and can be complemented in the future with a qualitative research that approaches the perspective of women in the context of the SE self-management paradigm.
Keywords: Feminist management practices, gender justice, self-management, solidarity economy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6249 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups
Authors: Lily Ingsrisawang, Tasanee Nacharoen
Abstract:
The problems arising from unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many researchers have found that the performance of existing classifiers tends to be biased towards the majority class. The k-nearest neighbors’ nonparametric discriminant analysis is a method that was proposed for classifying unbalanced classes with good performance. In this study, the methods of discriminant analysis are of interest in investigating misclassification error rates for classimbalanced data of three diabetes risk groups. The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification of class-imbalanced data of diabetes risk groups. Data from a project maintaining healthy conditions for 599 employees of a government hospital in Bangkok were obtained for the classification problem. The employees were divided into three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data including the variables of diabetes risk group, age, gender, blood glucose, and BMI were analyzed and bootstrapped for 50 and 100 samples, 599 observations per sample, for additional estimation of the misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples showed nonnormality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. Searching the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions of (0.90:0.05:0.05), (0.80: 0.10: 0.10) and (0.70, 0.15, 0.15). The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k=3 or k=4 and the defined prior probabilities of non-risk: risk: diabetic as 0.90: 0.05:0.05 or 0.80:0.10:0.10 gave the smallest error rate of misclassification. The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.Keywords: Bootstrap, diabetes risk groups, error rate, k-nearest neighbors.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2008