Search results for: Vicon physical data.

7286 A New Evolutionary Algorithm for Cluster Analysis

Authors: B.Bahmani Firouzi, T. Niknam, M. Nayeripour

Abstract:

Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the kmeans algorithm. Solutions obtained from this technique depend on the initialization of cluster centers and the final solution converges to local minima. In order to overcome K-means algorithm shortcomings, this paper proposes a hybrid evolutionary algorithm based on the combination of PSO, SA and K-means algorithms, called PSO-SA-K, which can find better cluster partition. The performance is evaluated through several benchmark data sets. The simulation results show that the proposed algorithm outperforms previous approaches, such as PSO, SA and K-means for partitional clustering problem.

Keywords: Data clustering, Hybrid evolutionary optimization algorithm, K-means algorithm, Simulated Annealing (SA), Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2277

7285 Impact of Individual Resilience on Organisational Resilience: An Exploratory Study

Authors: Mitansha, Suzanne Wilkinson, Regan Potangaroa

Abstract:

The built environment is designed, maintained, operated, and decommissioned by construction organisations, which play a significant role in providing physical resources and rebuilding infrastructures during major crises and disasters. It is evident that enhancing the resilience of construction organisations allows better responding ability and speedy recovery from disasters and acts as a boon for the nation in the face of significant disruptions. As individuals are the integral component of any organisation, hence, individual resilience is considered a critical aspect, which may boost organisational resilience of construction sector. It has been observed that individual resilience is indirectly supported by organisation’s citizenship behaviour, job performance, and career success. Not only this, it also tends to hold a directly proportional relation with job satisfaction, physical and emotional well-being affected by organisation’s work culture, whereas the resilience of organisation increases as a result of positive adaption, growth and collective learning of the employees as an entity. Moreover, indicators like situation awareness in staff and crisis related issues, effective vulnerability management, organisational leadership and culture ensured by approachable, encouraging and people-oriented leaders, are prominent for achieving organisational resilience. It, thus, becomes perceptible that both, organisational and individual resiliencies, have the potential to influence each other. Consequently, it arises a major question that how these characteristics are associated and tend to behave with respect to each other. The study, thus, aims to explore the overlapping dimensions of organisational and individual resilience to determine the impact boundaries. The research methodology of the paper would be based on systematic literature review specifically focused on the resilience of construction industry. This would provide a direct comparison of characteristics influencing individual and organisational resilience and will present the most significant indicators of individual resilience that can eventually help to enhance the resilience of construction organisations amidst any disaster or crisis.

Keywords: Construction industry, individual resilience, organisational resilience, overlapping dimension.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 225

7284 Research and Application of Consultative Committee for Space Data Systems Wireless Communications Standards for Spacecraft

Authors: Cuitao Zhang, Xiongwen He

Abstract:

According to the new requirements of the future spacecraft, such as networking, modularization and non-cable, this paper studies the CCSDS wireless communications standards, and focuses on the low data-rate wireless communications for spacecraft monitoring and control. The application fields and advantages of wireless communications are analyzed. Wireless communications technology has significant advantages in reducing the weight of the spacecraft, saving time in spacecraft integration, etc. Based on this technology, a scheme for spacecraft data system is put forward. The corresponding block diagram and key wireless interface design of the spacecraft data system are given. The design proposal of the wireless node and information flow of the spacecraft are also analyzed. The results show that the wireless communications scheme is reasonable and feasible. The wireless communications technology can meet the future spacecraft demands in networking, modularization and non-cable.

Keywords: CCSDS standards, information flow, non-cable, spacecraft, wireless communications.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 940

7283 The Use of Artificial Neural Network in Option Pricing: The Case of S and P 100 Index Options

Authors: Zeynep İltüzer Samur, Gül Tekin Temur

Abstract:

Due to the increasing and varying risks that economic units face with, derivative instruments gain substantial importance, and trading volumes of derivatives have reached very significant level. Parallel with these high trading volumes, researchers have developed many different models. Some are parametric, some are nonparametric. In this study, the aim is to analyse the success of artificial neural network in pricing of options with S&P 100 index options data. Generally, the previous studies cover the data of European type call options. This study includes not only European call option but also American call and put options and European put options. Three data sets are used to perform three different ANN models. One only includes data that are directly observed from the economic environment, i.e. strike price, spot price, interest rate, maturity, type of the contract. The others include an extra input that is not an observable data but a parameter, i.e. volatility. With these detail data, the performance of ANN in put/call dimension, American/European dimension, moneyness dimension is analyzed and whether the contribution of the volatility in neural network analysis make improvement in prediction performance or not is examined. The most striking results revealed by the study is that ANN shows better performance when pricing call options compared to put options; and the use of volatility parameter as an input does not improve the performance.

Keywords: Option Pricing, Neural Network, S&P 100 Index, American/European options

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3084

7282 Totally Integrated Smart Energy System through Data Acquisition via Remote Location

Authors: Muhammad Tahir Qadri, M. Irfan Anis, M. Nawaz Irshad Khan

Abstract:

This paper discusses the approach of real-time controlling of the energy management system using the data acquisition tool of LabVIEW. The main idea of this inspiration was to interface the Station (PC) with the system and publish the data on internet using LabVIEW. In this venture, controlling and switching of 3 phase AC loads are effectively and efficiently done. The phases are also sensed through devices. In case of any failure the attached generator starts functioning automatically. The computer sends command to the system and system respond to the request. The modern feature is to access and control the system world-wide using world wide web (internet). This controlling can be done at any time from anywhere to effectively use the energy especially in developing countries where energy management is a big problem. In this system totally integrated devices are used to operate via remote location.

Keywords: VI-server, Remote Access, Telemetry, Data Acquisition, web server.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1879

7281 Preservation of Molecular Ozone in a Clathrate Hydrate : Three-Phase (Gas + Liquid + Hydrate) Equilibrium Measurements for O3 + O2 + CO2 + H2O Systems

Authors: Kazutoshi Shishido, Sanehiro Muromachi, Ryo Ohmura

Abstract:

This paper reports the three-phase (gas + liquid + hydrate) equilibrium pressure versus temperature data for a (O3 + O2 + CO2 + H2O) system for developing the hydrate-based technology to preserve ozone, a chemically unstable substance, for various industrial, medical and consumer uses. These data cover the temperature range from 272 K to 277 K, corresponding to pressures from 1.6 MPa to 3.1 MPa, for each of the three different (O3 + O2)-to-CO2 or O2-to-CO2 molar ratios in the gas phase, which are approximately 4 : 6, 5 : 5, respectively. The mole fraction of ozone in the gas phase was ~0.03 , which are the densest ozone fraction to artificially form O3 containing hydrate ever reported in the literature. Based on these data, the formation of hydrate containing high-concentration ozone, as high as 1 mass %, will be expected.

Keywords: Clathrate hydrate, Ozone, Molecule storage, Sterilization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1508

7280 A Fuzzy TOPSIS Based Model for Safety Risk Assessment of Operational Flight Data

Authors: N. Borjalilu, P. Rabiei, A. Enjoo

Abstract:

Flight Data Monitoring (FDM) program assists an operator in aviation industries to identify, quantify, assess and address operational safety risks, in order to improve safety of flight operations. FDM is a powerful tool for an aircraft operator integrated into the operator’s Safety Management System (SMS), allowing to detect, confirm, and assess safety issues and to check the effectiveness of corrective actions, associated with human errors. This article proposes a model for safety risk assessment level of flight data in a different aspect of event focus based on fuzzy set values. It permits to evaluate the operational safety level from the point of view of flight activities. The main advantages of this method are proposed qualitative safety analysis of flight data. This research applies the opinions of the aviation experts through a number of questionnaires Related to flight data in four categories of occurrence that can take place during an accident or an incident such as: Runway Excursions (RE), Controlled Flight Into Terrain (CFIT), Mid-Air Collision (MAC), Loss of Control in Flight (LOC-I). By weighting each one (by F-TOPSIS) and applying it to the number of risks of the event, the safety risk of each related events can be obtained.

Keywords: F-TOPSIS, fuzzy set, FDM, flight safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 887

7279 Time Series Forecasting Using a Hybrid RBF Neural Network and AR Model Based On Binomial Smoothing

Authors: Fengxia Zheng, Shouming Zhong

Abstract:

ANNARIMA that combines both autoregressive integrated moving average (ARIMA) model and artificial neural network (ANN) model is a valuable tool for modeling and forecasting nonlinear time series, yet the over-fitting problem is more likely to occur in neural network models. This paper provides a hybrid methodology that combines both radial basis function (RBF) neural network and auto regression (AR) model based on binomial smoothing (BS) technique which is efficient in data processing, which is called BSRBFAR. This method is examined by using the data of Canadian Lynx data. Empirical results indicate that the over-fitting problem can be eased using RBF neural network based on binomial smoothing which is called BS-RBF, and the hybrid model–BS-RBFAR can be an effective way to improve forecasting accuracy achieved by BSRBF used separately.

Keywords: Binomial smoothing (BS), hybrid, Canadian Lynx data, forecasting accuracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3686

7278 Context Aware Lightweight Energy Efficient Framework

Authors: D. Sathan, A. Meetoo, R. K. Subramaniam

Abstract:

Context awareness is a capability whereby mobile computing devices can sense their physical environment and adapt their behavior accordingly. The term context-awareness, in ubiquitous computing, was introduced by Schilit in 1994 and has become one of the most exciting concepts in early 21st-century computing, fueled by recent developments in pervasive computing (i.e. mobile and ubiquitous computing). These include computing devices worn by users, embedded devices, smart appliances, sensors surrounding users and a variety of wireless networking technologies. Context-aware applications use context information to adapt interfaces, tailor the set of application-relevant data, increase the precision of information retrieval, discover services, make the user interaction implicit, or build smart environments. For example: A context aware mobile phone will know that the user is currently in a meeting room, and reject any unimportant calls. One of the major challenges in providing users with context-aware services lies in continuously monitoring their contexts based on numerous sensors connected to the context aware system through wireless communication. A number of context aware frameworks based on sensors have been proposed, but many of them have neglected the fact that monitoring with sensors imposes heavy workloads on ubiquitous devices with limited computing power and battery. In this paper, we present CALEEF, a lightweight and energy efficient context aware framework for resource limited ubiquitous devices.

Keywords: Context-Aware, Energy-Efficient, Lightweight, Ubiquitous Devices.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1947

7277 Unsupervised Texture Classification and Segmentation

Authors: V.P.Subramanyam Rallabandi, S.K.Sett

Abstract:

An unsupervised classification algorithm is derived by modeling observed data as a mixture of several mutually exclusive classes that are each described by linear combinations of independent non-Gaussian densities. The algorithm estimates the data density in each class by using parametric nonlinear functions that fit to the non-Gaussian structure of the data. This improves classification accuracy compared with standard Gaussian mixture models. When applied to textures, the algorithm can learn basis functions for images that capture the statistically significant structure intrinsic in the images. We apply this technique to the problem of unsupervised texture classification and segmentation.

Keywords: Gaussian Mixture Model, Independent Component Analysis, Segmentation, Unsupervised Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1591

7276 Vision-Based Daily Routine Recognition for Healthcare with Transfer Learning

Authors: Bruce X. B. Yu, Yan Liu, Keith C. C. Chan

Abstract:

We propose to record Activities of Daily Living (ADLs) of elderly people using a vision-based system so as to provide better assistive and personalization technologies. Current ADL-related research is based on data collected with help from non-elderly subjects in laboratory environments and the activities performed are predetermined for the sole purpose of data collection. To obtain more realistic datasets for the application, we recorded ADLs for the elderly with data collected from real-world environment involving real elderly subjects. Motivated by the need to collect data for more effective research related to elderly care, we chose to collect data in the room of an elderly person. Specifically, we installed Kinect, a vision-based sensor on the ceiling, to capture the activities that the elderly subject performs in the morning every day. Based on the data, we identified 12 morning activities that the elderly person performs daily. To recognize these activities, we created a HARELCARE framework to investigate into the effectiveness of existing Human Activity Recognition (HAR) algorithms and propose the use of a transfer learning algorithm for HAR. We compared the performance, in terms of accuracy, and training progress. Although the collected dataset is relatively small, the proposed algorithm has a good potential to be applied to all daily routine activities for healthcare purposes such as evidence-based diagnosis and treatment.

Keywords: Daily activity recognition, healthcare, IoT sensors, transfer learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 892

7275 Visualization of Sediment Thickness Variation for Sea Bed Logging using Spline Interpolation

Authors: Hanita Daud, Noorhana Yahya, Vijanth Sagayan, Muizuddin Talib

Abstract:

This paper discusses on the use of Spline Interpolation and Mean Square Error (MSE) as tools to process data acquired from the developed simulator that shall replicate sea bed logging environment. Sea bed logging (SBL) is a new technique that uses marine controlled source electromagnetic (CSEM) sounding technique and is proven to be very successful in detecting and characterizing hydrocarbon reservoirs in deep water area by using resistivity contrasts. It uses very low frequency of 0.1Hz to 10 Hz to obtain greater wavelength. In this work the in house built simulator was used and was provided with predefined parameters and the transmitted frequency was varied for sediment thickness of 1000m to 4000m for environment with and without hydrocarbon. From series of simulations, synthetics data were generated. These data were interpolated using Spline interpolation technique (degree of three) and mean square error (MSE) were calculated between original data and interpolated data. Comparisons were made by studying the trends and relationship between frequency and sediment thickness based on the MSE calculated. It was found that the MSE was on increasing trends in the set up that has the presence of hydrocarbon in the setting than the one without. The MSE was also on decreasing trends as sediment thickness was increased and with higher transmitted frequency.

Keywords: Spline Interpolation, Mean Square Error, Sea Bed Logging, Controlled Source Electromagnetic

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1656

7274 The Consumer Private Space: What is and How it can be Approached without Affecting the Consumer's Privacy

Authors: Calin Veghes

Abstract:

The concept of privacy, seen in connection to the consumer's private space and personalization, has recently gained a higher importance as a consequence of the increasing marketing efforts of the organizations based on the capturing, processing and usage of consumer-s personal data.Paper intends to provide a definition of the consumer-s private space based on the types of personal data the consumer is willing to disclose, to assess the attitude toward personalization and to identify the means preferred by consumers to control their personal data and defend their private space. Several implications generated through the definition of the consumer-s private space are identified and weighted from both the consumers- and organizations- perspectives.

Keywords: Consumer private space, personalization, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1566

7273 Quantifying the Methods of Monitoring Timers in Electric Water Heater for Grid Balancing on Demand Side Management: A Systematic Mapping Review

Authors: Yamamah Abdulrazaq, Lahieb A. Abrahim, Samuel E. Davies, Iain Shewring

Abstract:

Electric water heater (EWH) is a powerful appliance that uses electricity in residential, commercial, and industrial settings, and the ability to control them properly will result in cost savings and the prevention of blackouts on the national grid. This article discusses the usage of timers in EWH control strategies for demand-side management (DSM). To the authors' knowledge, there is no systematic mapping review focusing on the utilization of EWH control strategies in DSM has yet been conducted. Consequently, the purpose of this research is to identify and examine main papers exploring EWH procedures in DSM by quantifying and categorizing information with regard to publication year and source, kind of methods, and source of data for monitoring control techniques. In order to answer the research questions, a total of 31 publications published between 1999 and 2023 were selected depending on specific inclusion and exclusion criteria. The data indicate that direct load control (DLC) has been somewhat more prevalent than indirect load control (ILC). Additionally, the mix method is much lower than the other techniques, and the proportion of real-time data (RTD) to non-real-time data (NRTD) is about equal.

Keywords: Demand side management, direct load control, electric water heater, indirect load control, non-real-time data, real time data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 112

7272 A Comprehensive Evaluation of Supervised Machine Learning for the Phase Identification Problem

Authors: Brandon Foggo, Nanpeng Yu

Abstract:

Power distribution circuits undergo frequent network topology changes that are often left undocumented. As a result, the documentation of a circuit’s connectivity becomes inaccurate with time. The lack of reliable circuit connectivity information is one of the biggest obstacles to model, monitor, and control modern distribution systems. To enhance the reliability and efficiency of electric power distribution systems, the circuit’s connectivity information must be updated periodically. This paper focuses on one critical component of a distribution circuit’s topology - the secondary transformer to phase association. This topology component describes the set of phase lines that feed power to a given secondary transformer (and therefore a given group of power consumers). Finding the documentation of this component is call Phase Identification, and is typically performed with physical measurements. These measurements can take time lengths on the order of several months, but with supervised learning, the time length can be reduced significantly. This paper compares several such methods applied to Phase Identification for a large range of real distribution circuits, describes a method of training data selection, describes preprocessing steps unique to the Phase Identification problem, and ultimately describes a method which obtains high accuracy (> 96% in most cases, > 92% in the worst case) using only 5% of the measurements typically used for Phase Identification.

Keywords: Distribution network, machine learning, network topology, phase identification, smart grid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1074

7271 Analysis of Temperature Change under Global Warming Impact using Empirical Mode Decomposition

Authors: Md. Khademul Islam Molla, Akimasa Sumi, M. Sayedur Rahman

Abstract:

The empirical mode decomposition (EMD) represents any time series into a finite set of basis functions. The bases are termed as intrinsic mode functions (IMFs) which are mutually orthogonal containing minimum amount of cross-information. The EMD successively extracts the IMFs with the highest local frequencies in a recursive way, which yields effectively a set low-pass filters based entirely on the properties exhibited by the data. In this paper, EMD is applied to explore the properties of the multi-year air temperature and to observe its effects on climate change under global warming. This method decomposes the original time-series into intrinsic time scale. It is capable of analyzing nonlinear, non-stationary climatic time series that cause problems to many linear statistical methods and their users. The analysis results show that the mode of EMD presents seasonal variability. The most of the IMFs have normal distribution and the energy density distribution of the IMFs satisfies Chi-square distribution. The IMFs are more effective in isolating physical processes of various time-scales and also statistically significant. The analysis results also show that the EMD method provides a good job to find many characteristics on inter annual climate. The results suggest that climate fluctuations of every single element such as temperature are the results of variations in the global atmospheric circulation.

Keywords: Empirical mode decomposition, instantaneous frequency, Hilbert spectrum, Chi-square distribution, anthropogenic impact.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2149

7270 VaR Forecasting in Times of Increased Volatility

Authors: Ivo Jánský, Milan Rippel

Abstract:

The paper evaluates several hundred one-day-ahead VaR forecasting models in the time period between the years 2004 and 2009 on data from six world stock indices - DJI, GSPC, IXIC, FTSE, GDAXI and N225. The models model mean using the ARMA processes with up to two lags and variance with one of GARCH, EGARCH or TARCH processes with up to two lags. The models are estimated on the data from the in-sample period and their forecasting accuracy is evaluated on the out-of-sample data, which are more volatile. The main aim of the paper is to test whether a model estimated on data with lower volatility can be used in periods with higher volatility. The evaluation is based on the conditional coverage test and is performed on each stock index separately. The primary result of the paper is that the volatility is best modelled using a GARCH process and that an ARMA process pattern cannot be found in analyzed time series.

Keywords: VaR, risk analysis, conditional volatility, garch, egarch, tarch, moving average process, autoregressive process

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1429

7269 A Weighted Sum Technique for the Joint Optimization of Performance and Power Consumption in Data Centers

Authors: Samee Ullah Khan, C.Ardil

Abstract:

With data centers, end-users can realize the pervasiveness of services that will be one day the cornerstone of our lives. However, data centers are often classified as computing systems that consume the most amounts of power. To circumvent such a problem, we propose a self-adaptive weighted sum methodology that jointly optimizes the performance and power consumption of any given data center. Compared to traditional methodologies for multi-objective optimization problems, the proposed self-adaptive weighted sum technique does not rely on a systematical change of weights during the optimization procedure. The proposed technique is compared with the greedy and LR heuristics for large-scale problems, and the optimal solution for small-scale problems implemented in LINDO. the experimental results revealed that the proposed selfadaptive weighted sum technique outperforms both of the heuristics and projects a competitive performance compared to the optimal solution.

Keywords: Meta-heuristics, distributed systems, adaptive methods, resource allocation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1836

7268 Image-Based (RBG) Technique for Estimating Phosphorus Levels of Crops

Authors: M. M. Ali, Ahmed Al-Ani, Derek Eamus, Daniel K. Y. Tan

Abstract:

In this glasshouse study, we developed a new imagebased non-destructive technique for detecting leaf P status of different crops such as cotton, tomato and lettuce. The plants were grown on a nutrient solution containing different P concentrations, e.g. 0%, 50% and 100% of recommended P concentration (P0 = no P, L; P1 = 2.5 mL 10 L-1 of P and P2 = 5 mL 10 L-1 of P). After 7 weeks of treatment, the plants were harvested and data on leaf P contents were collected using the standard destructive laboratory method and at the same time leaf images were collected by a handheld crop image sensor. We calculated leaf area, leaf perimeter and RGB (red, green and blue) values of these images. These data were further used in linear discriminant analysis (LDA) to estimate leaf P contents, which successfully classified these plants on the basis of leaf P contents. The data indicated that P deficiency in crop plants can be predicted using leaf image and morphological data. Our proposed nondestructive imaging method is precise in estimating P requirements of different crop species.

Keywords: Image-based techniques, leaf area, leaf P contents, linear discriminant analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1649

7267 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: Microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1284

7266 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

The problems arising from unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many researchers have found that the performance of existing classifiers tends to be biased towards the majority class. The k-nearest neighbors’ nonparametric discriminant analysis is a method that was proposed for classifying unbalanced classes with good performance. In this study, the methods of discriminant analysis are of interest in investigating misclassification error rates for classimbalanced data of three diabetes risk groups. The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification of class-imbalanced data of diabetes risk groups. Data from a project maintaining healthy conditions for 599 employees of a government hospital in Bangkok were obtained for the classification problem. The employees were divided into three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data including the variables of diabetes risk group, age, gender, blood glucose, and BMI were analyzed and bootstrapped for 50 and 100 samples, 599 observations per sample, for additional estimation of the misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples showed nonnormality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. Searching the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions of (0.90:0.05:0.05), (0.80: 0.10: 0.10) and (0.70, 0.15, 0.15). The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k=3 or k=4 and the defined prior probabilities of non-risk: risk: diabetic as 0.90: 0.05:0.05 or 0.80:0.10:0.10 gave the smallest error rate of misclassification. The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: Bootstrap, diabetes risk groups, error rate, k-nearest neighbors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2008

7265 Localizing and Experiencing Electronic Questionnaires in an Educational Web Site

Authors: Theodore H. Kaskalis

Abstract:

One of the main research methods in humanistic studies is the collection and process of data through questionnaires. This paper reports our experiences of localizing and adapting the phpESP package of electronic surveys, which led to a friendly on-line questionnaire environment offered through our department web site. After presenting the characteristics of this environment, we identify the expected benefits and present a questionnaire carried out through both the traditional and electronic way. We present the respondents' feedback and then we report the researchers' opinions.Finally, we propose ideas we intend to implement in order to further assist and enhance the research based on this web accessed,electronic questionnaire environment.

Keywords: Electronic questionnaires, Computer assisted webinterviewing, Survey data collection, Survey data visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1286

7264 Design and Implementation of Security Middleware for Data Warehouse Signature Framework

Authors: Mayada AlMeghari

Abstract:

Recently, grid middlewares have provided large integrated use of network resources as the shared data and the CPU to become a virtual supercomputer. In this work, we present the design and implementation of the middleware for Data Warehouse Signature (DWS) Framework. The aim of using the middleware in the proposed DWS framework is to achieve the high performance by the parallel computing. This middleware is developed on Alchemi.Net framework to increase the security among the network nodes through the authentication and group-key distribution model. This model achieves the key security and prevents any intermediate attacks in the middleware. This paper presents the flow process structures of the middleware design. In addition, the paper ensures the implementation of security for DWS middleware enhancement with the authentication and group-key distribution model. Finally, from the analysis of other middleware approaches, the developed middleware of DWS framework is the optimal solution of a complete covering of security issues.

Keywords: Middleware, parallel computing, data warehouse, security, group-key, high performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 337

7263 Re-Optimization MVPP Using Common Subexpression for Materialized View Selection

Authors: Boontita Suchyukorn, Raweewan Auepanwiriyakul

Abstract:

A Data Warehouses is a repository of information integrated from source data. Information stored in data warehouse is the form of materialized in order to provide the better performance for answering the queries. Deciding which appropriated views to be materialized is one of important problem. In order to achieve this requirement, the constructing search space close to optimal is a necessary task. It will provide effective result for selecting view to be materialized. In this paper we have proposed an approach to reoptimize Multiple View Processing Plan (MVPP) by using global common subexpressions. The merged queries which have query processing cost not close to optimal would be rewritten. The experiment shows that our approach can help to improve the total query processing cost of MVPP and sum of query processing cost and materialized view maintenance cost is reduced as well after views are selected to be materialized.

Keywords: Data Warehouse, materialized views, query rewriting, common subexpressions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1678

7262 A Descriptive Study on Psychiatric Morbidity among Nurses Working in Selected Hospitals of Udupi and Mangalore Districts Karnataka, India

Authors: Tessy Treesa Jose, Sripathy M. Bhat

Abstract:

Nursing is recognized as a stressful occupation and has indicated a probable high prevalence of distress. It is a helping profession requiring a high degree of commitment and involvement. If stress is intense, continuous and repeated, it becomes a negative phenomenon or "distress," which can lead to physical illness and psychological disorders. The frequency of common psychosomatic symptoms including sleeping problems, tension headache, chronic fatigue, palpitation etc. may be an indicator of nurses’ work-related stress level. Objectives of the study were to determine psychiatric morbidity among nurses and to find its association with selected variables. The study population consisted of 1040 registered nurses working in selected medical college hospitals and government hospitals of Udupi and Mangalore districts. Descriptive survey design was used to conduct the study. Subjects were selected by using purposive sampling. Data were gathered by administering background proforma and General Health questionnaire. Severe distress was experienced by 0.9% of nurses and 5.6% had some evidence of distress. Subjects who did not have any distress were 93.5%. No significant association between psychiatric morbidity in nurses and demographic variables was observed. With regard to work variables significant association is observed between psychiatric morbidity and total years of experience (z=10.67, p=0.03) and experience in current area of work (z=9.43, p=0.02).

Keywords: Psychiatric morbidity, nurse, selected hospitals, working.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1224

7261 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: Clustering, k-means, categorical datasets, pattern recognition, unsupervised learning, knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3545

7260 Theoretical Investigation of Carbazole-Based D-D-π-A Organic Dyes for Efficient Dye-Sensitized Solar Cell

Authors: S. Jungsuttiwong, R. Tarsang, S. Pansay, T. Yakhantip, V. Promarak, T. Sudyoadsuk, T. Kaewin, S. Saengsuwan, S. Namuangrak

Abstract:

In this paper, four carbazole-based D-D-π-A organic dyes code as CCT2A, CCT3A, CCT1PA and CCT2PA were reported. A series of these organic dyes containing identical donor and acceptor group but different π-system. The effect of replacing of thiophene by phenyl thiophene as π-system on the physical properties has been focused. The structural, energetic properties and absorption spectra were theoretically investigated by means of Density Functional Theory (DFT) and Time-Dependent Density Functional Theory (TD-DFT). The results show that nonplanar conformation due to steric hindrance in donor part (cabazolecarbazole unit) of dye molecule can prevent unfavorable dye aggregation. By means of the TD-DFT method, the absorption spectra were calculated by B3LYP and BHandHLYP to study the affect of hybrid functional on the excitation energy (Eg). The results revealed the increasing of thiophene units not only resulted in decreasing of Eg, but also found the shifting of absorption spectra to higher wavelength. TD-DFT/BHandHLYP calculated results are more strongly agreed with the experimental data than B3LYP functions. Furthermore, the adsorptions of CCT2A and CCT3A on the TiO2 anatase (101) surface were carried out by mean of the chemical periodic calculation. The result exhibit the strong adsorption energy. The calculated results provide our new organic dyes can be effectively used as dye for Dye Sensitized Solar Cell (DSC).

Keywords: Dye-Sensitized Solar cell, Carbarzole, TD-DFT, D-D-π-A organic dye

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5153

7259 Positioning a Southern Inclusive Framework Embedded in the Social Model of Disability Theory Contextualized for Guyana

Authors: Lidon Lashley

Abstract:

This paper presents how the social model of disability can be used to reshape inclusive education practices in Guyana. Inclusive education in Guyana is metamorphosizing but still firmly held in the tenets of the Medical Model of Disability which influences the experiences of children with Special Education Needs and/or Disabilities (SEN/D). An ethnographic approach to data gathering was employed in this study. Qualitative data were gathered from the voices of children with and without SEN/D as well as their mainstream teachers to present the interplay of discourses and subjectivities in the situation. The data were analyzed using Adele Clarke's situational analysis. The data suggest that it is possible but will be challenging to fully contextualize and adopt Loreman's synthesis and Booths and Ainscow's Index in the two mainstream schools studied. In addition, the data paved the way for the presentation of the 'Southern Inclusive Education Framework for Guyana' and its support tool 'The Inclusive Checker created for Southern mainstream primary classrooms'.

Keywords: Social Model of Disability, Medical Model of Disability, subjectivities, metamorphosis, special education needs, postcolonial Guyana, Quasi-inclusion practices, Guyanese cultural challenges, mainstream primary schools, Loreman's Synthesis, Booths and Ainscow's Index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 636

7258 Integration of Microarray Data into a Genome-Scale Metabolic Model to Study Flux Distribution after Gene Knockout

Authors: Mona Heydari, Ehsan Motamedian, Seyed Abbas Shojaosadati

Abstract:

Prediction of perturbations after genetic manipulation (especially gene knockout) is one of the important challenges in systems biology. In this paper, a new algorithm is introduced that integrates microarray data into the metabolic model. The algorithm was used to study the change in the cell phenotype after knockout of Gss gene in Escherichia coli BW25113. Algorithm implementation indicated that gene deletion resulted in more activation of the metabolic network. Growth yield was more and less regulating gene were identified for mutant in comparison with the wild-type strain.

Keywords: Metabolic network, gene knockout, flux balance analysis, microarray data, integration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 996

7257 Using Genetic Programming to Evolve a Team of Data Classifiers

Authors: Gregor A. Morrison, Dominic P. Searson, Mark J. Willis

Abstract:

The purpose of this paper is to demonstrate the ability of a genetic programming (GP) algorithm to evolve a team of data classification models. The GP algorithm used in this work is “multigene" in nature, i.e. there are multiple tree structures (genes) that are used to represent team members. Each team member assigns a data sample to one of a fixed set of output classes. A majority vote, determined using the mode (highest occurrence) of classes predicted by the individual genes, is used to determine the final class prediction. The algorithm is tested on a binary classification problem. For the case study investigated, compact classification models are obtained with comparable accuracy to alternative approaches.

Keywords: classification, genetic programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782