Search results for: Statistical Data Analysis.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13685

Search results for: Statistical Data Analysis.

13655 Automated Process Quality Monitoring with Prediction of Fault Condition Using Measurement Data

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events is important to improve safety and reliability of machine operations and reduce losses caused by failures. Improper set-ups or aligning of parts often leads to severe problems in many machines. The construction of prediction models for predicting faulty conditions is quite essential in making decisions on when to perform machine maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of machine measurement data. The calibration model is used to predict two faulty conditions from historical reference data. This approach utilizes genetic algorithms (GA) based variable selection, and we evaluate the predictive performance of several prediction methods using real data. The results shows that the calibration model based on supervised probabilistic principal component analysis (SPPCA) yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: Prediction, operation monitoring, on-line data, nonlinear statistical methods, empirical model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1623
13654 The Relationships between Market Orientation and Competitiveness of Companies in Banking Sector

Authors: P. Jangl, M. Mikuláštík

Abstract:

The objective of the paper is to measure and compare market orientation of Swiss and Czech banks, as well as examine statistically the degree of influence it has on competitiveness of the institutions. The analysis of market orientation is based on the collecting, analysis and correct interpretation of the data. Descriptive analysis of market orientation describe current situation. Research of relation of competitiveness and market orientation in the sector of big international banks is suggested with the expectation of existence of a strong relationship. Partially, the work served as reconfirmation of suitability of classic methodologies to measurement of banks’ market orientation.

Two types of data were gathered. Firstly, by measuring subjectively perceived market orientation of a company and secondly, by quantifying its competitiveness. All data were collected from a sample of small, mid-sized and large banks. We used numerical secondary character data from the international statistical financial Bureau Van Dijk’s BANKSCOPE database.

 Statistical analysis led to the following results. Assuming classical market orientation measures to be scientifically justified, Czech banks are statistically less market-oriented than Swiss banks. Secondly, among small Swiss banks, which are not broadly internationally active, small relationship exist between market orientation measures and market share based competitiveness measures. Thirdly, among all Swiss banks, a strong relationship exists between market orientation measures and market share based competitiveness measures. Above results imply existence of a strong relation of this measure in sector of big international banks. A strong statistical relationship has been proven to exist between market orientation measures and equity/total assets ratio in Switzerland.

Keywords: Market Orientation, Competitiveness, Marketing Strategy, Measurement of Market Orientation, Relation between Market Orientation and Competitiveness, Banking Sector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2726
13653 The Wijma Delivery Expectancy/Experience Questionnaire (W-DEQ) with Turkish Sample: Confirmatory and Exploratory Factor Analysis

Authors: Oznur Korukcu, Kamile Kukulu, Mehmet Z. Firat

Abstract:

The propose of this study is to investigate the factor structures of the W-DEQ, originally developed on UK and Swedish women, were confirmed in Turkish samples, and to obtain a new modified factor structure appropriate to Turkish culture. Statistical analyses of the data obtained were performed using SPSS© for Windows version 13.0 and the SAS statistical software Version 9.1. Both confirmatory and exploratory factor analysis of W-DEQ were performed in the study. Factor analysis yielded four factors related to hope, fear, lack of positive anticipation and riskiness. The alpha estimates of the total W-DEQ score were somewhat higher, being 0.92 for the parous and 0.90 for the nulliparous sample. These are well above the accepted limit of 0.70 and indicate excellent levels of internal reliability, thus showing that the questions were appropriate to the Turkish culture and useful scale for the evaluation of fear of childbirth in Turkish pregnants.

Keywords: Confirmatory factor analysis, cross-cultural research, exploratory factor analysis, fear of childbirth.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5069
13652 Wind Farm Power Performance Verification Using Non-Parametric Statistical Inference

Authors: M. Celeska, K. Najdenkoski, V. Dimchev, V. Stoilkov

Abstract:

Accurate determination of wind turbine performance is necessary for economic operation of a wind farm. At present, the procedure to carry out the power performance verification of wind turbines is based on a standard of the International Electrotechnical Commission (IEC). In this paper, nonparametric statistical inference is applied to designing a simple, inexpensive method of verifying the power performance of a wind turbine. A statistical test is explained, examined, and the adequacy is tested over real data. The methods use the information that is collected by the SCADA system (Supervisory Control and Data Acquisition) from the sensors embedded in the wind turbines in order to carry out the power performance verification of a wind farm. The study has used data on the monthly output of wind farm in the Republic of Macedonia, and the time measuring interval was from January 1, 2016, to December 31, 2016. At the end, it is concluded whether the power performance of a wind turbine differed significantly from what would be expected. The results of the implementation of the proposed methods showed that the power performance of the specific wind farm under assessment was acceptable.

Keywords: Canonical correlation analysis, power curve, power performance, wind energy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 967
13651 Data Mining on the Router Logs for Statistical Application Classification

Authors: M. Rahmati, S.M. Mirzababaei

Abstract:

With the advance of information technology in the new era the applications of Internet to access data resources has steadily increased and huge amount of data have become accessible in various forms. Obviously, the network providers and agencies, look after to prevent electronic attacks that may be harmful or may be related to terrorist applications. Thus, these have facilitated the authorities to under take a variety of methods to protect the special regions from harmful data. One of the most important approaches is to use firewall in the network facilities. The main objectives of firewalls are to stop the transfer of suspicious packets in several ways. However because of its blind packet stopping, high process power requirements and expensive prices some of the providers are reluctant to use the firewall. In this paper we proposed a method to find a discriminate function to distinguish between usual packets and harmful ones by the statistical processing on the network router logs. By discriminating these data, an administrator may take an approach action against the user. This method is very fast and can be used simply in adjacent with the Internet routers.

Keywords: Data Mining, Firewall, Optimization, Packetclassification, Statistical Pattern Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1604
13650 Statistical and Land Planning Study of Tourist Arrivals in Greece during 2005-2016

Authors: Dimitra Alexiou

Abstract:

During the last 10 years, in spite of the economic crisis, the number of tourists arriving in Greece has increased, particularly during the tourist season from April to October. In this paper, the number of annual tourist arrivals is studied to explore their preferences with regard to the month of travel, the selected destinations, as well the amount of money spent. The collected data are processed with statistical methods, yielding numerical and graphical results. From the computation of statistical parameters and the forecasting with exponential smoothing, useful conclusions are arrived at that can be used by the Greek tourism authorities, as well as by tourist organizations, for planning purposes for the coming years. The results of this paper and the computed forecast can also be used for decision making by private tourist enterprises that are investing in Greece. With regard to the statistical methods, the method of Simple Exponential Smoothing of time series of data is employed. The search for a best forecast for 2017 and 2018 provides the value of the smoothing coefficient. For all statistical computations and graphics Microsoft Excel is used.

Keywords: Tourism, statistical methods, exponential smoothing, land spatial planning, economy, Microsoft Excel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 661
13649 Investigation on Performance of Change Point Algorithm in Time Series Dynamical Regimes and Effect of Data Characteristics

Authors: Farhad Asadi, Mohammad Javad Mollakazemi

Abstract:

In this paper, Bayesian online inference in models of data series are constructed by change-points algorithm, which separated the observed time series into independent series and study the change and variation of the regime of the data with related statistical characteristics. variation of statistical characteristics of time series data often represent separated phenomena in the some dynamical system, like a change in state of brain dynamical reflected in EEG signal data measurement or a change in important regime of data in many dynamical system. In this paper, prediction algorithm for studying change point location in some time series data is simulated. It is verified that pattern of proposed distribution of data has important factor on simpler and smother fluctuation of hazard rate parameter and also for better identification of change point locations. Finally, the conditions of how the time series distribution effect on factors in this approach are explained and validated with different time series databases for some dynamical system.

Keywords: Time series, fluctuation in statistical characteristics, optimal learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1762
13648 Joint Use of Factor Analysis (FA) and Data Envelopment Analysis (DEA) for Ranking of Data Envelopment Analysis

Authors: Reza Nadimi, Fariborz Jolai

Abstract:

This article combines two techniques: data envelopment analysis (DEA) and Factor analysis (FA) to data reduction in decision making units (DMU). Data envelopment analysis (DEA), a popular linear programming technique is useful to rate comparatively operational efficiency of decision making units (DMU) based on their deterministic (not necessarily stochastic) input–output data and factor analysis techniques, have been proposed as data reduction and classification technique, which can be applied in data envelopment analysis (DEA) technique for reduction input – output data. Numerical results reveal that the new approach shows a good consistency in ranking with DEA.

Keywords: Effectiveness, Decision Making, Data EnvelopmentAnalysis, Factor Analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2380
13647 Time-Domain Analysis of Pulse Parameters Effects on Crosstalk (In High Speed Circuits)

Authors: L. Tani, N. El Ouzzani

Abstract:

Crosstalk among interconnects and printed-circuit board (PCB) traces is a major limiting factor of signal quality in highspeed digital and communication equipments especially when fast data buses are involved. Such a bus is considered as a planar multiconductor transmission line. This paper will demonstrate how the finite difference time domain (FDTD) method provides an exact solution of the transmission-line equations to analyze the near end and the far end crosstalk. In addition, this study makes it possible to analyze the rise time effect on the near and far end voltages of the victim conductor. The paper also discusses a statistical analysis, based upon a set of several simulations. Such analysis leads to a better understanding of the phenomenon and yields useful information.

Keywords: Multiconductor transmission line, Crosstalk, Finite difference time domain (FDTD), printed-circuit board (PCB), Rise time, Statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1730
13646 Immobilization of Lipase Enzyme by Low Cost Material: A Statistical Approach

Authors: Md. Z. Alam, Devi R. Asih, Md. N. Salleh

Abstract:

Immobilization of lipase enzyme produced from palm oil mill effluent (POME) by the activated carbon (AC) among the low cost support materials was optimized. The results indicated that immobilization of 94% was achieved by AC as the most suitable support material. A sequential optimization strategy based on a statistical experimental design, including one-factor-at-a-time (OFAT) method was used to determine the equilibrium time. Three components influencing lipase immobilization were optimized by the response surface methodology (RSM) based on the face-centered central composite design (FCCCD). On the statistical analysis of the results, the optimum enzyme concentration loading, agitation rate and carbon active dosage were found to be 30 U/ml, 300 rpm and 8 g/L respectively, with a maximum immobilization activity of 3732.9 U/g-AC after 2 hrs of immobilization. Analysis of variance (ANOVA) showed a high regression coefficient (R2) of 0.999, which indicated a satisfactory fit of the model with the experimental data. The parameters were statistically significant at p<0.05.

Keywords: Activated carbon, adsorption, immobilization, POME based lipase.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2521
13645 Series-Parallel Systems Reliability Optimization Using Genetic Algorithm and Statistical Analysis

Authors: Essa Abrahim Abdulgader Saleem, Thien-My Dao

Abstract:

The main objective of this paper is to optimize series-parallel system reliability using Genetic Algorithm (GA) and statistical analysis; considering system reliability constraints which involve the redundant numbers of selected components, total cost, and total weight. To perform this work, firstly the mathematical model which maximizes system reliability subject to maximum system cost and maximum system weight constraints is presented; secondly, a statistical analysis is used to optimize GA parameters, and thirdly GA is used to optimize series-parallel systems reliability. The objective is to determine the strategy choosing the redundancy level for each subsystem to maximize the overall system reliability subject to total cost and total weight constraints. Finally, the series-parallel system case study reliability optimization results are showed, and comparisons with the other previous results are presented to demonstrate the performance of our GA.

Keywords: Genetic algorithm, optimization, reliability, statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1108
13644 Reliability of Digital FSO Links in Europe

Authors: Zdenek Kolka, Otakar Wilfert, Viera Biolkova

Abstract:

The paper deals with an analysis of visibility records collected from 210 European airports to obtain a realistic estimation of the availability of Free Space Optical (FSO) data links. Commercially available optical links usually operate in the 850nm waveband. Thus the influence of the atmosphere on the optical beam and on the visible light is similar. Long-term visibility records represent an invaluable source of data for the estimation of the quality of service of FSO links. The model used characterizes both the statistical properties of fade depths and the statistical properties of individual fade durations. Results are presented for Italy, France, and Germany.

Keywords: Computer networks, free-space optical links, meteorology, quality of service.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2095
13643 A Multivariate Statistical Approach for Water Quality Assessment of River Hindon, India

Authors: Nida Rizvi, Deeksha Katyal, Varun Joshi

Abstract:

River Hindon is an important river catering the demand of highly populated rural and industrial cluster of western Uttar Pradesh, India. Water quality of river Hindon is deteriorating at an alarming rate due to various industrial, municipal and agricultural activities. The present study aimed at identifying the pollution sources and quantifying the degree to which these sources are responsible for the deteriorating water quality of the river. Various water quality parameters, like pH, temperature, electrical conductivity, total dissolved solids, total hardness, calcium, chloride, nitrate, sulphate, biological oxygen demand, chemical oxygen demand, and total alkalinity were assessed. Water quality data obtained from eight study sites for one year has been subjected to the two multivariate techniques, namely, principal component analysis and cluster analysis. Principal component analysis was applied with the aim to find out spatial variability and to identify the sources responsible for the water quality of the river. Three Varifactors were obtained after varimax rotation of initial principal components using principal component analysis. Cluster analysis was carried out to classify sampling stations of certain similarity, which grouped eight different sites into two clusters. The study reveals that the anthropogenic influence (municipal, industrial, waste water and agricultural runoff) was the major source of river water pollution. Thus, this study illustrates the utility of multivariate statistical techniques for analysis and elucidation of multifaceted data sets, recognition of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

Keywords: Cluster analysis, multivariate statistical technique, river Hindon, water Quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3760
13642 Prediction of Basic Wind Speed for Ayeyarwady

Authors: Chaw Su Mon

Abstract:

Abstract— The paper presents a preliminary study on modeling and estimation of basic wind speed ( extreme wind gusts ) for the consideration of vulnerability and design of building in Ayeyarwady Region. The establishment of appropriate design wind speeds is a critical step towards the calculation of design wind loads for structures. In this paper the extreme value analysis of this prediction work is based on the anemometer data (1970-2009) maintained by the department of meteorology and hydrology of Pathein. Statistical and probabilistic approaches are used to derive formulas for estimating 3-second gusts from recorded data (10-minute sustained mean wind speeds).

Keywords: Basic Wind Speed, Building, Gusts, Statistical and probabilistic approaches

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1235
13641 Visual-Graphical Methods for Exploring Longitudinal Data

Authors: H. W. Ker

Abstract:

Longitudinal data typically have the characteristics of changes over time, nonlinear growth patterns, between-subjects variability, and the within errors exhibiting heteroscedasticity and dependence. The data exploration is more complicated than that of cross-sectional data. The purpose of this paper is to organize/integrate of various visual-graphical techniques to explore longitudinal data. From the application of the proposed methods, investigators can answer the research questions include characterizing or describing the growth patterns at both group and individual level, identifying the time points where important changes occur and unusual subjects, selecting suitable statistical models, and suggesting possible within-error variance.

Keywords: Data exploration, exploratory analysis, HLMs/LMEs, longitudinal data, visual-graphical methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2045
13640 Tidal Data Analysis using ANN

Authors: Ritu Vijay, Rekha Govil

Abstract:

The design of a complete expansion that allows for compact representation of certain relevant classes of signals is a central problem in signal processing applications. Achieving such a representation means knowing the signal features for the purpose of denoising, classification, interpolation and forecasting. Multilayer Neural Networks are relatively a new class of techniques that are mathematically proven to approximate any continuous function arbitrarily well. Radial Basis Function Networks, which make use of Gaussian activation function, are also shown to be a universal approximator. In this age of ever-increasing digitization in the storage, processing, analysis and communication of information, there are numerous examples of applications where one needs to construct a continuously defined function or numerical algorithm to approximate, represent and reconstruct the given discrete data of a signal. Many a times one wishes to manipulate the data in a way that requires information not included explicitly in the data, which is done through interpolation and/or extrapolation. Tidal data are a very perfect example of time series and many statistical techniques have been applied for tidal data analysis and representation. ANN is recent addition to such techniques. In the present paper we describe the time series representation capabilities of a special type of ANN- Radial Basis Function networks and present the results of tidal data representation using RBF. Tidal data analysis & representation is one of the important requirements in marine science for forecasting.

Keywords: ANN, RBF, Tidal Data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608
13639 On Methodologies for Analysing Sickness Absence Data: An Insight into a New Method

Authors: Xiaoshu Lu, Päivi Leino-Arjas, Kustaa Piha, Akseli Aittomäki, Peppiina Saastamoinen, Ossi Rahkonen, Eero Lahelma

Abstract:

Sickness absence represents a major economic and social issue. Analysis of sick leave data is a recurrent challenge to analysts because of the complexity of the data structure which is often time dependent, highly skewed and clumped at zero. Ignoring these features to make statistical inference is likely to be inefficient and misguided. Traditional approaches do not address these problems. In this study, we discuss model methodologies in terms of statistical techniques for addressing the difficulties with sick leave data. We also introduce and demonstrate a new method by performing a longitudinal assessment of long-term absenteeism using a large registration dataset as a working example available from the Helsinki Health Study for municipal employees from Finland during the period of 1990-1999. We present a comparative study on model selection and a critical analysis of the temporal trends, the occurrence and degree of long-term sickness absences among municipal employees. The strengths of this working example include the large sample size over a long follow-up period providing strong evidence in supporting of the new model. Our main goal is to propose a way to select an appropriate model and to introduce a new methodology for analysing sickness absence data as well as to demonstrate model applicability to complicated longitudinal data.

Keywords: Sickness absence, longitudinal data, methodologies, mix-distribution model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2228
13638 Accelerating Side Channel Analysis with Distributed and Parallelized Processing

Authors: Kyunghee Oh, Dooho Choi

Abstract:

Although there is no theoretical weakness in a cryptographic algorithm, Side Channel Analysis can find out some secret data from the physical implementation of a cryptosystem. The analysis is based on extra information such as timing information, power consumption, electromagnetic leaks or even sound which can be exploited to break the system. Differential Power Analysis is one of the most popular analyses, as computing the statistical correlations of the secret keys and power consumptions. It is usually necessary to calculate huge data and takes a long time. It may take several weeks for some devices with countermeasures. We suggest and evaluate the methods to shorten the time to analyze cryptosystems. Our methods include distributed computing and parallelized processing.

Keywords: DPA, distributed computing, parallelized processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1854
13637 Data Oriented Modeling of Uniform Random Variable: Applied Approach

Authors: Ahmad Habibizad Navin, Mehdi Naghian Fesharaki, Mirkamal Mirnia, Mohamad Teshnelab, Ehsan Shahamatnia

Abstract:

In this paper we introduce new data oriented modeling of uniform random variable well-matched with computing systems. Due to this conformity with current computers structure, this modeling will be efficiently used in statistical inference.

Keywords: Uniform random variable, Data oriented modeling, Statistical inference, Prodigraph, Statistically complete tree, Uniformdigital probability digraph, Uniform n-complete probability tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1582
13636 Evaluation of the Mechanical Behavior of a Retaining Wall Structure on a Weathered Soil through Probabilistic Methods

Authors: P. V. S. Mascarenhas, B. C. P. Albuquerque, D. J. F. Campos, L. L. Almeida, V. R. Domingues, L. C. S. M. Ozelim

Abstract:

Retaining slope structures are increasingly considered in geotechnical engineering projects due to extensive urban cities growth. These kinds of engineering constructions may present instabilities over the time and may require reinforcement or even rebuilding of the structure. In this context, statistical analysis is an important tool for decision making regarding retaining structures. This study approaches the failure probability of the construction of a retaining wall over the debris of an old and collapsed one. The new solution’s extension length will be of approximately 350 m and will be located over the margins of the Lake Paranoá, Brasilia, in the capital of Brazil. The building process must also account for the utilization of the ruins as a caisson. A series of in situ and laboratory experiments defined local soil strength parameters. A Standard Penetration Test (SPT) defined the in situ soil stratigraphy. Also, the parameters obtained were verified using soil data from a collection of masters and doctoral works from the University of Brasília, which is similar to the local soil. Initial studies show that the concrete wall is the proper solution for this case, taking into account the technical, economic and deterministic analysis. On the other hand, in order to better analyze the statistical significance of the factor-of-safety factors obtained, a Monte Carlo analysis was performed for the concrete wall and two more initial solutions. A comparison between the statistical and risk results generated for the different solutions indicated that a Gabion solution would better fit the financial and technical feasibility of the project.

Keywords: Economical analysis, probability of failure, retaining walls, statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 980
13635 Analysis of Web User Identification Methods

Authors: Renáta Iváncsy, Sándor Juhász

Abstract:

Web usage mining has become a popular research area, as a huge amount of data is available online. These data can be used for several purposes, such as web personalization, web structure enhancement, web navigation prediction etc. However, the raw log files are not directly usable; they have to be preprocessed in order to transform them into a suitable format for different data mining tasks. One of the key issues in the preprocessing phase is to identify web users. Identifying users based on web log files is not a straightforward problem, thus various methods have been developed. There are several difficulties that have to be overcome, such as client side caching, changing and shared IP addresses and so on. This paper presents three different methods for identifying web users. Two of them are the most commonly used methods in web log mining systems, whereas the third on is our novel approach that uses a complex cookie-based method to identify web users. Furthermore we also take steps towards identifying the individuals behind the impersonal web users. To demonstrate the efficiency of the new method we developed an implementation called Web Activity Tracking (WAT) system that aims at a more precise distinction of web users based on log data. We present some statistical analysis created by the WAT on real data about the behavior of the Hungarian web users and a comprehensive analysis and comparison of the three methods

Keywords: Data preparation, Tracking individuals, Web useridentification, Web usage mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4348
13634 Georgia Case: Tourism Expenses of International Visitors on the Basis of Growing Attractiveness

Authors: Nino Abesadze, Marine Mindorashvili, Nino Paresashvili

Abstract:

At present actual tourism indicators cannot be calculated in Georgia, making it impossible to perform their quantitative analysis. Therefore, the study conducted by us is highly important from a theoretical as well as practical standpoint. The main purpose of the article is to make complex statistical analysis of tourist expenses of foreign visitors and to calculate statistical attractiveness indices of the tourism potential of Georgia. During the research, the method involving random and proportional selection has been applied. Computer software SPSS was used to compute statistical data for corresponding analysis. Corresponding methodology of tourism statistics was implemented according to international standards. Important information was collected and grouped from major Georgian airports, and a representative population of foreign visitors and a rule of selection of respondents were determined. The results show a trend of growth in tourist numbers and the share of tourists from post-soviet countries are constantly increasing. The level of satisfaction with tourist facilities and quality of service has improved, but still we have a problem of disparity between the service quality and the prices. The design of tourist expenses of foreign visitors is diverse; competitiveness of tourist products of Georgian tourist companies is higher. Attractiveness of popular cities of Georgia has increased by 43%.

Keywords: Tourist, expenses, indexes, statistics, analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 874
13633 Data Mining Classification Methods Applied in Drug Design

Authors: Mária Stachová, Lukáš Sobíšek

Abstract:

Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.

Keywords: data mining, classification, drug design, QSAR

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2799
13632 Thailand National Biodiversity Database System with webMathematica and Google Earth

Authors: W. Katsarapong, W. Srisang, K. Jaroensutasinee, M. Jaroensutasinee

Abstract:

National Biodiversity Database System (NBIDS) has been developed for collecting Thai biodiversity data. The goal of this project is to provide advanced tools for querying, analyzing, modeling, and visualizing patterns of species distribution for researchers and scientists. NBIDS data record two types of datasets: biodiversity data and environmental data. Biodiversity data are specie presence data and species status. The attributes of biodiversity data can be further classified into two groups: universal and projectspecific attributes. Universal attributes are attributes that are common to all of the records, e.g. X/Y coordinates, year, and collector name. Project-specific attributes are attributes that are unique to one or a few projects, e.g., flowering stage. Environmental data include atmospheric data, hydrology data, soil data, and land cover data collecting by using GLOBE protocols. We have developed webbased tools for data entry. Google Earth KML and ArcGIS were used as tools for map visualization. webMathematica was used for simple data visualization and also for advanced data analysis and visualization, e.g., spatial interpolation, and statistical analysis. NBIDS will be used by park rangers at Khao Nan National Park, and researchers.

Keywords: GLOBE protocol, Biodiversity, Database System, ArcGIS, Google Earth and webMathematica.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1930
13631 Analysis of Air Quality in the Outdoor Environment of the City of Messina by an Application of the Pollution Index Method

Authors: G. Cannistraro, L. Ponterio

Abstract:

In this paper is reported an analysis about the outdoor air pollution of the urban centre of the city of Messina. The variations of the most critical pollutants concentrations (PM10, O3, CO, C6H6) and their trends respect of climatic parameters and vehicular traffic have been studied. Linear regressions have been effectuated for representing the relations among the pollutants; the differences between pollutants concentrations on weekend/weekday were also analyzed. In order to evaluate air pollution and its effects on human health, a method for calculating a pollution index was implemented and applied in the urban centre of the city. This index is based on the weighted mean of the most detrimental air pollutants concentrations respect of their limit values for protection of human health. The analyzed data of the polluting substances were collected by the Assessorship of the Environment of the Regional Province of Messina in the year 2004. A statistical analysis of the air quality index trends is also reported.

Keywords: Environmental pollution, Pollutants levels, Linearregression, Air Quality Index, Statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
13630 A Modified AES Based Algorithm for Image Encryption

Authors: M. Zeghid, M. Machhout, L. Khriji, A. Baganne, R. Tourki

Abstract:

With the fast evolution of digital data exchange, security information becomes much important in data storage and transmission. Due to the increasing use of images in industrial process, it is essential to protect the confidential image data from unauthorized access. In this paper, we analyze the Advanced Encryption Standard (AES), and we add a key stream generator (A5/1, W7) to AES to ensure improving the encryption performance; mainly for images characterised by reduced entropy. The implementation of both techniques has been realized for experimental purposes. Detailed results in terms of security analysis and implementation are given. Comparative study with traditional encryption algorithms is shown the superiority of the modified algorithm.

Keywords: Cryptography, Encryption, Advanced EncryptionStandard (AES), ECB mode, statistical analysis, key streamgenerator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4991
13629 Space Telemetry Anomaly Detection Based on Statistical PCA Algorithm

Authors: B. Nassar, W. Hussein, M. Mokhtar

Abstract:

The critical concern of satellite operations is to ensure the health and safety of satellites. The worst case in this perspective is probably the loss of a mission, but the more common interruption of satellite functionality can result in compromised mission objectives. All the data acquiring from the spacecraft are known as Telemetry (TM), which contains the wealth information related to the health of all its subsystems. Each single item of information is contained in a telemetry parameter, which represents a time-variant property (i.e. a status or a measurement) to be checked. As a consequence, there is a continuous improvement of TM monitoring systems to reduce the time required to respond to changes in a satellite's state of health. A fast conception of the current state of the satellite is thus very important to respond to occurring failures. Statistical multivariate latent techniques are one of the vital learning tools that are used to tackle the problem above coherently. Information extraction from such rich data sources using advanced statistical methodologies is a challenging task due to the massive volume of data. To solve this problem, in this paper, we present a proposed unsupervised learning algorithm based on Principle Component Analysis (PCA) technique. The algorithm is particularly applied on an actual remote sensing spacecraft. Data from the Attitude Determination and Control System (ADCS) was acquired under two operation conditions: normal and faulty states. The models were built and tested under these conditions, and the results show that the algorithm could successfully differentiate between these operations conditions. Furthermore, the algorithm provides competent information in prediction as well as adding more insight and physical interpretation to the ADCS operation.

Keywords: Space telemetry monitoring, multivariate analysis, PCA algorithm, space operations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2019
13628 Application of Life Data Analysis for the Reliability Assessment of Numerical Overcurrent Relays

Authors: Mohd Iqbal Ridwan, Kerk Lee Yen, Aminuddin Musa, Bahisham Yunus

Abstract:

Protective relays are components of a protection system in a power system domain that provides decision making element for correct protection and fault clearing operations. Failure of the protection devices may reduce the integrity and reliability of the power system protection that will impact the overall performance of the power system. Hence it is imperative for power utilities to assess the reliability of protective relays to assure it will perform its intended function without failure. This paper will discuss the application of reliability analysis using statistical method called Life Data Analysis in Tenaga Nasional Berhad (TNB), a government linked power utility company in Malaysia, namely Transmission Division, to assess and evaluate the reliability of numerical overcurrent protective relays from two different manufacturers.

Keywords: Life data analysis, Protective relays, Reliability, Weibull Distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3944
13627 Screening of Process Variables for the Production of Extracellular Lipase from Palm Oil by Trichoderma Viride using Plackett-Burman Design

Authors: R. Rajendiran, S. Gayathri devi, B.T. SureshKumar, V. Arul Priya

Abstract:

Plackett-Burman statistical screening of media constituents and operational conditions for extracellular lipase production from isolate Trichoderma viride has been carried out in submerged fermentation. This statistical design is used in the early stages of experimentation to screen out unimportant factors from a large number of possible factors. This design involves screening of up to 'n-1' variables in just 'n' number of experiments. Regression coefficients and t-values were calculated by subjecting the experimental data to statistical analysis using Minitab version 15. The effects of nine process variables were studied in twelve experimental trials. Maximum lipase activity of 7.83 μmol /ml /min was obtained in the 6th trail. Pareto chart illustrates the order of significance of the variables affecting the lipase production. The present study concludes that the most significant variables affecting lipase production were found to be palm oil, yeast extract, K2HPO4, MgSO4 and CaCl2.

Keywords: lipase, submerged fermentation, statistical optimization, Trichoderma viride

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2267
13626 Evaluation of Clustering Based on Preprocessing in Gene Expression Data

Authors: Seo Young Kim, Toshimitsu Hamasaki

Abstract:

Microarrays have become the effective, broadly used tools in biological and medical research to address a wide range of problems, including classification of disease subtypes and tumors. Many statistical methods are available for analyzing and systematizing these complex data into meaningful information, and one of the main goals in analyzing gene expression data is the detection of samples or genes with similar expression patterns. In this paper, we express and compare the performance of several clustering methods based on data preprocessing including strategies of normalization or noise clearness. We also evaluate each of these clustering methods with validation measures for both simulated data and real gene expression data. Consequently, clustering methods which are common used in microarray data analysis are affected by normalization and degree of noise and clearness for datasets.

Keywords: Gene expression, clustering, data preprocessing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1697