Search results for: Statistical data analysis

13680 Part of Speech Tagging Using Statistical Approach for Nepali Text

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: Hidden Markov model, Viterbi algorithm, POS tagging, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1714

13679 Probability Distribution of Rainfall Depth at Hourly Time-Scale

Authors: S. Dan'azumi, S. Shamsudin, A. A. Rahman

Abstract:

Rainfall data at fine resolution and knowledge of its characteristics plays a major role in the efficient design and operation of agricultural, telecommunication, runoff and erosion control as well as water quality control systems. The paper is aimed to study the statistical distribution of hourly rainfall depth for 12 representative stations spread across Peninsular Malaysia. Hourly rainfall data of 10 to 22 years period were collected and its statistical characteristics were estimated. Three probability distributions namely, Generalized Pareto, Exponential and Gamma distributions were proposed to model the hourly rainfall depth, and three goodness-of-fit tests, namely, Kolmogorov-Sminov, Anderson-Darling and Chi-Squared tests were used to evaluate their fitness. Result indicates that the east cost of the Peninsular receives higher depth of rainfall as compared to west coast. However, the rainfall frequency is found to be irregular. Also result from the goodness-of-fit tests show that all the three models fit the rainfall data at 1% level of significance. However, Generalized Pareto fits better than Exponential and Gamma distributions and is therefore recommended as the best fit.

Keywords: Goodness-of-fit test, Hourly rainfall, Malaysia, Probability distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2925

13678 Time Series Modelling and Prediction of River Runoff: Case Study of Karkheh River, Iran

Authors: Karim Hamidi Machekposhti, Hossein Sedghi, Abdolrasoul Telvari, Hossein Babazadeh

Abstract:

Rainfall and runoff phenomenon is a chaotic and complex outcome of nature which requires sophisticated modelling and simulation methods for explanation and use. Time Series modelling allows runoff data analysis and can be used as forecasting tool. In the paper attempt is made to model river runoff data and predict the future behavioural pattern of river based on annual past observations of annual river runoff. The river runoff analysis and predict are done using ARIMA model. For evaluating the efficiency of prediction to hydrological events such as rainfall, runoff and etc., we use the statistical formulae applicable. The good agreement between predicted and observation river runoff coefficient of determination (R²) display that the ARIMA (4,1,1) is the suitable model for predicting Karkheh River runoff at Iran.

Keywords: Time series modelling, ARIMA model, River runoff, Karkheh River, CLS method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 801

13677 Materialized View Effect on Query Performance

Authors: Yusuf Ziya Ayık, Ferhat Kahveci

Abstract:

Currently, database management systems have various tools such as backup and maintenance, and also provide statistical information such as resource usage and security. In terms of query performance, this paper covers query optimization, views, indexed tables, pre-computation materialized view, query performance analysis in which query plan alternatives can be created and the least costly one selected to optimize a query. Indexes and views can be created for related table columns. The literature review of this study showed that, in the course of time, despite the growing capabilities of the database management system, only database administrators are aware of the need for dealing with archival and transactional data types differently. These data may be constantly changing data used in everyday life, and also may be from the completed questionnaire whose data input was completed. For both types of data, the database uses its capabilities; but as shown in the findings section, instead of repeating similar heavy calculations which are carrying out same results with the same query over a survey results, using materialized view results can be in a more simple way. In this study, this performance difference was observed quantitatively considering the cost of the query.

Keywords: Materialized view, pre-computation, query cost, query performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1349

13676 A Very Efficient Pseudo-Random Number Generator Based On Chaotic Maps and S-Box Tables

Authors: M. Hamdi, R. Rhouma, S. Belghith

Abstract:

Generating random numbers are mainly used to create secret keys or random sequences. It can be carried out by various techniques. In this paper we present a very simple and efficient pseudo random number generator (PRNG) based on chaotic maps and S-Box tables. This technique adopted two main operations one to generate chaotic values using two logistic maps and the second to transform them into binary words using random S-Box tables. The simulation analysis indicates that our PRNG possessing excellent statistical and cryptographic properties.

Keywords: Chaotic map, Cryptography, Random Numbers, Statistical tests, S-box.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3870

13675 Spread Spectrum Image Watermarking for Secured Multimedia Data Communication

Authors: Tirtha S. Das, Ayan K. Sau, Subir K. Sarkar

Abstract:

Digital watermarking is a way to provide the facility of secure multimedia data communication besides its copyright protection approach. The Spread Spectrum modulation principle is widely used in digital watermarking to satisfy the robustness of multimedia signals against various signal-processing operations. Several SS watermarking algorithms have been proposed for multimedia signals but very few works have discussed on the issues responsible for secure data communication and its robustness improvement. The current paper has critically analyzed few such factors namely properties of spreading codes, proper signal decomposition suitable for data embedding, security provided by the key, successive bit cancellation method applied at decoder which have greater impact on the detection reliability, secure communication of significant signal under camouflage of insignificant signals etc. Based on the analysis, robust SS watermarking scheme for secure data communication is proposed in wavelet domain and improvement in secure communication and robustness performance is reported through experimental results. The reported result also shows improvement in visual and statistical invisibility of the hidden data.

Keywords: Spread spectrum modulation, spreading code, signaldecomposition, security, successive bit cancellation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2783

13674 Improvement of the Q-System Using the Rock Engineering System: A Case Study of Water Conveyor Tunnel of Azad Dam

Authors: S. Golmohammadi, M. Noorian Bidgoli

Abstract:

Because the status and mechanical parameters of discontinuities in the rock mass are included in the calculations, various methods of rock engineering classification are often used as a starting point for the design of different types of structures. The Q-system is one of the most frequently used methods for stability analysis and determination of support systems of underground structures in rock, including tunnel. In this method, six main parameters of the rock mass, namely, the Rock Quality Designation (RQD), joint set number (Jn), joint roughness number (Jr), joint alteration number (Ja), joint water parameter (Jw) and Stress Reduction Factor (SRF) are required. In this regard, in order to achieve a reasonable and optimal design, identifying the effective parameters for the stability of the mentioned structures is one of the most important goals and the most necessary actions in rock engineering. Therefore, it is necessary to study the relationships between the parameters of a system and how they interact with each other and, ultimately, the whole system. In this research, it has been attempted to determine the most effective parameters (key parameters) from the six parameters of rock mass in the Q-system using the Rock Engineering System (RES) method to improve the relationships between the parameters in the calculation of the Q value. The RES system is, in fact, a method by which one can determine the degree of cause and effect of a system's parameters by making an interaction matrix. In this research, the geomechanical data collected from the water conveyor tunnel of Azad Dam were used to make the interaction matrix of the Q-system. For this purpose, instead of using the conventional methods that are always accompanied by defects such as uncertainty, the Q-system interaction matrix is coded using a technique that is actually a statistical analysis of the data and determining the correlation coefficient between them. So, the effect of each parameter on the system is evaluated with greater certainty. The results of this study show that the formed interaction matrix provides a reasonable estimate of the effective parameters in the Q-system. Among the six parameters of the Q-system, the SRF and Jr parameters have the maximum and minimum impact on the system, respectively, and also the RQD and Jw parameters have the maximum and minimum impact on the system, respectively. Therefore, by developing this method, we can obtain a more accurate relation to the rock mass classification by weighting the required parameters in the Q-system.

Keywords: Q-system, Rock Engineering System, statistical analysis, rock mass, tunnel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 300

13673 Earthquake Classification in Molluca Collision Zone Using Conventional Statistical Methods

Authors: H. J. Wattimanela, U. S. Passaribu, N. T. Puspito, S. W. Indratno

Abstract:

Molluca Collision Zone is located at the junction of the Eurasian, Australian, Pacific and the Philippines plates. Between the Sangihe arc, west of the collision zone, and to the east of Halmahera arc is active collision and convex toward the Molluca Sea. This research will analyze the behavior of earthquake occurrence in Molluca Collision Zone related to the distributions of an earthquake in each partition regions, determining the type of distribution of a occurrence earthquake of partition regions, and the mean occurence of earthquakes each partition regions, and the correlation between the partitions region. We calculate number of earthquakes using partition method and its behavioral using conventional statistical methods. In this research, we used data of shallow earthquakes type and its magnitudes ≥4 SR (period 1964-2013). From the results, we can classify partitioned regions based on the correlation into two classes: strong and very strong. This classification can be used for early warning system in disaster management.

Keywords: Molluca Collision Zone, partition regions, conventional statistical methods, Earthquakes, classifications, disaster management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1986

13672 Experimental Testing of Statistical Size Effect in Civil Engineering Structures

Authors: Jana Kaděrová, Miroslav Vořechovský

Abstract:

The presented paper copes with an experimental evaluation of a model based on modified Weibull size effect theory. Classical statistical Weibull theory was modified by introducing a new parameter (correlation length lp) representing the spatial autocorrelation of a random mechanical properties of material. This size effect modification was observed on two different materials used in civil engineering: unreinforced (plain) concrete and multi-filament yarns made of alkaliresistant (AR) glass which are used for textile-reinforced concrete. The behavior under flexural, resp. tensile loading was investigated by laboratory experiments. A high number of specimens of different sizes was tested to obtain statistically significant data which were subsequently corrected and statistically processed. Due to a distortion of the measured displacements caused by the unstiff experiment device, only the maximal load values were statistically evaluated. Results of the experiments showed a decreasing strength with an increasing sample length. Size effect curves were obtained and the correlation length was fitted according to measured data. Results did not exclude the existence of the proposed new parameter lp.

Keywords: Statistical size effect, concrete, multi filaments yarns, experiment, autocorrelation length.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1986

13671 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: Classification algorithms; data mining; tourism; knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2550

13670 GIS-based Approach for Land-Use Analysis: A Case Study

Authors: M. Giannopoulou, I. Roukounis, A. Roukouni.

Abstract:

Geographical Information Systems are an integral part of planning in modern technical systems. Nowadays referred to as Spatial Decision Support Systems, as they allow synergy database management systems and models within a single user interface machine and they are important tools in spatial design for evaluating policies and programs at all levels of administration. This work refers to the creation of a Geographical Information System in the context of a broader research in the area of influence of an under construction station of the new metro in the Greek city of Thessaloniki, which included statistical and multivariate data analysis and diagrammatic representation, mapping and interpretation of the results.

Keywords: Databases, Geographical information systems (GIS), Land-use planning, Metro stations

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1607

13669 Lineup Optimization Model of Basketball Players Based on the Prediction of Recursive Neural Networks

Authors: Wang Yichen, Haruka Yamashita

Abstract:

In recent years, in the field of sports, decision making such as member in the game and strategy of the game based on then analysis of the accumulated sports data are widely attempted. In fact, in the NBA basketball league where the world's highest level players gather, to win the games, teams analyze the data using various statistical techniques. However, it is difficult to analyze the game data for each play such as the ball tracking or motion of the players in the game, because the situation of the game changes rapidly, and the structure of the data should be complicated. Therefore, it is considered that the analysis method for real time game play data is proposed. In this research, we propose an analytical model for "determining the optimal lineup composition" using the real time play data, which is considered to be difficult for all coaches. In this study, because replacing the entire lineup is too complicated, and the actual question for the replacement of players is "whether or not the lineup should be changed", and “whether or not Small Ball lineup is adopted”. Therefore, we propose an analytical model for the optimal player selection problem based on Small Ball lineups. In basketball, we can accumulate scoring data for each play, which indicates a player's contribution to the game, and the scoring data can be considered as a time series data. In order to compare the importance of players in different situations and lineups, we combine RNN (Recurrent Neural Network) model, which can analyze time series data, and NN (Neural Network) model, which can analyze the situation on the field, to build the prediction model of score. This model is capable to identify the current optimal lineup for different situations. In this research, we collected all the data of accumulated data of NBA from 2019-2020. Then we apply the method to the actual basketball play data to verify the reliability of the proposed model.

Keywords: Recurrent Neural Network, players lineup, basketball data, decision making model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 837

13668 Comparison of Neural Network and Logistic Regression Methods to Predict Xerostomia after Radiotherapy

Authors: Hui-Min Ting, Tsair-Fwu Lee, Ming-Yuan Cho, Pei-Ju Chao, Chun-Ming Chang, Long-Chang Chen, Fu-Min Fang

Abstract:

To evaluate the ability to predict xerostomia after radiotherapy, we constructed and compared neural network and logistic regression models. In this study, 61 patients who completed a questionnaire about their quality of life (QoL) before and after a full course of radiation therapy were included. Based on this questionnaire, some statistical data about the condition of the patients’ salivary glands were obtained, and these subjects were included as the inputs of the neural network and logistic regression models in order to predict the probability of xerostomia. Seven variables were then selected from the statistical data according to Cramer’s V and point-biserial correlation values and were trained by each model to obtain the respective outputs which were 0.88 and 0.89 for AUC, 9.20 and 7.65 for SSE, and 13.7% and 19.0% for MAPE, respectively. These parameters demonstrate that both neural network and logistic regression methods are effective for predicting conditions of parotid glands.

Keywords: NPC, ANN, logistic regression, xerostomia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1641

13667 Posture Recognition using Combined Statistical and Geometrical Feature Vectors based on SVM

Authors: Omer Rashid, Ayoub Al-Hamadi, Axel Panning, Bernd Michaelis

Abstract:

It is hard to percept the interaction process with machines when visual information is not available. In this paper, we have addressed this issue to provide interaction through visual techniques. Posture recognition is done for American Sign Language to recognize static alphabets and numbers. 3D information is exploited to obtain segmentation of hands and face using normal Gaussian distribution and depth information. Features for posture recognition are computed using statistical and geometrical properties which are translation, rotation and scale invariant. Hu-Moment as statistical features and; circularity and rectangularity as geometrical features are incorporated to build the feature vectors. These feature vectors are used to train SVM for classification that recognizes static alphabets and numbers. For the alphabets, curvature analysis is carried out to reduce the misclassifications. The experimental results show that proposed system recognizes posture symbols by achieving recognition rate of 98.65% and 98.6% for ASL alphabets and numbers respectively.

Keywords: Feature Extraction, Posture Recognition, Pattern Recognition, Application.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1523

13666 The Impact of Bus Rapid Transit on Land Development: A Case Study of Beijing, China

Authors: Taotao Deng, John D. Nelson

Abstract:

Bus Rapid Transit (BRT) has emerged as a cost-effective transport system for urban mobility. However its ability to stimulate land development remains largely unexplored. The study makes use of qualitative (interview method) and quantitative analysis (questionnaire survey and longitudinal analysis of property data) to investigate land development impact resulting from BRT in Beijing, China. The empirical analysis suggests that BRT has a positive impact on the residential and commercial property attractiveness along the busway corridor. The statistical analysis suggests that accessibility advantage conferred by BRT is capitalized into higher property price. The average price of apartments adjacent to a BRT station has gained a relatively faster increase than those not served by the BRT system. The capitalization effect mostly occurs after the full operation of BRT, and is more evident over time and particularly observed in areas which previously lack alternative mobility opportunity.

Keywords: accessibility, Bus Rapid Transit (BRT), Beijing, property value uplift

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4607

13665 Comprehensive Analysis of Data Mining Tools

Authors: S. Sarumathi, N. Shanthi

Abstract:

Due to the fast and flawless technological innovation there is a tremendous amount of data dumping all over the world in every domain such as Pattern Recognition, Machine Learning, Spatial Data Mining, Image Analysis, Fraudulent Analysis, World Wide Web etc., This issue turns to be more essential for developing several tools for data mining functionalities. The major aim of this paper is to analyze various tools which are used to build a resourceful analytical or descriptive model for handling large amount of information more efficiently and user friendly. In this survey the diverse tools are illustrated with their extensive technical paradigm, outstanding graphical interface and inbuilt multipath algorithms in which it is very useful for handling significant amount of data more indeed.

Keywords: Classification, Clustering, Data Mining, Machine learning, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2444

13664 Health Monitoring and Failure Detection of Electronic and Structural Components in Small Unmanned Aerial Vehicles

Authors: Gopi Kandaswamy, P. Balamuralidhar

Abstract:

Fully autonomous small Unmanned Aerial Vehicles (UAVs) are increasingly being used in many commercial applications. Although a lot of research has been done to develop safe, reliable and durable UAVs, accidents due to electronic and structural failures are not uncommon and pose a huge safety risk to the UAV operators and the public. Hence there is a strong need for an automated health monitoring system for UAVs with a view to minimizing mission failures thereby increasing safety. This paper describes our approach to monitoring the electronic and structural components in a small UAV without the need for additional sensors to do the monitoring. Our system monitors data from four sources; sensors, navigation algorithms, control inputs from the operator and flight controller outputs. It then does statistical analysis on the data and applies a rule based engine to detect failures. This information can then be fed back into the UAV and a decision to continue or abort the mission can be taken automatically by the UAV and independent of the operator. Our system has been verified using data obtained from real flights over the past year from UAVs of various sizes that have been designed and deployed by us for various applications.

Keywords: Fault detection, health monitoring, unmanned aerial vehicles, vibration analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1501

13663 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data

Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin

Abstract:

Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.

Keywords: Big data, correlation analysis, data recommendation system, urban data network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1107

13662 Human Growth Curve Estimation through a Combination of Longitudinal and Cross-sectional Data

Authors: Sedigheh Mirzaei S., Debasis Sengupta

Abstract:

Parametric models have been quite popular for studying human growth, particularly in relation to biological parameters such as peak size velocity and age at peak size velocity. Longitudinal data are generally considered to be vital for fittinga parametric model to individual-specific data, and for studying the distribution of these biological parameters in a human population. However, cross-sectional data are easier to obtain than longitudinal data. In this paper, we present a method of combining longitudinal and cross-sectional data for the purpose of estimating the distribution of the biological parameters. We demonstrate, through simulations in the special case ofthePreece Baines model, how estimates based on longitudinal data can be improved upon by harnessing the information contained in cross-sectional data.We study the extent of improvement for different mixes of the two types of data, and finally illustrate the use of the method through data collected by the Indian Statistical Institute.

Keywords: Preece-Baines growth model, MCMC method, Mixed effect model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2142

13661 Generation Expansion Planning Strategies on Power System: A Review

Authors: V. Phupha, T. Lantharthong, N. Rugthaicharoencheep

Abstract:

The problem of generation expansion planning (GEP) has been extensively studied for many years. This paper presents three topics in GEP as follow: statistical model, models for generation expansion, and expansion problem. In the topic of statistical model, the main stages of the statistical modeling are briefly explained. Some works on models for GEP are reviewed in the topic of models for generation expansion. Finally for the topic of expansion problem, the major issues in the development of a longterm expansion plan are summarized.

Keywords: Generation expansion planning, strategies, power system

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3222

13660 A New Heuristic Statistical Methodology for Optimizing Queuing Networks Using Discreet Event Simulation

Authors: Mohamad Mahdavi

Abstract:

Most of the real queuing systems include special properties and constraints, which can not be analyzed directly by using the results of solved classical queuing models. Lack of Markov chains features, unexponential patterns and service constraints, are the mentioned conditions. This paper represents an applied general algorithm for analysis and optimizing the queuing systems. The algorithm stages are described through a real case study. It is consisted of an almost completed non-Markov system with limited number of customers and capacities as well as lots of common exception of real queuing networks. Simulation is used for optimizing this system. So introduced stages over the following article include primary modeling, determining queuing system kinds, index defining, statistical analysis and goodness of fit test, validation of model and optimizing methods of system with simulation.

Keywords: Estimation, queuing system, simulation model, probability distribution, non-Markov chain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1622

13659 Hardness Variations as Affected by Bar Diameter of AISI 4140 Steel

Authors: Hamad K. Al-Khalid, Ayman M. Alaskari, Samy E. Oraby

Abstract:

Hardness of the widely used structural steel is of vital importance since it may help in the determination of many mechanical properties of a material under loading situations. In order to obtain reliable information for design, properties homogeneity should be validated. In the current study the hardness variation over the different diameters of the same AISI 4140 bar is investigated. Measurements were taken on the two faces of the stock at equally spaced eight sectors and fifteen layers. Statistical and graphical analysis are performed to asses the distribution of hardness measurements over the specified area. Hardness measurements showed some degree of dispersion with about ± 10% of its nominal value provided by manufacturer. Hardness value is found to have a slight decrease trend as the diameter is reduced. However, an opposite behavior is noticed regarding the sequence of the sector indicating a nonuniform distribution over the same area either on the same face or considering the corresponding sector on the other face (cross section) of the same material bar.

Keywords: Hardness; Hardness variation; AISI 4140 steel; Bardiameter; Statistical Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2933

13658 Semantic Support for Hypothesis-Based Research from Smart Environment Monitoring and Analysis Technologies

Authors: T. S. Myers, J. Trevathan

Abstract:

Improvements in the data fusion and data analysis phase of research are imperative due to the exponential growth of sensed data. Currently, there are developments in the Semantic Sensor Web community to explore efficient methods for reuse, correlation and integration of web-based data sets and live data streams. This paper describes the integration of remotely sensed data with web-available static data for use in observational hypothesis testing and the analysis phase of research. The Semantic Reef system combines semantic technologies (e.g., well-defined ontologies and logic systems) with scientific workflows to enable hypothesis-based research. A framework is presented for how the data fusion concepts from the Semantic Reef architecture map to the Smart Environment Monitoring and Analysis Technologies (SEMAT) intelligent sensor network initiative. The data collected via SEMAT and the inferred knowledge from the Semantic Reef system are ingested to the Tropical Data Hub for data discovery, reuse, curation and publication.

Keywords: Information architecture, Semantic technologies Sensor networks, Ontologies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1717

13657 A Ground Observation Based Climatology of Winter Fog: Study over the Indo-Gangetic Plains, India

Authors: Sanjay Kumar Srivastava, Anu Rani Sharma, Kamna Sachdeva

Abstract:

Every year, fog formation over the Indo-Gangetic Plains (IGPs) of Indian region during the winter months of December and January is believed to create numerous hazards, inconvenience, and economic loss to the inhabitants of this densely populated region of Indian subcontinent. The aim of the paper is to analyze the spatial and temporal variability of winter fog over IGPs. Long term ground observations of visibility and other meteorological parameters (1971-2010) have been analyzed to understand the formation of fog phenomena and its relevance during the peak winter months of January and December over IGP of India. In order to examine the temporal variability, time series and trend analysis were carried out by using the Mann-Kendall Statistical test. Trend analysis performed by using the Mann-Kendall test, accepts the alternate hypothesis with 95% confidence level indicating that there exists a trend. Kendall tau’s statistics showed that there exists a positive correlation between time series and fog frequency. Further, the Theil and Sen’s median slope estimate showed that the magnitude of trend is positive. Magnitude is higher during January compared to December for the entire IGP except in December when it is high over the western IGP. Decade wise time series analysis revealed that there has been continuous increase in fog days. The net overall increase of 99 % was observed over IGP in last four decades. Diurnal variability and average daily persistence were computed by using descriptive statistical techniques. Geo-statistical analysis of fog was carried out to understand the spatial variability of fog. Geo-statistical analysis of fog revealed that IGP is a high fog prone zone with fog occurrence frequency of more than 66% days during the study period. Diurnal variability indicates the peak occurrence of fog is between 06:00 and 10:00 local time and average daily fog persistence extends to 5 to 7 hours during the peak winter season. The results would offer a new perspective to take proactive measures in reducing the irreparable damage that could be caused due to changing trends of fog.

Keywords: Fog, climatology, Mann-Kendall test, trend analysis, spatial variability, temporal variability, visibility.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1757

13656 Revealing Nonlinear Couplings between Oscillators from Time Series

Authors: B.P. Bezruchko, D.A. Smirnov

Abstract:

Quantitative characterization of nonlinear directional couplings between stochastic oscillators from data is considered. We suggest coupling characteristics readily interpreted from a physical viewpoint and their estimators. An expression for a statistical significance level is derived analytically that allows reliable coupling detection from a relatively short time series. Performance of the technique is demonstrated in numerical experiments.

Keywords: Nonlinear time series analysis, directional couplings, coupled oscillators.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1269

13655 An In-Depth Analysis of Open Data Portals as an Emerging Public E-Service

Authors: Martin Lnenicka

Abstract:

Governments collect and produce large amounts of data. Increasingly, governments worldwide have started to implement open data initiatives and also launch open data portals to enable the release of these data in open and reusable formats. Therefore, a large number of open data repositories, catalogues and portals have been emerging in the world. The greater availability of interoperable and linkable open government data catalyzes secondary use of such data, so they can be used for building useful applications which leverage their value, allow insight, provide access to government services, and support transparency. The efficient development of successful open data portals makes it necessary to evaluate them systematic, in order to understand them better and assess the various types of value they generate, and identify the required improvements for increasing this value. Thus, the attention of this paper is directed particularly to the field of open data portals. The main aim of this paper is to compare the selected open data portals on the national level using content analysis and propose a new evaluation framework, which further improves the quality of these portals. It also establishes a set of considerations for involving businesses and citizens to create eservices and applications that leverage on the datasets available from these portals.

Keywords: Big data, content analysis, criteria comparison, data quality, open data, open data portals, public sector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3086

13654 Podemos Party Origin: From Social Protest to Spanish Parliament

Authors: Víctor Manuel Muñoz-Sánchez, Antonio Manuel Pérez-Flores

Abstract:

This paper analyzes the institutionalization of social protest in Spain. In the current crisis Podemos party seems to represent the political positions of the most affected citizens by the economic situation. It studies using quantitative techniques (statistical bivariate analysis), focusing on the exploitation of several bases of statistics data from the Center for Sociological and Research of Spanish Government, 15M movement characterization to its institutionalization in the Podemos party. Making a comparison between the participant's profile by the 15M and the social bases of Podemos votes. Data on the transformation of the socio-demographic profile of the fans, connoisseurs and 15M participants and voters are given.

Keywords: Collective action, emerging parties, political parties, social protest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2126

13653 Person-Environment Fit (PE Fit): Evidence from Brazil

Authors: Jucelia Appio, Danielle Deimling De Carli, Bruno Henrique Rocha Fernandes, Nelson Natalino Frizon

Abstract:

The purpose of this paper is to investigate if there are positive and significant correlations between the dimensions of Person-Environment Fit (Person-Job, Person-Organization, Person-Group and Person-Supervisor) at the “Best Companies to Work for” in Brazil in 2017. For that, a quantitative approach was used with a descriptive method being defined as a research sample the "150 Best Companies to Work for", according to data base collected in 2017 and provided by Fundação Instituto of Administração (FIA) of the University of São Paulo (USP). About the data analysis procedures, asymmetry and kurtosis, factorial analysis, Kaiser-Meyer-Olkin (KMO) tests, Bartlett sphericity and Cronbach's alpha were used for the 69 research variables, and as a statistical technique for the purpose of analyzing the hypothesis, Pearson's correlation analysis was performed. As a main result, we highlight that there was a positive and significant correlation between the dimensions of Person-Environment Fit, corroborating the H1 hypothesis that there is a positive and significant correlation between Person-Job Fit, Person-Organization Fit, Person-Group Fit and Person-Supervisor Fit.

Keywords: Human resource management, person-environment fit, strategic people management, best companies to work for.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1005

13652 Role of Association Rule Mining in Numerical Data Analysis

Authors: Sudhir Jagtap, Kodge B. G., Shinde G. N., Devshette P. M

Abstract:

Numerical analysis naturally finds applications in all fields of engineering and the physical sciences, but in the 21st century, the life sciences and even the arts have adopted elements of scientific computations. The numerical data analysis became key process in research and development of all the fields [6]. In this paper we have made an attempt to analyze the specified numerical patterns with reference to the association rule mining techniques with minimum confidence and minimum support mining criteria. The extracted rules and analyzed results are graphically demonstrated. Association rules are a simple but very useful form of data mining that describe the probabilistic co-occurrence of certain events within a database [7]. They were originally designed to analyze market-basket data, in which the likelihood of items being purchased together within the same transactions are analyzed.

Keywords: Numerical data analysis, Data Mining, Association Rule Mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2864

13651 Analysis of Diverse Clustering Tools in Data Mining

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Clustering in data mining is an unsupervised learning technique of aggregating the data objects into meaningful groups such that the intra cluster similarity of objects are maximized and inter cluster similarity of objects are minimized. Over the past decades several clustering tools were emerged in which clustering algorithms are inbuilt and are easier to use and extract the expected results. Data mining mainly deals with the huge databases that inflicts on cluster analysis and additional rigorous computational constraints. These challenges pave the way for the emergence of powerful expansive data mining clustering softwares. In this survey, a variety of clustering tools used in data mining are elucidated along with the pros and cons of each software.

Keywords: Cluster Analysis, Clustering Algorithms, Clustering Techniques, Association, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2203