Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4634

Search results for: statistical methods.

4634 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: Clustering algorithms, coastal engineering, data mining, data summarization, statistical methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 810
4633 Texture Feature Extraction of Infrared River Ice Images using Second-Order Spatial Statistics

Authors: Bharathi P. T, P. Subashini

Abstract:

Ice cover County has a significant impact on rivers as it affects with the ice melting capacity which results in flooding, restrict navigation, modify the ecosystem and microclimate. River ices are made up of different ice types with varying ice thickness, so surveillance of river ice plays an important role. River ice types are captured using infrared imaging camera which captures the images even during the night times. In this paper the river ice infrared texture images are analysed using first-order statistical methods and secondorder statistical methods. The second order statistical methods considered are spatial gray level dependence method, gray level run length method and gray level difference method. The performance of the feature extraction methods are evaluated by using Probabilistic Neural Network classifier and it is found that the first-order statistical method and second-order statistical method yields low accuracy. So the features extracted from the first-order statistical method and second-order statistical method are combined and it is observed that the result of these combined features (First order statistical method + gray level run length method) provides higher accuracy when compared with the features from the first-order statistical method and second-order statistical method alone.

Keywords: Gray Level Difference Method, Gray Level Run Length Method, Kurtosis, Probabilistic Neural Network, Skewness, Spatial Gray Level Dependence Method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2629
4632 Statistical and Land Planning Study of Tourist Arrivals in Greece during 2005-2016

Authors: Dimitra Alexiou

Abstract:

During the last 10 years, in spite of the economic crisis, the number of tourists arriving in Greece has increased, particularly during the tourist season from April to October. In this paper, the number of annual tourist arrivals is studied to explore their preferences with regard to the month of travel, the selected destinations, as well the amount of money spent. The collected data are processed with statistical methods, yielding numerical and graphical results. From the computation of statistical parameters and the forecasting with exponential smoothing, useful conclusions are arrived at that can be used by the Greek tourism authorities, as well as by tourist organizations, for planning purposes for the coming years. The results of this paper and the computed forecast can also be used for decision making by private tourist enterprises that are investing in Greece. With regard to the statistical methods, the method of Simple Exponential Smoothing of time series of data is employed. The search for a best forecast for 2017 and 2018 provides the value of the smoothing coefficient. For all statistical computations and graphics Microsoft Excel is used.

Keywords: Tourism, statistical methods, exponential smoothing, land spatial planning, economy, Microsoft Excel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 452
4631 Statistical Texture Analysis

Authors: G. N. Srinivasan, G. Shobha

Abstract:

This paper presents an overview of the methodologies and algorithms for statistical texture analysis of 2D images. Methods for digital-image texture analysis are reviewed based on available literature and research work either carried out or supervised by the authors.

Keywords: Image Texture, Texture Analysis, Statistical Approaches, Structural approaches, spectral approaches, Morphological approaches, Fractals, Fourier Transforms, Gabor Filters, Wavelet transforms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 347
4630 The Strengths and Limitations of the Statistical Modeling of Complex Social Phenomenon: Focusing on SEM, Path Analysis, or Multiple Regression Models

Authors: Jihye Jeon

Abstract:

This paper analyzes the conceptual framework of three statistical methods, multiple regression, path analysis, and structural equation models. When establishing research model of the statistical modeling of complex social phenomenon, it is important to know the strengths and limitations of three statistical models. This study explored the character, strength, and limitation of each modeling and suggested some strategies for accurate explaining or predicting the causal relationships among variables. Especially, on the studying of depression or mental health, the common mistakes of research modeling were discussed.

Keywords: Multiple regression, path analysis, structural equation models, statistical modeling, social and psychological phenomenon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7383
4629 An Evolutionary Statistical Learning Theory

Authors: Sung-Hae Jun, Kyung-Whan Oh

Abstract:

Statistical learning theory was developed by Vapnik. It is a learning theory based on Vapnik-Chervonenkis dimension. It also has been used in learning models as good analytical tools. In general, a learning theory has had several problems. Some of them are local optima and over-fitting problems. As well, statistical learning theory has same problems because the kernel type, kernel parameters, and regularization constant C are determined subjectively by the art of researchers. So, we propose an evolutionary statistical learning theory to settle the problems of original statistical learning theory. Combining evolutionary computing into statistical learning theory, our theory is constructed. We verify improved performances of an evolutionary statistical learning theory using data sets from KDD cup.

Keywords: Evolutionary computing, Local optima, Over-fitting, Statistical learning theory

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1470
4628 Earthquake Classification in Molluca Collision Zone Using Conventional Statistical Methods

Authors: H. J. Wattimanela, U. S. Passaribu, N. T. Puspito, S. W. Indratno

Abstract:

Molluca Collision Zone is located at the junction of the Eurasian, Australian, Pacific and the Philippines plates. Between the Sangihe arc, west of the collision zone, and to the east of Halmahera arc is active collision and convex toward the Molluca Sea. This research will analyze the behavior of earthquake occurrence in Molluca Collision Zone related to the distributions of an earthquake in each partition regions, determining the type of distribution of a occurrence earthquake of partition regions, and the mean occurence of earthquakes each partition regions, and the correlation between the partitions region. We calculate number of earthquakes using partition method and its behavioral using conventional statistical methods. In this research, we used data of shallow earthquakes type and its magnitudes ≥4 SR (period 1964-2013). From the results, we can classify partitioned regions based on the correlation into two classes: strong and very strong. This classification can be used for early warning system in disaster management.

Keywords: Molluca Collision Zone, partition regions, conventional statistical methods, Earthquakes, classifications, disaster management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1641
4627 Advances in Artificial Intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: Speech recognition, acoustic phonetic, artificial intelligence, Hidden Markov Models (HMM), statistical models of speech recognition, human machine performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6908
4626 One Dimensional Object Segmentation and Statistical Features of an Image for Texture Image Recognition System

Authors: Nang Thwe Thwe Oo

Abstract:

Traditional object segmentation methods are time consuming and computationally difficult. In this paper, onedimensional object detection along the secant lines is applied. Statistical features of texture images are computed for the recognition process. Example matrices of these features and formulae for calculation of similarities between two feature patterns are expressed. And experiments are also carried out using these features.

Keywords: 1-D object segmentation, secant lines, objectoccurrence(frequency) matrix, contiguity matrix, statistical features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1232
4625 Developing a Statistical Model for Electromagnetic Environment for Mobile Wireless Networks

Authors: C. Temaneh Nyah

Abstract:

The analysis of electromagnetic environment using deterministic mathematical models is characterized by the impossibility of analyzing a large number of interacting network stations with a priori unknown parameters, and this is characteristic, for example, of mobile wireless communication networks. One of the tasks of the tools used in designing, planning and optimization of mobile wireless network is to carry out simulation of electromagnetic environment based on mathematical modelling methods, including computer experiment, and to estimate its effect on radio communication devices. This paper proposes the development of a statistical model of electromagnetic environment of a mobile wireless communication network by describing the parameters and factors affecting it including the propagation channel and their statistical models.

Keywords: Electromagnetic Environment, Statistical model, Wireless communication network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1591
4624 Content-Based Color Image Retrieval Based On 2-D Histogram and Statistical Moments

Authors: Khalid Elasnaoui, Brahim Aksasse, Mohammed Ouanan

Abstract:

In this paper, we are interested in the problem of finding similar images in a large database. For this purpose we propose a new algorithm based on a combination of the 2-D histogram intersection in the HSV space and statistical moments. The proposed histogram is based on a 3x3 window and not only on the intensity of the pixel. This approach overcome the drawback of the conventional 1-D histogram which is ignoring the spatial distribution of pixels in the image, while the statistical moments are used to escape the effects of the discretisation of the color space which is intrinsic to the use of histograms. We compare the performance of our new algorithm to various methods of the state of the art and we show that it has several advantages. It is fast, consumes little memory and requires no learning. To validate our results, we apply this algorithm to search for similar images in different image databases.

Keywords: 2-D histogram, Statistical moments, Indexing, Similarity distance, Histograms intersection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1423
4623 Recursive Filter for Coastal Displacement Estimation

Authors: Efstratios Doukakis, Nikolaos Petrelis

Abstract:

All climate models agree that the temperature in Greece will increase in the range of 1° to 2°C by the year 2030 and mean sea level in Mediterranean is expected to rise at the rate of 5 cm/decade. The aim of the present paper is the estimation of the coastline displacement driven by the climate change and sea level rise. In order to achieve that, all known statistical and non-statistical computational methods are employed on some Greek coastal areas. Furthermore, Kalman filtering techniques are for the first time introduced, formulated and tested. Based on all the above, shoreline change signals and noises are computed and an inter-comparison between the different methods can be deduced to help evaluating which method is most promising as far as the retrieve of shoreline change rate is concerned.

Keywords: Climate Change, Coastal Displacement, KalmanFilter

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1134
4622 Development of Sleep Quality Index Using Heart Rate

Authors: Dongjoo Kim, Chang-Sik Son, Won-Seok Kang

Abstract:

Adequate sleep affects various parts of one’s overall physical and mental life. As one of the methods in determining the appropriate amount of sleep, this research presents a heart rate based sleep quality index. In order to evaluate sleep quality using the heart rate, sleep data from 280 subjects taken over one month are used. Their sleep data are categorized by a three-part heart rate range. After categorizing, some features are extracted, and the statistical significances are verified for these features. The results show that some features of this sleep quality index model have statistical significance. Thus, this heart rate based sleep quality index may be a useful discriminator of sleep.

Keywords: Sleep, sleep quality, heart rate, statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1099
4621 The Profit Trend of Cosmetics Products Using Bootstrap Edgeworth Approximation

Authors: Edlira Donefski, Lorenc Ekonomi, Tina Donefski

Abstract:

Edgeworth approximation is one of the most important statistical methods that has a considered contribution in the reduction of the sum of standard deviation of the independent variables’ coefficients in a Quantile Regression Model. This model estimates the conditional median or other quantiles. In this paper, we have applied approximating statistical methods in an economical problem. We have created and generated a quantile regression model to see how the profit gained is connected with the realized sales of the cosmetic products in a real data, taken from a local business. The Linear Regression of the generated profit and the realized sales was not free of autocorrelation and heteroscedasticity, so this is the reason that we have used this model instead of Linear Regression. Our aim is to analyze in more details the relation between the variables taken into study: the profit and the finalized sales and how to minimize the standard errors of the independent variable involved in this study, the level of realized sales. The statistical methods that we have applied in our work are Edgeworth Approximation for Independent and Identical distributed (IID) cases, Bootstrap version of the Model and the Edgeworth approximation for Bootstrap Quantile Regression Model. The graphics and the results that we have presented here identify the best approximating model of our study.

Keywords: Bootstrap, Edgeworth approximation, independent and Identical distributed, quantile.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 93
4620 Comparing Transformational Leadership in Successful and Unsuccessful Companies

Authors: Gh. Jandaghi, H. Zarei Matin, A. Farjami

Abstract:

In this article, while it is attempted to describe the problem and its importance, transformational leadership is studied by considering leadership theories. Issues such as the definition of transformational leadership and its aspects are compared on the basis of the ideas of various connoisseurs and then it (transformational leadership) is examined in successful and unsuccessful companies. According to the methodology, the method of research, hypotheses, population and statistical sample are investigated and research findings are analyzed by using descriptive and inferential statistical methods in the framework of analytical tables. Finally, our conclusion is provided by considering the results of statistical tests. The final result shows that transformational leadership is significantly higher in successful companies than unsuccessful ones P<0.0001).

Keywords: Idealized influence, individualized considerations, inspirational motivation, intellectual stimulation, transformational leadership

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4162
4619 Wind Farm Power Performance Verification Using Non-Parametric Statistical Inference

Authors: M. Celeska, K. Najdenkoski, V. Dimchev, V. Stoilkov

Abstract:

Accurate determination of wind turbine performance is necessary for economic operation of a wind farm. At present, the procedure to carry out the power performance verification of wind turbines is based on a standard of the International Electrotechnical Commission (IEC). In this paper, nonparametric statistical inference is applied to designing a simple, inexpensive method of verifying the power performance of a wind turbine. A statistical test is explained, examined, and the adequacy is tested over real data. The methods use the information that is collected by the SCADA system (Supervisory Control and Data Acquisition) from the sensors embedded in the wind turbines in order to carry out the power performance verification of a wind farm. The study has used data on the monthly output of wind farm in the Republic of Macedonia, and the time measuring interval was from January 1, 2016, to December 31, 2016. At the end, it is concluded whether the power performance of a wind turbine differed significantly from what would be expected. The results of the implementation of the proposed methods showed that the power performance of the specific wind farm under assessment was acceptable.

Keywords: Canonical correlation analysis, power curve, power performance, wind energy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 564
4618 Statistical Analysis of Interferon-γ for the Effectiveness of an Anti-Tuberculous Treatment

Authors: Shishen Xie, Yingda L. Xie

Abstract:

Tuberculosis (TB) is a potentially serious infectious disease that remains a health concern. The Interferon Gamma Release Assay (IGRA) is a blood test to find out if an individual is tuberculous positive or negative. This study applies statistical analysis to the clinical data of interferon-gamma levels of seventy-three subjects who diagnosed pulmonary TB in an anti-tuberculous treatment. Data analysis is performed to determine if there is a significant decline in interferon-gamma levels for the subjects during a period of six months, and to infer if the anti-tuberculous treatment is effective.

Keywords: Data analysis, interferon gamma release assay, statistical methods, tuberculosis infection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1471
4617 Finding an Optimized Discriminate Function for Internet Application Recognition

Authors: E. Khorram, S.M. Mirzababaei

Abstract:

Everyday the usages of the Internet increase and simply a world of the data become accessible. Network providers do not want to let the provided services to be used in harmful or terrorist affairs, so they used a variety of methods to protect the special regions from the harmful data. One of the most important methods is supposed to be the firewall. Firewall stops the transfer of such packets through several ways, but in some cases they do not use firewall because of its blind packet stopping, high process power needed and expensive prices. Here we have proposed a method to find a discriminate function to distinguish between usual packets and harmful ones by the statistical processing on the network router logs. So an administrator can alarm to the user. This method is very fast and can be used simply in adjacent with the Internet routers.

Keywords: Data Mining, Firewall, Optimization, Packetclassification, Statistical Pattern Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1207
4616 Model Discovery and Validation for the Qsar Problem using Association Rule Mining

Authors: Luminita Dumitriu, Cristina Segal, Marian Craciun, Adina Cocu, Lucian P. Georgescu

Abstract:

There are several approaches in trying to solve the Quantitative 1Structure-Activity Relationship (QSAR) problem. These approaches are based either on statistical methods or on predictive data mining. Among the statistical methods, one should consider regression analysis, pattern recognition (such as cluster analysis, factor analysis and principal components analysis) or partial least squares. Predictive data mining techniques use either neural networks, or genetic programming, or neuro-fuzzy knowledge. These approaches have a low explanatory capability or non at all. This paper attempts to establish a new approach in solving QSAR problems using descriptive data mining. This way, the relationship between the chemical properties and the activity of a substance would be comprehensibly modeled.

Keywords: association rules, classification, data mining, Quantitative Structure - Activity Relationship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1560
4615 An Alternative Method for Generating Almost Infinite Sequence of Gaussian Variables

Authors: Nyah C. Temaneh, F. A. Phiri, E. Ruhunga

Abstract:

Most of the well known methods for generating Gaussian variables require at least one standard uniform distributed value, for each Gaussian variable generated. The length of the random number generator therefore, limits the number of independent Gaussian distributed variables that can be generated meanwhile the statistical solution of complex systems requires a large number of random numbers for their statistical analysis. We propose an alternative simple method of generating almost infinite number of Gaussian distributed variables using a limited number of standard uniform distributed random numbers.

Keywords: Gaussian variable, statistical analysis, simulation ofCommunication Network, Random numbers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1198
4614 Statistical Genetic Algorithm

Authors: Mohammad Ali Tabarzad, Caro Lucas, Ali Hamzeh

Abstract:

Adaptive Genetic Algorithms extend the Standard Gas to use dynamic procedures to apply evolutionary operators such as crossover, mutation and selection. In this paper, we try to propose a new adaptive genetic algorithm, which is based on the statistical information of the population as a guideline to tune its crossover, selection and mutation operators. This algorithms is called Statistical Genetic Algorithm and is compared with traditional GA in some benchmark problems.

Keywords: Genetic Algorithms, Statistical Information ofthe Population, PAUX, SSO.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1525
4613 The Application of the Queuing Theory in the Traffic Flow of Intersection

Authors: Shuguo Yang, Xiaoyan Yang

Abstract:

It is practically significant to research the traffic flow of intersection because the capacity of intersection affects the efficiency of highway network directly. This paper analyzes the traffic conditions of an intersection in certain urban by the methods of queuing theory and statistical experiment, sets up a corresponding mathematical model and compares it with the actual values. The result shows that queuing theory is applied in the study of intersection traffic flow and it can provide references for the other similar designs.

Keywords: Intersection, Queuing theory, Statistical experiment, System metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7046
4612 On Preprocessing of Speech Signals

Authors: Ayaz Keerio, Bhargav Kumar Mitra, Philip Birch, Rupert Young, Chris Chatwin

Abstract:

Preprocessing of speech signals is considered a crucial step in the development of a robust and efficient speech or speaker recognition system. In this paper, we present some popular statistical outlier-detection based strategies to segregate the silence/unvoiced part of the speech signal from the voiced portion. The proposed methods are based on the utilization of the 3 σ edit rule, and the Hampel Identifier which are compared with the conventional techniques: (i) short-time energy (STE) based methods, and (ii) distribution based methods. The results obtained after applying the proposed strategies on some test voice signals are encouraging.

Keywords: STE based methods, Mahalanobis distance, 3 edit σ rule, Hampel Identifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1352
4611 Characteristic Function in Estimation of Probability Distribution Moments

Authors: Vladimir S. Timofeev

Abstract:

In this article the problem of distributional moments estimation is considered. The new approach of moments estimation based on usage of the characteristic function is proposed. By statistical simulation technique author shows that new approach has some robust properties. For calculation of the derivatives of characteristic function there is used numerical differentiation. Obtained results confirmed that author’s idea has a certain working efficiency and it can be recommended for any statistical applications.

Keywords: Characteristic function, distributional moments, robustness, outlier, statistical estimation problem, statistical simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2041
4610 An Approach Based on Statistics and Multi-Resolution Representation to Classify Mammograms

Authors: Nebi Gedik

Abstract:

One of the significant and continual public health problems in the world is breast cancer. Early detection is very important to fight the disease, and mammography has been one of the most common and reliable methods to detect the disease in the early stages. However, it is a difficult task, and computer-aided diagnosis (CAD) systems are needed to assist radiologists in providing both accurate and uniform evaluation for mass in mammograms. In this study, a multiresolution statistical method to classify mammograms as normal and abnormal in digitized mammograms is used to construct a CAD system. The mammogram images are represented by wave atom transform, and this representation is made by certain groups of coefficients, independently. The CAD system is designed by calculating some statistical features using each group of coefficients. The classification is performed by using support vector machine (SVM).

Keywords: Wave atom transform, statistical features, multi-resolution representation, mammogram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 654
4609 A Hybrid Ontology Based Approach for Ranking Documents

Authors: Sarah Motiee, Azadeh Nematzadeh, Mehrnoush Shamsfard

Abstract:

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques to extract phrases from documents and the query and doing stemming on words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done flexible and in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Keywords: Document ranking, Ontology, Spread activation algorithm, Annotation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1365
4608 Investigating Real Ship Accidents with Descriptive Analysis in Turkey

Authors: İsmail Karaca, Ömer Söner

Abstract:

The use of advanced methods has been increasing day by day in the maritime sector, which is one of the sectors least affected by the COVID-19 pandemic. It is aimed to minimize accidents, especially by using advanced methods in the investigation of marine accidents. This research aimed to conduct an exploratory statistical analysis of particular ship accidents in the Transport Safety Investigation Center of Turkey database. 46 ship accidents, which occurred between 2010-2018, have been selected from the database. In addition to the availability of a reliable and comprehensive database, taking advantage of the robust statistical models for investigation is critical to improving the safety of ships. Thus, descriptive analysis has been used in the research to identify causes and conditional factors related to different types of ship accidents. The research outcomes underline the fact that environmental factors and day and night ratio have great influence on ship safety.

Keywords: Descriptive analysis, maritime industry, maritime safety, marine accident analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 236
4607 ORank: An Ontology Based System for Ranking Documents

Authors: Mehrnoush Shamsfard, Azadeh Nematzadeh, Sarah Motiee

Abstract:

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques for extracting phrases and stemming words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Keywords: Document ranking, Ontology, Spread activation algorithm, Annotation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1650
4606 Statistical Description of Counterpoise Effective Length Based On Regressive Formulas

Authors: Petar Sarajcev, Josip Vasilj, Damir Jakus

Abstract:

This paper presents a novel statistical description of the counterpoise effective length due to lightning surges, where the (impulse) effective length had been obtained by means of regressive formulas applied to the transient simulation results. The effective length is described in terms of a statistical distribution function, from which median, mean, variance, and other parameters of interest could be readily obtained. The influence of lightning current amplitude, lightning front duration, and soil resistivity on the effective length has been accounted for, assuming statistical nature of these parameters. A method for determining the optimal counterpoise length, in terms of the statistical impulse effective length, is also presented. It is based on estimating the number of dangerous events associated with lightning strikes. Proposed statistical description and the associated method provide valuable information which could aid the design engineer in optimising physical lengths of counterpoises in different grounding arrangements and soil resistivity situations.

Keywords: Counterpoise, Grounding conductor, Effective length, Lightning, Monte Carlo method, Statistical distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2270
4605 A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features

Authors: Hee-Jeong Ahn, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama.

Keywords: Data mining, Korean linguistic feature, literary fiction, relationship extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1360