Search results for: statistical similarity matching
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1826

Search results for: statistical similarity matching

1526 Monitoring Patents Using the Statistical Process Control

Authors: Stephanie Russo Fabris, Edmara Thays Neres Menezes, Ruirogeres dos Santos Cruz, Lucio Leonardo Siqueira Santos, Suzana Leitao Russo

Abstract:

The statistical process control (SPC) is one of the most powerful tools developed to assist ineffective control of quality, involves collecting, organizing and interpreting data during production. This article aims to show how the use of CEP industries can control and continuously improve product quality through monitoring of production that can detect deviations of parameters representing the process by reducing the amount of off-specification products and thus the costs of production. This study aimed to conduct a technological forecasting in order to characterize the research being done related to the CEP. The survey was conducted in the databases Spacenet, WIPO and the National Institute of Industrial Property (INPI). Among the largest are the United States depositors and deposits via PCT, the classification section that was presented in greater abundance to F.

Keywords: Statistical Process Control, Industries

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1484
1525 Simultaneous Term Structure Estimation of Hazard and Loss Given Default with a Statistical Model using Credit Rating and Financial Information

Authors: Tomohiro Ando, Satoshi Yamashita

Abstract:

The objective of this study is to propose a statistical modeling method which enables simultaneous term structure estimation of the risk-free interest rate, hazard and loss given default, incorporating the characteristics of the bond issuing company such as credit rating and financial information. A reduced form model is used for this purpose. Statistical techniques such as spline estimation and Bayesian information criterion are employed for parameter estimation and model selection. An empirical analysis is conducted using the information on the Japanese bond market data. Results of the empirical analysis confirm the usefulness of the proposed method.

Keywords: Empirical Bayes, Hazard term structure, Loss given default.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1622
1524 Online Signature Verification Using Angular Transformation for e-Commerce Services

Authors: Peerapong Uthansakul, Monthippa Uthansakul

Abstract:

The rapid growth of e-Commerce services is significantly observed in the past decade. However, the method to verify the authenticated users still widely depends on numeric approaches. A new search on other verification methods suitable for online e-Commerce is an interesting issue. In this paper, a new online signature-verification method using angular transformation is presented. Delay shifts existing in online signatures are estimated by the estimation method relying on angle representation. In the proposed signature-verification algorithm, all components of input signature are extracted by considering the discontinuous break points on the stream of angular values. Then the estimated delay shift is captured by comparing with the selected reference signature and the error matching can be computed as a main feature used for verifying process. The threshold offsets are calculated by two types of error characteristics of the signature verification problem, False Rejection Rate (FRR) and False Acceptance Rate (FAR). The level of these two error rates depends on the decision threshold chosen whose value is such as to realize the Equal Error Rate (EER; FAR = FRR). The experimental results show that through the simple programming, employed on Internet for demonstrating e-Commerce services, the proposed method can provide 95.39% correct verifications and 7% better than DP matching based signature-verification method. In addition, the signature verification with extracting components provides more reliable results than using a whole decision making.

Keywords: Online signature verification, e-Commerce services, Angular transformation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1526
1523 Exploring the Spatial Characteristics of Mortality Map: A Statistical Area Perspective

Authors: Jung-Hong Hong, Jing-Cen Yang, Cai-Yu Ou

Abstract:

The analysis of geographic inequality heavily relies on the use of location-enabled statistical data and quantitative measures to present the spatial patterns of the selected phenomena and analyze their differences. To protect the privacy of individual instance and link to administrative units, point-based datasets are spatially aggregated to area-based statistical datasets, where only the overall status for the selected levels of spatial units is used for decision making. The partition of the spatial units thus has dominant influence on the outcomes of the analyzed results, well known as the Modifiable Areal Unit Problem (MAUP). A new spatial reference framework, the Taiwan Geographical Statistical Classification (TGSC), was recently introduced in Taiwan based on the spatial partition principles of homogeneous consideration of the number of population and households. Comparing to the outcomes of the traditional township units, TGSC provides additional levels of spatial units with finer granularity for presenting spatial phenomena and enables domain experts to select appropriate dissemination level for publishing statistical data. This paper compares the results of respectively using TGSC and township unit on the mortality data and examines the spatial characteristics of their outcomes. For the mortality data between the period of January 1st, 2008 and December 31st, 2010 of the Taitung County, the all-cause age-standardized death rate (ASDR) ranges from 571 to 1757 per 100,000 persons, whereas the 2nd dissemination area (TGSC) shows greater variation, ranged from 0 to 2222 per 100,000. The finer granularity of spatial units of TGSC clearly provides better outcomes for identifying and evaluating the geographic inequality and can be further analyzed with the statistical measures from other perspectives (e.g., population, area, environment.). The management and analysis of the statistical data referring to the TGSC in this research is strongly supported by the use of Geographic Information System (GIS) technology. An integrated workflow that consists of the tasks of the processing of death certificates, the geocoding of street address, the quality assurance of geocoded results, the automatic calculation of statistic measures, the standardized encoding of measures and the geo-visualization of statistical outcomes is developed. This paper also introduces a set of auxiliary measures from a geographic distribution perspective to further examine the hidden spatial characteristics of mortality data and justify the analyzed results. With the common statistical area framework like TGSC, the preliminary results demonstrate promising potential for developing a web-based statistical service that can effectively access domain statistical data and present the analyzed outcomes in meaningful ways to avoid wrong decision making.

Keywords: Mortality map, spatial patterns, statistical area, variation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 949
1522 Statistical Optimization of the Enzymatic Saccharification of the Oil Palm Empty Fruit Bunches

Authors: Rashid S. S., Alam M. Z.

Abstract:

A statistical optimization of the saccharification process of EFB was studied. The statistical analysis was done by applying faced centered central composite design (FCCCD) under response surface methodology (RSM). In this investigation, EFB dose, enzyme dose and saccharification period was examined, and the maximum 53.45% (w/w) yield of reducing sugar was found with 4% (w/v) of EFB, 10% (v/v) of enzyme after 120 hours of incubation. It can be calculated that the conversion rate of cellulose content of the substrate is more than 75% (w/w) which can be considered as a remarkable achievement. All the variables, linear, quadratic and interaction coefficient, were found to be highly significant, other than two coefficients, one quadratic and another interaction coefficient. The coefficient of determination (R2) is 0.9898 that confirms a satisfactory data and indicated that approximately 98.98% of the variability in the dependent variable, saccharification of EFB, could be explained by this model.

Keywords: Face centered central composite design (FCCCD), Liquid state bioconversion (LSB), Palm oil mill effluent, Trichoderma reesei RUT C-30.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2186
1521 An Approach Based on Statistics and Multi-Resolution Representation to Classify Mammograms

Authors: Nebi Gedik

Abstract:

One of the significant and continual public health problems in the world is breast cancer. Early detection is very important to fight the disease, and mammography has been one of the most common and reliable methods to detect the disease in the early stages. However, it is a difficult task, and computer-aided diagnosis (CAD) systems are needed to assist radiologists in providing both accurate and uniform evaluation for mass in mammograms. In this study, a multiresolution statistical method to classify mammograms as normal and abnormal in digitized mammograms is used to construct a CAD system. The mammogram images are represented by wave atom transform, and this representation is made by certain groups of coefficients, independently. The CAD system is designed by calculating some statistical features using each group of coefficients. The classification is performed by using support vector machine (SVM).

Keywords: Wave atom transform, statistical features, multi-resolution representation, mammogram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 836
1520 Inferring User Preference Using Distance Dependent Chinese Restaurant Process and Weighted Distribution for a Content Based Recommender System

Authors: Bagher Rahimpour Cami, Hamid Hassanpour, Hoda Mashayekhi

Abstract:

Nowadays websites provide a vast number of resources for users. Recommender systems have been developed as an essential element of these websites to provide a personalized environment for users. They help users to retrieve interested resources from large sets of available resources. Due to the dynamic feature of user preference, constructing an appropriate model to estimate the user preference is the major task of recommender systems. Profile matching and latent factors are two main approaches to identify user preference. In this paper, we employed the latent factor and profile matching to cluster the user profile and identify user preference, respectively. The method uses the Distance Dependent Chines Restaurant Process as a Bayesian nonparametric framework to extract the latent factors from the user profile. These latent factors are mapped to user interests and a weighted distribution is used to identify user preferences. We evaluate the proposed method using a real-world data-set that contains news tweets of a news agency (BBC). The experimental results and comparisons show the superior recommendation accuracy of the proposed approach related to existing methods, and its ability to effectively evolve over time.

Keywords: Content-based recommender systems, dynamic user modeling, extracting user interests, predicting user preference.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 767
1519 Defect Detection of Tiles Using 2D-Wavelet Transform and Statistical Features

Authors: M.Ghazvini, S. A. Monadjemi, N. Movahhedinia, K. Jamshidi

Abstract:

In this article, a method has been offered to classify normal and defective tiles using wavelet transform and artificial neural networks. The proposed algorithm calculates max and min medians as well as the standard deviation and average of detail images obtained from wavelet filters, then comes by feature vectors and attempts to classify the given tile using a Perceptron neural network with a single hidden layer. In this study along with the proposal of using median of optimum points as the basic feature and its comparison with the rest of the statistical features in the wavelet field, the relational advantages of Haar wavelet is investigated. This method has been experimented on a number of various tile designs and in average, it has been valid for over 90% of the cases. Amongst the other advantages, high speed and low calculating load are prominent.

Keywords: Defect detection, tile and ceramic quality inspection, wavelet transform, classification, neural networks, statistical features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2312
1518 Statistical Feature Extraction Method for Wood Species Recognition System

Authors: Mohd Iz'aan Paiz Bin Zamri, Anis Salwa Mohd Khairuddin, Norrima Mokhtar, Rubiyah Yusof

Abstract:

Effective statistical feature extraction and classification are important in image-based automatic inspection and analysis. An automatic wood species recognition system is designed to perform wood inspection at custom checkpoints to avoid mislabeling of timber which will results to loss of income to the timber industry. The system focuses on analyzing the statistical pores properties of the wood images. This paper proposed a fuzzy-based feature extractor which mimics the experts’ knowledge on wood texture to extract the properties of pores distribution from the wood surface texture. The proposed feature extractor consists of two steps namely pores extraction and fuzzy pores management. The total number of statistical features extracted from each wood image is 38 features. Then, a backpropagation neural network is used to classify the wood species based on the statistical features. A comprehensive set of experiments on a database composed of 5200 macroscopic images from 52 tropical wood species was used to evaluate the performance of the proposed feature extractor. The advantage of the proposed feature extraction technique is that it mimics the experts’ interpretation on wood texture which allows human involvement when analyzing the wood texture. Experimental results show the efficiency of the proposed method.

Keywords: Classification, fuzzy, inspection system, image analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1693
1517 A Optimal Subclass Detection Method for Credit Scoring

Authors: Luciano Nieddu, Giuseppe Manfredi, Salvatore D'Acunto, Katia La Regina

Abstract:

In this paper a non-parametric statistical pattern recognition algorithm for the problem of credit scoring will be presented. The proposed algorithm is based on a clustering k- means algorithm and allows for the determination of subclasses of homogenous elements in the data. The algorithm will be tested on two benchmark datasets and its performance compared with other well known pattern recognition algorithm for credit scoring.

Keywords: Constrained clustering, Credit scoring, Statistical pattern recognition, Supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2002
1516 Advances in Artificial Intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: Speech recognition, acoustic phonetic, artificial intelligence, Hidden Markov Models (HMM), statistical models of speech recognition, human machine performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7895
1515 Using Artificial Neural Network to Predict Collisions on Horizontal Tangents of 3D Two-Lane Highways

Authors: Omer F. Cansiz, Said M. Easa

Abstract:

The purpose of this study is mainly to predict collision frequency on the horizontal tangents combined with vertical curves using artificial neural network methods. The proposed ANN models are compared with existing regression models. First, the variables that affect collision frequency were investigated. It was found that only the annual average daily traffic, section length, access density, the rate of vertical curvature, smaller curve radius before and after the tangent were statistically significant according to related combinations. Second, three statistical models (negative binomial, zero inflated Poisson and zero inflated negative binomial) were developed using the significant variables for three alignment combinations. Third, ANN models are developed by applying the same variables for each combination. The results clearly show that the ANN models have the lowest mean square error value than those of the statistical models. Similarly, the AIC values of the ANN models are smaller to those of the regression models for all the combinations. Consequently, the ANN models have better statistical performances than statistical models for estimating collision frequency. The ANN models presented in this paper are recommended for evaluating the safety impacts 3D alignment elements on horizontal tangents.

Keywords: Collision frequency, horizontal tangent, 3D two-lane highway, negative binomial, zero inflated Poisson, artificial neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1596
1514 Statistical Characteristics of Distribution of Radiation-Induced Defects under Random Generation

Authors: Pavlo Selyshchev

Abstract:

We consider fluctuations of defects density taking into account their interaction. Stochastic field of displacement generation rate gives random defect distribution. We determinate statistical characteristics (mean and dispersion) of random field of point defect distribution as function of defect generation parameters, temperature and properties of irradiated crystal.

 

Keywords: Irradiation, Primary Defects, Interaction, Fluctuations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1794
1513 Relevance Feedback within CBIR Systems

Authors: Mawloud Mosbah, Bachir Boucheham

Abstract:

We present here the results for a comparative study of some techniques, available in the literature, related to the relevance feedback mechanism in the case of a short-term learning. Only one method among those considered here is belonging to the data mining field which is the K-nearest neighbors algorithm (KNN) while the rest of the methods is related purely to the information retrieval field and they fall under the purview of the following three major axes: Shifting query, Feature Weighting and the optimization of the parameters of similarity metric. As a contribution, and in addition to the comparative purpose, we propose a new version of the KNN algorithm referred to as an incremental KNN which is distinct from the original version in the sense that besides the influence of the seeds, the rate of the actual target image is influenced also by the images already rated. The results presented here have been obtained after experiments conducted on the Wang database for one iteration and utilizing color moments on the RGB space. This compact descriptor, Color Moments, is adequate for the efficiency purposes needed in the case of interactive systems. The results obtained allow us to claim that the proposed algorithm proves good results; it even outperforms a wide range of techniques available in the literature.

Keywords: CBIR, Category Search, Relevance Feedback (RFB), Query Point Movement, Standard Rocchio’s Formula, Adaptive Shifting Query, Feature Weighting, Optimization of the Parameters of Similarity Metric, Original KNN, Incremental KNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2252
1512 Quality Parameters of Offset Printing Wastewater

Authors: Kiurski S. Jelena, Kecić S. Vesna, Aksentijević M. Snežana

Abstract:

Samples of tap and wastewater were collected in three offset printing facilities in Novi Sad, Serbia. Ten physicochemical parameters were analyzed within all collected samples: pH, conductivity, m - alkalinity, p - alkalinity, acidity, carbonate concentration, hydrogen carbonate concentration, active oxygen content, chloride concentration and total alkali content. All measurements were conducted using the standard analytical and instrumental methods. Comparing the obtained results for tap water and wastewater, a clear quality difference was noticeable, since all physicochemical parameters were significantly higher within wastewater samples. The study also involves the application of simple linear regression analysis on the obtained dataset. By using software package ORIGIN 5 the pH value was mutually correlated with other physicochemical parameters. Based on the obtained values of Pearson coefficient of determination a strong positive correlation between chloride concentration and pH (r = -0.943), as well as between acidity and pH (r = -0.855) was determined. In addition, statistically significant difference was obtained only between acidity and chloride concentration with pH values, since the values of parameter F (247.634 and 182.536) were higher than Fcritical (5.59). In this way, results of statistical analysis highlighted the most influential parameter of water contamination in offset printing, in the form of acidity and chloride concentration. The results showed that variable dependence could be represented by the general regression model: y = a0 + a1x+ k, which further resulted with matching graphic regressions.

Keywords: Pollution, printing industry, simple linear regression analysis, wastewater.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1625
1511 Authenticity of Lipid and Soluble Sugar Profiles of Various Oat Cultivars (Avena sativa)

Authors: Marijana M. Ačanski, Kristian A. Pastor, Djura N. Vujić

Abstract:

The identification of lipid and soluble sugar components in flour samples of different cultivars belonging to common oat species (Avena sativa L.) was performed: spring oat, winter oat and hulless oat. Fatty acids were extracted from flour samples with n-hexane, and derivatized into volatile methyl esters, using TMSH (trimethylsulfonium hydroxide in methanol). Soluble sugars were then extracted from defatted and dried samples of oat flour with 96% ethanol, and further derivatized into corresponding TMS-oximes, using hydroxylamine hydrochloride solution and BSTFA (N,O-bis-(trimethylsilyl)-trifluoroacetamide). The hexane and ethanol extracts of each oat cultivar were analyzed using GC-MS system. Lipid and simple sugar compositions are very similar in all samples of investigated cultivars. Chemometric tool was applied to numeric values of automatically integrated surface areas of detected lipid and simple sugar components in their corresponding derivatized forms. Hierarchical cluster analysis shows a very high similarity between the investigated flour samples of oat cultivars, according to the fatty acid content (0.9955). Moderate similarity was observed according to the content of soluble sugars (0.50). These preliminary results support the idea of establishing methods for oat flour authentication, and provide the means for distinguishing oat flour samples, regardless of the variety, from flour samples made of other cereal species, just by lipid and simple sugar profile analysis.

Keywords: Authentication, chemometrics, GC-MS, lipid and soluble sugar composition, oat cultivars.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1319
1510 Visualization and Indexing of Spectral Databases

Authors: Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, Janos Abonyi

Abstract:

On-line (near infrared) spectroscopy is widely used to support the operation of complex process systems. Information extracted from spectral database can be used to estimate unmeasured product properties and monitor the operation of the process. These techniques are based on looking for similar spectra by nearest neighborhood algorithms and distance based searching methods. Search for nearest neighbors in the spectral space is an NP-hard problem, the computational complexity increases by the number of points in the discrete spectrum and the number of samples in the database. To reduce the calculation time some kind of indexing could be used. The main idea presented in this paper is to combine indexing and visualization techniques to reduce the computational requirement of estimation algorithms by providing a two dimensional indexing that can also be used to visualize the structure of the spectral database. This 2D visualization of spectral database does not only support application of distance and similarity based techniques but enables the utilization of advanced clustering and prediction algorithms based on the Delaunay tessellation of the mapped spectral space. This means the prediction has not to use the high dimension space but can be based on the mapped space too. The results illustrate that the proposed method is able to segment (cluster) spectral databases and detect outliers that are not suitable for instance based learning algorithms.

Keywords: indexing high dimensional databases, dimensional reduction, clustering, similarity, k-nn algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1723
1509 A Revisited View to the Paced Auditory Serial Addition Test (PASAT) in Female and Male Normal Subjects

Authors: Javad Razjouyan, Shahriar Gharibzadeh, Ali Fallah, Mehdi Moghaddasi, Mohsen Seyfi, Amir Kasaeian

Abstract:

Paced Auditory Serial Addition Test (PASAT) has been used as a common research tool for different neurological disorders like Multiple Sclerosis. Recently, technology let researchers to introduce a new versions of the visual test, the paced visual serial addition test (PVSAT). In this paper, the computerized version of these two tests is introduced. Beside the number of true responses are interpreted, the reaction time of subjects are calculated by the software. We hypothesize that paying attention to the reaction time may be valuable. For this purpose, sixty eight female normal subjects and fifty eight male normal subjects are enrolled in the study. We investigate the similarity between the PASAT3 and PVSAT3 in number of true responses and the new criterion (the average reaction time of each subject). The similarity between two tests were rejected (p-value = 0.000) which means that these two test differ. The effect of sex in the tests were not approved since the pvalues of different between PASAT3 and PVSAT3 in both sex is the same (p-value = 0.000) which means that male and female subjects performed the tests at no different level of performance. The new criterion shows a negative correlation with the age which offers aged normal subjects may have the same number of true responses as the young subjects but they have latent responses. This will give prove for the importance of reaction time.

Keywords: Paced Auditory Serial Addition Test, Pace Visual Serial Addition Test, reaction time.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1909
1508 Wheat Yield Prediction through Agro Meteorological Indices for Ardebil District

Authors: Fariba Esfandiary, Ghafoor Aghaie, Ali Dolati Mehr

Abstract:

Wheat prediction was carried out using different meteorological variables together with agro meteorological indices in Ardebil district for the years 2004-2005 & 2005–2006. On the basis of correlation coefficients, standard error of estimate as well as relative deviation of predicted yield from actual yield using different statistical models, the best subset of agro meteorological indices were selected including daily minimum temperature (Tmin), accumulated difference of maximum & minimum temperatures (TD), growing degree days (GDD), accumulated water vapor pressure deficit (VPD), sunshine hours (SH) & potential evapotranspiration (PET). Yield prediction was done two months in advance before harvesting time which was coincide with commencement of reproductive stage of wheat (5th of June). It revealed that in the final statistical models, 83% of wheat yield variability was accounted for variation in above agro meteorological indices.

Keywords: Wheat yields prediction, agro meteorological indices, statistical models

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2094
1507 Structural Parsing of Natural Language Text in Tamil Using Phrase Structure Hybrid Language Model

Authors: Selvam M, Natarajan. A M, Thangarajan R

Abstract:

Parsing is important in Linguistics and Natural Language Processing to understand the syntax and semantics of a natural language grammar. Parsing natural language text is challenging because of the problems like ambiguity and inefficiency. Also the interpretation of natural language text depends on context based techniques. A probabilistic component is essential to resolve ambiguity in both syntax and semantics thereby increasing accuracy and efficiency of the parser. Tamil language has some inherent features which are more challenging. In order to obtain the solutions, lexicalized and statistical approach is to be applied in the parsing with the aid of a language model. Statistical models mainly focus on semantics of the language which are suitable for large vocabulary tasks where as structural methods focus on syntax which models small vocabulary tasks. A statistical language model based on Trigram for Tamil language with medium vocabulary of 5000 words has been built. Though statistical parsing gives better performance through tri-gram probabilities and large vocabulary size, it has some disadvantages like focus on semantics rather than syntax, lack of support in free ordering of words and long term relationship. To overcome the disadvantages a structural component is to be incorporated in statistical language models which leads to the implementation of hybrid language models. This paper has attempted to build phrase structured hybrid language model which resolves above mentioned disadvantages. In the development of hybrid language model, new part of speech tag set for Tamil language has been developed with more than 500 tags which have the wider coverage. A phrase structured Treebank has been developed with 326 Tamil sentences which covers more than 5000 words. A hybrid language model has been trained with the phrase structured Treebank using immediate head parsing technique. Lexicalized and statistical parser which employs this hybrid language model and immediate head parsing technique gives better results than pure grammar and trigram based model.

Keywords: Hybrid Language Model, Immediate Head Parsing, Lexicalized and Statistical Parsing, Natural Language Processing, Parts of Speech, Probabilistic Context Free Grammar, Tamil Language, Tree Bank.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3588
1506 Fuzzy Estimation of Parameters in Statistical Models

Authors: A. Falsafain, S. M. Taheri, M. Mashinchi

Abstract:

Using a set of confidence intervals, we develop a common approach, to construct a fuzzy set as an estimator for unknown parameters in statistical models. We investigate a method to derive the explicit and unique membership function of such fuzzy estimators. The proposed method has been used to derive the fuzzy estimators of the parameters of a Normal distribution and some functions of parameters of two Normal distributions, as well as the parameters of the Exponential and Poisson distributions.

Keywords: Confidence interval. Fuzzy number. Fuzzy estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2220
1505 The Estimation of Bird Diversity Loss and Gain as an Impact of Oil Palm Plantation: Study Case in KJNP Estate Riau Province

Authors: Yanto Santosa, Catharina Yudea

Abstract:

The rapid growth of oil palm industry in Indonesia raised many negative accusations from various parties, who said that oil palm plantation is damaging the environment and biodiversity, including birds. Since research on oil palm plantation impacts on bird diversity is still limited, this study needs to be developed in order to gain further learning and understanding. Data on bird diversity were collected in March 2018 in KJNP Estate, Riau Province using strip transect method on five different land cover types (young, intermediate, and old growth of oil palm plantation, high conservation value area, and crops field or the baseline). The observations were conducted simultaneously, with three repetitions. The result shows that the baseline has 19 species of birds and land cover after the oil palm plantation has 39 species. HCV (high conservation value) area has the highest increase in diversity value. Oil palm plantation has changed the composition of bird species. The highest similarity index is shown by young growth oil palm land cover with total score 0.65, meanwhile the lowest similarity index with total score 0.43 is shown by HCV area. Overall, the existence of oil palm plantation made a positive impact by increasing bird species diversity, with total 23 species gained and 3 species lost.

Keywords: Bird diversity, crops field, impact of oil palm plantation, KJNP estate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 720
1504 Isolation of a Bacterial Community with High Removal Efficiencies of the Insecticide Bendiocarb

Authors: Eusebio A. Jiménez-Arévalo, Deifilia Ahuatzi-Chacón, Juvencio Galíndez-Mayer, Cleotilde Juárez-Ramírez, Nora Ruiz-Ordaz

Abstract:

Bendiocarb is a known toxic xenobiotic that presents acute and chronic risks for freshwater invertebrates and estuarine and marine biota; thus, the treatment of water contaminated with the insecticide is of concern. In this paper, a bacterial community with the capacity to grow in bendiocarb as its sole carbon and nitrogen source was isolated by enrichment techniques in batch culture, from samples of a composting plant located in the northeast of Mexico City. Eight cultivable bacteria were isolated from the microbial community, by PCR amplification of 16 rDNA; Pseudoxanthomonas spadix (NC_016147.2, 98%), Ochrobacterium anthropi (NC_009668.1, 97%), Staphylococcus capitis (NZ_CP007601.1, 99%), Bosea thiooxidans. (NZ_LMAR01000067.1, 99%), Pseudomonas denitrificans. (NC_020829.1, 99%), Agromyces sp. (NZ_LMKQ01000001.1, 98%), Bacillus thuringiensis. (NC_022873.1, 97%), Pseudomonas alkylphenolia (NZ_CP009048.1, 98%). NCBI accession numbers and percentage of similarity are indicated in parentheses. These bacteria were regarded as the isolated species for having the best similarity matches. The ability to degrade bendiocarb by the immobilized bacterial community in a packed bed biofilm reactor, using as support volcanic stone fragments (tezontle), was evaluated. The reactor system was operated in batch using mineral salts medium and 30 mg/L of bendiocarb as carbon and nitrogen source. With this system, an overall removal efficiency (ηbend) rounding 90%, was reached.

Keywords: Bendiocarb, biodegradation, biofilm reactor, carbamate insecticide.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1101
1503 Investigation of the Main Trends of Tourist Expenses in Georgia

Authors: Nino Abesadze, Marine Mindorashvili, Nino Paresashvili

Abstract:

The main purpose of the article is to make complex statistical analysis of tourist expenses of foreign visitors. We used mixed technique of selection that implies rules of random and proportional selection. Computer software SPSS was used to compute statistical data for corresponding analysis. Corresponding methodology of tourism statistics was implemented according to international standards. Important information was collected and grouped from the major Georgian airports. Techniques of statistical observation were prepared. A representative population of foreign visitors and a rule of selection of respondents were determined. We have a trend of growth of tourist numbers and share of tourists from post-soviet countries constantly increases. Level of satisfaction with tourist facilities and quality of service has grown, but still we have a problem of disparity between quality of service and prices. The design of tourist expenses of foreign visitors is diverse; competitiveness of tourist products of Georgian tourist companies is higher.

Keywords: Tourist, expenses, methods, statistics, analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 898
1502 Immobilization of Lipase Enzyme by Low Cost Material: A Statistical Approach

Authors: Md. Z. Alam, Devi R. Asih, Md. N. Salleh

Abstract:

Immobilization of lipase enzyme produced from palm oil mill effluent (POME) by the activated carbon (AC) among the low cost support materials was optimized. The results indicated that immobilization of 94% was achieved by AC as the most suitable support material. A sequential optimization strategy based on a statistical experimental design, including one-factor-at-a-time (OFAT) method was used to determine the equilibrium time. Three components influencing lipase immobilization were optimized by the response surface methodology (RSM) based on the face-centered central composite design (FCCCD). On the statistical analysis of the results, the optimum enzyme concentration loading, agitation rate and carbon active dosage were found to be 30 U/ml, 300 rpm and 8 g/L respectively, with a maximum immobilization activity of 3732.9 U/g-AC after 2 hrs of immobilization. Analysis of variance (ANOVA) showed a high regression coefficient (R2) of 0.999, which indicated a satisfactory fit of the model with the experimental data. The parameters were statistically significant at p<0.05.

Keywords: Activated carbon, adsorption, immobilization, POME based lipase.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2520
1501 Data Mining on the Router Logs for Statistical Application Classification

Authors: M. Rahmati, S.M. Mirzababaei

Abstract:

With the advance of information technology in the new era the applications of Internet to access data resources has steadily increased and huge amount of data have become accessible in various forms. Obviously, the network providers and agencies, look after to prevent electronic attacks that may be harmful or may be related to terrorist applications. Thus, these have facilitated the authorities to under take a variety of methods to protect the special regions from harmful data. One of the most important approaches is to use firewall in the network facilities. The main objectives of firewalls are to stop the transfer of suspicious packets in several ways. However because of its blind packet stopping, high process power requirements and expensive prices some of the providers are reluctant to use the firewall. In this paper we proposed a method to find a discriminate function to distinguish between usual packets and harmful ones by the statistical processing on the network router logs. By discriminating these data, an administrator may take an approach action against the user. This method is very fast and can be used simply in adjacent with the Internet routers.

Keywords: Data Mining, Firewall, Optimization, Packetclassification, Statistical Pattern Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601
1500 The Application of the Queuing Theory in the Traffic Flow of Intersection

Authors: Shuguo Yang, Xiaoyan Yang

Abstract:

It is practically significant to research the traffic flow of intersection because the capacity of intersection affects the efficiency of highway network directly. This paper analyzes the traffic conditions of an intersection in certain urban by the methods of queuing theory and statistical experiment, sets up a corresponding mathematical model and compares it with the actual values. The result shows that queuing theory is applied in the study of intersection traffic flow and it can provide references for the other similar designs.

Keywords: Intersection, Queuing theory, Statistical experiment, System metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7476
1499 High Aspect Ratio SiO2 Capillary Based On Silicon Etching and Thermal Oxidation Process for Optical Modulator

Authors: N. V. Toan, S. Sangu, T. Saitoh, N. Inomata, T. Ono

Abstract:

This paper presents the design and fabrication of an optical window for an optical modulator toward image sensing applications. An optical window consists of micrometer-order SiO2 capillaries (porous solid) that can modulate transmission light intensity by moving the liquid in and out of porous solid. A high optical transmittance of the optical window can be achieved due to refractive index matching when the liquid is penetrated into the porous solid. Otherwise, its light transmittance is lower because of light reflection and scattering by air holes and capillary walls. Silicon capillaries fabricated by deep reactive ion etching (DRIE) process are completely oxidized to form the SiO2 capillaries. Therefore, high aspect ratio SiO2 capillaries can be achieved based on silicon capillaries formed by DRIE technique. Large compressive stress of the oxide causes bending of the capillary structure, which is reduced by optimizing the design of device structure. The large stress of the optical window can be released via thin supporting beams. A 7.2 mm x 9.6 mm optical window area toward a fully integrated with the image sensor format is successfully fabricated and its optical transmittance is evaluated with and without inserting liquids (ethanol and matching oil). The achieved modulation range is approximately 20% to 35% with and without liquid penetration in visible region (wavelength range from 450 nm to 650 nm).

Keywords: Thermal oxidation process, SiO2 capillaries, optical window, light transmittance, image sensor, liquid penetration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2224
1498 Bacteriological Screening and Antibiotic – Heavy Metal Resistance Profile of the Bacteria Isolated from Some Amphibian and Reptile Species of the Biga Stream in Turkey

Authors: Nurcihan Hacioglu, Cigdem Gul, Murat Tosunoglu

Abstract:

In this article, the antibiogram and heavy metal resistance profile of the bacteria isolated from total 34 studied animals (Pelophylax ridibundus = 12; Mauremys rivulata = 14; Natrix natrix = 8) captured around the Biga Stream, are described. There was no database information on antibiogram and heavy metal resistance profile of bacteria from these area’s amphibians and reptiles. A total of 200 bacteria were successfully isolated from cloaca and oral samples of the aquatic amphibians and reptiles as well as from the water sample. According to Jaccard’s similarity index, the degree of similarity in the bacterial flora was quite high among the amphibian and reptile species under examination, whereas it was different from the bacterial diversity in the water sample. The most frequent isolates were A. hydrophila (31.5%), B. pseudomallei (8.5%), and C. freundii (7%). The total numbers of bacteria obtained were as follows: 45 in P. ridibundus, 45 in N. natrix 30 in M. rivulata, and 80 in the water sample. The result showed that cefmetazole was the most effective antibiotic to control the bacteria isolated in this study and that approximately 93.33% of the bacterial isolates were sensitive to this antibiotic. The multiple antibiotic resistances (MAR) index indicated that P. ridibundus (0.95) > N. natrix (0.89) > M. rivulata (0.39). Furthermore, all the tested heavy metals (Pb+2, Cu+2, Cr+3, and Mn+2) inhibit the growth of the bacterial isolates at different rates. Therefore, it indicated that the water source of the animals was contaminated with both antibiotic residues and heavy metals.

Keywords: Amphibian, Bacteriological Quality, Reptile, Antibiotic & Heavy Metal Resistance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2182
1497 Development of a Complex Meteorological Support System for UAVs

Authors: Z. Bottyán, F. Wantuch, A. Z. Gyöngyösi, Z. Tuba, K. Hadobács, P. Kardos, R. Kurunczi

Abstract:

The sensitivity of UAVs to the atmospheric effects are apparent. All the same the meteorological support for the UAVs missions is often non-adequate or partly missing. In our paper we show a new complex meteorological support system for different types of UAVs pilots, specialists and decision makers, too. The mentioned system has two important parts with different forecasts approach such as the statistical and dynamical ones. The statistical prediction approach is based on a large climatological data base and the special analog method which is able to select similar weather situations from the mentioned data base to apply them during the forecasting procedure. The applied dynamic approach uses the specific WRF model runs twice a day and produces 96 hours, high resolution weather forecast for the UAV users over the Hungary. An easy to use web-based system can give important weather information over the Carpathian basin in Central-Europe. The mentioned products can be reached via internet connection.

Keywords: Aviation meteorology, statistical weather prediction, unmanned aerial systems, WRF.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2707