Search results for: Cover Text
802 Role of Natural Language Processing in Information Retrieval; Challenges and Opportunities
Authors: Khaled M. Alhawiti
Abstract:
This paper aims to analyze the role of natural language processing (NLP). The paper will discuss the role in the context of automated data retrieval, automated question answer, and text structuring. NLP techniques are gaining wider acceptance in real life applications and industrial concerns. There are various complexities involved in processing the text of natural language that could satisfy the need of decision makers. This paper begins with the description of the qualities of NLP practices. The paper then focuses on the challenges in natural language processing. The paper also discusses major techniques of NLP. The last section describes opportunities and challenges for future research.
Keywords: Data Retrieval, Information retrieval, Natural Language Processing, Text Structuring.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2834801 Different Multimedia Presentation Types and Students' Interpretation Achievement
Authors: Cenk Akbiyik, Gonul Altin Akbiyik
Abstract:
The main purpose of the study was to determine whether students- interpretation achievement differed with the use of various multimedia presentation types. Four groups of students, text only (T), audio only (A), text and audio (TA), text and image (TI), were arranged and they were presented the same story via different types of multimedia presentations. Inference achievement was measured by a critical thinking inference test. Higher mean scores for the TA group compared to the other three groups were found. Also when compared pairwise, interpretation achievement of the TA group differed significantly from scores of the T and TI groups. These differences were interpreted with the increased cognitive load. Increased cognitive load for the TA group may have invited students to put more effort into comprehending the text, thus resulting in better test scores. Findings of the study can be seen as a sign of the importance of learning situations and learning outcomes in multimedia-supported learning environments and may have practical benefits for instructional designers.
Keywords: Multimedia, cognitive multimedia, dual coding, cognitive load, critical thinking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3451800 Evolutionary Feature Selection for Text Documents using the SVM
Authors: Daniel I. Morariu, Lucian N. Vintan, Volker Tresp
Abstract:
Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, we present three feature selection methods: Information Gain, Support Vector Machine feature selection called (SVM_FS) and Genetic Algorithm with SVM (called GA_SVM). We show that the best results were obtained with GA_SVM method for a relatively small dimension of the feature vector.Keywords: Feature Selection, Learning with Kernels, Support Vector Machine, Genetic Algorithm, and Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1706799 A Study of Touching Characters in Degraded Gurmukhi Text
Authors: M. K. Jindal, G. S. Lehal, R. K. Sharma
Abstract:
Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper a study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis.Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text.Keywords: Character Segmentation, Middle Zone, Touching Characters.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1841798 Providing Medical Information in Braille: Research and Development of Automatic Braille Translation Program for Japanese “eBraille“
Authors: Aki Sugano, Mika Ohta, Mineko Ikegami, Kenji Miura, Sayo Tsukamoto, Akihiro Ichinose, Toshiko Ohshima, Eiichi Maeda, Masako Matsuura, Yutaka Takao
Abstract:
Along with the advances in medicine, providing medical information to individual patient is becoming more important. In Japan such information via Braille is hardly provided to blind and partially sighted people. Thus we are researching and developing a Web-based automatic translation program “eBraille" to translate Japanese text into Japanese Braille. First we analyzed the Japanese transcription rules to implement them on our program. We then added medical words to the dictionary of the program to improve its translation accuracy for medical text. Finally we examined the efficacy of statistical learning models (SLMs) for further increase of word segmentation accuracy in braille translation. As a result, eBraille had the highest translation accuracy in the comparison with other translation programs, improved the accuracy for medical text and is utilized to make hospital brochures in braille for outpatients and inpatients.
Keywords: Automatic Braille translation, Medical text, Partially sighted people.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601797 Edit Distance Algorithm to Increase Storage Efficiency of Javanese Corpora
Authors: Aji P. Wibawa, Andrew Nafalski, Neil Murray, Wayan F. Mahmudy
Abstract:
Since the one-to-one word translator does not have the facility to translate pragmatic aspects of Javanese, the parallel text alignment model described uses a phrase pair combination. The algorithm aligns the parallel text automatically from the beginning to the end of each sentence. Even though the results of the phrase pair combination outperform the previous algorithm, it is still inefficient. Recording all possible combinations consume more space in the database and time consuming. The original algorithm is modified by applying the edit distance coefficient to improve the data-storage efficiency. As a result, the data-storage consumption is 90% reduced as well as its learning period (42s).Keywords: edit distance coefficient, Javanese, parallel text alignment, phrase pair combination
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1728796 A Text Clustering System based on k-means Type Subspace Clustering and Ontology
Authors: Liping Jing, Michael K. Ng, Xinhua Yang, Joshua Zhexue Huang
Abstract:
This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.
Keywords: Subspace Clustering, Text Mining, Feature Weighting, Cluster Interpretation, Ontology
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2462795 The Negative Effect of Traditional Loops Style on the Performance of Algorithms
Authors: Mahmoud Moh'd Mhashi
Abstract:
A new algorithm called Character-Comparison to Character-Access (CCCA) is developed to test the effect of both: 1) converting character-comparison and number-comparison into character-access and 2) the starting point of checking on the performance of the checking operation in string searching. An experiment is performed using both English text and DNA text with different sizes. The results are compared with five algorithms, namely, Naive, BM, Inf_Suf_Pref, Raita, and Cycle. With the CCCA algorithm, the results suggest that the evaluation criteria of the average number of total comparisons are improved up to 35%. Furthermore, the results suggest that the clock time required by the other algorithms is improved in range from 22.13% to 42.33% by the new CCCA algorithm.
Keywords: Pattern matching, string searching, charactercomparison, character-access, text type, and checking
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1270794 Performance Evaluation of an Aboveground LNG Storage Tank Cover using Nondestructive and Destructive Tests
Authors: Sungnam Hong, Sun-Kyu Park, Jieun Jeong, Jinwoong Choi
Abstract:
In this study, a new procedure for inspecting damages on LNG storage tanks was proposed with the use of structural diagnostic techniques: i.e., nondestructive inspection techniques such as macrography, the hammer sounding test, the Schmidt hammer test, and the ultrasonic pulse velocity test, and destructive inspection techniques such as the compressive strength test, the chloride penetration test, and the carbonation test. From the analysis of all the test results, it was concluded that the LNG storage tank cover was in good condition. Such results were also compared with the Korean concrete standard specifications and design values. In addition, the remaining life of the LNG storage tank was estimated by using existing models. Based on the results, an LNG storage tank cover performance evaluation procedure was suggested.
Keywords: Destructive test, LNG storage tank, Nondestructive test, Performance evaluation procedure, Remaining life.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3190793 A Text Mining Technique Using Association Rules Extraction
Authors: Hany Mahgoub, Dietmar Rösner, Nabil Ismail, Fawzy Torkey
Abstract:
This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.
Keywords: Text mining, data mining, association rule mining
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4437792 Text Mining Technique for Data Mining Application
Authors: M. Govindarajan
Abstract:
Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.Keywords: C5.0, Error Ratio, text mining, training data, test data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2489791 Simulation of Snow Covers Area by a Physical based Model
Authors: Hossein Zeinivand, Florimond De Smedt
Abstract:
Snow cover is an important phenomenon in hydrology, hence modeling the snow accumulation and melting is an important issue in places where snowmelt significantly contributes to runoff and has significant effect on water balance. The physics-based models are invariably distributed, with the basin disaggregated into zones or grid cells. Satellites images provide valuable data to verify the accuracy of spatially distributed model outputs. In this study a spatially distributed physically based model (WetSpa) was applied to predict snow cover and melting in the Latyan dam watershed in Iran. Snowmelt is simulated based on an energy balance approach. The model is applied and calibrated with one year of observed daily precipitation, air temperature, windspeed, and daily potential evaporation. The predicted snow-covered area is compared with remotely sensed images (MODIS). The results show that simulated snow cover area SCA has a good agreement with satellite image snow cover area SCA from MODIS images. The model performance is also tested by statistical and graphical comparison of simulated and measured discharges entering the Latyan dam reservoir.Keywords: Physical based model, Satellite image, Snow covers.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1865790 Knowledge Acquisition for the Construction of an Evolving Ontology: Application to Augmented Surgery
Authors: Nora Taleb, Sellami Mokhtar, Michel Simonet
Abstract:
This work concerns the evolution and the maintenance of an ontological resource in relation with the evolution of the corpus of texts from which it had been built. The knowledge forming a text corpus, especially in dynamic domains, is in continuous evolution. When a change in the corpus occurs, the domain ontology must evolve accordingly. Most methods manage ontology evolution independently from the corpus from which it is built; in addition, they treat evolution just as a process of knowledge addition, not considering other knowledge changes. We propose a methodology for managing an evolving ontology from a text corpus that evolves over time, while preserving the consistency and the persistence of this ontology. Our methodology is based on the changes made on the corpus to reflect the evolution of the considered domain - augmented surgery in our case. In this context, the results of text mining techniques, as well as the ARCHONTE method slightly modified, are used to support the evolution process.Keywords: Corpus, Evolution, Ontology
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1443789 Text Summarization for Oil and Gas Drilling Topic
Authors: Y. Y. Chen, O. M. Foong, S. P. Yong, Kurniawan Iwan
Abstract:
Information sharing and gathering are important in the rapid advancement era of technology. The existence of WWW has caused rapid growth of information explosion. Readers are overloaded with too many lengthy text documents in which they are more interested in shorter versions. Oil and gas industry could not escape from this predicament. In this paper, we develop an Automated Text Summarization System known as AutoTextSumm to extract the salient points of oil and gas drilling articles by incorporating statistical approach, keywords identification, synonym words and sentence-s position. In this study, we have conducted interviews with Petroleum Engineering experts and English Language experts to identify the list of most commonly used keywords in the oil and gas drilling domain. The system performance of AutoTextSumm is evaluated using the formulae of precision, recall and F-score. Based on the experimental results, AutoTextSumm has produced satisfactory performance with F-score of 0.81.
Keywords: Keyword's probability, synonym sets.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1731788 The Main Principles of Text-to-Speech Synthesis System
Authors: K.R. Aida–Zade, C. Ardil, A.M. Sharifova
Abstract:
In this paper, the main principles of text-to-speech synthesis system are presented. Associated problems which arise when developing speech synthesis system are described. Used approaches and their application in the speech synthesis systems for Azerbaijani language are shown.
Keywords: synthesis of Azerbaijani language, morphemes, phonemes, sounds, sentence, speech synthesizer, intonation, accent, pronunciation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5652787 Plants Cover Effects on Overland Flow and on Soil Erosion under Simulated Rainfall Intensity
Authors: H. Madi, L. Mouzai, M. Bouhadef
Abstract:
The purpose of this article is to study the effects of plants cover on overland flow and, therefore, its influences on the amount of eroded and transported soil. In this investigation, all the experiments were conducted in the LEGHYD laboratory using a rainfall simulator and a soil tray. The experiments were conducted using an experimental plot (soil tray) which is 2m long, 0.5 m wide and 0.15 m deep. The soil used is an agricultural sandy soil (62,08% coarse sand, 19,14% fine sand, 11,57% silt and 7,21% clay). Plastic rods (4 mm in diameter) were used to simulate the plants at different densities: 0 stem/m2 (bared soil), 126 stems/m², 203 stems/m², 461 stems/m² and 2500 stems/m²). The used rainfall intensity is 73mm/h and the soil tray slope is fixed to 3°. The results have shown that the overland flow velocities decreased with increasing stems density, and the density cover has a great effect on sediment concentration. Darcy–Weisbach and Manning friction coefficients of overland flow increased when the stems density increased. Froude and Reynolds numbers decreased with increasing stems density and, consequently, the flow regime of all treatments was laminar and subcritical. From these findings, we conclude that increasing the plants cover can efficiently reduce soil loss and avoid denuding the roots plants.
Keywords: Soil erosion, vegetation, stems density, overland flow.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3126786 Hydrological Modelling of Geological Behaviours in Environmental Planning for Urban Areas
Authors: Sheetal Sharma
Abstract:
Runoff,decreasing water levels and recharge in urban areas have been a complex issue now a days pointing defective urban design and increasing demography as cause. Very less has been discussed or analysed for water sensitive Urban Master Plans or local area plans. Land use planning deals with land transformation from natural areas into developed ones, which lead to changes in natural environment. Elaborated knowledge of relationship between the existing patterns of land use-land cover and recharge with respect to prevailing soil below is less as compared to speed of development. The parameters of incompatibility between urban functions and the functions of the natural environment are becoming various. Changes in land patterns due to built up, pavements, roads and similar land cover affects surface water flow seriously. It also changes permeability and absorption characteristics of the soil. Urban planners need to know natural processes along with modern means and best technologies available,as there is a huge gap between basic knowledge of natural processes and its requirement for balanced development planning leading to minimum impact on water recharge. The present paper analyzes the variations in land use land cover and their impacts on surface flows and sub-surface recharge in study area. The methodology adopted was to analyse the changes in land use and land cover using GIS and Civil 3d auto cad. The variations were used in computer modeling using Storm-water Management Model to find out the runoff for various soil groups and resulting recharge observing water levels in POW data for last 40 years of the study area. Results were anlayzed again to find best correlations for sustainable recharge in urban areas.
Keywords: Geology, runoff, urban planning, land use-land cover.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1318785 Classifying Biomedical Text Abstracts based on Hierarchical 'Concept' Structure
Authors: Rozilawati Binti Dollah, Masaki Aono
Abstract:
Classifying biomedical literature is a difficult and challenging task, especially when a large number of biomedical articles should be organized into a hierarchical structure. In this paper, we present an approach for classifying a collection of biomedical text abstracts downloaded from Medline database with the help of ontology alignment. To accomplish our goal, we construct two types of hierarchies, the OHSUMED disease hierarchy and the Medline abstract disease hierarchies from the OHSUMED dataset and the Medline abstracts, respectively. Then, we enrich the OHSUMED disease hierarchy before adapting it to ontology alignment process for finding probable concepts or categories. Subsequently, we compute the cosine similarity between the vector in probable concepts (in the “enriched" OHSUMED disease hierarchy) and the vector in Medline abstract disease hierarchies. Finally, we assign category to the new Medline abstracts based on the similarity score. The results obtained from the experiments show the performance of our proposed approach for hierarchical classification is slightly better than the performance of the multi-class flat classification.Keywords: Biomedical literature, hierarchical text classification, ontology alignment, text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2011784 Image Steganography Using Least Significant Bit Technique
Authors: Preeti Kumari, Ridhi Kapoor
Abstract:
In any communication, security is the most important issue in today’s world. In this paper, steganography is the process of hiding the important data into other data, such as text, audio, video, and image. The interest in this topic is to provide availability, confidentiality, integrity, and authenticity of data. The steganographic technique that embeds hides content with unremarkable cover media so as not to provoke eavesdropper’s suspicion or third party and hackers. In which many applications of compression, encryption, decryption, and embedding methods are used for digital image steganography. Due to compression, the nose produces in the image. To sustain noise in the image, the LSB insertion technique is used. The performance of the proposed embedding system with respect to providing security to secret message and robustness is discussed. We also demonstrate the maximum steganography capacity and visual distortion.Keywords: Steganography, LSB, encoding, information hiding, color image.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1092783 Wasting Human and Computer Resources
Authors: Mária Csernoch, Piroska Biró
Abstract:
The legends about “user-friendly” and “easy-to-use” birotical tools (computer-related office tools) have been spreading and misleading end-users. This approach has led us to the extremely high number of incorrect documents, causing serious financial losses in the creating, modifying, and retrieving processes. Our research proved that there are at least two sources of this underachievement: (1) The lack of the definition of the correctly edited, formatted documents. Consequently, end-users do not know whether their methods and results are correct or not. They are not aware of their ignorance. They are so ignorant that their ignorance does not allow them to realize their lack of knowledge. (2) The end-users’ problem solving methods. We have found that in non-traditional programming environments end-users apply, almost exclusively, surface approach metacognitive methods to carry out their computer related activities, which are proved less effective than deep approach methods. Based on these findings we have developed deep approach methods which are based on and adapted from traditional programming languages. In this study, we focus on the most popular type of birotical documents, the text based documents. We have provided the definition of the correctly edited text, and based on this definition, adapted the debugging method known in programming. According to the method, before the realization of text editing, a thorough debugging of already existing texts and the categorization of errors are carried out. With this method in advance to real text editing users learn the requirements of text based documents and also of the correctly formatted text. The method has been proved much more effective than the previously applied surface approach methods. The advantages of the method are that the real text handling requires much less human and computer sources than clicking aimlessly in the GUI (Graphical User Interface), and the data retrieval is much more effective than from error-prone documents.
Keywords: Deep approach metacognitive methods, error-prone birotical documents, financial losses, human and computer resources.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1911782 Image Classification and Accuracy Assessment Using the Confusion Matrix, Contingency Matrix, and Kappa Coefficient
Authors: F. F. Howard, C. B. Boye, I. Yakubu, J. S. Y. Kuma
Abstract:
One of the ways that could be used for the production of land use and land cover maps by a procedure known as image classification is the use of the remote sensing technique. Numerous elements ought to be taken into consideration, including the availability of highly satisfactory Landsat imagery, secondary data and a precise classification process. The goal of this study was to classify and map the land use and land cover of the study area using remote sensing and Geospatial Information System (GIS) analysis. The classification was done using Landsat 8 satellite images acquired in December 2020 covering the study area. The Landsat image was downloaded from the USGS. The Landsat image with 30 m resolution was geo-referenced to the WGS_84 datum and Universal Transverse Mercator (UTM) Zone 30N coordinate projection system. A radiometric correction was applied to the image to reduce the noise in the image. This study consists of two sections: the Land Use/Land Cover (LULC) and Accuracy Assessments using the confusion and contingency matrix and the Kappa coefficient. The LULC classifications were vegetation (agriculture) (67.87%), water bodies (0.01%), mining areas (5.24%), forest (26.02%), and settlement (0.88%). The overall accuracy of 97.87% and the kappa coefficient (K) of 97.3% were obtained for the confusion matrix. While an overall accuracy of 95.7% and a Kappa coefficient of 0.947 were obtained for the contingency matrix, the kappa coefficients were rated as substantial; hence, the classified image is fit for further research.
Keywords: Confusion Matrix, contingency matrix, kappa coefficient, land used/ land cover, accuracy assessment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 252781 Interannual Variations in Snowfall and Continuous Snow Cover Duration in Pelso, Central Finland, Linked to Teleconnection Patterns, 1944-2010
Authors: M. Irannezhad, E. H. N. Gashti, S. Mohammadighavam, M. Zarrini, B. Kløve
Abstract:
Climate warming would increase rainfall by shifting precipitation falling form from snow to rain, and would accelerate snow cover disappearing by increasing snowpack. Using temperature and precipitation data in the temperature-index snowmelt model, we evaluated variability of snowfall and continuous snow cover duration (CSCD) during 1944-2010 over Pelso, central Finland. Mann- Kendall non-parametric test determined that annual precipitation increased by 2.69 (mm/year, p<0.05) during the study period, but no clear trend in annual temperature. Both annual rainfall and snowfall increased by 1.67 and 0.78 (mm/year, p<0.05), respectively. CSCD was generally about 205 days from 14 October to 6 May. No clear trend was found in CSCD over Pelso. Spearman’s rank correlation showed most significant relationships of annual snowfall with the East Atlantic (EA) pattern, and CSCD with the East Atlantic/West Russia (EA/WR) pattern. Increased precipitation with no warming temperature caused the rainfall and snowfall to increase, while no effects on CSCD.
Keywords: Variations, snowfall, snow cover duration, temperature-index snowmelt model, teleconnection patterns.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1915780 Automatic Text Summarization
Authors: Mohamed Abdel Fattah, Fuji Ren
Abstract:
This work proposes an approach to address automatic text summarization. This approach is a trainable summarizer, which takes into account several features, including sentence position, positive keyword, negative keyword, sentence centrality, sentence resemblance to the title, sentence inclusion of name entity, sentence inclusion of numerical data, sentence relative length, Bushy path of the sentence and aggregated similarity for each sentence to generate summaries. First we investigate the effect of each sentence feature on the summarization task. Then we use all features score function to train genetic algorithm (GA) and mathematical regression (MR) models to obtain a suitable combination of feature weights. The proposed approach performance is measured at several compression rates on a data corpus composed of 100 English religious articles. The results of the proposed approach are promising.Keywords: Automatic Summarization, Genetic Algorithm, Mathematical Regression, Text Features.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2335779 FRP Bars Spacing Effect on Numerical Thermal Deformations in Concrete Beams under High Temperatures
Authors: A. Zaidi, F. Khelifi, R. Masmoudi, M. Bouhicha
Abstract:
5In order to eradicate the degradation of reinforced concrete structures due to the steel corrosion, professionals in constructions suggest using fiber reinforced polymers (FRP) for their excellent properties. Nevertheless, high temperatures may affect the bond between FRP bar and concrete, and consequently the serviceability of FRP-reinforced concrete structures. This paper presents a nonlinear numerical investigation using ADINA software to investigate the effect of the spacing between glass FRP (GFRP) bars embedded in concrete on circumferential thermal deformations and the distribution of radial thermal cracks in reinforced concrete beams submitted to high temperature variations up to 60 °C for asymmetrical problems. The thermal deformations predicted from nonlinear finite elements model, at the FRP bar/concrete interface and at the external surface of concrete cover, were established as a function of the ratio of concrete cover thickness to FRP bar diameter (c/db) and the ratio of spacing between FRP bars in concrete to FRP bar diameter (e/db). Numerical results show that the circumferential thermal deformations at the external surface of concrete cover are linear until cracking thermal load varied from 32 to 55 °C corresponding to the ratio of e/db varied from 1.3 to 2.3, respectively. However, for ratios e/db >2.3 and c/db >1.6, the thermal deformations at the external surface of concrete cover exhibit linear behavior without any cracks observed on the specified surface. The numerical results are compared to those obtained from analytical models validated by experimental tests.
Keywords: Concrete beam, FRP bars, spacing effect, thermal deformation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 632778 Root System Production and Aboveground Biomass Production of Chosen Cover Crops
Authors: M. Hajzler, J. Klimesova, T. Streda, K. Vejrazka, V. Marecek, T. Cholastova
Abstract:
The most planted cover crops in the Czech Republic are mustard (Sinapis alba) and phacelia (Phacelia tanacetifolia Benth.). A field trial was executed to evaluate root system size (RSS) in eight varieties of mustard and five varieties of phacelia on two locations, in three BBCH phases and in two years. The relationship between RSS and aboveground biomass was inquired. The root system was assessed by measuring its electric capacity. Aboveground mass and root samples to be evaluated by means of a digital image analysis were recovered in the BBCH phase 70. The yield of aboveground biomass of mustard was always statistically significantly higher than that of phacelia. Mustard showed a statistically significant negative correlation between root length density (RLD) within 10 cm and aboveground biomass weight (r = - 0.46*). Phacelia featured a statistically significant correlation between aboveground biomass production and nitrate nitrogen content in soil (r=0.782**).Keywords: Aboveground Biomass, Cover crop, Nitrogen content, Root system size
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1680777 Text Summarization for Oil and Gas News Article
Authors: L. H. Chong, Y. Y. Chen
Abstract:
Information is increasing in volumes; companies are overloaded with information that they may lose track in getting the intended information. It is a time consuming task to scan through each of the lengthy document. A shorter version of the document which contains only the gist information is more favourable for most information seekers. Therefore, in this paper, we implement a text summarization system to produce a summary that contains gist information of oil and gas news articles. The summarization is intended to provide important information for oil and gas companies to monitor their competitor-s behaviour in enhancing them in formulating business strategies. The system integrated statistical approach with three underlying concepts: keyword occurrences, title of the news article and location of the sentence. The generated summaries were compared with human generated summaries from an oil and gas company. Precision and recall ratio are used to evaluate the accuracy of the generated summary. Based on the experimental results, the system is able to produce an effective summary with the average recall value of 83% at the compression rate of 25%.
Keywords: Information retrieval, text summarization, statistical approach.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608776 A Framework for Urdu Language Translation using LESSA
Authors: Imran Sarwar Bajwa
Abstract:
Internet is one of the major sources of information for the person belonging to almost all the fields of life. Major language that is used to publish information on internet is language. This thing becomes a problem in a country like Pakistan, where Urdu is the national language. Only 10% of Pakistan mass can understand English. The reason is millions of people are deprived of precious information available on internet. This paper presents a system for translation from English to Urdu. A module LESSA is used that uses a rule based algorithm to read the input text in English language, understand it and translate it into Urdu language. The designed approach was further incorporated to translate the complete website from English language o Urdu language. An option appears in the browser to translate the webpage in a new window. The designed system will help the millions of users of internet to get benefit of the internet and approach the latest information and knowledge posted daily on internet.Keywords: Natural Language Translation, Text Understanding, Knowledge extraction, Text Processing
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2666775 How Does Psychoanalysis Help in Reconstructing Political Thought? An Exercise of Interpretation
Authors: Subramaniam Chandran
Abstract:
The significance of psychology in studying politics is embedded in philosophical issues as well as behavioural pursuits. For the former is often associated with Sigmund Freud and his followers. The latter is inspired by the writings of Harold Lasswell. Political psychology or psychopolitics has its own impression on political thought ever since it deciphers the concept of human nature and political propaganda. More importantly, psychoanalysis views political thought as a textual content which needs to explore the latent from the manifest content. In other words, it reads the text symptomatically and interprets the hidden truth. This paper explains the paradigm of dream interpretation applied by Freud. The dream work is a process which has four successive activities: condensation, displacement, representation and secondary revision. The texts dealing with political though can also be interpreted on these principles. Freud's method of dream interpretation draws its source after the hermeneutic model of philological research. It provides theoretical perspective and technical rules for the interpretation of symbolic structures. The task of interpretation remains a discovery of equivalence of symbols and actions through perpetual analogies. Psychoanalysis can help in studying political thought in two ways: to study the text distortion, Freud's dream interpretation is used as a paradigm exploring the latent text from its manifest text; and to apply Freud's psychoanalytic concepts and theories ranging from individual mind to civilization, religion, war and politics.Keywords: Psychoanalysis, political thought, dreaminterpretation, latent content, manifest content
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1562774 SMaTTS: Standard Malay Text to Speech System
Authors: Othman O. Khalifa, Zakiah Hanim Ahmad, Teddy Surya Gunawan
Abstract:
This paper presents a rule-based text- to- speech (TTS) Synthesis System for Standard Malay, namely SMaTTS. The proposed system using sinusoidal method and some pre- recorded wave files in generating speech for the system. The use of phone database significantly decreases the amount of computer memory space used, thus making the system very light and embeddable. The overall system was comprised of two phases the Natural Language Processing (NLP) that consisted of the high-level processing of text analysis, phonetic analysis, text normalization and morphophonemic module. The module was designed specially for SM to overcome few problems in defining the rules for SM orthography system before it can be passed to the DSP module. The second phase is the Digital Signal Processing (DSP) which operated on the low-level process of the speech waveform generation. A developed an intelligible and adequately natural sounding formant-based speech synthesis system with a light and user-friendly Graphical User Interface (GUI) is introduced. A Standard Malay Language (SM) phoneme set and an inclusive set of phone database have been constructed carefully for this phone-based speech synthesizer. By applying the generative phonology, a comprehensive letter-to-sound (LTS) rules and a pronunciation lexicon have been invented for SMaTTS. As for the evaluation tests, a set of Diagnostic Rhyme Test (DRT) word list was compiled and several experiments have been performed to evaluate the quality of the synthesized speech by analyzing the Mean Opinion Score (MOS) obtained. The overall performance of the system as well as the room for improvements was thoroughly discussed.Keywords: Natural Language Processing, Text-To-Speech (TTS), Diphone, source filter, low-/ high- level synthesis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1973773 A Study on Finding Similar Document with Multiple Categories
Authors: R. Saraçoğlu, N. Allahverdi
Abstract:
Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.
Keywords: Document similarity, Fuzzy classification, Multiple categories, Text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1707