Search results for: Association language features
2352 A Study of the Variability of Very Low Resolution Characters and the Feasibility of Their Discrimination Using Geometrical Features
Authors: Farshideh Einsele, Rolf Ingold
Abstract:
Current OCR technology does not allow to accurately recognizing small text images, such as those found in web images. Our goal is to investigate new approaches to recognize very low resolution text images containing antialiased character shapes. This paper presents a preliminary study on the variability of such characters and the feasibility to discriminate them by using geometrical features. In a first stage we analyze the distribution of these features. In a second stage we present a study on the discriminative power for recognizing isolated characters, using various rendering methods and font properties. Finally we present interesting results of our evaluation tests leading to our conclusion and future focus.Keywords: World Wide Web, document analysis, pattern recognition, Optical Character Recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13702351 Heritage Tree Expert Assessment and Classification: Malaysian Perspective
Authors: B.-Y.-S. Lau, Y.-C.-T. Jonathan, M.-S. Alias
Abstract:
Heritage trees are natural large, individual trees with exceptionally value due to association with age or event or distinguished people. In Malaysia, there is an abundance of tropical heritage trees throughout the country. It is essential to set up a repository of heritage trees to prevent valuable trees from being cut down. In this cross domain study, a web-based online expert system namely the Heritage Tree Expert Assessment and Classification (HTEAC) is developed and deployed for public to nominate potential heritage trees. Based on the nomination, tree care experts or arborists would evaluate and verify the nominated trees as heritage trees. The expert system automatically rates the approved heritage trees according to pre-defined grades via Delphi technique. Features and usability test of the expert system are presented. Preliminary result is promising for the system to be used as a full scale public system.Keywords: Arboriculture, Delphi, expert system, heritage tree, urban forestry.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14302350 SySRA: A System of a Continuous Speech Recognition in Arab Language
Authors: Samir Abdelhamid, Noureddine Bouguechal
Abstract:
We report in this paper the model adopted by our system of continuous speech recognition in Arab language SySRA and the results obtained until now. This system uses the database Arabdic-10 which is a corpus of word for the Arab language and which was manually segmented. Phonetic decoding is represented by an expert system where the knowledge base is translated in the form of production rules. This expert system transforms a vocal signal into a phonetic lattice. The higher level of the system takes care of the recognition of the lattice thus obtained by deferring it in the form of written sentences (orthographical Form). This level contains initially the lexical analyzer which is not other than the module of recognition. We subjected this analyzer to a set of spectrograms obtained by dictating a score of sentences in Arab language. The rate of recognition of these sentences is about 70% which is, to our knowledge, the best result for the recognition of the Arab language. The test set consists of twenty sentences from four speakers not having taken part in the training.Keywords: Continuous speech recognition, lexical analyzer, phonetic decoding, phonetic lattice, vocal signal.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13882349 On Developing a Core Guideline for English Language Training Programs in Business Settings
Authors: T. Ito, K. Kawaguchi, R. Ohta
Abstract:
The purpose of this study is to provide a guideline to assist globally-minded companies in developing task-based English- language programs for their employees. After conducting an online self-assessment questionnaire comprised of 45 job-related tasks, we analyzed responses received from 3,000 Japanese company employees and developed a checklist that considered three areas; i) the percentage of those who need to accomplish English-language tasks in their workplace (need for English), ii) a five-point self-assessment score (task performance level), and iii) the impact of previous task experience on perceived performance (experience factor). The 45 tasks were graded according to five proficiency levels. Our results helped us to create a core guideline that may assist companies in two ways: first, in helping determine which tasks employees with a certain English proficiency should be able to satisfactorily carry out, and secondly, to quickly prioritize which business-related English skills they would need in future English language programs.
Keywords: Business settings, Can-do statements, English language training programs, Self-assessment, Task experience.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14482348 Implementation of ADETRAN Language Using Message Passing Interface
Authors: Akiyoshi Wakatani
Abstract:
This paper describes the Message Passing Interface (MPI) implementation of ADETRAN language, and its evaluation on SX-ACE supercomputers. ADETRAN language includes pdo statement that specifies the data distribution and parallel computations and pass statement that specifies the redistribution of arrays. Two methods for implementation of pass statement are discussed and the performance evaluation using Splitting-Up CG method is presented. The effectiveness of the parallelization is evaluated and the advantage of one dimensional distribution is empirically confirmed by using the results of experiments.Keywords: Iterative methods, array redistribution, translator, distributed memory.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11992347 Image Similarity: A Genetic Algorithm Based Approach
Authors: R. C. Joshi, Shashikala Tapaswi
Abstract:
The paper proposes an approach using genetic algorithm for computing the region based image similarity. The image is denoted using a set of segmented regions reflecting color and texture properties of an image. An image is associated with a family of image features corresponding to the regions. The resemblance of two images is then defined as the overall similarity between two families of image features, and quantified by a similarity measure, which integrates properties of all the regions in the images. A genetic algorithm is applied to decide the most plausible matching. The performance of the proposed method is illustrated using examples from an image database of general-purpose images, and is shown to produce good results.Keywords: Image Features, color descriptor, segmented classes, texture descriptors, genetic algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23252346 A Tool for Checking Conformance of UML Specification
Authors: Rosziati Ibrahim, Noraini Ibrahim
Abstract:
Unified Modeling Language (UML) is a standard language for modeling of a system. UML is used to visually specify the structure and behavior of a system. The system requirements are captured and then converted into UML specification. UML specification uses a set of rules and notations, and diagrams to specify the system requirements. In this paper, we present a tool for developing the UML specification. The tool will ease the use of the notations and diagrams for UML specification as well as increase the understanding and familiarity of the UML specification. The tool will also be able to check the conformance of the diagrams against each other for basic compliance of UML specification.Keywords: Software Engineering, Unified Modeling Language (UML), UML Specification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22062345 Hand Gesture Recognition Based on Combined Features Extraction
Authors: Mahmoud Elmezain, Ayoub Al-Hamadi, Bernd Michaelis
Abstract:
Hand gesture is an active area of research in the vision community, mainly for the purpose of sign language recognition and Human Computer Interaction. In this paper, we propose a system to recognize alphabet characters (A-Z) and numbers (0-9) in real-time from stereo color image sequences using Hidden Markov Models (HMMs). Our system is based on three main stages; automatic segmentation and preprocessing of the hand regions, feature extraction and classification. In automatic segmentation and preprocessing stage, color and 3D depth map are used to detect hands where the hand trajectory will take place in further step using Mean-shift algorithm and Kalman filter. In the feature extraction stage, 3D combined features of location, orientation and velocity with respected to Cartesian systems are used. And then, k-means clustering is employed for HMMs codeword. The final stage so-called classification, Baum- Welch algorithm is used to do a full train for HMMs parameters. The gesture of alphabets and numbers is recognized using Left-Right Banded model in conjunction with Viterbi algorithm. Experimental results demonstrate that, our system can successfully recognize hand gestures with 98.33% recognition rate.Keywords: Gesture Recognition, Computer Vision & Image Processing, Pattern Recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 40322344 A Sentence-to-Sentence Relation Network for Recognizing Textual Entailment
Authors: Isaac K. E. Ampomah, Seong-Bae Park, Sang-Jo Lee
Abstract:
Over the past decade, there have been promising developments in Natural Language Processing (NLP) with several investigations of approaches focusing on Recognizing Textual Entailment (RTE). These models include models based on lexical similarities, models based on formal reasoning, and most recently deep neural models. In this paper, we present a sentence encoding model that exploits the sentence-to-sentence relation information for RTE. In terms of sentence modeling, Convolutional neural network (CNN) and recurrent neural networks (RNNs) adopt different approaches. RNNs are known to be well suited for sequence modeling, whilst CNN is suited for the extraction of n-gram features through the filters and can learn ranges of relations via the pooling mechanism. We combine the strength of RNN and CNN as stated above to present a unified model for the RTE task. Our model basically combines relation vectors computed from the phrasal representation of each sentence and final encoded sentence representations. Firstly, we pass each sentence through a convolutional layer to extract a sequence of higher-level phrase representation for each sentence from which the first relation vector is computed. Secondly, the phrasal representation of each sentence from the convolutional layer is fed into a Bidirectional Long Short Term Memory (Bi-LSTM) to obtain the final sentence representations from which a second relation vector is computed. The relations vectors are combined and then used in then used in the same fashion as attention mechanism over the Bi-LSTM outputs to yield the final sentence representations for the classification. Experiment on the Stanford Natural Language Inference (SNLI) corpus suggests that this is a promising technique for RTE.Keywords: Deep neural models, natural language inference, recognizing textual entailment, sentence-to-sentence relation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14542343 Learning to Recognize Faces by Local Feature Design and Selection
Authors: Yanwei Pang, Lei Zhang, Zhengkai Liu
Abstract:
Studies in neuroscience suggest that both global and local feature information are crucial for perception and recognition of faces. It is widely believed that local feature is less sensitive to variations caused by illumination, expression and illumination. In this paper, we target at designing and learning local features for face recognition. We designed three types of local features. They are semi-global feature, local patch feature and tangent shape feature. The designing of semi-global feature aims at taking advantage of global-like feature and meanwhile avoiding suppressing AdaBoost algorithm in boosting weak classifies established from small local patches. The designing of local patch feature targets at automatically selecting discriminative features, and is thus different with traditional ways, in which local patches are usually selected manually to cover the salient facial components. Also, shape feature is considered in this paper for frontal view face recognition. These features are selected and combined under the framework of boosting algorithm and cascade structure. The experimental results demonstrate that the proposed approach outperforms the standard eigenface method and Bayesian method. Moreover, the selected local features and observations in the experiments are enlightening to researches in local feature design in face recognition.Keywords: Face recognition, local feature, AdaBoost, subspace analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15962342 Crossing Borders: In Research and Business Communication
Authors: E. Podhovnik
Abstract:
Cultures play a role in business communication and in research. At the example of language in international business, this paper addresses the issue of how the research cultures of management research and linguistics as well as cultures as such can be linked. After looking at existing research on language in international business, this paper approaches communication in international business from a linguistic angle and attempts to explain communication issues in businesses based on linguistic research. Thus the paper makes a step into cross-disciplinary research combining management research with linguistics.
Keywords: Language in international business, sociolinguistics, ethnopragmatics, cultural scripts.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23682341 Active Segment Selection Method in EEG Classification Using Fractal Features
Authors: Samira Vafaye Eslahi
Abstract:
BCI (Brain Computer Interface) is a communication machine that translates brain massages to computer commands. These machines with the help of computer programs can recognize the tasks that are imagined. Feature extraction is an important stage of the process in EEG classification that can effect in accuracy and the computation time of processing the signals. In this study we process the signal in three steps of active segment selection, fractal feature extraction, and classification. One of the great challenges in BCI applications is to improve classification accuracy and computation time together. In this paper, we have used student’s 2D sample t-statistics on continuous wavelet transforms for active segment selection to reduce the computation time. In the next level, the features are extracted from some famous fractal dimension estimation of the signal. These fractal features are Katz and Higuchi. In the classification stage we used ANFIS (Adaptive Neuro-Fuzzy Inference System) classifier, FKNN (Fuzzy K-Nearest Neighbors), LDA (Linear Discriminate Analysis), and SVM (Support Vector Machines). We resulted that active segment selection method would reduce the computation time and Fractal dimension features with ANFIS analysis on selected active segments is the best among investigated methods in EEG classification.
Keywords: EEG, Student’s t- statistics, BCI, Fractal Features, ANFIS, FKNN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21202340 Speaker Identification using Neural Networks
Authors: R.V Pawar, P.P.Kajave, S.N.Mali
Abstract:
The speech signal conveys information about the identity of the speaker. The area of speaker identification is concerned with extracting the identity of the person speaking the utterance. As speech interaction with computers becomes more pervasive in activities such as the telephone, financial transactions and information retrieval from speech databases, the utility of automatically identifying a speaker is based solely on vocal characteristic. This paper emphasizes on text dependent speaker identification, which deals with detecting a particular speaker from a known population. The system prompts the user to provide speech utterance. System identifies the user by comparing the codebook of speech utterance with those of the stored in the database and lists, which contain the most likely speakers, could have given that speech utterance. The speech signal is recorded for N speakers further the features are extracted. Feature extraction is done by means of LPC coefficients, calculating AMDF, and DFT. The neural network is trained by applying these features as input parameters. The features are stored in templates for further comparison. The features for the speaker who has to be identified are extracted and compared with the stored templates using Back Propogation Algorithm. Here, the trained network corresponds to the output; the input is the extracted features of the speaker to be identified. The network does the weight adjustment and the best match is found to identify the speaker. The number of epochs required to get the target decides the network performance.Keywords: Average Mean Distance function, Backpropogation, Linear Predictive Coding, MultilayeredPerceptron,
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18922339 Association of the p53 Codon 72 Polymorphism with Colorectal Cancer in South West of Iran
Authors: A. Doosti, P. Ghasemi Dehkordi, M. Zamani, S. Taheri, M. Banitalebi, M. Mahmoudzadeh
Abstract:
The p53 tumor suppressor gene plays two important roles in genomic stability: blocking cell proliferation after DNA damage until it has been repaired, and starting apoptosis if the damage is too critical. Codon 72 exon4 polymorphism (Arg72Pro) of the P53 gene has been implicated in cancer risk. Various studies have been done to investigate the status of p53 at codon 72 for arginine (Arg) and proline (Pro) alleles in different populations and also the association of this codon 72 polymorphism with various tumors. Our objective was to investigate the possible association between P53 Arg72Pro polymorphism and susceptibility to colorectal cancer among Isfahan and Chaharmahal Va Bakhtiari (a part of south west of Iran) population. We investigated the status of p53 at codon 72 for Arg/Arg, Arg/Pro and Pro/Pro allele polymorphisms in blood samples from 145 colorectal cancer patients and 140 controls by Nested-PCR of p53 exon 4 and digestion with BstUI restriction enzyme and the DNA fragments were then resolved by electrophoresis in 2% agarose gel. The Pro allele was 279 bp, while the Arg allele was restricted into two fragments of 160 and 119 bp. Among the 145 colorectal cancer cases 49 cases (33.79%) were homozygous for the Arg72 allele (Arg/Arg), 18 cases (12.41%) were homozygous for the Pro72 allele (Pro/Pro) and 78 cases (53.8%) found in heterozygous (Arg/Pro). In conclusion, it can be said that p53Arg/Arg genotype may be correlated with possible increased risk of this kind of cancers in south west of Iran.Keywords: TP53, Polymorphism, Colorectal Cancer, Iran
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23912338 Autistic Children and Different Tense Forms
Authors: Ameneh Zare, Shahin Nematzadeh, Shahla Raghibdoust, Iran Kalbassi
Abstract:
Autism spectrum disorder is characterized by abnormalities in social communication, language abilities and repetitive behaviors. The present study focused on some grammatical deficits in autistic children. We evaluated the impairment of correct use of different Persian verb tenses in autistic children-s speech. Two standardized Language Test were administered then gathered data were analyzed. The main result of this study was significant difference between the mean scores of correct responses to present tense in comparison with past tense in Persian language. This study demonstrated that tense is severely impaired in autistic children-s speech. Our findings indicated those autistic children-s production of simple present/ past tense opposition to be better than production of future and past periphrastic forms (past perfect, present perfect, past progressive).Keywords: Autism, Past, Persian Language, Present, Tense
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27542337 Developmental Differences in the Construction of Concepts by Children from 3 to 14-Year-Olds: Perception, Language and Instruction
Authors: Mehmet Ozcan
Abstract:
This study was designed to investigate the relationship between language and children’s construction of the concept of objects, actions, and states. Participants of this study are 120 children whose ages range from 3 to 14 years. Ten children participated from each age group and 10 adults participated as normative group. Data were collected using 28 words which were identified and grouped according to the purpose of this study. Participants were asked the question “What is x?’ for each word in a reserved room. The audio recorded data were transcribed and coded. The data were analyzed primarily qualitatively but quantitatively as well to support qualitative findings. The findings reveal that younger children rely more on their perceptual experience and linguistic input while 7-year-olds and older ones rely more on instructional language in the construction of the concepts related to objects, actions and states. Adults differ from all age groups with their usage of metaphors to refer to objects. It has been noted that linguistic, perceptual and instructional experiences work in an interwoven way but each one seems to be dominant at certain ages.
Keywords: Cognition, concept construction, first language acquisition, language, thought.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11152336 The Sign in the Communication Process
Authors: S. Pesina, T. Solonchak
Abstract:
In the process of information transmission (concept verbalization) we deal mostly with the substance (contents), and then pay attention to the form. Recalling events from the remote past, often we cannot exactly reproduce specific heard or pronounced words, as well as the syntactic structures. We remember events, feelings, images; we recall the general contents of the discourse. The thought gets a specific language form only during the concept verbalization phase. With minimum time for pondering, depending on the language competence level, the grammar and syntactic shaping often occurs automatically with the use of famous models and stereotypes. This means that the language form adapts itself to the consciousness, and not vice versa.
Keywords: Lexical eidos, phenomenology, noema, polysemantic word, semantic core.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19422335 Towards an Extended SQLf: Bipolar Query Language with Preferences
Authors: L. Ludovic, R. Daniel, S-E Tbahriti
Abstract:
Database management systems that integrate user preferences promise better solution for personalization, greater flexibility and higher quality of query responses. This paper presents a tentative work that studies and investigates approaches to express user preferences in queries. We sketch an extend capabilities of SQLf language that uses the fuzzy set theory in order to define the user preferences. For that, two essential points are considered: the first concerns the expression of user preferences in SQLf by so-called fuzzy commensurable predicates set. The second concerns the bipolar way in which these user preferences are expressed on mandatory and/or optional preferences.
Keywords: Flexible query language, relational database, userpreference.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10112334 Morpho-Anatomical and Ecological Studies on Endemic of Fritillaria oranensis Pomel. from the Mounts of Tessala (Western Algeria)
Abstract:
Fritillaria oranensis (Liliaceae) was described in 1874 by pomel from Algeria. Plant samples have been collected from the mount of Tessala (Sidi-Bel-Abbes). The morphological features of various organs of the plant are described in detail. In the morphological part of the study, features of various organs of the plants such as stem and leaf were determined and illustrated. Ecological studies provide information about the physical and chemical structure of soil types in Tessala Mountain. The aim of this original investigation is to put forth ecological and anatomical features of these species for the first time, but at the same time given detailed account of the morphological characteristics of the stem and leaf of Fritillaria oranensis.
Keywords: Anatomy, ecology, Liliaceae, morphology, Fritillaria oranensis Pomel.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18252333 Atomic Force Microscopy (AFM)Topographical Surface Characterization of Multilayer-Coated and Uncoated Carbide Inserts
Authors: Samy E. Oraby, Ayman M. Alaskari
Abstract:
In recent years, scanning probe atomic force microscopy SPM AFM has gained acceptance over a wide spectrum of research and science applications. Most fields focuses on physical, chemical, biological while less attention is devoted to manufacturing and machining aspects. The purpose of the current study is to assess the possible implementation of the SPM AFM features and its NanoScope software in general machining applications with special attention to the tribological aspects of cutting tool. The surface morphology of coated and uncoated as-received carbide inserts is examined, analyzed, and characterized through the determination of the appropriate scanning setting, the suitable data type imaging techniques and the most representative data analysis parameters using the MultiMode SPM AFM in contact mode. The NanoScope operating software is used to capture realtime three data types images: “Height", “Deflection" and “Friction". Three scan sizes are independently performed: 2, 6, and 12 μm with a 2.5 μm vertical range (Z). Offline mode analysis includes the determination of three functional topographical parameters: surface “Roughness", power spectral density “PSD" and “Section". The 12 μm scan size in association with “Height" imaging is found efficient to capture every tiny features and tribological aspects of the examined surface. Also, “Friction" analysis is found to produce a comprehensive explanation about the lateral characteristics of the scanned surface. Configuration of many surface defects and drawbacks has been precisely detected and analyzed.Keywords: SPM AFM contact mode, carbide inserts, scan size, surface defects, surface roughness, PSD.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 72692332 Evaluation Pattern of Cognitive Processes in Language in Written Comprehension
Authors: Agnès Garletti
Abstract:
Our research aims at helping the tutor on line to evaluate the student-s cognitive processes. The student is a learner in French as a Second Language who studies an on-line socio-cognitive scenario in written communication. In our method, these cognitive processes are defined. For that, the language abilities and learning tasks are associated to cognitive operation. Moreover, the found cognitive processes are named with specific terms. The result was to create an instrumental pattern to question the learner about the cognitive processes used to build an item of written comprehension. Our research follows the principles of the third historical generation of studies on the cognitive activity of the text comprehension. The strength of our instrumental pattern stands in the precision and the logical articulation of the questions to the learner. However, the learner-s answers can still be subjective but the precision of the instrument restricts it.Keywords: Cognitive processes, Evaluation pattern, French as asecond language, Socio-cognitive scenario, Written comprehension.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14902331 Corporate Social Responsibility Disclosure, Tax Aggressiveness and Sustainability Report Assurance: Evidence from Thailand
Authors: Eko Budi Santoso, Kazia Laturette, Stanislaus Adnanto Mastan
Abstract:
This study aims to examine the association between disclosure of social responsibility and tax aggressiveness in developing countries, namely Thailand. This is due to the increasing trend of disclosure of social responsibility in developing countries, even though this disclosure of information is still voluntary. On the other hand, developing countries have low taxation rate and investor protection infrastructures that allow the disclosure of social responsibility to be used opportunistically as a tool to fool the attainment of interests. This study also examines the role of assurance on the association between corporate social responsibility disclosure and tax aggressiveness. The assurance aims to provide confidence that the disclosure of social responsibility by the company is valid. This research builds an index to measure the disclosure of social responsibility based on the rules issued by the innovative Global Reporting. The results of the study are based on a sample of publicly traded companies in Thailand, which showed a positive association between disclosure of corporate social responsibility and tax aggressiveness, but it was further discovered that these results were mitigated by the existence of assurance against disclosure of corporate social responsibility. The results of this study indicate that the disclosure of corporate social responsibility can show that the company cares about the issue of social responsibility but does not automatically make the company as one that holds ethical values in its business practices.Keywords: Corporate Social Responsibility disclosure, tax aggressiveness, sustainability assurance, business ethics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15082330 Continuous Text Translation Using Text Modeling in the Thetos System
Authors: Nina Suszczanska, Przemyslaw Szmal, Slawomir Kulikow
Abstract:
In the paper a method of modeling text for Polish is discussed. The method is aimed at transforming continuous input text into a text consisting of sentences in so called canonical form, whose characteristic is, among others, a complete structure as well as no anaphora or ellipses. The transformation is lossless as to the content of text being transformed. The modeling method has been worked out for the needs of the Thetos system, which translates Polish written texts into the Polish sign language. We believe that the method can be also used in various applications that deal with the natural language, e.g. in a text summary generator for Polish.Keywords: anaphora, machine translation, NLP, sign language, text syntax.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16552329 Improving Worm Detection with Artificial Neural Networks through Feature Selection and Temporal Analysis Techniques
Authors: Dima Stopel, Zvi Boger, Robert Moskovitch, Yuval Shahar, Yuval Elovici
Abstract:
Computer worm detection is commonly performed by antivirus software tools that rely on prior explicit knowledge of the worm-s code (detection based on code signatures). We present an approach for detection of the presence of computer worms based on Artificial Neural Networks (ANN) using the computer's behavioral measures. Identification of significant features, which describe the activity of a worm within a host, is commonly acquired from security experts. We suggest acquiring these features by applying feature selection methods. We compare three different feature selection techniques for the dimensionality reduction and identification of the most prominent features to capture efficiently the computer behavior in the context of worm activity. Additionally, we explore three different temporal representation techniques for the most prominent features. In order to evaluate the different techniques, several computers were infected with five different worms and 323 different features of the infected computers were measured. We evaluated each technique by preprocessing the dataset according to each one and training the ANN model with the preprocessed data. We then evaluated the ability of the model to detect the presence of a new computer worm, in particular, during heavy user activity on the infected computers.Keywords: Artificial Neural Networks, Feature Selection, Temporal Analysis, Worm Detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17272328 Support Vector Machine Prediction Model of Early-stage Lung Cancer Based on Curvelet Transform to Extract Texture Features of CT Image
Authors: Guo Xiuhua, Sun Tao, Wu Haifeng, He Wen, Liang Zhigang, Zhang Mengxia, Guo Aimin, Wang Wei
Abstract:
Purpose: To explore the use of Curvelet transform to extract texture features of pulmonary nodules in CT image and support vector machine to establish prediction model of small solitary pulmonary nodules in order to promote the ratio of detection and diagnosis of early-stage lung cancer. Methods: 2461 benign or malignant small solitary pulmonary nodules in CT image from 129 patients were collected. Fourteen Curvelet transform textural features were as parameters to establish support vector machine prediction model. Results: Compared with other methods, using 252 texture features as parameters to establish prediction model is more proper. And the classification consistency, sensitivity and specificity for the model are 81.5%, 93.8% and 38.0% respectively. Conclusion: Based on texture features extracted from Curvelet transform, support vector machine prediction model is sensitive to lung cancer, which can promote the rate of diagnosis for early-stage lung cancer to some extent.Keywords: CT image, Curvelet transform, Small pulmonary nodules, Support vector machines, Texture extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27652327 Combined Feature Based Hyperspectral Image Classification Technique Using Support Vector Machines
Authors: Mrs.K.Kavitha, S.Arivazhagan
Abstract:
A spatial classification technique incorporating a State of Art Feature Extraction algorithm is proposed in this paper for classifying a heterogeneous classes present in hyper spectral images. The classification accuracy can be improved if and only if both the feature extraction and classifier selection are proper. As the classes in the hyper spectral images are assumed to have different textures, textural classification is entertained. Run Length feature extraction is entailed along with the Principal Components and Independent Components. A Hyperspectral Image of Indiana Site taken by AVIRIS is inducted for the experiment. Among the original 220 bands, a subset of 120 bands is selected. Gray Level Run Length Matrix (GLRLM) is calculated for the selected forty bands. From GLRLMs the Run Length features for individual pixels are calculated. The Principle Components are calculated for other forty bands. Independent Components are calculated for next forty bands. As Principal & Independent Components have the ability to represent the textural content of pixels, they are treated as features. The summation of Run Length features, Principal Components, and Independent Components forms the Combined Features which are used for classification. SVM with Binary Hierarchical Tree is used to classify the hyper spectral image. Results are validated with ground truth and accuracies are calculated.
Keywords: Multi-class, Run Length features, PCA, ICA, classification and Support Vector Machines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15212326 A Proposed Hybrid Approach for Feature Selection in Text Document Categorization
Authors: M. F. Zaiyadi, B. Baharudin
Abstract:
Text document categorization involves large amount of data or features. The high dimensionality of features is a troublesome and can affect the performance of the classification. Therefore, feature selection is strongly considered as one of the crucial part in text document categorization. Selecting the best features to represent documents can reduce the dimensionality of feature space hence increase the performance. There were many approaches has been implemented by various researchers to overcome this problem. This paper proposed a novel hybrid approach for feature selection in text document categorization based on Ant Colony Optimization (ACO) and Information Gain (IG). We also presented state-of-the-art algorithms by several other researchers.Keywords: Ant colony optimization, feature selection, information gain, text categorization, text representation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20682325 Addressing Scalability Issues of Named Entity Recognition Using Multi-Class Support Vector Machines
Authors: Mona Soliman Habib
Abstract:
This paper explores the scalability issues associated with solving the Named Entity Recognition (NER) problem using Support Vector Machines (SVM) and high-dimensional features. The performance results of a set of experiments conducted using binary and multi-class SVM with increasing training data sizes are examined. The NER domain chosen for these experiments is the biomedical publications domain, especially selected due to its importance and inherent challenges. A simple machine learning approach is used that eliminates prior language knowledge such as part-of-speech or noun phrase tagging thereby allowing for its applicability across languages. No domain-specific knowledge is included. The accuracy measures achieved are comparable to those obtained using more complex approaches, which constitutes a motivation to investigate ways to improve the scalability of multiclass SVM in order to make the solution more practical and useable. Improving training time of multi-class SVM would make support vector machines a more viable and practical machine learning solution for real-world problems with large datasets. An initial prototype results in great improvement of the training time at the expense of memory requirements.Keywords: Named entity recognition, support vector machines, language independence, bioinformatics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16892324 Features of Soil Formation in the North of Western Siberia in Cryogenic Conditions
Authors: Tatiana V. Raudina, Sergey P. Kulizhskiy
Abstract:
A large part of Russia is located in permafrost areas. These areas are widely used because there are concentrated valuable natural resources. Therefore to explore of cryosols it is important due to the significant increase of anthropogenic stress as well as the problem of global climate change. In the north of Western Siberia permafrost phenomena is widespread. Permafrost as a factor of soil formation and cryogenesis as a process have a great impact on the soil formation of these areas. Based on the research results of permafrost-affected soils tundra landscapes formed in the central part of the Tazovskiy Peninsula in cryogenic conditions, data were obtained which characterize the morphological features of soils. The specificity of soil cover distribution and manifestation of soil-forming processes within the study area are noted. Permafrost features such as frost cracking, cryoturbation, thixotropy, movement of humus are formed. The formation of these features is increased with the development of the territory. As a consequence, there is a change in the components of the environment and the destruction of the soil cover.
Keywords: Gleyed and nongleyed soils, permafrost, soil cryogenesis (pedocryogenesis), soil-forming macroprocesses.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20622323 Different Ergonomic Exposure Risk and Infrared Thermal Temperature on Low Back
Authors: Sihao Lin, Bo Shen, Xuexiang Dai, Xuyan Xu, Zhenyi Wu, Xianzhe Zeng
Abstract:
Infrared Thermography (IRT) has been little documented in the objective measurement of ergonomic exposure. We aimed to examine the association between different ergonomic exposures and low back skin temperature measured by IRT. A total of 114 subjects among sedentary students, sports students and cleaning workers were selected as different ergonomic exposure levels. Low back skin temperature was measured by IRT before and post ergonomic exposure. Ergonomic exposure was assessed by Quick Exposure Check (QEC) and quantitative scores were calculated on the low back. Multiple regressions were constructed to examine the possible associations between ergonomic risk exposures and the skin temperature over the low back. Compared to the two student groups, clean workers had significantly higher ergonomic exposure scores on the low back. The low back temperature variations were different among the three groups. The temperature decreased significantly among students with ergonomic exposure (P < 0.01), while it increased among cleaning workers. With adjustment of confounding, the post-exposure temperature and the temperature changes after exposure showed a significantly negative association with ergonomic exposure scores. For maximum temperature, one increasing ergonomic score decreased -0.23 °C (95% CI -0.37, -0.10) of temperature after ergonomic exposure over the low back. There was a significant association between ergonomic exposures and infrared thermal temperature over low back. IRT could be used as an objective assessment of ergonomic exposure on the low back.
Keywords: Ergonomic exposure, infrared thermography, musculoskeletal disorders, skin temperature, low back.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 139