Search results for: Hindi.
12 Functioning of Turkic Elements in Modern Hindi
Authors: B. S. Bokuleva, R. A. Avakova, A. A. Sultangubieva, U. Schamiloglu
Abstract:
It is discussed about modern usage of adopted words and their vocabularies, Turkism usage fields, phonetic, grammatical and lexis-semantic assimilation of the typological-morphological structures of entering to different Hindi languages in comparative typological aspects in this scientific article. The lexis vocabulary is rich, the prevalence area is wide and it has researched the entering process of vocabulary into the great languages of Turkic elements from the speakers- numbers. The research work has worked on the base of Hindi vocabulary.Keywords: Adopted words, language communications, Turkism, Turkic languages.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 216611 Strategic Risk Issues for Film Distributors of Hindi Film Industry in Mumbai: A Grounded Theory Approach
Abstract:
The purpose of the paper is to address the strategic risk issues surrounding Hindi film distribution in Mumbai for a film distributor, who acts as an entrepreneur when launching a product (movie) in the market (film territory).The paper undertakes a fundamental review of films and risk in the Hindi film industry and applies Grounded Theory technique to understand the complex phenomena of risk taking behavior of the film distributors (both independent and studios) in Mumbai. Rich in-depth interviews with distributors are coded to develop core categories through constant comparison leading to conceptualization of the phenomena of interest. This paper is a first-of-its-kind-attempt to understand risk behavior of a distributor, which is akin to entrepreneurial risk behavior under conditions of uncertainty.Keywords: Entrepreneurial Risk Behavior, Film Distribution Strategy, Hindi Film Industry, Risk.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 261510 OCR for Script Identification of Hindi (Devnagari) Numerals using Feature Sub Selection by Means of End-Point with Neuro-Memetic Model
Authors: Banashree N. P., R. Vasanta
Abstract:
Recognition of Indian languages scripts is challenging problems. In Optical Character Recognition [OCR], a character or symbol to be recognized can be machine printed or handwritten characters/numerals. There are several approaches that deal with problem of recognition of numerals/character depending on the type of feature extracted and different way of extracting them. This paper proposes a recognition scheme for handwritten Hindi (devnagiri) numerals; most admired one in Indian subcontinent. Our work focused on a technique in feature extraction i.e. global based approach using end-points information, which is extracted from images of isolated numerals. These feature vectors are fed to neuro-memetic model [18] that has been trained to recognize a Hindi numeral. The archetype of system has been tested on varieties of image of numerals. . In proposed scheme data sets are fed to neuro-memetic algorithm, which identifies the rule with highest fitness value of nearly 100 % & template associates with this rule is nothing but identified numerals. Experimentation result shows that recognition rate is 92-97 % compared to other models.Keywords: OCR, Global Feature, End-Points, Neuro-Memetic model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17599 Named Entity Recognition using Support Vector Machine: A Language Independent Approach
Authors: Asif Ekbal, Sivaji Bandyopadhyay
Abstract:
Named Entity Recognition (NER) aims to classify each word of a document into predefined target named entity classes and is now-a-days considered to be fundamental for many Natural Language Processing (NLP) tasks such as information retrieval, machine translation, information extraction, question answering systems and others. This paper reports about the development of a NER system for Bengali and Hindi using Support Vector Machine (SVM). Though this state of the art machine learning technique has been widely applied to NER in several well-studied languages, the use of this technique to Indian languages (ILs) is very new. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the four different named (NE) classes, such as Person name, Location name, Organization name and Miscellaneous name. We have used the annotated corpora of 122,467 tokens of Bengali and 502,974 tokens of Hindi tagged with the twelve different NE classes 1, defined as part of the IJCNLP-08 NER Shared Task for South and South East Asian Languages (SSEAL) 2. In addition, we have manually annotated 150K wordforms of the Bengali news corpus, developed from the web-archive of a leading Bengali newspaper. We have also developed an unsupervised algorithm in order to generate the lexical context patterns from a part of the unlabeled Bengali news corpus. Lexical patterns have been used as the features of SVM in order to improve the system performance. The NER system has been tested with the gold standard test sets of 35K, and 60K tokens for Bengali, and Hindi, respectively. Evaluation results have demonstrated the recall, precision, and f-score values of 88.61%, 80.12%, and 84.15%, respectively, for Bengali and 80.23%, 74.34%, and 77.17%, respectively, for Hindi. Results show the improvement in the f-score by 5.13% with the use of context patterns. Statistical analysis, ANOVA is also performed to compare the performance of the proposed NER system with that of the existing HMM based system for both the languages.
Keywords: Named Entity (NE), Named Entity Recognition (NER), Support Vector Machine (SVM), Bengali, Hindi.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34048 OCR for Script Identification of Hindi (Devnagari) Numerals using Error Diffusion Halftoning Algorithm with Neural Classifier
Authors: Banashree N. P., Andhe Dharani, R. Vasanta, P. S. Satyanarayana
Abstract:
The applications on numbers are across-the-board that there is much scope for study. The chic of writing numbers is diverse and comes in a variety of form, size and fonts. Identification of Indian languages scripts is challenging problems. In Optical Character Recognition [OCR], machine printed or handwritten characters/numerals are recognized. There are plentiful approaches that deal with problem of detection of numerals/character depending on the sort of feature extracted and different way of extracting them. This paper proposes a recognition scheme for handwritten Hindi (devnagiri) numerals; most admired one in Indian subcontinent our work focused on a technique in feature extraction i.e. Local-based approach, a method using 16-segment display concept, which is extracted from halftoned images & Binary images of isolated numerals. These feature vectors are fed to neural classifier model that has been trained to recognize a Hindi numeral. The archetype of system has been tested on varieties of image of numerals. Experimentation result shows that recognition rate of halftoned images is 98 % compared to binary images (95%).
Keywords: OCR, Halftoning, Neural classifier, 16-segmentdisplay concept.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17167 Online Multilingual Dictionary Using Hamburg Notation for Avatar-Based Indian Sign Language Generation System
Authors: Sugandhi, Parteek Kumar, Sanmeet Kaur
Abstract:
Sign Language (SL) is used by deaf and other people who cannot speak but can hear or have a problem with spoken languages due to some disability. It is a visual gesture language that makes use of either one hand or both hands, arms, face, body to convey meanings and thoughts. SL automation system is an effective way which provides an interface to communicate with normal people using a computer. In this paper, an avatar based dictionary has been proposed for text to Indian Sign Language (ISL) generation system. This research work will also depict a literature review on SL corpus available for various SL s over the years. For ISL generation system, a written form of SL is required and there are certain techniques available for writing the SL. The system uses Hamburg sign language Notation System (HamNoSys) and Signing Gesture Mark-up Language (SiGML) for ISL generation. It is developed in PHP using Web Graphics Library (WebGL) technology for 3D avatar animation. A multilingual ISL dictionary is developed using HamNoSys for both English and Hindi Language. This dictionary will be used as a database to associate signs with words or phrases of a spoken language. It provides an interface for admin panel to manage the dictionary, i.e., modification, addition, or deletion of a word. Through this interface, HamNoSys can be developed and stored in a database and these notations can be converted into its corresponding SiGML file manually. The system takes natural language input sentence in English and Hindi language and generate 3D sign animation using an avatar. SL generation systems have potential applications in many domains such as healthcare sector, media, educational institutes, commercial sectors, transportation services etc. This research work will help the researchers to understand various techniques used for writing SL and generation of Sign Language systems.
Keywords: Avatar, dictionary, HamNoSys, hearing-impaired, Indian Sign Language, sign language.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13556 Using Interval Trees for Approximate Indexing of Instances
Authors: Khalil el Hindi
Abstract:
This paper presents a simple and effective method for approximate indexing of instances for instance based learning. The method uses an interval tree to determine a good starting search point for the nearest neighbor. The search stops when an early stopping criterion is met. The method proved to be very effective especially when only the first nearest neighbor is required.
Keywords: Instance based learning, interval trees, the knn algorithm, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15115 Study of Features for Hand-printed Recognition
Authors: Satish Kumar
Abstract:
The feature extraction method(s) used to recognize hand-printed characters play an important role in ICR applications. In order to achieve high recognition rate for a recognition system, the choice of a feature that suits for the given script is certainly an important task. Even if a new feature required to be designed for a given script, it is essential to know the recognition ability of the existing features for that script. Devanagari script is being used in various Indian languages besides Hindi the mother tongue of majority of Indians. This research examines a variety of feature extraction approaches, which have been used in various ICR/OCR applications, in context to Devanagari hand-printed script. The study is conducted theoretically and experimentally on more that 10 feature extraction methods. The various feature extraction methods have been evaluated on Devanagari hand-printed database comprising more than 25000 characters belonging to 43 alphabets. The recognition ability of the features have been evaluated using three classifiers i.e. k-NN, MLP and SVM.Keywords: Features, Hand-printed, Devanagari, Classifier, Database
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17294 Evaluation of Guaiacol and Syringol Emission upon Wood Pyrolysis for some Fast Growing Species
Authors: Sherif S. Z. Hindi
Abstract:
Wood pyrolysis for Casuarina glauca, Casuarina cunninghamiana, Eucalyptus camaldulensis, Eucalyptus microtheca was made at 450°C with 2.5°C/min. in a flowing N2-atmosphere. The Eucalyptus genus wood gave higher values of specific gravity, ash , total extractives, lignin, N2-liquid trap distillate (NLTD) and water trap distillate (WSP) than those for Casuarina genus. The GHC of NLTD was higher for Casuarina genus than that for Eucalyptus genus with the highest value for Casuarina cunninghamiana. Guiacol, 4-ethyl-2-methoxyphenol and syringol were observed in the NLTD of all the four wood species reflecting their parent hardwood lignin origin. Eucalyptus camaldulensis wood had the highest lignin content (28.89%) and was pyrolyzed to the highest values of phenolics (73.01%), guaiacol (11.2%) and syringol (32.28%) contents in methylene chloride fraction (MCF) of NLTD. Accordingly, recoveries of syringol and guaiacol may become economically attractive from Eucalyptus camaldulensis.
Keywords: Wood, Pyrolysis, Guaiacol, Syringol
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23643 A Collaborative Platform for Multilingual Ontology Development
Authors: Ahmed Tawfik, Fausto Giunchiglia, Vincenzo Maltese
Abstract:
Ontologies provide a common understanding of a specific domain of interest that can be communicated between people and used as background knowledge for automated reasoning in a wide range of applications. In this paper, we address the design of multilingual ontologies following well-defined knowledge engineering methodologies with the support of novel collaborative development approaches. In particular, we present a collaborative platform which allows ontologies to be developed incrementally in multiple languages. This is made possible via an appropriate mapping between language independent concepts and one lexicalization per language (or a lexical gap in case such lexicalization does not exist). The collaborative platform has been designed to support the development of the Universal Knowledge Core, a multilingual ontology currently in English, Italian, Chinese, Mongolian, Hindi and Bangladeshi. Its design follows a workflow-based development methodology that models resources as a set of collaborative objects and assigns customizable workflows to build and maintain each collaborative object in a community driven manner, with extensive support of modern web 2.0 social and collaborative features.
Keywords: Knowledge Diversity, Knowledge Representation, Ontology Development.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22042 Indirect Solar Desalination: Value Engineering and Cost Benefit Analysis
Authors: Grace Rachid, Mutasem El-Fadel, Mahmoud Al-Hindi, Ibrahim Jamali, Daniel Abdel Nour
Abstract:
This study examines the feasibility of indirect solar desalination in oil producing countries in the Middle East and North Africa (MENA) region. It relies on value engineering (VE) and costbenefit with sensitivity analyses to identify optimal coupling configurations of desalination and solar energy technologies. A comparative return on investment was assessed as a function of water costs for varied plant capacities (25,000 to 75,000 m3/day), project lifetimes (15 to 25 years), and discount rates (5 to 15%) taking into consideration water and energy subsidies, land cost as well as environmental externalities in the form of carbon credit related to greenhouse gas (GHG) emissions reduction. The results showed reverse osmosis (RO) coupled with photovoltaic technologies (PVs) as the most promising configuration, robust across different prices for Brent oil, discount rates, as well as different project lifetimes. Environmental externalities and subsidies analysis revealed that a 16% reduction in existing subsidy on water tariffs would ensure economic viability. Additionally, while land costs affect investment attractiveness, the viability of RO coupled with PV remains possible for a land purchase cost <$ 80/m2 or a lease rate <$1/m2/yr. Beyond those rates, further subsidy lifting is required.Keywords: Solar energy, desalination, value engineering, CBA, carbon credit, subsidies.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25951 Locating Center Points for Radial Basis Function Networks Using Instance Reduction Techniques
Authors: Rana Yousef, Khalil el Hindi
Abstract:
The behavior of Radial Basis Function (RBF) Networks greatly depends on how the center points of the basis functions are selected. In this work we investigate the use of instance reduction techniques, originally developed to reduce the storage requirements of instance based learners, for this purpose. Five Instance-Based Reduction Techniques were used to determine the set of center points, and RBF networks were trained using these sets of centers. The performance of the RBF networks is studied in terms of classification accuracy and training time. The results obtained were compared with two Radial Basis Function Networks: RBF networks that use all instances of the training set as center points (RBF-ALL) and Probabilistic Neural Networks (PNN). The former achieves high classification accuracies and the latter requires smaller training time. Results showed that RBF networks trained using sets of centers located by noise-filtering techniques (ALLKNN and ENN) rather than pure reduction techniques produce the best results in terms of classification accuracy. The results show that these networks require smaller training time than that of RBF-ALL and higher classification accuracy than that of PNN. Thus, using ALLKNN and ENN to select center points gives better combination of classification accuracy and training time. Our experiments also show that using the reduced sets to train the networks is beneficial especially in the presence of noise in the original training sets.
Keywords: Radial basis function networks, Instance-based reduction, PNN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1688