Search results for: Retrieval Accuracy.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1942

Search results for: Retrieval Accuracy.

1822 Exploiting Query Feedback for Efficient Query Routing in Unstructured Peer-to-peer Networks

Authors: Iskandar Ishak, Naomie Salim

Abstract:

Unstructured peer-to-peer networks are popular due to its robustness and scalability. Query schemes that are being used in unstructured peer-to-peer such as the flooding and interest-based shortcuts suffer various problems such as using large communication overhead long delay response. The use of routing indices has been a popular approach for peer-to-peer query routing. It helps the query routing processes to learn the routing based on the feedbacks collected. In an unstructured network where there is no global information available, efficient and low cost routing approach is needed for routing efficiency. In this paper, we propose a novel mechanism for query-feedback oriented routing indices to achieve routing efficiency in unstructured network at a minimal cost. The approach also applied information retrieval technique to make sure the content of the query is understandable and will make the routing process not just based to the query hits but also related to the query content. Experiments have shown that the proposed mechanism performs more efficient than flood-based routing.

Keywords: Unstructured peer-to-peer, Searching, Retrieval, Internet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1492
1821 A New Approach to Signal Processing for DC-Electromagnetic Flowmeters

Authors: Michael Schukat

Abstract:

Electromagnetic flowmeters with DC excitation are used for a wide range of fluid measurement tasks, but are rarely found in dosing applications with short measurement cycles due to the achievable accuracy. This paper will identify a number of factors that influence the accuracy of this sensor type when used for short-term measurements. Based on these results a new signal-processing algorithm will be described that overcomes the identified problems to some extend. This new method allows principally a higher accuracy of electromagnetic flowmeters with DC excitation than traditional methods.

Keywords: Electromagnetic Flowmeter, Kalman Filter, ShortMeasurement Cycles, Signal Estimation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1567
1820 AINA: Disney Animation Information as Educational Resources

Authors: Piedad Garrido, Fernando Repulles, Andy Bloor, Julio A. Sanguesa, Jesus Gallardo, Vicente Torres, Jesus Tramullas

Abstract:

With the emergence and development of Information and Communications Technologies (ICTs), Higher Education is experiencing rapid changes, not only in its teaching strategies but also in student’s learning skills. However, we have noticed that students often have difficulty when seeking innovative, useful, and interesting learning resources for their work. This is due to the lack of supervision in the selection of good query tools. This paper presents AINA, an Information Retrieval (IR) computer system aimed at providing motivating and stimulating content to both students and teachers working on different areas and at different educational levels. In particular, our proposal consists of an open virtual resource environment oriented to the vast universe of Disney comics and cartoons. Our test suite includes Disney’s long and shorts films, and we have performed some activities based on the Just In Time Teaching (JiTT) methodology. More specifically, it has been tested by groups of university and secondary school students.

Keywords: Information retrieval, animation, educational resources, JiTT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1154
1819 Mining Association Rules from Unstructured Documents

Authors: Hany Mahgoub

Abstract:

This paper presents a system for discovering association rules from collections of unstructured documents called EART (Extract Association Rules from Text). The EART system treats texts only not images or figures. EART discovers association rules amongst keywords labeling the collection of textual documents. The main characteristic of EART is that the system integrates XML technology (to transform unstructured documents into structured documents) with Information Retrieval scheme (TF-IDF) and Data Mining technique for association rules extraction. EART depends on word feature to extract association rules. It consists of four phases: structure phase, index phase, text mining phase and visualization phase. Our work depends on the analysis of the keywords in the extracted association rules through the co-occurrence of the keywords in one sentence in the original text and the existing of the keywords in one sentence without co-occurrence. Experiments applied on a collection of scientific documents selected from MEDLINE that are related to the outbreak of H5N1 avian influenza virus.

Keywords: Association rules, information retrieval, knowledgediscovery in text, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2401
1818 Sounds Alike Name Matching for Myanmar Language

Authors: Yuzana, Khin Marlar Tun

Abstract:

Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.

Keywords: natural language processing, name matching, phonetic matching

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751
1817 Research on Development and Accuracy Improvement of an Explosion Proof Combustible Gas Leak Detector Using an IR Sensor

Authors: Gyoutae Park, Seungho Han, Byungduk Kim, Youngdo Jo, Yongsop Shim, Yeonjae Lee, Sangguk Ahn, Hiesik Kim, Jungil Park

Abstract:

In this paper, we presented not only development technology of an explosion proof type and portable combustible gas leak detector but also algorithm to improve accuracy for measuring gas concentrations. The presented techniques are to apply the flame-proof enclosure and intrinsic safe explosion proof to an infrared gas leak detector at first in Korea and to improve accuracy using linearization recursion equation and Lagrange interpolation polynomial. Together, we tested sensor characteristics and calibrated suitable input gases and output voltages. Then, we advanced the performances of combustible gaseous detectors through reflecting demands of gas safety management fields. To check performances of two company's detectors, we achieved the measurement tests with eight standard gases made by Korea Gas Safety Corporation. We demonstrated our instruments better in detecting accuracy other than detectors through experimental results.

Keywords: Gas sensor, leak, detector, accuracy, interpolation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1356
1816 Fast and Accuracy Control Chart Pattern Recognition using a New cluster-k-Nearest Neighbor

Authors: Samir Brahim Belhaouari

Abstract:

By taking advantage of both k-NN which is highly accurate and K-means cluster which is able to reduce the time of classification, we can introduce Cluster-k-Nearest Neighbor as "variable k"-NN dealing with the centroid or mean point of all subclasses generated by clustering algorithm. In general the algorithm of K-means cluster is not stable, in term of accuracy, for that reason we develop another algorithm for clustering our space which gives a higher accuracy than K-means cluster, less subclass number, stability and bounded time of classification with respect to the variable data size. We find between 96% and 99.7 % of accuracy in the lassification of 6 different types of Time series by using K-means cluster algorithm and we find 99.7% by using the new clustering algorithm.

Keywords: Pattern recognition, Time series, k-Nearest Neighbor, k-means cluster, Gaussian Mixture Model, Classification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1926
1815 Evaluation of Robust Feature Descriptors for Texture Classification

Authors: Jia-Hong Lee, Mei-Yi Wu, Hsien-Tsung Kuo

Abstract:

Texture is an important characteristic in real and synthetic scenes. Texture analysis plays a critical role in inspecting surfaces and provides important techniques in a variety of applications. Although several descriptors have been presented to extract texture features, the development of object recognition is still a difficult task due to the complex aspects of texture. Recently, many robust and scaling-invariant image features such as SIFT, SURF and ORB have been successfully used in image retrieval and object recognition. In this paper, we have tried to compare the performance for texture classification using these feature descriptors with k-means clustering. Different classifiers including K-NN, Naive Bayes, Back Propagation Neural Network , Decision Tree and Kstar were applied in three texture image sets - UIUCTex, KTH-TIPS and Brodatz, respectively. Experimental results reveal SIFTS as the best average accuracy rate holder in UIUCTex, KTH-TIPS and SURF is advantaged in Brodatz texture set. BP neuro network works best in the test set classification among all used classifiers.

Keywords: Texture classification, texture descriptor, SIFT, SURF, ORB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1546
1814 Design and Implementation of a Microcontroller Based LCD Screen Digital Stop Watch

Authors: Mr. Khalid I. Saad, Ms. Nusrat Afrin, Mr. Rajib Mikail

Abstract:

The stop watch is used to measure the time required for a certain event. This is different from normal clocks in many ways, one of which is the accuracy of time. The stop watch requires much more accuracy than the normal clocks. In this paper, an ATmega8535 microcontroller was used to control the stop watch, by which perfect accuracy can be ensured. For compiling the C code and for loading the compiled .hex file into the microcontroller, AVR studio and PonyProg were used respectively. The stop watch is also different from traditional stop watches, as it contains two different timing modes namely 'Split timing' and 'Lap timing'.

Keywords: Stop Watch, Microcontroller, Split timing, Laptiming, LCD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9605
1813 Fast Database Indexing for Large Protein Sequence Collections Using Parallel N-Gram Transformation Algorithm

Authors: Jehad A. H. Hammad, Nur'Aini binti Abdul Rashid

Abstract:

With the rapid development in the field of life sciences and the flooding of genomic information, the need for faster and scalable searching methods has become urgent. One of the approaches that were investigated is indexing. The indexing methods have been categorized into three categories which are the lengthbased index algorithms, transformation-based algorithms and mixed techniques-based algorithms. In this research, we focused on the transformation based methods. We embedded the N-gram method into the transformation-based method to build an inverted index table. We then applied the parallel methods to speed up the index building time and to reduce the overall retrieval time when querying the genomic database. Our experiments show that the use of N-Gram transformation algorithm is an economical solution; it saves time and space too. The result shows that the size of the index is smaller than the size of the dataset when the size of N-Gram is 5 and 6. The parallel N-Gram transformation algorithm-s results indicate that the uses of parallel programming with large dataset are promising which can be improved further.

Keywords: Biological sequence, Database index, N-gram indexing, Parallel computing, Sequence retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2082
1812 Optimal Document Archiving and Fast Information Retrieval

Authors: Hazem M. El-Bakry, Ahmed A. Mohammed

Abstract:

In this paper, an intelligent algorithm for optimal document archiving is presented. It is kown that electronic archives are very important for information system management. Minimizing the size of the stored data in electronic archive is a main issue to reduce the physical storage area. Here, the effect of different types of Arabic fonts on electronic archives size is discussed. Simulation results show that PDF is the best file format for storage of the Arabic documents in electronic archive. Furthermore, fast information detection in a given PDF file is introduced. Such approach uses fast neural networks (FNNs) implemented in the frequency domain. The operation of these networks relies on performing cross correlation in the frequency domain rather than spatial one. It is proved mathematically and practically that the number of computation steps required for the presented FNNs is less than that needed by conventional neural networks (CNNs). Simulation results using MATLAB confirm the theoretical computations.

Keywords: Information Storage and Retrieval, Electronic Archiving, Fast Information Detection, Cross Correlation, Frequency Domain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532
1811 The Relationship between Representational Conflicts, Generalization, and Encoding Requirements in an Instance Memory Network

Authors: Mathew Wakefield, Matthew Mitchell, Lisa Wise, Christopher McCarthy

Abstract:

This paper aims to provide an interpretation of artificial neural networks (ANNs) and explore some of its implications. The interpretation views ANNs as a memory which encodes instances of experience. An experiment explores the behavior of encoding and retrieval of instances from memory. A localised representation ANN is created that allows control over encoding and retrieved memory sample size and is experimented with using the MNIST digits dataset. The relationship between input familiarity, conflict within retrieved samples, and error rates is described and demonstrated to be an effective driver for memory encoding. Results indicate that selective encoding and retrieval samples that allow detection of memory conflicts produce optimal performance, and that error rates are normally distributed with input familiarity and conflict. By using input familiarity and sample consistency to guide memory encoding, the number of encoding trials on the dataset were reduced to 18.33% of the training data while maintaining good recognition performance on the test data.

Keywords: Artificial Neural Networks, ANNs, representation, memory, conflict monitoring, confidence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 412
1810 Speech Recognition Using Scaly Neural Networks

Authors: Akram M. Othman, May H. Riadh

Abstract:

This research work is aimed at speech recognition using scaly neural networks. A small vocabulary of 11 words were established first, these words are “word, file, open, print, exit, edit, cut, copy, paste, doc1, doc2". These chosen words involved with executing some computer functions such as opening a file, print certain text document, cutting, copying, pasting, editing and exit. It introduced to the computer then subjected to feature extraction process using LPC (linear prediction coefficients). These features are used as input to an artificial neural network in speaker dependent mode. Half of the words are used for training the artificial neural network and the other half are used for testing the system; those are used for information retrieval. The system components are consist of three parts, speech processing and feature extraction, training and testing by using neural networks and information retrieval. The retrieve process proved to be 79.5-88% successful, which is quite acceptable, considering the variation to surrounding, state of the person, and the microphone type.

Keywords: Feature extraction, Liner prediction coefficients, neural network, Speech Recognition, Scaly ANN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1696
1809 Controlled Vocabularies and Information Retrieval: 1918 Pandemic’s Scientific Literature as an Example

Authors: M. Garcia-Alsina, J. Cobarsí

Abstract:

The role of controlled vocabularies in information retrieval is broadly recognized as a relevant feature. Besides, there is a standing demand that editors and databases should consider the effective introduction of controlled vocabularies in their procedures to index scientific literature. That is especially important because information retrieval is pointed out as a significant point to drive systematic literature review. Hence, a first question emerges: Are the controlled vocabularies at this moment considered? On the other hand, subject searching in the catalogs is complex mainly due to the dichotomy between keywords from authors versus keywords based on controlled vocabularies. Finally, there is some demand to unify the terminology related to health to make easier the medical history exploitation and research. Considering these features, this paper focuses on controlled vocabularies related to the health field and their role for storing, classifying, and retrieving relevant literature. The objective is knowing which role plays the controlled vocabularies related to the health field to index and retrieve research literature in data bases such as Web of Science (WoS) and Scopus. So, this exploratory research is grounded over two research questions: 1) Which are the terms considered in specific controlled vocabularies of the health field; and 2) How papers are indexed in relevant databases to be easily retrieved, considering keywords vs specific health’ controlled vocabularies? This research takes as fieldwork the controlled vocabularies related to health and the scientific interest for 1918 flu pandemic, also known equivocally as ‘Spanish flu’. This interest has been fostered by the emergence in the early 21st of epidemics of pneumonic diseases caused by virus. Searches about and with controlled vocabularies on WoS and Scopus databases are conducted. First results of this work in progress are surprising. There are different controlled vocabularies for the health field, into which the terms collected and preferred related to ‘1918 pandemic’ are identified. To summarize, ‘Spanish influenza epidemic’ or ‘Spanish flu’ are collected as not preferred terms. The preferred terms are: ‘influenza’ or ‘influenza pandemic, 1918-1919’. Although the controlled vocabularies are clear in their election, most of the literature about ‘1918 pandemic’ is retrievable either by ‘Spanish’ or by ‘1918’ disjunct, and the dominant word to retrieve literature is ‘Spanish’ rather than ‘1918’. This is surprising considering the existence of suitable controlled vocabularies related to health topics, and the modern guidelines of World Health Organization concerning naming of diseases that point out to other preferred terms. A first conclusion is the failure of using controlled vocabularies for a field such as health, and in consequence for WoS and Scopus. This research opens further research questions about which is the role that controlled vocabularies play in the instructions to authors that journals deliver to documents’ authors.

Keywords: Controlled vocabularies, indexing, 1918 influenza, information retrieval, keywords, 1918 pandemic, scientific databases.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 365
1808 Semi-Automatic Analyzer to Detect Authorial Intentions in Scientific Documents

Authors: Kanso Hassan, Elhore Ali, Soule-dupuy Chantal, Tazi Said

Abstract:

Information Retrieval has the objective of studying models and the realization of systems allowing a user to find the relevant documents adapted to his need of information. The information search is a problem which remains difficult because the difficulty in the representing and to treat the natural languages such as polysemia. Intentional Structures promise to be a new paradigm to extend the existing documents structures and to enhance the different phases of documents process such as creation, editing, search and retrieval. The intention recognition of the author-s of texts can reduce the largeness of this problem. In this article, we present intentions recognition system is based on a semi-automatic method of extraction the intentional information starting from a corpus of text. This system is also able to update the ontology of intentions for the enrichment of the knowledge base containing all possible intentions of a domain. This approach uses the construction of a semi-formal ontology which considered as the conceptualization of the intentional information contained in a text. An experiments on scientific publications in the field of computer science was considered to validate this approach.

Keywords: Information research, text analyzes, intentionalstructure, segmentation, ontology, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601
1807 Fuzzy C-Means Clustering for Biomedical Documents Using Ontology Based Indexing and Semantic Annotation

Authors: S. Logeswari, K. Premalatha

Abstract:

Search is the most obvious application of information retrieval. The variety of widely obtainable biomedical data is enormous and is expanding fast. This expansion makes the existing techniques are not enough to extract the most interesting patterns from the collection as per the user requirement. Recent researches are concentrating more on semantic based searching than the traditional term based searches. Algorithms for semantic searches are implemented based on the relations exist between the words of the documents. Ontologies are used as domain knowledge for identifying the semantic relations as well as to structure the data for effective information retrieval. Annotation of data with concepts of ontology is one of the wide-ranging practices for clustering the documents. In this paper, indexing based on concept and annotation are proposed for clustering the biomedical documents. Fuzzy c-means (FCM) clustering algorithm is used to cluster the documents. The performances of the proposed methods are analyzed with traditional term based clustering for PubMed articles in five different diseases communities. The experimental results show that the proposed methods outperform the term based fuzzy clustering.

Keywords: MeSH Ontology, Concept Indexing, Annotation, semantic relations, Fuzzy c-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2255
1806 Business Domain Modelling Using an Integrated Framework

Authors: Mohammed Salahat, Steve Wade

Abstract:

This paper presents an application of a “Systematic Soft Domain Driven Design Framework” as a soft systems approach to domain-driven design of information systems development. The framework use SSM as a guiding methodology within which we have embedded a sequence of design tasks based on the UML leading to the implementation of a software system using the Naked Objects framework. This framework have been used in action research projects that have involved the investigation and modelling of business processes using object-oriented domain models and the implementation of software systems based on those domain models. Within this framework, Soft Systems Methodology (SSM) is used as a guiding methodology to explore the problem situation and to develop the domain model using UML for the given business domain. The framework is proposed and evaluated in our previous works, and a real case study “Information Retrieval System for academic research” is used, in this paper, to show further practice and evaluation of the framework in different business domain. We argue that there are advantages from combining and using techniques from different methodologies in this way for business domain modelling. The framework is overviewed and justified as multimethodology using Mingers multimethodology ideas.

Keywords: SSM, UML, domain-driven design, soft domaindriven design, naked objects, soft language, information retrieval, multimethodology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1728
1805 Secure and Efficient Transmission of Aggregated Data for Mobile Wireless Sensor Networks

Authors: A. Krishna Veni, R.Geetha

Abstract:

Wireless Sensor Networks (WSNs) are suitable for many scenarios in the real world. The retrieval of data is made efficient by the data aggregation techniques. Many techniques for the data aggregation are offered and most of the existing schemes are not energy efficient and secure. However, the existing techniques use the traditional clustering approach where there is a delay during the packet transmission since there is no proper scheduling. The presented system uses the Velocity Energy-efficient and Link-aware Cluster-Tree (VELCT) scheme in which there is a Data Collection Tree (DCT) which improves the lifetime of the network. The VELCT scheme and the construction of DCT reduce the delay and traffic. The network lifetime can be increased by avoiding the frequent change in cluster topology. Secure and Efficient Transmission of Aggregated data (SETA) improves the security of the data transmission via the trust value of the nodes prior the aggregation of data. Since SETA considers the data only from the trustworthy nodes for aggregation, it is more secure in transmitting the data thereby improving the accuracy of aggregated data.

Keywords: Aggregation, lifetime, network security, wireless sensor network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1178
1804 Straightness Error Compensation Servo-system for Single-axis Linear Motor Stage

Authors: M. S. Kang, D. H. Kim, J. S. Yoon, B. S. Park, J. K. Lee

Abstract:

Since straightness error of linear motor stage is hardly dependent upon machining accuracy and assembling accuracy, there is limit on maximum realizable accuracy. To cope with this limitation, this paper proposed a servo system to compensate straightness error of a linear motor stage. The servo system is mounted on the slider of the linear motor stage and moves in the direction of the straightness error so as to compensate the error. From position dependency and repeatability of the straightness error of the slider, a feedforward compensation control is applied to the platform servo control. In the consideration of required fine positioning accuracy, a platform driven by an electro-magnetic actuator is suggested and a sliding mode control was applied. The effectiveness of the sliding mode control was verified along with some experimental results.

Keywords: Linear Motor Stage, Straightness Error, Friction, Sliding Mode Control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1887
1803 Dimensional Accuracy of CNTs/PMMA Parts and Holes Produced by Laser Cutting

Authors: A. Karimzad Ghavidel, M. Zadshakouyan

Abstract:

Laser cutting is a very common production method for cutting 2D polymeric parts. Developing of polymer composites with nano-fibers makes important their other properties like laser workability. The aim of this research is investigation of the influence different laser cutting conditions on the dimensional accuracy of parts and holes from poly methyl methacrylate (PMMA)/carbon nanotubes (CNTs) material. Experiments were carried out by considering of CNTs (in four level 0,0.5, 1 and 1.5% wt.%), laser power (60, 80, and 100 watt) and cutting speed 20, 30, and 40 mm/s as input variable factors. The results reveal that CNTs adding improves the laser workability of PMMA and the increasing of power has a significant effect on the part and hole size. The findings also show cutting speed is effective parameter on the size accuracy. Eventually, the statistical analysis of results was done, and calculated mathematical equations by the regression are presented for determining relation between input and output factor.

Keywords: Dimensional accuracy-PMMA-CNTs-laser cutting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1144
1802 Organization Model of Semantic Document Repository and Search Techniques for Studying Information Technology

Authors: Nhon Do, Thuong Huynh, An Pham

Abstract:

Nowadays, organizing a repository of documents and resources for learning on a special field as Information Technology (IT), together with search techniques based on domain knowledge or document-s content is an urgent need in practice of teaching, learning and researching. There have been several works related to methods of organization and search by content. However, the results are still limited and insufficient to meet user-s demand for semantic document retrieval. This paper presents a solution for the organization of a repository that supports semantic representation and processing in search. The proposed solution is a model which integrates components such as an ontology describing domain knowledge, a database of document repository, semantic representation for documents and a file system; with problems, semantic processing techniques and advanced search techniques based on measuring semantic similarity. The solution is applied to build a IT learning materials management system of a university with semantic search function serving students, teachers, and manager as well. The application has been implemented, tested at the University of Information Technology, Ho Chi Minh City, Vietnam and has achieved good results.

Keywords: document retrieval system, knowledgerepresentation, document representation, semantic search, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665
1801 Logistic Model Tree and Expectation-Maximization for Pollen Recognition and Grouping

Authors: Endrick Barnacin, Jean-Luc Henry, Jack Molinié, Jimmy Nagau, Hélène Delatte, Gérard Lebreton

Abstract:

Palynology is a field of interest for many disciplines. It has multiple applications such as chronological dating, climatology, allergy treatment, and even honey characterization. Unfortunately, the analysis of a pollen slide is a complicated and time-consuming task that requires the intervention of experts in the field, which is becoming increasingly rare due to economic and social conditions. So, the automation of this task is a necessity. Pollen slides analysis is mainly a visual process as it is carried out with the naked eye. That is the reason why a primary method to automate palynology is the use of digital image processing. This method presents the lowest cost and has relatively good accuracy in pollen retrieval. In this work, we propose a system combining recognition and grouping of pollen. It consists of using a Logistic Model Tree to classify pollen already known by the proposed system while detecting any unknown species. Then, the unknown pollen species are divided using a cluster-based approach. Success rates for the recognition of known species have been achieved, and automated clustering seems to be a promising approach.

Keywords: Pollen recognition, logistic model tree, expectation-maximization, local binary pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 702
1800 A Novel Framework for User-Friendly Ontology-Mediated Access to Relational Databases

Authors: Efthymios Chondrogiannis, Vassiliki Andronikou, Efstathios Karanastasis, Theodora Varvarigou

Abstract:

A large amount of data is typically stored in relational databases (DB). The latter can efficiently handle user queries which intend to elicit the appropriate information from data sources. However, direct access and use of this data requires the end users to have an adequate technical background, while they should also cope with the internal data structure and values presented. Consequently the information retrieval is a quite difficult process even for IT or DB experts, taking into account the limited contributions of relational databases from the conceptual point of view. Ontologies enable users to formally describe a domain of knowledge in terms of concepts and relations among them and hence they can be used for unambiguously specifying the information captured by the relational database. However, accessing information residing in a database using ontologies is feasible, provided that the users are keen on using semantic web technologies. For enabling users form different disciplines to retrieve the appropriate data, the design of a Graphical User Interface is necessary. In this work, we will present an interactive, ontology-based, semantically enable web tool that can be used for information retrieval purposes. The tool is totally based on the ontological representation of underlying database schema while it provides a user friendly environment through which the users can graphically form and execute their queries.

Keywords: Ontologies, Relational Databases, SPARQL, Web Interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1884
1799 Evaluation of the Accuracy of Time of Arrival Source Location Algorithm of Acoustic Emission in Concrete-Mortar Structure

Authors: Hisham A. Elfergani, Ayad A. Abdalla, Ahmed R. Ballil

Abstract:

Acoustic Emission (AE) is one of the most effective non-destructive tests that can be used to detect the defect process as it is occurring. AE techniques can be used to monitor a wide range of structures and materials such as metals, non-metals and combinations of these when load is applied. The current work investigates the effectiveness and accuracy of TOA method in AE tests involving reinforced composite concrete-mortar structures. A series of experimental tests were performed using the Hsu-Neilson (H-N) source to study 2-D location accuracy using this method on concrete-mortar (400×400 mm) specimens. Four AE sensors (R3I – resonant frequency 30 kHz) were mounted to the mortar surface and six sources were performed at each point of preselected locations on the upper surface of the mortar. Results show that the TOA method can be used effectively to locate signals on composite concrete/mortar specimen and has high accuracy.

Keywords: Acoustic emission, time of arrival, composite materials, reinforced concrete.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 580
1798 Comparison of Machine Learning Techniques for Single Imputation on Audiograms

Authors: Sarah Beaver, Renee Bryce

Abstract:

Audiograms detect hearing impairment, but missing values pose problems. This work explores imputations in an attempt to improve accuracy. This work implements Linear Regression, Lasso, Linear Support Vector Regression, Bayesian Ridge, K Nearest Neighbors (KNN), and Random Forest machine learning techniques to impute audiogram frequencies ranging from 125 Hz to 8000 Hz. The data contain patients who had or were candidates for cochlear implants. Accuracy is compared across two different Nested Cross-Validation k values. Over 4000 audiograms were used from 800 unique patients. Additionally, training on data combines and compares left and right ear audiograms versus single ear side audiograms. The accuracy achieved using Root Mean Square Error (RMSE) values for the best models for Random Forest ranges from 4.74 to 6.37. The R2 values for the best models for Random Forest ranges from .91 to .96. The accuracy achieved using RMSE values for the best models for KNN ranges from 5.00 to 7.72. The R2 values for the best models for KNN ranges from .89 to .95. The best imputation models received R2 between .89 to .96 and RMSE values less than 8dB. We also show that the accuracy of classification predictive models performed better with our imputation models versus constant imputations by a two percent increase.

Keywords: Machine Learning, audiograms, data imputations, single imputations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37
1797 MOSFET Based ADC for Accurate Positioning of Control Valves in Industry

Authors: K. Diwakar, N. Vasudevan, C. Senthilpari

Abstract:

This paper presents MOSFET based analog to digital converter which is simple in design, has high resolution, and conversion rate better than dual slope ADC. It has no DAC which will limit the performance, no error in conversion, can operate for wide range of inputs and never become unstable. One of the industrial applications, where the proposed high resolution MOSFET ADC can be used is, for the positioning of control valves in a multi channel data acquisition and control system (DACS), using stepper motors as actuators of control valves. It is observed that in a DACS having ten control valves, 0.02% of positional accuracy of control valves can be achieved with the data update period of 250ms and with stepper motors of maximum pulse rate 20 Kpulses per sec. and minimum pulse width of 2.5 μsec. The reported accuracy so far by other authors is 0.2%, with update period of 255 ms and with 8 bit DAC. The accuracy in the proposed configuration is limited by the available precision stepper motor and not by the MOSFET based ADC.

Keywords: MOSFET based ADC, Actuators, Positional accuracy, Stepper Motors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2558
1796 Support Vector Machine based Intelligent Watermark Decoding for Anticipated Attack

Authors: Syed Fahad Tahir, Asifullah Khan, Abdul Majid, Anwar M. Mirza

Abstract:

In this paper, we present an innovative scheme of blindly extracting message bits from an image distorted by an attack. Support Vector Machine (SVM) is used to nonlinearly classify the bits of the embedded message. Traditionally, a hard decoder is used with the assumption that the underlying modeling of the Discrete Cosine Transform (DCT) coefficients does not appreciably change. In case of an attack, the distribution of the image coefficients is heavily altered. The distribution of the sufficient statistics at the receiving end corresponding to the antipodal signals overlap and a simple hard decoder fails to classify them properly. We are considering message retrieval of antipodal signal as a binary classification problem. Machine learning techniques like SVM is used to retrieve the message, when certain specific class of attacks is most probable. In order to validate SVM based decoding scheme, we have taken Gaussian noise as a test case. We generate a data set using 125 images and 25 different keys. Polynomial kernel of SVM has achieved 100 percent accuracy on test data.

Keywords: Bit Correct Ratio (BCR), Grid Search, Intelligent Decoding, Jackknife Technique, Support Vector Machine (SVM), Watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1624
1795 Study of Remote Sensing and Satellite Images Ability in Preparing Agricultural Land Use Map (ALUM)

Authors: Ali Gholami

Abstract:

In this research the Preparation of Land use map of scanner LISS III satellite data, belonging to the IRS in the Aghche region in Isfahan province, is studied carefully. For this purpose, the IRS satellite images of August 2008 and various land preparation uses in region including rangelands, irrigation farming, dry farming, gardens and urban areas were separated and identified. Therefore, the GPS and Erdas Imaging software were used and three methods of Maximum Likelihood, Mahalanobis Distance and Minimum Distance were analyzed. In each of these methods, matrix error and Kappa index were calculated and accuracy of each method, based on percentages: 53.13, 56.64 and 48.44, were obtained respectively. Considering the low accuracy of these methods in separation of land preparation use, the visual interpretation of the map was used. Finally, regional visits of 150 points were noted at random and no error was observed. It shows that the map prepared by visual interpretation is in high accuracy. Although the probable errors due to visual interpretation and geometric correction might happen but the desired accuracy of the map which is more than 85 percent is reliable.

Keywords: Land use map, Aghche Region, Erdas Imagine, satellite images

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1515
1794 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: Clustering, k-means, categorical datasets, pattern recognition, unsupervised learning, knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3490
1793 Satellite Data Classification Accuracy Assessment Based from Reference Dataset

Authors: Mohd Hasmadi Ismail, Kamaruzaman Jusoff

Abstract:

In order to develop forest management strategies in tropical forest in Malaysia, surveying the forest resources and monitoring the forest area affected by logging activities is essential. There are tremendous effort has been done in classification of land cover related to forest resource management in this country as it is a priority in all aspects of forest mapping using remote sensing and related technology such as GIS. In fact classification process is a compulsory step in any remote sensing research. Therefore, the main objective of this paper is to assess classification accuracy of classified forest map on Landsat TM data from difference number of reference data (200 and 388 reference data). This comparison was made through observation (200 reference data), and interpretation and observation approaches (388 reference data). Five land cover classes namely primary forest, logged over forest, water bodies, bare land and agricultural crop/mixed horticultural can be identified by the differences in spectral wavelength. Result showed that an overall accuracy from 200 reference data was 83.5 % (kappa value 0.7502459; kappa variance 0.002871), which was considered acceptable or good for optical data. However, when 200 reference data was increased to 388 in the confusion matrix, the accuracy slightly improved from 83.5% to 89.17%, with Kappa statistic increased from 0.7502459 to 0.8026135, respectively. The accuracy in this classification suggested that this strategy for the selection of training area, interpretation approaches and number of reference data used were importance to perform better classification result.

Keywords: Image Classification, Reference Data, Accuracy Assessment, Kappa Statistic, Forest Land Cover

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3091