Search results for: keyword spotting.

40 Investigation of Effective Parameters on Annealing and Hot Spotting Processes for Straightening of Bent Turbine Rotors

Authors: Esmaeil Poursaeidi, Mostafa Kamalzadeh Yazdi, Mohammadreza Mohammadi Arhani1

Abstract:

The most severe damage of the turbine rotor is its distortion. The rotor straightening process must lead, at the first stage, to removal of the stresses from the material by annealing and next, to straightening of the plastic distortion without leaving any stress by hot spotting. The straightening method does not produce stress accumulations and the heating technique, developed specifically for solid forged rotors and disks, enables to avoid local overheating and structural changes in the material. This process also does not leave stresses in the shaft material. An experimental study of hot spotting is carried out on a large turbine rotor and some of the most important effective parameters that must be considered on annealing and hot spotting processes are investigated in this paper.

Keywords: Annealing, Hot Spotting, Effective Parameter, Rotor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1949

39 Keyword Network Analysis on the Research Trends of Life-Long Education for People with Disabilities in Korea

Authors: Jakyoung Kim, Sungwook Jang

Abstract:

The purpose of this study is to examine the research trends of life-long education for people with disabilities using a keyword network analysis. For this purpose, 151 papers were selected from 594 papers retrieved using keywords such as 'people with disabilities' and 'life-long education' in the Korean Education and Research Information Service. The Keyword network analysis was constructed by extracting and coding the keyword used in the title of the selected papers. The frequency of the extracted keywords, the centrality of degree, and betweenness was analyzed by the keyword network. The results of the keyword network analysis are as follows. First, the main keywords that appeared frequently in the study of life-long education for people with disabilities were 'people with disabilities', 'life-long education', 'developmental disabilities', 'current situations', 'development'. The research trends of life-long education for people with disabilities are focused on the current status of the life-long education and the program development. Second, the keyword network analysis and visualization showed that the keywords with high frequency of occurrences also generally have high degree centrality and betweenness centrality. In terms of the keyword network diagram, it was confirmed that research trends of life-long education for people with disabilities are centered on six prominent keywords. Based on these results, it was discussed that life-long education for people with disabilities in the future needs to expand the subjects and the supporting areas of the life-long education, and the research needs to be further expanded into more detailed and specific areas.

Keywords: Life-long education, people with disabilities, research trends, keyword network analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1225

38 Identifying Potential Partnership for Open Innovation by using Bibliographic Coupling and Keyword Vector Mapping

Authors: Inchae Park, Byungun Yoon

Abstract:

As open innovation has received increasingly attention in the management of innovation, the importance of identifying potential partnership is increasing. This paper suggests a methodology to identify the interested parties as one of Innovation intermediaries to enable open innovation with patent network. To implement the methodology, multi-stage patent citation analysis such as bibliographic coupling and information visualization method such as keyword vector mapping are utilized. This paper has contribution in that it can present meaningful collaboration keywords to identified potential partners in network since not only citation information but also patent textual information is used.

Keywords: Open innovation, partner selection, bibliographic coupling, Keyword vector mapping, patent network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1775

37 Architecture of Speech-based Registration System

Authors: Mayank Kumar, D B Mahesh Kumar, Ashwin S Kumar, N K Srinath

Abstract:

In this era of technology, fueled by the pervasive usage of the internet, security is a prime concern. The number of new attacks by the so-called “bots", which are automated programs, is increasing at an alarming rate. They are most likely to attack online registration systems. Technology, called “CAPTCHA" (Completely Automated Public Turing test to tell Computers and Humans Apart) do exist, which can differentiate between automated programs and humans and prevent replay attacks. Traditionally CAPTCHA-s have been implemented with the challenge involved in recognizing textual images and reproducing the same. We propose an approach where the visual challenge has to be read out from which randomly selected keywords are used to verify the correctness of spoken text and in turn detect the presence of human. This is supplemented with a speaker recognition system which can identify the speaker also. Thus, this framework fulfills both the objectives – it can determine whether the user is a human or not and if it is a human, it can verify its identity.

Keywords: CAPTCHA, automatic speech recognition, keyword spotting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1530

36 An Open Source Advertisement System

Authors: Pushkar Umaranikar, Chris Pollett

Abstract:

An online advertisement system and its implementation for the Yioop open source search engine are presented. This system supports both selling advertisements and displaying them within search results. The selling of advertisements is done using a system to auction off daily impressions for keyword searches. This is an open, ascending price auction system in which all accepted bids will receive a fraction of the auctioned day’s impressions. New bids in our system are required to be at least one half of the sum of all previous bids ensuring the number of accepted bids is logarithmic in the total ad spend on a keyword for a day. The mechanics of creating an advertisement, attaching keywords to it, and adding it to an advertisement inventory are described. The algorithm used to go from accepted bids for a keyword to which ads are displayed at search time is also presented. We discuss properties of our system and compare it to existing auction systems and systems for selling online advertisements.

Keywords: Online markets, online ad system, online auctions, search engines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1346

35 Identification of Spam Keywords Using Hierarchical Category in C2C E-commerce

Authors: Shao Bo Cheng, Yong-Jin Han, Se Young Park, Seong-Bae Park

Abstract:

Consumer-to-Consumer (C2C) E-commerce has been growing at a very high speed in recent years. Since identical or nearly-same kinds of products compete one another by relying on keyword search in C2C E-commerce, some sellers describe their products with spam keywords that are popular but are not related to their products. Though such products get more chances to be retrieved and selected by consumers than those without spam keywords, the spam keywords mislead the consumers and waste their time. This problem has been reported in many commercial services like ebay and taobao, but there have been little research to solve this problem. As a solution to this problem, this paper proposes a method to classify whether keywords of a product are spam or not. The proposed method assumes that a keyword for a given product is more reliable if the keyword is observed commonly in specifications of products which are the same or the same kind as the given product. This is because that a hierarchical category of a product in general determined precisely by a seller of the product and so is the specification of the product. Since higher layers of the hierarchical category represent more general kinds of products, a reliable degree is differently determined according to the layers. Hence, reliable degrees from different layers of a hierarchical category become features for keywords and they are used together with features only from specifications for classification of the keywords. Support Vector Machines are adopted as a basic classifier using the features, since it is powerful, and widely used in many classification tasks. In the experiments, the proposed method is evaluated with a golden standard dataset from Yi-han-wang, a Chinese C2C E-commerce, and is compared with a baseline method that does not consider the hierarchical category. The experimental results show that the proposed method outperforms the baseline in F1-measure, which proves that spam keywords are effectively identified by a hierarchical category in C2C E-commerce.

Keywords: Spam Keyword, E-commerce, keyword features, spam filtering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2485

34 Effective Keyword and Similarity Thresholds for the Discovery of Themes from the User Web Access Patterns

Authors: Haider A Ramadhan, Khalil Shihab

Abstract:

Clustering techniques have been used by many intelligent software agents to group similar access patterns of the Web users into high level themes which express users intentions and interests. However, such techniques have been mostly focusing on one salient feature of the Web document visited by the user, namely the extracted keywords. The major aim of these techniques is to come up with an optimal threshold for the number of keywords needed to produce more focused themes. In this paper we focus on both keyword and similarity thresholds to generate themes with concentrated themes, and hence build a more sound model of the user behavior. The purpose of this paper is two fold: use distance based clustering methods to recognize overall themes from the Proxy log file, and suggest an efficient cut off levels for the keyword and similarity thresholds which tend to produce more optimal clusters with better focus and efficient size.

Keywords: Data mining, knowledge discovery, clustering, dataanalysis, Web log analysis, theme based searching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1436

33 Image Retrieval: Techniques, Challenge, and Trend

Authors: Hui Hui Wang, Dzulkifli Mohamad, N.A Ismail

Abstract:

This paper attempts to discuss the evolution of the retrieval techniques focusing on development, challenges and trends of the image retrieval. It highlights both the already addressed and outstanding issues. The explosive growth of image data leads to the need of research and development of Image Retrieval. However, Image retrieval researches are moving from keyword, to low level features and to semantic features. Drive towards semantic features is due to the problem of the keywords which can be very subjective and time consuming while low level features cannot always describe high level concepts in the users- mind.

Keywords: content based image retrieval, keyword based imageretrieval, semantic gap, semantic image retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2503

32 Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation

Authors: Mario Kubek, Herwig Unger

Abstract:

Centroid terms are single words that semantically and topically characterise text documents and so may serve as their very compact representation in automatic text processing. In the present paper, centroids are used to measure the relevance of text documents with respect to a given search query. Thus, a new graphbased paradigm for searching texts in large corpora is proposed and evaluated against keyword-based methods. The first, promising experimental results demonstrate the usefulness of the centroid-based search procedure. It is shown that especially the routing of search queries in interactive and decentralised search systems can be greatly improved by applying this approach. A detailed discussion on further fields of its application completes this contribution.

Keywords: Search algorithm, centroid, query, keyword, cooccurrence, categorisation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 605

31 Analyzing Keyword Networks for the Identification of Correlated Research Topics

Authors: Thiago M. R. Dias, Patrícia M. Dias, Gray F. Moita

Abstract:

The production and publication of scientific works have increased significantly in the last years, being the Internet the main factor of access and distribution of these works. Faced with this, there is a growing interest in understanding how scientific research has evolved, in order to explore this knowledge to encourage research groups to become more productive. Therefore, the objective of this work is to explore repositories containing data from scientific publications and to characterize keyword networks of these publications, in order to identify the most relevant keywords, and to highlight those that have the greatest impact on the network. To do this, each article in the study repository has its keywords extracted and in this way the network is characterized, after which several metrics for social network analysis are applied for the identification of the highlighted keywords.

Keywords: Extraction and data integration, bibliometrics, scientometrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 671

30 Automatic Text Summarization

Authors: Mohamed Abdel Fattah, Fuji Ren

Abstract:

This work proposes an approach to address automatic text summarization. This approach is a trainable summarizer, which takes into account several features, including sentence position, positive keyword, negative keyword, sentence centrality, sentence resemblance to the title, sentence inclusion of name entity, sentence inclusion of numerical data, sentence relative length, Bushy path of the sentence and aggregated similarity for each sentence to generate summaries. First we investigate the effect of each sentence feature on the summarization task. Then we use all features score function to train genetic algorithm (GA) and mathematical regression (MR) models to obtain a suitable combination of feature weights. The proposed approach performance is measured at several compression rates on a data corpus composed of 100 English religious articles. The results of the proposed approach are promising.

Keywords: Automatic Summarization, Genetic Algorithm, Mathematical Regression, Text Features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2309

29 A Text Mining Technique Using Association Rules Extraction

Authors: Hany Mahgoub, Dietmar Rösner, Nabil Ismail, Fawzy Torkey

Abstract:

This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.

Keywords: Text mining, data mining, association rule mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4410

28 Causes of Rotor Distortions and Applicable Common Straightening Methods for Turbine Rotors and Shafts

Authors: Esmaeil Poursaeidi, Mostafa Kamalzadeh Yazdi

Abstract:

Different problems may causes distortion of the rotor, and hence vibration, which is the most severe damage of the turbine rotors. In many years different techniques have been developed for the straightening of bent rotors. The method for straightening can be selected according to initial information from preliminary inspections and tests such as nondestructive tests, chemical analysis, run out tests and also a knowledge of the shaft material. This article covers the various causes of excessive bends and then some applicable common straightening methods are reviewed. Finally, hot spotting is opted for a particular bent rotor. A 325 MW steam turbine rotor is modeled and finite element analyses are arranged to investigate this straightening process. Results of experimental data show that performing the exact hot spot straightening process reduced the bending of the rotor significantly.

Keywords: Distortion, FEM, Hot Spot Area, Rotor Straightening

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6487

27 2-Dimensional Finger Gesture Based Mobile Robot Control Using Touch Screen

Authors: O. Ejale, N.B. Siddique, R. Seals

Abstract:

The purpose of this study was to present a reliable mean for human-computer interfacing based on finger gestures made in two dimensions, which could be interpreted and adequately used in controlling a remote robot's movement. The gestures were captured and interpreted using an algorithm based on trigonometric functions, in calculating the angular displacement from one point of touch to another as the user-s finger moved within a time interval; thereby allowing for pattern spotting of the captured gesture. In this paper the design and implementation of such a gesture based user interface was presented, utilizing the aforementioned algorithm. These techniques were then used to control a remote mobile robot's movement. A resistive touch screen was selected as the gesture sensor, then utilizing a programmed microcontroller to interpret them respectively.

Keywords: 2-Dimensional interface, finger gesture, mobile robot control, touch screen.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1910

26 A Practical Solution of a Plant Pipes Monitoring System Using Bio-mimetic Robots

Authors: Seung You Na, Daejung Shin, Jin Young Kim, Bae-Ho Lee, Ji-Sung Lee

Abstract:

There has been a growing interest in the field of bio-mimetic robots that resemble the shape of an insect or an aquatic animal, among many others. One bio-mimetic robot serves the purpose of exploring pipelines, spotting any troubled areas or malfunctions and reporting its data. Moreover, the robot is able to prepare for and react to any abnormal routes in the pipeline. In order to move effectively inside a pipeline, the robot-s movement will resemble that of a lizard. When situated in massive pipelines with complex routes, the robot places fixed sensors in several important spots in order to complete its monitoring. This monitoring task is to prevent a major system failure by preemptively recognizing any minor or partial malfunctions. Areas uncovered by fixed sensors are usually impossible to provide real-time observation and examination, and thus are dependant on periodical offline monitoring. This paper provides the Monitoring System that is able to monitor the entire area of pipelines–with and without fixed sensors–by using the bio-mimetic robot.

Keywords: Bio-mimetic robots, Plant pipes monitoring, Mobileand active monitoring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1571

25 Pipelines Monitoring System Using Bio-mimetic Robots

Authors: Seung You Na, Daejung Shin, Jin Young Kim, Seong-Joon Baek, Bae-Ho Lee

Abstract:

Recently there has been a growing interest in the field of bio-mimetic robots that resemble the behaviors of an insect or an aquatic animal, among many others. One of various bio-mimetic robot applications is to explore pipelines, spotting any troubled areas or malfunctions and reporting its data. Moreover, the robot is able to prepare for and react to any abnormal routes in the pipeline. Special types of mobile robots are necessary for the pipeline monitoring tasks. In order to move effectively along a pipeline, the robot-s movement will resemble that of insects or crawling animals. When situated in massive pipelines with complex routes, the robot places fixed sensors in several important spots in order to complete its monitoring. This monitoring task is to prevent a major system failure by preemptively recognizing any minor or partial malfunctions. Areas uncovered by fixed sensors are usually impossible to provide real-time observation and examination, and thus are dependent on periodical offline monitoring. This paper proposes a monitoring system that is able to monitor the entire area of pipelines–with and without fixed sensors–by using the bio-mimetic robot.

Keywords: Bio-mimetic robots, Plant pipes monitoring, Mobile and active monitoring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2248

24 Lessons Learned from Observing User Behavior through Repeated Usability Evaluations

Authors: Hanmin Jung, Mikyoung Lee, Won-kyung Sung

Abstract:

Academic research information service is a must for surveying previous studies in research and development process. OntoFrame is an academic research information service under Semantic Web framework different from simple keyword-based services such as CiteSeer and Google Scholar. The first purpose of this study is for revealing user behavior in their surveys, the objects of using academic research information services, and their needs. The second is for applying lessons learned from the results to OntoFrame.

Keywords: User Behavior, Usability Evaluation, OntoFrame, CiteSeer, Google Scholar, Academic Research Information Service.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1497

23 A Review of Existing Turnover Intention Theories

Authors: Pauline E. Ngo-Henha

Abstract:

Existing turnover intention theories are reviewed in this paper. This review was conducted with the help of the search keyword “turnover intention theories” in Google Scholar during the month of July 2017. These theories include: The Theory of Organizational Equilibrium (TOE), Social Exchange Theory, Job Embeddedness Theory, Herzberg’s Two-Factor Theory, the Resource-Based View, Equity Theory, Human Capital Theory, and the Expectancy Theory. One of the limitations of this review paper is that data were only collected from Google Scholar where many papers were sometimes not freely accessible. However, this paper attempts to contribute to the research in clarifying the distinction between theories and models in the context of turnover intention.

Keywords: Job embeddedness theory, theory of organizational equilibrium (TOE), Herzberg’s two-factor theory, turnover intention theories, theories and models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22583

22 A Secure Mobile OTP Authentication Scheme for User Mobility Cloud VDI Environment

Authors: Jong-won Lee

Abstract:

Since Cloud environment has appeared as the most powerful keyword in the computing industry, the growth in VDI (Virtual Desktop Infrastructure) became remarkable in domestic market. In recent years, with the trend that mobile devices such as smartphones and pads spread so rapidly, the strengths of VDI that allows people to access and perform business on the move along with companies' office needs expedite more rapid spread of VDI. In this paper, mobile OTP (One-Time Password) authentication method is proposed to secure mobile device portability through rapid and secure authentication using mobile devices such as mobile phones or pads, which does not require additional purchase or possession of OTP tokens of users. To facilitate diverse and wide use of Services in the future, service should be continuous and stable, and above all, security should be considered the most important to meet advanced portability and user accessibility, the strengths of VDI.

Keywords: Cloud, VDI, OTP, Mobility

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2026

21 Text Summarization for Oil and Gas Drilling Topic

Authors: Y. Y. Chen, O. M. Foong, S. P. Yong, Kurniawan Iwan

Abstract:

Information sharing and gathering are important in the rapid advancement era of technology. The existence of WWW has caused rapid growth of information explosion. Readers are overloaded with too many lengthy text documents in which they are more interested in shorter versions. Oil and gas industry could not escape from this predicament. In this paper, we develop an Automated Text Summarization System known as AutoTextSumm to extract the salient points of oil and gas drilling articles by incorporating statistical approach, keywords identification, synonym words and sentence-s position. In this study, we have conducted interviews with Petroleum Engineering experts and English Language experts to identify the list of most commonly used keywords in the oil and gas drilling domain. The system performance of AutoTextSumm is evaluated using the formulae of precision, recall and F-score. Based on the experimental results, AutoTextSumm has produced satisfactory performance with F-score of 0.81.

Keywords: Keyword's probability, synonym sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1713

20 Lecture Video Indexing and Retrieval Using Topic Keywords

Authors: B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa

Abstract:

In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.

Keywords: Video indexing and retrieval, lecture videos, content based video search, multimodal indexing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1529

19 A Keyword-Based Filtering Technique of Document-Centric XML using NFA Representation

Authors: Changwoo Byun, Kyounghan Lee, Seog Park

Abstract:

XML is becoming a de facto standard for online data exchange. Existing XML filtering techniques based on a publish/subscribe model are focused on the highly structured data marked up with XML tags. These techniques are efficient in filtering the documents of data-centric XML but are not effective in filtering the element contents of the document-centric XML. In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to adequately filter element contents using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. We show several performance studies, efficiency and scalability using the multi-query processing time (MQPT).

Keywords: XML Data Stream, Document-centric XML, Filtering Technique, Value-based Predicates.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1741

18 Text Summarization for Oil and Gas News Article

Authors: L. H. Chong, Y. Y. Chen

Abstract:

Information is increasing in volumes; companies are overloaded with information that they may lose track in getting the intended information. It is a time consuming task to scan through each of the lengthy document. A shorter version of the document which contains only the gist information is more favourable for most information seekers. Therefore, in this paper, we implement a text summarization system to produce a summary that contains gist information of oil and gas news articles. The summarization is intended to provide important information for oil and gas companies to monitor their competitor-s behaviour in enhancing them in formulating business strategies. The system integrated statistical approach with three underlying concepts: keyword occurrences, title of the news article and location of the sentence. The generated summaries were compared with human generated summaries from an oil and gas company. Precision and recall ratio are used to evaluate the accuracy of the generated summary. Based on the experimental results, the system is able to produce an effective summary with the average recall value of 83% at the compression rate of 25%.

Keywords: Information retrieval, text summarization, statistical approach.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1582

17 Image Ranking to Assist Object Labeling for Training Detection Models

Authors: Tonislav Ivanov, Oleksii Nedashkivskyi, Denis Babeshko, Vadim Pinskiy, Matthew Putman

Abstract:

Training a machine learning model for object detection that generalizes well is known to benefit from a training dataset with diverse examples. However, training datasets usually contain many repeats of common examples of a class and lack rarely seen examples. This is due to the process commonly used during human annotation where a person would proceed sequentially through a list of images labeling a sufficiently high total number of examples. Instead, the method presented involves an active process where, after the initial labeling of several images is completed, the next subset of images for labeling is selected by an algorithm. This process of algorithmic image selection and manual labeling continues in an iterative fashion. The algorithm used for the image selection is a deep learning algorithm, based on the U-shaped architecture, which quantifies the presence of unseen data in each image in order to find images that contain the most novel examples. Moreover, the location of the unseen data in each image is highlighted, aiding the labeler in spotting these examples. Experiments performed using semiconductor wafer data show that labeling a subset of the data, curated by this algorithm, resulted in a model with a better performance than a model produced from sequentially labeling the same amount of data. Also, similar performance is achieved compared to a model trained on exhaustive labeling of the whole dataset. Overall, the proposed approach results in a dataset that has a diverse set of examples per class as well as more balanced classes, which proves beneficial when training a deep learning model.

Keywords: Computer vision, deep learning, object detection, semiconductor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 791

16 Composite Kernels for Public Emotion Recognition from Twitter

Authors: Chien-Hung Chen, Yan-Chun Hsing, Yung-Chun Chang

Abstract:

The Internet has grown into a powerful medium for information dispersion and social interaction that leads to a rapid growth of social media which allows users to easily post their emotions and perspectives regarding certain topics online. Our research aims at using natural language processing and text mining techniques to explore the public emotions expressed on Twitter by analyzing the sentiment behind tweets. In this paper, we propose a composite kernel method that integrates tree kernel with the linear kernel to simultaneously exploit both the tree representation and the distributed emotion keyword representation to analyze the syntactic and content information in tweets. The experiment results demonstrate that our method can effectively detect public emotion of tweets while outperforming the other compared methods.

Keywords: Public emotion recognition, natural language processing, composite kernel, sentiment analysis, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 746

15 Effectiveness of Dominant Color Descriptor Technique in Medical Image Retrieval Application

Authors: Mohd Kamir Yusof

Abstract:

This paper presents a dominant color descriptor technique for medical image retrieval. The medical image system will collect and store into medical database. The purpose of dominant color descriptor (DCD) technique is to retrieve medical image and to display similar image using queried image. First, this technique will search and retrieve medical image based on keyword entered by user. After image is found, the system will assign this image as a queried image. DCD technique will calculate the image value of dominant color. Then, system will search and retrieve again medical image based on value of dominant color query image. Finally, the system will display similar images with the queried image to user. Simple application has been developed and tested using dominant color descriptor. Result based on experiment indicates this technique is effective and can be used for medical image retrieval.

Keywords: Medical Image Retrieval, Dominant ColorDescriptor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1726

14 Measuring Text-Based Semantics Relatedness Using WordNet

Authors: Madiha Khan, Sidrah Ramzan, Seemab Khan, Shahzad Hassan, Kamran Saeed

Abstract:

Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.

Keywords: GraphViz representation, semantic relatedness, similarity measurement, WordNet similarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 808

13 Wireless Building Monitoring and Control System

Authors: J.-P. Skön, M. Johansson, O. Kauhanen, M. Raatikainen, K. Leiviskä, M. Kolehmainen

Abstract:

The building sector is the largest energy consumer and CO2 emitter in the European Union (EU) and therefore the active reduction of energy consumption and elimination of energy wastage are among the main goals in it. Healthy housing and energy efficiency are affected by many factors which set challenges to monitoring, control and research of indoor air quality (IAQ) and energy consumption, especially in old buildings. These challenges include measurement and equipment costs, for example. Additionally, the measurement results are difficult to interpret and their usage in the ventilation control is also limited when taking into account the energy efficiency of housing at the same time. The main goal of this study is to develop a cost-effective building monitoring and control system especially for old buildings. The starting point or keyword of the development process is a wireless system; otherwise the installation costs become too high. As the main result, this paper describes an idea of a wireless building monitoring and control system. The first prototype of the system has been installed in 10 residential buildings and in 10 school buildings located in the City of Kuopio, Finland.

Keywords: Energy efficiency, Indoor air quality, Monitoring system, Building automation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782

12 Lexical Database for Multiple Languages: Multilingual Word Semantic Network

Authors: K. K. Yong, R. Mahmud, C. S. Woo

Abstract:

Data mining and knowledge engineering have become a tough task due to the availability of large amount of data in the web nowadays. Validity and reliability of data also become a main debate in knowledge acquisition. Besides, acquiring knowledge from different languages has become another concern. There are many language translators and corpora developed but the function of these translators and corpora are usually limited to certain languages and domains. Furthermore, search results from engines with traditional 'keyword' approach are no longer satisfying. More intelligent knowledge engineering agents are needed. To address to these problems, a system known as Multilingual Word Semantic Network is proposed. This system adapted semantic network to organize words according to concepts and relations. The system also uses open source as the development philosophy to enable the native language speakers and experts to contribute their knowledge to the system. The contributed words are then defined and linked using lexical and semantic relations. Thus, related words and derivatives can be identified and linked. From the outcome of the system implementation, it contributes to the development of semantic web and knowledge engineering.

Keywords: Multilingual, semantic network, intelligent knowledge engineering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1936

11 Categorizing Search Result Records Using Word Sense Disambiguation

Authors: R. Babisaraswathi, N. Shanthi, S. S. Kiruthika

Abstract:

Web search engines are designed to retrieve and extract the information in the web databases and to return dynamic web pages. The Semantic Web is an extension of the current web in which it includes semantic content in web pages. The main goal of semantic web is to promote the quality of the current web by changing its contents into machine understandable form. Therefore, the milestone of semantic web is to have semantic level information in the web. Nowadays, people use different keyword- based search engines to find the relevant information they need from the web. But many of the words are polysemous. When these words are used to query a search engine, it displays the Search Result Records (SRRs) with different meanings. The SRRs with similar meanings are grouped together based on Word Sense Disambiguation (WSD). In addition to that semantic annotation is also performed to improve the efficiency of search result records. Semantic Annotation is the process of adding the semantic metadata to web resources. Thus the grouped SRRs are annotated and generate a summary which describes the information in SRRs. But the automatic semantic annotation is a significant challenge in the semantic web. Here ontology and knowledge based representation are used to annotate the web pages.

Keywords: Ontology, Semantic Web, WordNet, Word Sense Disambiguation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1744