Search results for: HTML documents

210 Semantic Enhanced Social Media Sentiments for Stock Market Prediction

Authors: K. Nirmala Devi, V. Murali Bhaskaran

Abstract:

Traditional document representation for classification follows Bag of Words (BoW) approach to represent the term weights. The conventional method uses the Vector Space Model (VSM) to exploit the statistical information of terms in the documents and they fail to address the semantic information as well as order of the terms present in the documents. Although, the phrase based approach follows the order of the terms present in the documents rather than semantics behind the word. Therefore, a semantic concept based approach is used in this paper for enhancing the semantics by incorporating the ontology information. In this paper a novel method is proposed to forecast the intraday stock market price directional movement based on the sentiments from Twitter and money control news articles. The stock market forecasting is a very difficult and highly complicated task because it is affected by many factors such as economic conditions, political events and investor’s sentiment etc. The stock market series are generally dynamic, nonparametric, noisy and chaotic by nature. The sentiment analysis along with wisdom of crowds can automatically compute the collective intelligence of future performance in many areas like stock market, box office sales and election outcomes. The proposed method utilizes collective sentiments for stock market to predict the stock price directional movements. The collective sentiments in the above social media have powerful prediction on the stock price directional movements as up/down by using Granger Causality test.

Keywords: Bag of Words, Collective Sentiments, Ontology, Semantic relations, Sentiments, Social media, Stock Prediction, Twitter, Vector Space Model and wisdom of crowds.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2750

209 Information Extraction from Unstructured and Ungrammatical Data Sources for Semantic Annotation

Authors: Quratulain N. Rajput, Sajjad Haider, Nasir Touheed

Abstract:

The internet has become an attractive avenue for global e-business, e-learning, knowledge sharing, etc. Due to continuous increase in the volume of web content, it is not practically possible for a user to extract information by browsing and integrating data from a huge amount of web sources retrieved by the existing search engines. The semantic web technology enables advancement in information extraction by providing a suite of tools to integrate data from different sources. To take full advantage of semantic web, it is necessary to annotate existing web pages into semantic web pages. This research develops a tool, named OWIE (Ontology-based Web Information Extraction), for semantic web annotation using domain specific ontologies. The tool automatically extracts information from html pages with the help of pre-defined ontologies and gives them semantic representation. Two case studies have been conducted to analyze the accuracy of OWIE.

Keywords: Ontology, Semantic Annotation, Wrapper, Information Extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2070

208 Theoretical Literature Review on Lack of Cardiorespiratory Fitness and Its Effects on Children

Authors: E. Abdi

Abstract:

The purpose of this theoretical literature review is to study the relevant academic literature on lack of cardiorespiratory fitness and its effects on children. The total of thirty eight relevant documents were identified and considered for this review which nineteen of those were original research articles published in peer reviewed journals. The other nineteen articles were statistical documents. This literature review is structured to examine 5 effects in deficiency of cardiorespiratory fitness in school aged children (A) Relative Age Effect (RAE), (B) Obesity, (C) Inadequate fitness level (D) Unhealthy life style, and (E) Academics. The categories provide a theoretical framework for future studies where results are driven from the literature review. The study discusses that regular physical fitness assists children and adolescents to develop healthy physical activity behaviors which can be sustained throughout adult life. Conclusion suggests that advocacy for increasing physical activity and decreasing sedentary behaviors at school and home are necessary.

Keywords: Cardiorespiratory, endurance, physical activity, physical fitness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2184

207 Concept Indexing using Ontology and Supervised Machine Learning

Authors: Rossitza M. Setchi, Qiao Tang

Abstract:

Nowadays, ontologies are the only widely accepted paradigm for the management of sharable and reusable knowledge in a way that allows its automatic interpretation. They are collaboratively created across the Web and used to index, search and annotate documents. The vast majority of the ontology based approaches, however, focus on indexing texts at document level. Recently, with the advances in ontological engineering, it became clear that information indexing can largely benefit from the use of general purpose ontologies which aid the indexing of documents at word level. This paper presents a concept indexing algorithm, which adds ontology information to words and phrases and allows full text to be searched, browsed and analyzed at different levels of abstraction. This algorithm uses a general purpose ontology, OntoRo, and an ontologically tagged corpus, OntoCorp, both developed for the purpose of this research. OntoRo and OntoCorp are used in a two-stage supervised machine learning process aimed at generating ontology tagging rules. The first experimental tests show a tagging accuracy of 78.91% which is encouraging in terms of the further improvement of the algorithm.

Keywords: Concepts, indexing, machine learning, ontology, tagging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634

206 The Video Database for Teaching and Learning in Football Refereeing

Authors: M. Armenteros, A. Domínguez, M. Fernández, A. J. Benítez

Abstract:

The following paper describes the video database tool used by the Fédération Internationale de Football Association (FIFA) as part of the research project developed in collaboration with the Carlos III University of Madrid. The database project began in 2012, with the aim of creating an educational tool for the training of instructors, referees and assistant referees, and it has been used in all FUTURO III courses since 2013. The platform now contains 3,135 video clips of different match situations from FIFA competitions. It has 1,835 users (FIFA instructors, referees and assistant referees). In this work, the main features of the database are described, such as the use of a search tool and the creation of multimedia presentations and video quizzes. The database has been developed in MySQL, ActionScript, Ruby on Rails and HTML. This tool has been rated by users as "very good" in all courses, which prompt us to introduce it as an ideal tool for any other sport that requires the use of video analysis.

Keywords: Video database, FIFA, refereeing, e-learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1275

205 A Recognition Method of Ancient Yi Script Based on Deep Learning

Authors: Shanxiong Chen, Xu Han, Xiaolong Wang, Hui Ma

Abstract:

Yi is an ethnic group mainly living in mainland China, with its own spoken and written language systems, after development of thousands of years. Ancient Yi is one of the six ancient languages in the world, which keeps a record of the history of the Yi people and offers documents valuable for research into human civilization. Recognition of the characters in ancient Yi helps to transform the documents into an electronic form, making their storage and spreading convenient. Due to historical and regional limitations, research on recognition of ancient characters is still inadequate. Thus, deep learning technology was applied to the recognition of such characters. Five models were developed on the basis of the four-layer convolutional neural network (CNN). Alpha-Beta divergence was taken as a penalty term to re-encode output neurons of the five models. Two fully connected layers fulfilled the compression of the features. Finally, at the softmax layer, the orthographic features of ancient Yi characters were re-evaluated, their probability distributions were obtained, and characters with features of the highest probability were recognized. Tests conducted show that the method has achieved higher precision compared with the traditional CNN model for handwriting recognition of the ancient Yi.

Keywords: Recognition, CNN, convolutional neural network, Yi character, divergence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 683

204 Moving towards a General Definition of Public Happiness: A Grounded Theory Approach to the Recent Academic Research on Well-Being

Authors: Cristina Sanchez-Sanchez

Abstract:

Although there seems to be a growing interest in the study of the citizen’s happiness as an alternative measure of a country’s progress to GDP, happiness as a public concern is still an ambiguous concept, hard to define. Moreover, different notions are used indiscriminately to talk about the same thing. This investigation aims to determine the conceptions of happiness, well-being and quality of life that originate from the indexes that different governments and public institutions around the world have created to study them. Through the Scoping Review method, this study identifies the recent academic research in this field (a total of 267 documents between 2006 and 2016) from some of the most popular social sciences databases around the world, Web of Science, Scopus, JSTOR, Sage, EBSCO, IBSS and Google Scholar, and in Spain, ISOC and Dialnet. These 267 documents referenced 53 different indexes and researches. The Grounded Theory method has been applied to a sample of 13 indexes in order to identify the main categories they use to determine these three concepts. The results show that these are multi-dimensional concepts and similar indicators are used indistinctly to measure happiness, well-being and quality of life.

Keywords: Grounded theory, happiness, happiness index, quality of life, scoping review, well-being.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 925

203 An Intelligent System for Phish Detection, using Dynamic Analysis and Template Matching

Authors: Chinmay Soman, Hrishikesh Pathak, Vishal Shah, Aniket Padhye, Amey Inamdar

Abstract:

Phishing, or stealing of sensitive information on the web, has dealt a major blow to Internet Security in recent times. Most of the existing anti-phishing solutions fail to handle the fuzziness involved in phish detection, thus leading to a large number of false positives. This fuzziness is attributed to the use of highly flexible and at the same time, highly ambiguous HTML language. We introduce a new perspective against phishing, that tries to systematically prove, whether a given page is phished or not, using the corresponding original page as the basis of the comparison. It analyzes the layout of the pages under consideration to determine the percentage distortion between them, indicative of any form of malicious alteration. The system design represents an intelligent system, employing dynamic assessment which accurately identifies brand new phishing attacks and will prove effective in reducing the number of false positives. This framework could potentially be used as a knowledge base, in educating the internet users against phishing.

Keywords: World Wide Web, Phishing, Internet security, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1784

202 A Methodology for Investigating Public Opinion Using Multilevel Text Analysis

Authors: William Xiu Shun Wong, Myungsu Lim, Yoonjin Hyun, Chen Liu, Seongi Choi, Dasom Kim, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, many users have begun to frequently share their opinions on diverse issues using various social media. Therefore, numerous governments have attempted to establish or improve national policies according to the public opinions captured from various social media. In this paper, we indicate several limitations of the traditional approaches to analyze public opinion on science and technology and provide an alternative methodology to overcome these limitations. First, we distinguish between the science and technology analysis phase and the social issue analysis phase to reflect the fact that public opinion can be formed only when a certain science and technology is applied to a specific social issue. Next, we successively apply a start list and a stop list to acquire clarified and interesting results. Finally, to identify the most appropriate documents that fit with a given subject, we develop a new logical filter concept that consists of not only mere keywords but also a logical relationship among the keywords. This study then analyzes the possibilities for the practical use of the proposed methodology thorough its application to discover core issues and public opinions from 1,700,886 documents comprising SNS, blogs, news, and discussions.

Keywords: Big data, social network analysis, text mining, topic modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1621

201 Organization Model of Semantic Document Repository and Search Techniques for Studying Information Technology

Authors: Nhon Do, Thuong Huynh, An Pham

Abstract:

Nowadays, organizing a repository of documents and resources for learning on a special field as Information Technology (IT), together with search techniques based on domain knowledge or document-s content is an urgent need in practice of teaching, learning and researching. There have been several works related to methods of organization and search by content. However, the results are still limited and insufficient to meet user-s demand for semantic document retrieval. This paper presents a solution for the organization of a repository that supports semantic representation and processing in search. The proposed solution is a model which integrates components such as an ontology describing domain knowledge, a database of document repository, semantic representation for documents and a file system; with problems, semantic processing techniques and advanced search techniques based on measuring semantic similarity. The solution is applied to build a IT learning materials management system of a university with semantic search function serving students, teachers, and manager as well. The application has been implemented, tested at the University of Information Technology, Ho Chi Minh City, Vietnam and has achieved good results.

Keywords: document retrieval system, knowledgerepresentation, document representation, semantic search, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665

200 Exploring the Narrative Communication: Representing Visual Information from Digital Travel Stories

Authors: Rocío Abascal-Mena, Erick López-Ornelas

Abstract:

We present the results of a case study aiming to assess the reflection of the tourism community in the Web and its usability to propose new ways to communicate visually. The wealth of information contained in the Web and the clear facilities to communicate personals points of view makes of the social web a new space of exploration. In this way, social web allow the sharing of information between communities with similar interests. However, the tourism community remains unexplored as is the case of the information covered in travel stories. Along the Web, we find multiples sites allowing the users to communicate their experiences and personal points of view of a particular place of the world. This cultural heritage is found in multiple documents, usually very little supplemented with photos, so they are difficult to explore due to the lack of visual information. This paper explores the possibility of analyzing travel stories to display them visually on maps and generate new knowledge such as patterns of travel routes. This way, travel narratives published in electronic formats can be very important especially to the tourism community because of the great amount of knowledge that can be extracted. Our approach is based on the use of a Geoparsing Web Service to extract geographic coordinates from travel narratives in order to draw the geo-positions and link the documents into a map image.

Keywords: Social web, tourism community, visual communication, travel stories, geo references.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1600

199 A study of Cancer-related MicroRNAs through Expression Data and Literature Search

Authors: Chien-Hung Huang, Chia-Wei Weng, Chang-Chih Chiang, Shih-Hua Wu, Chih-Hsien Huang, Ka-Lok Ng

Abstract:

MicroRNAs (miRNAs) are a class of non-coding RNAs that hybridize to mRNAs and induce either translation repression or mRNA cleavage. Recently, it has been reported that miRNAs could possibly play an important role in human diseases. By integrating miRNA target genes, cancer genes, miRNA and mRNA expression profiles information, a database is developed to link miRNAs to cancer target genes. The database provides experimentally verified human miRNA target genes information, including oncogenes and tumor suppressor genes. In addition, fragile sites information for miRNAs, and the strength of the correlation of miRNA and its target mRNA expression level for nine tissue types are computed, which serve as an indicator for suggesting miRNAs could play a role in human cancer. The database is freely accessible at http://ppi.bioinfo.asia.edu.tw/mirna_target/index.html.

Keywords: MicroRNA, miRNA expression profile, mRNAexpression profile, cancer genes, oncogene, tumor suppressor gene

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1478

198 Exploring Social Impact of Emerging Technologies from Futuristic Data

Authors: Heeyeul Kwon, Yongtae Park

Abstract:

Despite the highly touted benefits, emerging technologies have unleashed pervasive concerns regarding unintended and unforeseen social impacts. Thus, those wishing to create safe and socially acceptable products need to identify such side effects and mitigate them prior to the market proliferation. Various methodologies in the field of technology assessment (TA), namely Delphi, impact assessment, and scenario planning, have been widely incorporated in such a circumstance. However, literatures face a major limitation in terms of sole reliance on participatory workshop activities. They unfortunately missed out the availability of a massive untapped data source of futuristic information flooding through the Internet. This research thus seeks to gain insights into utilization of futuristic data, future-oriented documents from the Internet, as a supplementary method to generate social impact scenarios whilst capturing perspectives of experts from a wide variety of disciplines. To this end, network analysis is conducted based on the social keywords extracted from the futuristic documents by text mining, which is then used as a guide to produce a comprehensive set of detailed scenarios. Our proposed approach facilitates harmonized depictions of possible hazardous consequences of emerging technologies and thereby makes decision makers more aware of, and responsive to, broad qualitative uncertainties.

Keywords: Emerging technologies, futuristic data, scenario, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2344

197 A Weighted-Profiling Using an Ontology Basefor Semantic-Based Search

Authors: Hikmat A. M. Abd-El-Jaber, Tengku M. T. Sembok

Abstract:

The information on the Web increases tremendously. A number of search engines have been developed for searching Web information and retrieving relevant documents that satisfy the inquirers needs. Search engines provide inquirers irrelevant documents among search results, since the search is text-based rather than semantic-based. Information retrieval research area has presented a number of approaches and methodologies such as profiling, feedback, query modification, human-computer interaction, etc for improving search results. Moreover, information retrieval has employed artificial intelligence techniques and strategies such as machine learning heuristics, tuning mechanisms, user and system vocabularies, logical theory, etc for capturing user's preferences and using them for guiding the search based on the semantic analysis rather than syntactic analysis. Although a valuable improvement has been recorded on search results, the survey has shown that still search engines users are not really satisfied with their search results. Using ontologies for semantic-based searching is likely the key solution. Adopting profiling approach and using ontology base characteristics, this work proposes a strategy for finding the exact meaning of the query terms in order to retrieve relevant information according to user needs. The evaluation of conducted experiments has shown the effectiveness of the suggested methodology and conclusion is presented.

Keywords: information retrieval, user profiles, semantic Web, ontology, search engine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3167

196 Semantic Mobility Channel (SMC): Ubiquitous and Mobile Computing Meets the Semantic Web

Authors: José M. Cantera, Miguel Jiménez, Genoveva López, Javier Soriano

Abstract:

With the advent of emerging personal computing paradigms such as ubiquitous and mobile computing, Web contents are becoming accessible from a wide range of mobile devices. Since these devices do not have the same rendering capabilities, Web contents need to be adapted for transparent access from a variety of client agents. Such content adaptation is exploited for either an individual element or a set of consecutive elements in a Web document and results in better rendering and faster delivery to the client device. Nevertheless, Web content adaptation sets new challenges for semantic markup. This paper presents an advanced components platform, called SMC, enabling the development of mobility applications and services according to a channel model based on the principles of Services Oriented Architecture (SOA). It then goes on to describe the potential for integration with the Semantic Web through a novel framework of external semantic annotation that prescribes a scheme for representing semantic markup files and a way of associating Web documents with these external annotations. The role of semantic annotation in this framework is to describe the contents of individual documents themselves, assuring the preservation of the semantics during the process of adapting content rendering. Semantic Web content adaptation is a way of adding value to Web contents and facilitates repurposing of Web contents (enhanced browsing, Web Services location and access, etc).

Keywords: Semantic web, ubiquitous and mobile computing, web content transcoding. semantic mark-up, mobile computing, middleware and services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1761

195 A Bibliometric Assessment on Sustainability and Clustering

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner, David Gabriel F. de Barros

Abstract:

Review researches are useful in terms of analysis of research problems. Between the types of review documents, we commonly find bibliometric studies. This type of application often helps the global visualization of a research problem and helps academics worldwide to understand the context of a research area better. In this document, a bibliometric view surrounding clustering techniques and sustainability problems is presented. The authors aimed at which issues mostly use clustering techniques and even which sustainability issue would be more impactful on today’s moment of research. During the bibliometric analysis, we found 10 different groups of research in clustering applications for sustainability issues: Energy; Environmental; Non-urban Planning; Sustainable Development; Sustainable Supply Chain; Transport; Urban Planning; Water; Waste Disposal; and, Others. Moreover, by analyzing the citations of each group, it was discovered that the Environmental group could be classified as the most impactful research cluster in the area mentioned. After the content analysis of each paper classified in the environmental group, it was found that the k-means technique is preferred for solving sustainability problems with clustering methods since it appeared the most amongst the documents. The authors finally conclude that a bibliometric assessment could help indicate a gap of researches on waste disposal – which was the group with the least amount of publications – and the most impactful research on environmental problems.

Keywords: Bibliometric assessment, clustering, sustainability, territorial partitioning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 319

194 Management Software for the Elaboration of an Electronic File in the Pharmaceutical Industry Following Mexican Regulations

Authors: M. Peña Aguilar Juan, Ríos Hernández Ezequiel, R. Valencia Luis

Abstract:

For certification, certain goods of public interest, such as medicines and food, it is required the preparation and delivery of a dossier. For its elaboration, legal and administrative knowledge must be taken, as well as organization of the documents of the process, and an order that allows the file verification. Therefore, a virtual platform was developed to support the process of management and elaboration of the dossier, providing accessibility to the information and interfaces that allow the user to know the status of projects. The development of dossier system on the cloud allows the inclusion of the technical requirements for the software management, including the validation and the manufacturing in the field industry. The platform guides and facilitates the dossier elaboration (report, file or history), considering Mexican legislation and regulations, it also has auxiliary tools for its management. This technological alternative provides organization support for documents and accessibility to the information required to specify the successful development of a dossier. The platform divides into the following modules: System control, catalog, dossier and enterprise management. The modules are designed per the structure required in a dossier in those areas. However, the structure allows for flexibility, as its goal is to become a tool that facilitates and does not obstruct processes. The architecture and development of the software allows flexibility for future work expansion to other fields, this would imply feeding the system with new regulations.

Keywords: Electronic dossier, technologies for management, web software, dossier elaboration, pharmaceutical industry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1157

193 Opinion Mining Framework in the Education Domain

Authors: A. M. H. Elyasir, K. S. M. Anbananthen

Abstract:

The internet is growing larger and becoming the most popular platform for the people to share their opinion in different interests. We choose the education domain specifically comparing some Malaysian universities against each other. This comparison produces benchmark based on different criteria shared by the online users in various online resources including Twitter, Facebook and web pages. The comparison is accomplished using opinion mining framework to extract, process the unstructured text and classify the result to positive, negative or neutral (polarity). Hence, we divide our framework to three main stages; opinion collection (extraction), unstructured text processing and polarity classification. The extraction stage includes web crawling, HTML parsing, Sentence segmentation for punctuation classification, Part of Speech (POS) tagging, the second stage processes the unstructured text with stemming and stop words removal and finally prepare the raw text for classification using Named Entity Recognition (NER). Last phase is to classify the polarity and present overall result for the comparison among the Malaysian universities. The final result is useful for those who are interested to study in Malaysia, in which our final output declares clear winners based on the public opinions all over the web.

Keywords: Entity Recognition, Education Domain, Opinion Mining, Unstructured Text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2926

192 CVOIP-FRU: Comprehensive VoIP Forensics Report Utility

Authors: Alejandro Villegas, Cihan Varol

Abstract:

Voice over Internet Protocol (VoIP) products is an emerging technology that can contain forensically important information for a criminal activity. Without having the user name and passwords, this forensically important information can still be gathered by the investigators. Although there are a few VoIP forensic investigative applications available in the literature, most of them are particularly designed to collect evidence from the Skype product. Therefore, in order to assist law enforcement with collecting forensically important information from variety of Betamax VoIP tools, CVOIP-FRU framework is developed. CVOIP-FRU provides a data gathering solution that retrieves usernames, contact lists, as well as call and SMS logs from Betamax VoIP products. It is a scripting utility that searches for data within the registry, logs and the user roaming profiles in Windows and Mac OSX operating systems. Subsequently, it parses the output into readable text and html formats. One superior way of CVOIP-FRU compared to the other applications that due to intelligent data filtering capabilities and cross platform scripting back end of CVOIP-FRU, it is expandable to include other VoIP solutions as well. Overall, this paper reveals the exploratory analysis performed in order to find the key data paths and locations, the development stages of the framework, and the empirical testing and quality assurance of CVOIP-FRU.

Keywords: Betamax, digital forensics, report utility, VoIP, VoIP Buster, VoIPWise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3086

191 Highlighting Document's Structure

Authors: Sylvie Ratté, Wilfried Njomgue, Pierre-André Ménard

Abstract:

In this paper, we present symbolic recognition models to extract knowledge characterized by document structures. Focussing on the extraction and the meticulous exploitation of the semantic structure of documents, we obtain a meaningful contextual tagging corresponding to different unit types (title, chapter, section, enumeration, etc.).

Keywords: Information retrieval, document structures, symbolic grammars.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1190

190 Information Filtering using Index Word Selection based on the Topics

Authors: Takeru YOKOI, Hidekazu YANAGIMOTO, Sigeru OMATU

Abstract:

We have proposed an information filtering system using index word selection from a document set based on the topics included in a set of documents. This method narrows down the particularly characteristic words in a document set and the topics are obtained by Sparse Non-negative Matrix Factorization. In information filtering, a document is often represented with the vector in which the elements correspond to the weight of the index words, and the dimension of the vector becomes larger as the number of documents is increased. Therefore, it is possible that useless words as index words for the information filtering are included. In order to address the problem, the dimension needs to be reduced. Our proposal reduces the dimension by selecting index words based on the topics included in a document set. We have applied the Sparse Non-negative Matrix Factorization to the document set to obtain these topics. The filtering is carried out based on a centroid of the learning document set. The centroid is regarded as the user-s interest. In addition, the centroid is represented with a document vector whose elements consist of the weight of the selected index words. Using the English test collection MEDLINE, thus, we confirm the effectiveness of our proposal. Hence, our proposed selection can confirm the improvement of the recommendation accuracy from the other previous methods when selecting the appropriate number of index words. In addition, we discussed the selected index words by our proposal and we found our proposal was able to select the index words covered some minor topics included in the document set.

Keywords: Information Filtering, Sparse NMF, Index wordSelection, User Profile, Chi-squared Measure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1409

189 Ethereum Based Smart Contracts for Trade and Finance

Authors: Rishabh Garg

Abstract:

Traditionally, business parties build trust with a centralized operating mechanism, such as payment by letter of credit. However, the increase in cyber-attacks and malicious hacking has jeopardized business operations and finance practices. Emerging markets, due to their high banking risks and the large presence of digital financing, are looking for technology that enables transparency and traceability of any transaction in trade, finance or supply chain management. Blockchain systems, in the absence of any central authority, enable transactions across the globe with the help of decentralized applications. DApps consist of a front-end, a blockchain back-end, and middleware, that is, the code that connects the two. The front-end can be a sophisticated web app or mobile app, which is used to implement the functions/methods on the smart contract. Web apps can employ technologies such as HTML, CSS, React and Express. In this wake, fintech and blockchain products are popping up in brokerages, digital wallets, exchanges, post-trade clearance, settlement, middleware, infrastructure and base protocols. The present paper provides a technology driven solution, financial inclusion and innovative working paradigm for business and finance.

Keywords: Authentication, blockchain, channel, cryptography, DApps, data portability, Decentralized Public Key Infrastructure, Ethereum, hash function, Hashgraph, Privilege creep, Proof of Work algorithm, revocation, storage variables, Zero Knowledge Proof.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 466

188 The Effects of T-Walls on Urban Landscape and Quality of Life and Anti-Terror Design Concept in Kabul, Afghanistan

Authors: Fakhrullah Sarwari, Hiroko Ono

Abstract:

Kabul city has suffered a lot in 40 years of conflict of civil war and “The war on terror”. After the invasion of Afghanistan by the United States of America and its allies in 2001, the Taliban was removed from operational power, but The Taliban and other terrorist groups remained in remote areas of the country, they started suicide attacks and bombings. Hence to protect from these attacks officials surrounded their office buildings and houses with concrete blast walls. It gives a bad landscape to the city and creates traffic congestions. Our research contains; questionnaire, reviewing Kabul Municipality documents and literature review. Questionnaires were distributed to Kabul citizens to find out how people feel by seeing the T-Walls on Kabul streets? And what problems they face with T-Walls. “The T-Walls pull down commission” of Kabul Municipality documents were reviewed to find out what caused the failure of this commission. A literature review has been done to compare Kabul with Washington D.C on how they designed the city against terrorism threat without turning the cities into lock down. Bogota city of Columbia urban happiness movement is reviewed and compared with Kabul. The finding of research revealed that citizens of Kabul want security but not at the expense of public realm and creating the architecture of fear. It also indicates that increasing the T-walls do not give secure feeling but instead; it increases terror, hatred and affect people’s optimism. At the end, a series of recommendation is suggested on the issue.

Keywords: Anti-terror design, Kabul, T-Walls, urban happiness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 804

187 A Modification of Wireless and Internet Technologies for Logistics- Analysis

Authors: Apiwat Sangnoree

Abstract:

This research is designed for helping a WAPbased mobile phone-s user in order to analyze of logistics in the traffic area by applying and designing the accessible processes from mobile user to server databases. The research-s design comprises Mysql 4.1.8-nt database system for being the server which there are three sub-databases, traffic light – times of intersections in periods of the day, distances on the road of area-blocks where are divided from the main sample-area and speeds of sample vehicles (motorcycle, personal car and truck) in periods of the day. For interconnections between the server and user, PHP is used to calculate distances and travelling times from the beginning point to destination, meanwhile XHTML applied for receiving, sending and displaying data from PHP to user-s mobile. In this research, the main sample-area is focused at the Huakwang-Ratchada-s area, Bangkok, Thailand where usually the congested point and 6.25 km2 surrounding area which are split into 25 blocks, 0.25 km2 for each. For simulating the results, the designed server-database and all communicating models of this research have been uploaded to www.utccengineering.com/m4tg and used the mobile phone which supports WAP 2.0 XHTML/HTML multimode browser for observing values and displayed pictures. According to simulated results, user can check the route-s pictures from the requiring point to destination along with analyzed consuming times when sample vehicles travel in various periods of the day.

Keywords: WAP, logistics, XHTML, internet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1400

186 A New Version of Annotation Method with a XML-based Knowledge Base

Authors: Mohammad Yasrebi, Somayeh Khosravi

Abstract:

Machine-understandable data when strongly interlinked constitutes the basis for the SemanticWeb. Annotating web documents is one of the major techniques for creating metadata on the Web. Annotating websitexs defines the containing data in a form which is suitable for interpretation by machines. In this paper, we present a better and improved approach than previous [1] to annotate the texts of the websites depends on the knowledge base.

Keywords: Knowledge base, ontology, semantic annotation, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1528

185 Trends in Use of Millings in Pavement Maintenance

Authors: Rafiqul Tarefder, Mohiuddin Ahmad, Mohammad Hossain

Abstract:

While millings materials from old pavement surface can be an important component of cost effective maintenance operation, their use in maintenance projects are not uniform and well documented. This study documents the different maintenance practices followed by four transportation districts of New Mexico Department of Transportation (NMDOT) in an attempt to find whether millings are being used in maintenance projects by those districts. Based on existing literature, a questionnaire was developed related to six common maintenance practices. NMDOT district personal were interviewed face to face to discuss and get answers to that questionnaire. It revealed that NMDOT districts mainly use chip seal and patching. Other maintenance procedures such as sand seal, scrub seal, slurry seal, and thin overlay have limited use. Two out of four participating districts do not have any documents on chip sealing; rather they employ the experiences of the chip seal crew. All districts use polymer modified high float emulsion (HFE100P) for chip seal with an application rate ranging from 0.4 to 0.56 gallons per square yard. Chip application rate varies from 15 to 40 lb/ square yard. State wide, the thickness of chip seal varies from 3/8'' to 1'' and life varies from 3 to 10 years. NMDOT districts mainly use three type of patching: pothole, dig-out and blade patch. Pothole patches are used for small potholes and during emergency, dig-out patches are used for all type of potholes sometimes after pothole patching, and blade patch is used when a significant portion of the pavement is damaged. Pothole patches last as low as three days whereas, blade patch lasts as long as 3 years. It was observed that all participating districts use millings in maintenance projects.

Keywords: Chip seal, sand seal, scrub seal, slurry seal, overlay, patching, millings.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1951

184 Tourism Policy Challenges in Post-Soviet Georgia

Authors: Merab Khokhobaia

Abstract:

Within the framework of this research, the regulatory documents, which are in force in relation to this industry, were analyzed. The main attention is turned to their modernization and necessity of their compliance with European standards. It is a current issue to direct the efforts of state policy on support of business by implementing infrastructural projects, as well as by development of human resources, which may be possible by supporting the relevant higher and vocational studying-educational programs.

Keywords: Regional Development, Tourism Industry, Tourism Policy, Transition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2569

183 An Extensible Software Infrastructure for Computer Aided Custom Monitoring of Patients in Smart Homes

Authors: Ritwik Dutta, Marilyn Wolf

Abstract:

This paper describes the tradeoffs and the design from scratch of a self-contained, easy-to-use health dashboard software system that provides customizable data tracking for patients in smart homes. The system is made up of different software modules and comprises a front-end and a back-end component. Built with HTML, CSS, and JavaScript, the front-end allows adding users, logging into the system, selecting metrics, and specifying health goals. The backend consists of a NoSQL Mongo database, a Python script, and a SimpleHTTPServer written in Python. The database stores user profiles and health data in JSON format. The Python script makes use of the PyMongo driver library to query the database and displays formatted data as a daily snapshot of user health metrics against target goals. Any number of standard and custom metrics can be added to the system, and corresponding health data can be fed automatically, via sensor APIs or manually, as text or picture data files. A real-time METAR request API permits correlating weather data with patient health, and an advanced query system is implemented to allow trend analysis of selected health metrics over custom time intervals. Available on the GitHub repository system, the project is free to use for academic purposes of learning and experimenting, or practical purposes by building on it.

Keywords: Flask, Java, JavaScript, health monitoring, long term care, Mongo, Python, smart home, software engineering, webserver.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2088

182 Lexical Based Method for Opinion Detection on Tripadvisor Collection

Authors: Faiza Belbachir, Thibault Schienhinski

Abstract:

The massive development of online social networks allows users to post and share their opinions on various topics. With this huge volume of opinion, it is interesting to extract and interpret these information for different domains, e.g., product and service benchmarking, politic, system of recommendation. This is why opinion detection is one of the most important research tasks. It consists on differentiating between opinion data and factual data. The difficulty of this task is to determine an approach which returns opinionated document. Generally, there are two approaches used for opinion detection i.e. Lexical based approaches and Machine Learning based approaches. In Lexical based approaches, a dictionary of sentimental words is used, words are associated with weights. The opinion score of document is derived by the occurrence of words from this dictionary. In Machine learning approaches, usually a classifier is trained using a set of annotated document containing sentiment, and features such as n-grams of words, part-of-speech tags, and logical forms. Majority of these works are based on documents text to determine opinion score but dont take into account if these texts are really correct. Thus, it is interesting to exploit other information to improve opinion detection. In our work, we will develop a new way to consider the opinion score. We introduce the notion of trust score. We determine opinionated documents but also if these opinions are really trustable information in relation with topics. For that we use lexical SentiWordNet to calculate opinion and trust scores, we compute different features about users like (numbers of their comments, numbers of their useful comments, Average useful review). After that, we combine opinion score and trust score to obtain a final score. We applied our method to detect trust opinions in TRIPADVISOR collection. Our experimental results report that the combination between opinion score and trust score improves opinion detection.

Keywords: Tripadvisor, Opinion detection, SentiWordNet, trust score.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 691

181 An Semantic Algorithm for Text Categoritation

Authors: Xu Zhao

Abstract:

Text categorization techniques are widely used to many Information Retrieval (IR) applications. In this paper, we proposed a simple but efficient method that can automatically find the relationship between any pair of terms and documents, also an indexing matrix is established for text categorization. We call this method Indexing Matrix Categorization Machine (IMCM). Several experiments are conducted to show the efficiency and robust of our algorithm.

Keywords: Text categorization, Sub-space learning, Latent Semantic Space

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1420