Search results for: document categorization
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 307

Search results for: document categorization

217 Identification of Most Frequently Occurring Lexis in Winnings-announcing Unsolicited Bulke-mails

Authors: Jatinderkumar R. Saini, Apurva A. Desai

Abstract:

e-mail has become an important means of electronic communication but the viability of its usage is marred by Unsolicited Bulk e-mail (UBE) messages. UBE consists of many types like pornographic, virus infected and 'cry-for-help' messages as well as fake and fraudulent offers for jobs, winnings and medicines. UBE poses technical and socio-economic challenges to usage of e-mails. To meet this challenge and combat this menace, we need to understand UBE. Towards this end, the current paper presents a content-based textual analysis of nearly 3000 winnings-announcing UBE. Technically, this is an application of Text Parsing and Tokenization for an un-structured textual document and we approach it using Bag Of Words (BOW) and Vector Space Document Model techniques. We have attempted to identify the most frequently occurring lexis in the winnings-announcing UBE documents. The analysis of such top 100 lexis is also presented. We exhibit the relationship between occurrence of a word from the identified lexisset in the given UBE and the probability that the given UBE will be the one announcing fake winnings. To the best of our knowledge and survey of related literature, this is the first formal attempt for identification of most frequently occurring lexis in winningsannouncing UBE by its textual analysis. Finally, this is a sincere attempt to bring about alertness against and mitigate the threat of such luring but fake UBE.

Keywords: Lexis, Unsolicited Bulk e-mail (UBE), Vector SpaceDocument Model, Winnings, Lottery

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1492
216 Software Architectural Design Ontology

Authors: Muhammad Irfan Marwat, Sadaqat Jan, Syed Zafar Ali Shah

Abstract:

Software Architecture plays a key role in software development but absence of formal description of Software Architecture causes different impede in software development. To cope with these difficulties, ontology has been used as artifact. This paper proposes ontology for Software Architectural design based on IEEE model for architecture description and Kruchten 4+1 model for viewpoints classification. For categorization of style and views, ISO/IEC 42010 has been used. Corpus method has been used to evaluate ontology. The main aim of the proposed ontology is to classify and locate Software Architectural design information.

Keywords: Software Architecture Ontology, Semantic based Software Architecture, Software Architecture, Ontology, Software Engineering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4121
215 Towards Clustering of Web-based Document Structures

Authors: Matthias Dehmer, Frank Emmert Streib, Jürgen Kilian, Andreas Zulauf

Abstract:

Methods for organizing web data into groups in order to analyze web-based hypertext data and facilitate data availability are very important in terms of the number of documents available online. Thereby, the task of clustering web-based document structures has many applications, e.g., improving information retrieval on the web, better understanding of user navigation behavior, improving web users requests servicing, and increasing web information accessibility. In this paper we investigate a new approach for clustering web-based hypertexts on the basis of their graph structures. The hypertexts will be represented as so called generalized trees which are more general than usual directed rooted trees, e.g., DOM-Trees. As a important preprocessing step we measure the structural similarity between the generalized trees on the basis of a similarity measure d. Then, we apply agglomerative clustering to the obtained similarity matrix in order to create clusters of hypertext graph patterns representing navigation structures. In the present paper we will run our approach on a data set of hypertext structures and obtain good results in Web Structure Mining. Furthermore we outline the application of our approach in Web Usage Mining as future work.

Keywords: Clustering methods, graph-based patterns, graph similarity, hypertext structures, web structure mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1467
214 Application of a Similarity Measure for Graphs to Web-based Document Structures

Authors: Matthias Dehmer, Frank Emmert Streib, Alexander Mehler, Jürgen Kilian, Max Mühlhauser

Abstract:

Due to the tremendous amount of information provided by the World Wide Web (WWW) developing methods for mining the structure of web-based documents is of considerable interest. In this paper we present a similarity measure for graphs representing web-based hypertext structures. Our similarity measure is mainly based on a novel representation of a graph as linear integer strings, whose components represent structural properties of the graph. The similarity of two graphs is then defined as the optimal alignment of the underlying property strings. In this paper we apply the well known technique of sequence alignments for solving a novel and challenging problem: Measuring the structural similarity of generalized trees. In other words: We first transform our graphs considered as high dimensional objects in linear structures. Then we derive similarity values from the alignments of the property strings in order to measure the structural similarity of generalized trees. Hence, we transform a graph similarity problem to a string similarity problem for developing a efficient graph similarity measure. We demonstrate that our similarity measure captures important structural information by applying it to two different test sets consisting of graphs representing web-based document structures.

Keywords: Graph similarity, hierarchical and directed graphs, hypertext, generalized trees, web structure mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1846
213 A Long Tail Study of eWOM Communities

Authors: M. Olmedilla, M. R. Martinez-Torres, S. L. Toral

Abstract:

Electronic Word-Of-Mouth (eWOM) communities represent today an important source of information in which more and more customers base their purchasing decisions. They include thousands of reviews concerning very different products and services posted by many individuals geographically distributed all over the world. Due to their massive audience, eWOM communities can help users to find the product they are looking for even if they are less popular or rare. This is known as the long tail effect, which leads to a larger number of lower-selling niche products. This paper analyzes the long tail effect in a well-known eWOM community and defines a tool for finding niche products unavailable through conventional channels.

Keywords: eWOM, Online user reviews, Long tail theory, Product categorization, Social Network Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2300
212 Categorization and Estimation of Relative Connectivity of Genes from Meta-OFTEN Network

Authors: U. Kairov, T. Karpenyuk, E. Ramanculov, A. Zinovyev

Abstract:

The most common result of analysis of highthroughput data in molecular biology represents a global list of genes, ranked accordingly to a certain score. The score can be a measure of differential expression. Recent work proposed a new method for selecting a number of genes in a ranked gene list from microarray gene expression data such that this set forms the Optimally Functionally Enriched Network (OFTEN), formed by known physical interactions between genes or their products. Here we present calculation results of relative connectivity of genes from META-OFTEN network and tentative biological interpretation of the most reproducible signal. The relative connectivity and inbetweenness values of genes from META-OFTEN network were estimated.

Keywords: Microarray, META-OFTEN, gene network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1579
211 A Proposed Framework for Improving IT Utilization in the Energy Industry

Authors: Jin Kyung Park, Ji Yeon Cho, Yong Ho Shim, Su Jin Kim, Bong Gyou Lee

Abstract:

The purpose of this study is to suggest direction for future study of the energy-IT industry that will be used for framework to increase IT utilization in the energy industry. Recently, Green IT is a becoming global issue because of global environmental pollution. Also, IT roles in energy industry are becoming more important. However, the related studies were IT industry oriented that is not sufficient to make plan for Green energy. Therefore, after analyzing existing studies related to Green energy and Green IT, re-categorization for Green energy-IT industry was suggested. Direction of framework is based on energy industry that enable to link between energy and IT. The results of this study suggest comprehensive insight to Green energy-IT industry. Thus it is able to provide useful implications and guidelines to increase IT utilization in the energy industry.

Keywords: Energy-IT Industry, Green Energy, Green IT, IT Utilization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1302
210 Systematic Functional Analysis Methods for Design Retrieval and Documentation

Authors: L. Zehtaban, D. Roller

Abstract:

Apart from geometry, functionality is one of the most significant hallmarks of a product. The functionality of a product can be considered as the fundamental justification for a product existence. Therefore a functional analysis including a complete and reliable descriptor has a high potential to improve product development process in various fields especially in knowledge-based design. One of the important applications of the functional analysis and indexing is in retrieval and design reuse concept. More than 75% of design activity for a new product development contains reusing earlier and existing design know-how. Thus, analysis and categorization of product functions concluded by functional indexing, influences directly in design optimization. This paper elucidates and evaluates major classes for functional analysis by discussing their major methods. Moreover it is finalized by presenting a noble hybrid approach for functional analysis.

Keywords: Functional analysis, design reuse, functionalindexing and representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5110
209 Definition in Law: Transgender Identities and Marriage

Authors: Kimberly Tao

Abstract:

This paper looks at transgender identities and the law in the context of marriage. It particularly focuses on the role of language and definition in classifying transgendered individuals into a legal category. Two lines of cases in transgender jurisprudence are examined. The former cases decided the definition of 'man' and 'woman' on the basis of biological criteria while the latter cases held that biological factors should not be the sole criterion for defining a man or a woman. Three categories were found to classify transgender people, namely male, female and "monstrous". Since transgender people challenge the core gender distinction that the law stresses, they are often regarded as problematic and monstrous which caused them to be subjected to severe legal consequences. This paper discusses these issues by analyzing and comparing different cases in transgender jurisprudence as well as examining how these issues play out in contemporary Hong Kong.

Keywords: Trangender, Monstrousness, Categorization, Definition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2150
208 Trends in IT Consulting in Austria

Authors: Michael Torggler

Abstract:

IT consultants often take over an important role as an interface between technological, organizational and managerial structures. As a result, the services offered are in many cases assigned to different disciplines which can cause a lack of transparency on the market for consulting services. However, not all consulting products are suitable for every company because of different frameworks and business processes. In this context the questions arises as to what consulting products are currently offered and how they can be compared as well as how the market for IT consulting services is structured on the supply side. The presented study aims to shed light on the IT consulting market by giving an overview of the current structure of the supply-side for IT consulting services as well as proposing a categorization of the currently available consulting services (consulting fields) in order to provide a theoretical background for the empirical study. Apart from these theoretical considerations, the empirical results of field surveys on the Austrian IT consulting market are presented and analyzed.

Keywords: IT Consulting, Management Consulting, ISConsulting, Consulting Fields, Market study.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1368
207 A Hybrid Ontology Based Approach for Ranking Documents

Authors: Sarah Motiee, Azadeh Nematzadeh, Mehrnoush Shamsfard

Abstract:

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques to extract phrases from documents and the query and doing stemming on words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done flexible and in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Keywords: Document ranking, Ontology, Spread activation algorithm, Annotation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1578
206 A Similarity Measure for Clustering and its Applications

Authors: Guadalupe J. Torres, Ram B. Basnet, Andrew H. Sung, Srinivas Mukkamala, Bernardete M. Ribeiro

Abstract:

This paper introduces a measure of similarity between two clusterings of the same dataset produced by two different algorithms, or even the same algorithm (K-means, for instance, with different initializations usually produce different results in clustering the same dataset). We then apply the measure to calculate the similarity between pairs of clusterings, with special interest directed at comparing the similarity between various machine clusterings and human clustering of datasets. The similarity measure thus can be used to identify the best (in terms of most similar to human) clustering algorithm for a specific problem at hand. Experimental results pertaining to the text categorization problem of a Portuguese corpus (wherein a translation-into-English approach is used) are presented, as well as results on the well-known benchmark IRIS dataset. The significance and other potential applications of the proposed measure are discussed.

Keywords: Clustering Algorithms, Clustering Applications, Similarity Measures, Text Clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1522
205 ORank: An Ontology Based System for Ranking Documents

Authors: Mehrnoush Shamsfard, Azadeh Nematzadeh, Sarah Motiee

Abstract:

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques for extracting phrases and stemming words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Keywords: Document ranking, Ontology, Spread activation algorithm, Annotation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1841
204 Stochastic Learning Algorithms for Modeling Human Category Learning

Authors: Toshihiko Matsuka, James E. Corter

Abstract:

Most neural network (NN) models of human category learning use a gradient-based learning method, which assumes that locally-optimal changes are made to model parameters on each learning trial. This method tends to under predict variability in individual-level cognitive processes. In addition many recent models of human category learning have been criticized for not being able to replicate rapid changes in categorization accuracy and attention processes observed in empirical studies. In this paper we introduce stochastic learning algorithms for NN models of human category learning and show that use of the algorithms can result in (a) rapid changes in accuracy and attention allocation, and (b) different learning trajectories and more realistic variability at the individual-level.

Keywords: category learning, cognitive modeling, radial basis function, stochastic optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1580
203 Determination of Temperature and Velocity Fields in a Corridor at a Central Interim Spent Fuel Storage Facility Using Numerical Simulation

Authors: V. Salajka, J. Kala, P. Hradil

Abstract:

The presented article deals with the description of a numerical model of a corridor at a Central Interim Spent Fuel Storage Facility (hereinafter CISFSF). The model takes into account the effect of air flows on the temperature of stored waste. The computational model was implemented in the ANSYS/CFX programming environment in the form of a CFD task solution, which was compared with an approximate analytical calculation. The article includes a categorization of the individual alternatives for the ventilation of such underground systems. The aim was to evaluate a ventilation system for a CISFSF with regard to its stability and capacity to provide sufficient ventilation for the removal of heat produced by stored casks with spent nuclear fuel.

Keywords: Temperature fields, Spent Fuel, Interim storage facility, CFD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1360
202 Encrypter Information Software Using Chaotic Generators

Authors: Cardoza-Avendaño L., López-Gutiérrez R.M., Inzunza-González E., Cruz-Hernández C., García-Guerrero E., Spirin V., Serrano H.

Abstract:

This document shows a software that shows different chaotic generator, as continuous as discrete time. The software gives the option for obtain the different signals, using different parameters and initial condition value. The program shows then critical parameter for each model. All theses models are capable of encrypter information, this software show it too.

Keywords: cryptography, chaotic attractors, software.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1451
201 Collaborative Environmental Management: A Case Study Research of Stakeholders’ Collaboration in the Nigerian Oil-producing Region

Authors: Favour Makuochukwu Orji, Yingkui Zhao

Abstract:

A myriad of environmental issues face the Nigerian industrial region, resulting from; oil and gas production, mining, manufacturing and domestic wastes. Amidst these, much effort has been directed by stakeholders in the Nigerian oil producing regions, because of the impacts of the region on the wider Nigerian economy. Although collaborative environmental management has been noted as an effective approach in managing environmental issues, little attention has been given to the roles and practices of stakeholders in effecting a collaborative environmental management framework for the Nigerian oil-producing region. This paper produces a framework to expand and deepen knowledge relating to stakeholders aspects of collaborative roles in managing environmental issues in the Nigeria oil-producing region. The knowledge is derived from analysis of stakeholders’ practices – studied through multiple case studies using document analysis. Selected documents of key stakeholders – Nigerian government agencies, multi-national oil companies and host communities, were analyzed. Open and selective coding was employed manually during document analysis of data collected from the offices and websites of the stakeholders. The findings showed that the stakeholders have a range of roles, practices, interests, drivers and barriers regarding their collaborative roles in managing environmental issues. While they have interests for efficient resource use, compliance to standards, sharing of responsibilities, generating of new solutions, and shared objectives; there is evidence of major barriers and these include resource allocation, disjointed policy, ineffective monitoring, diverse socio- economic interests, lack of stakeholders’ commitment and limited knowledge sharing. However, host communities hold deep concerns over the collaborative roles of stakeholders for economic interests, particularly, where government agencies and multi-national oil companies are involved. With these barriers and concerns, a genuine stakeholders’ collaboration is found to be limited, and as a result, optimal environmental management practices and policies have not been successfully implemented in the Nigeria oil-producing region. A framework is produced that describes practices that characterize collaborative environmental management might be employed to satisfy the stakeholders’ interests. The framework recommends critical factors, based on the findings, which may guide a collaborative environmental management in the oil producing regions. The recommendations are designed to re-define the practices of stakeholders in managing environmental issues in the oil producing regions, not as something wholly new, but as an approach essential for implementing a sustainable environmental policy. This research outcome may clarify areas for future research as well as to contribute to industry guidance in the area of collaborative environmental management.

Keywords: Collaborative environmental management framework, document analysis, case studies, multinational oil companies, Nigerian oil-producing region, stakeholders analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2404
200 Quantifying the Sustainable Building Criteria Based on Case Studies from Malaysia

Authors: Fahanim Abdul Rashid, Muhammad Azzam Ismail, Deo Prasad

Abstract:

In order to encourage the construction of green homes (GH) in Malaysia, a simple and attainable framework for designing and building GHs is needed. This can be achieved by aligning GH principles against Cole-s 'Sustainable Building Criteria' (SBC). This set of considerations was used to categorize the GH features of three case studies from Malaysia. Although the categorization of building features is useful at exploring the presence of sustainability inclinations of each house, the overall impact of building features in each of the five SBCs are unknown. Therefore, this paper explored the possibility of quantifying the impact of building features categorized in SBC1 – “Buildings will have to adapt to the new environment and restore damaged ecology while mitigating resource use" based on existing GH assessment tools and methods and other literature. This process as reported in this paper could lead to a new dimension in green home rating and assessment methods.

Keywords: Green homes, Malaysia, Sustainable BuildingCriteria, Sustainable homes

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2100
199 A K-Means Based Clustering Approach for Finding Faulty Modules in Open Source Software Systems

Authors: Parvinder S. Sandhu, Jagdeep Singh, Vikas Gupta, Mandeep Kaur, Sonia Manhas, Ramandeep Sidhu

Abstract:

Prediction of fault-prone modules provides one way to support software quality engineering. Clustering is used to determine the intrinsic grouping in a set of unlabeled data. Among various clustering techniques available in literature K-Means clustering approach is most widely being used. This paper introduces K-Means based Clustering approach for software finding the fault proneness of the Object-Oriented systems. The contribution of this paper is that it has used Metric values of JEdit open source software for generation of the rules for the categorization of software modules in the categories of Faulty and non faulty modules and thereafter empirically validation is performed. The results are measured in terms of accuracy of prediction, probability of Detection and Probability of False Alarms.

Keywords: K-Means, Software Fault, Classification, ObjectOriented Metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2262
198 Virtual Training, Human-Computer and Software Interactions, and Social-Based Embodiness

Authors: Philippe Fauquet-Alekhine

Abstract:

For professions of high risk industries, simulation training has always been thought in terms of high degree of fidelity regarding the real operational situation. Due to the recent progress, this way of training is changing, modifying the human-computer and software interactions: the interactions between trainees during simulation training session tend to become virtual, transforming the social-based embodiness (the way subjects integrate social skills for interpersonal relationship with co-workers). On the basis of the analysis of eight different profession trainings, a categorization of interactions has help to produce an analytical tool, the social interactions table. This tool may be very valuable to point out the changes of social interactions when the training sessions are skipping from a high fidelity simulator to a virtual simulator. In this case, it helps the designers of professional training to analyze and to assess the consequences of the potential lack the social-based embodiness.

Keywords: Interface, interaction, simulator, virtual training.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1752
197 Service-Oriented Architecture for Object- Centric Information Fusion

Authors: Jeffrey A. Dunne, Kevin Ligozio

Abstract:

In many applications there is a broad variety of information relevant to a focal “object" of interest, and the fusion of such heterogeneous data types is desirable for classification and categorization. While these various data types can sometimes be treated as orthogonal (such as the hull number, superstructure color, and speed of an oil tanker), there are instances where the inference and the correlation between quantities can provide improved fusion capabilities (such as the height, weight, and gender of a person). A service-oriented architecture has been designed and prototyped to support the fusion of information for such “object-centric" situations. It is modular, scalable, and flexible, and designed to support new data sources, fusion algorithms, and computational resources without affecting existing services. The architecture is designed to simplify the incorporation of legacy systems, support exact and probabilistic entity disambiguation, recognize and utilize multiple types of uncertainties, and minimize network bandwidth requirements.

Keywords: Data fusion, distributed computing, service-oriented architecture, SOA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1430
196 Sustainability Assessment of Municipal Wastewater Treatment

Authors: Yousra Zakaria Ahmed, Ahmed El Gendy, Salah El Haggar

Abstract:

In this paper, our methodology to assess sustainability of wastewater treatment technologies in Egypt is presented. The preliminary list of factors to be considered, as well as their ranking listed. The factors include, but are not limited to pollutants removal efficiency and energy consumption under the environmental dimension, construction cost, operation and maintenance costs and required land area cost under the economic dimension and public acceptance, noise and generating job opportunities for local residents. This methodology is intended to be a user-friendly screening tool to support the decision making process when investigating different wastewater treatment technologies in Egypt. Based on the research work results presented in this paper, it can be generally concluded that the categorization of some of the social and environmental aspects of sustainability is subjective and highly dependent on the local conditions and researchers’ background.

Keywords: Sustainability, wastewater treatment, sustainability assessment, Egypt.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1529
195 Unsupervised Feature Selection Using Feature Density Functions

Authors: Mina Alibeigi, Sattar Hashemi, Ali Hamzeh

Abstract:

Since dealing with high dimensional data is computationally complex and sometimes even intractable, recently several feature reductions methods have been developed to reduce the dimensionality of the data in order to simplify the calculation analysis in various applications such as text categorization, signal processing, image retrieval, gene expressions and etc. Among feature reduction techniques, feature selection is one the most popular methods due to the preservation of the original features. In this paper, we propose a new unsupervised feature selection method which will remove redundant features from the original feature space by the use of probability density functions of various features. To show the effectiveness of the proposed method, popular feature selection methods have been implemented and compared. Experimental results on the several datasets derived from UCI repository database, illustrate the effectiveness of our proposed methods in comparison with the other compared methods in terms of both classification accuracy and the number of selected features.

Keywords: Feature, Feature Selection, Filter, Probability Density Function

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2032
194 The Functionality and Usage of CRM Systems

Authors: Michael Torggler

Abstract:

Modern information and communication technologies offer a variety of support options for the efficient handling of customer relationships. CRM systems have been developed, which are designed to support the processes in the areas of marketing, sales and service. Along with technological progress, CRM systems are constantly changing, i.e. the systems are continually enhanced by new functions. However, not all functions are suitable for every company because of different frameworks and business processes. In this context the question arises whether or not CRM systems are widely used in Austrian companies and which business processes are most frequently supported by CRM systems. This paper aims to shed light on the popularity of CRM systems in Austrian companies in general and the use of different functions to support their daily business. First of all, the paper provides a theoretical overview of the structure of modern CRM systems and proposes a categorization of currently available software functionality for collaborative, operational and analytical CRM processes, which provides the theoretical background for the empirical study. Apart from these theoretical considerations, the paper presents the empirical results of a field survey on the use of CRM systems in Austrian companies and analyzes its findings.

Keywords: CRM systems, CRM system adoption, CRM system diffusion, CRM functionality, Market study.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3935
193 Inflammatory Markers in the Blood and Chronic Periodontitis

Authors: Saimir Heta, Ilma Robo, Nevila Alliu, Tea Meta

Abstract:

Background: Plasma levels of inflammatory markers are the expression of the infectious wastes of existing periodontitis, as well as of existing inflammation everywhere in the body. Materials and Methods: The study consists of the clinical part of the measurement of inflammatory markers of 23 patients diagnosed with chronic periodontitis and the recording of parental periodontal parameters of patient periodontal status: hemorrhage index and probe values, before and 7-10 days after non-surgical periodontal treatment. Results: The level of fibrinogen drops according to the categorization of disease progression, active and passive, with the biggest % (18%-30%) at the fluctuation 10-20 mg/d. Fluctuations in fibrinogen level according to the age of patients in the range 0-10 mg/dL under 40 years and over 40 years was 13%-26%, in the range 10-20 mg/dL was 26%-22%, in the 20-40 mg/dL was 9%-4%. Conclusions: Non-surgical periodontal treatment significantly reduces the level of non-inflammatory markers in the blood. Oral health significantly reduces the potential source for periodontal bacteria, with the potential of promoting thromboembolism, through interaction between thrombocytes.

Keywords: Chronic periodontitis, atherosclerosis, risk factor, inflammatory markers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 496
192 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of big data technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centres or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through VADER and RoBERTa model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and Term Frequency – Inverse Document Frequency (TFIDF) Vectorization and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide if the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: Counter vectorization, Convolutional Neural Network, Crawler, data technology, Long Short-Term Memory, LSTM, Web Scraping, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 112
191 Proffering a Brand New Methodology to Resource Discovery in Grid based on Economic Criteria Using Learning Automata

Authors: Ali Sarhadi, Mohammad Reza Meybodi, Ali Yousefi

Abstract:

Resource discovery is one of the chief services of a grid. A new approach to discover the provenances in grid through learning automata has been propounded in this article. The objective of the aforementioned resource-discovery service is to select the resource based upon the user-s applications and the mercantile yardsticks that is to say opting for an originator which can accomplish the user-s tasks in the most economic manner. This novel service is submitted in two phases. We proffered an applicationbased categorization by means of an intelligent nerve-prone plexus. The user in question sets his or her application as the input vector of the nerve-prone nexus. The output vector of the aforesaid network limns the appropriateness of any one of the resource for the presented executive procedure. The most scrimping option out of those put forward in the previous stage which can be coped with to fulfill the task in question is picked out. Te resource choice is carried out by means of the presented algorithm based upon the learning automata.

Keywords: Resource discovery, learning automata, neural network, economic policy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1411
190 Restructuring of XML Documents in the Form of Ontologies

Authors: Jamal Bakkas, Mohamed Bahaj, Abdellatif Soklabi

Abstract:

The intense use of the web has made it a very changing environment, its content is in permanent evolution to adapt to the demands. The standards have accompanied this evolution by passing from standards that regroup data with their presentations without any structuring such as HTML, to standards that separate both and give more importance to the structural aspect of the content such as XML standard and its derivatives. Currently, with the appearance of the Semantic Web, ontologies become increasingly present on the web and standards that allow their representations as OWL and RDF/RDFS begin to gain momentum. This paper provided an automatic method that converts XML schema document to ontologies represented in OWL.

Keywords: XML Schema, OWL, RDB, Mapping, Ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2335
189 Juxtaposing South Africa’s Private Sector and Its Public Service Regarding Innovation Diffusion, to Explore the Obstacles to E-Governance

Authors: Petronella Jonck, Freda van der Walt

Abstract:

Despite the benefits of innovation diffusion in the South African public service, implementation thereof seems to be problematic, particularly with regard to e-governance which would enhance the quality of service delivery, especially accessibility, choice, and mode of operation. This paper reports on differences between the public service and the private sector in terms of innovation diffusion. Innovation diffusion will be investigated to explore identified obstacles that are hindering successful implementation of e-governance. The research inquiry is underpinned by the diffusion of innovation theory, which is premised on the assumption that innovation has a distinct channel, time, and mode of adoption within the organisation. A comparative thematic document analysis was conducted to investigate organisational differences with regard to innovation diffusion. A similar approach has been followed in other countries, where the same conceptual framework has been used to guide document analysis in studies in both the private and the public sectors. As per the recommended conceptual framework, three organisational characteristics were emphasised, namely the external characteristics of the organisation, the organisational structure, and the inherent characteristics of the leadership. The results indicated that the main difference in the external characteristics lies in the focus and the clientele of the private sector. With regard to organisational structure, private organisations have veto power, which is not the case in the public service. Regarding leadership, similarities were observed in social and environmental responsibility and employees’ attitudes towards immediate supervision. Differences identified included risk taking, the adequacy of leadership development, organisational approaches to motivation and involvement in decision making, and leadership style. Due to the organisational differences observed, it is recommended that differentiated strategies be employed to ensure effective innovation diffusion, and ultimately e-governance. It is recommended that the results of this research be used to stimulate discussion on ways to improve collaboration between the mentioned sectors, to capitalise on the benefits of each sector.

Keywords: E-governance, ICT, innovation diffusion, comparative analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1743
188 Design and Construction of PIC-Based IR Remote Control Moving Robot

Authors: Sanda Win, Tin Shein, Khin Maung Latt

Abstract:

This document describes an electronic speed control designed to drive two DC motors from a 6 V battery pack to be controlled by a commercial universal infrared remote control hand set. Conceived for a tank-like vehicle, one motor drives the left side wheels or tracks and the other motor drives the right side. As it is shown here, there is a left-right steering input and a forward– backward throttles input, like would be used on a model car. It is designed using a microcontroller PIC16F873A.

Keywords: Assembly Language, Direction Control, SpeedControl, PIC 16F 873A

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5133