Search results for: frequent and non-frequent items

393 A Hybrid Recommendation System Based On Association Rules

Abstract:

Recommendation systems are widely used in e-commerce applications. The engine of a current recommendation system recommends items to a particular user based on user preferences and previous high ratings. Various recommendation schemes such as collaborative filtering and content-based approaches are used to build a recommendation system. Most of current recommendation systems were developed to fit a certain domain such as books, articles, and movies. We propose1 a hybrid framework recommendation system to be applied on two dimensional spaces (User × Item) with a large number of Users and a small number of Items. Moreover, our proposed framework makes use of both favorite and non-favorite items of a particular user. The proposed framework is built upon the integration of association rules mining and the content-based approach. The results of experiments show that our proposed framework can provide accurate recommendations to users.

Keywords: Data Mining, Association Rules, Recommendation Systems, Hybrid Systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3941

392 Correction of Frequent English Writing Errors by Using Coded Indirect Corrective Feedback and Error Treatment: The Case of Reading and Writing English for Academic Purposes II

Authors: Chaiwat Tantarangsee

Abstract:

The purposes of this study are 1) to study the frequent English writing errors of students registering the course: Reading and Writing English for Academic Purposes II, and 2) to find out the results of writing error correction by using coded indirect corrective feedback and writing error treatments. Samples include 28 2nd year English Major students, Faculty of Education, Suan Sunandha Rajabhat University. Tool for experimental study includes the lesson plan of the course; Reading and Writing English for Academic Purposes II, and tool for data collection includes 4 writing tests of short texts. The research findings disclose that frequent English writing errors found in this course comprise 7 types of grammatical errors, namely Fragment sentence, Subject-verb agreement, Wrong form of verb tense, Singular or plural noun endings, Run-ons sentence, Wrong form of verb pattern and Lack of parallel structure. Moreover, it is found that the results of writing error correction by using coded indirect corrective feedback and error treatment reveal the overall reduction of the frequent English writing errors and the increase of students’ achievement in the writing of short texts with the significance at .05.

Keywords: Coded indirect corrective feedback, error correction, and error treatment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1045

391 Identity of Cultural Food: A Case Study of Traditional Mon Cuisine in Bangkok, Thailand

Authors: S. Nitiworakarn

Abstract:

This research aims to identify traditional Mon cuisines as well as gather and classify traditional cuisines of Mon communities in Bangkok. The studying of this research is used by methodology of the quantitative research. Using the questionnaire as the method in collecting information from sampling totally amount of 450 persons analyzed via frequency, percentage and mean value. The results showed that a variety of traditional Mon cuisines of Bangkok could split into 6 categories of meat diet with 54 items and 6 categories of desserts with 19 items.

Keywords: Cultural identity, traditional food, Mon cuisine, Thailand.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3090

390 Manipulation of Ideological Items in the Audiovisual Translation of Voiced-Over Documentaries in the Arab World

Authors: S. Chabbak

Abstract:

In a widely globalized world, the influence of audiovisual translation on the culture and identity of audiences is unmistakable. However, in the Arab World, there is a noticeable disproportion between this growing influence and the research carried out in the field. As a matter of fact, the voiced-over documentary is one of the most abundantly translated genres in the Arab World that carries lots of ideological elements which are in many cases rendered by manipulation. However, voiced-over documentaries have hardly received any focused attention from researchers in the Arab World. This paper attempts to scrutinize the process of translation of voiced-over documentaries in the Arab World, from French into Arabic in the present case study, by sub-categorizing the ideological items subject to manipulation, identifying the techniques utilized in their translation and exploring the potential extra-linguistic factors that prompt translation agents to opt for manipulative translation. The investigation is based on a corpus of 94 episodes taken from a series entitled 360° GEO Reports, produced by the French German network ARTE in French, and acquired, translated and aired by Al Jazeera Documentary Channel for Arab audiences. The results yielded 124 cases of manipulation in four sub-categories of ideological items, and the use of 10 different oblique procedures in the process of manipulative translation. The study also revealed that manipulation is in most of the instances dictated by the editorial line of the broadcasting channel, in addition to the religious, geopolitical and socio-cultural peculiarities of the target culture.

Keywords: Audiovisual translation, ideological items, manipulation, voiced-over documentaries.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 962

389 Development and Psychometric Properties of the Relational Mobility Scale for the Indonesian Population

Authors: Sukaesi Marianti

Abstract:

This study aims to develop the Relational Mobility Scale for the Indonesian population and to investigate its psychometric properties. New items of the scale were created taking into account the Indonesian population which consists of two parallel forms (A and A’). This study uses 30 newly orchestrated items while keeping in mind the characteristics of the targeted population. The scale was administered to 433 public high school students in Malang, Indonesia. Construct validity of its factor structure was demonstrated using exploratory factor analysis and confirmatory factor analysis. The result exhibits that he model fits the data, and that the delayed alternate form method shows acceptable result. Results yielded that 21 items of the three-dimensional Relational Mobility Scale is suitable for measuring relational mobility in high school students of Indonesian population.

Keywords: Confirmatory factor analysis, exploratory factor analysis, delayed alternate form, Indonesian population, relational mobility scale.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 989

388 Validity and Reliability of Competency Assessment Implementation (CAI) Instrument Using Rasch Model

Authors: Nurfirdawati Muhamad Hanafi, Azmanirah Ab Rahman, Marina Ibrahim Mukhtar, Jamil Ahmad, Sarebah Warman

Abstract:

This study was conducted to generate empirical evidence on validity and reliability of the item of Competency Assessment Implementation (CAI) Instrument using Rasch Model for polythomous data aided by Winstep software version 3.68. The construct validity was examined by analyzing the point-measure correlation index (PTMEA), infit and outfit MNSQ values; meanwhile the reliability was examined by analyzing item reliability index. A survey technique was used as the major method with the CAI instrument on 156 teachers from vocational schools. The results have shown that the reliability of CAI Instrument items were between 0.80 and 0.98. PTMEA Correlation is in positive values, in which the item is able to distinguish between the ability of the respondent. Statistical data obtained show that out of 154 items, 12 items from the instrument suggested to be omitted. This study is hoped could bring a new direction to the process of data analysis in educational research.

Keywords: Competency Assessment, Reliability, Validity, Item Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2770

387 A Quantitative Model for Determining the Area of the “Core and Structural System Elements” of Tall Office Buildings

Authors: Görkem Arslan Kılınç

Abstract:

Due to the high construction, operation, and maintenance costs of tall buildings, quantification of the area in the plan layout which provides a financial return is an important design criterion. The area of the “core and the structural system elements” does not provide financial return but must exist in the plan layout. Some characteristic items of tall office buildings affect the size of these areas. From this point of view, 15 tall office buildings were systematically investigated. The typical office floor plans of these buildings were re-produced digitally. The area of the “core and the structural system elements” in each building and the characteristic items of each building were calculated. These characteristic items are the size of the long and short plan edge, plan length/width ratio, size of the core long and short edge, core length/width ratio, core area, slenderness, building height, number of floors, and floor height. These items were analyzed by correlation and regression analyses. Results of this paper put forward that; characteristic items which affect the area of "core and structural system elements" are plan long and short edge size, core short edge size, building height, and the number of floors. A one-unit increase in plan short side size increases the area of the "core and structural system elements" in the plan by 12,378 m2. An increase in core short edge size increases the area of the core and structural system elements in the plan by 25,650 m2. Subsequent studies can be conducted by expanding the sample of the study and considering the geographical location of the building.

Keywords: Core area, correlation analysis, floor area, regression analysis, space efficiency, tall office buildings.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 414

386 An EOQ Model for Non-Instantaneous Deteriorating Items with Power Demand, Time Dependent Holding Cost, Partial Backlogging and Permissible Delay in Payments

Authors: M. Palanivel, R. Uthayakumar

Abstract:

In this paper, Economic Order Quantity (EOQ) based model for non-instantaneous Weibull distribution deteriorating items with power demand pattern is presented. In this model, the holding cost per unit of the item per unit time is assumed to be an increasing linear function of time spent in storage. Here the retailer is allowed a trade-credit offer by the supplier to buy more items. Also in this model, shortages are allowed and partially backlogged. The backlogging rate is dependent on the waiting time for the next replenishment. This model aids in minimizing the total inventory cost by finding the optimal time interval and finding the optimal order quantity. The optimal solution of the model is illustrated with the help of numerical examples. Finally sensitivity analysis and graphical representations are given to demonstrate the model.

Keywords: Power demand pattern, Partial backlogging, Time dependent holding cost, Trade credit, Weibull deterioration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3041

385 Development of an Attitude Scale Towards Social Networking Sites

Authors: Münevver Başman, Deniz Gülleroğlu

Abstract:

The purpose of this study is to develop a scale to determine the attitudes towards social networking sites. 45 tryout items, prepared for this aim, were applied to 342 students studying at Marmara University, Faculty of Education. The reliability and the validity of the scale were conducted with the help of these students. As a result of exploratory factor analysis with Varimax rotation, 41 items grouped according to the structure with three factors (interest, reality and negative effects) is obtained. While alpha reliability of the scale is obtained as .899; the reliability of factors is obtained as .899, .799, .775, respectively.

Keywords: Attitude, reliability, social networking sites, validity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1615

384 A New Model for Discovering XML Association Rules from XML Documents

Authors: R. AliMohammadzadeh, M. Rahgozar, A. Zarnani

Abstract:

The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized association rules form a XML document collection. We directly use frequent subtree mining techniques in the discovery process and do not ignore the tree structure of data in the final rules. The frequent subtrees based on the user provided support are split to complement subtrees to form the rules. We explain our model within multi-steps from data preparation to rule generation.

Keywords: XML, Data Mining, Association Rule Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1588

383 A Community Compromised Approach to Combinatorial Coalition Problem

Authors: Laor Boongasame, Veera Boonjing, Ho-fung Leung

Abstract:

Buyer coalition with a combination of items is a group of buyers joining together to purchase a combination of items with a larger discount. The primary aim of existing buyer coalition with a combination of items research is to generate a large total discount. However, the aim is hard to achieve because this research is based on the assumption that each buyer completely knows other buyers- information or at least one buyer knows other buyers- information in a coalition by exchange of information. These assumption contrast with the real world environment where buyers join a coalition with incomplete information, i.e., they concerned only with their expected discounts. Therefore, this paper proposes a new buyer community coalition formation with a combination of items scheme, called the Community Compromised Combinatorial Coalition scheme, under such an environment of incomplete information. In order to generate a larger total discount, after buyers who want to join a coalition propose their minimum required saving, a coalition structure that gives a maximum total retail prices is formed. Then, the total discount division of the coalition is divided among buyers in the coalition depending on their minimum required saving and is a Pareto optimal. In mathematical analysis, we compare concepts of this scheme with concepts of the existing buyer coalition scheme. Our mathematical analysis results show that the total discount of the coalition in this scheme is larger than that in the existing buyer coalition scheme.

Keywords: group decision and negotiations, group buying, gametheory, combinatorial coalition formation, Pareto optimality

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1491

382 Elimination of Redundant Links in Web Pages– Mathematical Approach

Authors: G. Poonkuzhali, K.Thiagarajan, K.Sarukesi

Abstract:

With the enormous growth on the web, users get easily lost in the rich hyper structure. Thus developing user friendly and automated tools for providing relevant information without any redundant links to the users to cater to their needs is the primary task for the website owners. Most of the existing web mining algorithms have concentrated on finding frequent patterns while neglecting the less frequent one that are likely to contain the outlying data such as noise, irrelevant and redundant data. This paper proposes new algorithm for mining the web content by detecting the redundant links from the web documents using set theoretical(classical mathematics) such as subset, union, intersection etc,. Then the redundant links is removed from the original web content to get the required information by the user..

Keywords: Web documents, Web content mining, redundantlink, outliers, set theory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1963

381 The Correlation between Peer Aggression and Peer Victimization: Are Aggressors Victims Too?

Authors: Glenn M. Calaguas

Abstract:

To investigate the possible correlation between peer aggression and peer victimization, 148 sixth-graders were asked to respond to the Reduced Aggression and Victimization Scales (RAVS). RAVS measures the frequency of reporting aggressive behaviors or of being victimized during the previous week prior to the survey. The scales are composed of six items each. Each point represents one instance of aggression or victimization. Specifically, the Pearson Product-Moment Correlation Coefficient (PMCC) was used to determine the correlations between the scores of the sixthgraders in the two scales, both in individual items and total scores. Positive correlations were established and correlations were significant at the 0.01 levels.

Keywords: correlation, peer aggression, peer victimization, sixth-graders.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2401

380 Defect Cause Modeling with Decision Tree and Regression Analysis

Authors: B. Bakır, İ. Batmaz, F. A. Güntürkün, İ. A. İpekçi, G. Köksal, N. E. Özdemirel

Abstract:

The main aim of this study is to identify the most influential variables that cause defects on the items produced by a casting company located in Turkey. To this end, one of the items produced by the company with high defective percentage rates is selected. Two approaches-the regression analysis and decision treesare used to model the relationship between process parameters and defect types. Although logistic regression models failed, decision tree model gives meaningful results. Based on these results, it can be claimed that the decision tree approach is a promising technique for determining the most important process variables.

Keywords: Casting industry, decision tree algorithm C5.0, logistic regression, quality improvement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2460

379 Awareness of Students and Teachers towards AIDS and AIDS Education

Authors: Anjan Saikia

Abstract:

600 schools going adolescents and 100 teachers from 16 schools of Dhemaji and Lakhimpur district of Assam, India were surveyed to assess and compare their awareness regarding AIDS and AIDS Education. An awareness test was administered containing 38 items for adolescents and 40 items for teachers in the test. Observations revealed that the majority of school-going adolescents are poor in their HIV/AIDS and AIDS education awareness. It shows that the school going adolescents of Dhemaji district are better in HIV/AIDS and AIDS education awareness than the school going adolescents of Lakhimpur district while comparing the gender, settlement, steam and district wise variables.

Keywords: Awareness, HIV, AIDS, AIDS education.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2122

378 Shoplifting in Riyadh, Saudi Arabia

Authors: Saleh Dabil

Abstract:

the research was conducted using the self report of shoplifters who apprehended in the supermarket while stealing. 943 shoplifters in three years were interviewed right after the stealing act and before calling the police. The aim of the study is to know the shoplifting characteristics in Saudi Arabia, including the trait of shoplifters and the situation of the supermarkets where the stealing takes place. The analysis based on the written information about each thief as the documentary research method. Descriptive statistics as well as some inferential statistics were employed. The result shows that there are differences between genders, age groups, occupations, time of the day, days of the week, months, way of stealing, individual or group of thieves and other supermarket situations in the type of items stolen, total price and the count of items. The result and the recommendation will serve as a guide for retailers where, when and who to look at to prevent shoplifting.

Keywords: Shoplifting, stealing, theft, supermarket.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3356

377 GRI – Reporting Chemical Sector's Environmental Item Disclosures

Authors: M. Suutari

Abstract:

In this content analysis research note the aim was to explore to how sustainability and especially environmental issues are conveyed into environmental items in annual reports and disclosures. As The Global Reporting Initiative (GRI) is a globally wide multistakeholder process, the enterprises using voluntarily GRI framework are considered to be aware of sustainability and environmental concerns. The findings were that although these enterprises included in an environmentally sensitive industry sector and had special capabilities to consider environmental issues there were few GRIreporting enterprises presented substantially detailed environmental items in audited financial statements. There were only slight differences between publishing years 2008 and 2009 - the beginning years of economic turmoil. The environmental issues seemed not to be considered substantial enough for financial reporting as a basis for concerning investment or voting decisions.

Keywords: Environmental, reporting, financial, GRI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1726

376 Comparing Academically Gifted and Non-Gifted Students- Supportive Environments in Jordan

Authors: Mustafa Qaseem Hielat, Ahmad Mohammad Al-Shabatat

Abstract:

Jordan exerts many efforts to nurture their academically gifted students in special schools since 2001. During the past nine years of launching these schools, their learning and excellence environments were believed to be distinguished compared to public schools. This study investigated the environments of gifted students compared with other non-gifted, using a survey instrument that measures the dimensions of family, peers, teachers, school- support, society, and resources –dimensions rooted deeply in supporting gifted education, learning, and achievement. A total number of 109 were selected from excellence schools for academically gifted students, and 119 non-gifted students were selected from public schools. Around 8.3% of the non-gifted students reported that they “Never" received any support from their surrounding environments, 14.9% reported “Seldom" support, 23.7% reported “ Often" support, 26.0% reported “Frequent" support, and 32.8% reported “Very frequent" support. Where the gifted students reported more “Never" support than the non-gifted did with 11.3%, “Seldom" support with 15.4%, “Often" support with 26.6%, “Frequent" support with 29.0%, and reported “Very frequent" support less than the non-gifted students with 23.6%. Unexpectedly, statistical differences were found between the two groups favoring non-gifted students in perception of their surrounding environments in specific dimensions, namely, school- support, teachers, and society. No statistical differences were found in the other dimensions of the survey, namely, family, peers, and resources. As the differences were found in teachers, school- support, and society, the nurturing environments for the excellence schools need to be revised to adopt more creative teaching styles, rich school atmosphere and infrastructures, interactive guiding for the students and their parents, promoting for the excellence environments, and re-build successful identification models. Thus, families, schools, and society should increase their cooperation, communication, and awareness of the gifted supportive environments. However, more studies to investigate other aspects of promoting academic giftedness and excellence are recommended.

Keywords: Academic giftedness, Supportive environment, Excellence schools, Gifted grouping, Gifted nurturing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1828

375 Malaysian Multi-Ethnic Discrimination Scale: Preliminary Factor and Psychometric Analysis

Authors: Chua Bee Seok, Shamsul Amri Baharuddin, Rosnah Ismail, Ferlis Bahari, Jasmine Adela Mutang, Lailawati Madlan, Asong Joseph

Abstract:

The aims of this study were to determine the factor structure and psychometric properties (i.e., reliability and convergent validity) of the Malaysian Multi-Ethnic Discrimination Scale (MMEDS). It consists of 71-items measure experience, strategies used and consequences of ethnic discrimination. A sample of 649 university students from one of the higher education institution in Malaysia was asked to complete MMEDS, as well as Perceived Ethnic and Racial Discrimination. The exploratory factor analysis on ethnic discrimination experience extracted two factors labeled ‘unfair treatment’ (15 items) and ‘Denial of the ethnic right’ (12 items) which accounted for 60.92% of the total variance. The two sub scales demonstrated clear reliability with internal consistency above .70. The convergent validity of the Scale was supported by an expected pattern of correlations (positive and significant correlation) between the score of unfair treatment and denial of the ethnic right and the score of Perceived Ethnic and Racial Discrimination by Peers Scale. The results suggest that the MMEDS is a reliable and valid measure. However, further studies need to be carried out in other groups of sample as to validate the Scale.

Keywords: Factor structure, psychometric properties, exploratory factor analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2445

374 Assessment of Health and Safety Item on Construction Sites in Ondo State

Authors: Ikumapayi Catherine Mayowa

Abstract:

The well been of human beings on construction site is very important, many man power had been lost through accidents which kills or make workers physically unfit to carry out construction activities, these in turn have multiple effects on the whole economy. Thus it is necessary to put all safety items and regulations in place before construction activities can commence. This study was carried out in Ondo state of Nigeria to known and analyse the state of health and safety of construction workers in the state. The study was done using first hand observation method, 50 construction project sites were visited in 10 major towns of Ondo state, questionnaires were distributed and the results were analysed. The result show that construction workers are being exposed to a lot of construction site hazards due to lack of inadequate safety programmes and nonprovision of appropriate safety materials for workers on site. From the data gotten for each site visited and the statistical analysis, it can be concluded that occurrence of accident on construction sites depends significantly on the available safety facilities on the sites. The result of the regression statistics show that the level of significant of the dependence of occurrence of accident on the availability of safety items on site is 0.0362 which is less than 0.05 maximum significant level required. Therefore a vital way of sustaining our building strategy is by given a detail attention to provision of adequate health and safety items on construction sites which will reduce the occurrence of accident, loss of man power and death of skilled workers among others.

Keywords: Construction sites, health, safety, welfare.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665

373 Analysis Model for the Relationship of Users, Products, and Stores on Online Marketplace Based on Distributed Representation

Authors: Ke He, Wumaier Parezhati, Haruka Yamashita

Abstract:

Recently, online marketplaces in the e-commerce industry, such as Rakuten and Alibaba, have become some of the most popular online marketplaces in Asia. In these shopping websites, consumers can select purchase products from a large number of stores. Additionally, consumers of the e-commerce site have to register their name, age, gender, and other information in advance, to access their registered account. Therefore, establishing a method for analyzing consumer preferences from both the store and the product side is required. This study uses the Doc2Vec method, which has been studied in the field of natural language processing. Doc2Vec has been used in many cases to analyze the extraction of semantic relationships between documents (represented as consumers) and words (represented as products) in the field of document classification. This concept is applicable to represent the relationship between users and items; however, the problem is that one more factor (i.e., shops) needs to be considered in Doc2Vec. More precisely, a method for analyzing the relationship between consumers, stores, and products is required. The purpose of our study is to combine the analysis of the Doc2vec model for users and shops, and for users and items in the same feature space. This method enables the calculation of similar shops and items for each user. In this study, we derive the real data analysis accumulated in the online marketplace and demonstrate the efficiency of the proposal.

Keywords: Doc2Vec, marketing, online marketplace, recommendation system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 414

372 A Recommender System Fusing Collaborative Filtering and User’s Review Mining

Authors: Seulbi Choi, Hyunchul Ahn

Abstract:

Collaborative filtering (CF) algorithm has been popularly used for recommender systems in both academic and practical applications. It basically generates recommendation results using users’ numeric ratings. However, the additional use of the information other than user ratings may lead to better accuracy of CF. Considering that a lot of people are likely to share their honest opinion on the items they purchased recently due to the advent of the Web 2.0, user's review can be regarded as the new informative source for identifying user's preference with accuracy. Under this background, this study presents a hybrid recommender system that fuses CF and user's review mining. Our system adopts conventional memory-based CF, but it is designed to use both user’s numeric ratings and his/her text reviews on the items when calculating similarities between users.

Keywords: Recommender system, collaborative filtering, text mining, review mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532

371 Content Based Sampling over Transactional Data Streams

Authors: Mansour Tarafdar, Mohammad Saniee Abade

Abstract:

This paper investigates the problem of sampling from transactional data streams. We introduce CFISDS as a content based sampling algorithm that works on a landmark window model of data streams and preserve more informed sample in sample space. This algorithm that work based on closed frequent itemset mining tasks, first initiate a concept lattice using initial data, then update lattice structure using an incremental mechanism.Incremental mechanism insert, update and delete nodes in/from concept lattice in batch manner. Presented algorithm extracts the final samples on demand of user. Experimental results show the accuracy of CFISDS on synthetic and real datasets, despite on CFISDS algorithm is not faster than exist sampling algorithms such as Z and DSS.

Keywords: Sampling, data streams, closed frequent item set mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1666

370 Machine Learning Based Approach for Measuring Promotion Effectiveness in Multiple Parallel Promotions’ Scenarios

Authors: Revoti Prasad Bora, Nikita Katyal

Abstract:

Promotion is a key element in the retail business. Thus, analysis of promotions to quantify their effectiveness in terms of Revenue and/or Margin is an essential activity in the retail industry. However, measuring the sales/revenue uplift is based on estimations, as the actual sales/revenue without the promotion is not present. Further, the presence of Halo and Cannibalization in a multiple parallel promotions’ scenario complicates the problem. Calculating Baseline by considering inter-brand/competitor items or using Halo and Cannibalization's impact on Revenue calculations by considering Baseline as an interpretation of items’ unit sales in neighboring nonpromotional weeks individually may not capture the overall Revenue uplift in the case of multiple parallel promotions. Hence, this paper proposes a Machine Learning based method for calculating the Revenue uplift by considering the Halo and Cannibalization impact on the Baseline and the Revenue. In the first section of the proposed methodology, Baseline of an item is calculated by incorporating the impact of the promotions on its related items. In the later section, the Revenue of an item is calculated by considering both Halo and Cannibalization impacts. Hence, this methodology enables correct calculation of the overall Revenue uplift due a given promotion.

Keywords: Halo, cannibalization, promotion, baseline, temporary price reduction, retail, elasticity, cross price elasticity, machine learning, random forest, linear regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1231

369 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: Cooccurrence graph, entity relation graph, unstructured text, weighted distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 635

368 Matching Current Search with Future Postings

Authors: Kim Nee Goh, Viknesh Kumar Naleyah

Abstract:

Online trading is an alternative to conventional shopping method. People trade goods which are new or pre-owned before. However, there are times when a user is not able to search the items wanted online. This is because the items may not be posted as yet, thus ending the search. Conventional search mechanism only works by searching and matching search criteria (requirement) with data available in a particular database. This research aims to match current search requirements with future postings. This would involve the time factor in the conventional search method. A Car Matching Alert System (CMAS) prototype was developed to test the matching algorithm. When a buyer-s search returns no result, the system saves the search and the buyer will be alerted if there is a match found based on future postings. The algorithm developed is useful and as it can be applied in other search context.

Keywords: Matching algorithm, online trading, search, future postings, car matching

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1372

367 Signed Approach for Mining Web Content Outliers

Authors: G. Poonkuzhali, K.Thiagarajan, K.Sarukesi, G.V.Uma

Abstract:

The emergence of the Internet has brewed the revolution of information storage and retrieval. As most of the data in the web is unstructured, and contains a mix of text, video, audio etc, there is a need to mine information to cater to the specific needs of the users without loss of important hidden information. Thus developing user friendly and automated tools for providing relevant information quickly becomes a major challenge in web mining research. Most of the existing web mining algorithms have concentrated on finding frequent patterns while neglecting the less frequent ones that are likely to contain outlying data such as noise, irrelevant and redundant data. This paper mainly focuses on Signed approach and full word matching on the organized domain dictionary for mining web content outliers. This Signed approach gives the relevant web documents as well as outlying web documents. As the dictionary is organized based on the number of characters in a word, searching and retrieval of documents takes less time and less space.

Keywords: Outliers, Relevant document, , Signed Approach, Web content mining, Web documents..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2309

366 A CTL Specification of Serializability for Transactions Accessing Uniform Data

Authors: Rafat Alshorman, Walter Hussak

Abstract:

Existing work in temporal logic on representing the execution of infinitely many transactions, uses linear-time temporal logic (LTL) and only models two-step transactions. In this paper, we use the comparatively efficient branching-time computational tree logic CTL and extend the transaction model to a class of multistep transactions, by introducing distinguished propositional variables to represent the read and write steps of n multi-step transactions accessing m data items infinitely many times. We prove that the well known correspondence between acyclicity of conflict graphs and serializability for finite schedules, extends to infinite schedules. Furthermore, in the case of transactions accessing the same set of data items in (possibly) different orders, serializability corresponds to the absence of cycles of length two. This result is used to give an efficient encoding of the serializability condition into CTL.

Keywords: computational tree logic, serializability, multi-step transactions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1128

365 Constraint Based Frequent Pattern Mining Technique for Solving GCS Problem

Authors: First G.M. Karthik, Second Ramachandra.V.Pujeri, Dr.

Abstract:

Generalized Center String (GCS) problem are generalized from Common Approximate Substring problem and Common substring problems. GCS are known to be NP-hard allowing the problems lies in the explosion of potential candidates. Finding longest center string without concerning the sequence that may not contain any motifs is not known in advance in any particular biological gene process. GCS solved by frequent pattern-mining techniques and known to be fixed parameter tractable based on the fixed input sequence length and symbol set size. Efficient method known as Bpriori algorithms can solve GCS with reasonable time/space complexities. Bpriori 2 and Bpriori 3-2 algorithm are been proposed of any length and any positions of all their instances in input sequences. In this paper, we reduced the time/space complexity of Bpriori algorithm by Constrained Based Frequent Pattern mining (CBFP) technique which integrates the idea of Constraint Based Mining and FP-tree mining. CBFP mining technique solves the GCS problem works for all center string of any length, but also for the positions of all their mutated copies of input sequence. CBFP mining technique construct TRIE like with FP tree to represent the mutated copies of center string of any length, along with constraints to restraint growth of the consensus tree. The complexity analysis for Constrained Based FP mining technique and Bpriori algorithm is done based on the worst case and average case approach. Algorithm's correctness compared with the Bpriori algorithm using artificial data is shown.

Keywords: Constraint Based Mining, FP tree, Data mining, GCS problem, CBFP mining technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1645

364 Goal Based Episodic Processing in Implicit Learning

Authors: Peter A. Bibby

Abstract:

Research has suggested that implicit learning tasks may rely on episodic processing to generate above chance performance on the standard classification tasks. The current research examines the invariant features task (McGeorge and Burton, 1990) and argues that such episodic processing is indeed important. The results of the experiment suggest that both rejection and similarity strategies are used by participants in this task to simultaneously reject unfamiliar items and to accept (falsely) familiar items. Primarily these decisions are based on the presence of low or high frequency goal based features of the stimuli presented in the incidental learning phase. It is proposed that a goal based analysis of the incidental learning task provides a simple step in understanding which features of the episodic processing are most important for explaining the match between incidental, implicit learning and test performance.

Keywords: Episodic processing, incidental learning, implicitlearning, invariant learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1393