Search results for: Data Mining Community
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8056

Search results for: Data Mining Community

7516 Assessment of Sustainability in the Wulo Abiye Watershed, Central Highlands of Ethiopia

Authors: Getabalew Derib, Arragaw Alemayehu

Abstract:

Assessing the sustainability of watersheds holds significant importance for regional natural resource management and to achieve sustainable development. This study investigated the sustainability of the Wulo Abiye watershed, central highlands of Ethiopia. The sustainability status of the watershed was evaluated by using 17 indicators representing the economic, social, and environmental dimensions of sustainable development goals (SDGs) based on the local and existing conditions of the watershed. The results indicated that environmental sustainability was at a ‘high’ level, while social and economic sustainability and the aggregate index were at ‘moderate’ levels. The overall level of community participation in the planning and evaluation phases of watershed management was at ‘low’ levels. The implementation phase was at ‘high’ level. Overall, the sustainability status of the watershed management practices and the level of community participation were at a moderate level. The study concluded that integrated support is needed to overcome the identified challenges to achieve sustainable development in watersheds.

Keywords: Wulo Abiye watershed, community participation, watershed management, sustainable development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 71
7515 The Socio-Technical Indicator Model: Socially-Sensitive CMC Technology, with an Implementation of Representative Moderation

Authors: Zach-Amaury Boufoy-Bastick, Lenandlar Singh

Abstract:

Computer-mediated communication technologies which provide for virtual communities have typically evolved in a cross-dichotomous manner, such that technical constructs of the technology have evolved independently from the social environment of the community. The present paper analyses some limitations of current implementations of computer-mediated communication technology that are implied by such a dichotomy, and discusses their inhibiting effects on possible developments of virtual communities. A Socio-Technical Indicator Model is introduced that utilizes integrated feedback to describe, simulate and operationalise increasing representativeness within a variety of structurally and parametrically diverse systems. In illustration, applications of the model are briefly described for financial markets and for eco-systems. A detailed application is then provided to resolve the aforementioned technical limitations of moderation on the evolution of virtual communities. The application parameterises virtual communities to function as self-transforming social-technical systems which are sensitive to emergent and shifting community values as products of on-going communications within the collective.

Keywords: Virtual community, e-democracy, feedback systems, moderation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1570
7514 Effective Stacking of Deep Neural Models for Automated Object Recognition in Retail Stores

Authors: Ankit Sinha, Soham Banerjee, Pratik Chattopadhyay

Abstract:

Automated product recognition in retail stores is an important real-world application in the domain of Computer Vision and Pattern Recognition. In this paper, we consider the problem of automatically identifying the classes of the products placed on racks in retail stores from an image of the rack and information about the query/product images. We improve upon the existing approaches in terms of effectiveness and memory requirement by developing a two-stage object detection and recognition pipeline comprising of a Faster-RCNN-based object localizer that detects the object regions in the rack image and a ResNet-18-based image encoder that classifies  the detected regions into the appropriate classes. Each of the models is fine-tuned using appropriate data sets for better prediction and data augmentation is performed on each query image to prepare an extensive gallery set for fine-tuning the ResNet-18-based product recognition model. This encoder is trained using a triplet loss function following the strategy of online-hard-negative-mining for improved prediction. The proposed models are lightweight and can be connected in an end-to-end manner during deployment to automatically identify each product object placed in a rack image. Extensive experiments using Grozi-32k and GP-180 data sets verify the effectiveness of the proposed model.

Keywords: Retail stores, Faster-RCNN, object localization, ResNet-18, triplet loss, data augmentation, product recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 583
7513 Using Data Mining Methodology to Build the Predictive Model of Gold Passbook Price

Authors: Chien-Hui Yang, Che-Yang Lin, Ya-Chen Hsu

Abstract:

Gold passbook is an investing tool that is especially suitable for investors to do small investment in the solid gold. The gold passbook has the lower risk than other ways investing in gold, but its price is still affected by gold price. However, there are many factors can cause influences on gold price. Therefore, building a model to predict the price of gold passbook can both reduce the risk of investment and increase the benefits. This study investigates the important factors that influence the gold passbook price, and utilize the Group Method of Data Handling (GMDH) to build the predictive model. This method can not only obtain the significant variables but also perform well in prediction. Finally, the significant variables of gold passbook price, which can be predicted by GMDH, are US dollar exchange rate, international petroleum price, unemployment rate, whole sale price index, rediscount rate, foreign exchange reserves, misery index, prosperity coincident index and industrial index.

Keywords: Gold price, Gold passbook price, Group Method ofData Handling (GMDH), Regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2285
7512 Annual Power Load Forecasting Using Support Vector Regression Machines: A Study on Guangdong Province of China 1985-2008

Authors: Zhiyong Li, Zhigang Chen, Chao Fu, Shipeng Zhang

Abstract:

Load forecasting has always been the essential part of an efficient power system operation and planning. A novel approach based on support vector machines is proposed in this paper for annual power load forecasting. Different kernel functions are selected to construct a combinatorial algorithm. The performance of the new model is evaluated with a real-world dataset, and compared with two neural networks and some traditional forecasting techniques. The results show that the proposed method exhibits superior performance.

Keywords: combinatorial algorithm, data mining, load forecasting, support vector machines

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1646
7511 The Relationship between the Architectural Style of the Area’s Residential Waterfront Communities of Bangnoi Floating Bangkhonthi Districts Samut Songkhram Province

Authors: Kunyaphat Thanakunwutthirot

Abstract:

Bangnoi Floating Market located at Bangkhonthi Districts Samut Songkhram Province is a valuable architectural market. The lifestyle of the community's relationship with the living space and the relationship between the architectural style of the area's residential waterfront communities of Bangnoi Floating Bangkhonthi Districts Samut Songkhram Province, which deserves to be preserved. Therefore, this research it helps to know the value of the architectural style of the area's residential waterfront communities of Bangnoi Floating Bangkhonthi Districts SamutSongkhram Province, which deserves to be preserved.

Keywords: Bangnoi Floating Market, floor plan of riverside community architecture, riverside architectural identity, style of riverside community architecture utility space.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1735
7510 Community Perceptions and Attitudes Regarding Wildlife Crime in South Africa

Authors: Louiza C. Duncker, Duarte Gonçalves

Abstract:

Wildlife crime is a complex problem with many interconnected facets, which are generally responded to in parts or fragments in efforts to “break down” the complexity into manageable components. However, fragmentation increases complexity as coherence and cooperation become diluted. A whole-of-society approach has been developed towards finding a common goal and integrated approach to preventing wildlife crime. As part of this development, research was conducted in rural communities adjacent to conservation areas in South Africa to define and comprehend the challenges faced by them, and to understand their perceptions of wildlife crime. The results of the research showed that the perceptions of community members varied - most were in favor of conservation and of protecting rhinos, only if they derive adequate benefit from it. Regardless of gender, income level, education level, or access to services, conservation was perceived to be good and bad by the same people. Even though people in the communities are poor, a willingness to stop rhino poaching does exist amongst them, but their perception of parks not caring about people triggered an attitude of not being willing to stop, prevent or report poaching. Understanding the nuances, the history, the interests and values of community members, and the drivers behind poaching mind-sets (intrinsic or driven by transnational organized crime) is imperative to create sustainable and resilient communities on multiple levels that make a substantial positive impact on people’s lives, but also conserve wildlife for posterity.

Keywords: Conservation, community perceptions, wildlife crime, rhino poaching, interest and value creation, whole-of-society approach.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1879
7509 Measurement and Evaluation of Outdoor Lighting Environment at Night in Residential Community in China: A Case Study of Hangzhou

Authors: Jiantao Weng, Yujie Zhao

Abstract:

With the improvement of living quality and demand for nighttime activities in China, the current situation of outdoor lighting environment at night needs to be assessed. Lighting environment at night plays an important role to guarantee night safety. Two typical residential communities in Hangzhou were selected. A comprehensive test method of outdoor lighting environment at night was established. The road, fitness area, landscape, playground and entrance were included. Field measurements and questionnaires were conducted in these two residential communities. The characteristics of residents’ habits and the subjective evaluation on different aspects of outdoor lighting environment at night were collected via questionnaire. A safety evaluation system on the outdoor lighting environment at night in the residential community was established. The results show that there is a big difference in illumination in different areas. The lighting uniformities of roads cannot meet the requirement of lighting standard in China. Residents pay more attention to the lighting environment of the fitness area and road than others. This study can provide guidance for the design and management of outdoor lighting environment at night.

Keywords: Residential community, lighting environment, night, field measurement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 656
7508 Towards an Understanding of Social Capital in an Online Community of Filipino Music Artists

Authors: Jerome V. Cleofas

Abstract:

Cyberspace has become a more viable arena for budding artists to share musical acts through digital forms. The increasing relevance of online communities has attracted scholars from various fields demonstrating its influence on social capital. This paper extends this understanding of social capital among Filipino music artists belonging to the SoundCloud Philippines Facebook Group. The study makes use of various qualitative data obtained from key-informant interviews and participant observation of online and physical encounters, analyzed using the case study approach. Soundcloud Philippines has over seven-hundred members and is composed of Filipino singers, instrumentalists, composers, arrangers, producers, multimedia artists and event managers. Group interactions are a mix of online encounters based on Facebook and SoundCloud and physical encounters through meet-ups and events. Benefits reaped from the community are informational, technical, instrumental, promotional, motivational and social support. Under the guidance of online group administrators, collaborative activities such as music productions, concerts and events transpire. Most conflicts and problems arising are resolved peacefully. Social capital in SoundCloud Philippines is mobilized through recognition, respect and reciprocity.

Keywords: Facebook, music artists, online communities, social capital.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1748
7507 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4876
7506 Developing a Sustainable Educational Portal for the D-Grid Community

Authors: Viktor Achter, Sebastian Breuers, Marc Seifert, Ulrich Lang, Joachim Götze, Bernd Reuther, Paul Müller

Abstract:

Within the last years, several technologies have been developed to help building e-learning portals. Most of them follow approaches that deliver a vast amount of functionalities, suitable for class-like learning. The SuGI project, as part of the D-Grid (funded by the BMBF), targets on delivering a highly scalable and sustainable learning solution to provide materials (e.g. learning modules, training systems, webcasts, tutorials, etc.) containing knowledge about Grid computing to the D-Grid community. In this article, the process of the development of an e-learning portal focused on the requirements of this special user group is described. Furthermore, it deals with the conceptual and technical design of an e-learning portal, addressing the special needs of heterogeneous target groups. The main focus lies on the quality management of the software development process, Web templates for uploading new contents, the rich search and filter functionalities which will be described from a conceptual as well as a technical point of view. Specifically, it points out best practices as well as concepts to provide a sustainable solution to a relatively unknown and highly heterogeneous community.

Keywords: D-Grid, e-learning, e-science, Grid computing, SuGI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1345
7505 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2612
7504 Indigenous Dayak People’s Perceptions of Wildlife Loss and Gain Related to Oil Palm Development

Authors: A. Sunkar, A. Saraswati, Y. Santosa

Abstract:

Controversies surrounding the impacts of oil palm plantations have resulted in some heated debates, especially concerning biodiversity loss and indigenous people well-being. The indigenous people of Dayak generally used wildlife to fulfill their daily needs thus were assumed to have experienced negative impacts due to oil palm developments within and surrounding their settlement areas. This study was conducted to identify the characteristics of the Dayak community settled around an oil palm plantation, to determine their perceptions of wildlife loss or gain as the results of the development of oil palm plantations, and to identify the determinant characteristic of the perceptions. The research was conducted on March 2018 in Nanga Tayap and Tajok Kayong Villages, which were located around the oil palm plantation of NTYE of Ketapang, West Kalimantan-Indonesia. Data were collected through in depth-structured interview, using closed and semi-open questionnaires and three-scale Likert statements. Interviews were conducted with 74 respondents using accidental sampling, and categorized into respondents who were dependent on oil palm for their livelihoods and those who were not. Data were analyzed using quantitative statistics method, Likert Scale, Chi-Square Test, Spearman Test, and Mann-Whitney Test. The research found that the indigenous Dayak people were aware of wildlife species loss and gain since the establishment of the plantation. Nevertheless, wildlife loss did not affect their social, economic, and cultural needs since they could find substitutions. It was found that prior to the plantation’s development, the local Dayak communities were already slowly experiencing some livelihood transitions through local village development. The only determinant characteristic of the community that influenced their perceptions of wildlife loss/gain was level of education.

Keywords: Wildlife, oil palm plantations, indigenous Dayak, biodiversity loss and gain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1366
7503 Eclectic Rule-Extraction from Support Vector Machines

Authors: Nahla Barakat, Joachim Diederich

Abstract:

Support vector machines (SVMs) have shown superior performance compared to other machine learning techniques, especially in classification problems. Yet one limitation of SVMs is the lack of an explanation capability which is crucial in some applications, e.g. in the medical and security domains. In this paper, a novel approach for eclectic rule-extraction from support vector machines is presented. This approach utilizes the knowledge acquired by the SVM and represented in its support vectors as well as the parameters associated with them. The approach includes three stages; training, propositional rule-extraction and rule quality evaluation. Results from four different experiments have demonstrated the value of the approach for extracting comprehensible rules of high accuracy and fidelity.

Keywords: Data mining, hybrid rule-extraction algorithms, medical diagnosis, SVMs

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1708
7502 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559
7501 Cohabiting in Multiethnic Community: Forms, Representations and Images of the Diversity

Authors: Gioacchino Lavanco, Cinzia Novara, Floriana Romano, Elisabetta Di Giovanni

Abstract:

Modern culture, based on disinhibition of cultural trends and on heterodirection, is promoting openmindedness attitudes towards ethnic diversity, but on the other hand also new forms of social representations of the foreigner. Social representation is situated between the psychic field and the social one; it is the representation of oneself and of the other one, hanging between social categories and individual inner world. We will produce the results of a research on the representation of the foreigner, built on the type of prejudice prevailing among middle-low or middle-high educational qualification subjects, in which prejudicial attitudes seem to descend from precise mental images of the foreigner.

Keywords: Community, diversity, integration, prejudice, representations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1251
7500 Spatial Clustering Model of Vessel Trajectory to Extract Sailing Routes Based on AIS Data

Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin

Abstract:

The automatic extraction of shipping routes is advantageous for intelligent traffic management systems to identify events and support decision-making in maritime surveillance. At present, there is a high demand for the extraction of maritime traffic networks that resemble the real traffic of vessels accurately, which is valuable for further analytical processing tasks for vessels trajectories (e.g., naval routing and voyage planning, anomaly detection, destination prediction, time of arrival estimation). With the help of big data and processing huge amounts of vessels’ trajectory data, it is possible to learn these shipping routes from the navigation history of past behaviour of other, similar ships that were travelling in a given area. In this paper, we propose a spatial clustering model of vessels’ trajectories (SPTCLUST) to extract spatial representations of sailing routes from historical Automatic Identification System (AIS) data. The whole model consists of three main parts: data preprocessing, path finding, and route extraction, which consists of clustering and representative trajectory extraction. The proposed clustering method provides techniques to overcome the problems of: (i) optimal input parameters selection; (ii) the high complexity of processing a huge volume of multidimensional data; (iii) and the spatial representation of complete representative trajectory detection in the context of trajectory clustering algorithms. The experimental evaluation showed the effectiveness of the proposed model by using a real-world AIS dataset from the Port of Halifax. The results contribute to further understanding of shipping route patterns. This could aid surveillance authorities in stable and sustainable vessel traffic management.

Keywords: Vessel trajectory clustering, trajectory mining, Spatial Clustering, marine intelligent navigation, maritime traffic network extraction, sdailing routes extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 455
7499 Modeling Language for Constructing Solvers in Machine Learning: Reductionist Perspectives

Authors: Tsuyoshi Okita

Abstract:

For a given specific problem an efficient algorithm has been the matter of study. However, an alternative approach orthogonal to this approach comes out, which is called a reduction. In general for a given specific problem this reduction approach studies how to convert an original problem into subproblems. This paper proposes a formal modeling language to support this reduction approach in order to make a solver quickly. We show three examples from the wide area of learning problems. The benefit is a fast prototyping of algorithms for a given new problem. It is noted that our formal modeling language is not intend for providing an efficient notation for data mining application, but for facilitating a designer who develops solvers in machine learning.

Keywords: Formal language, statistical inference problem, reduction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1328
7498 Using the Transtheoretical Model to Investigate Stages of Change in Regular Volunteer Service among Seniors in Community

Authors: Pei-Ti Hsu, I-Ju Chen, Jeu-Jung Chen, Cheng-Fen Chang, Shiu-Yan Yang

Abstract:

Background: Taiwan now is an aging society. Research on the elderly should not be confined to caring for seniors, but should also be focused on ways to improve health and the quality of life. Senior citizens who participate in volunteer services could become less lonely, have new growth opportunities, and regain a sense of accomplishment. Thus, the question of how to get the elderly to participate in volunteer service is worth exploring. Objective: Apply the Transtheoretical Model to understand stages of change in regular volunteer service and voluntary service behaviour among the seniors. Methods: 1525 adults over the age of 65 from the Renai district of Keelung City were interviewed. The research tool was a self-constructed questionnaire, and individual interviews were conducted to collect data. Then the data was processed and analyzed using the IBM SPSS Statistics 20 (Windows version) statistical software program. Results: In the past six months, research subjects averaged 9.92 days of volunteer services. A majority of these elderly individuals had no intention to change their regular volunteer services. We discovered that during the maintenance stage, the self-efficacy for volunteer services was higher than during all other stages, but self-perceived barriers were less during the preparation stage and action stage. Self-perceived benefits were found to have an important predictive power for those with regular volunteer service behaviors in the previous stage, and self-efficacy was found to have an important predictive power for those with regular volunteer service behaviors in later stages. Conclusions/Implications for Practice: The research results support the conclusion that community nursing staff should group elders based on their regular volunteer services change stages and design appropriate behavioral change strategies.

Keywords: Seniors, stages of change in regular volunteer services, volunteer service behavior, self-efficacy, self-perceived benefits.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1959
7497 Dynamic Features Selection for Heart Disease Classification

Authors: Walid MOUDANI

Abstract:

The healthcare environment is generally perceived as being information rich yet knowledge poor. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. In fact, valuable knowledge can be discovered from application of data mining techniques in healthcare system. In this study, a proficient methodology for the extraction of significant patterns from the Coronary Heart Disease warehouses for heart attack prediction, which unfortunately continues to be a leading cause of mortality in the whole world, has been presented. For this purpose, we propose to enumerate dynamically the optimal subsets of the reduced features of high interest by using rough sets technique associated to dynamic programming. Therefore, we propose to validate the classification using Random Forest (RF) decision tree to identify the risky heart disease cases. This work is based on a large amount of data collected from several clinical institutions based on the medical profile of patient. Moreover, the experts- knowledge in this field has been taken into consideration in order to define the disease, its risk factors, and to establish significant knowledge relationships among the medical factors. A computer-aided system is developed for this purpose based on a population of 525 adults. The performance of the proposed model is analyzed and evaluated based on set of benchmark techniques applied in this classification problem.

Keywords: Multi-Classifier Decisions Tree, Features Reduction, Dynamic Programming, Rough Sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2532
7496 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: Cooccurrence graph, entity relation graph, unstructured text, weighted distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 684
7495 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: Classifier ensemble, breast cancer survivability, data mining, SEER.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1671
7494 Knowledge Discovery from Production Databases for Hierarchical Process Control

Authors: Pavol Tanuska, Pavel Vazan, Michal Kebisek, Dominika Jurovata

Abstract:

The paper gives the results of the project that was oriented on the usage of knowledge discoveries from production systems for needs of the hierarchical process control. One of the main project goals was the proposal of knowledge discovery model for process control. Specifics data mining methods and techniques was used for defined problems of the process control. The gained knowledge was used on the real production system thus the proposed solution has been verified. The paper documents how is possible to apply the new discovery knowledge to use in the real hierarchical process control. There are specified the opportunities for application of the proposed knowledge discovery model for hierarchical process control.

Keywords: Hierarchical process control, knowledge discovery from databases, neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1775
7493 Trust In Ad Media

Authors: Duygu Aydin

Abstract:

Advertising today has already become an integral part of human life as a building block of the consumer community. A component of the value chain of the media, advertising sector is struggling increasingly harder to find new methods to reach consumers. The tendency towards experimental marketing practices is increasing day by day, especially to divert consumers from the idea “They are selling something to me.” It is therefore considered a good idea to investigate the trust in ad media of consumers, who are today exposed to a great bulk of information from advertising sector. In this study, the current value of ad media for the young consumer will be investigated. Data on various ad media reliability will be comparatively analyzed and young consumers will be traced by including university students in the study. In this research, which will be performed on students studying at the Selçuk University (Turkey) by random sampling method, data will be obtained by survey technique and evaluated by a statistical analysis.

Keywords: Trust in advertising, ad medium, media.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1948
7492 Greek Tragedy on the American Stage until the First Half of 20th: Identities and Intersections between Greek, Italian and Jewish Community Theater

Authors: Papazafeiropoulou Olga

Abstract:

The purpose of this paper focuses on exploring the emergence of Greek tragedy on the American stage until the first half of 20th century through the intellectual processes and contributions of Greek, Italian and Jewish community theatre. Drawing on a wide range of sources, we trace Greek tragedy on the American stage, exploring the intricate processes of community’s theatre identities. The announcement aims to analyze the distinct yet related efforts of first Americans to intersect with Greek tragedy, searching simultaneously for the identities of immigrants. Eventually, the ancient drama became a vehicle for major developments in American theater as individual immigrant communities began their own theatrical endeavors. From 1903 the Greek actor Dionysios Taboularis arrived in America, while in the decade 1907-1917 Nikolaos Matsoukas and Petros Kotopoulis formed their own troupes. In 1930, the actress Marika Kotopoulis also arrived for a tour. Also, members of Vrysoula’s Pantopoulos formed the “Athenian Operetta”, with positive influence on Greek American theatre. The Italian immigrant community was located in the "Little Italies" housing throughout the city, and soon amateur theatrical clubs evolved. The earliest was the “Circolo Filodrammatico Italo-Americano” in 1880. Fausto Malzone’s artistic direction paved the way for the professional Italian immigrant theatre. Immigrant audiences heard the plays of their homeland, representing a major transition for this ethnic theatre. In 1900, the community had produced the major forces that created the professional theatre. By l905, the Italian American theatre had become firmly rooted in its professional phase. Yiddish Theater was both an import and a home-grown phenomenon. Since 1878, works began to be presented by Boris Tomashevsky. Between 1890 and 1940, many Yiddish theater companies appeared in America presenting adaptations of classical plays. American people first encounter with ancient texts was mostly academic. The tracing of tragedy as form and concept that follow the evolutionary course of domestic social, aesthetic and political ferments according to the international trends and currents, draws conclusion about the early Greek, Italian and Jewish immigrant’s theatre in relationship to the American scene until the first half of 20th century. Presumably, community theater acquired identity by intersecting with the spiritual reception of tragedy in America.

Keywords: American, Community, Greek, Italian, identities, intersection, Jewish, theatre, tragedy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24
7491 Platform Urbanism: Planning towards Hyper-Personalisation

Authors: Provides Ng

Abstract:

Platform economy is a peer-to-peer model of distributing resources facilitated by community-based digital platforms. In recent years, digital platforms are rapidly reconfiguring the public realm using hyper-personalisation techniques. This paper aims at investigating how urban planning can leapfrog into the digital age to help relieve the rising tension of the global issue of labour flow; it discusses the means to transfer techniques of hyper-personalisation into urban planning for plasticity using platform technologies. This research first denotes the limitations of the current system of urban residency, where the system maintains itself on the circulation of documents, which are data on paper. Then, this paper tabulates how some of the institutions around the world, both public and private, digitise data, and streamline communications between a network of systems and citizens using platform technologies. Subsequently, this paper proposes ways in which hyper-personalisation can be utilised to form a digital planning platform. Finally, this paper concludes by reviewing how the proposed strategy may help to open up new ways of thinking about how we affiliate ourselves with cities.

Keywords: Platform urbanism, hyper-personalisation, urban residency, digital data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 669
7490 The Effect of the Hourly Compensation on the Unemployment Rate: Comparative Analysis of United States, Canada and the United Kingdom Using Panel Data Regression Analysis

Authors: Ashiquer Rahman, Hares Mohammad, Ummey Salma

Abstract:

A country’s hourly compensation and unemployment rates are two of its most crucial components. They are not merely statistics but they have profound effects on individual, families, country, and the economy. They are inversely related to one another. The increased hourly compensation in the manufacturing sector can have a favorable effect on job changing issues. Moreover, the relationship between hourly compensation and unemployment is complex and influenced by broader economic factors. In this paper, in order to determine the effect of hourly compensation on unemployment rate, we use the panel data regression models and evaluate the expected link between hourly compensation and unemployment rate. We estimate the fixed effects model (FEM), evaluate the error components model (ECM), and determine which model (the FEM or ECM) is better through pooling all 60 observations. We then analyze and review the data by comparing countries (United States, Canada and the United Kingdom) using panel data regression models. Finally, we provide result, analysis and a summary of this extensive research on how the hourly compensation affects unemployment rate. Additionally, this paper offers relevant and useful guideline for the government and academic community to use an econometrics and social approach for the hourly compensation on unemployment rate to eliminate the problem.

Keywords: Hourly compensation, unemployment rate, panel data regression models, dummy variables, random effects model, fixed effects model, the linear regression model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 68
7489 Destination Port Detection for Vessels: An Analytic Tool for Optimizing Port Authorities Resources

Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin

Abstract:

Port authorities have many challenges in congested ports to allocate their resources to provide a safe and secure loading/unloading procedure for cargo vessels. Selecting a destination port is the decision of a vessel master based on many factors such as weather, wavelength and changes of priorities. Having access to a tool which leverages Automatic Identification System (AIS) messages to monitor vessel’s movements and accurately predict their next destination port promotes an effective resource allocation process for port authorities. In this research, we propose a method, namely, Reference Route of Trajectory (RRoT) to assist port authorities in predicting inflow and outflow traffic in their local environment by monitoring AIS messages. Our RRo method creates a reference route based on historical AIS messages. It utilizes some of the best trajectory similarity measures to identify the destination of a vessel using their recent movement. We evaluated five different similarity measures such as Discrete Frechet Distance (DFD), Dynamic Time ´ Warping (DTW), Partial Curve Mapping (PCM), Area between two curves (Area) and Curve length (CL). Our experiments show that our method identifies the destination port with an accuracy of 98.97% and an f-measure of 99.08% using Dynamic Time Warping (DTW) similarity measure.

Keywords: Spatial temporal data mining, trajectory mining, trajectory similarity, resource optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 696
7488 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3775
7487 Learning Classifier Systems Approach for Automated Discovery of Censored Production Rules

Authors: Suraiya Jabin, Kamal K. Bharadwaj

Abstract:

In the recent past Learning Classifier Systems have been successfully used for data mining. Learning Classifier System (LCS) is basically a machine learning technique which combines evolutionary computing, reinforcement learning, supervised or unsupervised learning and heuristics to produce adaptive systems. A LCS learns by interacting with an environment from which it receives feedback in the form of numerical reward. Learning is achieved by trying to maximize the amount of reward received. All LCSs models more or less, comprise four main components; a finite population of condition–action rules, called classifiers; the performance component, which governs the interaction with the environment; the credit assignment component, which distributes the reward received from the environment to the classifiers accountable for the rewards obtained; the discovery component, which is responsible for discovering better rules and improving existing ones through a genetic algorithm. The concatenate of the production rules in the LCS form the genotype, and therefore the GA should operate on a population of classifier systems. This approach is known as the 'Pittsburgh' Classifier Systems. Other LCS that perform their GA at the rule level within a population are known as 'Mitchigan' Classifier Systems. The most predominant representation of the discovered knowledge is the standard production rules (PRs) in the form of IF P THEN D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski and Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: IF P THEN D UNLESS C, where Censor C is an exception to the rule. Such rules are employed in situations, in which conditional statement IF P THEN D holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the IF P THEN D part of CPR expresses important information, while the UNLESS C part acts only as a switch and changes the polarity of D to ~D. In this paper Pittsburgh style LCSs approach is used for automated discovery of CPRs. An appropriate encoding scheme is suggested to represent a chromosome consisting of fixed size set of CPRs. Suitable genetic operators are designed for the set of CPRs and individual CPRs and also appropriate fitness function is proposed that incorporates basic constraints on CPR. Experimental results are presented to demonstrate the performance of the proposed learning classifier system.

Keywords: Censored Production Rule, Data Mining, GeneticAlgorithm, Learning Classifier System, Machine Learning, PittsburgApproach, , Reinforcement learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1530