Search results for: symbolic data analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 40775

Search results for: symbolic data analysis

40685 Design Architecture Anti-Corruption Commission (KPK) According to KPK Law: Strong or Weak?

Authors: Moh Rizaldi, Ali Abdurachman, Indra Perwira

Abstract:

The biggest demonstration after the 1998 reforms that took place in Indonesia for several days at the end of 2019 did not eliminate the intention of the People’s Representative Council (Dewan Perwakilan Rakyat or DPR) and the President to enact the law 19 of 2019 (KPK law). There is a central issue to be highlighted, namely whether the change is intended to strengthen or even weaken the KPK. To achieve this goal, the Analysis focuses on two agency principles namely the independent principle and the control principle as seen from three things namely the legal substance, legal structure, and legal culture. The research method is normative with conceptual, historical and statute approaches. The argument from this writing is that KPK Law has cut most of the KPK's authority as a result the KPK has become symbolic or toothless in combating corruption.

Keywords: control, independent, KPK, law no. 19 of 2019

Procedia PDF Downloads 107
40684 Pattern Recognition Using Feature Based Die-Map Clustering in the Semiconductor Manufacturing Process

Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek

Abstract:

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Keywords: die-map clustering, feature extraction, pattern recognition, semiconductor manufacturing process

Procedia PDF Downloads 377
40683 The Economic Limitations of Defining Data Ownership Rights

Authors: Kacper Tomasz Kröber-Mulawa

Abstract:

This paper will address the topic of data ownership from an economic perspective, and examples of economic limitations of data property rights will be provided, which have been identified using methods and approaches of economic analysis of law. To properly build a background for the economic focus, in the beginning a short perspective of data and data ownership in the EU’s legal system will be provided. It will include a short introduction to its political and social importance and highlight relevant viewpoints. This will stress the importance of a Single Market for data but also far-reaching regulations of data governance and privacy (including the distinction of personal and non-personal data, data held by public bodies and private businesses). The main discussion of this paper will build upon the briefly referred to legal basis as well as methods and approaches of economic analysis of law.

Keywords: antitrust, data, data ownership, digital economy, property rights

Procedia PDF Downloads 51
40682 Iot Device Cost Effective Storage Architecture and Real-Time Data Analysis/Data Privacy Framework

Authors: Femi Elegbeleye, Omobayo Esan, Muienge Mbodila, Patrick Bowe

Abstract:

This paper focused on cost effective storage architecture using fog and cloud data storage gateway and presented the design of the framework for the data privacy model and data analytics framework on a real-time analysis when using machine learning method. The paper began with the system analysis, system architecture and its component design, as well as the overall system operations. The several results obtained from this study on data privacy model shows that when two or more data privacy model is combined we tend to have a more stronger privacy to our data, and when fog storage gateway have several advantages over using the traditional cloud storage, from our result shows fog has reduced latency/delay, low bandwidth consumption, and energy usage when been compare with cloud storage, therefore, fog storage will help to lessen excessive cost. This paper dwelt more on the system descriptions, the researchers focused on the research design and framework design for the data privacy model, data storage, and real-time analytics. This paper also shows the major system components and their framework specification. And lastly, the overall research system architecture was shown, its structure, and its interrelationships.

Keywords: IoT, fog, cloud, data analysis, data privacy

Procedia PDF Downloads 70
40681 Cloud Design for Storing Large Amount of Data

Authors: M. Strémy, P. Závacký, P. Cuninka, M. Juhás

Abstract:

Main goal of this paper is to introduce our design of private cloud for storing large amount of data, especially pictures, and to provide good technological backend for data analysis based on parallel processing and business intelligence. We have tested hypervisors, cloud management tools, storage for storing all data and Hadoop to provide data analysis on unstructured data. Providing high availability, virtual network management, logical separation of projects and also rapid deployment of physical servers to our environment was also needed.

Keywords: cloud, glusterfs, hadoop, juju, kvm, maas, openstack, virtualization

Procedia PDF Downloads 331
40680 Distinguishing Substance from Spectacle in Violent Extremist Propaganda through Frame Analysis

Authors: John Hardy

Abstract:

Over the last decade, the world has witnessed an unprecedented rise in the quality and availability of violent extremist propaganda. This phenomenon has been fueled primarily by three interrelated trends: rapid adoption of online content mediums by creators of violent extremist propaganda, increasing sophistication of violent extremist content production, and greater coordination of content and action across violent extremist organizations. In particular, the self-styled ‘Islamic State’ attracted widespread attention from its supporters and detractors alike by mixing shocking video and imagery content in with substantive ideological and political content. Although this practice was widely condemned for its brutality, it proved to be effective at engaging with a variety of international audiences and encouraging potential supporters to seek further information. The reasons for the noteworthy success of this kind of shock-value propaganda content remain unclear, despite many governments’ attempts to produce counterpropaganda. This study examines violent extremist propaganda distributed by five terrorist organizations between 2010 and 2016, using material released by the ‎Al Hayat Media Center of the Islamic State, Boko Haram, Al Qaeda, Al Qaeda in the Arabian Peninsula, and Al Qaeda in the Islamic Maghreb. The time period covers all issues of the infamous publications Inspire and Dabiq, as well as the most shocking video content released by the Islamic State and its affiliates. The study uses frame analysis to distinguish thematic from symbolic content in violent extremist propaganda by contrasting the ways that substantive ideology issues were framed against the use of symbols and violence to garner attention and to stylize propaganda. The results demonstrate that thematic content focuses significantly on diagnostic frames, which explain violent extremist groups’ causes, and prognostic frames, which propose solutions to addressing or rectifying the cause shared by groups and their sympathizers. Conversely, symbolic violence is primarily stylistic and rarely linked to thematic issues or motivational framing. Frame analysis provides a useful preliminary tool in disentangling substantive ideological and political content from stylistic brutality in violent extremist propaganda. This provides governments and researchers a method for better understanding the framing and content used to design narratives and propaganda materials used to promote violent extremism around the world. Increased capacity to process and understand violent extremist narratives will further enable governments and non-governmental organizations to develop effective counternarratives which promote non-violent solutions to extremists’ grievances.

Keywords: countering violent extremism, counternarratives, frame analysis, propaganda, terrorism, violent extremism

Procedia PDF Downloads 156
40679 Spatial Structure of First-Order Voronoi for the Future of Roundabout Cairo Since 1867

Authors: Ali Essam El Shazly

Abstract:

The Haussmannization plan of Cairo in 1867 formed a regular network of roundabout spaces, though deteriorated at present. The method of identifying the spatial structure of roundabout Cairo for conservation matches the voronoi diagram with the space syntax through their geometrical property of spatial convexity. In this initiative, the primary convex hull of first-order voronoi adopts the integral and control measurements of space syntax on Cairo’s roundabout generators. The functional essence of royal palaces optimizes the roundabout structure in terms of spatial measurements and the symbolic voronoi projection of 'Tahrir Roundabout' over the Giza Nile and Pyramids. Some roundabouts of major public and commercial landmarks surround the pole of 'Ezbekia Garden' with a higher control than integral measurements, which filter the new spatial structure from the adjacent traditional town. Nevertheless, the least integral and control measures correspond to the voronoi contents of pollutant workshops and the plateau of old Cairo Citadel with the visual compensation of new royal landmarks on top. Meanwhile, the extended suburbs of infinite voronoi polygons arrange high control generators of chateaux housing in 'garden city' environs. The point pattern of roundabouts determines the geometrical characteristics of voronoi polygons. The measured lengths of voronoi edges alternate between the zoned short range at the new poles of Cairo and the distributed structure of longer range. Nevertheless, the shortest range of generator-vertex geometry concentrates at 'Ezbekia Garden' where the crossways of vast Cairo intersect, which maximizes the variety of choice at different spatial resolutions. However, the symbolic 'Hippodrome' which is the largest public landmark forms exclusive geometrical measurements, while structuring a most integrative roundabout to parallel the royal syntax. Overview of the symbolic convex hull of voronoi with space syntax interconnects Parisian Cairo with the spatial chronology of scattered monuments to conceive one universal Cairo structure. Accordingly, the approached methodology of 'voronoi-syntax' prospects the future conservation of roundabout Cairo at the inferred city-level concept.

Keywords: roundabout Cairo, first-order Voronoi, space syntax, spatial structure

Procedia PDF Downloads 475
40678 Data Integration with Geographic Information System Tools for Rural Environmental Monitoring

Authors: Tamas Jancso, Andrea Podor, Eva Nagyne Hajnal, Peter Udvardy, Gabor Nagy, Attila Varga, Meng Qingyan

Abstract:

The paper deals with the conditions and circumstances of integration of remotely sensed data for rural environmental monitoring purposes. The main task is to make decisions during the integration process when we have data sources with different resolution, location, spectral channels, and dimension. In order to have exact knowledge about the integration and data fusion possibilities, it is necessary to know the properties (metadata) that characterize the data. The paper explains the joining of these data sources using their attribute data through a sample project. The resulted product will be used for rural environmental analysis.

Keywords: remote sensing, GIS, metadata, integration, environmental analysis

Procedia PDF Downloads 95
40677 Femicide in the News: Jewish and Arab Victims and Culprits in the Israeli Hebrew Media

Authors: Ina Filkobski, Eran Shor

Abstract:

This article explores how newspapers cover murder of women by family members and intimate partners. Three major Israeli newspapers were compared in order to analyse the coverage of Jewish and Arab victims and culprits and to examine whether and in what ways the media contribute to the construction of symbolic boundaries between minority and dominant social groups. A sample of some 459 articles that were published between 2013 and 2015 was studied using a systematic qualitative content analysis. Our findings suggest that the treatment of murder cases by the media varies according to the ethnicity of both victims and culprits. The murder of Jews by family members or intimate partners was framed as a shocking and unusual event, a result of the individual personality or pathology of the culprit. Conversely, when Arabs were the killers, murders were often explained by focusing on the culture of the ethnic group, described as traditional, violent, and patriarchal. In two-thirds of the cases in which Arabs were involved, so-called ‘honor killing’ or other cultural explanations were proposed as the motive for the murder. This was often the case even before a suspect was detected, while police investigation was at its very early stages, and often despite forceful denials from victims’ families. In case of Jewish culprits, more than half of the articles in our sample suggested mental disorder to explain the acts and cultural explanations were almost entirely absent. Beyond the emphasis on psychological vs. cultural explanations, newspaper articles also tend to provide much more detail about Jewish culprits than about Arabs. Such detailed examinations convey a desire to make sense of the event by understanding the supposedly unique and unorthodox nature of the killer. The detailed accounts were usually absent from the reports on Arab killers. Thus, even if reports do not explicitly offer cultural motivations for the murder, the fact that reports often remain laconic leaves people to draw their own conclusions, which would then be likely based on existing cognitive scripts and previous reports on family murders among Arabs. Such treatment contributes to the notion that Arab and Muslim cultures, religions, and nationalities are essentially misogynistic and adhere to norms of honor and shame that are radically different from those of modern societies, such as the Jewish-Israeli one. Murder within the family is one of the most dramatic occurrences in the social world, and in societies that see themselves as modern it is a taboo; an ultimate signifier of danger. We suggest that representations of murder provide a valuable prism for examining the construction of group boundaries. Our analysis, therefore, contributes to the scholarly effort to understand the creation and reinforcement of symbolic boundaries between ‘society’ and its ‘others’ by systematically tracing the media constructions of ‘otherness’. While our analysis focuses on Israel, studies on the United States, Canada, and various European countries with ethnically and racially heterogeneous populations, make it clear that the stigmatisation and exclusion of visible, religious, and language minorities are not unique to the Israeli case.

Keywords: comparative study of media coverege of minority and majority groups, construction of symbolic group boundaries, murder of women by family members and intimate partners, Israel, Jews, Arabs

Procedia PDF Downloads 155
40676 Analysis of Genomics Big Data in Cloud Computing Using Fuzzy Logic

Authors: Mohammad Vahed, Ana Sadeghitohidi, Majid Vahed, Hiroki Takahashi

Abstract:

In the genomics field, the huge amounts of data have produced by the next-generation sequencers (NGS). Data volumes are very rapidly growing, as it is postulated that more than one billion bases will be produced per year in 2020. The growth rate of produced data is much faster than Moore's law in computer technology. This makes it more difficult to deal with genomics data, such as storing data, searching information, and finding the hidden information. It is required to develop the analysis platform for genomics big data. Cloud computing newly developed enables us to deal with big data more efficiently. Hadoop is one of the frameworks distributed computing and relies upon the core of a Big Data as a Service (BDaaS). Although many services have adopted this technology, e.g. amazon, there are a few applications in the biology field. Here, we propose a new algorithm to more efficiently deal with the genomics big data, e.g. sequencing data. Our algorithm consists of two parts: First is that BDaaS is applied for handling the data more efficiently. Second is that the hybrid method of MapReduce and Fuzzy logic is applied for data processing. This step can be parallelized in implementation. Our algorithm has great potential in computational analysis of genomics big data, e.g. de novo genome assembly and sequence similarity search. We will discuss our algorithm and its feasibility.

Keywords: big data, fuzzy logic, MapReduce, Hadoop, cloud computing

Procedia PDF Downloads 271
40675 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 124
40674 Estimation of Missing Values in Aggregate Level Spatial Data

Authors: Amitha Puranik, V. S. Binu, Seena Biju

Abstract:

Missing data is a common problem in spatial analysis especially at the aggregate level. Missing can either occur in covariate or in response variable or in both in a given location. Many missing data techniques are available to estimate the missing data values but not all of these methods can be applied on spatial data since the data are autocorrelated. Hence there is a need to develop a method that estimates the missing values in both response variable and covariates in spatial data by taking account of the spatial autocorrelation. The present study aims to develop a model to estimate the missing data points at the aggregate level in spatial data by accounting for (a) Spatial autocorrelation of the response variable (b) Spatial autocorrelation of covariates and (c) Correlation between covariates and the response variable. Estimating the missing values of spatial data requires a model that explicitly account for the spatial autocorrelation. The proposed model not only accounts for spatial autocorrelation but also utilizes the correlation that exists between covariates, within covariates and between a response variable and covariates. The precise estimation of the missing data points in spatial data will result in an increased precision of the estimated effects of independent variables on the response variable in spatial regression analysis.

Keywords: spatial regression, missing data estimation, spatial autocorrelation, simulation analysis

Procedia PDF Downloads 347
40673 Fuzzy Optimization Multi-Objective Clustering Ensemble Model for Multi-Source Data Analysis

Authors: C. B. Le, V. N. Pham

Abstract:

In modern data analysis, multi-source data appears more and more in real applications. Multi-source data clustering has emerged as a important issue in the data mining and machine learning community. Different data sources provide information about different data. Therefore, multi-source data linking is essential to improve clustering performance. However, in practice multi-source data is often heterogeneous, uncertain, and large. This issue is considered a major challenge from multi-source data. Ensemble is a versatile machine learning model in which learning techniques can work in parallel, with big data. Clustering ensemble has been shown to outperform any standard clustering algorithm in terms of accuracy and robustness. However, most of the traditional clustering ensemble approaches are based on single-objective function and single-source data. This paper proposes a new clustering ensemble method for multi-source data analysis. The fuzzy optimized multi-objective clustering ensemble method is called FOMOCE. Firstly, a clustering ensemble mathematical model based on the structure of multi-objective clustering function, multi-source data, and dark knowledge is introduced. Then, rules for extracting dark knowledge from the input data, clustering algorithms, and base clusterings are designed and applied. Finally, a clustering ensemble algorithm is proposed for multi-source data analysis. The experiments were performed on the standard sample data set. The experimental results demonstrate the superior performance of the FOMOCE method compared to the existing clustering ensemble methods and multi-source clustering methods.

Keywords: clustering ensemble, multi-source, multi-objective, fuzzy clustering

Procedia PDF Downloads 146
40672 Symbolic Computation via Grobner Basis

Authors: Haohao Wang

Abstract:

The purpose of this paper is to find elimination ideals via Grobner basis. We first introduce the concept of Grobner bases, and then, we provide computational algorithms to applications for curves and surfaces.

Keywords: curves, surfaces, Grobner basis, elimination

Procedia PDF Downloads 277
40671 Turkey in Minds: Cognitive and Social Representation of "East" and "West"

Authors: Feyzan Tuzkaya, Nihan S. Soylu, Caglar Solak, Mehmet Peker, Hilal Peker, Kemal Ozeralp, Ceren Mete, Ezgi Mehmetoglu, Mehmet Karasu, Cihan Elci, Ece Akca, Melek Goregenli

Abstract:

Perception, evaluation and representation of the environment have been the subject of many disciplines including psychology, geography and architecture. In environmental and social psychology literature there are several evidences which suggest that cognitive representations about a place consisted of not only geographic items but also social and cultural. Mental representations of residence area or a country is influenced and determined by social-demographics, the physical and social context. Thus, all mental representations of a given place are also social representations. Cognitive maps are the main and common instruments that are used to identify spatial images and the difference between physical and subjective environments. The aim of the current study is investigating the mental and social representations of Turkey in university students’ minds. Data was collected from 249 university students from different departments (i.e. psychology, geography, history, tourism departments) of Ege University. Participants were requested to reflect Turkey in their mind onto the paper drawing sketch maps. According to the results, cognitive maps showed geographic aspects of Turkey as well as the context of symbolic, cultural and political reality of Turkey. That is to say, these maps had many symbolic and verbal items related to critics on social and cultural problems, ongoing ethnic and political conflicts, and actual political agenda of Turkey. Additionally, one of main differentiations in these representations appeared in terms of the East and West side of the Turkey, and the representations of the East and West was varied correspondingly participants’ cultural background, their ethnic values, and where they have born. The results of the study were discussed in environmental and social psychological perspective considering cultural and social values of Turkey and current political circumstances of the country.

Keywords: cognitive maps, East, West, politics, social representations, Turkey

Procedia PDF Downloads 380
40670 Analysis of Cyber Activities of Potential Business Customers Using Neo4j Graph Databases

Authors: Suglo Tohari Luri

Abstract:

Data analysis is an important aspect of business performance. With the application of artificial intelligence within databases, selecting a suitable database engine for an application design is also very crucial for business data analysis. The application of business intelligence (BI) software into some relational databases such as Neo4j has proved highly effective in terms of customer data analysis. Yet what remains of great concern is the fact that not all business organizations have the neo4j business intelligence software applications to implement for customer data analysis. Further, those with the BI software lack personnel with the requisite expertise to use it effectively with the neo4j database. The purpose of this research is to demonstrate how the Neo4j program code alone can be applied for the analysis of e-commerce website customer visits. As the neo4j database engine is optimized for handling and managing data relationships with the capability of building high performance and scalable systems to handle connected data nodes, it will ensure that business owners who advertise their products at websites using neo4j as a database are able to determine the number of visitors so as to know which products are visited at routine intervals for the necessary decision making. It will also help in knowing the best customer segments in relation to specific goods so as to place more emphasis on their advertisement on the said websites.

Keywords: data, engine, intelligence, customer, neo4j, database

Procedia PDF Downloads 168
40669 Analysis and Forecasting of Bitcoin Price Using Exogenous Data

Authors: J-C. Leneveu, A. Chereau, L. Mansart, T. Mesbah, M. Wyka

Abstract:

Extracting and interpreting information from Big Data represent a stake for years to come in several sectors such as finance. Currently, numerous methods are used (such as Technical Analysis) to try to understand and to anticipate market behavior, with mixed results because it still seems impossible to exactly predict a financial trend. The increase of available data on Internet and their diversity represent a great opportunity for the financial world. Indeed, it is possible, along with these standard financial data, to focus on exogenous data to take into account more macroeconomic factors. Coupling the interpretation of these data with standard methods could allow obtaining more precise trend predictions. In this paper, in order to observe the influence of exogenous data price independent of other usual effects occurring in classical markets, behaviors of Bitcoin users are introduced in a model reconstituting Bitcoin value, which is elaborated and tested for prediction purposes.

Keywords: big data, bitcoin, data mining, social network, financial trends, exogenous data, global economy, behavioral finance

Procedia PDF Downloads 334
40668 Estimating the Life-Distribution Parameters of Weibull-Life PV Systems Utilizing Non-Parametric Analysis

Authors: Saleem Z. Ramadan

Abstract:

In this paper, a model is proposed to determine the life distribution parameters of the useful life region for the PV system utilizing a combination of non-parametric and linear regression analysis for the failure data of these systems. Results showed that this method is dependable for analyzing failure time data for such reliable systems when the data is scarce.

Keywords: masking, bathtub model, reliability, non-parametric analysis, useful life

Procedia PDF Downloads 534
40667 The Extent of Big Data Analysis by the External Auditors

Authors: Iyad Ismail, Fathilatul Abdul Hamid

Abstract:

This research was mainly investigated to recognize the extent of big data analysis by external auditors. This paper adopts grounded theory as a framework for conducting a series of semi-structured interviews with eighteen external auditors. The research findings comprised the availability extent of big data and big data analysis usage by the external auditors in Palestine, Gaza Strip. Considering the study's outcomes leads to a series of auditing procedures in order to improve the external auditing techniques, which leads to high-quality audit process. Also, this research is crucial for auditing firms by giving an insight into the mechanisms of auditing firms to identify the most important strategies that help in achieving competitive audit quality. These results are aims to instruct the auditing academic and professional institutions in developing techniques for external auditors in order to the big data analysis. This paper provides appropriate information for the decision-making process and a source of future information which affects technological auditing.

Keywords: big data analysis, external auditors, audit reliance, internal audit function

Procedia PDF Downloads 44
40666 Enhance the Power of Sentiment Analysis

Authors: Yu Zhang, Pedro Desouza

Abstract:

Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modelling and testing work was done in R and Greenplum in-database analytic tools.

Keywords: sentiment analysis, social media, Twitter, Amazon, data mining, machine learning, text mining

Procedia PDF Downloads 328
40665 Analysis of Cooperative Learning Behavior Based on the Data of Students' Movement

Authors: Wang Lin, Li Zhiqiang

Abstract:

The purpose of this paper is to analyze the cooperative learning behavior pattern based on the data of students' movement. The study firstly reviewed the cooperative learning theory and its research status, and briefly introduced the k-means clustering algorithm. Then, it used clustering algorithm and mathematical statistics theory to analyze the activity rhythm of individual student and groups in different functional areas, according to the movement data provided by 10 first-year graduate students. It also focused on the analysis of students' behavior in the learning area and explored the law of cooperative learning behavior. The research result showed that the cooperative learning behavior analysis method based on movement data proposed in this paper is feasible. From the results of data analysis, the characteristics of behavior of students and their cooperative learning behavior patterns could be found.

Keywords: behavior pattern, cooperative learning, data analyze, k-means clustering algorithm

Procedia PDF Downloads 160
40664 What the Future Holds for Social Media Data Analysis

Authors: P. Wlodarczak, J. Soar, M. Ally

Abstract:

The dramatic rise in the use of Social Media (SM) platforms such as Facebook and Twitter provide access to an unprecedented amount of user data. Users may post reviews on products and services they bought, write about their interests, share ideas or give their opinions and views on political issues. There is a growing interest in the analysis of SM data from organisations for detecting new trends, obtaining user opinions on their products and services or finding out about their online reputations. A recent research trend in SM analysis is making predictions based on sentiment analysis of SM. Often indicators of historic SM data are represented as time series and correlated with a variety of real world phenomena like the outcome of elections, the development of financial indicators, box office revenue and disease outbreaks. This paper examines the current state of research in the area of SM mining and predictive analysis and gives an overview of the analysis methods using opinion mining and machine learning techniques.

Keywords: social media, text mining, knowledge discovery, predictive analysis, machine learning

Procedia PDF Downloads 393
40663 Big Data Analysis with Rhipe

Authors: Byung Ho Jung, Ji Eun Shin, Dong Hoon Lim

Abstract:

Rhipe that integrates R and Hadoop environment made it possible to process and analyze massive amounts of data using a distributed processing environment. In this paper, we implemented multiple regression analysis using Rhipe with various data sizes of actual data. Experimental results for comparing the performance of our Rhipe with stats and biglm packages available on bigmemory, showed that our Rhipe was more fast than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases. We also compared the computing speeds of pseudo-distributed and fully-distributed modes for configuring Hadoop cluster. The results showed that fully-distributed mode was faster than pseudo-distributed mode, and computing speeds of fully-distributed mode were faster as the number of data nodes increases.

Keywords: big data, Hadoop, Parallel regression analysis, R, Rhipe

Procedia PDF Downloads 478
40662 From CBGB to F21: The Ramone's Band T-Shirt and Its Representations in the Mainstream Culture

Authors: Cláudia Pereira, Lívia Boeschenstein

Abstract:

This article aims to present an analysis of rock band t-shirts as an element that claims a certain identity in modern-contemporary culture. This work focuses on the study of t-shirts that display the name, related elements and the logo of punk band The Ramones, because of its strong presence in the collective mind along the last decades. As we shall see, it is possible to observe a phenomenon of symbolic transition from the original cultural place of that object. At first, it was a piece of cloth that had been part of a specific subculture and then it became just a generic item diluted by the mainstream. This symbolic transitional phenomenon is significant in many ways and will be discussed furthermore. For the analysis, we begin with a brief introduction to the history of the band, followed by the study about the vintage rock band T-shirts and their meanings. From there, we will turn to a historical contextualization of band T-shirts as a subcultural item and to its redefinition after the appropriation made by the mainstream. To guide this reasoning, it will be used theories about the styles, subcultures and youth culture and about material culture from an anthropological perspective. In addition, we shall see the theories and concepts of social representations in order to understand the ways of using the Ramones’s T-shirt as a representative element of a fashionable style. This T-shirt, after being resignified by the standardization and the massive consumption, no longer symbolizes the punk movement, its behavioral motivations and original policies. Also has little to do with the rage the working class suburbs of London or New York. It seems to be a mute and vague sign of a restricted rebellion, foreseen and framed establishing a stylistic contrast to the designer clothes and good behavior predicted by establishment. It's an item that composes a specific style available on the market, but at the same time is accepted by the mainstream and provides a subcultural association that has some prestige in society. Another perspective is that of resignification loop. As the same way that punk resignified the conventional goods for their own social standards, fashion resignifies what was said to be an object of a subculture and absorbs in their own mass culture standards. Therefore, outsiders to the punk phenomenon wearing Ramones’s T-shirts can be perceived negatively by subcultural members, but at the same time are well received by those who are partially unaware or completely out of subcultural context. For the general public, the stamp of the Ramones’s logo happens to be appreciated as a diffuse allusion to a punk style, since its original meaning has being entirely neutralized.

Keywords: social representations, subcultures, material culture, punk

Procedia PDF Downloads 360
40661 Origins of the Tattoo: Decoding the Ancient Meanings of Terrestrial Body Art to Establish a Connection between the Natural World and Humans Today

Authors: Sangeet Anand

Abstract:

Body art and tattooing have long been practiced as a form of self-expression for centuries, and this study studies and analyzes the pertinence of tattoo culture in our everyday lives and ancient past. Individuals of different cultures represent ideas, practices, and elements of their cultures through symbolic representation. These symbols come in all shapes and sizes and can be as simple as the makeup you put on every day to something more permanent such as a tattoo. In the long run, these individuals who choose to display art on their bodies are seeking to express their individuality. In addition, these visuals are ultimately a reflection of our own appropriate cultures deem as beautiful, important, and powerful to the human eye. They make us known to the world and give us a plausible identity in an ever-changing world. We have lived through and seen a rise in hippie culture today. This type of bodily decoration displayed by this fad has made it seem as though body art is a visual language that is relatively new. But quite to the contrary, it is not. Through cultural symbolic exploration, we can answer key questions to ideas that have been raised for centuries. Through careful, in-depth interviews, this study takes a broad subject matter-art, and symbolism-and culminates it into a deeper philosophical connection between the world and its past. The basic methodologies used in this sociocultural study include interview questionnaires and textual analysis, which encompass a subject and interviewer as well as source material. The major findings of this study contain a distinct connection between cultural heritage and the day-to-day likings of an individual. The participant that was studied during this project demonstrated a clear passion for hobbies that were practiced even by her ancestors. We can conclude, through these findings, that there is a deeper cultural connection between modern day humans, the first humans, and the surrounding environments. Our symbols today are a direct reflection of the elements of nature that our human ancestors were exposed to, and, through cultural acceptance, we can adorn ourselves with these representations to help others identify our pasts. Body art embraces the different aspects of different cultures and holds significance, tells stories, and persists, even as the human population rapidly integrates. With this pattern, our human descendents will continue to represent their cultures and identities in the future. Body art is an integral element in understanding how and why people identify with certain aspects of life over others and broaden the scope for conducting more analysis cross-culturally.

Keywords: natural, symbolism, tattoo, terrestrial

Procedia PDF Downloads 84
40660 Detection Efficient Enterprises via Data Envelopment Analysis

Authors: S. Turkan

Abstract:

In this paper, the Turkey’s Top 500 Industrial Enterprises data in 2014 were analyzed by data envelopment analysis. Data envelopment analysis is used to detect efficient decision-making units such as universities, hospitals, schools etc. by using inputs and outputs. The decision-making units in this study are enterprises. To detect efficient enterprises, some financial ratios are determined as inputs and outputs. For this reason, financial indicators related to productivity of enterprises are considered. The efficient foreign weighted owned capital enterprises are detected via super efficiency model. According to the results, it is said that Mercedes-Benz is the most efficient foreign weighted owned capital enterprise in Turkey.

Keywords: data envelopment analysis, super efficiency, logistic regression, financial ratios

Procedia PDF Downloads 307
40659 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: clustering algorithms, coastal engineering, data mining, data summarization, statistical methods

Procedia PDF Downloads 339
40658 Bayesian Borrowing Methods for Count Data: Analysis of Incontinence Episodes in Patients with Overactive Bladder

Authors: Akalu Banbeta, Emmanuel Lesaffre, Reynaldo Martina, Joost Van Rosmalen

Abstract:

Including data from previous studies (historical data) in the analysis of the current study may reduce the sample size requirement and/or increase the power of analysis. The most common example is incorporating historical control data in the analysis of a current clinical trial. However, this only applies when the historical control dataare similar enough to the current control data. Recently, several Bayesian approaches for incorporating historical data have been proposed, such as the meta-analytic-predictive (MAP) prior and the modified power prior (MPP) both for single control as well as for multiple historical control arms. Here, we examine the performance of the MAP and the MPP approaches for the analysis of (over-dispersed) count data. To this end, we propose a computational method for the MPP approach for the Poisson and the negative binomial models. We conducted an extensive simulation study to assess the performance of Bayesian approaches. Additionally, we illustrate our approaches on an overactive bladder data set. For similar data across the control arms, the MPP approach outperformed the MAP approach with respect to thestatistical power. When the means across the control arms are different, the MPP yielded a slightly inflated type I error (TIE) rate, whereas the MAP did not. In contrast, when the dispersion parameters are different, the MAP gave an inflated TIE rate, whereas the MPP did not.We conclude that the MPP approach is more promising than the MAP approach for incorporating historical count data.

Keywords: count data, meta-analytic prior, negative binomial, poisson

Procedia PDF Downloads 91
40657 Data Quality Enhancement with String Length Distribution

Authors: Qi Xiu, Hiromu Hota, Yohsuke Ishii, Takuya Oda

Abstract:

Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.

Keywords: string classification, data quality, feature selection, probability distribution, string length

Procedia PDF Downloads 289
40656 Microarray Data Visualization and Preprocessing Using R and Bioconductor

Authors: Ruchi Yadav, Shivani Pandey, Prachi Srivastava

Abstract:

Microarrays provide a rich source of data on the molecular working of cells. Each microarray reports on the abundance of tens of thousands of mRNAs. Virtually every human disease is being studied using microarrays with the hope of finding the molecular mechanisms of disease. Bioinformatics analysis plays an important part of processing the information embedded in large-scale expression profiling studies and for laying the foundation for biological interpretation. A basic, yet challenging task in the analysis of microarray gene expression data is the identification of changes in gene expression that are associated with particular biological conditions. Careful statistical design and analysis are essential to improve the efficiency and reliability of microarray experiments throughout the data acquisition and analysis process. One of the most popular platforms for microarray analysis is Bioconductor, an open source and open development software project based on the R programming language. This paper describes specific procedures for conducting quality assessment, visualization and preprocessing of Affymetrix Gene Chip and also details the different bioconductor packages used to analyze affymetrix microarray data and describe the analysis and outcome of each plots.

Keywords: microarray analysis, R language, affymetrix visualization, bioconductor

Procedia PDF Downloads 453