Search results for: symbolic data analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 42191

Search results for: symbolic data analysis

42071 Estimation of Missing Values in Aggregate Level Spatial Data

Authors: Amitha Puranik, V. S. Binu, Seena Biju

Abstract:

Missing data is a common problem in spatial analysis especially at the aggregate level. Missing can either occur in covariate or in response variable or in both in a given location. Many missing data techniques are available to estimate the missing data values but not all of these methods can be applied on spatial data since the data are autocorrelated. Hence there is a need to develop a method that estimates the missing values in both response variable and covariates in spatial data by taking account of the spatial autocorrelation. The present study aims to develop a model to estimate the missing data points at the aggregate level in spatial data by accounting for (a) Spatial autocorrelation of the response variable (b) Spatial autocorrelation of covariates and (c) Correlation between covariates and the response variable. Estimating the missing values of spatial data requires a model that explicitly account for the spatial autocorrelation. The proposed model not only accounts for spatial autocorrelation but also utilizes the correlation that exists between covariates, within covariates and between a response variable and covariates. The precise estimation of the missing data points in spatial data will result in an increased precision of the estimated effects of independent variables on the response variable in spatial regression analysis.

Keywords: spatial regression, missing data estimation, spatial autocorrelation, simulation analysis

Procedia PDF Downloads 382
42070 Geometry, the language of Manifestation of Tabriz School’s Mystical Thoughts in Architecture (Case Study: Dome of Soltanieh)

Authors: Lida Balilan, Dariush Sattarzadeh, Rana Koorepaz

Abstract:

In the Ilkhanid era, the mystical school of Tabriz manifested itself as an art school in various aspects, including miniatures, architecture, urban planning and design, simultaneously with the expansion of the many sciences of its time. In this era, mysticism, both in form and in poetry and prose, as well as in works of art reached its peak. Mysticism, as an inner belief and thought, brought the audience to the artistic and aesthetical view using allegorical and symbolic expression of the religion and had a direct impact on the formation of the intellectual and cultural layers of the society. At the same time, Mystic school of Tabriz could create a symbolic and allegorical language to create magnificent works of architecture with the expansion of science in various fields and using various sciences such as mathematics, geometry, science of numbers and by Abjad letters. In this era, geometry is the middle link between mysticism and architecture and it is divided into two categories, including intellectual and sensory geometry and based on its function. Soltaniyeh dome is one of the prominent buildings of the Tabriz school with the shrine land use. In this article, information is collected using a historical-interpretive method and the results are analyzed using an analytical-comparative method. The results of the study suggest that the designers and builders of the Soltaniyeh dome have used shapes, colors, numbers, letters and words in the form of motifs, geometric patterns as well as lines and writings in levels and layers ranging from plans to decorations and arrays for architectural symbolization and encryption to express and transmit mystical ideas.

Keywords: geometry, Tabriz school, mystical thoughts, dome of Soltaniyeh

Procedia PDF Downloads 86
42069 Fuzzy Optimization Multi-Objective Clustering Ensemble Model for Multi-Source Data Analysis

Authors: C. B. Le, V. N. Pham

Abstract:

In modern data analysis, multi-source data appears more and more in real applications. Multi-source data clustering has emerged as a important issue in the data mining and machine learning community. Different data sources provide information about different data. Therefore, multi-source data linking is essential to improve clustering performance. However, in practice multi-source data is often heterogeneous, uncertain, and large. This issue is considered a major challenge from multi-source data. Ensemble is a versatile machine learning model in which learning techniques can work in parallel, with big data. Clustering ensemble has been shown to outperform any standard clustering algorithm in terms of accuracy and robustness. However, most of the traditional clustering ensemble approaches are based on single-objective function and single-source data. This paper proposes a new clustering ensemble method for multi-source data analysis. The fuzzy optimized multi-objective clustering ensemble method is called FOMOCE. Firstly, a clustering ensemble mathematical model based on the structure of multi-objective clustering function, multi-source data, and dark knowledge is introduced. Then, rules for extracting dark knowledge from the input data, clustering algorithms, and base clusterings are designed and applied. Finally, a clustering ensemble algorithm is proposed for multi-source data analysis. The experiments were performed on the standard sample data set. The experimental results demonstrate the superior performance of the FOMOCE method compared to the existing clustering ensemble methods and multi-source clustering methods.

Keywords: clustering ensemble, multi-source, multi-objective, fuzzy clustering

Procedia PDF Downloads 189
42068 Analysis and Forecasting of Bitcoin Price Using Exogenous Data

Authors: J-C. Leneveu, A. Chereau, L. Mansart, T. Mesbah, M. Wyka

Abstract:

Extracting and interpreting information from Big Data represent a stake for years to come in several sectors such as finance. Currently, numerous methods are used (such as Technical Analysis) to try to understand and to anticipate market behavior, with mixed results because it still seems impossible to exactly predict a financial trend. The increase of available data on Internet and their diversity represent a great opportunity for the financial world. Indeed, it is possible, along with these standard financial data, to focus on exogenous data to take into account more macroeconomic factors. Coupling the interpretation of these data with standard methods could allow obtaining more precise trend predictions. In this paper, in order to observe the influence of exogenous data price independent of other usual effects occurring in classical markets, behaviors of Bitcoin users are introduced in a model reconstituting Bitcoin value, which is elaborated and tested for prediction purposes.

Keywords: big data, bitcoin, data mining, social network, financial trends, exogenous data, global economy, behavioral finance

Procedia PDF Downloads 355
42067 Analysis of Cyber Activities of Potential Business Customers Using Neo4j Graph Databases

Authors: Suglo Tohari Luri

Abstract:

Data analysis is an important aspect of business performance. With the application of artificial intelligence within databases, selecting a suitable database engine for an application design is also very crucial for business data analysis. The application of business intelligence (BI) software into some relational databases such as Neo4j has proved highly effective in terms of customer data analysis. Yet what remains of great concern is the fact that not all business organizations have the neo4j business intelligence software applications to implement for customer data analysis. Further, those with the BI software lack personnel with the requisite expertise to use it effectively with the neo4j database. The purpose of this research is to demonstrate how the Neo4j program code alone can be applied for the analysis of e-commerce website customer visits. As the neo4j database engine is optimized for handling and managing data relationships with the capability of building high performance and scalable systems to handle connected data nodes, it will ensure that business owners who advertise their products at websites using neo4j as a database are able to determine the number of visitors so as to know which products are visited at routine intervals for the necessary decision making. It will also help in knowing the best customer segments in relation to specific goods so as to place more emphasis on their advertisement on the said websites.

Keywords: data, engine, intelligence, customer, neo4j, database

Procedia PDF Downloads 193
42066 Estimating the Life-Distribution Parameters of Weibull-Life PV Systems Utilizing Non-Parametric Analysis

Authors: Saleem Z. Ramadan

Abstract:

In this paper, a model is proposed to determine the life distribution parameters of the useful life region for the PV system utilizing a combination of non-parametric and linear regression analysis for the failure data of these systems. Results showed that this method is dependable for analyzing failure time data for such reliable systems when the data is scarce.

Keywords: masking, bathtub model, reliability, non-parametric analysis, useful life

Procedia PDF Downloads 562
42065 The Extent of Big Data Analysis by the External Auditors

Authors: Iyad Ismail, Fathilatul Abdul Hamid

Abstract:

This research was mainly investigated to recognize the extent of big data analysis by external auditors. This paper adopts grounded theory as a framework for conducting a series of semi-structured interviews with eighteen external auditors. The research findings comprised the availability extent of big data and big data analysis usage by the external auditors in Palestine, Gaza Strip. Considering the study's outcomes leads to a series of auditing procedures in order to improve the external auditing techniques, which leads to high-quality audit process. Also, this research is crucial for auditing firms by giving an insight into the mechanisms of auditing firms to identify the most important strategies that help in achieving competitive audit quality. These results are aims to instruct the auditing academic and professional institutions in developing techniques for external auditors in order to the big data analysis. This paper provides appropriate information for the decision-making process and a source of future information which affects technological auditing.

Keywords: big data analysis, external auditors, audit reliance, internal audit function

Procedia PDF Downloads 69
42064 „Real and Symbolic in Poetics of Multiplied Screens and Images“

Authors: Kristina Horvat Blazinovic

Abstract:

In the context of a work of art, one can talk about the idea-concept-term-intention expressed by the artist by using various forms of repetition (external, material, visible repetition). Such repetitions of elements (images in space or moving visual and sound images in time) suggest a "covert", "latent" ("dressed") repetition – i.e., "hidden", "latent" term-intention-idea. Repeating in this way reveals a "deeper truth" that the viewer needs to decode and which is hidden "under" the technical manifestation of the multiplied images. It is not only images, sounds, and screens that are repeated - something else is repeated through them as well, even if, in some cases, the very idea of repetition is repeated. This paper examines serial images and single-channel or multi-channel artwork in the field of video/film art and video installations, which in a way implies the concept of repetition and multiplication. Moving or static images and screens (as multi-screens) are repeated in time and space. The categories of the real and the symbolic partly refer to the Lacan registers of reality, i.e., the Imaginary - Symbolic – Real trinity that represents the orders within which human subjectivity is established. Authors such as Bruce Nauman, VALIE EXPORT, Ragnar Kjartansson, Wolf Vostell, Shirin Neshat, Paul Sharits, Harun Farocki, Dalibor Martinis, Andy Warhol, Douglas Gordon, Bill Viola, Frank Gillette, and Ira Schneider, and Marina Abramovic problematize, in different ways, the concept and procedures of multiplication - repetition, but not in the sense of "copying" and "repetition" of reality or the original, but of repeated repetitions of the simulacrum. Referential works of art are often connected by the theme of the traumatic. Repetitions of images and situations are a response to the traumatic (experience) - repetition itself is a symptom of trauma. On the other hand, repeating and multiplying traumatic images results in a new traumatic effect or cancels it. Reflections on repetition as a temporal and spatial phenomenon are in line with the chapters that link philosophical considerations of space and time and experience temporality with their manifestation in works of art. The observations about time and the relation of perception and memory are according to Henry Bergson and his conception of duration (durée) as "quality of quantity." The video works intended to be displayed as a video loop, express the idea of infinite duration ("pure time," according to Bergson). The Loop wants to be always present - to fixate in time. Wholeness is unrecognizable because the intention is to make the effect infinitely cyclic. Reflections on time and space end with considerations about the occurrence and effects of time and space intervals as places and moments "between" – the points of connection and separation, of continuity and stopping - by reference to the "interval theory" of Soviet filmmaker DzigaVertov. The scale of opportunities that can be explored in interval mode is wide. Intervals represent the perception of time and space in the form of pauses, interruptions, breaks (e.g., emotional, dramatic, or rhythmic) denote emptiness or silence, distance, proximity, interstitial space, or a gap between various states.

Keywords: video installation, performance, repetition, multi-screen, real and symbolic, loop, video art, interval, video time

Procedia PDF Downloads 173
42063 Stories of Women With Cervical Cancer in Taiwan: A Narrative Analysis Research

Authors: Pei-Yu Lee

Abstract:

This study investigates the life experience and self-interpretation of female cervical cancer patients under Taiwanese cultural context. Through a Narrative Analysis Research approach, the study took six cervical cancer female patients with an average age of 58 years (ranging from 55-66 years) for an average of twice, 60 minutes each time, in-depth recorded interviews under their consent. After converting the interview recordings into transcripts, the study applied the Riessman approach to analyze the contents. The results revealed two major theme, including 1. The symbolic meaning of the cervix, and 2. Women's perseverance and compliance. Because of the illness metaphor of cervical cancer and the designation of women being family caregivers under the Chinese patriarchal culture, females with cervical cancer are not only patients but also responsible for being family and partner roles, in which contradictions of intimate relationships exist. Show the strength of perseverance and compliance in the course of life. On the other hand, they have to identify and recognize their roles in life and strive to determine the situation of coexisting with the disease to picture their life. The results showed that female cervical cancer patients not only need to combat the disease but also stand against the stigma and the traditional responsibility given to women. The researchers recommend that nurses should include cultural implications in their care of female cervical cancer patients.

Keywords: female, cervical cancer, narrative analysis research, taiwan

Procedia PDF Downloads 98
42062 The Contributions of Internal Marketing to the Explanation of Organizational Commitment: Study Developed on Public Institutions

Authors: J. Santos, A. Gomes, G. Goncalves

Abstract:

Organizations have increased the debate on the importance of symbolic aspects need to humanize, based on trust. A strong connection with the cultural guidance is key to determine the success of any company since it guarantees its recognition and increased productivity. This way, the quality of an organization relies essentially on its collaborators; on the way, they feel the company as their own. The changes imposed on public institutions try to fit some management practices of the private sector, to the public organizations. Currently, all efforts are aimed to increase competitiveness and promoting a better organizational performance, which leads to an increased the importance of human assets in organizations. A particular interest is the internal marketing since it has a relevant role in the development of employees. This research aimed to describe and identify how internal marketing contributes to explain organizational commitment. A quantitative analysis was done with a sample of 600 workers from public organizations, collected through a questionnaire composed of two scales that allowed the analysis of each of the constructs. The results show explanatory contribution of internal marketing practices on affective and normative commitment, through written information. By the results, workers are committed to the organizations.

Keywords: internal marketing, organizational commitment, public institutions, Portuguese

Procedia PDF Downloads 244
42061 Enhance the Power of Sentiment Analysis

Authors: Yu Zhang, Pedro Desouza

Abstract:

Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modelling and testing work was done in R and Greenplum in-database analytic tools.

Keywords: sentiment analysis, social media, Twitter, Amazon, data mining, machine learning, text mining

Procedia PDF Downloads 352
42060 A Theoretical Framework: The Influence of Luxury Companies' Corporate Social Activities on Consumer Purchase Intention

Authors: Kveta Olsanova, Gina Cook, Marija Zlatic

Abstract:

This paper discusses the theoretical framework suggesting the dependencies between luxury brands’ CSR (Corporate Social Responsibility) variables and the purchase intention of luxury shoppers. The framework is based on a literature review and in-depth individual interviews with a sample of luxury users and buyers. The measures of the model are based on existing research and the authors' qualitative research results. The model suggests that purchase intention in the luxury segment is dependent on the luxury values (symbolic, experiential, functional and social), individual sustainable dimension (composed of societal, environmental and economic variables) and awareness of the brand’s CSR, the last two relationships being potentially moderated by certain conditions such as demographics and general attitudes towards CSR and sustainability. The model’s output is in the formulation of several hypotheses, to be tested in an upcoming quantitative study. The qualitative phase indicated that the perceived symbolic, functional and experiential value dimensions of luxury brands were stronger drivers of purchase intention compared to the sustainable dimension. The contribution of the research consists of highlighting CSR’s impact on customer purchase intent as a potential implication for luxury brand management due to two aspects: (i) consumer awareness of the existing CSR activities of luxury brands is low, and this might be challenged by the demands of Gen Z entrants into the lux industry as they are known for their positive approach to CSR; (ii) the UN’s SDGs will bring CSR to the attention of all industries, including currently 'CSR silent' segments represented by luxury. Our research should contribute to incorporation of strategic CSR into the policies and strategies of the luxury segment by providing evidence that luxury customers do care.

Keywords: CSR, luxury shoppers, purchase intention, sustainability

Procedia PDF Downloads 148
42059 Analysis of Cooperative Learning Behavior Based on the Data of Students' Movement

Authors: Wang Lin, Li Zhiqiang

Abstract:

The purpose of this paper is to analyze the cooperative learning behavior pattern based on the data of students' movement. The study firstly reviewed the cooperative learning theory and its research status, and briefly introduced the k-means clustering algorithm. Then, it used clustering algorithm and mathematical statistics theory to analyze the activity rhythm of individual student and groups in different functional areas, according to the movement data provided by 10 first-year graduate students. It also focused on the analysis of students' behavior in the learning area and explored the law of cooperative learning behavior. The research result showed that the cooperative learning behavior analysis method based on movement data proposed in this paper is feasible. From the results of data analysis, the characteristics of behavior of students and their cooperative learning behavior patterns could be found.

Keywords: behavior pattern, cooperative learning, data analyze, k-means clustering algorithm

Procedia PDF Downloads 187
42058 Multimodal Integration of EEG, fMRI and Positron Emission Tomography Data Using Principal Component Analysis for Prognosis in Coma Patients

Authors: Denis Jordan, Daniel Golkowski, Mathias Lukas, Katharina Merz, Caroline Mlynarcik, Max Maurer, Valentin Riedl, Stefan Foerster, Eberhard F. Kochs, Andreas Bender, Ruediger Ilg

Abstract:

Introduction: So far, clinical assessments that rely on behavioral responses to differentiate coma states or even predict outcome in coma patients are unreliable, e.g. because of some patients’ motor disabilities. The present study was aimed to provide prognosis in coma patients using markers from electroencephalogram (EEG), blood oxygen level dependent (BOLD) functional magnetic resonance imaging (fMRI) and [18F]-fluorodeoxyglucose (FDG) positron emission tomography (PET). Unsuperwised principal component analysis (PCA) was used for multimodal integration of markers. Methods: Approved by the local ethics committee of the Technical University of Munich (Germany) 20 patients (aged 18-89) with severe brain damage were acquired through intensive care units at the Klinikum rechts der Isar in Munich and at the Therapiezentrum Burgau (Germany). At the day of EEG/fMRI/PET measurement (date I) patients (<3.5 month in coma) were grouped in the minimal conscious state (MCS) or vegetative state (VS) on the basis of their clinical presentation (coma recovery scale-revised, CRS-R). Follow-up assessment (date II) was also based on CRS-R in a period of 8 to 24 month after date I. At date I, 63 channel EEG (Brain Products, Gilching, Germany) was recorded outside the scanner, and subsequently simultaneous FDG-PET/fMRI was acquired on an integrated Siemens Biograph mMR 3T scanner (Siemens Healthineers, Erlangen Germany). Power spectral densities, permutation entropy (PE) and symbolic transfer entropy (STE) were calculated in/between frontal, temporal, parietal and occipital EEG channels. PE and STE are based on symbolic time series analysis and were already introduced as robust markers separating wakefulness from unconsciousness in EEG during general anesthesia. While PE quantifies the regularity structure of the neighboring order of signal values (a surrogate of cortical information processing), STE reflects information transfer between two signals (a surrogate of directed connectivity in cortical networks). fMRI was carried out using SPM12 (Wellcome Trust Center for Neuroimaging, University of London, UK). Functional images were realigned, segmented, normalized and smoothed. PET was acquired for 45 minutes in list-mode. For absolute quantification of brain’s glucose consumption rate in FDG-PET, kinetic modelling was performed with Patlak’s plot method. BOLD signal intensity in fMRI and glucose uptake in PET was calculated in 8 distinct cortical areas. PCA was performed over all markers from EEG/fMRI/PET. Prognosis (persistent VS and deceased patients vs. recovery to MCS/awake from date I to date II) was evaluated using the area under the curve (AUC) including bootstrap confidence intervals (CI, *: p<0.05). Results: Prognosis was reliably indicated by the first component of PCA (AUC=0.99*, CI=0.92-1.00) showing a higher AUC when compared to the best single markers (EEG: AUC<0.96*, fMRI: AUC<0.86*, PET: AUC<0.60). CRS-R did not show prediction (AUC=0.51, CI=0.29-0.78). Conclusion: In a multimodal analysis of EEG/fMRI/PET in coma patients, PCA lead to a reliable prognosis. The impact of this result is evident, as clinical estimates of prognosis are inapt at time and could be supported by quantitative biomarkers from EEG, fMRI and PET. Due to the small sample size, further investigations are required, in particular allowing superwised learning instead of the basic approach of unsuperwised PCA.

Keywords: coma states and prognosis, electroencephalogram, entropy, functional magnetic resonance imaging, machine learning, positron emission tomography, principal component analysis

Procedia PDF Downloads 339
42057 Big Data Analysis with Rhipe

Authors: Byung Ho Jung, Ji Eun Shin, Dong Hoon Lim

Abstract:

Rhipe that integrates R and Hadoop environment made it possible to process and analyze massive amounts of data using a distributed processing environment. In this paper, we implemented multiple regression analysis using Rhipe with various data sizes of actual data. Experimental results for comparing the performance of our Rhipe with stats and biglm packages available on bigmemory, showed that our Rhipe was more fast than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases. We also compared the computing speeds of pseudo-distributed and fully-distributed modes for configuring Hadoop cluster. The results showed that fully-distributed mode was faster than pseudo-distributed mode, and computing speeds of fully-distributed mode were faster as the number of data nodes increases.

Keywords: big data, Hadoop, Parallel regression analysis, R, Rhipe

Procedia PDF Downloads 497
42056 What the Future Holds for Social Media Data Analysis

Authors: P. Wlodarczak, J. Soar, M. Ally

Abstract:

The dramatic rise in the use of Social Media (SM) platforms such as Facebook and Twitter provide access to an unprecedented amount of user data. Users may post reviews on products and services they bought, write about their interests, share ideas or give their opinions and views on political issues. There is a growing interest in the analysis of SM data from organisations for detecting new trends, obtaining user opinions on their products and services or finding out about their online reputations. A recent research trend in SM analysis is making predictions based on sentiment analysis of SM. Often indicators of historic SM data are represented as time series and correlated with a variety of real world phenomena like the outcome of elections, the development of financial indicators, box office revenue and disease outbreaks. This paper examines the current state of research in the area of SM mining and predictive analysis and gives an overview of the analysis methods using opinion mining and machine learning techniques.

Keywords: social media, text mining, knowledge discovery, predictive analysis, machine learning

Procedia PDF Downloads 423
42055 Brokerage and Value-Creation: Trading Practices in the English Market of 20th-Century Maps

Authors: Shaun Lim

Abstract:

This paper presents a 9-month ethnographic case study of the value creating strategies employed by an Oxford market-trader of 20th-century maps. Maps are usually valued and sold as either antique objets d’art or useful navigational tools, with 20th-century maps precariously lying between the boundary of the aesthetic and utilitarian value-regimes. Here, the brokerage practices involved in the framing of outdated, lowly valued maps into vintage commodities will be examined. Ethnographic material of the unstudied market of old maps is introduced and situated in the second-hand, antique and collectible spheres of exchange. The map-trader as a broker is the ethnographic and methodological starting point of this paper. Brokerage is understood through the activity of framing that defines and brackets the value-regimes of commodities with the aid of market and framing devices. The trader’s activities will be examined in three parts. (1) The post-sourcing industry: the altering, mounting and tagging of maps before putting them into market circulation. Mounts, frames and tags are seen as market devices that authenticates and frames maps with aesthetic and symbolic values along with the disentanglement of its use value. (2) The market-display: the constitution of space that encourages the relations of looking at maps as aesthetic objects, while the categorical arrangement of the display contributes to legitimising of the collectability of maps. (3) The salesmanship strategies of the trader: the match-making of customers with maps of meaningful value, and the mediating of knowledge through the verbal articulation of the map’s symbolic values. Ultimately, value is not created in an accumulative sense, but is layered and superimposed to cater to a wide spectrum of patrons. The trader creates demand for his goods by mediating and articulating value-regimes already coherent to potential patrons.

Keywords: art and material culture, brokerage, commodification, framing, markets, value

Procedia PDF Downloads 146
42054 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: clustering algorithms, coastal engineering, data mining, data summarization, statistical methods

Procedia PDF Downloads 361
42053 Detection Efficient Enterprises via Data Envelopment Analysis

Authors: S. Turkan

Abstract:

In this paper, the Turkey’s Top 500 Industrial Enterprises data in 2014 were analyzed by data envelopment analysis. Data envelopment analysis is used to detect efficient decision-making units such as universities, hospitals, schools etc. by using inputs and outputs. The decision-making units in this study are enterprises. To detect efficient enterprises, some financial ratios are determined as inputs and outputs. For this reason, financial indicators related to productivity of enterprises are considered. The efficient foreign weighted owned capital enterprises are detected via super efficiency model. According to the results, it is said that Mercedes-Benz is the most efficient foreign weighted owned capital enterprise in Turkey.

Keywords: data envelopment analysis, super efficiency, logistic regression, financial ratios

Procedia PDF Downloads 324
42052 Bayesian Borrowing Methods for Count Data: Analysis of Incontinence Episodes in Patients with Overactive Bladder

Authors: Akalu Banbeta, Emmanuel Lesaffre, Reynaldo Martina, Joost Van Rosmalen

Abstract:

Including data from previous studies (historical data) in the analysis of the current study may reduce the sample size requirement and/or increase the power of analysis. The most common example is incorporating historical control data in the analysis of a current clinical trial. However, this only applies when the historical control dataare similar enough to the current control data. Recently, several Bayesian approaches for incorporating historical data have been proposed, such as the meta-analytic-predictive (MAP) prior and the modified power prior (MPP) both for single control as well as for multiple historical control arms. Here, we examine the performance of the MAP and the MPP approaches for the analysis of (over-dispersed) count data. To this end, we propose a computational method for the MPP approach for the Poisson and the negative binomial models. We conducted an extensive simulation study to assess the performance of Bayesian approaches. Additionally, we illustrate our approaches on an overactive bladder data set. For similar data across the control arms, the MPP approach outperformed the MAP approach with respect to thestatistical power. When the means across the control arms are different, the MPP yielded a slightly inflated type I error (TIE) rate, whereas the MAP did not. In contrast, when the dispersion parameters are different, the MAP gave an inflated TIE rate, whereas the MPP did not.We conclude that the MPP approach is more promising than the MAP approach for incorporating historical count data.

Keywords: count data, meta-analytic prior, negative binomial, poisson

Procedia PDF Downloads 117
42051 Data Quality Enhancement with String Length Distribution

Authors: Qi Xiu, Hiromu Hota, Yohsuke Ishii, Takuya Oda

Abstract:

Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.

Keywords: string classification, data quality, feature selection, probability distribution, string length

Procedia PDF Downloads 318
42050 Microarray Data Visualization and Preprocessing Using R and Bioconductor

Authors: Ruchi Yadav, Shivani Pandey, Prachi Srivastava

Abstract:

Microarrays provide a rich source of data on the molecular working of cells. Each microarray reports on the abundance of tens of thousands of mRNAs. Virtually every human disease is being studied using microarrays with the hope of finding the molecular mechanisms of disease. Bioinformatics analysis plays an important part of processing the information embedded in large-scale expression profiling studies and for laying the foundation for biological interpretation. A basic, yet challenging task in the analysis of microarray gene expression data is the identification of changes in gene expression that are associated with particular biological conditions. Careful statistical design and analysis are essential to improve the efficiency and reliability of microarray experiments throughout the data acquisition and analysis process. One of the most popular platforms for microarray analysis is Bioconductor, an open source and open development software project based on the R programming language. This paper describes specific procedures for conducting quality assessment, visualization and preprocessing of Affymetrix Gene Chip and also details the different bioconductor packages used to analyze affymetrix microarray data and describe the analysis and outcome of each plots.

Keywords: microarray analysis, R language, affymetrix visualization, bioconductor

Procedia PDF Downloads 480
42049 Development of New Technology Evaluation Model by Using Patent Information and Customers' Review Data

Authors: Kisik Song, Kyuwoong Kim, Sungjoo Lee

Abstract:

Many global firms and corporations derive new technology and opportunity by identifying vacant technology from patent analysis. However, previous studies failed to focus on technologies that promised continuous growth in industrial fields. Most studies that derive new technology opportunities do not test practical effectiveness. Since previous studies depended on expert judgment, it became costly and time-consuming to evaluate new technologies based on patent analysis. Therefore, research suggests a quantitative and systematic approach to technology evaluation indicators by using patent data to and from customer communities. The first step involves collecting two types of data. The data is used to construct evaluation indicators and apply these indicators to the evaluation of new technologies. This type of data mining allows a new method of technology evaluation and better predictor of how new technologies are adopted.

Keywords: data mining, evaluating new technology, technology opportunity, patent analysis

Procedia PDF Downloads 377
42048 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data

Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin

Abstract:

Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.

Keywords: big data, machine learning, ontology model, urban data model

Procedia PDF Downloads 418
42047 The Construct of Personal Choice within Individual Language Shift: A Phenomenological Qualitative Study

Authors: Kira Gulko Morse

Abstract:

Choosing one’s primary language may not be as common as choosing an additional foreign language to study or use during travel. In some instances, however, it becomes a matter of internal personal struggle, as language is tied not only to specific circumstances but also to human background and identity. This phenomenological qualitative study focuses on the factors affecting the decision of a person to undergo a language shift. Specifically, it considers how these factors relate to identity negotiation and expression. The data for the study include the analysis of published autobiographical narratives and personal interviews conducted using the Responsive Interviewing model. While research participants come from a variety of geographical locations and have used different reasons for undergoing their individual language shift, the study identifies a number of common features shared by all the participants. Specifically, while all the participants have been able to maintain their first language to varying degrees of proficiency, they have all completed the shift to establish a primary language different from their first. Additionally, the process of self-identification is found to be directly connected to the phenomenon of language choice for each of the participants. The findings of the study further tie the phenomenon of individual language shift to a more comprehensive issue of individual life choices – ethnic revival, immigration, and inter-cultural marriage among others. The study discusses varying language roles and the data indicate that language shift may occur whether it is a symbolic driving force or a secondary means in fulfilling a set life goal. The concept of language addition is suggested as an alternative to the arbitrariness of language shift. Thus, instead of focusing on subtractive bilingualism or language loss, the emphasis becomes the integration of languages within the individual. The study emphasizes the importance of the construct of personal choice in its connection to individual language shift. It places the focus from society onto an individual and the ability of an individual to make decisions in matters of linguistic identification.

Keywords: choice theory, identity negotiation, language shift, psycholinguistics

Procedia PDF Downloads 135
42046 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 520
42045 Efficiency of DMUs in Presence of New Inputs and Outputs in DEA

Authors: Esmat Noroozi, Elahe Sarfi, Farha Hosseinzadeh Lotfi

Abstract:

Examining the impacts of data modification is considered as sensitivity analysis. A lot of studies have considered the data modification of inputs and outputs in DEA. The issues which has not heretofore been considered in DEA sensitivity analysis is modification in the number of inputs and (or) outputs and determining the impacts of this modification in the status of efficiency of DMUs. This paper is going to present systems that show the impacts of adding one or multiple inputs or outputs on the status of efficiency of DMUs and furthermore a model is presented for recognizing the minimum number of inputs and (or) outputs from among specified inputs and outputs which can be added whereas an inefficient DMU will become efficient. Finally the presented systems and model have been utilized for a set of real data and the results have been reported.

Keywords: data envelopment analysis, efficiency, sensitivity analysis, input, out put

Procedia PDF Downloads 450
42044 Kelantan Malay Cultural Landscape: The Concept of Kota Bharu Islamic City

Authors: Mohammad Rusdi Mohd Nasir, Ismail Hafiz Salleh

Abstract:

Kota Bharu, as an Islamic City, represents a symbolic icon in the urban development of the Islamic state of Kelantan, Malaysia. This research seeks to provide a basis for new approaches to landscape planning that shows greater respect for the traditional vernacular landscape. In addition, this research also intends to distinguish the prospects for the future Kelantan Malay cultural landscape, building upon the multiple historical influences in the evolution of the cultural landscape using multiple methods including literature review, observation, document analysis and content analysis. The study of the Kelantan Malay cultural landscape is particularly important in view of its distinctive contribution to Malay heritage by identifying the elements, characteristics, history and their influences. As a result, this research recognizes the importance of incorporating the existing heritage alongside contemporary design as well as further research on the Kelantan Malay cultural landscape. Optimistically, there will be better landscape practices in the future to understand the past, the present and the future prospects of the vernacular tradition, in order to ensure that our architecture, landscape and urbanism practices express its values.

Keywords: Malay culture, Malay heritage, cultural landscape, Islamic concept

Procedia PDF Downloads 439
42043 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Saeed Hassan Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analysing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics

Procedia PDF Downloads 574
42042 Men's Relationships in D. H. Lawrence's 'Sons and Lovers'

Authors: Chaich Hamza Walid

Abstract:

The primary goal of this paper is to question the situation of men’s place in D.H Lawrence’s Sons and Lovers. Our question is what is the role of each man in the novel? And how a mother’s possessiveness had changed the life of all men in the family? David Herbert Lawrence was an important and controversial English writer of the 20th century. He wrote many great works, one of his most popular novels, Sons and Loves, is an autobiographical account of his youth. This novel is about the life of the Morels. The author develops the story by portraying the relationships between many characters, especially the male ones we focus on. ‘Sons and Lovers’ seems to be written especially to women, all what Lawrence wrote is about women but when we go deeper, we see that Lawrence was also interested in men. This work will approach the question in two ways. The first chapter will deal with men’s place in D.H Lawrence’s Sons and Lovers, more exactly with Paul and his father Walter Morel, and with Baxter Dawes. We will focus on each man’s behavior with one another. In the second chapter, we will analyze possessiveness, that is to say, the desire of holding or having someone as one’s own or under one’s control. We will try to prove this view from the spiritual and symbolic possession of different relationships. Our study will be through an intensive psychological analysis of a wife’s possessiveness to her husband, and a mother’s possessiveness to her son’s; William and Paul. The conclusion will review all the important aspects of this analysis. It is very important to know about men’s relationships in D.H Lawrence’s Sons and Lovers this will give us another vision of the novel, and where we can situate Paul’s true relationships, that is to say, his relationships with his father and the other men in the novel.

Keywords: language, literature, English, civilisation

Procedia PDF Downloads 460