Search results for: Data Assimilation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7460

Search results for: Data Assimilation

7460 Information Seeking through Assimilation Process in Thai Organization

Authors: Pornprom Chomngam

Abstract:

The purpose of this study is to examine employee assessments of the usefulness/value of different types of information available to those employees during the process of organizational assimilation. Participants in the study were 247 “new" employees at Bangkok Bank. Bangkok Bank considers employees whose length of stay with the bank has been less than 18 months as new employees. Questionnaires were administered to all of the Bank-s new employees to obtain the data for this study. Repeated measures analysis was used to analyze the data. The data were summed and coded by using Statistical Package for Social Science. Newcomers indicate that social information is the most useful information, followed by job (technical, referent, and appraisal information), political, normative, and organizational information. Essentially, social, job, and political information are evaluated by newcomers as highly useful, while normative and organizational information are rated as moderately useful.

Keywords: Information seeking, organization assimilation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1661
7459 A Parallel Approach for 3D-Variational Data Assimilation on GPUs in Ocean Circulation Models

Authors: Rossella Arcucci, Luisa D’Amore, Simone Celestino, Giuseppe Scotti, Giuliano Laccetti

Abstract:

This work is the first dowel in a rather wide research activity in collaboration with Euro Mediterranean Center for Climate Changes, aimed at introducing scalable approaches in Ocean Circulation Models. We discuss designing and implementation of a parallel algorithm for solving the Variational Data Assimilation (DA) problem on Graphics Processing Units (GPUs). The algorithm is based on the fully scalable 3DVar DA model, previously proposed by the authors, which uses a Domain Decomposition approach (we refer to this model as the DD-DA model). We proceed with an incremental porting process consisting of 3 distinct stages: requirements and source code analysis, incremental development of CUDA kernels, testing and optimization. Experiments confirm the theoretic performance analysis based on the so-called scale up factor demonstrating that the DD-DA model can be suitably mapped on GPU architectures.

Keywords: Data Assimilation, Parallel Algorithm, GPU architectures, Ocean Models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2010
7458 Parallelization of Ensemble Kalman Filter (EnKF) for Oil Reservoirs with Time-lapse Seismic Data

Authors: Md Khairullah, Hai-Xiang Lin, Remus G. Hanea, Arnold W. Heemink

Abstract:

In this paper we describe the design and implementation of a parallel algorithm for data assimilation with ensemble Kalman filter (EnKF) for oil reservoir history matching problem. The use of large number of observations from time-lapse seismic leads to a large turnaround time for the analysis step, in addition to the time consuming simulations of the realizations. For efficient parallelization it is important to consider parallel computation at the analysis step. Our experiments show that parallelization of the analysis step in addition to the forecast step has good scalability, exploiting the same set of resources with some additional efforts.

Keywords: EnKF, Data assimilation, Parallel computing, Parallel efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2281
7457 On Adaptive Optimization of Filter Performance Based on Markov Representation for Output Prediction Error

Authors: Hong Son Hoang, Remy Baraille

Abstract:

This paper addresses the problem of how one can improve the performance of a non-optimal filter. First the theoretical question on dynamical representation for a given time correlated random process is studied. It will be demonstrated that for a wide class of random processes, having a canonical form, there exists a dynamical system equivalent in the sense that its output has the same covariance function. It is shown that the dynamical approach is more effective for simulating and estimating a Markov and non- Markovian random processes, computationally is less demanding, especially with increasing of the dimension of simulated processes. Numerical examples and estimation problems in low dimensional systems are given to illustrate the advantages of the approach. A very useful application of the proposed approach is shown for the problem of state estimation in very high dimensional systems. Here a modified filter for data assimilation in an oceanic numerical model is presented which is proved to be very efficient due to introducing a simple Markovian structure for the output prediction error process and adaptive tuning some parameters of the Markov equation.

Keywords: Statistical simulation, canonical form, dynamical system, Markov and non-Markovian processes, data assimilation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1298
7456 Collaborative Planning and Forecasting

Authors: Neha Asthana, Vishal Krishna Prasad

Abstract:

Collaborative Planning and Forecasting is an innovative and systematic approach towards productive integration and assimilation of data synergized into information. The changing and variable market dynamics have persuaded global business chains to incorporate Collaborative Planning and Forecasting as an imperative tool. Thus, it is essential for the supply chains to constantly improvise, update its nature, and mould as per changing global environment.

Keywords: Information transfer, Forecasting, Optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1905
7455 Comparative Analysis of the Third Generation of Research Data for Evaluation of Solar Energy Potential

Authors: Claudineia Brazil, Elison Eduardo Jardim Bierhals, Luciane Teresa Salvi, Rafael Haag

Abstract:

Renewable energy sources are dependent on climatic variability, so for adequate energy planning, observations of the meteorological variables are required, preferably representing long-period series. Despite the scientific and technological advances that meteorological measurement systems have undergone in the last decades, there is still a considerable lack of meteorological observations that form series of long periods. The reanalysis is a system of assimilation of data prepared using general atmospheric circulation models, based on the combination of data collected at surface stations, ocean buoys, satellites and radiosondes, allowing the production of long period data, for a wide gamma. The third generation of reanalysis data emerged in 2010, among them is the Climate Forecast System Reanalysis (CFSR) developed by the National Centers for Environmental Prediction (NCEP), these data have a spatial resolution of 0.50 x 0.50. In order to overcome these difficulties, it aims to evaluate the performance of solar radiation estimation through alternative data bases, such as data from Reanalysis and from meteorological satellites that satisfactorily meet the absence of observations of solar radiation at global and/or regional level. The results of the analysis of the solar radiation data indicated that the reanalysis data of the CFSR model presented a good performance in relation to the observed data, with determination coefficient around 0.90. Therefore, it is concluded that these data have the potential to be used as an alternative source in locations with no seasons or long series of solar radiation, important for the evaluation of solar energy potential.

Keywords: Climate, reanalysis, renewable energy, solar radiation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 906
7454 Four Decades of Greek Artistic Presence in Paris (1970-2010): Theory and Interpretation

Authors: Sapfo A. Mortaki

Abstract:

This article examines the presence of Greek immigrant artists (painters and sculptors) in Paris during 1970-2010. The aim is to highlight their presence in the French capital through archival research in the daily and periodical press as well as present the impact of their artistic activity on the French intellectual life and society. At the same time, their contribution to the development of cultural life in Greece becomes apparent. The integration of those migrant artists into an environment of cultural coexistence and the understanding of the social phenomenon of their migration, in the context of postmodernity, are being investigated. The cultural relations between the two countries are studied in the context of support mechanisms, such as the Greek community, cultural institutions, museums and galleries. The recognition of the Greek artists by the French society and the social dimension in the context of their activity in Paris, are discussed in terms of the assimilation theory. Since the 1970s, and especially since the fall of the dictatorship in Greece, in opposition to the prior situation, artists' contacts with their homeland have been significantly enhanced, with most of them now travelling to Paris, while others work in parallel in both countries. As a result, not only do the stages of the development of their work through their pursuits become visible, but, most importantly, the artistic world becomes informed about the multifaceted expression of art through the succession of various contemporary currents. Thus, the participation of Greek artists in the international cultural landscape is demonstrated.

Keywords: Artistic migration, cultural impact, Greek artists, postmodernity, theory of assimilation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 805
7453 The Secrecy Underlying Young Language Learners- Learning

Authors: Nima Shakouri Masouleh, Razieh Bahraminezhad Jooneghani

Abstract:

The study investigated the educational implications that can be derived from the work of a variety of celebrated figures such as Piaget, Vygotsky, and Bruner that will be helpful in the field of language learning. However, the writer believed these views were previously expressed not full–fledged by Comenius who has been described by Howatt (1984) as a genius–the one that the history of language teaching can claim. And we owe to him more than anyone.

Keywords: restructuring, assimilation, equiliberation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1803
7452 Uvulars Alternation in Hasawi Arabic: A Harmonic Serialism Approach

Authors: Huda Ahmed Al Taisan

Abstract:

This paper investigates a phonological phenomenon, which exhibits variation ‘alternation’ in terms of the uvular consonants [q] and [ʁ] in Hasawi Arabic. This dialect is spoken in Alahsa city, which is located in the Eastern province of Saudi Arabia. To the best of our knowledge, no such research has systematically studied this phenomenon in Hasawi Arabic dialect. This paper is significant because it fills the gap in the literature about this alternation phenomenon in this understudied dialect. A large amount of the data is extracted from several interviews the author has conducted with 10 participants, native speakers of the dialect, and complemented by additional forms from social media. The latter method of collecting the data adds to the significance of the research. The analysis of the data is carried out in Harmonic Serialism Optimality Theory (HS-OT), a version of the Optimality Theoretic (OT) framework, which holds that linguistic forms are the outcome of the interaction among violable universal constraints, and in the recent development of OT into a model that accounts for linguistic variation in harmonic derivational steps. This alternation process is assumed to be phonologically unconditioned and in free variation in other varieties of Arabic dialects in the area. The goal of this paper is to investigate whether this phenomenon is in free variation or governed, what governs this alternation between [q] and [ʁ] and whether the alternation is phonological or other linguistic constraints are in action. The results show that the [q] and [ʁ] alternation is not free and it occurs due to different assimilation processes. Positional, segmental sequence and vowel adjacency factors are in action in Hasawi Arabic.

Keywords: Harmonic serialism, Hasawi, uvular, alternation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 652
7451 The Effects of Mobile Phones in Mitigating Cultural Shock Amongst Refugees: Case of South Africa

Authors: Sarah Vuningoma, Maria Rosa Lorini, Wallace Chigona

Abstract:

The potential of mobile phones is evident in their ability to address isolation and loneliness, support the improvement of interpersonal relations, and contribute to the facilitation of assimilation processes. Mobile phones can play a role in facilitating the integration of refugees into a new environment. This study aims to evaluate the impact of mobile phone use on helping refugees navigate the challenges posed by cultural differences in the host country. Semi-structured interviews were employed to collect data for the study, involving a sample size of 27 participants. Participants in the study were refugees based in South Africa, and thematic analysis was the chosen method for data analysis. The research highlights the numerous challenges faced by refugees in their host nation, including a lack of local cultural skills, the separation of family and friends from their countries of origin, hurdles in acquiring legal documentation, and the complexities of assimilating into the unfamiliar community. The use of mobile phones by refugees comes with several advantages, such as the advancement of language and cultural understanding, seamless integration into the host country, streamlined communication, and the exploration of diverse opportunities. Concurrently, mobile phones allow refugees in South Africa to manage the impact of culture shock.

Keywords: Mobile phones, culture shock, refugees, South Africa.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 192
7450 Cultural Integration as a Factor of Genesis of the Kazakh Nation in the Conditions of Multicultural Society

Authors: Kadyraliyeva Altynay Mustafayevna, Zholdubayeva Azhar Kuanyshbekovna, Alimzhanova Aliya Sharabekovna, Zhiyenbekova Ainur Abdurakhmanovna, Asanov Seylbek Sadikovich

Abstract:

The article analyses historical aspects of the formation of the Kazakh nation in the conditions of the multicultural society. The authors underline cultural integration as a significant stage of the cultural advancement of the Kazakh nation. The transition to the modern-style houses, the adoption and development of the secular education gave a rise to the development of the society and culture on the whole.

Keywords: Assimilation, culture genesis, cultural integration, multiculturalism

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1895
7449 Functioning of Turkic Elements in Modern Hindi

Authors: B. S. Bokuleva, R. A. Avakova, A. A. Sultangubieva, U. Schamiloglu

Abstract:

It is discussed about modern usage of adopted words and their vocabularies, Turkism usage fields, phonetic, grammatical and lexis-semantic assimilation of the typological-morphological structures of entering to different Hindi languages in comparative typological aspects in this scientific article. The lexis vocabulary is rich, the prevalence area is wide and it has researched the entering process of vocabulary into the great languages of Turkic elements from the speakers- numbers. The research work has worked on the base of Hindi vocabulary.

Keywords: Adopted words, language communications, Turkism, Turkic languages.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2166
7448 An Optimal Control Method for Reconstruction of Topography in Dam-Break Flows

Authors: Alia Alghosoun, Nabil El Moçayd, Mohammed Seaid

Abstract:

Modeling dam-break flows over non-flat beds requires an accurate representation of the topography which is the main source of uncertainty in the model. Therefore, developing robust and accurate techniques for reconstructing topography in this class of problems would reduce the uncertainty in the flow system. In many hydraulic applications, experimental techniques have been widely used to measure the bed topography. In practice, experimental work in hydraulics may be very demanding in both time and cost. Meanwhile, computational hydraulics have served as an alternative for laboratory and field experiments. Unlike the forward problem, the inverse problem is used to identify the bed parameters from the given experimental data. In this case, the shallow water equations used for modeling the hydraulics need to be rearranged in a way that the model parameters can be evaluated from measured data. However, this approach is not always possible and it suffers from stability restrictions. In the present work, we propose an adaptive optimal control technique to numerically identify the underlying bed topography from a given set of free-surface observation data. In this approach, a minimization function is defined to iteratively determine the model parameters. The proposed technique can be interpreted as a fractional-stage scheme. In the first stage, the forward problem is solved to determine the measurable parameters from known data. In the second stage, the adaptive control Ensemble Kalman Filter is implemented to combine the optimality of observation data in order to obtain the accurate estimation of the topography. The main features of this method are on one hand, the ability to solve for different complex geometries with no need for any rearrangements in the original model to rewrite it in an explicit form. On the other hand, its achievement of strong stability for simulations of flows in different regimes containing shocks or discontinuities over any geometry. Numerical results are presented for a dam-break flow problem over non-flat bed using different solvers for the shallow water equations. The robustness of the proposed method is investigated using different numbers of loops, sensitivity parameters, initial samples and location of observations. The obtained results demonstrate high reliability and accuracy of the proposed techniques.

Keywords: Optimal control, ensemble Kalman Filter, topography reconstruction, data assimilation, shallow water equations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 679
7447 Assessing Innovation Activity in Mexico and South Korea: An Econometric Approach

Authors: Mario Gómez, Won Ho Kim, Ángel Licona, José Carlos Rodríguez

Abstract:

This article analyzes innovation activity in Mexico and South Korea. It develops an econometric model to test for structural breaks in the number of patent applications filed by residents and nonresidents in these countries during the period of 1965 to 2012. These changes may suggest that firms’ innovative capabilities have changed because of implementing different science, technology and innovation (STI) policies in Mexico and South Korea. Two important features characterize this research from others already developed by these authors. First, the theoretical research framework in this research is the debate between the assimilation view of growth and the accumulation view of growth. This characteristic suggests that trade liberalization should be accompanied by an adequate STI policy to boost competitiveness among indigenous firms. Second, the analysis in this research stresses the importance of key actors (e.g. governments) to successfully develop innovation capabilities among indigenous firms. Therefore, the question conducting this research is how STI policies in Mexico and South Korea contributed to develop firms’ innovation capabilities in these countries during last decades? The results from this research suggests that STI policy in South Korea was more suitable to boost innovation firms to compete in markets. Data to develop this research was released by the World Intellectual Property Organization (WIPO).

Keywords: Econometric methods, innovation, Mexico, South Korea, STI Policy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 962
7446 Developing Islamic Module Project for Preschool Teachers Using Modified Delphi Technique

Authors: Mazeni Ismail, Nurul Aliah, Hasmadi Hassan

Abstract:

The purpose of this study is to gather the consensus of experts regarding the use of moral guidance amongst preschool teachers vis-a-vis the Islamic Project module (I-Project Module). This I-Project Module seeks to provide pertinent data on the assimilation of noble values in subject-matter teaching. To obtain consensus for the various components of the module, the Modified Delphi technique was used to develop the module. 12 subject experts from various educational fields of Islamic education, early childhood education, counselling and language fully participated in the development of this module. The Modified Delphi technique was administered in two mean cycles. The standard deviation value derived from questionnaires completed by the participating panel of experts provided the value of expert consensus reached. This was subsequently analyzed using SPSS version 22. Findings revealed that the panel of experts reached a discernible degree of agreement on five topics outlined in the module, viz; content (mean value 3.36), teaching strategy (mean value 3.28), programme duration (mean value 3.0), staff involved and attention-grabbing strategy of target group participating in the value program (mean value 3.5), and strategy to attract attention of target group to utilize i-project (mean value 3.0). With regard to the strategy to attract the attention of the target group, the experts proposed for creative activities to be added in order to enhance teachers’ creativity.

Keywords: Islamic project, modified Delphi technique, project approach, teacher moral guidance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 736
7445 Big Data: Big Challenges to Privacy and Data Protection

Authors: Abu Bakar Munir, Siti Hajar Mohd Yasin, Firdaus Muhammad-Sukki

Abstract:

This paper seeks to analyse the benefits of big data and more importantly the challenges it pose to the subject of privacy and data protection. First, the nature of big data will be briefly deliberated before presenting the potential of big data in the present days. Afterwards, the issue of privacy and data protection is highlighted before discussing the challenges of implementing this issue in big data. In conclusion, the paper will put forward the debate on the adequacy of the existing legal framework in protecting personal data in the era of big data.

Keywords: Big data, data protection, information, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3924
7444 An Innovation Capability Maturity Model – Development and Initial Application

Authors: H. Essmann, N. du Preez

Abstract:

The seemingly ambiguous title of this paper – use of the terms maturity and innovation in concord – signifies the imperative of every organisation within the competitive domain. Where organisational maturity and innovativeness were traditionally considered antonymous, the assimilation of these two seemingly contradictory notions is fundamental to the assurance of long-term organisational prosperity. Organisations are required, now more than ever, to grow and mature their innovation capability – rending consistent innovative outputs. This paper describes research conducted to consolidate the principles of innovation and identify the fundamental components that constitute organisational innovation capability. The process of developing an Innovation Capability Maturity Model is presented. A brief description is provided of the basic components of the model, followed by a description of the case studies that were conducted to evaluate the model. The paper concludes with a summary of the findings and potential future research.

Keywords: Capability maturity, innovation, innovation capability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4183
7443 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6089
7442 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4873
7441 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2611
7440 The Effect of Pyridoxine and Different Levels of Nitrogen on Physiological Indices of Corn(Zea Mays L.var.sc704)

Authors: Gholamreza Farrokhi, Babak Paykarestan

Abstract:

One field experiment was conducted on corn (Zea mays L.Var. SC 704) to study the effect of three different basic levels of nitrogen (90, 140and 190 Kg/ha as urea) with 0.01% and 0.02% pyridoxine pre-sowing seed soaking for 8 hours. Water-soaked seeds were treated as controled. biomass production was recorded on 45, 70 and 95 days after sowing. Total dry material (TDM), leaf area index (LAI), crop growth rate (CGR), relative growth rate (RGR) and net assimilation rate (NAR) was calculated form 45until 95 days after sowing. Yield and its components such as kernel yield, grain weight, biologic yield, harvest index and protein percentage was measured at harvest. In general, 0.02% pyridoxine and 190 Kg pure nitrogen/ha was shown gave maximum value for growth and yield parameters. N190 + 0.02 % pyridoxine enhanced seed yield and biologic yield by 57.15% and 62.98% compared to 90kg N and water – soaked treatment.

Keywords: Corn, Growth Indices, Nitrogen Levels, Physiological Indices, Pyridoxine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1725
7439 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1558
7438 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2480
7437 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3775
7436 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1303
7435 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634
7434 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2010
7433 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2143
7432 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2791
7431 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1642