Search results for: multi-source data fusion
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24507

Search results for: multi-source data fusion

24177 Conceptualizing the Knowledge to Manage and Utilize Data Assets in the Context of Digitization: Case Studies of Multinational Industrial Enterprises

Authors: Martin Böhmer, Agatha Dabrowski, Boris Otto

Abstract:

The trend of digitization significantly changes the role of data for enterprises. Data turn from an enabler to an intangible organizational asset that requires management and qualifies as a tradeable good. The idea of a networked economy has gained momentum in the data domain as collaborative approaches for data management emerge. Traditional organizational knowledge consequently needs to be extended by comprehensive knowledge about data. The knowledge about data is vital for organizations to ensure that data quality requirements are met and data can be effectively utilized and sovereignly governed. As this specific knowledge has been paid little attention to so far by academics, the aim of the research presented in this paper is to conceptualize it by proposing a “data knowledge model”. Relevant model entities have been identified based on a design science research (DSR) approach that iteratively integrates insights of various industry case studies and literature research.

Keywords: data management, digitization, industry 4.0, knowledge engineering, metamodel

Procedia PDF Downloads 329
24176 Heterotopic Ossification: DISH and Myositis Ossificans in Human Remains Identification

Authors: Patricia Shirley Almeida Prado, Liz Brito, Selma Paixão Argollo, Gracie Moreira, Leticia Matos Sobrinho

Abstract:

Diffuse idiopathic skeletal hyperostosis (DISH) is a degenerative bone disease also known as Forestier´s disease and ankylosing hyperostosis of the spine is characterized by a tendency toward ossification of half the anterior longitudinal spinal ligament without intervertebral disc disease. DISH is not considered to be osteoarthritis, although the two conditions commonly occur together. Diagnostic criteria include fusion of at least four vertebrae by bony bridges arising from the anterolateral aspect of the vertebral bodies. These vertebral bodies have a 'dripping candle wax' appearance, also can be seen periosteal new bone formation on the anterior surface of the vertebral bodies and there is no ankylosis at zygoapophyseal facet joint. Clinically, patients with DISH tend to be asymptomatic some patients mention moderate pain and stiffness in upper back. This disease is more common in man, uncommon in patients younger than 50 years and rare in patients under 40 years old. In modern populations, DISH is found in association with obesity, (type II) diabetes; abnormal vitamin A metabolism and also associated with higher levels of serum uric acid. There is also some association between the increase of risk of stroke or other cerebrovascular disease. The DISH condition can be confused with Heterotopic Ossification, what is the bone formation in the soft tissues as the result of trauma, wounding, surgery, burnings, prolonged immobility and some central nervous system disorder. All these conditions have been described extensively as myositis ossificans which can be confused with the fibrodysplasia (myositis) ossificans progressive. As in the DISH symptomatology it can be asymptomatic or extensive enough to impair joint function. A third confusion osteoarthritis disease that can bring confusion are the enthesopathies that occur in the entire skeleton being common on the ischial tuberosities, iliac crests, patellae, and calcaneus. Ankylosis of the sacroiliac joint by bony bridges may also be found. CASE 1: this case is skeletal remains presenting skull, some vertebrae and scapulae. This case remains unidentified and due to lack of bone remains. Sex, age and ancestry profile was compromised, however the DISH pathognomonic findings and diagnostic helps to estimate sex and age characteristics. Moreover to presenting DISH these skeletal remains also showed some bone alterations and non-metrics as fusion of the first vertebrae with occipital bone, maxillae and palatine torus and scapular foramen on the right scapulae. CASE 2: this skeleton remains shows an extensive bone heterotopic ossification on the great trochanter area of left femur, right fibula showed a healed fracture in its body however in its inteosseous crest there is an extensive bone growth, also in the Ilium at the region of inferior gluteal line can be observed some pronounced bone growth and the skull presented a pronounced mandibular, maxillary and palatine torus. Despite all these pronounced heterotopic ossification the whole skeleton presents moderate bone overgrowth that is not linked with aging, since the skeleton belongs to a young unidentified individual. The appropriate osteopathological diagnosis support the human identification process through medical reports and also assist with epidemiological data that can strengthen vulnerable anthropological estimates.

Keywords: bone disease, DISH, human identification, human remains

Procedia PDF Downloads 303
24175 Analysis and Forecasting of Bitcoin Price Using Exogenous Data

Authors: J-C. Leneveu, A. Chereau, L. Mansart, T. Mesbah, M. Wyka

Abstract:

Extracting and interpreting information from Big Data represent a stake for years to come in several sectors such as finance. Currently, numerous methods are used (such as Technical Analysis) to try to understand and to anticipate market behavior, with mixed results because it still seems impossible to exactly predict a financial trend. The increase of available data on Internet and their diversity represent a great opportunity for the financial world. Indeed, it is possible, along with these standard financial data, to focus on exogenous data to take into account more macroeconomic factors. Coupling the interpretation of these data with standard methods could allow obtaining more precise trend predictions. In this paper, in order to observe the influence of exogenous data price independent of other usual effects occurring in classical markets, behaviors of Bitcoin users are introduced in a model reconstituting Bitcoin value, which is elaborated and tested for prediction purposes.

Keywords: big data, bitcoin, data mining, social network, financial trends, exogenous data, global economy, behavioral finance

Procedia PDF Downloads 333
24174 Data and Model-based Metamodels for Prediction of Performance of Extended Hollo-Bolt Connections

Authors: M. Cabrera, W. Tizani, J. Ninic, F. Wang

Abstract:

Open section beam to concrete-filled tubular column structures has been increasingly utilized in construction over the past few decades due to their enhanced structural performance, as well as economic and architectural advantages. However, the use of this configuration in construction is limited due to the difficulties in connecting the structural members as there is no access to the inner part of the tube to install standard bolts. Blind-bolted systems are a relatively new approach to overcome this limitation as they only require access to one side of the tubular section to tighten the bolt. The performance of these connections in concrete-filled steel tubular sections remains uncharacterized due to the complex interactions between concrete, bolt, and steel section. Over the last years, research in structural performance has moved to a more sophisticated and efficient approach consisting of machine learning algorithms to generate metamodels. This method reduces the need for developing complex, and computationally expensive finite element models, optimizing the search for desirable design variables. Metamodels generated by a data fusion approach use numerical and experimental results by combining multiple models to capture the dependency between the simulation design variables and connection performance, learning the relations between different design parameters and predicting a given output. Fully characterizing this connection will transform high-rise and multistorey construction by means of the introduction of design guidance for moment-resisting blind-bolted connections, which is currently unavailable. This paper presents a review of the steps taken to develop metamodels generated by means of artificial neural network algorithms which predict the connection stress and stiffness based on the design parameters when using Extended Hollo-Bolt blind bolts. It also provides consideration of the failure modes and mechanisms that contribute to the deformability as well as the feasibility of achieving blind-bolted rigid connections when using the blind fastener.

Keywords: blind-bolted connections, concrete-filled tubular structures, finite element analysis, metamodeling

Procedia PDF Downloads 136
24173 On the Combination of Patient-Generated Data with Data from a Secure Clinical Network Environment: A Practical Example

Authors: Jeroen S. de Bruin, Karin Schindler, Christian Schuh

Abstract:

With increasingly more mobile health applications appearing due to the popularity of smartphones, the possibility arises that these data can be used to improve the medical diagnostic process, as well as the overall quality of healthcare, while at the same time lowering costs. However, as of yet there have been no reports of a successful combination of patient-generated data from smartphones with data from clinical routine. In this paper, we describe how these two types of data can be combined in a secure way without modification to hospital information systems, and how they can together be used in a medical expert system for automatic nutritional classification and triage.

Keywords: mobile health, data integration, expert systems, disease-related malnutrition

Procedia PDF Downloads 457
24172 A Quantitative Plan for Drawing Down Emissions to Attenuate Climate Change

Authors: Terry Lucas

Abstract:

Calculations are performed to quantify the potential contribution of each greenhouse gas emission reduction strategy. This approach facilitates the visualisation of the relative benefits of each, and it provides a potential baseline for the development of a plan of action that is rooted in quantitative evaluation. Emissions reductions are converted to potential de-escalation of global average temperature. A comprehensive plan is then presented which shows the potential benefits all the way out to year 2100. A target temperature de-escalation of 2oC was selected, but the plan shows a benefit of only 1.225oC. This latter disappointing result is in spite of new and powerful technologies introduced into the equation. These include nuclear fusion and alternative nuclear fission processes. Current technologies such as wind, solar and electric vehicles show surprisingly small constributions to the whole.

Keywords: climate change, emissions, drawdown, energy

Procedia PDF Downloads 100
24171 The Prospects of Leveraging (Big) Data for Accelerating a Just Sustainable Transition around Different Contexts

Authors: Sombol Mokhles

Abstract:

This paper tries to show the prospects of utilising (big)data for enabling just the transition of diverse cities. Our key purpose is to offer a framework of applications and implications of utlising (big) data in comparing sustainability transitions across different cities. Relying on the cosmopolitan comparison, this paper explains the potential application of (big) data but also its limitations. The paper calls for adopting a data-driven and just perspective in including different cities around the world. Having a just and inclusive approach at the front and centre ensures a just transition with synergistic effects that leave nobody behind.

Keywords: big data, just sustainable transition, cosmopolitan city comparison, cities

Procedia PDF Downloads 72
24170 Strategic Workplace Security: The Role of Malware and the Threat of Internal Vulnerability

Authors: Modesta E. Ezema, Christopher C. Ezema, Christian C. Ugwu, Udoka F. Eze, Florence M. Babalola

Abstract:

Some employees knowingly or unknowingly contribute to loss of data and also expose data to threat in the process of getting their jobs done. Many organizations today are faced with the challenges of how to secure their data as cyber criminals constantly devise new ways of attacking the organization’s secret data. However, this paper enlists the latest strategies that must be put in place in order to protect these important data from being attacked in a collaborative work place. It also introduces us to Advanced Persistent Threats (APTs) and how it works. The empirical study was conducted to collect data from the employee in data centers on how data could be protected from malicious codes and cyber criminals and their responses are highly considered to help checkmate the activities of malicious code and cyber criminals in our work places.

Keywords: data, employee, malware, work place

Procedia PDF Downloads 359
24169 Acceptance of Big Data Technologies and Its Influence towards Employee’s Perception on Job Performance

Authors: Jia Yi Yap, Angela S. H. Lee

Abstract:

With the use of big data technologies, organization can get result that they are interested in. Big data technologies simply load all the data that is useful for the organizations and provide organizations a better way of analysing data. The purpose of this research is to get employees’ opinion from films in Malaysia to explore the use of big data technologies in their organization in order to provide how it may affect the perception of the employees on job performance. Therefore, in order to identify will accepting big data technologies in the organization affect the perception of the employee, questionnaire will be distributed to different employee from different Small and medium-sized enterprises (SME) organization listed in Malaysia. The conceptual model proposed will test with other variables in order to see the relationship between variables.

Keywords: big data technologies, employee, job performance, questionnaire

Procedia PDF Downloads 271
24168 Data Poisoning Attacks on Federated Learning and Preventive Measures

Authors: Beulah Rani Inbanathan

Abstract:

In the present era, it is vivid from the numerous outcomes that data privacy is being compromised in various ways. Machine learning is one technology that uses the centralized server, and then data is given as input which is being analyzed by the algorithms present on this mentioned server, and hence outputs are predicted. However, each time the data must be sent by the user as the algorithm will analyze the input data in order to predict the output, which is prone to threats. The solution to overcome this issue is federated learning, where the models alone get updated while the data resides on the local machine and does not get exchanged with the other local models. Nevertheless, even on these local models, there are chances of data poisoning, and it is crystal clear from various experiments done by many people. This paper delves into many ways where data poisoning occurs and the many methods through which it is prevalent that data poisoning still exists. It includes the poisoning attacks on IoT devices, Edge devices, Autoregressive model, and also, on Industrial IoT systems and also, few points on how these could be evadible in order to protect our data which is personal, or sensitive, or harmful when exposed.

Keywords: data poisoning, federated learning, Internet of Things, edge computing

Procedia PDF Downloads 62
24167 Simulation and Hardware Implementation of Data Communication Between CAN Controllers for Automotive Applications

Authors: R. M. Kalayappan, N. Kathiravan

Abstract:

In automobile industries, Controller Area Network (CAN) is widely used to reduce the system complexity and inter-task communication. Therefore, this paper proposes the hardware implementation of data frame communication between one controller to other. The CAN data frames and protocols will be explained deeply, here. The data frames are transferred without any collision or corruption. The simulation is made in the KEIL vision software to display the data transfer between transmitter and receiver in CAN. ARM7 micro-controller is used to transfer data’s between the controllers in real time. Data transfer is verified using the CRO.

Keywords: control area network (CAN), automotive electronic control unit, CAN 2.0, industry

Procedia PDF Downloads 372
24166 Analysis of Nonlinear and Non-Stationary Signal to Extract the Features Using Hilbert Huang Transform

Authors: A. N. Paithane, D. S. Bormane, S. D. Shirbahadurkar

Abstract:

It has been seen that emotion recognition is an important research topic in the field of Human and computer interface. A novel technique for Feature Extraction (FE) has been presented here, further a new method has been used for human emotion recognition which is based on HHT method. This method is feasible for analyzing the nonlinear and non-stationary signals. Each signal has been decomposed into the IMF using the EMD. These functions are used to extract the features using fission and fusion process. The decomposition technique which we adopt is a new technique for adaptively decomposing signals. In this perspective, we have reported here potential usefulness of EMD based techniques.We evaluated the algorithm on Augsburg University Database; the manually annotated database.

Keywords: intrinsic mode function (IMF), Hilbert-Huang transform (HHT), empirical mode decomposition (EMD), emotion detection, electrocardiogram (ECG)

Procedia PDF Downloads 545
24165 High Throughput LC-MS/MS Studies on Sperm Proteome of Malnad Gidda (Bos Indicus) Cattle

Authors: Kerekoppa Puttaiah Bhatta Ramesha, Uday Kannegundla, Praseeda Mol, Lathika Gopalakrishnan, Jagish Kour Reen, Gourav Dey, Manish Kumar, Sakthivel Jeyakumar, Arumugam Kumaresan, Kiran Kumar M., Thottethodi Subrahmanya Keshava Prasad

Abstract:

Spermatozoa are the highly specialized transcriptionally and translationally inactive haploid male gamete. The understanding of proteome of sperm is indispensable to explore the mechanism of sperm motility and fertility. Though there is a large number of human sperm proteomic studies, in-depth proteomic information on Bos indicus spermatozoa is not well established yet. Therefore, we illustrated the profile of sperm proteome in indigenous cattle, Malnad gidda (Bos Indicus), using high-resolution mass spectrometry. In the current study, two semen ejaculates from 3 breeding bulls were collected employing the artificial vaginal method. Using 45% percoll purification, spermatozoa cells were isolated. Protein was extracted using lysis buffer containing 2% Sodium Dodecyl Sulphate (SDS) and protein concentration was estimated. Fifty micrograms of protein from each individual were pooled for further downstream processing. Pooled sample was fractionated using SDS-Poly Acrylamide Gel Electrophoresis, which is followed by in-gel digestion. The peptides were subjected to C18 Stage Tip clean-up and analyzed in Orbitrap Fusion Tribrid mass spectrometer interfaced with Proxeon Easy-nano LC II system (Thermo Scientific, Bremen, Germany). We identified a total of 6773 peptides with 28426 peptide spectral matches, which belonged to 1081 proteins. Gene ontology analysis has been carried out to determine the biological processes, molecular functions and cellular components associated with sperm protein. The biological process chiefly represented our data is an oxidation-reduction process (5%), spermatogenesis (2.5%) and spermatid development (1.4%). The highlighted molecular functions are ATP, and GTP binding (14%) and the prominent cellular components most observed in our data were nuclear membrane (1.5%), acrosomal vesicle (1.4%), and motile cilium (1.3%). Seventeen percent of sperm proteins identified in this study were involved in metabolic pathways. To the best of our knowledge, this data represents the first total sperm proteome from indigenous cattle, Malnad Gidda. We believe that our preliminary findings could provide a strong base for the future understanding of bovine sperm proteomics.

Keywords: Bos indicus, Malnad Gidda, mass spectrometry, spermatozoa

Procedia PDF Downloads 173
24164 Improving the Statistics Nature in Research Information System

Authors: Rajbir Cheema

Abstract:

In order to introduce an integrated research information system, this will provide scientific institutions with the necessary information on research activities and research results in assured quality. Since data collection, duplication, missing values, incorrect formatting, inconsistencies, etc. can arise in the collection of research data in different research information systems, which can have a wide range of negative effects on data quality, the subject of data quality should be treated with better results. This paper examines the data quality problems in research information systems and presents the new techniques that enable organizations to improve their quality of research information.

Keywords: Research information systems (RIS), research information, heterogeneous sources, data quality, data cleansing, science system, standardization

Procedia PDF Downloads 128
24163 Data Mining Meets Educational Analysis: Opportunities and Challenges for Research

Authors: Carla Silva

Abstract:

Recent development of information and communication technology enables us to acquire, collect, analyse data in various fields of socioeconomic – technological systems. Along with the increase of economic globalization and the evolution of information technology, data mining has become an important approach for economic data analysis. As a result, there has been a critical need for automated approaches to effective and efficient usage of massive amount of educational data, in order to support institutions to a strategic planning and investment decision-making. In this article, we will address data from several different perspectives and define the applied data to sciences. Many believe that 'big data' will transform business, government, and other aspects of the economy. We discuss how new data may impact educational policy and educational research. Large scale administrative data sets and proprietary private sector data can greatly improve the way we measure, track, and describe educational activity and educational impact. We also consider whether the big data predictive modeling tools that have emerged in statistics and computer science may prove useful in educational and furthermore in economics. Finally, we highlight a number of challenges and opportunities for future research.

Keywords: data mining, research analysis, investment decision-making, educational research

Procedia PDF Downloads 330
24162 Laser-Hole Boring into Overdense Targets: A Detailed Study on Laser and Target Properties

Authors: Florian Wagner, Christoph Schmidt, Vincent Bagnoud

Abstract:

Understanding the interaction of ultra-intense laser pulses with overcritical targets is of major interest for many applications such as laser-driven ion acceleration, fast ignition in the frame of inertial confinement fusion or high harmonic generation and the creation of attosecond pulses. One particular aspect of this interaction is the shift of the critical surface, where the laser pulse is stopped and the absorption is at maximum, due to the radiation pressure induced by the laser pulse, also referred to as laser hole boring. We investigate laser-hole boring experimentally by measuring the backscattered spectrum which is doppler-broadened because of the movement of the reflecting surface. Using the high-power, high-energy laser system PHELIX in Darmstadt, we gathered an extensive set of data for different laser intensities ranging from 10^18 W/cm2 to 10^21 W/cm2, two different levels of the nanosecond temporal contrast (10^6 vs. 10^11), elliptical and linear polarization and varying target configurations. In this contribution we discuss how the maximum velocity of the critical surface depends on these parameters. In particular we show that by increasing the temporal contrast the maximum hole boring velocity is decreased by more than a factor of three. Our experimental findings are backed by a basic analytical model based on momentum and mass conservation as well as particle in cell simulations. These results are of particular importance for fast ignition since they contribute to a better understanding of the transport of the ignitor pulse into the overdense region.

Keywords: laser-hole boring, interaction of ultra-intense lasers with overcritical targets, fast ignition, relativistic laser motter interaction

Procedia PDF Downloads 369
24161 A Method of Detecting the Difference in Two States of Brain Using Statistical Analysis of EEG Raw Data

Authors: Digvijaysingh S. Bana, Kiran R. Trivedi

Abstract:

This paper introduces various methods for the alpha wave to detect the difference between two states of brain. One healthy subject participated in the experiment. EEG was measured on the forehead above the eye (FP1 Position) with reference and ground electrode are on the ear clip. The data samples are obtained in the form of EEG raw data. The time duration of reading is of one minute. Various test are being performed on the alpha band EEG raw data.The readings are performed in different time duration of the entire day. The statistical analysis is being carried out on the EEG sample data in the form of various tests.

Keywords: electroencephalogram(EEG), biometrics, authentication, EEG raw data

Procedia PDF Downloads 440
24160 Cognition of Driving Context for Driving Assistance

Authors: Manolo Dulva Hina, Clement Thierry, Assia Soukane, Amar Ramdane-Cherif

Abstract:

In this paper, we presented our innovative way of determining the driving context for a driving assistance system. We invoke the fusion of all parameters that describe the context of the environment, the vehicle and the driver to obtain the driving context. We created a training set that stores driving situation patterns and from which the system consults to determine the driving situation. A machine-learning algorithm predicts the driving situation. The driving situation is an input to the fission process that yields the action that must be implemented when the driver needs to be informed or assisted from the given the driving situation. The action may be directed towards the driver, the vehicle or both. This is an ongoing work whose goal is to offer an alternative driving assistance system for safe driving, green driving and comfortable driving. Here, ontologies are used for knowledge representation.

Keywords: cognitive driving, intelligent transportation system, multimodal system, ontology, machine learning

Procedia PDF Downloads 337
24159 A Study on Big Data Analytics, Applications and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, Healthcare, and business intelligence contain voluminous and incremental data, which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organization's decision-making strategy can be enhanced using big data analytics and applying different machine learning techniques and statistical tools on such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates on various frameworks in the process of Analysis using different machine-learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 54
24158 A Study on Big Data Analytics, Applications, and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 68
24157 SARS-CoV-2: Prediction of Critical Charged Amino Acid Mutations

Authors: Atlal El-Assaad

Abstract:

Viruses change with time through mutations and result in new variants that may persist or disappear. A Mutation refers to an actual change in the virus genetic sequence, and a variant is a viral genome that may contain one or more mutations. Critical mutations may cause the virus to be more transmissible, with high disease severity, and more vulnerable to diagnostics, therapeutics, and vaccines. Thus, variants carrying such mutations may increase the risk to human health and are considered variants of concern (VOC). Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) - the contagious in humans, positive-sense single-stranded RNA virus that caused coronavirus disease 2019 (COVID-19) - has been studied thoroughly, and several variants were revealed across the world with their corresponding mutations. SARS-CoV-2 has four structural proteins, known as the S (spike), E (envelope), M (membrane), and N (nucleocapsid) proteins, but prior study and vaccines development focused on genetic mutations in the S protein due to its vital role in allowing the virus to attach and fuse with the membrane of a host cell. Specifically, subunit S1 catalyzes attachment, whereas subunit S2 mediates fusion. In this perspective, we studied all charged amino acid mutations of the SARS-CoV-2 viral spike protein S1 when bound to Antibody CC12.1 in a crystal structure and assessed the effect of different mutations. We generated all missense mutants of SARS-CoV-2 protein amino acids (AAs) within the SARS-CoV-2:CC12.1 complex model. To generate the family of mutants in each complex, we mutated every charged amino acid with all other charged amino acids (Lysine (K), Arginine (R), Glutamic Acid (E), and Aspartic Acid (D)) and studied the new binding of the complex after each mutation. We applied Poisson-Boltzmann electrostatic calculations feeding into free energy calculations to determine the effect of each mutation on binding. After analyzing our data, we identified charged amino acids keys for binding. Furthermore, we validated those findings against published experimental genetic data. Our results are the first to propose in silico potential life-threatening mutations of SARS-CoV-2 beyond the present mutations found in the five common variants found worldwide.

Keywords: SARS-CoV-2, variant, ionic amino acid, protein-protein interactions, missense mutation, AESOP

Procedia PDF Downloads 79
24156 Improved K-Means Clustering Algorithm Using RHadoop with Combiner

Authors: Ji Eun Shin, Dong Hoon Lim

Abstract:

Data clustering is a common technique used in data analysis and is used in many applications, such as artificial intelligence, pattern recognition, economics, ecology, psychiatry and marketing. K-means clustering is a well-known clustering algorithm aiming to cluster a set of data points to a predefined number of clusters. In this paper, we implement K-means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. The main idea is to introduce a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. The experimental results demonstrated that K-means algorithm using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also showed that our K-means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases.

Keywords: big data, combiner, K-means clustering, RHadoop

Procedia PDF Downloads 402
24155 Face Recognition Using Discrete Orthogonal Hahn Moments

Authors: Fatima Akhmedova, Simon Liao

Abstract:

One of the most critical decision points in the design of a face recognition system is the choice of an appropriate face representation. Effective feature descriptors are expected to convey sufficient, invariant and non-redundant facial information. In this work, we propose a set of Hahn moments as a new approach for feature description. Hahn moments have been widely used in image analysis due to their invariance, non-redundancy and the ability to extract features either globally and locally. To assess the applicability of Hahn moments to Face Recognition we conduct two experiments on the Olivetti Research Laboratory (ORL) database and University of Notre-Dame (UND) X1 biometric collection. Fusion of the global features along with the features from local facial regions are used as an input for the conventional k-NN classifier. The method reaches an accuracy of 93% of correctly recognized subjects for the ORL database and 94% for the UND database.

Keywords: face recognition, Hahn moments, recognition-by-parts, time-lapse

Procedia PDF Downloads 341
24154 Framework for Integrating Big Data and Thick Data: Understanding Customers Better

Authors: Nikita Valluri, Vatcharaporn Esichaikul

Abstract:

With the popularity of data-driven decision making on the rise, this study focuses on providing an alternative outlook towards the process of decision-making. Combining quantitative and qualitative methods rooted in the social sciences, an integrated framework is presented with a focus on delivering a much more robust and efficient approach towards the concept of data-driven decision-making with respect to not only Big data but also 'Thick data', a new form of qualitative data. In support of this, an example from the retail sector has been illustrated where the framework is put into action to yield insights and leverage business intelligence. An interpretive approach to analyze findings from both kinds of quantitative and qualitative data has been used to glean insights. Using traditional Point-of-sale data as well as an understanding of customer psychographics and preferences, techniques of data mining along with qualitative methods (such as grounded theory, ethnomethodology, etc.) are applied. This study’s final goal is to establish the framework as a basis for providing a holistic solution encompassing both the Big and Thick aspects of any business need. The proposed framework is a modified enhancement in lieu of traditional data-driven decision-making approach, which is mainly dependent on quantitative data for decision-making.

Keywords: big data, customer behavior, customer experience, data mining, qualitative methods, quantitative methods, thick data

Procedia PDF Downloads 135
24153 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: text mining, topic extraction, independent, incremental, independent component analysis

Procedia PDF Downloads 281
24152 Open Data for e-Governance: Case Study of Bangladesh

Authors: Sami Kabir, Sadek Hossain Khoka

Abstract:

Open Government Data (OGD) refers to all data produced by government which are accessible in reusable way by common people with access to Internet and at free of cost. In line with “Digital Bangladesh” vision of Bangladesh government, the concept of open data has been gaining momentum in the country. Opening all government data in digital and customizable format from single platform can enhance e-governance which will make government more transparent to the people. This paper presents a well-in-progress case study on OGD portal by Bangladesh Government in order to link decentralized data. The initiative is intended to facilitate e-service towards citizens through this one-stop web portal. The paper further discusses ways of collecting data in digital format from relevant agencies with a view to making it publicly available through this single point of access. Further, possible layout of this web portal is presented.

Keywords: e-governance, one-stop web portal, open government data, reusable data, web of data

Procedia PDF Downloads 327
24151 On a Theoretical Framework for Language Learning Apps Evaluation

Authors: Juan Manuel Real-Espinosa

Abstract:

This paper addresses the first step to evaluate language learning apps: what theoretical framework to adopt when designing the app evaluation framework. The answer is not just one since there are several options that could be proposed. However, the question to be clarified is to what extent the learning design of apps is based on a specific learning approach, or on the contrary, on a fusion of elements from several theoretical proposals and paradigms, such as m-learning, mobile assisted language learning, and a number of theories about language acquisition. The present study suggests that the reality is closer to the second assumption. This implies that the theoretical framework against which the learning design of the apps should be evaluated must also be a hybrid theoretical framework, which integrates evaluation criteria from the different theories involved in language learning through mobile applications.

Keywords: mobile-assisted language learning, action-oriented approach, apps evaluation, post-method pedagogy, second language acquisition

Procedia PDF Downloads 172
24150 Resource Framework Descriptors for Interestingness in Data

Authors: C. B. Abhilash, Kavi Mahesh

Abstract:

Human beings are the most advanced species on earth; it's all because of the ability to communicate and share information via human language. In today's world, a huge amount of data is available on the web in text format. This has also resulted in the generation of big data in structured and unstructured formats. In general, the data is in the textual form, which is highly unstructured. To get insights and actionable content from this data, we need to incorporate the concepts of text mining and natural language processing. In our study, we mainly focus on Interesting data through which interesting facts are generated for the knowledge base. The approach is to derive the analytics from the text via the application of natural language processing. Using semantic web Resource framework descriptors (RDF), we generate the triple from the given data and derive the interesting patterns. The methodology also illustrates data integration using the RDF for reliable, interesting patterns.

Keywords: RDF, interestingness, knowledge base, semantic data

Procedia PDF Downloads 128
24149 Data Mining Practices: Practical Studies on the Telecommunication Companies in Jordan

Authors: Dina Ahmad Alkhodary

Abstract:

This study aimed to investigate the practices of Data Mining on the telecommunication companies in Jordan, from the viewpoint of the respondents. In order to achieve the goal of the study, and test the validity of hypotheses, the researcher has designed a questionnaire to collect data from managers and staff members from main department in the researched companies. The results shows improvements stages of the telecommunications companies towered Data Mining.

Keywords: data, mining, development, business

Procedia PDF Downloads 469
24148 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain

Authors: Amal M. Alrayes

Abstract:

Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance.Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.

Keywords: data quality, performance, system quality, Kingdom of Bahrain

Procedia PDF Downloads 461