Search results for: incomplete data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25305

Search results for: incomplete data

25095 A Fuzzy Kernel K-Medoids Algorithm for Clustering Uncertain Data Objects

Authors: Behnam Tavakkol

Abstract:

Uncertain data mining algorithms use different ways to consider uncertainty in data such as by representing a data object as a sample of points or a probability distribution. Fuzzy methods have long been used for clustering traditional (certain) data objects. They are used to produce non-crisp cluster labels. For uncertain data, however, besides some uncertain fuzzy k-medoids algorithms, not many other fuzzy clustering methods have been developed. In this work, we develop a fuzzy kernel k-medoids algorithm for clustering uncertain data objects. The developed fuzzy kernel k-medoids algorithm is superior to existing fuzzy k-medoids algorithms in clustering data sets with non-linearly separable clusters.

Keywords: clustering algorithm, fuzzy methods, kernel k-medoids, uncertain data

Procedia PDF Downloads 215
25094 Democracy Bytes: Interrogating the Exploitation of Data Democracy by Radical Terrorist Organizations

Authors: Nirmala Gopal, Sheetal Bhoola, Audecious Mugwagwa

Abstract:

This paper discusses the continued infringement and exploitation of data by non-state actors for destructive purposes, emphasizing radical terrorist organizations. It will discuss how terrorist organizations access and use data to foster their nefarious agendas. It further examines how cybersecurity, designed as a tool to curb data exploitation, is ineffective in raising global citizens' concerns about how their data can be kept safe and used for its acquired purpose. The study interrogates several policies and data protection instruments, such as the Data Protection Act, Cyber Security Policies, Protection of Personal Information(PPI) and General Data Protection Regulations (GDPR), to understand data use and storage in democratic states. The study outcomes point to the fact that international cybersecurity and cybercrime legislation, policies, and conventions have not curbed violations of data access and use by radical terrorist groups. The study recommends ways to enhance cybersecurity and reduce cyber risks using democratic principles.

Keywords: cybersecurity, data exploitation, terrorist organizations, data democracy

Procedia PDF Downloads 204
25093 Healthcare Data Mining Innovations

Authors: Eugenia Jilinguirian

Abstract:

In the healthcare industry, data mining is essential since it transforms the field by collecting useful data from large datasets. Data mining is the process of applying advanced analytical methods to large patient records and medical histories in order to identify patterns, correlations, and trends. Healthcare professionals can improve diagnosis accuracy, uncover hidden linkages, and predict disease outcomes by carefully examining these statistics. Additionally, data mining supports personalized medicine by personalizing treatment according to the unique attributes of each patient. This proactive strategy helps allocate resources more efficiently, enhances patient care, and streamlines operations. However, to effectively apply data mining, however, and ensure the use of private healthcare information, issues like data privacy and security must be carefully considered. Data mining continues to be vital for searching for more effective, efficient, and individualized healthcare solutions as technology evolves.

Keywords: data mining, healthcare, big data, individualised healthcare, healthcare solutions, database

Procedia PDF Downloads 66
25092 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: clustering algorithms, coastal engineering, data mining, data summarization, statistical methods

Procedia PDF Downloads 361
25091 Fracture Dislocation of Upper Sacrum in an Adolescent: Case Report and Review of Literature

Authors: S. Alireza Mirghasemi, Narges Rahimi Gabaran

Abstract:

Although sacral fractures in children are rare due to the fact that the occurrence of pelvic fracture is not common in childhood. Sacral fractures present a high risk of neurological damage. This kind of fracture is often missed because the routine pelvic X-rays imaging scarcely show this fracture. Also, the treatment is controversial, and it ranges from fine reduction to conservative treatments without any try to reduce the dislocation. In this article, a case of fracture dislocation of S1 and S2 along with a suggested diagnostic test and treatment based on similar cases are presented. The case investigates a 14-year-old boy who entered the hospital one week after a car accident that knocked him to the ground in crawling position and a rack fell down on his body. Pain and tenderness in the sacral region and a fracture in the left leg were notable--we detected incomplete bilateral palsy of L5, S1 and S2 roots. In radiographs of the spine fracture dislocation of S1, the sacral fracture was seen. The treatment included a skeletal traction with a halo over the patient’s head and two femoral pins. After one week, another surgery was performed in order to stabilize and reduce the fracture, and we employed a posterior approach with CD and a pedicular screw. After two years of follow-up, the fracture is completely cured without any loss of reduction.

Keywords: adolescent, fracture in adolescent, fracture dislocation, sacrum

Procedia PDF Downloads 292
25090 Access to Health Data in Medical Records in Indonesia in Terms of Personal Data Protection Principles: The Limitation and Its Implication

Authors: Anny Retnowati, Elisabeth Sundari

Abstract:

This research aims to elaborate the meaning of personal data protection principles on patient access to health data in medical records in Indonesia and its implications. The method uses normative legal research by examining health law in Indonesia regarding the patient's right to access their health data in medical records. The data will be analysed qualitatively using the interpretation method to elaborate on the limitation of the meaning of personal data protection principles on patients' access to their data in medical records. The results show that patients only have the right to obtain copies of their health data in medical records. There is no right to inspect directly at any time. Indonesian health law limits the principle of patients' right to broad access to their health data in medical records. This restriction has implications for the reduction of personal data protection as part of human rights. This research contribute to show that a limitaion of personal data protection may abuse the human rights.

Keywords: access, health data, medical records, personal data, protection

Procedia PDF Downloads 93
25089 Conceptualizing the Knowledge to Manage and Utilize Data Assets in the Context of Digitization: Case Studies of Multinational Industrial Enterprises

Authors: Martin Böhmer, Agatha Dabrowski, Boris Otto

Abstract:

The trend of digitization significantly changes the role of data for enterprises. Data turn from an enabler to an intangible organizational asset that requires management and qualifies as a tradeable good. The idea of a networked economy has gained momentum in the data domain as collaborative approaches for data management emerge. Traditional organizational knowledge consequently needs to be extended by comprehensive knowledge about data. The knowledge about data is vital for organizations to ensure that data quality requirements are met and data can be effectively utilized and sovereignly governed. As this specific knowledge has been paid little attention to so far by academics, the aim of the research presented in this paper is to conceptualize it by proposing a “data knowledge model”. Relevant model entities have been identified based on a design science research (DSR) approach that iteratively integrates insights of various industry case studies and literature research.

Keywords: data management, digitization, industry 4.0, knowledge engineering, metamodel

Procedia PDF Downloads 356
25088 Multi-Objective Multi-Mode Resource-Constrained Project Scheduling Problem by Preemptive Fuzzy Goal Programming

Authors: Busaba Phurksaphanrat

Abstract:

This research proposes a pre-emptive fuzzy goal programming model for multi-objective multi-mode resource constrained project scheduling problem. The objectives of the problem are minimization of the total time and the total cost of the project. Objective in a multi-mode resource-constrained project scheduling problem is often a minimization of make-span. However, both time and cost should be considered at the same time with different level of important priorities. Moreover, all elements of cost functions in a project are not included in the conventional cost objective function. Incomplete total project cost causes an error in finding the project scheduling time. In this research, pre-emptive fuzzy goal programming is presented to solve the multi-objective multi-mode resource constrained project scheduling problem. It can find the compromise solution of the problem. Moreover, it is also flexible in adjusting to find a variety of alternative solutions.

Keywords: multi-mode resource constrained project scheduling problem, fuzzy set, goal programming, pre-emptive fuzzy goal programming

Procedia PDF Downloads 435
25087 Analysis and Forecasting of Bitcoin Price Using Exogenous Data

Authors: J-C. Leneveu, A. Chereau, L. Mansart, T. Mesbah, M. Wyka

Abstract:

Extracting and interpreting information from Big Data represent a stake for years to come in several sectors such as finance. Currently, numerous methods are used (such as Technical Analysis) to try to understand and to anticipate market behavior, with mixed results because it still seems impossible to exactly predict a financial trend. The increase of available data on Internet and their diversity represent a great opportunity for the financial world. Indeed, it is possible, along with these standard financial data, to focus on exogenous data to take into account more macroeconomic factors. Coupling the interpretation of these data with standard methods could allow obtaining more precise trend predictions. In this paper, in order to observe the influence of exogenous data price independent of other usual effects occurring in classical markets, behaviors of Bitcoin users are introduced in a model reconstituting Bitcoin value, which is elaborated and tested for prediction purposes.

Keywords: big data, bitcoin, data mining, social network, financial trends, exogenous data, global economy, behavioral finance

Procedia PDF Downloads 355
25086 Hybrid Method for Smart Suggestions in Conversations for Online Marketplaces

Authors: Yasamin Rahimi, Ali Kamandi, Abbas Hoseini, Hesam Haddad

Abstract:

Online/offline chat is a convenient approach in the electronic markets of second-hand products in which potential customers would like to have more information about the products to fill the information gap between buyers and sellers. Online peer in peer market is trying to create artificial intelligence-based systems that help customers ask more informative questions in an easier way. In this article, we introduce a method for the question/answer system that we have developed for the top-ranked electronic market in Iran called Divar. When it comes to secondhand products, incomplete product information in a purchase will result in loss to the buyer. One way to balance buyer and seller information of a product is to help the buyer ask more informative questions when purchasing. Also, the short time to start and achieve the desired result of the conversation was one of our main goals, which was achieved according to A/B tests results. In this paper, we propose and evaluate a method for suggesting questions and answers in the messaging platform of the e-commerce website Divar. Creating such systems is to help users gather knowledge about the product easier and faster, All from the Divar database. We collected a dataset of around 2 million messages in Persian colloquial language, and for each category of product, we gathered 500K messages, of which only 2K were Tagged, and semi-supervised methods were used. In order to publish the proposed model to production, it is required to be fast enough to process 10 million messages daily on CPU processors. In order to reach that speed, in many subtasks, faster and simplistic models are preferred over deep neural models. The proposed method, which requires only a small amount of labeled data, is currently used in Divar production on CPU processors, and 15% of buyers and seller’s messages in conversations is directly chosen from our model output, and more than 27% of buyers have used this model suggestions in at least one daily conversation.

Keywords: smart reply, spell checker, information retrieval, intent detection, question answering

Procedia PDF Downloads 187
25085 On the Combination of Patient-Generated Data with Data from a Secure Clinical Network Environment: A Practical Example

Authors: Jeroen S. de Bruin, Karin Schindler, Christian Schuh

Abstract:

With increasingly more mobile health applications appearing due to the popularity of smartphones, the possibility arises that these data can be used to improve the medical diagnostic process, as well as the overall quality of healthcare, while at the same time lowering costs. However, as of yet there have been no reports of a successful combination of patient-generated data from smartphones with data from clinical routine. In this paper, we describe how these two types of data can be combined in a secure way without modification to hospital information systems, and how they can together be used in a medical expert system for automatic nutritional classification and triage.

Keywords: mobile health, data integration, expert systems, disease-related malnutrition

Procedia PDF Downloads 477
25084 The Prospects of Leveraging (Big) Data for Accelerating a Just Sustainable Transition around Different Contexts

Authors: Sombol Mokhles

Abstract:

This paper tries to show the prospects of utilising (big)data for enabling just the transition of diverse cities. Our key purpose is to offer a framework of applications and implications of utlising (big) data in comparing sustainability transitions across different cities. Relying on the cosmopolitan comparison, this paper explains the potential application of (big) data but also its limitations. The paper calls for adopting a data-driven and just perspective in including different cities around the world. Having a just and inclusive approach at the front and centre ensures a just transition with synergistic effects that leave nobody behind.

Keywords: big data, just sustainable transition, cosmopolitan city comparison, cities

Procedia PDF Downloads 99
25083 Strategic Workplace Security: The Role of Malware and the Threat of Internal Vulnerability

Authors: Modesta E. Ezema, Christopher C. Ezema, Christian C. Ugwu, Udoka F. Eze, Florence M. Babalola

Abstract:

Some employees knowingly or unknowingly contribute to loss of data and also expose data to threat in the process of getting their jobs done. Many organizations today are faced with the challenges of how to secure their data as cyber criminals constantly devise new ways of attacking the organization’s secret data. However, this paper enlists the latest strategies that must be put in place in order to protect these important data from being attacked in a collaborative work place. It also introduces us to Advanced Persistent Threats (APTs) and how it works. The empirical study was conducted to collect data from the employee in data centers on how data could be protected from malicious codes and cyber criminals and their responses are highly considered to help checkmate the activities of malicious code and cyber criminals in our work places.

Keywords: data, employee, malware, work place

Procedia PDF Downloads 383
25082 Acceptance of Big Data Technologies and Its Influence towards Employee’s Perception on Job Performance

Authors: Jia Yi Yap, Angela S. H. Lee

Abstract:

With the use of big data technologies, organization can get result that they are interested in. Big data technologies simply load all the data that is useful for the organizations and provide organizations a better way of analysing data. The purpose of this research is to get employees’ opinion from films in Malaysia to explore the use of big data technologies in their organization in order to provide how it may affect the perception of the employees on job performance. Therefore, in order to identify will accepting big data technologies in the organization affect the perception of the employee, questionnaire will be distributed to different employee from different Small and medium-sized enterprises (SME) organization listed in Malaysia. The conceptual model proposed will test with other variables in order to see the relationship between variables.

Keywords: big data technologies, employee, job performance, questionnaire

Procedia PDF Downloads 298
25081 Data Poisoning Attacks on Federated Learning and Preventive Measures

Authors: Beulah Rani Inbanathan

Abstract:

In the present era, it is vivid from the numerous outcomes that data privacy is being compromised in various ways. Machine learning is one technology that uses the centralized server, and then data is given as input which is being analyzed by the algorithms present on this mentioned server, and hence outputs are predicted. However, each time the data must be sent by the user as the algorithm will analyze the input data in order to predict the output, which is prone to threats. The solution to overcome this issue is federated learning, where the models alone get updated while the data resides on the local machine and does not get exchanged with the other local models. Nevertheless, even on these local models, there are chances of data poisoning, and it is crystal clear from various experiments done by many people. This paper delves into many ways where data poisoning occurs and the many methods through which it is prevalent that data poisoning still exists. It includes the poisoning attacks on IoT devices, Edge devices, Autoregressive model, and also, on Industrial IoT systems and also, few points on how these could be evadible in order to protect our data which is personal, or sensitive, or harmful when exposed.

Keywords: data poisoning, federated learning, Internet of Things, edge computing

Procedia PDF Downloads 87
25080 Effect of Composition Fuel on Safety of Combustion Process

Authors: Lourdes I. Meriño, Viatcheslav Kafarov, Maria Gómez

Abstract:

Fuel gas used in the burner receives as contributors other gases from different processes and this result in variability in the composition, which may cause an incomplete combustion. The burners are designed to operate in a certain curve, the calorific power dependent on the pressure and gas burners. When deviation of propane and C5+ is huge, there is a large release of energy, which causes it to work out the curves of the burners, because less pressure is required to force curve into operation. That increases the risk of explosion in an oven, besides of a higher environmental impact. There should be flame detection systems, and instrumentation equipment, such as local pressure gauges located at the entrance of the gas burners, to permit verification by the operator. Additionally, distributed control systems must be configured with different combustion instruments associated with respective alarms, as well as its operational windows, and windows control guidelines of integrity, leaving the design information of this equipment. Therefore, it is desirable to analyze when a plant is taken out of service and make good operational analysis to determine the impact of changes in fuel gas streams contributors, by varying the calorific power. Hence, poor combustion is one of the cause instability in the flame of the burner and having a great impact on process safety, the integrity of individuals and teams and environment.

Keywords: combustion process, fuel composition, safety, fuel gas

Procedia PDF Downloads 490
25079 Generalization of Clustering Coefficient on Lattice Networks Applied to Criminal Networks

Authors: Christian H. Sanabria-Montaña, Rodrigo Huerta-Quintanilla

Abstract:

A lattice network is a special type of network in which all nodes have the same number of links, and its boundary conditions are periodic. The most basic lattice network is the ring, a one-dimensional network with periodic border conditions. In contrast, the Cartesian product of d rings forms a d-dimensional lattice network. An analytical expression currently exists for the clustering coefficient in this type of network, but the theoretical value is valid only up to certain connectivity value; in other words, the analytical expression is incomplete. Here we obtain analytically the clustering coefficient expression in d-dimensional lattice networks for any link density. Our analytical results show that the clustering coefficient for a lattice network with density of links that tend to 1, leads to the value of the clustering coefficient of a fully connected network. We developed a model on criminology in which the generalized clustering coefficient expression is applied. The model states that delinquents learn the know-how of crime business by sharing knowledge, directly or indirectly, with their friends of the gang. This generalization shed light on the network properties, which is important to develop new models in different fields where network structure plays an important role in the system dynamic, such as criminology, evolutionary game theory, econophysics, among others.

Keywords: clustering coefficient, criminology, generalized, regular network d-dimensional

Procedia PDF Downloads 411
25078 Simulation and Hardware Implementation of Data Communication Between CAN Controllers for Automotive Applications

Authors: R. M. Kalayappan, N. Kathiravan

Abstract:

In automobile industries, Controller Area Network (CAN) is widely used to reduce the system complexity and inter-task communication. Therefore, this paper proposes the hardware implementation of data frame communication between one controller to other. The CAN data frames and protocols will be explained deeply, here. The data frames are transferred without any collision or corruption. The simulation is made in the KEIL vision software to display the data transfer between transmitter and receiver in CAN. ARM7 micro-controller is used to transfer data’s between the controllers in real time. Data transfer is verified using the CRO.

Keywords: control area network (CAN), automotive electronic control unit, CAN 2.0, industry

Procedia PDF Downloads 398
25077 Improving the Statistics Nature in Research Information System

Authors: Rajbir Cheema

Abstract:

In order to introduce an integrated research information system, this will provide scientific institutions with the necessary information on research activities and research results in assured quality. Since data collection, duplication, missing values, incorrect formatting, inconsistencies, etc. can arise in the collection of research data in different research information systems, which can have a wide range of negative effects on data quality, the subject of data quality should be treated with better results. This paper examines the data quality problems in research information systems and presents the new techniques that enable organizations to improve their quality of research information.

Keywords: Research information systems (RIS), research information, heterogeneous sources, data quality, data cleansing, science system, standardization

Procedia PDF Downloads 157
25076 Data Mining Meets Educational Analysis: Opportunities and Challenges for Research

Authors: Carla Silva

Abstract:

Recent development of information and communication technology enables us to acquire, collect, analyse data in various fields of socioeconomic – technological systems. Along with the increase of economic globalization and the evolution of information technology, data mining has become an important approach for economic data analysis. As a result, there has been a critical need for automated approaches to effective and efficient usage of massive amount of educational data, in order to support institutions to a strategic planning and investment decision-making. In this article, we will address data from several different perspectives and define the applied data to sciences. Many believe that 'big data' will transform business, government, and other aspects of the economy. We discuss how new data may impact educational policy and educational research. Large scale administrative data sets and proprietary private sector data can greatly improve the way we measure, track, and describe educational activity and educational impact. We also consider whether the big data predictive modeling tools that have emerged in statistics and computer science may prove useful in educational and furthermore in economics. Finally, we highlight a number of challenges and opportunities for future research.

Keywords: data mining, research analysis, investment decision-making, educational research

Procedia PDF Downloads 358
25075 Development and Evaluation of Gastro Retentive Floating Tablets of Ayurvedic Vati Formulation

Authors: Imran Khan Pathan, Anil Bhandari, Peeyush K. Sharma, Rakesh K. Patel, Suresh Purohit

Abstract:

Floating tablets of Marichyadi Vati were developed with an aim to prolong its gastric residence time and increase the bioavailability of drug. Rapid gastrointestinal transit could result in incomplete drug release from the drug delivery system above the absorption zone leading to diminished efficacy of the administered dose. The tablets were prepared by wet granulation technique, using HPMC E50 LV act as Matrixing agent, Carbopol as floating enhancer, microcrystalline cellulose as binder, sodium bi carbonate as effervescent agent with other excipients. The simplex lattice design was used for selection of variables for tablets formulation. Formulation was optimized on the basis of floating time and in vitro drug release. The results showed that the floating lag time for optimized formulation was found to be 61 second with about 97.32 % of total drug release within 3 hours. The in vitro release profiles of drug from the formulation could be best expressed zero order with highest linearity r2 = 0.9943. It was concluded that the gastroretentive drug delivery system can be developed for Marichyadi Vati containing piperine to increase the residence time of the drug in the stomach and thereby increasing bioavailability.

Keywords: piperine, Marichyadi Vati, gastroretentive drug delivery, floating tablet

Procedia PDF Downloads 457
25074 A Method of Detecting the Difference in Two States of Brain Using Statistical Analysis of EEG Raw Data

Authors: Digvijaysingh S. Bana, Kiran R. Trivedi

Abstract:

This paper introduces various methods for the alpha wave to detect the difference between two states of brain. One healthy subject participated in the experiment. EEG was measured on the forehead above the eye (FP1 Position) with reference and ground electrode are on the ear clip. The data samples are obtained in the form of EEG raw data. The time duration of reading is of one minute. Various test are being performed on the alpha band EEG raw data.The readings are performed in different time duration of the entire day. The statistical analysis is being carried out on the EEG sample data in the form of various tests.

Keywords: electroencephalogram(EEG), biometrics, authentication, EEG raw data

Procedia PDF Downloads 464
25073 Predictors of Lost to Follow-Up among HIV Patients Attending Anti-Retroviral Therapy Treatment Centers in Nigeria

Authors: Oluwasina Folajinmi, Kate Ssamulla, Penninah Lutung, Daniel Reijer

Abstract:

Background: Despite of well-verified benefits of anti-retroviral therapy (ART) in prolonging life expectancy being lost to follow-up (LTFU) presents a challenge to the success of ART programs in resource limited countries like Nigeria. In several studies of ART programs in developing countries, researchers have reported that there has been a high rate of LTFU among patients receiving care and treatment at ART treatment centers. This study seeks to determine the cause of LTFU among HIV clients. Method: A descriptive cross sectional study focused on a population of 9,280 persons living with HIV/AIDS who were enrolled in nine treatment centers in Nigeria (both pre-ART and ART patients were included). Out of the total population, 1752 (18.9%) were found to be LTFU. Of this group we randomly selected 1200 clients (68.5%) their d patients’ information was generated through a database. Data on demographics and CD4 counts, causes of LTFU were analyzed and summarized. Results: Out of 1200 LTFU clients selected, 462 (38.5%) were on ART; 341 clients (73.8%) had CD4 level < 500cell/µL and 738 (61.5%) on pre-ART had CD4 level >500/µL. In our records we found telephone number for 675 (56.1%) of these clients. 675 (56.1%) were owners of a phone. The majority of the client’s 731 (60.9%) were living at not more than 25km away from the ART center. A majority were females (926 or 77.2%) while 274 (22.8%) were male. 675 (56.1%) clients were reported traced via telephone and home address. 326 (27.2%) of clients phone numbers were not reachable; 173 (14.4%) of telephone numbers were incomplete. 71 (5.9%) had relocated due to communal crises and expert client trackers reported that some patient could not afford transportation to ART centers. Conclusion: This study shows that, low health education levels, poverty, relocations and lack of reliable phone contact were major predictors of LTFU. Periodic updates of home addresses, telephone contacts including at least two next of kin, phone text messages and home visits may improve follow up. Early and consistent tracking of missed appointments is crucial. Creation of more ART decentralized centres are needed to avoid long distances.

Keywords: anti-retroviral therapy, HIV/AIDS, predictors, lost to follow up

Procedia PDF Downloads 304
25072 A Study on Big Data Analytics, Applications and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, Healthcare, and business intelligence contain voluminous and incremental data, which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organization's decision-making strategy can be enhanced using big data analytics and applying different machine learning techniques and statistical tools on such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates on various frameworks in the process of Analysis using different machine-learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 83
25071 A Study on Big Data Analytics, Applications, and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 95
25070 Modeling of Carbon Monoxide Distribution under the Sky-Train Stations

Authors: Suranath Chomcheon, Nathnarong Khajohnsaksumeth, Benchawan Wiwatanapataphee

Abstract:

Carbon monoxide is one of the harmful gases which have colorless, odorless, and tasteless. Too much carbon monoxide taken into the human body causes the reduction of oxygen transportation within human body cells leading to many symptoms including headache, nausea, vomiting, loss of consciousness, and death. Carbon monoxide is considered as one of the air pollution indicators. It is mainly released as soot from the exhaust pipe of the incomplete combustion of the vehicle engine. Nowadays, the increase in vehicle usage and the slowly moving of the vehicle struck by the traffic jam has created a large amount of carbon monoxide, which accumulated in the street canyon area. In this research, we study the effect of parameters such as wind speed and aspect ratio of the height building affecting the ventilation. We consider the model of the pollutant under the Bangkok Transit System (BTS) stations in a two-dimensional geometrical domain. The convention-diffusion equation and Reynolds-averaged Navier-stokes equation is used to describe the concentration and the turbulent flow of carbon monoxide. The finite element method is applied to obtain the numerical result. The result shows that our model can describe the dispersion patterns of carbon monoxide for different wind speeds.

Keywords: air pollution, carbon monoxide, finite element, street canyon

Procedia PDF Downloads 126
25069 Improved K-Means Clustering Algorithm Using RHadoop with Combiner

Authors: Ji Eun Shin, Dong Hoon Lim

Abstract:

Data clustering is a common technique used in data analysis and is used in many applications, such as artificial intelligence, pattern recognition, economics, ecology, psychiatry and marketing. K-means clustering is a well-known clustering algorithm aiming to cluster a set of data points to a predefined number of clusters. In this paper, we implement K-means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. The main idea is to introduce a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. The experimental results demonstrated that K-means algorithm using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also showed that our K-means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases.

Keywords: big data, combiner, K-means clustering, RHadoop

Procedia PDF Downloads 438
25068 Cold Spray Deposition of SS316L Powders on Al5052 Substrates and Their Potential Using for Biomedical Applications

Authors: B. Dikici, I. Ozdemir, M. Topuz

Abstract:

The corrosion behaviour of 316L stainless steel coatings obtained by cold spray method was investigated in this study. 316L powders were deposited onto Al5052 aluminum substrates. The coatings were produced using nitrogen (N2) process gas. In order to further improve the corrosion and mechanical properties of the coatings, heat treatment was applied at 250 and 750 °C. The corrosion performances of the coatings were compared using the potentiodynamic scanning (PDS) technique under in-vitro conditions (in Ringer’s solution at 37 °C). In addition, the hardness and porosity tests were carried out on the coatings. Microstructural characterization of the coatings was carried out by using scanning electron microscopy attached with energy dispersive spectrometer (SEM-EDS) and X-ray diffraction (XRD) technique. It was found that clean surfaces and a good adhesion were achieved for particle/substrate bonding. The heat treatment process provided both elimination of the anisotropy in the coating and resulting in healing-up of the incomplete interfaces between the deposited particles. It was found that the corrosion potential of the annealed coatings at 750 °C was higher than that of commercially 316 L stainless steel. Moreover, the microstructural investigations after the corrosion tests revealed that corrosion preferentially starts at inter-splat boundaries.

Keywords: biomaterials, cold spray, 316L, corrosion, heat treatment

Procedia PDF Downloads 370
25067 Framework for Integrating Big Data and Thick Data: Understanding Customers Better

Authors: Nikita Valluri, Vatcharaporn Esichaikul

Abstract:

With the popularity of data-driven decision making on the rise, this study focuses on providing an alternative outlook towards the process of decision-making. Combining quantitative and qualitative methods rooted in the social sciences, an integrated framework is presented with a focus on delivering a much more robust and efficient approach towards the concept of data-driven decision-making with respect to not only Big data but also 'Thick data', a new form of qualitative data. In support of this, an example from the retail sector has been illustrated where the framework is put into action to yield insights and leverage business intelligence. An interpretive approach to analyze findings from both kinds of quantitative and qualitative data has been used to glean insights. Using traditional Point-of-sale data as well as an understanding of customer psychographics and preferences, techniques of data mining along with qualitative methods (such as grounded theory, ethnomethodology, etc.) are applied. This study’s final goal is to establish the framework as a basis for providing a holistic solution encompassing both the Big and Thick aspects of any business need. The proposed framework is a modified enhancement in lieu of traditional data-driven decision-making approach, which is mainly dependent on quantitative data for decision-making.

Keywords: big data, customer behavior, customer experience, data mining, qualitative methods, quantitative methods, thick data

Procedia PDF Downloads 162
25066 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: text mining, topic extraction, independent, incremental, independent component analysis

Procedia PDF Downloads 309