Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 19392

Search results for: data labelling

19392 Radio Labeling and Characterization of Cysteine and Its Derivatives with Tc99m and Their Bio-Distribution

Authors: Rabia Ashfaq, Saeed Iqbal, Atiq ur Rehman, Irfanullah Khan

Abstract:

An extensive series of radiopharmaceuticals have been explored in order to discover a better brain tumour diagnostic agent. Tc99m labelling with cysteine and its derivatives in liposomes shows effective tagging of about 70% to 80 %. Due to microscopic size it successfully crossed the brain barrier in 2 minutes which gradually decreases in 5 to 15 minutes. HMPAO labelled with Tc99m is another important radiopharmaceutical used to study brain perfusion but it comes with a flaw that it’s only functional during epilepsy. 1, 1 ECD is purely used in Tc99m ECD formulation; because it not only tends to cross the blood brain barrier but it can be metabolized which can be easily entrapped in human brain. Radio labelling of Cysteine with Tc99m at room temperature was performed which yielded no good results. Hence cysteine derivatives with salicylaldehyde were prepared that produced about 75 % yield for ligand. In order to perform it’s radio labelling a suitable solvent DMSO was selected and physical parameters were performed. Elemental analyser produced remarkably similar results for ligand as reported in literature. IR spectra of Ligand in DMSO concluded in the absence of SH stretch and presence of N-H vibration. Thermal analysis of the ligand further suggested its decomposition pattern with no distinct curve for a melting point. Radio labelling of ligand was performed which produced excellent results giving up to 88% labelling at pH 5.0. Clinical trials using Rabbit were performed after validating the products reproducibility. The radiopharmaceutical prepared was injected into the rabbit. Dynamic as well as static study was performed under the SPECT. It showed considerable uptake in the kidneys and liver considering it suitable for the Hypatobilliary study.

Keywords: marcapto compounds, 99mTc - radiolabeling, salicylaldicysteine, thiozolidine

Procedia PDF Downloads 251
19391 Towards Law Data Labelling Using Topic Modelling

Authors: Daniel Pinheiro Da Silva Junior, Aline Paes, Daniel De Oliveira, Christiano Lacerda Ghuerren, Marcio Duran

Abstract:

The Courts of Accounts are institutions responsible for overseeing and point out irregularities of Public Administration expenses. They have a high demand for processes to be analyzed, whose decisions must be grounded on severity laws. Despite the existing large amount of processes, there are several cases reporting similar subjects. Thus, previous decisions on already analyzed processes can be a precedent for current processes that refer to similar topics. Identifying similar topics is an open, yet essential task for identifying similarities between several processes. Since the actual amount of topics is considerably large, it is tedious and error-prone to identify topics using a pure manual approach. This paper presents a tool based on Machine Learning and Natural Language Processing to assists in building a labeled dataset. The tool relies on Topic Modelling with Latent Dirichlet Allocation to find the topics underlying a document followed by Jensen Shannon distance metric to generate a probability of similarity between documents pairs. Furthermore, in a case study with a corpus of decisions of the Rio de Janeiro State Court of Accounts, it was noted that data pre-processing plays an essential role in modeling relevant topics. Also, the combination of topic modeling and a calculated distance metric over document represented among generated topics has been proved useful in helping to construct a labeled base of similar and non-similar document pairs.

Keywords: courts of accounts, data labelling, document similarity, topic modeling

Procedia PDF Downloads 64
19390 Evaluation of Stable Isotope in Life History and Mating Behaviour of Mediterranean Fruit Fly Ceratitis capitata (Diptera: Tephidae) in Laboratory Conditions

Authors: Hasan AL-Khshemawee, Manjree Agarwal, Xin Du, Yonglin Ren

Abstract:

The possibility use of stable isotopes to study Medfly mating and life history were investigated in these experiments. 13C6 glucose was incorporated in the diet of the Mediterranean fruit fly Ceratitis capitata (Diptera: Tephidae). Treatments included labelling and unlabelled of either the media or adult sugar water. The measured started from egg hatching till the adults have died. After mating, the adults were analysed for 13C6 glucose ratio using Liquid chromatography-mass spectrometry LC-MS in two periods of time immediately and after three days of mating. Results showed that stable isotopes were used successfully for labelling Medfly in laboratory conditions, and there were significant differences between labelled and unlabelled treatment in eggs hatching, larval development, pupae emergence, survival of adults and mating behaviour. Labelling during larval development and combined labelling of larvae and adults resulted in detectable values. The label glucose in larvae stage did not effect on mating behaviour, however, the label glucose in adults’ stage was affected by mating behaviour. We recommended that it is possible to label adults of Mediterranean fruit fly C. capitata and detected the label after mating. This method offers good tools to study mating behaviour in Medfly and other types of insects and could be providing useful tools in genetic studies, sterile insect technique (SIT) or agricultural pest management. Also, we recommended using this technique in the field.

Keywords: stable isotope, sterile insect technique (SIT), medfly, mating behaviour

Procedia PDF Downloads 182
19389 Quantification of Dispersion Effects in Arterial Spin Labelling Perfusion MRI

Authors: Rutej R. Mehta, Michael A. Chappell

Abstract:

Introduction: Arterial spin labelling (ASL) is an increasingly popular perfusion MRI technique, in which arterial blood water is magnetically labelled in the neck before flowing into the brain, providing a non-invasive measure of cerebral blood flow (CBF). The accuracy of ASL CBF measurements, however, is hampered by dispersion effects; the distortion of the ASL labelled bolus during its transit through the vasculature. In spite of this, the current recommended implementation of ASL – the white paper (Alsop et al., MRM, 73.1 (2015): 102-116) – does not account for dispersion, which leads to the introduction of errors in CBF. Given that the transport time from the labelling region to the tissue – the arterial transit time (ATT) – depends on the region of the brain and the condition of the patient, it is likely that these errors will also vary with the ATT. In this study, various dispersion models are assessed in comparison with the white paper (WP) formula for CBF quantification, enabling the errors introduced by the WP to be quantified. Additionally, this study examines the relationship between the errors associated with the WP and the ATT – and how this is influenced by dispersion. Methods: Data were simulated using the standard model for pseudo-continuous ASL, along with various dispersion models, and then quantified using the formula in the WP. The ATT was varied from 0.5s-1.3s, and the errors associated with noise artefacts were computed in order to define the concept of significant error. The instantaneous slope of the error was also computed as an indicator of the sensitivity of the error with fluctuations in ATT. Finally, a regression analysis was performed to obtain the mean error against ATT. Results: An error of 20.9% was found to be comparable to that introduced by typical measurement noise. The WP formula was shown to introduce errors exceeding 20.9% for ATTs beyond 1.25s even when dispersion effects were ignored. Using a Gaussian dispersion model, a mean error of 16% was introduced by using the WP, and a dispersion threshold of σ=0.6 was determined, beyond which the error was found to increase considerably with ATT. The mean error ranged from 44.5% to 73.5% when other physiologically plausible dispersion models were implemented, and the instantaneous slope varied from 35 to 75 as dispersion levels were varied. Conclusion: It has been shown that the WP quantification formula holds only within an ATT window of 0.5 to 1.25s, and that this window gets narrower as dispersion occurs. Provided that the dispersion levels fall below the threshold evaluated in this study, however, the WP can measure CBF with reasonable accuracy if dispersion is correctly modelled by the Gaussian model. However, substantial errors were observed with other common models for dispersion with dispersion levels similar to those that have been observed in literature.

Keywords: arterial spin labelling, dispersion, MRI, perfusion

Procedia PDF Downloads 310
19388 Using Satellite Images Datasets for Road Intersection Detection in Route Planning

Authors: Fatma El-Zahraa El-Taher, Ayman Taha, Jane Courtney, Susan Mckeever

Abstract:

Understanding road networks plays an important role in navigation applications such as self-driving vehicles and route planning for individual journeys. Intersections of roads are essential components of road networks. Understanding the features of an intersection, from a simple T-junction to larger multi-road junctions, is critical to decisions such as crossing roads or selecting the safest routes. The identification and profiling of intersections from satellite images is a challenging task. While deep learning approaches offer the state-of-the-art in image classification and detection, the availability of training datasets is a bottleneck in this approach. In this paper, a labelled satellite image dataset for the intersection recognition problem is presented. It consists of 14,692 satellite images of Washington DC, USA. To support other users of the dataset, an automated download and labelling script is provided for dataset replication. The challenges of construction and fine-grained feature labelling of a satellite image dataset is examined, including the issue of how to address features that are spread across multiple images. Finally, the accuracy of the detection of intersections in satellite images is evaluated.

Keywords: satellite images, remote sensing images, data acquisition, autonomous vehicles

Procedia PDF Downloads 32
19387 Multiscale Connected Component Labelling and Applications to Scientific Microscopy Image Processing

Authors: Yayun Hsu, Henry Horng-Shing Lu

Abstract:

In this paper, a new method is proposed to extending the method of connected component labeling from processing binary images to multi-scale modeling of images. By using the adaptive threshold of multi-scale attributes, this approach minimizes the possibility of missing those important components with weak intensities. In addition, the computational cost of this approach remains similar to that of the typical approach of component labeling. Then, this methodology is applied to grain boundary detection and Drosophila Brain-bow neuron segmentation. These demonstrate the feasibility of the proposed approach in the analysis of challenging microscopy images for scientific discovery.

Keywords: microscopic image processing, scientific data mining, multi-scale modeling, data mining

Procedia PDF Downloads 362
19386 Ex-Offenders’ Labelling, Stigmatisation and Unsuccessful Re-Integration as Factors Leading into Recidivism: A South African Context

Authors: Tshimangadzo Oscar Magadze

Abstract:

For successful re-integration, the individual offender must adapt and transform, which requires that the offender should adopt and internalise socially approved norms, attitudes, values, and beliefs. However, the offender’s labelling and community stigmatisation decide the destination of the offender. Community involvement in ex-offenders’ re-integration is an important issue in efforts to reduce recidivism and to control overcrowding in our correctional facilities. Crime is a social problem that requires society to come together to fight against it. This study was conducted in the Limpopo Province in Vhembe District Municipality within four local municipalities, namely Musina, Makhado, Mutale, and Thulamela. A total number of 30 participants were interviewed, and all were members of the Community Corrections Forums. This was necessitated by the fact that Musina is a very small area, which compelled the Department of Correctional Services to combine the two (Musina and Makhado) into one social re-integration entity. This is a qualitative research study where participants were selected through the use of purposive sampling. Participants were selected based on the value they would add to this study in order to achieve the objectives. The data collection method of this study was the focus group, which comprised of three groups of 10 participants each. Thulamela and Mutale local municipalities formed a group with (10) participants each, whereas Musina (2) and Makhado (8) formed another. Results indicate that the current situation is not conducive for re-integration to be successful. Participants raised many factors that need serious redress, namely offenders’ discrimination, lack of forgiveness by members of the community, which is fuelled by lack of community awareness due to the failure of the Department of Correctional Services in educating communities on ex-offenders’ re-integration.

Keywords: ex-offender, labeling, re-integration, stigmatization

Procedia PDF Downloads 51
19385 Study of the Composition of Lipids in Different Kinds of Packaged Food Products

Authors: Zineb Taidirt, Fathia Sebahi, Mohamed Karim Guarchani, Anissa Berkane, Noureddine Smail, Ouahiba Hadjoudj

Abstract:

Cardiovascular diseases are one of the most important causes of death in Algeria. Several risk factors are responsible for this, including the consumption of foods containing saturated fat and trans fatty acids TFAs. This brief presents the results of a descriptive study of the lipid composition of 251 food products marketed in Algeria. The objective of the study is to describe the nature and composition of lipids and to verify the compliance of saturated and trans fatty acids intakes with the regulations. The study is based on data from the nutrition labelling of marketed food products. The results showed that the lipids in foodstuffs are diverse in nature and of varying amounts, but their nature is not specified on all products. In addition, the required content of saturated fatty acids is mentioned only in 29.48% of the products; 21.62% of them do not comply with the standard. Hydrogenation of fats, which produced Trans fatty acids, is common: 19.92% of products contain hydrogenated fats, and 74.89% may contain them according to the aspect of the lipid (solid fat). However, the trans fatty acid content is only mentioned in 5.18% of the products. The latter is above the limits set by Algerian regulations in 50% of the butter samples studied. The composition of lipids in mono- and polyunsaturated fatty acids essential for the body is insufficient: only 13.94% of the products inform their contents on their labels. It is necessary to adopt mandatory restriction of trans fatty acids, to ban the use of partially-hydrogenated oils, and to require required mandatory labeling of the TFAs and the other fatty acids on packaged foods, and to conduct more studies in order to appreciate the intake of TFAs and saturated fat and appreciate their effects on the Algerian population and to get more informed about the composition of the lipid in packaged foods.

Keywords: cardiovascular diseases, lipids, nutrition labelling, lipids, trans fatty acids

Procedia PDF Downloads 57
19384 Consumer Knowledge of Food Quality Assurance and Use of Food Labels in Trinidad, West Indies

Authors: Daryl Clement Knutt, Neela Badrie, Marsha Singh

Abstract:

Quality assurance and product labelling are vital in the food and drink industry, as a tactical tool in a competitive environment. The food label is a principal marketing tool which also serves as a regulatory mechanism in the safeguarding of consumer well –being. The objective of this study was to evaluate the level of consumers’ use and understanding of food labeling information and knowledge pertaining to food quality assurance systems. The study population consisted of Trinidadian adults, who were over the age of 18 (n=384). Data collection was conducted via a self-administered questionnaire, which contained 31 questions, comprising of four sections: I. socio demographic information; II. food quality and quality assurance; III. use of Labeling information; and IV. laws and regulations. Sampling was conducted at six supermarkets, in five major regions of the country over a period of three weeks in 2014. The demographic profile of the shoppers revealed that majority was female (63.6%). The gender factor and those who were concerned about the nutrient content of their food, were predictive indicators of those who read food labels. Most (93.1%) read food labels before purchase, 15.4% ‘always’; 32.5% ‘most times’ and 45.2% ‘sometimes’. Some (42%) were often satisfied with the information presented on food labels, whilst 35.7% of consumers were unsatisfied. When the respondents were questioned on their familiarity with terms ‘food quality’ and ‘food quality assurance’, 21.3% of consumers replied positively - ‘I have heard the terms and know a lot’ whilst 37% were only ‘somewhat familiar’. Consumers were mainly knowledgeable of the International Standard of Organization (ISO) (51.5%) and Good Agricultural Practices GAP (38%) as quality tools. Participants ranked ‘nutritional information’ as the number one labeling element that should be better presented, followed by ‘allergy notes’ and ‘best before date’. Females were more inclined to read labels being the household shoppers. The shoppers would like better presentation of the food labelling information so as to guide their decision to purchase a product.

Keywords: food labels, food quality, nutrition, marketing, Trinidad, Tobago

Procedia PDF Downloads 386
19383 Features for Measuring Credibility on Facebook Information

Authors: Kanda Runapongsa Saikaew, Chaluemwut Noyunsan

Abstract:

Nowadays social media information, such as news, links, images, or VDOs, is shared extensively. However, the effectiveness of disseminating information through social media lacks in quality: less fact checking, more biases, and several rumors. Many researchers have investigated about credibility on Twitter, but there is no the research report about credibility information on Facebook. This paper proposes features for measuring credibility on Facebook information. We developed the system for credibility on Facebook. First, we have developed FB credibility evaluator for measuring credibility of each post by manual human’s labelling. We then collected the training data for creating a model using Support Vector Machine (SVM). Secondly, we developed a chrome extension of FB credibility for Facebook users to evaluate the credibility of each post. Based on the usage analysis of our FB credibility chrome extension, about 81% of users’ responses agree with suggested credibility automatically computed by the proposed system.

Keywords: facebook, social media, credibility measurement, internet

Procedia PDF Downloads 278
19382 Processing Big Data: An Approach Using Feature Selection

Authors: Nikat Parveen, M. Ananthi

Abstract:

Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.

Keywords: big data, key value, feature selection, retrieval, performance

Procedia PDF Downloads 171
19381 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: big data, learning analytics, analytics, big data in education, Hadoop

Procedia PDF Downloads 282
19380 Analysis of Big Data

Authors: Sandeep Sharma, Sarabjit Singh

Abstract:

As per the user demand and growth trends of large free data the storage solutions are now becoming more challenge-able to protect, store and to retrieve data. The days are not so far when the storage companies and organizations are start saying 'no' to store our valuable data or they will start charging a huge amount for its storage and protection. On the other hand as per the environmental conditions it becomes challenge-able to maintain and establish new data warehouses and data centers to protect global warming threats. A challenge of small data is over now, the challenges are big that how to manage the exponential growth of data. In this paper we have analyzed the growth trend of big data and its future implications. We have also focused on the impact of the unstructured data on various concerns and we have also suggested some possible remedies to streamline big data.

Keywords: big data, unstructured data, volume, variety, velocity

Procedia PDF Downloads 423
19379 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, WangQun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSQL), and gives 6 data cleaning methods based on these algorithms.

Keywords: data cleaning, dependency rules, violation data discovery, data repair

Procedia PDF Downloads 347
19378 A Study on the Different Components of a Typical Back-Scattered Chipless RFID Tag Reflection

Authors: Fatemeh Babaeian, Nemai Chandra Karmakar

Abstract:

Chipless RFID system is a wireless system for tracking and identification which use passive tags for encoding data. The advantage of using chipless RFID tag is having a planar tag which is printable on different low-cost materials like paper and plastic. The printed tag can be attached to different items in the labelling level. Since the price of chipless RFID tag can be as low as a fraction of a cent, this technology has the potential to compete with the conventional optical barcode labels. However, due to the passive structure of the tag, data processing of the reflection signal is a crucial challenge. The captured reflected signal from a tag attached to an item consists of different components which are the reflection from the reader antenna, the reflection from the item, the tag structural mode RCS component and the antenna mode RCS of the tag. All these components are summed up in both time and frequency domains. The effect of reflection from the item and the structural mode RCS component can distort/saturate the frequency domain signal and cause difficulties in extracting the desired component which is the antenna mode RCS. Therefore, it is required to study the reflection of the tag in both time and frequency domains to have a better understanding of the nature of the captured chipless RFID signal. The other benefits of this study can be to find an optimised encoding technique in tag design level and to find the best processing algorithm the chipless RFID signal in decoding level. In this paper, the reflection from a typical backscattered chipless RFID tag with six resonances is analysed, and different components of the signal are separated in both time and frequency domains. Moreover, the time domain signal corresponding to each resonator of the tag is studied. The data for this processing was captured from simulation in CST Microwave Studio 2017. The outcome of this study is understanding different components of a measured signal in a chipless RFID system and a discovering a research gap which is a need to find an optimum detection algorithm for tag ID extraction.

Keywords: antenna mode RCS, chipless RFID tag, resonance, structural mode RCS

Procedia PDF Downloads 98
19377 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: mining big data, big data, machine learning, telecommunication

Procedia PDF Downloads 296
19376 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 292
19375 Reviewing Privacy Preserving Distributed Data Mining

Authors: Sajjad Baghernezhad, Saeideh Baghernezhad

Abstract:

Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.

Keywords: data mining, distributed data mining, privacy protection, privacy preserving

Procedia PDF Downloads 391
19374 The Right to Data Portability and Its Influence on the Development of Digital Services

Authors: Roman Bieda

Abstract:

The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.

Keywords: data portability, digital market, GDPR, personal data

Procedia PDF Downloads 356
19373 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 308
19372 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 424
19371 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data

Procedia PDF Downloads 291
19370 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 59
19369 Government Big Data Ecosystem: A Systematic Literature Review

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.

Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review

Procedia PDF Downloads 57
19368 A Machine Learning Decision Support Framework for Industrial Engineering Purposes

Authors: Anli Du Preez, James Bekker

Abstract:

Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.

Keywords: Data analytics, Industrial engineering, Machine learning, Value creation

Procedia PDF Downloads 68
19367 Providing Security to Private Cloud Using Advanced Encryption Standard Algorithm

Authors: Annapureddy Srikant Reddy, Atthanti Mahendra, Samala Chinni Krishna, N. Neelima

Abstract:

In our present world, we are generating a lot of data and we, need a specific device to store all these data. Generally, we store data in pen drives, hard drives, etc. Sometimes we may loss the data due to the corruption of devices. To overcome all these issues, we implemented a cloud space for storing the data, and it provides more security to the data. We can access the data with just using the internet from anywhere in the world. We implemented all these with the java using Net beans IDE. Once user uploads the data, he does not have any rights to change the data. Users uploaded files are stored in the cloud with the file name as system time and the directory will be created with some random words. Cloud accepts the data only if the size of the file is less than 2MB.

Keywords: cloud space, AES, FTP, NetBeans IDE

Procedia PDF Downloads 113
19366 Business Intelligence for Profiling of Telecommunication Customer

Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro

Abstract:

Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.

Keywords: business intelligence, customer segmentation, data warehouse, data mining

Procedia PDF Downloads 335
19365 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Saeed Hassan Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analysing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics

Procedia PDF Downloads 429
19364 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 231
19363 Control the Flow of Big Data

Authors: Shizra Waris, Saleem Akhtar

Abstract:

Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.

Keywords: computer, it community, industry, big data

Procedia PDF Downloads 48