Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 24104

Search results for: non real-time data

24104 Recommender System Based on Mining Graph Databases for Data-Intensive Applications

Authors: Mostafa Gamal, Hoda K. Mohamed, Islam El-Maddah, Ali Hamdi

Abstract:

In recent years, many digital documents on the web have been created due to the rapid growth of ’social applications’ communities or ’Data-intensive applications’. The evolution of online-based multimedia data poses new challenges in storing and querying large amounts of data for online recommender systems. Graph data models have been shown to be more efficient than relational data models for processing complex data. This paper will explain the key differences between graph and relational databases, their strengths and weaknesses, and why using graph databases is the best technology for building a realtime recommendation system. Also, The paper will discuss several similarity metrics algorithms that can be used to compute a similarity score of pairs of nodes based on their neighbourhoods or their properties. Finally, the paper will discover how NLP strategies offer the premise to improve the accuracy and coverage of realtime recommendations by extracting the information from the stored unstructured knowledge, which makes up the bulk of the world’s data to enrich the graph database with this information. As the size and number of data items are increasing rapidly, the proposed system should meet current and future needs.

Keywords: graph databases, NLP, recommendation systems, similarity metrics

Procedia PDF Downloads 70

24103 150 KVA Multifunction Laboratory Test Unit Based on Power-Frequency Converter

Authors: Bartosz Kedra, Robert Malkowski

Abstract:

This paper provides description and presentation of laboratory test unit built basing on 150 kVA power frequency converter and Simulink RealTime platform. Assumptions, based on criteria which load and generator types may be simulated using discussed device, are presented, as well as control algorithm structure. As laboratory setup contains transformer with thyristor controlled tap changer, a wider scope of setup capabilities is presented. Information about used communication interface, data maintenance, and storage solution as well as used Simulink real-time features is presented. List and description of all measurements are provided. Potential of laboratory setup modifications is evaluated. For purposes of Rapid Control Prototyping, a dedicated environment was used Simulink RealTime. Therefore, load model Functional Unit Controller is based on a PC computer with I/O cards and Simulink RealTime software. Simulink RealTime was used to create real-time applications directly from Simulink models. In the next step, applications were loaded on a target computer connected to physical devices that provided opportunity to perform Hardware in the Loop (HIL) tests, as well as the mentioned Rapid Control Prototyping process. With Simulink RealTime, Simulink models were extended with I/O cards driver blocks that made automatic generation of real-time applications and performing interactive or automated runs on a dedicated target computer equipped with a real-time kernel, multicore CPU, and I/O cards possible. Results of performed laboratory tests are presented. Different load configurations are described and experimental results are presented. This includes simulation of under frequency load shedding, frequency and voltage dependent characteristics of groups of load units, time characteristics of group of different load units in a chosen area and arbitrary active and reactive power regulation basing on defined schedule.

Keywords: MATLAB, power converter, Simulink Real-Time, thyristor-controlled tap changer

Procedia PDF Downloads 292

24102 Performance of the Abbott RealTime High Risk HPV Assay with SurePath Liquid Based Cytology Specimens from Women with Low Grade Cytological Abnormalities

Authors: Alexandra Sargent, Sarah Ferris, Ioannis Theofanous

Abstract:

The Abbott RealTime High Risk HPV test (RealTime HPV) is one of five assays clinically validated and approved by the English NHS Cervical Screening Programme (CSP) for HPV triage of low grade dyskaryosis and test-of-cure of treated Cervical Intraepithelial Neoplasia. The assay is a highly automated multiplex real-time PCR test for detecting 14 high risk (hr) HPV types, with simultaneous differentiation of HPV 16 and HPV 18 versus non-HPV 16/18 hrHPV. An endogenous internal control ensures sample cellularity, controls extraction efficiency and PCR inhibition. The original cervical specimen collected in SurePath (SP) liquid-based cytology (LBC) medium (BD Diagnostics) and the SP post-gradient cell pellets (SPG) after cytological processing are both CE marked for testing with the RealTime HPV test. During the 2011 NHSCSP validation of new tests only the original aliquot of SP LBC medium was investigated. Residual sample volume left after cytology slide preparation is low and may not always have sufficient volume for repeat HPV testing or for testing of other biomarkers that may be implemented in testing algorithms in the future. The SPG samples, however, have sufficient volumes to carry out additional testing and necessary laboratory validation procedures. This study investigates the correlation of RealTime HPV results of cervical specimens collected in SP LBC medium from women with low grade cytological abnormalities observed with matched pairs of original SP LBC medium and SP post-gradient cell pellets (SPG) after cytology processing. Matched pairs of SP and SPG samples from 750 women with borderline (N = 392) and mild (N = 351) cytology were available for this study. Both specimen types were processed and parallel tested for the presence of hrHPV with RealTime HPV according to the manufacturer´s instructions. HrHPV detection rates and concordance between test results from matched SP and SPGCP pairs were calculated. A total of 743 matched pairs with valid test results on both sample types were available for analysis. An overall-agreement of hrHPV test results of 97.5% (k: 0.95) was found with matched SP/SPG pairs and slightly lower concordance (96.9%; k: 0.94) was observed on 392 pairs from women with borderline cytology compared to 351 pairs from women with mild cytology (98.0%; k: 0.95). Partial typing results were highly concordant in matched SP/SPG pairs for HPV 16 (99.1%), HPV 18 (99.7%) and non-HPV16/18 hrHPV (97.0%), respectively. 19 matched pairs were found with discrepant results: 9 from women with borderline cytology and 4 from women with mild cytology were negative on SPG and positive on SP; 3 from women with borderline cytology and 3 from women with mild cytology were negative on SP and positive on SPG. Excellent correlation of hrHPV DNA test results was found between matched pairs of SP original fluid and post-gradient cell pellets from women with low grade cytological abnormalities tested with the Abbott RealTime High-Risk HPV assay, demonstrating robust performance of the test with both specimen types and reassuring the utility of the assay for cytology triage with both specimen types.

Keywords: Abbott realtime test, HPV, SurePath liquid based cytology, surepath post-gradient cell pellet

Procedia PDF Downloads 224

24101 Developing a Multiagent-Based Decision Support System for Realtime Multi-Risk Disaster Management

Authors: D. Moser, D. Pinto, A. Cipriano

Abstract:

A Disaster Management System (DMS) for countries with different disasters is very important. In the world different disasters like earthquakes, tsunamis, volcanic eruption, fire or other natural or man-made disasters occurs and have an effect on the population. It is also possible that two or more disasters arisen at the same time, this means to handle multi-risk situations. To handle such a situation a Decision Support System (DSS) based on multiagents is a suitable architecture. The most known DMSs deal with one (in the case of an earthquake-tsunami combination with two) disaster and often with one particular disaster. Nevertheless, a DSS helps for a better realtime response. Analyze the existing systems in the literature and expand them for multi-risk disasters to construct a well-organized system is the proposal of our work. The here shown work is an approach of a multi-risk system, which needs an architecture, and well-defined aims. In this moment our study is a kind of case study to analyze the way we have to follow to create our proposed system in the future.

Keywords: decision support system, disaster management system, multi-risk, multiagent system

Procedia PDF Downloads 388

24100 Visualization-Based Feature Extraction for Classification in Real-Time Interaction

Authors: Ágoston Nagy

Abstract:

This paper introduces a method of using unsupervised machine learning to visualize the feature space of a dataset in 2D, in order to find most characteristic segments in the set. After dimension reduction, users can select clusters by manual drawing. Selected clusters are recorded into a data model that is used for later predictions, based on realtime data. Predictions are made with supervised learning, using Gesture Recognition Toolkit. The paper introduces two example applications: a semantic audio organizer for analyzing incoming sounds, and a gesture database organizer where gestural data (recorded by a Leap motion) is visualized for further manipulation.

Keywords: gesture recognition, machine learning, real-time interaction, visualization

Procedia PDF Downloads 320

24099 A Building Structure Health Monitoring DeviceBased on Cost Effective 1-Axis Accelerometers

Authors: Chih Hsing Lin, Wen-Ching Chen, Ssu-Ying Chen, Chih-Chyau Yang, Chien-Ming Wu, Chun-Ming Huang

Abstract:

Critical structures such as buildings, bridges and dams require periodic inspections to ensure safe operation. The reliable inspection of structures can be achieved by combing temperature sensor and accelerometers. In this work, we propose a building structure health monitoring device (BSHMD) with using three 1-axis accelerometers, gateway, analog to digital converter (ADC), and data logger to monitoring the building structure. The proposed BSHMD achieves the features of low cost by using three 1-axis accelerometers with the data synchronization problem being solved, and easily installation and removal. Furthermore, we develop a packet acquisition program to receive the sensed data and then classify it based on time and date. Compared with 3-axis accelerometer, our proposed 1-axis accelerometers based device achieves 64.3% cost saving. Compared with previous structural monitoring device, the BSHMD achieves 89% area saving. Therefore, with using the proposed device, the realtime diagnosis system for building damage monitoring can be conducted effectively.

Keywords: building structure health monitoring, cost effective, 1-axis accelerometers, real-time diagnosis

Procedia PDF Downloads 330

24098 Annexing the Strength of Information and Communication Technology (ICT) for Real-time TB Reporting Using TB Situation Room (TSR) in Nigeria: Kano State Experience

Authors: Ibrahim Umar, Ashiru Rajab, Sumayya Chindo, Emmanuel Olashore

Abstract:

INTRODUCTION: Kano is the most populous state in Nigeria and one of the two states with the highest TB burden in the country. The state notifies an average of 8,000+ TB cases quarterly and has the highest yearly notification of all the states in Nigeria from 2020 to 2022. The contribution of the state TB program to the National TB notification varies from 9% to 10% quarterly between the first quarter of 2022 and second quarter of 2023. The Kano State TB Situation Room is an innovative platform for timely data collection, collation and analysis for informed decision in health system. During the 2023 second National TB Testing week (NTBTW) Kano TB program aimed at early TB detection, prevention and treatment. The state TB Situation room provided avenue to the state for coordination and surveillance through real time data reporting, review, analysis and use during the NTBTW. OBJECTIVES: To assess the role of innovative information and communication technology platform for real-time TB reporting during second National TB Testing week in Nigeria 2023. To showcase the NTBTW data cascade analysis using TSR as innovative ICT platform. METHODOLOGY: The State TB deployed a real-time virtual dashboard for NTBTW reporting, analysis and feedback. A data room team was set up who received realtime data using google link. Data received was analyzed using power BI analytic tool with statistical alpha level of significance of <0.05. RESULTS: At the end of the week-long activity and using the real-time dashboard with onsite mentorship of the field workers, the state TB program was able to screen a total of 52,054 people were screened for TB from 72,112 individuals eligible for screening (72% screening rate). A total of 9,910 presumptive TB clients were identified and evaluated for TB leading to diagnosis of 445 TB patients with TB (5% yield from presumptives) and placement of 435 TB patients on treatment (98% percentage enrolment). CONCLUSION: The TB Situation Room (TBSR) has been a great asset to Kano State TB Control Program in meeting up with the growing demand for timely data reporting in TB and other global health responses. The use of real time surveillance data during the 2023 NTBTW has in no small measure improved the TB response and feedback in Kano State. Scaling up this intervention to other disease areas, states and nations is a positive step in the right direction towards global TB eradication.

Keywords: tuberculosis (tb), national tb testing week (ntbtw), tb situation rom (tsr), information communication technology (ict)

Procedia PDF Downloads 31

24097 Evaluation of Fetal brain using Magnetic Resonance Imaging

Authors: Mahdi Farajzadeh Ajirlou

Abstract:

Ordinary fetal brain development can be considered by in vivo attractive reverberation imaging (MRI) from the 18th gestational week (GW) to term and depends fundamentally on T2-weighted and diffusion-weighted (DW) arrangements. The foremost commonly suspected brain pathologies alluded to fetal MRI for assist assessment are ventriculomegaly, lost corpus callosum, and anomalies of the posterior fossa. Brain division could be a crucial to begin with step in neuroimage examination. Within the case of fetal MRI it is especially challenging and critical due to the subjective introduction of the hatchling, organs that encompass the fetal head, and irregular fetal movement. A few promising strategies have been proposed but are constrained in their execution in challenging cases and in realtime division. Fetal MRI is routinely performed on a 1.5-Tesla scanner without maternal or fetal sedation. The mother lies recumbent amid the course of the examination, the length of which is ordinarily 45 to 60 minutes. The accessibility and continuous approval of standardizing fetal brain development directions will give critical devices for early discovery of impeded fetal brain development upon which to oversee high-risk pregnancies.

Keywords: brain, fetal, MRI, imaging

Procedia PDF Downloads 43

24096 Processing Big Data: An Approach Using Feature Selection

Authors: Nikat Parveen, M. Ananthi

Abstract:

Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.

Keywords: big data, key value, feature selection, retrieval, performance

Procedia PDF Downloads 304

24095 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: big data, learning analytics, analytics, big data in education, Hadoop

Procedia PDF Downloads 380

24094 Analysis of Big Data

Authors: Sandeep Sharma, Sarabjit Singh

Abstract:

As per the user demand and growth trends of large free data the storage solutions are now becoming more challenge-able to protect, store and to retrieve data. The days are not so far when the storage companies and organizations are start saying 'no' to store our valuable data or they will start charging a huge amount for its storage and protection. On the other hand as per the environmental conditions it becomes challenge-able to maintain and establish new data warehouses and data centers to protect global warming threats. A challenge of small data is over now, the challenges are big that how to manage the exponential growth of data. In this paper we have analyzed the growth trend of big data and its future implications. We have also focused on the impact of the unstructured data on various concerns and we have also suggested some possible remedies to streamline big data.

Keywords: big data, unstructured data, volume, variety, velocity

Procedia PDF Downloads 510

24093 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, WangQun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSQL), and gives 6 data cleaning methods based on these algorithms.

Keywords: data cleaning, dependency rules, violation data discovery, data repair

Procedia PDF Downloads 530

24092 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: mining big data, big data, machine learning, telecommunication

Procedia PDF Downloads 364

24091 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach

Authors: Theertha Chandroth

Abstract:

This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.

Keywords: XML, JSON, data comparison, integration testing, Python, SQL

Procedia PDF Downloads 83

24090 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 347

24089 Reviewing Privacy Preserving Distributed Data Mining

Authors: Sajjad Baghernezhad, Saeideh Baghernezhad

Abstract:

Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.

Keywords: data mining, distributed data mining, privacy protection, privacy preserving

Procedia PDF Downloads 487

24088 The Right to Data Portability and Its Influence on the Development of Digital Services

Authors: Roman Bieda

Abstract:

The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.

Keywords: data portability, digital market, GDPR, personal data

Procedia PDF Downloads 439

24087 Computer Network Applications, Practical Implementations and Structural Control System Representations

Authors: El Miloudi Djelloul

Abstract:

The computer network play an important position for practical implementations of the differently system. To implement a system into network above all is needed to know all the configurations, which is responsible to be a part of the system, and to give adequate information and solution in realtime. So if want to implement this system for example in the school or relevant institutions, the first step is to analyze the types of model which is needed to be configured and another important step is to organize the works in the context of devices, as a part of the general system. Often before configuration, as important point is descriptions and documentations from all the works into the respective process, and then to organize in the aspect of problem-solving. The computer network as critic infrastructure is very specific so the paper present the effectiveness solutions in the structured aspect viewed from one side, and another side is, than the paper reflect the positive aspect in the context of modeling and block schema presentations as an better alternative to solve the specific problem because of continually distortions of the system from the line of devices, programs and signals or packed collisions, which are in movement from one computer node to another nodes.

Keywords: local area networks, LANs, block schema presentations, computer network system, computer node, critical infrastructure packed collisions, structural control system representations, computer network, implementations, modeling structural representations, companies, computers, context, control systems, internet, software

Procedia PDF Downloads 327

24086 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 364

24085 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 608

24084 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data

Procedia PDF Downloads 342

24083 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 127

24082 Government Big Data Ecosystem: A Systematic Literature Review

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.

Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review

Procedia PDF Downloads 179

24081 A Machine Learning Decision Support Framework for Industrial Engineering Purposes

Authors: Anli Du Preez, James Bekker

Abstract:

Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.

Keywords: Data analytics, Industrial engineering, Machine learning, Value creation

Procedia PDF Downloads 136

24080 Providing Security to Private Cloud Using Advanced Encryption Standard Algorithm

Authors: Annapureddy Srikant Reddy, Atthanti Mahendra, Samala Chinni Krishna, N. Neelima

Abstract:

In our present world, we are generating a lot of data and we, need a specific device to store all these data. Generally, we store data in pen drives, hard drives, etc. Sometimes we may loss the data due to the corruption of devices. To overcome all these issues, we implemented a cloud space for storing the data, and it provides more security to the data. We can access the data with just using the internet from anywhere in the world. We implemented all these with the java using Net beans IDE. Once user uploads the data, he does not have any rights to change the data. Users uploaded files are stored in the cloud with the file name as system time and the directory will be created with some random words. Cloud accepts the data only if the size of the file is less than 2MB.

Keywords: cloud space, AES, FTP, NetBeans IDE

Procedia PDF Downloads 172

24079 Business Intelligence for Profiling of Telecommunication Customer

Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro

Abstract:

Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.

Keywords: business intelligence, customer segmentation, data warehouse, data mining

Procedia PDF Downloads 442

24078 Current Epizootic Situation of Q Fever in Polish Cattle

Authors: Monika Szymańska-Czerwińska, Agnieszka Jodełko, Krzysztof Niemczuk

Abstract:

Q fever (coxiellosis) is an infectious disease of animals and humans causes by C. burnetii and widely distributed throughout the world. Cattle and small ruminants are commonly known as shedders of C. burnetii. The aims of this study were the evaluation of seroprevalence and shedding of C. burnetii in cattle. Genotypes of the pathogen present in the tested specimens were also identified using MLVA (Multiple Locus Variable-Number Tandem Repeat Analysis) and MST (multispacer sequence typing) methods. Sampling was conducted in different regions of Poland in 2018-2021. In total, 2180 bovine serum samples from 801 cattle herds were tested by ELISA (enzyme-linked immunosorbent assay). 489 specimens from 157 cattle herds such as: individual milk samples (n=407), bulk tank milk (n=58), vaginal swabs (n=20), placenta (n=3) and feces (n=1) were subjected to C. burnetii specific qPCR. The qPCR (IS1111 transposon-like repetitive region) was performed using Adiavet COX RealTime PCR kit. Genotypic characterization of the strains was conducted utilizing MLVA and MST methods. MLVA was performed using 6 variable loci. The overall herd-level seroprevalence of C. burnetii infection was 36.74% (801/2180). Shedders were detected in 29.3% (46/157) cattle herds in all tested regions. ST 61 sequence type was identified in 10 out of 18 genotyped strains. Interestingly one strain represents sequence type which has never been recorded previously. MLVA method identified three previously known genotypes: most common was J but also I and BE were recognized. Moreover, a one genotype has never been described previously. Seroprevalence and shedding of C. burnetii in cattle is common and strains are genetically diverse.

Keywords: Coxiella burnetii, cattle, MST, MLVA, Q fever

Procedia PDF Downloads 47

24077 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Saeed Hassan Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analysing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics

Procedia PDF Downloads 531

24076 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 300

24075 Control the Flow of Big Data

Authors: Shizra Waris, Saleem Akhtar

Abstract:

Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.

Keywords: computer, it community, industry, big data

Procedia PDF Downloads 157