Search results for: Startup data analytics
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7403

Search results for: Startup data analytics

7343 Telehealth Ecosystem: Challenge and Opportunity

Authors: R. Poonsuph

Abstract:

Technological innovation plays a crucial role in virtual healthcare services. A growing number of telehealth platforms are concentrating on using digital tools to improve the quality and availability of care. As a result, telehealth represents an opportunity to redesign the way health services are delivered. The research objective is to discover a new business model for digital health services and related industries to participate with telehealth solutions. The business opportunity is valuable for healthcare investors as a startup company to further investigations or implement the telehealth platform. The paper presents a digital healthcare business model and business opportunities to related industries. These include digital healthcare services extending from a traditional business model and use cases of business opportunities to related industries. Although there are enormous business opportunities, telehealth is still challenging due to the patient adaption and digital transformation process within a healthcare organization.

Keywords: telehealth, Internet hospital, HealthTech, InsurTech

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 986
7342 Business Intelligence for N=1 Analytics using Hybrid Intelligent System Approach

Authors: Rajendra M Sonar

Abstract:

The future of business intelligence (BI) is to integrate intelligence into operational systems that works in real-time analyzing small chunks of data based on requirements on continuous basis. This is moving away from traditional approach of doing analysis on ad-hoc basis or sporadically in passive and off-line mode analyzing huge amount data. Various AI techniques such as expert systems, case-based reasoning, neural-networks play important role in building business intelligent systems. Since BI involves various tasks and models various types of problems, hybrid intelligent techniques can be better choice. Intelligent systems accessible through web services make it easier to integrate them into existing operational systems to add intelligence in every business processes. These can be built to be invoked in modular and distributed way to work in real time. Functionality of such systems can be extended to get external inputs compatible with formats like RSS. In this paper, we describe a framework that use effective combinations of these techniques, accessible through web services and work in real-time. We have successfully developed various prototype systems and done few commercial deployments in the area of personalization and recommendation on mobile and websites.

Keywords: Business Intelligence, Customer Relationship Management, Hybrid Intelligent Systems, Personalization and Recommendation (P&R), Recommender Systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2044
7341 Malware Detection in Mobile Devices by Analyzing Sequences of System Calls

Authors: Jorge Maestre Vidal, Ana Lucila Sandoval Orozco, Luis Javier García Villalba

Abstract:

With the increase in popularity of mobile devices, new and varied forms of malware have emerged. Consequently, the organizations for cyberdefense have echoed the need to deploy more effective defensive schemes adapted to the challenges posed by these recent monitoring environments. In order to contribute to their development, this paper presents a malware detection strategy for mobile devices based on sequence alignment algorithms. Unlike the previous proposals, only the system calls performed during the startup of applications are studied. In this way, it is possible to efficiently study in depth, the sequences of system calls executed by the applications just downloaded from app stores, and initialize them in a secure and isolated environment. As demonstrated in the performed experimentation, most of the analyzed malicious activities were successfully identified in their boot processes.

Keywords: Android, information security, intrusion detection systems, malware, mobile devices.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1260
7340 Automatic Adjustment of Thresholds via Closed-Loop Feedback Mechanism for Solder Paste Inspection

Authors: Chia-Chen Wei, Pack Hsieh, Jeffrey Chen

Abstract:

Surface Mount Technology (SMT) is widely used in the area of the electronic assembly in which the electronic components are mounted to the surface of the printed circuit board (PCB). Most of the defects in the SMT process are mainly related to the quality of solder paste printing. These defects lead to considerable manufacturing costs in the electronics assembly industry. Therefore, the solder paste inspection (SPI) machine for controlling and monitoring the amount of solder paste printing has become an important part of the production process. So far, the setting of the SPI threshold is based on statistical analysis and experts’ experiences to determine the appropriate threshold settings. Because the production data are not normal distribution and there are various variations in the production processes, defects related to solder paste printing still occur. In order to solve this problem, this paper proposes an online machine learning algorithm, called the automatic threshold adjustment (ATA) algorithm, and closed-loop architecture in the SMT process to determine the best threshold settings. Simulation experiments prove that our proposed threshold settings improve the accuracy from 99.85% to 100%.

Keywords: Big data analytics, Industry 4.0, SPI threshold setting, surface mount technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 759
7339 Optimal All-to-All Personalized Communication in All-Port Tori

Authors: Liu Gang, Gu Nai-jie, Bi Kun, Tu Kun, Dong Wan-li

Abstract:

All-to-all personalized communication, also known as complete exchange, is one of the most dense communication patterns in parallel computing. In this paper, we propose new indirect algorithms for complete exchange on all-port ring and torus. The new algorithms fully utilize all communication links and transmit messages along shortest paths to completely achieve the theoretical lower bounds on message transmission, which have not be achieved among other existing indirect algorithms. For 2D r × c ( r % c ) all-port torus, the algorithm has time complexities of optimal transmission cost and O(c) message startup cost. In addition, the proposed algorithms accommodate non-power-of-two tori where the number of nodes in each dimension needs not be power-of-two or square. Finally, the algorithms are conceptually simple and symmetrical for every message and every node so that they can be easily implemented and achieve the optimum in practice.

Keywords: Complete exchange, collective communication, all-to-all personalized communication, parallel computing, wormhole routing, torus.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1470
7338 Big Data: Big Challenges to Privacy and Data Protection

Authors: Abu Bakar Munir, Siti Hajar Mohd Yasin, Firdaus Muhammad-Sukki

Abstract:

This paper seeks to analyse the benefits of big data and more importantly the challenges it pose to the subject of privacy and data protection. First, the nature of big data will be briefly deliberated before presenting the potential of big data in the present days. Afterwards, the issue of privacy and data protection is highlighted before discussing the challenges of implementing this issue in big data. In conclusion, the paper will put forward the debate on the adequacy of the existing legal framework in protecting personal data in the era of big data.

Keywords: Big data, data protection, information, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3876
7337 Using Multi-Arm Bandits to Optimize Game Play Metrics and Effective Game Design

Authors: Kenny Raharjo, Ramon Lawrence

Abstract:

Game designers have the challenging task of building games that engage players to spend their time and money on the game. There are an infinite number of game variations and design choices, and it is hard to systematically determine game design choices that will have positive experiences for players. In this work, we demonstrate how multi-arm bandits can be used to automatically explore game design variations to achieve improved player metrics. The advantage of multi-arm bandits is that they allow for continuous experimentation and variation, intrinsically converge to the best solution, and require no special infrastructure to use beyond allowing minor game variations to be deployed to users for evaluation. A user study confirms that applying multi-arm bandits was successful in determining the preferred game variation with highest play time metrics and can be a useful technique in a game designer's toolkit.

Keywords: Game design, multi-arm bandit, design exploration and data mining, player metric optimization and analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1490
7336 Vibration Analysis of Gas Turbine SIEMENS 162MW - V94.2 Related to Iran Power Plant Industry in Fars Province

Authors: Omid A. Zargar

Abstract:

Vibration analysis of most critical equipment is considered as one of the most challenging activities in preventive maintenance. Utilities are heart of the process in big industrial plants like petrochemical zones. Vibration analysis methods and condition monitoring systems of these kinds of equipments are developed too much in recent years. On the other hand, there are too much operation factors like inlet and outlet pressures and temperatures that should be monitored. In this paper, some of the most effective concepts and techniques related to gas turbine vibration analysis are discussed. In addition, a gas turbine SIEMENS 162MW - V94.2 vibration case history related to Iran power industry in Fars province is explained. Vibration monitoring system and machinery technical specification are introduced. Besides, absolute and relative vibration trends, turbine and compressor orbits, Fast Fourier transform (FFT) in absolute vibrations, vibration modal analysis, turbine and compressor start up and shut down conditions, bode diagrams for relative vibrations, Nyquist diagrams and waterfall or three-dimensional FFT diagrams in startup and trip conditions are discussed with relative graphs. Furthermore, Split Resonance in gas turbines is discussed in details. Moreover, some updated vibration monitoring system, blade manufacturing technique and modern damping mechanism are discussed in this paper.

Keywords: Gas turbine, turbine compressor, vibration data collector, utility, condition monitoring, non-contact probe, Relative Vibration, Absolute Vibration, Split Resonance, Time Wave Form (TWF), Fast Fourier transform (FFT).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3600
7335 Air Handling Units Power Consumption Using Generalized Additive Model for Anomaly Detection: A Case Study in a Singapore Campus

Authors: Ju Peng Poh, Jun Yu Charles Lee, Jonathan Chew Hoe Khoo

Abstract:

The emergence of digital twin technology, a digital replica of physical world, has improved the real-time access to data from sensors about the performance of buildings. This digital transformation has opened up many opportunities to improve the management of the building by using the data collected to help monitor consumption patterns and energy leakages. One example is the integration of predictive models for anomaly detection. In this paper, we use the GAM (Generalised Additive Model) for the anomaly detection of Air Handling Units (AHU) power consumption pattern. There is ample research work on the use of GAM for the prediction of power consumption at the office building and nation-wide level. However, there is limited illustration of its anomaly detection capabilities, prescriptive analytics case study, and its integration with the latest development of digital twin technology. In this paper, we applied the general GAM modelling framework on the historical data of the AHU power consumption and cooling load of the building between Jan 2018 to Aug 2019 from an education campus in Singapore to train prediction models that, in turn, yield predicted values and ranges. The historical data are seamlessly extracted from the digital twin for modelling purposes. We enhanced the utility of the GAM model by using it to power a real-time anomaly detection system based on the forward predicted ranges. The magnitude of deviation from the upper and lower bounds of the uncertainty intervals is used to inform and identify anomalous data points, all based on historical data, without explicit intervention from domain experts. Notwithstanding, the domain expert fits in through an optional feedback loop through which iterative data cleansing is performed. After an anomalously high or low level of power consumption detected, a set of rule-based conditions are evaluated in real-time to help determine the next course of action for the facilities manager. The performance of GAM is then compared with other approaches to evaluate its effectiveness. Lastly, we discuss the successfully deployment of this approach for the detection of anomalous power consumption pattern and illustrated with real-world use cases.

Keywords: Anomaly detection, digital twin, Generalised Additive Model, Power Consumption Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 444
7334 Monomial Form Approach to Rectangular Surface Modeling

Authors: Taweechai Nuntawisuttiwong, Natasha Dejdumrong

Abstract:

Geometric modeling plays an important role in the constructions and manufacturing of curve, surface and solid modeling. Their algorithms are critically important not only in the automobile, ship and aircraft manufacturing business, but are also absolutely necessary in a wide variety of modern applications, e.g., robotics, optimization, computer vision, data analytics and visualization. The calculation and display of geometric objects can be accomplished by these six techniques: Polynomial basis, Recursive, Iterative, Coefficient matrix, Polar form approach and Pyramidal algorithms. In this research, the coefficient matrix (simply called monomial form approach) will be used to model polynomial rectangular patches, i.e., Said-Ball, Wang-Ball, DP, Dejdumrong and NB1 surfaces. Some examples of the monomial forms for these surface modeling are illustrated in many aspects, e.g., construction, derivatives, model transformation, degree elevation and degress reduction.

Keywords: Monomial form, rectangular surfaces, CAGD curves, monomial matrix applications.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 659
7333 Hardware-in-the-Loop Test for Automatic Voltage Regulator of Synchronous Condenser

Authors: Ha Thi Nguyen, Guangya Yang, Arne Hejde Nielsen, Peter Højgaard Jensen

Abstract:

Automatic voltage regulator (AVR) plays an important role in volt/var control of synchronous condenser (SC) in power systems. Test AVR performance in steady-state and dynamic conditions in real grid is expensive, low efficiency, and hard to achieve. To address this issue, we implement hardware-in-the-loop (HiL) test for the AVR of SC to test the steady-state and dynamic performances of AVR in different operating conditions. Startup procedure of the system and voltage set point changes are studied to evaluate the AVR hardware response. Overexcitation, underexcitation, and AVR set point loss are tested to compare the performance of SC with the AVR hardware and that of simulation. The comparative results demonstrate how AVR will work in a real system. The results show HiL test is an effective approach for testing devices before deployment and is able to parameterize the controller with lower cost, higher efficiency, and more flexibility.

Keywords: Automatic voltage regulator, hardware-in-the-loop, synchronous condenser, real time digital simulator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1055
7332 Fermentation of Xylose and Glucose Mixture in Intensified Reactors by Scheffersomyces stipitis to Produce Ethanol

Authors: S. C. Santos, S. R. Dionísio, A. L. D. De Andrade, L. R. Roque, A. C. Da Costa, J. L. Ienczak

Abstract:

In this work, two fermentations at different temperatures (25 and 30ºC), with cell recycling, were accomplished to produce ethanol, using a mix of commercial substrates, xylose (70%) and glucose (30%), as organic source for Scheffersomyces stipitis. Five consecutive fermentations of 80 g L-1 (1º, 2º and 3º recycles), 96 g L-1 (4º recycle) and 120 g L-1 (5º recycle)reduced sugars led to a final maximum ethanol concentration of 17.2 and 34.5 g L-1, at 25 and 30ºC, respectively. Glucose was the preferred substrate; moreover xylose startup degradation was initiated after a remaining glucose presence in the medium. Results showed that yeast acid treatment, performed before each cycle, provided improvements on cell viability, accompanied by ethanol productivity of 2.16 g L-1 h- 1 at 30ºC. A maximum 36% of xylose was retained in the fermentation medium and after five-cycle fermentation an ethanol yield of 0.43 g ethanol/g sugars was observed. S. stipitis fermentation capacity and tolerance showed better results at 30ºC with 83.4% of theoretical yield referenced on initial biomass.

Keywords: 5-carbon sugar, cell recycling fermenter, mixed sugars, xylose-fermenting yeast.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2719
7331 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5960
7330 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2572
7329 Investigation into the Role of Leadership in the Management of Digital Transformation for Small and Medium Enterprises

Authors: Francesco Coraci, Abdul-Hadi G. Abulrub

Abstract:

Digital technology is transforming the landscape of the industrial sector at a precedential level by connecting people, processes, and machines in real-time. It represents the means for a new pathway to achieve innovative, dynamic competitive advantages, deliver unique customers’ values, and sustain critical relationships. Thus, success in a constantly changing environment is governed by the ability of an organization to revolutionize their business models, deliver innovative solutions, and capture values from big data analytics and insights. Businesses need to re-strategize operations and develop extra capabilities to cope with the necessity for additional flexibility and agility. The traditional “command and control” leadership style is structurally and operationally incompatible with the digital era. In this paper, the authors discuss how transformational leaders can act as a glue in the social, organizational context, which is crucial to enable the workforce and develop a psychological attachment to the digital vision.

Keywords: Internet of things, strategy, change leadership, dynamic competitive advantage, digital transformation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 596
7328 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1523
7327 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2432
7326 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1269
7325 In Search of Excellence – Google vs Baidu

Authors: Linda, Sau-ling LAI

Abstract:

This paper compares the search engine marketing strategies adopted in China and the Western countries through two illustrative cases, namely, Google and Baidu. Marketers in the West use search engine optimization (SEO) to rank their sites higher for queries in Google. Baidu, however, offers paid search placement, or the selling of engine results for particular keywords to the higher bidders. Whereas Google has been providing innovative services ranging from Google Map to Google Blog, Baidu remains focused on search services – the one that it does best. The challenges and opportunities of the Chinese Internet market offered to global entrepreneurs are also discussed in the paper

Keywords: Search Engine, Web analytics, Google, Baidu

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2412
7324 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601
7323 Ordinal Regression with Fenton-Wilkinson Order Statistics: A Case Study of an Orienteering Race

Authors: Joonas Pääkkönen

Abstract:

In sports, individuals and teams are typically interested in final rankings. Final results, such as times or distances, dictate these rankings, also known as places. Places can be further associated with ordered random variables, commonly referred to as order statistics. In this work, we introduce a simple, yet accurate order statistical ordinal regression function that predicts relay race places with changeover-times. We call this function the Fenton-Wilkinson Order Statistics model. This model is built on the following educated assumption: individual leg-times follow log-normal distributions. Moreover, our key idea is to utilize Fenton-Wilkinson approximations of changeover-times alongside an estimator for the total number of teams as in the notorious German tank problem. This original place regression function is sigmoidal and thus correctly predicts the existence of a small number of elite teams that significantly outperform the rest of the teams. Our model also describes how place increases linearly with changeover-time at the inflection point of the log-normal distribution function. With real-world data from Jukola 2019, a massive orienteering relay race, the model is shown to be highly accurate even when the size of the training set is only 5% of the whole data set. Numerical results also show that our model exhibits smaller place prediction root-mean-square-errors than linear regression, mord regression and Gaussian process regression.

Keywords: Fenton-Wilkinson approximation, German tank problem, log-normal distribution, order statistics, ordinal regression, orienteering, sports analytics, sports modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 770
7322 The Path to Web Intelligence Maturity

Authors: Zeljko Panian

Abstract:

Web intelligence, if made personal, can fuel the process of building communications around the interests and preferences of each individual customer or prospect, by providing specific behavioral insights about each individual. To become fully efficient, Web intelligence must reach a stage of a high-level maturity, passing throughout a process that involves five steps: (1) Web site analysis; (2) Web site and advertising optimization; (3) Segment targeting; (4) Interactive marketing (online only); and (5) Interactive marketing (online and offline). Discussing these steps in detail, the paper uncovers the real gold mine that is personal-level Web intelligence.

Keywords: Web intelligence, web analytics, informationtechnology (IT), interactive marketing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1598
7321 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1969
7320 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2018
7319 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2745
7318 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1605
7317 Using Scrum in an Online Smart Classroom Environment: A Case Study

Authors: Ye Wei, Sitalakshmi Venkatraman, Fahri Benli, Fiona Wahr

Abstract:

The present digital world poses many challenges to various stakeholders in the education sector. In particular, lecturers of higher education (HE) are faced with the problem of ensuring that students are able to achieve the required learning outcomes despite rapid changes taking place worldwide. Different strategies are adopted to retain student engagement and commitment in classrooms to address the differences in learning habits, preferences and styles of the digital generation of students recently. Further, with the onset of coronavirus disease (COVID-19) pandemic, online classroom has become the most suitable alternate mode of teaching environment to cope with lockdown restrictions. These changes have compounded the problems in the learning engagement and short attention span of HE students. New Agile methodologies that have been successfully employed to manage projects in different fields are gaining prominence in the education domain. In this paper, we present the application of Scrum as an agile methodology to enhance student learning and engagement in an online smart classroom environment. We demonstrate the use of our proposed approach using a case study to teach key topics in information technology that require students to gain technical and business-related data analytics skills.

Keywords: Agile methodology, Scrum, online learning, smart classroom environment, student engagement, active learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 346
7316 Analyzing Multi-Labeled Data Based on the Roll of a Concept against a Semantic Range

Authors: Masahiro Kuzunishi, Tetsuya Furukawa, Ke Lu

Abstract:

Classifying data hierarchically is an efficient approach to analyze data. Data is usually classified into multiple categories, or annotated with a set of labels. To analyze multi-labeled data, such data must be specified by giving a set of labels as a semantic range. There are some certain purposes to analyze data. This paper shows which multi-labeled data should be the target to be analyzed for those purposes, and discusses the role of a label against a set of labels by investigating the change when a label is added to the set of labels. These discussions give the methods for the advanced analysis of multi-labeled data, which are based on the role of a label against a semantic range.

Keywords: Classification Hierarchies, Data Analysis, Multilabeled Data, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1182
7315 Steganalysis of Data Hiding via Halftoning and Coordinate Projection

Authors: Woong Hee Kim, Ilhwan Park

Abstract:

Steganography is the art of hiding and transmitting data through apparently innocuous carriers in an effort to conceal the existence of the data. A lot of steganography algorithms have been proposed recently. Many of them use the digital image data as a carrier. In data hiding scheme of halftoning and coordinate projection, still image data is used as a carrier, and the data of carrier image are modified for data embedding. In this paper, we present three features for analysis of data hiding via halftoning and coordinate projection. Also, we present a classifier using the proposed three features.

Keywords: Steganography, steganalysis, digital halftoning, data hiding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1563
7314 Biological Data Integration using SOA

Authors: Noura Meshaan Al-Otaibi, Amin Yousef Noaman

Abstract:

Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. This research suggests the use of Service Oriented Architecture (SOA) to integrate biological data from different data sources. This work shows SOA will solve the problems that facing integration process and if the biologist scientists can access the biological data in easier way. There are several methods to implement SOA but web service is the most popular method. The Microsoft .Net Framework used to implement proposed architecture.

Keywords: Bioinformatics, Biological data, Data Integration, SOA and Web Services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2430