Search results for: missing data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7419

Search results for: missing data

7359 Big Data: Big Challenges to Privacy and Data Protection

Authors: Abu Bakar Munir, Siti Hajar Mohd Yasin, Firdaus Muhammad-Sukki

Abstract:

This paper seeks to analyse the benefits of big data and more importantly the challenges it pose to the subject of privacy and data protection. First, the nature of big data will be briefly deliberated before presenting the potential of big data in the present days. Afterwards, the issue of privacy and data protection is highlighted before discussing the challenges of implementing this issue in big data. In conclusion, the paper will put forward the debate on the adequacy of the existing legal framework in protecting personal data in the era of big data.

Keywords: Big data, data protection, information, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3855
7358 Government Initiatives: The Missing Key for E-commerce Growth in KSA

Authors: R. AlGhamdi, S. Drew, S. Alkhalaf

Abstract:

This paper explores the issues that influence online retailing in Saudi Arabia. Retailers in Saudi Arabia have been reserved in their adoption of electronically delivered aspects of their business. Despite the fact that Saudi Arabia has the largest and fastest growth of ICT marketplaces in the Arab region, e-commerce activities are not progressing at the same speed. Only very few Saudi companies, mostly medium and large companies from the manufacturing sector, are involved in e-commerce implementation. Based on qualitative data collected by conducting interviews with 16 retailers and 16 potential customers in Saudi Arabia, several factors influencing online retailing diffusion in Saudi Arabia are identified. However, government support comes the highest and most influencing factor for online retailing growth as identified by both parties; retailers and potential customers in Saudi Arabia.

Keywords: government support, key factor, online retailing growth, Saudi Arabia

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1801
7357 Evaluation of Two Earliness Cotton Genotypes in Three Ecological Regions

Authors: Gholamhossein Hosseini

Abstract:

Two earliness cotton genotypes I and II, which had been developed by hybridization and backcross methods between sindise-80 as an early maturing gene parent and two other lines i.e. Red leaf and Bulgare-557 as a second parent, are subjected to different environmental conditions. The early maturing genotypes with coded names of I and II were compared with four native cotton cultivars in randomized complete block design (RCBD) with four replications in three ecological regions of Iran from 2016-2017. Two early maturing genotypes along with four native cultivars viz. Varamin, Oltan, Sahel and Arya were planted in Agricultural Research Station of Varamin, Moghan and Kashmar for evaluation. Earliness data were collected for six treatments during two years in the three regions except missing data for the second year of Kashmar. Therefore, missed data were estimated and imputed. For testing the homogeneity of error variances, each experiment at a given location or year is analyzed separately using Hartley and Bartlett’s Chi-square tests and both tests confirmed homogeneity of variance. Combined analysis of variance showed that genotypes I and II were superior in Varamin, Moghan and Kashmar regions. Earliness means and their interaction effects were compared with Duncan’s multiple range tests. Finally combined analysis of variance showed that genotypes I and II were superior in Varamin, Moghan and Kashmar regions. Earliness means and their interaction effects are compared with Duncan’s multiple range tests.

Keywords: Cotton, combined, analysis, earliness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 525
7356 The Use of Software and Internet Search Engines to Develop the Encoding and Decoding Skills of a Dyslexic Learner: A Case Study

Authors: Rabih Joseph Nabhan

Abstract:

This case study explores the impact of two major computer software programs Learn to Speak English and Learn English Spelling and Pronunciation, and some Internet search engines such as Google on mending the decoding and spelling deficiency of Simon X, a dyslexic student. The improvement in decoding and spelling may result in better reading comprehension and composition writing. Some computer programs and Internet materials can help regain the missing awareness and consequently restore his self-confidence and self-esteem. In addition, this study provides a systematic plan comprising a set of activities (four computer programs and Internet materials) which address the problem from the lowest to the highest levels of phoneme and phonological awareness. Four methods of data collection (accounts, observations, published tests, and interviews) create the triangulation to validly and reliably collect data before the plan, during the plan, and after the plan. The data collected are analyzed quantitatively and qualitatively. Sometimes the analysis is either quantitative or qualitative, and some other times a combination of both. Tables and figures are utilized to provide a clear and uncomplicated illustration of some data. The improvement in the decoding, spelling, reading comprehension, and composition writing skills that occurred is proved through the use of authentic materials performed by the student under study. Such materials are a comparison between two sample passages written by the learner before and after the plan, a genuine computer chat conversation, and the scores of the academic year that followed the execution of the plan. Based on these results, the researcher recommends further studies on other Lebanese dyslexic learners using the computer to mend their language problem in order to design and make a most reliable software program that can address this disability more efficiently and successfully.

Keywords: Analysis, awareness, dyslexic, software.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 597
7355 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5925
7354 Structural Integrity Management for Fixed Offshore Platforms in Malaysia

Authors: Narayanan Sambu Potty , Mohammad Kabir B. Mohd Akram

Abstract:

Structural Integrity Management (SIM) is important for the protection of offshore crew, environment, business assets and company and industry reputation. API RP 2A contained guidelines for assessment of existing platforms mostly for the Gulf of Mexico (GOM). ISO 19902 SIM framework also does not specifically cater for Malaysia. There are about 200 platforms in Malaysia with 90 exceeding their design life. The Petronas Carigali Sdn Bhd (PCSB) uses the Asset Integrity Management System and the very subjective Risk based Inspection Program for these platforms. Petronas currently doesn-t have a standalone Petronas Technical Standard PTS-SIM. This study proposes a recommended practice for the SIM process for offshore structures in Malaysia, including studies by API and ISO and local elements such as the number of platforms, types of facilities, age and risk ranking. Case study on SMG-A platform in Sabah shows missing or scattered platform data and a gap in inspection history. It is to undergo a level 3 underwater inspection in year 2015.

Keywords: platform, assessment, integrity, risk based inspection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7198
7353 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4815
7352 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2561
7351 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1515
7350 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2418
7349 A 3-Year Evaluation Study on Fine Needle Aspiration Cytology and Corresponding Histology

Authors: Amjad Al Shammari, Ashraf Ibrahim, Laila Seada

Abstract:

Background and Objectives: Incidence of thyroid carcinoma has been increasing world-wide. In the present study, we evaluated diagnostic accuracy of Fine needle aspiration (FNA) and its efficiency in early detecting neoplastic lesions of thyroid gland over a 3-year period. Methods: Data have been retrieved from pathology files in King Khalid Hospital. For each patient, age, gender, FNA, site & size of nodule and final histopathologic diagnosis were recorded. Results: Study included 490 cases where 419 of them were female and 71 male. Male to female ratio was 1:6. Mean age was 43 years for males and 38 for females. Cases with confirmed histopathology were 131. In 101/131 (77.1%), concordance was found between FNA and histology. In 30/131 (22.9%), there was discrepancy in diagnosis. Total malignant cases were 43, out of which 14 (32.5%) were true positive and 29 (67.44%) were false negative. No false positive cases could be found in our series. Conclusion: FNA could diagnose benign nodules in all cases, however, in malignant cases, ultrasound findings have to be taken into consideration to avoid missing of a microcarcinoma in the contralateral lobe.

Keywords: FNA, hail, histopathology, thyroid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1118
7348 A Convolutional Neural Network-Based Vehicle Theft Detection, Location, and Reporting System

Authors: Michael Moeti, Khuliso Sigama, Thapelo Samuel Matlala

Abstract:

One of the principal challenges that the world is confronted with is insecurity. The crime rate is increasing exponentially, and protecting our physical assets, especially in the motorist sector, is becoming impossible when applying our own strength. The need to develop technological solutions that detect and report theft without any human interference is inevitable. This is critical, especially for vehicle owners, to ensure theft detection and speedy identification towards recovery efforts in cases where a vehicle is missing or attempted theft is taking place. The vehicle theft detection system uses Convolutional Neural Network (CNN) to recognize the driver's face captured using an installed mobile phone device. The location identification function uses a Global Positioning System (GPS) to determine the real-time location of the vehicle. Upon identification of the location, Global System for Mobile Communications (GSM) technology is used to report or notify the vehicle owner about the whereabouts of the vehicle. The installed mobile app was implemented by making use of Python as it is undoubtedly the best choice in machine learning. It allows easy access to machine learning algorithms through its widely developed library ecosystem. The graphical user interface was developed by making use of JAVA as it is better suited for mobile development. Google's online database (Firebase) was used as a means of storage for the application. The system integration test was performed using a simple percentage analysis. 60 vehicle owners participated in this study as a sample, and questionnaires were used in order to establish the acceptability of the system developed. The result indicates the efficiency of the proposed system, and consequently, the paper proposes that the use of the system can effectively monitor the vehicle at any given place, even if it is driven outside its normal jurisdiction. More so, the system can be used as a database to detect, locate and report missing vehicles to different security agencies.

Keywords: Convolutional Neural Network, CNN, location identification, tracking, GPS, GSM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 322
7347 Life Experiences are Important Factors of Making Stronger SOC (Sense of Coherence) on the Workers in Tsukuba Research Park City (TRPC)

Authors: Shinichiro Sasahara, Yusuke Tomotsune, Yuichi Ohi, Shun Suzuki, Akihiro Seki, Junko Sakano, Yoshihiko Yamazaki, Ichiyo Matsuzaki

Abstract:

Via a large scale cross-sectional study among Japanese white color workers, the authors aimed to elucidate: (1) the distributions of Sense of Coherence (SOC), which reflect stress coping abilities, (2) the distributions of Life experience; (3) and the association between SOC and Life experience. Anonymous self-administered questionnaires were sent to 15,891 in 2001 and 21,922 in 2011 employees at educational and research institutions in Tsukuba Research Park City. A total of 5,868 (36.9%) and 9,528 (43.5%) respectively workers completed and returned the questionnaire; 5,715 and 9,515 respectively workers without missing data were analyzed. SOC scale scores differed by gender, age, and other demographic features in both study years. Among the life experiences, workers who have got over parenting or management position were higher SOC scale scores adjusted by gender and age. The life experiences that workers have got over could develop their stronger SOC in their life course.

Keywords: field study, life experience, mental health, SOC (sense of coherence)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1490
7346 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3727
7345 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1256
7344 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1593
7343 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1958
7342 Designing Creative Events with Deconstructivism Approach

Authors: Maryam Memarian, Mahmood Naghizadeh

Abstract:

Deconstruction is an approach that is entirely incompatible with the traditional prevalent architecture. Considering the fact that this approach attempts to put architecture in sharp contrast with its opposite events and transpires with attending to the neglected and missing aspects of architecture and deconstructing its stable structures. It also recklessly proceeds beyond the existing frameworks and intends to create a different and more efficient prospect for space. The aim of deconstruction architecture is to satisfy both the prospective and retrospective visions as well as takes into account all tastes of the present in order to transcend time. Likewise, it ventures to fragment the facts and symbols of the past and extract new concepts from within their heart, which coincide with today’s circumstances. Since this approach is an attempt to surpass the limits of the prevalent architecture, it can be employed to design places in which creative events occur and imagination and ambition flourish. Thought-provoking artistic events can grow and mature in such places and be represented in the best way possible to all people. The concept of event proposed in the plan grows out of the interaction between space and creation. In addition to triggering surprise and high impressions, it is also considered as a bold journey into the suspended realms of the traditional conflicts in architecture such as architecture-landscape, interior-exterior, center-margin, product-process, and stability-instability. In this project, at first, through interpretive-historical research method and examining the inputs and data collection, recognition and organizing takes place. After evaluating the obtained data using deductive reasoning, the data is eventually interpreted. Given the fact that the research topic is in its infancy and there is not a similar case in Iran with limited number of corresponding instances across the world, the selected topic helps to shed lights on the unrevealed and neglected parts in architecture. Similarly, criticizing, investigating and comparing specific and highly prized cases in other countries with the project under study can serve as an introduction into this architecture style.

Keywords: Creativity, deconstruction, event.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1848
7341 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1986
7340 A Partially Accelerated Life Test Planning with Competing Risks and Linear Degradation Path under Tampered Failure Rate Model

Authors: Fariba Azizi, Firoozeh Haghighi, Viliam Makis

Abstract:

In this paper, we propose a method to model the relationship between failure time and degradation for a simple step stress test where underlying degradation path is linear and different causes of failure are possible. It is assumed that the intensity function depends only on the degradation value. No assumptions are made about the distribution of the failure times. A simple step-stress test is used to shorten failure time of products and a tampered failure rate (TFR) model is proposed to describe the effect of the changing stress on the intensities. We assume that some of the products that fail during the test have a cause of failure that is only known to belong to a certain subset of all possible failures. This case is known as masking. In the presence of masking, the maximum likelihood estimates (MLEs) of the model parameters are obtained through an expectation-maximization (EM) algorithm by treating the causes of failure as missing values. The effect of incomplete information on the estimation of parameters is studied through a Monte-Carlo simulation. Finally, a real example is analyzed to illustrate the application of the proposed methods.

Keywords: Expectation-maximization (EM) algorithm, cause of failure, intensity, linear degradation path, masked data, reliability function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1030
7339 Improved Computational Efficiency of Machine Learning Algorithms Based on Evaluation Metrics to Control the Spread of Coronavirus in the UK

Authors: Swathi Ganesan, Nalinda Somasiri, Rebecca Jeyavadhanam, Gayathri Karthick

Abstract:

The COVID-19 crisis presents a substantial and critical hazard to worldwide health. Since the occurrence of the disease in late January 2020 in the UK, the number of infected people confirmed to acquire the illness has increased tremendously across the country, and the number of individuals affected is undoubtedly considerably high. The purpose of this research is to figure out a predictive machine learning (ML) archetypal that could forecast the COVID-19 cases within the UK. This study concentrates on the statistical data collected from 31st January 2020 to 31st March 2021 in the United Kingdom. Information on total COVID-19 cases registered, new cases encountered on a daily basis, total death registered, and patients’ death per day due to Coronavirus is collected from World Health Organization (WHO). Data preprocessing is carried out to identify any missing values, outliers, or anomalies in the dataset. The data are split into 8:2 ratio for training and testing purposes to forecast future new COVID-19 cases. Support Vector Machine (SVM), Random Forest (RF), and linear regression (LR) algorithms are chosen to study the model performance in the prediction of new COVID-19 cases. From the evaluation metrics such as r-squared value and mean squared error, the statistical performance of the model in predicting the new COVID-19 cases is evaluated. RF outperformed the other two ML algorithms with a training accuracy of 99.47% and testing accuracy of 98.26% when n = 30. The mean square error obtained for RF is 4.05e11, which is lesser compared to the other predictive models used for this study. From the experimental analysis, RF algorithm can perform more effectively and efficiently in predicting the new COVID-19 cases, which could help the health sector to take relevant control measures for the spread of the virus.

Keywords: COVID-19, machine learning, supervised learning, unsupervised learning, linear regression, support vector machine, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 99
7338 Design of High Torque Elbow Joint for Above Elbow Prosthesis

Authors: Irfan Hussain, Adnan Masood, Javaid Iqbal, Umar S. Khan

Abstract:

Above Elbow Prosthesis is one of the most commonly amputated or missing limbs. The research is done for modelling techniques of upper limb prosthesis and design of high torque, light weight and compact in size elbow actuator. The purposed actuator consists of a DC motor, planetary gear set and a harmonic drive. The calculations show that the actuator is good enough to be used in real life powered prosthetic upper limb or rehabilitation exoskeleton.

Keywords: Above Elbow prosthesis, Harmonic drive, Planetarygear set, Sagittal Plane

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2691
7337 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1589
7336 Inverse Heat Conduction Analysis of Cooling on Run Out Tables

Authors: M. S. Gadala, Khaled Ahmed, Elasadig Mahdi

Abstract:

In this paper, we introduced a gradient-based inverse solver to obtain the missing boundary conditions based on the readings of internal thermocouples. The results show that the method is very sensitive to measurement errors, and becomes unstable when small time steps are used. The artificial neural networks are shown to be capable of capturing the whole thermal history on the run-out table, but are not very effective in restoring the detailed behavior of the boundary conditions. Also, they behave poorly in nonlinear cases and where the boundary condition profile is different. GA and PSO are more effective in finding a detailed representation of the time-varying boundary conditions, as well as in nonlinear cases. However, their convergence takes longer. A variation of the basic PSO, called CRPSO, showed the best performance among the three versions. Also, PSO proved to be effective in handling noisy data, especially when its performance parameters were tuned. An increase in the self-confidence parameter was also found to be effective, as it increased the global search capabilities of the algorithm. RPSO was the most effective variation in dealing with noise, closely followed by CRPSO. The latter variation is recommended for inverse heat conduction problems, as it combines the efficiency and effectiveness required by these problems.

Keywords: Inverse Analysis, Function Specification, Neural Net Works, Particle Swarm, Run Out Table.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1659
7335 Analyzing Multi-Labeled Data Based on the Roll of a Concept against a Semantic Range

Authors: Masahiro Kuzunishi, Tetsuya Furukawa, Ke Lu

Abstract:

Classifying data hierarchically is an efficient approach to analyze data. Data is usually classified into multiple categories, or annotated with a set of labels. To analyze multi-labeled data, such data must be specified by giving a set of labels as a semantic range. There are some certain purposes to analyze data. This paper shows which multi-labeled data should be the target to be analyzed for those purposes, and discusses the role of a label against a set of labels by investigating the change when a label is added to the set of labels. These discussions give the methods for the advanced analysis of multi-labeled data, which are based on the role of a label against a semantic range.

Keywords: Classification Hierarchies, Data Analysis, Multilabeled Data, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1164
7334 Shape-Based Image Retrieval Using Shape Matrix

Authors: C. Sheng, Y. Xin

Abstract:

Retrieval image by shape similarity, given a template shape is particularly challenging, owning to the difficulty to derive a similarity measurement that closely conforms to the common perception of similarity by humans. In this paper, a new method for the representation and comparison of shapes is present which is based on the shape matrix and snake model. It is scaling, rotation, translation invariant. And it can retrieve the shape images with some missing or occluded parts. In the method, the deformation spent by the template to match the shape images and the matching degree is used to evaluate the similarity between them.

Keywords: shape representation, shape matching, shape matrix, deformation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1466
7333 Streamflow Modeling for a Small Watershed Using Limited Hydrological Data

Authors: S. Chuenchooklin

Abstract:

This research was conducted in the Pua Watershed whereas located in the Upper Nan River Basin in Nan province, Thailand. Nan River basin originated in Nan province that comprises of many tributary streams to produce as inflow to the Sirikit dam provided huge reservoir with the storage capacity of 9510 million cubic meters. The common problems of most watersheds were found i.e. shortage water supply for consumption and agriculture utilizations, deteriorate of water quality, flood and landslide including debris flow, and unstable of riverbank. The Pua Watershed is one of several small river basins that flow through the Nan River Basin. The watershed includes 404 km2 representing the Pua District, the Upper Nan Basin, or the whole Nan River Basin, of 61.5%, 18.2% or 1.2% respectively. The Pua River is a main stream producing all year streamflow supplying the Pua District and an inflow to the Upper Nan Basin. Its length approximately 56.3 kilometers with an average slope of the channel by 1.9% measured. A diversion weir namely Pua weir bound the plain and mountainous areas with a very steep slope of the riverbed to 2.9% and drainage area of 149 km2 as upstream watershed while a mild slope of the riverbed to 0.2% found in a river reach of 20.3 km downstream of this weir, which considered as a gauged basin. However, the major branch streams of the Pua River are ungauged catchments namely: Nam Kwang and Nam Koon with the drainage area of 86 and 35 km2 respectively. These upstream watersheds produce runoff through the 3-streams downstream of Pua weir, Jao weir, and Kang weir, with an averaged annual runoff of 578 million cubic meters. They were analyzed using both statistical data at Pua weir and simulated data resulted from the hydrologic modeling system (HEC–HMS) which applied for the remaining ungauged basins. Since the Kwang and Koon catchments were limited with lack of hydrological data included streamflow and rainfall. Therefore, the mathematical modeling: HEC-HMS with the Snyder-s hydrograph synthesized and transposed methods were applied for those areas using calibrated hydrological parameters from the upstream of Pua weir with continuously daily recorded of streamflow and rainfall data during 2008-2011. The results showed that the simulated daily streamflow and sum up as annual runoff in 2008, 2010, and 2011 were fitted with observed annual runoff at Pua weir using the simple linear regression with the satisfied correlation R2 of 0.64, 062, and 0.59, respectively. The sensitivity of simulation results were come from difficulty using calibrated parameters i.e. lag-time, coefficient of peak flow, initial losses, uniform loss rates, and missing some daily observed data. These calibrated parameters were used to apply for the other 2-ungauged catchments and downstream catchments simulated.

Keywords: Streamflow, hydrological model, ungauged catchments.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1937
7332 Steganalysis of Data Hiding via Halftoning and Coordinate Projection

Authors: Woong Hee Kim, Ilhwan Park

Abstract:

Steganography is the art of hiding and transmitting data through apparently innocuous carriers in an effort to conceal the existence of the data. A lot of steganography algorithms have been proposed recently. Many of them use the digital image data as a carrier. In data hiding scheme of halftoning and coordinate projection, still image data is used as a carrier, and the data of carrier image are modified for data embedding. In this paper, we present three features for analysis of data hiding via halftoning and coordinate projection. Also, we present a classifier using the proposed three features.

Keywords: Steganography, steganalysis, digital halftoning, data hiding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1550
7331 Biological Data Integration using SOA

Authors: Noura Meshaan Al-Otaibi, Amin Yousef Noaman

Abstract:

Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. This research suggests the use of Service Oriented Architecture (SOA) to integrate biological data from different data sources. This work shows SOA will solve the problems that facing integration process and if the biologist scientists can access the biological data in easier way. There are several methods to implement SOA but web service is the most popular method. The Microsoft .Net Framework used to implement proposed architecture.

Keywords: Bioinformatics, Biological data, Data Integration, SOA and Web Services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2416
7330 STATISTICA Software: A State of the Art Review

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, P. Ranjetha

Abstract:

Data mining idea is mounting rapidly in admiration and also in their popularity. The foremost aspire of data mining method is to extract data from a huge data set into several forms that could be comprehended for additional use. The data mining is a technology that contains with rich potential resources which could be supportive for industries and businesses that pay attention to collect the necessary information of the data to discover their customer’s performances. For extracting data there are several methods are available such as Classification, Clustering, Association, Discovering, and Visualization… etc., which has its individual and diverse algorithms towards the effort to fit an appropriate model to the data. STATISTICA mostly deals with excessive groups of data that imposes vast rigorous computational constraints. These results trials challenge cause the emergence of powerful STATISTICA Data Mining technologies. In this survey an overview of the STATISTICA software is illustrated along with their significant features.

Keywords: Data Mining, STATISTICA Data Miner, Text Miner, Enterprise Server, Classification, Association, Clustering, Regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2557