Search results for: Geospatial data misuse.

7341 Big Data: Big Challenges to Privacy and Data Protection

Authors: Abu Bakar Munir, Siti Hajar Mohd Yasin, Firdaus Muhammad-Sukki

Abstract:

This paper seeks to analyse the benefits of big data and more importantly the challenges it pose to the subject of privacy and data protection. First, the nature of big data will be briefly deliberated before presenting the potential of big data in the present days. Afterwards, the issue of privacy and data protection is highlighted before discussing the challenges of implementing this issue in big data. In conclusion, the paper will put forward the debate on the adequacy of the existing legal framework in protecting personal data in the era of big data.

Keywords: Big data, data protection, information, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3862

7340 Geospatial Assessment of State Lands in the Cape Coast Urban Area

Authors: E. B. Quarcoo, I. Yakubu, K. J. Appau

Abstract:

Current land use and land cover (LULC) dynamics in Ghana have revealed considerable changes in settlement spaces. As a result, this study is intended to merge the cellular automata and Markov chain models using remotely sensed data and Geographical Information System (GIS) approaches to monitor, map, and detect the spatio-temporal LULC change in state lands within Cape Coast Metropolis. Multi-temporal satellite images from 1986-2020 were pre-processed, geo-referenced, and then mapped using supervised maximum likelihood classification to investigate the state’s land cover history (1986-2020) with an overall mapping accuracy of approximately 85%. The study further observed the rate of change for the area to have favored the built-up area 9.8 (12.58 km2) to the detriment of vegetation 5.14 (12.68 km2), but on average, 0.37 km2 (91.43 acres, or 37.00 ha.) of the landscape was transformed yearly. Subsequently, the CA-Markov model was used to anticipate the potential LULC for the study area for 2030. According to the anticipated 2030 LULC map, the patterns of vegetation transitioning into built-up regions will continue over the following ten years as a result of urban growth.

Keywords: LULC, cellular automata, Markov Chain, state lands, urbanisation, public lands, cape coast metropolis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 55

7339 An Edit-Distance Algorithm to Detect Correlated Attacks in Distributed Systems

Authors: Sule Simsek

Abstract:

Intrusion detection systems (IDS)are crucial components of the security mechanisms of today-s computer systems. Existing research on intrusion detection has focused on sequential intrusions. However, intrusions can also be formed by concurrent interactions of multiple processes. Some of the intrusions caused by these interactions cannot be detected using sequential intrusion detection methods. Therefore, there is a need for a mechanism that views the distributed system as a whole. L-BIDS (Lattice-Based Intrusion Detection System) is proposed to address this problem. In the L-BIDS framework, a library of intrusions and distributed traces are represented as lattices. Then these lattices are compared in order to detect intrusions in the distributed traces.

Keywords: Attack graph, distributed, edit-distance, misuse detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1341

7338 Image Classification and Accuracy Assessment Using the Confusion Matrix, Contingency Matrix, and Kappa Coefficient

Authors: F. F. Howard, C. B. Boye, I. Yakubu, J. S. Y. Kuma

Abstract:

One of the ways that could be used for the production of land use and land cover maps by a procedure known as image classification is the use of the remote sensing technique. Numerous elements ought to be taken into consideration, including the availability of highly satisfactory Landsat imagery, secondary data and a precise classification process. The goal of this study was to classify and map the land use and land cover of the study area using remote sensing and Geospatial Information System (GIS) analysis. The classification was done using Landsat 8 satellite images acquired in December 2020 covering the study area. The Landsat image was downloaded from the USGS. The Landsat image with 30 m resolution was geo-referenced to the WGS_84 datum and Universal Transverse Mercator (UTM) Zone 30N coordinate projection system. A radiometric correction was applied to the image to reduce the noise in the image. This study consists of two sections: the Land Use/Land Cover (LULC) and Accuracy Assessments using the confusion and contingency matrix and the Kappa coefficient. The LULC classifications were vegetation (agriculture) (67.87%), water bodies (0.01%), mining areas (5.24%), forest (26.02%), and settlement (0.88%). The overall accuracy of 97.87% and the kappa coefficient (K) of 97.3% were obtained for the confusion matrix. While an overall accuracy of 95.7% and a Kappa coefficient of 0.947 were obtained for the contingency matrix, the kappa coefficients were rated as substantial; hence, the classified image is fit for further research.

Keywords: Confusion Matrix, contingency matrix, kappa coefficient, land used/ land cover, accuracy assessment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 152

7337 Antibiotic Prescribing in the Acute Care in Iraq

Authors: Ola A. Nassr, Ali M. Abd Alridha, Rua A. Naser, Rasha S. Abbas

Abstract:

Background: Excessive and inappropriate use of antimicrobial agents among hospitalized patients remains an important patient safety and public health issue worldwide. Not only does this behavior incur unnecessary cost but it is also associated with increased morbidity and mortality. The objective of this study is to obtain an insight into the prescribing patterns of antibiotics in surgical and medical wards, to help identify a scope for improvement in service delivery. Method: A simple point prevalence survey included a convenience sample of 200 patients admitted to medical and surgical wards in a government teaching hospital in Baghdad between October 2017 and April 2018. Data were collected by a trained pharmacy intern using a standardized form. Patient’s demographics and details of the prescribed antibiotics, including dose, frequency of dosing and route of administration, were reported. Patients were included if they had been admitted at least 24 hours before the survey. Patients under 18 years of age, having a diagnosis of cancer or shock, or being admitted to the intensive care unit, were excluded. Data were checked and entered by the authors into Excel and were subjected to frequency analysis, which was carried out on anonymized data to protect patient confidentiality. Results: Overall, 88.5% of patients (n=177) received 293 antibiotics during their hospital admission, with a small variation between wards (80%-97%). The average number of antibiotics prescribed per patient was 1.65, ranging from 1.3 for medical patients to 1.95 for surgical patients. Parenteral third-generation cephalosporins were the most commonly prescribed at a rate of 54.3% (n=159) followed by nitroimidazole 29.4% (n=86), quinolones 7.5% (n=22) and macrolides 4.4% (n=13), while carbapenems and aminoglycosides were the least prescribed together accounting for only 4.4% (n=13). The intravenous route was the most common route of administration, used for 96.6% of patients (n=171). Indications were reported in only 63.8% of cases. Culture to identify pathogenic organisms was employed in only 0.5% of cases. Conclusion: Broad-spectrum antibiotics are prescribed at an alarming rate. This practice may provoke antibiotic resistance and adversely affect the patient outcome. Implementation of an antibiotic stewardship program is warranted to enhance the efficacy, safety and cost-effectiveness of antimicrobial agents.

Keywords: Acute care, antibiotic misuse, Iraq, prescribing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 919

7336 Spatial Distribution of Cd, Zn and Hg in Groundwater at Rayong Province, Thailand

Authors: T. Makkasap, T. Satapanajaru

Abstract:

The objective of this study was to evaluate the distribution patterns of Cd, Zn and Hg in groundwater by geospatial interpolation. The study was performed at Rayong province in the eastern part of Thailand, with high agricultural and industrial activities. Groundwater samples were collected twice a year from 31 tubewells around this area. Inductively Coupled Plasma-Atomic Emission Spectrometer (ICP-AES) was used to measure the concentrations of Cd, Zn, and Hg in groundwater samples. The results demonstrated that concentrations of Cd, Zn and Hg range from 0.000-0.297 mg/L (x = 0.021±0.033 mg/L), 0.022-33.236 mg/L (x = 4.214±4.766 mg/L) and 0.000-0.289 mg/L (x = 0.023±0.034 mg/L), respectively. Most of the heavy metals concentrations were exceeded groundwater quality standards as specified in the Ministry of Natural Resources and Environment, Thailand. The trend distribution of heavy metals were high concentrations at the southeastern part of the area that especially vulnerable to heavy metals and other contaminants.

Keywords: Groundwater, Heavy metals, Kriging, Rayong, Spatial distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1905

7335 Anti-Social Networking?

Authors: Jarrod Trevathan, Trina Myers

Abstract:

Social networking is one of the most successful and popular tools to emerge from the Web 2.0 era. However, the increased interconnectivity and access to peoples- personal lives and information has created a plethora of opportunities for the nefarious side of human nature to manifest. This paper categorizes and describes the major types of anti-social behavior and criminal activity that can arise through undisciplined use and/or misuse of social media. We specifically address identity theft, misrepresentation of information posted, cyber bullying, children and social networking, and social networking in the work place. Recommendations are provided for how to reduce the risk of being the victim of a crime or engaging in embarrassing behavior that could irrevocably harm one-s reputation either professionally or personally. We also discuss what responsibilities social networking companies have to protect their users and also what law enforcement and policy makers can do to help alleviate the problems.

Keywords: Identity theft, misrepresentation, cyber bullying, online scams.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2077

7334 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5935

7333 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4818

7332 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2563

7331 Land Use/Land Cover Mapping Using Landsat 8 and Sentinel-2 in a Mediterranean Landscape

Authors: M. Vogiatzis, K. Perakis

Abstract:

Spatial-explicit and up-to-date land use/land cover information is fundamental for spatial planning, land management, sustainable development, and sound decision-making. In the last decade, many satellite-derived land cover products at different spatial, spectral, and temporal resolutions have been developed, such as the European Copernicus Land Cover product. However, more efficient and detailed information for land use/land cover is required at the regional or local scale. A typical Mediterranean basin with a complex landscape comprised of various forest types, crops, artificial surfaces, and wetlands was selected to test and develop our approach. In this study, we investigate the improvement of Copernicus Land Cover product (CLC2018) using Landsat 8 and Sentinel-2 pixel-based classification based on all available existing geospatial data (Forest Maps, LPIS, Natura2000 habitats, cadastral parcels, etc.). We examined and compared the performance of the Random Forest classifier for land use/land cover mapping. In total, 10 land use/land cover categories were recognized in Landsat 8 and 11 in Sentinel-2A. A comparison of the overall classification accuracies for 2018 shows that Landsat 8 classification accuracy was slightly higher than Sentinel-2A (82,99% vs. 80,30%). We concluded that the main land use/land cover types of CLC2018, even within a heterogeneous area, can be successfully mapped and updated according to CLC nomenclature. Future research should be oriented toward integrating spatiotemporal information from seasonal bands and spectral indexes in the classification process.

Keywords: land use/land cover, random forest, Landsat-8 OLI, Sentinel-2A MSI, Corine land cover

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 246

7330 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1516

7329 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2420

7328 Identify Features and Parameters to Devise an Accurate Intrusion Detection System Using Artificial Neural Network

Authors: Saman M. Abdulla, Najla B. Al-Dabagh, Omar Zakaria

Abstract:

The aim of this article is to explain how features of attacks could be extracted from the packets. It also explains how vectors could be built and then applied to the input of any analysis stage. For analyzing, the work deploys the Feedforward-Back propagation neural network to act as misuse intrusion detection system. It uses ten types if attacks as example for training and testing the neural network. It explains how the packets are analyzed to extract features. The work shows how selecting the right features, building correct vectors and how correct identification of the training methods with nodes- number in hidden layer of any neural network affecting the accuracy of system. In addition, the work shows how to get values of optimal weights and use them to initialize the Artificial Neural Network.

Keywords: Artificial Neural Network, Attack Features, MisuseIntrusion Detection System, Training Parameters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2247

7327 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3728

7326 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1261

7325 Privacy Issues in Pervasive Healthcare Monitoring System: A Review

Authors: Rusyaizila Ramli, Nasriah Zakaria, Putra Sumari

Abstract:

Privacy issues commonly discussed among researchers, practitioners, and end-users in pervasive healthcare. Pervasive healthcare systems are applications that can support patient-s need anytime and anywhere. However, pervasive healthcare raises privacy concerns since it can lead to situations where patients may not be aware that their private information is being shared and becomes vulnerable to threat. We have systematically analyzed the privacy issues and present a summary in tabular form to show the relationship among the issues. The six issues identified are medical information misuse, prescription leakage, medical information eavesdropping, social implications for the patient, patient difficulties in managing privacy settings, and lack of support in designing privacy-sensitive applications. We narrow down the issues and chose to focus on the issue of 'lack of support in designing privacysensitive applications' by proposing a privacy-sensitive architecture specifically designed for pervasive healthcare monitoring systems.

Keywords: Human Factors, Pervasive Healthcare, PrivacyIssues

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2868

7324 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1595

7323 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1960

7322 CAGE Questionnaire as a Screening Tool for Hazardous Drinking in an Acute Admissions Ward: Frequency of Application and Comparison with AUDIT-C Questionnaire

Authors: Ammar Ayad Issa Al-Rifaie, Zuhreya Muazu, Maysam Ali Abdulwahid, Dermot Gleeson

Abstract:

The aim of this audit was to examine the efficiency of alcohol history documentation and screening for hazardous drinkers at the Medical Admission Unit (MAU) of Northern General Hospital (NGH), Sheffield, to identify any potential for enhancing clinical practice. Data were collected from medical clerking sheets, ICE system and directly from 82 patients by three junior medical doctors using both CAGE questionnaire and AUDIT-C tool for newly admitted patients to MAU in NGH, in the period between January and March 2015. Alcohol consumption was documented in around two-third of the patient sample and this was documented fairly accurately by health care professionals. Some used subjective words such as 'social drinking' in the alcohol units’ section of the history. CAGE questionnaire was applied to only four patients and none of the patients had documented advice, education or referral to an alcohol liaison team. AUDIT-C tool had identified 30.4%, while CAGE 10.9%, of patients admitted to the NGH MAU as hazardous drinkers. The amount of alcohol the patient consumes positively correlated with the score of AUDIT-C (Pearson correlation 0.83). Re-audit is planned to be carried out after integrating AUDIT-C tool as labels in the notes and presenting a brief teaching session to junior doctors. Alcohol misuse screening is not adequately undertaken and no appropriate action is being offered to hazardous drinkers. CAGE questionnaire is poorly applied to patients and when satisfactory and adequately used has low sensitivity to detect hazardous drinkers in comparison with AUDIT-C tool. Re-audit of alcohol screening practice after introducing AUDIT-C tool in clerking sheets (as labels) is required to compare the findings and conclude the audit cycle.

Keywords: Alcohol screening, AUDIT-C, CAGE, Hazardous drinking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1864

7321 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1995

7320 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2731

7319 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1593

7318 Analyzing Multi-Labeled Data Based on the Roll of a Concept against a Semantic Range

Authors: Masahiro Kuzunishi, Tetsuya Furukawa, Ke Lu

Abstract:

Classifying data hierarchically is an efficient approach to analyze data. Data is usually classified into multiple categories, or annotated with a set of labels. To analyze multi-labeled data, such data must be specified by giving a set of labels as a semantic range. There are some certain purposes to analyze data. This paper shows which multi-labeled data should be the target to be analyzed for those purposes, and discusses the role of a label against a set of labels by investigating the change when a label is added to the set of labels. These discussions give the methods for the advanced analysis of multi-labeled data, which are based on the role of a label against a semantic range.

Keywords: Classification Hierarchies, Data Analysis, Multilabeled Data, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1170

7317 Drug Abuse among Immigrant Youth in Canada

Authors: Qin Wei

Abstract:

There has been an increased number of immigrants arriving in Canada and a concurrent rise in the number of immigrant youth suffering from drug abuse. Immigrant youths’ drug abuse has become a significant social and public health concern for researchers. This paper explores the nature of immigrant youths’ drug abuse by examining the factors influencing the onset of substance misuse, the barriers that discourage youth to seek out treatment, and how to resolve addictions amidst immigrant youth. Findings demonstrate that diminished parental supervision, acculturation challenges, peer conformity, discrimination, and ethnic marginalization are all significant factors influencing youth to use drugs as an outlet for their pain, while culturally incompetent care and fear of family and culture-based addiction stigma act as barriers discouraging youth from seeking out addiction support. To resolve addiction challenges amidst immigrant youth, future research should focus on promoting and implementing culturally sensitive practices and psychoeducational initiatives into immigrant communities and within public health policies.

Keywords: Approaches, barriers, drug abuse, Canada, immigrant youth.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 884

7316 Impact of Social Media on the Functioning of the Indian Government: A Critical Analysis

Authors: Priya Sepaha

Abstract:

Social media has loomed as the most effective tool in recent times to flag the causes, contents, opinions and direction of any social movement and has demonstrated that it will have a far-reaching effect on government as well. This study focuses on India which has emerged as the fastest growing community on social media. Social movement activists, in particular, have extensively utilized the power of digital social media to streamline the effectiveness of social protest on a particular issue through extensive successful mass mobilizations. This research analyses the role and impact of social media as a power to catalyze the social movements in India and further seeks to describe how certain social movements are resisted, subverted, co-opted and/or deployed by social media. The impact assessment study has been made with the help of cases, policies and some social movement which India has witnessed the assertion of numerous social issues perturbing the public which eventually paved the way for remarkable judicial decisions. The paper concludes with the observations that despite its pros and cons, the impacts of social media on the functioning of the Indian Government have demonstrated that it has already become an indispensable tool in the hands of social media-suave Indians who are committed to bring about a desired change.

Keywords: Impact, Indian government, misuse, social media, social movement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 922

7315 Steganalysis of Data Hiding via Halftoning and Coordinate Projection

Authors: Woong Hee Kim, Ilhwan Park

Abstract:

Steganography is the art of hiding and transmitting data through apparently innocuous carriers in an effort to conceal the existence of the data. A lot of steganography algorithms have been proposed recently. Many of them use the digital image data as a carrier. In data hiding scheme of halftoning and coordinate projection, still image data is used as a carrier, and the data of carrier image are modified for data embedding. In this paper, we present three features for analysis of data hiding via halftoning and coordinate projection. Also, we present a classifier using the proposed three features.

Keywords: Steganography, steganalysis, digital halftoning, data hiding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1553

7314 Biological Data Integration using SOA

Authors: Noura Meshaan Al-Otaibi, Amin Yousef Noaman

Abstract:

Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. This research suggests the use of Service Oriented Architecture (SOA) to integrate biological data from different data sources. This work shows SOA will solve the problems that facing integration process and if the biologist scientists can access the biological data in easier way. There are several methods to implement SOA but web service is the most popular method. The Microsoft .Net Framework used to implement proposed architecture.

Keywords: Bioinformatics, Biological data, Data Integration, SOA and Web Services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2418

7313 STATISTICA Software: A State of the Art Review

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, P. Ranjetha

Abstract:

Data mining idea is mounting rapidly in admiration and also in their popularity. The foremost aspire of data mining method is to extract data from a huge data set into several forms that could be comprehended for additional use. The data mining is a technology that contains with rich potential resources which could be supportive for industries and businesses that pay attention to collect the necessary information of the data to discover their customer’s performances. For extracting data there are several methods are available such as Classification, Clustering, Association, Discovering, and Visualization… etc., which has its individual and diverse algorithms towards the effort to fit an appropriate model to the data. STATISTICA mostly deals with excessive groups of data that imposes vast rigorous computational constraints. These results trials challenge cause the emergence of powerful STATISTICA Data Mining technologies. In this survey an overview of the STATISTICA software is illustrated along with their significant features.

Keywords: Data Mining, STATISTICA Data Miner, Text Miner, Enterprise Server, Classification, Association, Clustering, Regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2561

7312 Proposal of Data Collection from Probes

Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik

Abstract:

In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.

Keywords: Communication, computer network, data collection, probe.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1747