Search results for: data security
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26558

Search results for: data security

24998 Multi Data Management Systems in a Cluster Randomized Trial in Poor Resource Setting: The Pneumococcal Vaccine Schedules Trial

Authors: Abdoullah Nyassi, Golam Sarwar, Sarra Baldeh, Mamadou S. K. Jallow, Bai Lamin Dondeh, Isaac Osei, Grant A. Mackenzie

Abstract:

A randomized controlled trial is the "gold standard" for evaluating the efficacy of an intervention. Large-scale, cluster-randomized trials are expensive and difficult to conduct, though. To guarantee the validity and generalizability of findings, high-quality, dependable, and accurate data management systems are necessary. Robust data management systems are crucial for optimizing and validating the quality, accuracy, and dependability of trial data. Regarding the difficulties of data gathering in clinical trials in low-resource areas, there is a scarcity of literature on this subject, which may raise concerns. Effective data management systems and implementation goals should be part of trial procedures. Publicizing the creative clinical data management techniques used in clinical trials should boost public confidence in the study's conclusions and encourage further replication. In the ongoing pneumococcal vaccine schedule study in rural Gambia, this report details the development and deployment of multi-data management systems and methodologies. We implemented six different data management, synchronization, and reporting systems using Microsoft Access, RedCap, SQL, Visual Basic, Ruby, and ASP.NET. Additionally, data synchronization tools were developed to integrate data from these systems into the central server for reporting systems. Clinician, lab, and field data validation systems and methodologies are the main topics of this report. Our process development efforts across all domains were driven by the complexity of research project data collected in real-time data, online reporting, data synchronization, and ways for cleaning and verifying data. Consequently, we effectively used multi-data management systems, demonstrating the value of creative approaches in enhancing the consistency, accuracy, and reporting of trial data in a poor resource setting.

Keywords: data management, data collection, data cleaning, cluster-randomized trial

Procedia PDF Downloads 15
24997 The Role of Regional Economic Communities in Fighting Terrorism in Africa: The Case of Inter-Governmental Authority on Development (IGAD)

Authors: Memar Ayalew Demeke, Solomon Gebreyohans Gebru

Abstract:

In Africa, Regional Economic Communities (RECs) were initially established to tackle the economic challenges of the continent. However, overtime, they expanded their mandate to deal with the security threats of the continent such as terrorism. In fact, the fight against terrorism has been internationalized following the September 9/11 terrorist attack in the U.S.A. Since then, RECs have been giving considerable attention to preventing and combating terrorism in their respective regions. Similarly, IGAD has been involved in preventing and combating terrorism. So far, however, little has been done with regard to what IGAD has performed in fighting terrorism. Therefore, this study was intended to describe and analyze the legal and practical activities carried out by IGAD in its fight against terrorism in the region general and in Somalia in particular. Both descriptive and analytical methods were employed and data were analyzed through qualitative approach. Finally, based on the findings, the study argues that, instead of over-reliance on hard power as a means of fighting terrorism, IGAD should invest more on the political and socio-economic problems of its member states so as to address the root causes.

Keywords: regional economic communities, IGAD, terrorism, treaties, conventions

Procedia PDF Downloads 419
24996 Adult Child Labour Migration and Elderly Parent Health: Recent Evidence from Indonesian Panel Data

Authors: Alfiah Hasanah, Silvia Mendolia, Oleg Yerokhin

Abstract:

This paper explores the impacts of adult child migration on the health of elderly parents left behind. The maternal and children health are a priority of health-related policy in most low and middle-income country, and so there is lack of evidence on the health of older population particularly in Indonesia. With increasing life expectancy and limited access to social security and social services for the elderly in this country, the consequences of increasing number of out-migration of adult children to parent health are important to investigate. This study use Indonesia Family Life Survey (IFLS), the only large-scale continuing longitudinal socioeconomic and health survey that based on a sample of households representing about 83 percent of the Indonesian population in its first wave. Using four waves of IFLS including the recent wave of 2014, several indicators of the self-rated health status, interviewer-rated health status and days of illness are used to estimate the impact of labour out-migration of adult children on parent health status. Incorporate both individual fixed effects to control for unobservable factors in migrant and non-migrant households and the ordered response of self-rated health, this study apply the ordered logit of “Blow-up and Cluster” (BUC ) estimator. The result shows that labour out-migration of adult children significantly improves the self-rated health status of the elderly parent left behind. Findings of this study are consistent with the view that migration increases family resources and contribute to better health care and nutrition of the family left behind.

Keywords: aging, migration, panel data, self-rated health

Procedia PDF Downloads 348
24995 Finding Bicluster on Gene Expression Data of Lymphoma Based on Singular Value Decomposition and Hierarchical Clustering

Authors: Alhadi Bustaman, Soeganda Formalidin, Titin Siswantining

Abstract:

DNA microarray technology is used to analyze thousand gene expression data simultaneously and a very important task for drug development and test, function annotation, and cancer diagnosis. Various clustering methods have been used for analyzing gene expression data. However, when analyzing very large and heterogeneous collections of gene expression data, conventional clustering methods often cannot produce a satisfactory solution. Biclustering algorithm has been used as an alternative approach to identifying structures from gene expression data. In this paper, we introduce a transform technique based on singular value decomposition to identify normalized matrix of gene expression data followed by Mixed-Clustering algorithm and the Lift algorithm, inspired in the node-deletion and node-addition phases proposed by Cheng and Church based on Agglomerative Hierarchical Clustering (AHC). Experimental study on standard datasets demonstrated the effectiveness of the algorithm in gene expression data.

Keywords: agglomerative hierarchical clustering (AHC), biclustering, gene expression data, lymphoma, singular value decomposition (SVD)

Procedia PDF Downloads 273
24994 An Efficient Traceability Mechanism in the Audited Cloud Data Storage

Authors: Ramya P, Lino Abraham Varghese, S. Bose

Abstract:

By cloud storage services, the data can be stored in the cloud, and can be shared across multiple users. Due to the unexpected hardware/software failures and human errors, which make the data stored in the cloud be lost or corrupted easily it affected the integrity of data in cloud. Some mechanisms have been designed to allow both data owners and public verifiers to efficiently audit cloud data integrity without retrieving the entire data from the cloud server. But public auditing on the integrity of shared data with the existing mechanisms will unavoidably reveal confidential information such as identity of the person, to public verifiers. Here a privacy-preserving mechanism is proposed to support public auditing on shared data stored in the cloud. It uses group signatures to compute verification metadata needed to audit the correctness of shared data. The identity of the signer on each block in shared data is kept confidential from public verifiers, who are easily verifying shared data integrity without retrieving the entire file. But on demand, the signer of the each block is reveal to the owner alone. Group private key is generated once by the owner in the static group, where as in the dynamic group, the group private key is change when the users revoke from the group. When the users leave from the group the already signed blocks are resigned by cloud service provider instead of owner is efficiently handled by efficient proxy re-signature scheme.

Keywords: data integrity, dynamic group, group signature, public auditing

Procedia PDF Downloads 387
24993 Measuring Social Dimension of Sustainable Development in New Zealand Cities

Authors: Taimaz Larimian

Abstract:

During recent years, sustainable development has increasingly influenced urban policy, housing and planning in cities all over the world. Debates about sustainability no longer consider it solely as an environmental concern, but also incorporate social and economic dimensions. However, while a social dimension of sustainability is extensively accepted, the exact definition of the concept is still vague and unclear. This study is addressing this lack of specificity through a detailed exploration of social sustainability as the least studied pillar of sustainable development and sheds light on the debate over the definition of social sustainability through developing a measurement model of the constitutive dimensions of the concept. With this aim, a conceptual framework is developed based on the existing literature, determining seven main dimensions of the social sustainability concept namely: social interaction, safety and security, social equity, social participation, neighborhood satisfaction, housing satisfaction and sense of place. The validity and reliability of the model is then tested using exploratory and confirmatory factor analysis. In order to do so, five case study neighborhoods from Dunedin city with a range of urban forms and characters are investigated, to define social sustainability concept and its consisting dimensions from people’s perspective. The findings of this study present a clear definition of social sustainability at neighborhood scale and highlight all different dimensions of the concept in the context of New Zealand cities. According to the results, among the investigated dimensions, neighborhood satisfaction and safety and security had the most influence on people’s feeling of social sustainability in their neighborhood.

Keywords: social sustainability, factor analysis, neighborhood level, New Zealand cities

Procedia PDF Downloads 294
24992 A Short Survey of Integrating Urban Agriculture and Environmental Planning

Authors: Rayeheh Khatami, Toktam Hanaei, Mohammad Reza Mansouri Daneshvar

Abstract:

The growth of the agricultural sector is known as an essential way to achieve development goals in developing countries. Urban agriculture is a way to reduce the vulnerability of urban populations of the world toward global environmental change. It is a sustainable and efficient system to respond to the environmental, social and economic needs of the city, which leads to urban sustainability. Today, many local and national governments are developing urban agriculture as an effective tool in responding to challenges such as poverty, food security, and environmental problems. In this study, we follow a perspective based on urban agriculture literature in order to indicate the urban agriculture’s benefits in environmental planning strategies in non-western countries like Iran. The methodological approach adopted is based on qualitative approach and documentary studies. A total of 35 articles (mixed quantitative and qualitative methods studies) were studied in final analysis, which are published in relevant journals that focus on this subject. Studies show the wide range of positive benefits of urban agriculture on food security, nutrition outcomes, health outcomes, environmental outcomes, and social capital. However, there was no definitive conclusion about the negative effects of urban agriculture. This paper provides a conceptual and theoretical basis to know about urban agriculture and its roles in environmental planning, and also conclude the benefits of urban agriculture for researchers, practitioners, and policymakers who seek to create spaces in cities for implementation urban agriculture in future.

Keywords: urban agriculture, environmental planning, urban planning, literature

Procedia PDF Downloads 137
24991 Rodriguez Diego, Del Valle Martin, Hargreaves Matias, Riveros Jose Luis

Authors: Nathainail Bashir, Neil Anderson

Abstract:

The objective of this study site was to investigate the current state of the practice with regards to karst detection methods and recommend the best method and pattern of arrays to acquire the desire results. Proper site investigation in karst prone regions is extremely valuable in determining the location of possible voids. Two geophysical techniques were employed: multichannel analysis of surface waves (MASW) and electric resistivity tomography (ERT).The MASW data was acquired at each test location using different array lengths and different array orientations (to increase the probability of getting interpretable data in karst terrain). The ERT data were acquired using a dipole-dipole array consisting of 168 electrodes. The MASW data was interpreted (re: estimated depth to physical top of rock) and used to constrain and verify the interpretation of the ERT data. The ERT data indicates poorer quality MASW data were acquired in areas where there was significant local variation in the depth to top of rock.

Keywords: dipole-dipole, ERT, Karst terrains, MASW

Procedia PDF Downloads 314
24990 Data Science in Military Decision-Making: A Semi-Systematic Literature Review

Authors: H. W. Meerveld, R. H. A. Lindelauf

Abstract:

In contemporary warfare, data science is crucial for the military in achieving information superiority. Yet, to the authors’ knowledge, no extensive literature survey on data science in military decision-making has been conducted so far. In this study, 156 peer-reviewed articles were analysed through an integrative, semi-systematic literature review to gain an overview of the topic. The study examined to what extent literature is focussed on the opportunities or risks of data science in military decision-making, differentiated per level of war (i.e. strategic, operational, and tactical level). A relatively large focus on the risks of data science was observed in social science literature, implying that political and military policymakers are disproportionally influenced by a pessimistic view on the application of data science in the military domain. The perceived risks of data science are, however, hardly addressed in formal science literature. This means that the concerns on the military application of data science are not addressed to the audience that can actually develop and enhance data science models and algorithms. Cross-disciplinary research on both the opportunities and risks of military data science can address the observed research gaps. Considering the levels of war, relatively low attention for the operational level compared to the other two levels was observed, suggesting a research gap with reference to military operational data science. Opportunities for military data science mostly arise at the tactical level. On the contrary, studies examining strategic issues mostly emphasise the risks of military data science. Consequently, domain-specific requirements for military strategic data science applications are hardly expressed. Lacking such applications may ultimately lead to a suboptimal strategic decision in today’s warfare.

Keywords: data science, decision-making, information superiority, literature review, military

Procedia PDF Downloads 156
24989 Wavelets Contribution on Textual Data Analysis

Authors: Habiba Ben Abdessalem

Abstract:

The emergence of giant set of textual data was the push that has encouraged researchers to invest in this field. The purpose of textual data analysis methods is to facilitate access to such type of data by providing various graphic visualizations. Applying these methods requires a corpus pretreatment step, whose standards are set according to the objective of the problem studied. This step determines the forms list contained in contingency table by keeping only those information carriers. This step may, however, lead to noisy contingency tables, so the use of wavelet denoising function. The validity of the proposed approach is tested on a text database that offers economic and political events in Tunisia for a well definite period.

Keywords: textual data, wavelet, denoising, contingency table

Procedia PDF Downloads 275
24988 Customer Churn Analysis in Telecommunication Industry Using Data Mining Approach

Authors: Burcu Oralhan, Zeki Oralhan, Nilsun Sariyer, Kumru Uyar

Abstract:

Data mining has been becoming more and more important and a wide range of applications in recent years. Data mining is the process of find hidden and unknown patterns in big data. One of the applied fields of data mining is Customer Relationship Management. Understanding the relationships between products and customers is crucial for every business. Customer Relationship Management is an approach to focus on customer relationship development, retention and increase on customer satisfaction. In this study, we made an application of a data mining methods in telecommunication customer relationship management side. This study aims to determine the customers profile who likely to leave the system, develop marketing strategies, and customized campaigns for customers. Data are clustered by applying classification techniques for used to determine the churners. As a result of this study, we will obtain knowledge from international telecommunication industry. We will contribute to the understanding and development of this subject in Customer Relationship Management.

Keywords: customer churn analysis, customer relationship management, data mining, telecommunication industry

Procedia PDF Downloads 311
24987 On Pooling Different Levels of Data in Estimating Parameters of Continuous Meta-Analysis

Authors: N. R. N. Idris, S. Baharom

Abstract:

A meta-analysis may be performed using aggregate data (AD) or an individual patient data (IPD). In practice, studies may be available at both IPD and AD level. In this situation, both the IPD and AD should be utilised in order to maximize the available information. Statistical advantages of combining the studies from different level have not been fully explored. This study aims to quantify the statistical benefits of including available IPD when conducting a conventional summary-level meta-analysis. Simulated meta-analysis were used to assess the influence of the levels of data on overall meta-analysis estimates based on IPD-only, AD-only and the combination of IPD and AD (mixed data, MD), under different study scenario. The percentage relative bias (PRB), root mean-square-error (RMSE) and coverage probability were used to assess the efficiency of the overall estimates. The results demonstrate that available IPD should always be included in a conventional meta-analysis using summary level data as they would significantly increased the accuracy of the estimates. On the other hand, if more than 80% of the available data are at IPD level, including the AD does not provide significant differences in terms of accuracy of the estimates. Additionally, combining the IPD and AD has moderating effects on the biasness of the estimates of the treatment effects as the IPD tends to overestimate the treatment effects, while the AD has the tendency to produce underestimated effect estimates. These results may provide some guide in deciding if significant benefit is gained by pooling the two levels of data when conducting meta-analysis.

Keywords: aggregate data, combined-level data, individual patient data, meta-analysis

Procedia PDF Downloads 371
24986 A Deep Learning Approach to Online Social Network Account Compromisation

Authors: Edward K. Boahen, Brunel E. Bouya-Moko, Changda Wang

Abstract:

The major threat to online social network (OSN) users is account compromisation. Spammers now spread malicious messages by exploiting the trust relationship established between account owners and their friends. The challenge in detecting a compromised account by service providers is validating the trusted relationship established between the account owners, their friends, and the spammers. Another challenge is the increase in required human interaction with the feature selection. Research available on supervised learning (machine learning) has limitations with the feature selection and accounts that cannot be profiled, like application programming interface (API). Therefore, this paper discusses the various behaviours of the OSN users and the current approaches in detecting a compromised OSN account, emphasizing its limitations and challenges. We propose a deep learning approach that addresses and resolve the constraints faced by the previous schemes. We detailed our proposed optimized nonsymmetric deep auto-encoder (OPT_NDAE) for unsupervised feature learning, which reduces the required human interaction levels in the selection and extraction of features. We evaluated our proposed classifier using the NSL-KDD and KDDCUP'99 datasets in a graphical user interface enabled Weka application. The results obtained indicate that our proposed approach outperformed most of the traditional schemes in OSN compromised account detection with an accuracy rate of 99.86%.

Keywords: computer security, network security, online social network, account compromisation

Procedia PDF Downloads 112
24985 Analyzing On-Line Process Data for Industrial Production Quality Control

Authors: Hyun-Woo Cho

Abstract:

The monitoring of industrial production quality has to be implemented to alarm early warning for unusual operating conditions. Furthermore, identification of their assignable causes is necessary for a quality control purpose. For such tasks many multivariate statistical techniques have been applied and shown to be quite effective tools. This work presents a process data-based monitoring scheme for production processes. For more reliable results some additional steps of noise filtering and preprocessing are considered. It may lead to enhanced performance by eliminating unwanted variation of the data. The performance evaluation is executed using data sets from test processes. The proposed method is shown to provide reliable quality control results, and thus is more effective in quality monitoring in the example. For practical implementation of the method, an on-line data system must be available to gather historical and on-line data. Recently large amounts of data are collected on-line in most processes and implementation of the current scheme is feasible and does not give additional burdens to users.

Keywords: detection, filtering, monitoring, process data

Procedia PDF Downloads 552
24984 A Review of Travel Data Collection Methods

Authors: Muhammad Awais Shafique, Eiji Hato

Abstract:

Household trip data is of crucial importance for managing present transportation infrastructure as well as to plan and design future facilities. It also provides basis for new policies implemented under Transportation Demand Management. The methods used for household trip data collection have changed with passage of time, starting with the conventional face-to-face interviews or paper-and-pencil interviews and reaching to the recent approach of employing smartphones. This study summarizes the step-wise evolution in the travel data collection methods. It provides a comprehensive review of the topic, for readers interested to know the changing trends in the data collection field.

Keywords: computer, smartphone, telephone, travel survey

Procedia PDF Downloads 307
24983 A Business-to-Business Collaboration System That Promotes Data Utilization While Encrypting Information on the Blockchain

Authors: Hiroaki Nasu, Ryota Miyamoto, Yuta Kodera, Yasuyuki Nogami

Abstract:

To promote Industry 4.0 and Society 5.0 and so on, it is important to connect and share data so that every member can trust it. Blockchain (BC) technology is currently attracting attention as the most advanced tool and has been used in the financial field and so on. However, the data collaboration using BC has not progressed sufficiently among companies on the supply chain of manufacturing industry that handle sensitive data such as product quality, manufacturing conditions, etc. There are two main reasons why data utilization is not sufficiently advanced in the industrial supply chain. The first reason is that manufacturing information is top secret and a source for companies to generate profits. It is difficult to disclose data even between companies with transactions in the supply chain. In the blockchain mechanism such as Bitcoin using PKI (Public Key Infrastructure), in order to confirm the identity of the company that has sent the data, the plaintext must be shared between the companies. Another reason is that the merits (scenarios) of collaboration data between companies are not specifically specified in the industrial supply chain. For these problems this paper proposes a Business to Business (B2B) collaboration system using homomorphic encryption and BC technique. Using the proposed system, each company on the supply chain can exchange confidential information on encrypted data and utilize the data for their own business. In addition, this paper considers a scenario focusing on quality data, which was difficult to collaborate because it is a top secret. In this scenario, we show a implementation scheme and a benefit of concrete data collaboration by proposing a comparison protocol that can grasp the change in quality while hiding the numerical value of quality data.

Keywords: business to business data collaboration, industrial supply chain, blockchain, homomorphic encryption

Procedia PDF Downloads 132
24982 Multivariate Assessment of Mathematics Test Scores of Students in Qatar

Authors: Ali Rashash Alzahrani, Elizabeth Stojanovski

Abstract:

Data on various aspects of education are collected at the institutional and government level regularly. In Australia, for example, students at various levels of schooling undertake examinations in numeracy and literacy as part of NAPLAN testing, enabling longitudinal assessment of such data as well as comparisons between schools and states within Australia. Another source of educational data collected internationally is via the PISA study which collects data from several countries when students are approximately 15 years of age and enables comparisons in the performance of science, mathematics and English between countries as well as ranking of countries based on performance in these standardised tests. As well as student and school outcomes based on the tests taken as part of the PISA study, there is a wealth of other data collected in the study including parental demographics data and data related to teaching strategies used by educators. Overall, an abundance of educational data is available which has the potential to be used to help improve educational attainment and teaching of content in order to improve learning outcomes. A multivariate assessment of such data enables multiple variables to be considered simultaneously and will be used in the present study to help develop profiles of students based on performance in mathematics using data obtained from the PISA study.

Keywords: cluster analysis, education, mathematics, profiles

Procedia PDF Downloads 122
24981 Composite Approach to Extremism and Terrorism Web Content Classification

Authors: Kolade Olawande Owoeye, George Weir

Abstract:

Terrorism and extremism activities on the internet are becoming the most significant threats to national security because of their potential dangers. In response to this challenge, law enforcement and security authorities are actively implementing comprehensive measures by countering the use of the internet for terrorism. To achieve the measures, there is need for intelligence gathering via the internet. This includes real-time monitoring of potential websites that are used for recruitment and information dissemination among other operations by extremist groups. However, with billions of active webpages, real-time monitoring of all webpages become almost impossible. To narrow down the search domain, there is a need for efficient webpage classification techniques. This research proposed a new approach tagged: SentiPosit-based method. SentiPosit-based method combines features of the Posit-based method and the Sentistrenght-based method for classification of terrorism and extremism webpages. The experiment was carried out on 7500 webpages obtained through TENE-webcrawler by International Cyber Crime Research Centre (ICCRC). The webpages were manually grouped into three classes which include the ‘pro-extremist’, ‘anti-extremist’ and ‘neutral’ with 2500 webpages in each category. A supervised learning algorithm is then applied on the classified dataset in order to build the model. Results obtained was compared with existing classification method using the prediction accuracy and runtime. It was observed that our proposed hybrid approach produced a better classification accuracy compared to existing approaches within a reasonable runtime.

Keywords: sentiposit, classification, extremism, terrorism

Procedia PDF Downloads 274
24980 Dataset Quality Index:Development of Composite Indicator Based on Standard Data Quality Indicators

Authors: Sakda Loetpiparwanich, Preecha Vichitthamaros

Abstract:

Nowadays, poor data quality is considered one of the majority costs for a data project. The data project with data quality awareness almost as much time to data quality processes while data project without data quality awareness negatively impacts financial resources, efficiency, productivity, and credibility. One of the processes that take a long time is defining the expectations and measurements of data quality because the expectation is different up to the purpose of each data project. Especially, big data project that maybe involves with many datasets and stakeholders, that take a long time to discuss and define quality expectations and measurements. Therefore, this study aimed at developing meaningful indicators to describe overall data quality for each dataset to quick comparison and priority. The objectives of this study were to: (1) Develop a practical data quality indicators and measurements, (2) Develop data quality dimensions based on statistical characteristics and (3) Develop Composite Indicator that can describe overall data quality for each dataset. The sample consisted of more than 500 datasets from public sources obtained by random sampling. After datasets were collected, there are five steps to develop the Dataset Quality Index (SDQI). First, we define standard data quality expectations. Second, we find any indicators that can measure directly to data within datasets. Thirdly, each indicator aggregates to dimension using factor analysis. Next, the indicators and dimensions were weighted by an effort for data preparing process and usability. Finally, the dimensions aggregate to Composite Indicator. The results of these analyses showed that: (1) The developed useful indicators and measurements contained ten indicators. (2) the developed data quality dimension based on statistical characteristics, we found that ten indicators can be reduced to 4 dimensions. (3) The developed Composite Indicator, we found that the SDQI can describe overall datasets quality of each dataset and can separate into 3 Level as Good Quality, Acceptable Quality, and Poor Quality. The conclusion, the SDQI provide an overall description of data quality within datasets and meaningful composition. We can use SQDI to assess for all data in the data project, effort estimation, and priority. The SDQI also work well with Agile Method by using SDQI to assessment in the first sprint. After passing the initial evaluation, we can add more specific data quality indicators into the next sprint.

Keywords: data quality, dataset quality, data quality management, composite indicator, factor analysis, principal component analysis

Procedia PDF Downloads 133
24979 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: predictive analysis, big data, predictive analysis algorithms, CART algorithm

Procedia PDF Downloads 136
24978 Canopy Temperature Acquired from Daytime and Nighttime Aerial Data as an Indicator of Trees’ Health Status

Authors: Agata Zakrzewska, Dominik Kopeć, Adrian Ochtyra

Abstract:

The growing number of new cameras, sensors, and research methods allow for a broader application of thermal data in remote sensing vegetation studies. The aim of this research was to check whether it is possible to use thermal infrared data with a spectral range (3.6-4.9 μm) obtained during the day and the night to assess the health condition of selected species of deciduous trees in an urban environment. For this purpose, research was carried out in the city center of Warsaw (Poland) in 2020. During the airborne data acquisition, thermal data, laser scanning, and orthophoto map images were collected. Synchronously with airborne data, ground reference data were obtained for 617 studied species (Acer platanoides, Acer pseudoplatanus, Aesculus hippocastanum, Tilia cordata, and Tilia × euchlora) in different health condition states. The results were as follows: (i) healthy trees are cooler than trees in poor condition and dying both in the daytime and nighttime data; (ii) the difference in the canopy temperatures between healthy and dying trees was 1.06oC of mean value on the nighttime data and 3.28oC of mean value on the daytime data; (iii) condition classes significantly differentiate on both daytime and nighttime thermal data, but only on daytime data all condition classes differed statistically significantly from each other. In conclusion, the aerial thermal data can be considered as an alternative to hyperspectral data, a method of assessing the health condition of trees in an urban environment. Especially data obtained during the day, which can differentiate condition classes better than data obtained at night. The method based on thermal infrared and laser scanning data fusion could be a quick and efficient solution for identifying trees in poor health that should be visually checked in the field.

Keywords: middle wave infrared, thermal imagery, tree discoloration, urban trees

Procedia PDF Downloads 110
24977 Gendered Water Insecurity: a Structural Equation Approach for Female-Headed Households in South Africa

Authors: Saul Ngarava, Leocadia Zhou, Nomakhaya Monde

Abstract:

Water crises have the fourth most significant societal impact after weapons of mass destruction, climate change, and extreme weather conditions, ahead of natural disasters. Intricacies between women and water are central to achieving the 2030 Sustainable Development Goals (SDGs). The majority of the 1.2 billion poor people worldwide, with two-thirds being women, and mostly located in Sub Sahara Africa (SSA) and South Asia, do not have access to safe and reliable sources of water. There exist gendered differences in water security based on the division of labour associating women with water. Globally, women and girls are responsible for water collection in 80% of the households which have no water on their premises. Women spend 16 million hours a day collecting water, while men and children spend 6 million and 4 million per day, respectively, which is time foregone in the pursuit of other livelihood activities. Due to their proximity and activities concerning water, women are vulnerable to water insecurity through exposures to water-borne diseases, fatigue from physically carrying water, and exposure to sexual and physical harassment, amongst others. Proximity to treated water and their wellbeing also has an effect on their sensitivity and adaptive capacity to water insecurity. The great distances, difficult terrain and heavy lifting expose women to vulnerabilities of water insecurity. However, few studies have quantified the vulnerabilities and burdens on women, with a few taking a phenomenological qualitative approach. Vulnerability studies have also been scanty in the water security realm, with most studies taking linear forms of either quantifying exposures, sensitivities or adaptive capacities in climate change studies. The current study argues for the need for a water insecurity vulnerability assessment, especially for women into research agendas as well as policy interventions, monitoring, and evaluation. The study sought to identify and provide pathways through which female-headed households were water insecure in South Africa, the 30th driest country in the world. This was through linking the drinking water decision as well as the vulnerability frameworks. Secondary data collected during the 2016 General Household Survey (GHS) was utilised, with a sample of 5928 female-headed households. Principal Component Analysis and Structural Equation Modelling were used to analyse the data. The results show dynamic relationships between water characteristics and water treatment. There were also associations between water access and wealth status of the female-headed households. Association was also found between water access and water treatment as well as between wealth status and water treatment. The study concludes that there are dynamic relationships in water insecurity (exposure, sensitivity, and adaptive capacity) for female-headed households in South Africa. The study recommends that a multi-prong approach is required in tackling exposures, sensitivities, and adaptive capacities to water insecurity. This should include capacitating and empowering women for wealth generation, improve access to water treatment equipment as well as prioritising the improvement of infrastructure that brings piped and safe water to female-headed households.

Keywords: gender, principal component analysis, structural equation modelling, vulnerability, water insecurity

Procedia PDF Downloads 117
24976 Hierarchical Clustering Algorithms in Data Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the areas in data mining and it can be classified into partition, hierarchical, density based, and grid-based. Therefore, in this paper, we do a survey and review for four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON, and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems, as well as deriving more robust and scalable algorithms for clustering.

Keywords: clustering, unsupervised learning, algorithms, hierarchical

Procedia PDF Downloads 881
24975 The Situation of Transgender Individuals Was Worsened During Covid-19

Authors: Kajal Attri

Abstract:

Introduction: Transgender people are considered third gender in India, although they still face identification issues and alienated from society. Furthermore, they face several challenges, including discrimination in employment, resources, education, and property as a result, most transgender people make a living through begging at traffic lights, trains, and buses; attending auspicious occasions such as childbirth and weddings; and engaging in sex work, which includes both home-based and street-based sex work. During COVID-19, maintaining social distance exacerbates transgender people's circumstances and prevents them from accessing health care services, sexual reassignment surgery, identity-based resources, government security, and financial stability. Nonetheless, the pandemic raised unfavorable attitudes about transgender persons, such as unsupportive family members and trouble forming emotional relationships. This study focuses on how we missed transgender people during COVID-19 to provide better facilities to cope with this situation when they are already the most vulnerable segment of the society. Methodology: The research was conducted using secondary data from published publications and grey literature obtained from four databases: Pubmed, Psychinfo, Science direct, and Google scholar. The literature included total 25 articles that met the inclusion criteria for a review. Result and Discussion: Transgender people, who are considered the most vulnerable sector of society, have already faced several obstacles as a result of the outbreak. The analysis underscores the difficulties that transgender persons faced during COVID-19, such as, They had trouble accessing the government's social security programmes during the lockdown, which provide rations and pensions since they lack the necessary identifying cards. The impact of COVID-19 leaves transgender people at heightened risk of poverty and ill health because they exist on the margins of society, those livelihood base on sex work, begging, and participation on auspicious occasions. They had a significant risk of contracting SARS-CoV2 because they lived in congested areas or did not have permanent shelter, and they were predominantly infected with HIV, cancer, and other non-communicable illnesses. The pandemic raised unfavorable attitudes about transgender persons, such as unsupportive family members and trouble forming emotional relationships. Conclusion: The study comes forward with useful suggestions based on content analysis and information to reduce the existing woes of transgenders during any pandemic like COVID-19.

Keywords: COVID-19, transgender, lockdown, transwomen, stigmatization

Procedia PDF Downloads 73
24974 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering

Authors: K. Umbleja, M. Ichino

Abstract:

Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.

Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis

Procedia PDF Downloads 157
24973 WiFi Data Offloading: Bundling Method in a Canvas Business Model

Authors: Majid Mokhtarnia, Alireza Amini

Abstract:

Mobile operators deal with increasing in the data traffic as a critical issue. As a result, a vital responsibility of the operators is to deal with such a trend in order to create added values. This paper addresses a bundling method in a Canvas business model in a WiFi Data Offloading (WDO) strategy by which some elements of the model may be affected. In the proposed method, it is supposed to sell a number of data packages for subscribers in which there are some packages with a free given volume of data-offloaded WiFi complimentary. The paper on hands analyses this method in the views of attractiveness and profitability. The results demonstrate that the quality of implementation of the WDO strongly affects the final result and helps the decision maker to make the best one.

Keywords: bundling, canvas business model, telecommunication, WiFi data offloading

Procedia PDF Downloads 193
24972 Research on Autonomous Controllability of BeiDou Navigation Satellite System Based on Knowledge Transformation

Authors: Hang Ju, Changmin Zhu

Abstract:

The development level of the BeiDou Navigation Satellite System (BDS) can strongly reflect national defense strength as an important spatial information infrastructure. BDS can be not only used for military purposes, such as intelligence gathering, nuclear explosion monitoring, emergency communications, but also for location services, transportation, mapping, precision agriculture. In order to ensure the national defense security and the wide application of BDS in civil and military areas, BDS must be autonomous and controllable. As a complex system of knowledge-intensive, knowledge transformation runs through the whole process of research and development, production, operation, and maintenance of BDS. Based on the perspective of knowledge transformation, this paper expounds on the meaning of socialization, externalization, combination, and internalization of knowledge transformation, and the coupling relationship of autonomy and control on the basis of analyzing the status quo and problems of the autonomy and control of BDS. The autonomous and controllable framework of BDS based on knowledge transformation is constructed from six dimensions of management capability, R&D capability, technical capability, manufacturing capability, service support capability, and application capability. It can provide support for the smooth implementation of information security policy, provide a reference for the autonomy and control of the upstream and downstream industrial chains in Beidou, and provide a reference for the autonomous and controllable research of aerospace components, military measurement test equipment, and other related industries.

Keywords: knowledge transformation, BeiDou Navigation Satellite System, autonomy and control, framework

Procedia PDF Downloads 182
24971 Distributed Perceptually Important Point Identification for Time Series Data Mining

Authors: Tak-Chung Fu, Ying-Kit Hung, Fu-Lai Chung

Abstract:

In the field of time series data mining, the concept of the Perceptually Important Point (PIP) identification process is first introduced in 2001. This process originally works for financial time series pattern matching and it is then found suitable for time series dimensionality reduction and representation. Its strength is on preserving the overall shape of the time series by identifying the salient points in it. With the rise of Big Data, time series data contributes a major proportion, especially on the data which generates by sensors in the Internet of Things (IoT) environment. According to the nature of PIP identification and the successful cases, it is worth to further explore the opportunity to apply PIP in time series ‘Big Data’. However, the performance of PIP identification is always considered as the limitation when dealing with ‘Big’ time series data. In this paper, two distributed versions of PIP identification based on the Specialized Binary (SB) Tree are proposed. The proposed approaches solve the bottleneck when running the PIP identification process in a standalone computer. Improvement in term of speed is obtained by the distributed versions.

Keywords: distributed computing, performance analysis, Perceptually Important Point identification, time series data mining

Procedia PDF Downloads 428
24970 Agroforestry Systems: A Sustainable Strategy of the Agricultural Systems of Cumaral (Meta), Colombia

Authors: Amanda Silva Parra, Dayra Yisel García Ramirez

Abstract:

In developing countries, agricultural "modernization" has led to a loss of biodiversity and inefficiency of agricultural systems, manifested in increases in Greenhouse Gas Emissions (GHG) and the C footprint, generating the susceptibility of systems agriculture to environmental problems, loss of biodiversity, depletion of natural resources, soil degradation and loss of nutrients, and a decrease in the supply of products that affect food security for peoples and nations. Each year agriculture emits 10 to 12% (5.1 to 6.1 Gt CO2eq per year) of the total estimated GHG emissions (51 Gt CO2 eq per year). The FAO recommends that countries that have not yet done so consider declaring sustainable agriculture as an essential or strategic activity of public interest within the framework of green economies to better face global climate change. The objective of this research was to estimate the balance of GHG in agricultural systems of Cumaral, Meta (Colombia), to contribute to the recovery and sustainable operation of agricultural systems that guarantee food security and face changes generated by the climate in a more intelligent way. To determine the GHG balances, the IPCC methodologies were applied with a Tier 1 and 2 level of use. It was estimated that all the silvopastoral systems evaluated play an important role in this reconversion compared to conventional systems such as improved pastures. and degraded pastures due to their ability to capture C both in soil and in biomass, generating positive GHG balances, guaranteeing greater sustainability of soil and air resources.

Keywords: climate change, carbon capture, environmental sustainability, GHG mitigation, silvopastoral systems

Procedia PDF Downloads 115
24969 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 119