Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 41074

Search results for: multivariate data analysis

40654 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 98

40653 Modelling the Education Supply Chain with Network Data Envelopment Analysis

Authors: Sourour Ramzi, Claudia Sarrico

Abstract:

Little has been done on network DEA in education, and nobody has attempted to model the whole education supply chain using network DEA. As such the contribution of the present paper is to propose a model for measuring the efficiency of education supply chains using network DEA. First, we use a general survey of data envelopment analysis (DEA) to establish the emergent themes for research in DEA, and focus on the theme of Network DEA. Second, we use a survey on two-stage DEA models, and Network DEA to write a state of the art on Network DEA, particularly applied to supply chain management. Third, we use a survey on DEA applications to establish the most influential papers on DEA education applications, in order to establish the state of the art on applications of DEA in education, in general, and applications of DEA to education using network DEA, in particular. Finally, we propose a model for measuring the performance of education supply chains of different education systems (countries or states within a country, for instance). We then use this model on some empirical data.

Keywords: supply chain, education, data envelopment analysis, network DEA

Procedia PDF Downloads 352

40652 Analysis and Prediction of Netflix Viewing History Using Netflixlatte as an Enriched Real Data Pool

Authors: Amir Mabhout, Toktam Ghafarian, Amirhossein Farzin, Zahra Makki, Sajjad Alizadeh, Amirhossein Ghavi

Abstract:

The high number of Netflix subscribers makes it attractive for data scientists to extract valuable knowledge from the viewers' behavioural analyses. This paper presents a set of statistical insights into viewers' viewing history. After that, a deep learning model is used to predict the future watching behaviour of the users based on previous watching history within the Netflixlatte data pool. Netflixlatte in an aggregated and anonymized data pool of 320 Netflix viewers with a length 250 000 data points recorded between 2008-2022. We observe insightful correlations between the distribution of viewing time and the COVID-19 pandemic outbreak. The presented deep learning model predicts future movie and TV series viewing habits with an average loss of 0.175.

Keywords: data analysis, deep learning, LSTM neural network, netflix

Procedia PDF Downloads 215

40651 EnumTree: An Enumerative Biclustering Algorithm for DNA Microarray Data

Authors: Haifa Ben Saber, Mourad Elloumi

Abstract:

In a number of domains, like in DNA microarray data analysis, we need to cluster simultaneously rows (genes) and columns (conditions) of a data matrix to identify groups of constant rows with a group of columns. This kind of clustering is called biclustering. Biclustering algorithms are extensively used in DNA microarray data analysis. More effective biclustering algorithms are highly desirable and needed. We introduce a new algorithm called, Enumerative tree (EnumTree) for biclustering of binary microarray data. is an algorithm adopting the approach of enumerating biclusters. This algorithm extracts all biclusters consistent good quality. The main idea of EnumLat is the construction of a new tree structure to represent adequately different biclusters discovered during the process of enumeration. This algorithm adopts the strategy of all biclusters at a time. The performance of the proposed algorithm is assessed using both synthetic and real DNA micryarray data, our algorithm outperforms other biclustering algorithms for binary microarray data. Biclusters with different numbers of rows. Moreover, we test the biological significance using a gene annotation web tool to show that our proposed method is able to produce biologically relevent biclusters.

Keywords: DNA microarray, biclustering, gene expression data, tree, datamining.

Procedia PDF Downloads 358

40650 A Study on Big Data Analytics, Applications and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, Healthcare, and business intelligence contain voluminous and incremental data, which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organization's decision-making strategy can be enhanced using big data analytics and applying different machine learning techniques and statistical tools on such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates on various frameworks in the process of Analysis using different machine-learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 62

40649 A Study on Big Data Analytics, Applications, and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 77

40648 Strategic Citizen Participation in Applied Planning Investigations: How Planners Use Etic and Emic Community Input Perspectives to Fill-in the Gaps in Their Analysis

Authors: John Gaber

Abstract:

Planners regularly use citizen input as empirical data to help them better understand community issues they know very little about. This type of community data is based on the lived experiences of local residents and is known as "emic" data. What is becoming more common practice for planners is their use of data from local experts and stakeholders (known as "etic" data or the outsider perspective) to help them fill in the gaps in their analysis of applied planning research projects. Utilizing international Health Impact Assessment (HIA) data, I look at who planners invite to their citizen input investigations. Research presented in this paper shows that planners access a wide range of emic and etic community perspectives in their search for the “community’s view.” The paper concludes with how planners can chart out a new empirical path in their execution of emic/etic citizen participation strategies in their applied planning research projects.

Keywords: citizen participation, emic data, etic data, Health Impact Assessment (HIA)

Procedia PDF Downloads 471

40647 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 54

40646 Impact of Ownership Structure on Financial Performance of Listed Industrial Goods Firms in Nigeria

Authors: Muhammad Shehu Garba

Abstract:

The financial statements of the firms between the periods of 2013 and 2022 were collected using the secondary method of data collection, and the study aims to investigate the effect of ownership structure on the financial performance of listed industrial goods companies in Nigeria. 10 firms were used as the study's sample size. The study used panel data variables of the study. The ownership structure is measured with managerial ownership, institutional ownership and foreign ownership, while financial performance is measured with return on asset and return on equity; the study made use of control variables leverage and firm size. The result shows a multivariate relationship that exists between variables of the study, which shows ROA has a positive correlation with ROE (0.4053), MO (0.2001), and FS (0.3048). It has a negative correlation with FO (-0.1933), IO (-0.0919), and LEV (-0.3367). ROE has a positive correlation with ROA (0.4053), MO (0.2001), and FS (0.2640). It has a negative correlation with FO (-0.1864), IO (-0.1847), and LEV (-0.0319). It is recommended that firms should focus on increasing their ROA. Firms should also consider increasing their MO, as this can help to align the interests of managers and shareholders. Firms should also be aware of the potential impact of FO and IO on their ROA.

Keywords: firm size, ownership structure, financial performance, leaverage

Procedia PDF Downloads 42

40645 Dataset Quality Index:Development of Composite Indicator Based on Standard Data Quality Indicators

Authors: Sakda Loetpiparwanich, Preecha Vichitthamaros

Abstract:

Nowadays, poor data quality is considered one of the majority costs for a data project. The data project with data quality awareness almost as much time to data quality processes while data project without data quality awareness negatively impacts financial resources, efficiency, productivity, and credibility. One of the processes that take a long time is defining the expectations and measurements of data quality because the expectation is different up to the purpose of each data project. Especially, big data project that maybe involves with many datasets and stakeholders, that take a long time to discuss and define quality expectations and measurements. Therefore, this study aimed at developing meaningful indicators to describe overall data quality for each dataset to quick comparison and priority. The objectives of this study were to: (1) Develop a practical data quality indicators and measurements, (2) Develop data quality dimensions based on statistical characteristics and (3) Develop Composite Indicator that can describe overall data quality for each dataset. The sample consisted of more than 500 datasets from public sources obtained by random sampling. After datasets were collected, there are five steps to develop the Dataset Quality Index (SDQI). First, we define standard data quality expectations. Second, we find any indicators that can measure directly to data within datasets. Thirdly, each indicator aggregates to dimension using factor analysis. Next, the indicators and dimensions were weighted by an effort for data preparing process and usability. Finally, the dimensions aggregate to Composite Indicator. The results of these analyses showed that: (1) The developed useful indicators and measurements contained ten indicators. (2) the developed data quality dimension based on statistical characteristics, we found that ten indicators can be reduced to 4 dimensions. (3) The developed Composite Indicator, we found that the SDQI can describe overall datasets quality of each dataset and can separate into 3 Level as Good Quality, Acceptable Quality, and Poor Quality. The conclusion, the SDQI provide an overall description of data quality within datasets and meaningful composition. We can use SQDI to assess for all data in the data project, effort estimation, and priority. The SDQI also work well with Agile Method by using SDQI to assessment in the first sprint. After passing the initial evaluation, we can add more specific data quality indicators into the next sprint.

Keywords: data quality, dataset quality, data quality management, composite indicator, factor analysis, principal component analysis

Procedia PDF Downloads 119

40644 Evaluation of Groundwater Quality and Contamination Sources Using Geostatistical Methods and GIS in Miryang City, Korea

Authors: H. E. Elzain, S. Y. Chung, V. Senapathi, Kye-Hun Park

Abstract:

Groundwater is considered a significant source for drinking and irrigation purposes in Miryang city, and it is attributed to a limited number of a surface water reservoirs and high seasonal variations in precipitation. Population growth in addition to the expansion of agricultural land uses and industrial development may affect the quality and management of groundwater. This research utilized multidisciplinary approaches of geostatistics such as multivariate statistics, factor analysis, cluster analysis and kriging technique in order to identify the hydrogeochemical process and characterizing the control factors of the groundwater geochemistry distribution for developing risk maps, exploiting data obtained from chemical investigation of groundwater samples under the area of study. A total of 79 samples have been collected and analyzed using atomic absorption spectrometer (AAS) for major and trace elements. Chemical maps using 2-D spatial Geographic Information System (GIS) of groundwater provided a powerful tool for detecting the possible potential sites of groundwater that involve the threat of contamination. GIS computer based map exhibited that the higher rate of contamination observed in the central and southern area with relatively less extent in the northern and southwestern parts. It could be attributed to the effect of irrigation, residual saline water, municipal sewage and livestock wastes. At wells elevation over than 85m, the scatter diagram represents that the groundwater of the research area was mainly influenced by saline water and NO3. Level of pH measurement revealed low acidic condition due to dissolved atmospheric CO2 in the soil, while the saline water had a major impact on the higher values of TDS and EC. Based on the cluster analysis results, the groundwater has been categorized into three group includes the CaHCO3 type of the fresh water, NaHCO3 type slightly influenced by sea water and Ca-Cl, Na-Cl types which are heavily affected by saline water. The most predominant water type was CaHCO3 in the study area. Contamination sources and chemical characteristics were identified from factor analysis interrelationship and cluster analysis. The chemical elements that belong to factor 1 analysis were related to the effect of sea water while the elements of factor 2 associated with agricultural fertilizers. The degree level, distribution, and location of groundwater contamination have been generated by using Kriging methods. Thus, geostatistics model provided more accurate results for identifying the source of contamination and evaluating the groundwater quality. GIS was also a creative tool to visualize and analyze the issues affecting water quality in the Miryang city.

Keywords: groundwater characteristics, GIS chemical maps, factor analysis, cluster analysis, Kriging techniques

Procedia PDF Downloads 153

40643 Assessment of Incidence and Predictors of Mortality Among HIV Positive Children on Art in Public Hospitals of Harer Town Who Were Enrolled From 2011 to 2021

Authors: Getahun Nigusie Demise

Abstract:

Background; antiretroviral treatment reduce HIV-related morbidity, and prolonged survival of patients however, there is lack of up-to-date information concerning the treatment long term effect on the survival of HIV positive children especially in the study area. Objective: The aim of this study is to assess the incidence and predictors of mortality among HIV positive children on antiretroviral therapy (ART) in public hospitals of Harer town who were enrolled from 2011 to 2021. Methodology: Institution based retrospective cohort study was conducted among 429 HIV positive children enrolled in ART clinic from January 1st 2011 to December30th 2021. Data were collected from medical cards by using a data extraction form, Descriptive analyses were used to Summarized the results, and life table was used to estimate survival probability at specific point of time after introduction of ART. Kaplan Meier survival curve together with log rank test was used to compare survival between different categories of covariates, and Multivariate Cox-proportional hazard regression model was used to estimate adjusted Hazard rate. Variables with p-values ≤0.25 in bivariable analysis were candidates to the multivariable analysis. Finally, variables with p-values < 0.05 were considered as significant variables. Results: The study participants had followed for a total of 2549.6 child-years (30596 child months) with an overall mortality rate of 1.5 (95% CI: 1.1, 2.04) per 100 child-years. Their median survival time was 112 months (95% CI: 101–117). There were 38 children with unknown outcome, 39 deaths, and 55 children transfer out to different facility. The overall survival at 6, 12, 24, 48 months were 98%, 96%, 95%, 94% respectively. being in WHO clinical Stage four (AHR=4.55, 95% CI:1.36, 15.24), having anemia(AHR=2.56, 95% CI:1.11, 5.93), baseline low absolute CD4 count (AHR=2.95, 95% CI: 1.22, 7.12), stunting (AHR=4.1, 95% CI: 1.11, 15.42), wasting (AHR=4.93, 95% CI: 1.31, 18.76), poor adherence to treatment (AHR=3.37, 95% CI: 1.25, 9.11), having TB infection at enrollment (AHR=3.26, 95% CI: 1.25, 8.49),and no history of change their regimen(AHR=7.1, 95% CI: 2.74, 18.24), were independent predictors of death. Conclusion: more than half of death occurs within 2 years. Prevalent tuberculosis, anemia, wasting, and stunting nutritional status, socioeconomic factors, and baseline opportunistic infection were independent predictors of death. Increasing early screening and managing those predictors are required.

Keywords: human immunodeficiency virus-positive children, anti-retroviral therapy, survival, treatment, Ethiopia

Procedia PDF Downloads 20

40642 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault

Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola

Abstract:

Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.

Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula

Procedia PDF Downloads 51

40641 An Automated Approach to Consolidate Galileo System Availability

Authors: Marie Bieber, Fabrice Cosson, Olivier Schmitt

Abstract:

Europe's Global Navigation Satellite System, Galileo, provides worldwide positioning and navigation services. The satellites in space are only one part of the Galileo system. An extensive ground infrastructure is essential to oversee the satellites and ensure accurate navigation signals. High reliability and availability of the entire Galileo system are crucial to continuously provide positioning information of high quality to users. Outages are tracked, and operational availability is regularly assessed. A highly flexible and adaptive tool has been developed to automate the Galileo system availability analysis. Not only does it enable a quick availability consolidation, but it also provides first steps towards improving the data quality of maintenance tickets used for the analysis. This includes data import and data preparation, with a focus on processing strings used for classification and identifying faulty data. Furthermore, the tool allows to handle a low amount of data, which is a major constraint when the aim is to provide accurate statistics.

Keywords: availability, data quality, system performance, Galileo, aerospace

Procedia PDF Downloads 143

40640 Floristic Diversity, Composition and Environmental Correlates on the Arid, Coralline Islands of the Farasan Archipelago, Red SEA, Saudi Arabia

Authors: Khalid Al Mutairi, Mashhor Mansor, Magdy El-Bana, Asyraf Mansor, Saud AL-Rowaily

Abstract:

Urban expansion and the associated increase in anthropogenic pressures have led to a great loss of the Red Sea’s biodiversity. Floristic composition, diversity, and environmental controls were investigated for 210 relive's on twenty coral islands of Farasan in the Red Sea, Saudi Arabia. Multivariate statistical analyses for classification (Cluster Analysis), ordination (Detrended Correspondence Analysis (DCA), and Redundancy Analysis (RDA) were employed to identify vegetation types and their relevance to the underlying environmental gradients. A total of 191 flowering plants belonging to 53 families and 129 genera were recorded. Geophytes and chamaephytes were the main life forms in the saline habitats, whereas therophytes and hemicryptophytes dominated the sandy formations and coral rocks. The cluster analysis and DCA ordination identified twelve vegetation groups that linked to five main habitats with definite floristic composition and environmental characteristics. The constrained RDA with Monte Carlo permutation tests revealed that elevation and soil salinity were the main environmental factors explaining the vegetation distributions. These results indicate that the flora of the study archipelago represents a phytogeographical linkage between Africa and Saharo-Arabian landscape functional elements. These findings should guide conservation and management efforts to maintain species diversity, which is threatened by anthropogenic activities and invasion by the exotic invasive tree Prosopis juliflora (Sw.) DC.

Keywords: biodiversity, classification, conservation, ordination, Red Sea

Procedia PDF Downloads 333

40639 Information Visualization Methods Applied to Nanostructured Biosensors

Authors: Osvaldo N. Oliveira Jr.

Abstract:

The control of molecular architecture inherent in some experimental methods to produce nanostructured films has had great impact on devices of various types, including sensors and biosensors. The self-assembly monolayers (SAMs) and the electrostatic layer-by-layer (LbL) techniques, for example, are now routinely used to produce tailored architectures for biosensing where biomolecules are immobilized with long-lasting preserved activity. Enzymes, antigens, antibodies, peptides and many other molecules serve as the molecular recognition elements for detecting an equally wide variety of analytes. The principles of detection are also varied, including electrochemical methods, fluorescence spectroscopy and impedance spectroscopy. In this presentation an overview will be provided of biosensors made with nanostructured films to detect antibodies associated with tropical diseases and HIV, in addition to detection of analytes of medical interest such as cholesterol and triglycerides. Because large amounts of data are generated in the biosensing experiments, use has been made of computational and statistical methods to optimize performance. Multidimensional projection techniques such as Sammon´s mapping have been shown more efficient than traditional multivariate statistical analysis in identifying small concentrations of anti-HIV antibodies and for distinguishing between blood serum samples of animals infected with two tropical diseases, namely Chagas´ disease and Leishmaniasis. Optimization of biosensing may include a combination of another information visualization method, the Parallel Coordinate technique, with artificial intelligence methods in order to identify the most suitable frequencies for reaching higher sensitivity using impedance spectroscopy. Also discussed will be the possible convergence of technologies, through which machine learning and other computational methods may be used to treat data from biosensors within an expert system for clinical diagnosis.

Keywords: clinical diagnosis, information visualization, nanostructured films, layer-by-layer technique

Procedia PDF Downloads 314

40638 Longevity of Soybean Seeds Submitted to Different Mechanized Harvesting Conditions

Authors: Rute Faria, Digo Moraes, Amanda Santos, Dione Morais, Maria Sartori

Abstract:

Seed vigor is a fundamental component for the good performance of the entire soybean production process. Seeds with mechanical damage at harvest time will be more susceptible to fungal and insect attack during storage, which will invariably reduce their vigor to the field, compromising uniformity and final stand performance. Harvesters, even the most modern ones, when not properly regulated or operated, can cause irreversible damages to the seeds, compromising even their commercialization. Therefore, the control of an efficient harvest is necessary in order to guarantee a good quality final product. In this work, the damage caused by two different harvesters (one rented, and another one) was evaluated, traveling in two speeds (4 and 8 km / h). The design was completely randomized in 2 x 2 factorial, with four replications. To evaluate the physiological quality seed germination and vigor tests were carried out over a period of six months. A multivariate analysis of Principal Components (PCA) and clustering allowed us to verify that the leased machine had better performance in the incidence of immediate damages in the seeds, but after a storage period of 6 months the vigor of these seeds reduced more than own machine evidencing that such a machine would bring more damages to the seeds.

Keywords: Glycine max (L.), cluster analysis, PCA, vigor

Procedia PDF Downloads 236

40637 The Effectiveness of Mindfulness Education on Emotional, Psychological, and Social Well-Being in 12th Grade Students in Tehran City

Authors: Fariba Dortaj, H. Bashir Nejad, Akram Dortaj,

Abstract:

Investigate the Effectiveness of Mindfulness Education on Emotional, Psychological, and Social Well-being in 12th grade students in Tehran city is the aim of present study. The research method is semi-experimental with pretest-posttest design with control group. The statistical population of the study includes all 12th grade students of the 12th district of Tehran city in the academic year of 2017 to 2018. From the mentioned population, 60 students had earned low scores in three dimensions of Subjective Well-Being Questionnaire of Keyes and Magyar-Moe (2003) by using random sampling method and they were selected and randomly assigned into 2 experimental and control groups. Then experimental groups were received a Mindfulness protocol in 8 sessions during 2 hours. After completion of the sessions, all subjects were re-evaluated. Data were analyzed by using multivariate analysis of covariance. The findings of this study showed that in the emotional well-being aspect with the components of positive emotional affection (P < 0.025, F = 17/80) and negative emotions (P <0.025, F = 5/41), in the psychological well-being of the components Self-esteem (P < 0.008, F = 25.26), life goal (P < 0.008, F = 38.19), environmental domination (P <0.008, F=82.82), relationships with others (P < 0.008, F = 19.12), personal development with (P < 0.008, F = 87.38), and in the social well-being aspect, the correlation coefficients with (P<0.01, F=12/21), admission and acceptability with (P <0.01, F =18.09) and realism with (P <0.01, F = 11.30), there was a significant difference between the experimental and control groups and it can be said that the education of mindfulness affects the improvement of components of psychological, social and emotional well-being in students.

Keywords: mindfulness, emotional well-being, psychological well-being, social well-being

Procedia PDF Downloads 151

40636 Joint Probability Distribution of Extreme Water Level with Rainfall and Temperature: Trend Analysis of Potential Impacts of Climate Change

Authors: Ali Razmi, Saeed Golian

Abstract:

Climate change is known to have the potential to impact adversely hydrologic patterns for variables such as rainfall, maximum and minimum temperature and sea level rise. Long-term average of these climate variables could possibly change over time due to climate change impacts. In this study, trend analysis was performed on rainfall, maximum and minimum temperature and water level data of a coastal area in Manhattan, New York City, Central Park and Battery Park stations to investigate if there is a significant change in the data mean. Partial Man-Kendall test was used for trend analysis. Frequency analysis was then performed on data using common probability distribution functions such as Generalized Extreme Value (GEV), normal, log-normal and log-Pearson. Goodness of fit tests such as Kolmogorov-Smirnov are used to determine the most appropriate distributions. In flood frequency analysis, rainfall and water level data are often separately investigated. However, in determining flood zones, simultaneous consideration of rainfall and water level in frequency analysis could have considerable effect on floodplain delineation (flood extent and depth). The present study aims to perform flood frequency analysis considering joint probability distribution for rainfall and storm surge. First, correlation between the considered variables was investigated. Joint probability distribution of extreme water level and temperature was also investigated to examine how global warming could affect sea level flooding impacts. Copula functions were fitted to data and joint probability of water level with rainfall and temperature for different recurrence intervals of 2, 5, 25, 50, 100, 200, 500, 600 and 1000 was determined and compared with the severity of individual events. Results for trend analysis showed increase in long-term average of data that could be attributed to climate change impacts. GEV distribution was found as the most appropriate function to be fitted to the extreme climate variables. The results for joint probability distribution analysis confirmed the necessity for incorporation of both rainfall and water level data in flood frequency analysis.

Keywords: climate change, climate variables, copula, joint probability

Procedia PDF Downloads 339

40635 Improved K-Means Clustering Algorithm Using RHadoop with Combiner

Authors: Ji Eun Shin, Dong Hoon Lim

Abstract:

Data clustering is a common technique used in data analysis and is used in many applications, such as artificial intelligence, pattern recognition, economics, ecology, psychiatry and marketing. K-means clustering is a well-known clustering algorithm aiming to cluster a set of data points to a predefined number of clusters. In this paper, we implement K-means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. The main idea is to introduce a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. The experimental results demonstrated that K-means algorithm using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also showed that our K-means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases.

Keywords: big data, combiner, K-means clustering, RHadoop

Procedia PDF Downloads 411

40634 Investigation of the Effect of Teaching Thinking and Research Lesson by Cooperative and Traditional Methods on Creativity of Sixth Grade Students

Authors: Faroogh Khakzad, Marzieh Dehghani, Elahe Hejazi

Abstract:

The present study investigates the effect of teaching a Thinking and Research lesson by cooperative and traditional methods on the creativity of sixth-grade students in Piranshahr province. The statistical society includes all the sixth-grade students of Piranshahr province. The sample of this studytable was selected by available sampling from among male elementary schools of Piranshahr. They were randomly assigned into two groups of cooperative teaching method and traditional teaching method. The design of the study is quasi-experimental with a control group. In this study, to assess students’ creativity, Abedi’s creativity questionnaire was used. Based on Cronbach’s alpha coefficient, the reliability of the factor flow was 0.74, innovation was 0.61, flexibility was 0.63, and expansion was 0.68. To analyze the data, t-test, univariate and multivariate covariance analysis were used for evaluation of the difference of means and the pretest and posttest scores. The findings of the research showed that cooperative teaching method does not significantly increase creativity (p > 0.05). Moreover, cooperative teaching method was found to have significant effect on flow factor (p < 0.05), but in innovation and expansion factors no significant effect was observed (p < 0.05).

Keywords: cooperative teaching method, traditional teaching method, creativity, flow, innovation, flexibility, expansion, thinking and research lesson

Procedia PDF Downloads 299

40633 Processing Big Data: An Approach Using Feature Selection

Authors: Nikat Parveen, M. Ananthi

Abstract:

Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.

Keywords: big data, key value, feature selection, retrieval, performance

Procedia PDF Downloads 318

40632 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Model

Authors: Alam Ali, Ashok Kumar Pathak

Abstract:

Path analysis is a statistical technique used to evaluate the strength of the direct and indirect effects of variables. One or more structural regression equations are used to estimate a series of parameters in order to find the better fit of data. Sometimes, exogenous variables do not show a significant strength of their direct and indirect effect when the assumption of classical regression (ordinary least squares (OLS)) are violated by the nature of the data. The main motive of this article is to investigate the efficacy of the copula-based regression approach over the classical regression approach and calculate the direct and indirect effects of variables when data violates the OLS assumption and variables are linked through an elliptical copula. We perform this study using a well-organized numerical scheme. Finally, a real data application is also presented to demonstrate the performance of the superiority of the copula approach.

Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique

Procedia PDF Downloads 54

40631 Sentiment Analysis: An Enhancement of Ontological-Based Features Extraction Techniques and Word Equations

Authors: Mohd Ridzwan Yaakub, Muhammad Iqbal Abu Latiffi

Abstract:

Online business has become popular recently due to the massive amount of information and medium available on the Internet. This has resulted in the huge number of reviews where the consumers share their opinion, criticisms, and satisfaction on the products they have purchased on the websites or the social media such as Facebook and Twitter. However, to analyze customer’s behavior has become very important for organizations to find new market trends and insights. The reviews from the websites or the social media are in structured and unstructured data that need a sentiment analysis approach in analyzing customer’s review. In this article, techniques used in will be defined. Definition of the ontology and description of its possible usage in sentiment analysis will be defined. It will lead to empirical research that related to mobile phones used in research and the ontology used in the experiment. The researcher also will explore the role of preprocessing data and feature selection methodology. As the result, ontology-based approach in sentiment analysis can help in achieving high accuracy for the classification task.

Keywords: feature selection, ontology, opinion, preprocessing data, sentiment analysis

Procedia PDF Downloads 183

40630 The Comparison of Emotional Regulation Strategies and Psychological Symptoms in Patients with Multiple Sclerosis and Normal Individuals

Authors: Amir Salamatzade, Marhamet HematPour

Abstract:

Due to the increasing importance of psychological factors in the incidence and exacerbation of chronic diseases such as multiple sclerosis, the aim of this study was to determine the difference between emotional regulation strategies and psychological symptoms in patients with multiple sclerosis and normal people. The research method was causal-comparative (post-event). The statistical population of this research included all patients with multiple sclerosis referred to the MS Association of Rasht in the first quarter of 2021, approximately 350 people. The study sample also included 120 people (60 patients with multiple sclerosis and 60 normal people) who were selected by the available sampling method and completed the emotional regulation and anxiety, depression, and stress Lavibund and Lavibund (1995) questionnaires. Data were analyzed using an independent t-test and multivariate variance analysis. The results showed that there was a significant difference between the mean of emotional regulation strategies and the components of emotional reassessment and emotional inhibition between the two groups of patients with multiple sclerosis and normal individuals (p < 0.01). There is a significant difference between the mean of psychological symptoms and the components of depression, anxiety, and stress in the two groups of patients with multiple sclerosis and normal individuals. (p < 0.01). Based on this, it can be concluded that patients with multiple sclerosis have lower levels of emotional regulation strategies and higher levels of psychological symptoms than normal individuals.

Keywords: emotional regulation strategies, psychological symptoms, multiple sclerosis, normal Individuals

Procedia PDF Downloads 196

40629 Social Network Analysis as a Research and Pedagogy Tool in Problem-Focused Undergraduate Social Innovation Courses

Authors: Sean McCarthy, Patrice M. Ludwig, Will Watson

Abstract:

This exploratory case study explores the deployment of Social Network Analysis (SNA) in mapping community assets in an interdisciplinary, undergraduate, team-taught course focused on income insecure populations in a rural area in the US. Specifically, it analyzes how students were taught to collect data on community assets and to visualize the connections between those assets using Kumu, an SNA data visualization tool. Further, the case study shows how social network data was also collected about student teams via their written communications in Slack, an enterprise messaging tool, which enabled instructors to manage and guide student research activity throughout the semester. The discussion presents how SNA methods can simultaneously inform both community-based research and social innovation pedagogy through the use of data visualization and collaboration-focused communication technologies.

Keywords: social innovation, social network analysis, pedagogy, problem-based learning, data visualization, information communication technologies

Procedia PDF Downloads 131

40628 Quantification of Lawsone and Adulterants in Commercial Henna Products

Authors: Ruchi B. Semwal, Deepak K. Semwal, Thobile A. N. Nkosi, Alvaro M. Viljoen

Abstract:

The use of Lawsonia inermis L. (Lythraeae), commonly known as henna, has many medicinal benefits and is used as a remedy for the treatment of diarrhoea, cancer, inflammation, headache, jaundice and skin diseases in folk medicine. Although widely used for hair dyeing and temporary tattooing, henna body art has popularized over the last 15 years and changed from being a traditional bridal and festival adornment to an exotic fashion accessory. The naphthoquinone, lawsone, is one of the main constituents of the plant and responsible for its dyeing property. Henna leaves typically contain 1.8–1.9% lawsone, which is used as a marker compound for the quality control of henna products. Adulteration of henna with various toxic chemicals such as p-phenylenediamine, p-methylaminophenol, p-aminobenzene and p-toluenodiamine to produce a variety of colours, is very common and has resulted in serious health problems, including allergic reactions. This study aims to assess the quality of henna products collected from different parts of the world by determining the lawsone content, as well as the concentrations of any adulterants present. Ultra high performance liquid chromatography-mass spectrometry (UPLC-MS) was used to determine the lawsone concentrations in 172 henna products. Separation of the chemical constituents was achieved on an Acquity UPLC BEH C18 column using gradient elution (0.1% formic acid and acetonitrile). The results from UPLC-MS revealed that of 172 henna products, 11 contained 1.0-1.8% lawsone, 110 contained 0.1-0.9% lawsone, whereas 51 samples did not contain detectable levels of lawsone. High performance thin layer chromatography was investigated as a cheaper, more rapid technique for the quality control of henna in relation to the lawsone content. The samples were applied using an automatic TLC Sampler 4 (CAMAG) to pre-coated silica plates, which were subsequently developed with acetic acid, acetone and toluene (0.5: 1.0: 8.5 v/v). A Reprostar 3 digital system allowed the images to be captured. The results obtained corresponded to those from UPLC-MS analysis. Vibrational spectroscopy analysis (MIR or NIR) of the powdered henna, followed by chemometric modelling of the data, indicates that this technique shows promise as an alternative quality control method. Principal component analysis (PCA) was used to investigate the data by observing clustering and identifying outliers. Partial least squares (PLS) multivariate calibration models were constructed for the quantification of lawsone. In conclusion, only a few of the samples analysed contain lawsone in high concentrations, indicating that they are of poor quality. Currently, the presence of adulterants that may have been added to enhance the dyeing properties of the products, is being investigated.

Keywords: Lawsonia inermis, paraphenylenediamine, temporary tattooing, lawsone

Procedia PDF Downloads 442

40627 Attributable Mortality of Nosocomial Infection: A Nested Case Control Study in Tunisia

Authors: S. Ben Fredj, H. Ghali, M. Ben Rejeb, S. Layouni, S. Khefacha, L. Dhidah, H. Said

Abstract:

Background: The Intensive Care Unit (ICU) provides continuous care and uses a high level of treatment technologies. Although developed country hospitals allocate only 5–10% of beds in critical care areas, approximately 20% of nosocomial infections (NI) occur among patients treated in ICUs. Whereas in the developing countries the situation is still less accurate. The aim of our study is to assess mortality rates in ICUs and to determine its predictive factors. Methods: We carried out a nested case-control study in a 630-beds public tertiary care hospital in Eastern Tunisia. We included in the study all patients hospitalized for more than two days in the surgical or medical ICU during the entire period of the surveillance. Cases were patients who died before ICU discharge, whereas controls were patients who survived to discharge. NIs were diagnosed according to the definitions of ‘Comité Technique des Infections Nosocomiales et les Infections Liées aux Soins’ (CTINLIS, France). Data collection was based on the protocol of Rea-RAISIN 2009 of the National Institute for Health Watch (InVS, France). Results: Overall, 301 patients were enrolled from medical and surgical ICUs. The mean age was 44.8 ± 21.3 years. The crude ICU mortality rate was 20.6% (62/301). It was 35.8% for patients who acquired at least one NI during their stay in ICU and 16.2% for those without any NI, yielding an overall crude excess mortality rate of 19.6% (OR= 2.9, 95% CI, 1.6 to 5.3). The population-attributable fraction due to ICU-NI in patients who died before ICU discharge was 23.46% (95% CI, 13.43%–29.04%). Overall, 62 case-patients were compared to 239 control patients for the final analysis. Case patients and control patients differed by age (p=0,003), simplified acute physiology score II (p < 10-3), NI (p < 10-3), nosocomial pneumonia (p=0.008), infection upon admission (p=0.002), immunosuppression (p=0.006), days of intubation (p < 10-3), tracheostomy (p=0.004), days with urinary catheterization (p < 10-3), days with CVC ( p=0.03), and length of stay in ICU (p=0.003). Multivariate analysis demonstrated 3 factors: age older than 65 years (OR, 5.78 [95% CI, 2.03-16.05] p=0.001), duration of intubation 1-10 days (OR, 6.82 [95% CI, [1.90-24.45] p=0.003), duration of intubation > 10 days (OR, 11.11 [95% CI, [2.85-43.28] p=0.001), duration of CVC 1-7 days (OR, 6.85[95% CI, [1.71-27.45] p=0.007) and duration of CVC > 7 days (OR, 5.55[95% CI, [1.70-18.04] p=0.004). Conclusion: While surveillance provides important baseline data, successful trials with more active intervention protocols, adopting multimodal approach for the prevention of nosocomial infection incited us to think about the feasibility of similar trial in our context. Therefore, the implementation of an efficient infection control strategy is a crucial step to improve the quality of care.

Keywords: intensive care unit, mortality, nosocomial infection, risk factors

Procedia PDF Downloads 390

40626 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering

Authors: K. Umbleja, M. Ichino

Abstract:

Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.

Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis

Procedia PDF Downloads 148

40625 Evaluation of Some Trace Elements in Biological Samples of Egyptian Viral Hepatitis Patients under Nutrition Therapy

Authors: Tarek Elnimr, Reda Morsy, Assem El Fert, Aziza Ismail

Abstract:

Hepatitis is an inflammation of the liver. The condition can be self-limiting or can progress to fibrosis, cirrhosis or liver cancer. Disease caused by the hepatitis virus, the virus can cause hepatitis infection, ranging in severity from a mild illness lasting a few weeks to a serious, lifelong illness. A growing body of evidence indicates that many trace elements play important roles in a number of carcinogenic processes that proceed with various mechanisms. To examine the status of trace elements during the development of hepatic carcinoma, we determined the iron, copper, zinc and selenium levels in some biological samples of patients at different stages of viral hepatic disease. We observed significant changes in the iron, copper, zinc and selenium levels in the biological samples of patients hepatocellular carcinoma, relative to those of healthy controls. The mean hair, nail, RBC, serum and whole blood copper levels in patients with hepatitis virus were significantly higher than that of the control group. In contrast the mean iron, zinc, and selenium levels in patients having hepatitis virus were significantly lower than those of the control group. On the basis of this study, we identified the impact of natural supplements to improve the treatment of viral liver damage, using the level of some trace elements such as, iron, copper, zinc and selenium, which might serve as biomarkers for increases survival and reduces disease progression. Most of the elements revealed diverse and random distribution in the samples of the donor groups. The correlation study pointed out significant disparities in the mutual relationships among the trace elements in the patients and controls. Principal component analysis and cluster analysis of the element data manifested diverse apportionment of the selected elements in the scalp hair, nail and blood components of the patients compared with the healthy counterparts.

Keywords: hepatitis, hair, nail, blood components, trace element, nutrition therapy, multivariate analysis, correlation, ICP-MS

Procedia PDF Downloads 389

‹
1
2
...
12
13
14
15
16
17
18
...
1369
1370
›