Search results for: cluster validation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2139

Search results for: cluster validation

2049 Translation, Cultural Adaptation and Validation of the Hungarian Version of Self- Determination Scale

Authors: E. E. Marschalko, K. Kalcza-Janosi, I. Kotta, B. Bibok

Abstract:

Cultural moderation aspects have been highlighted in the literature on self-determination behavior in some cultures, including in the Hungarian population. There is a lack of validated instruments in Hungarian for the assessment of self-determination related behaviors. In order to fill in this gap, the aim of this study was the translation, cultural adaptation and validation of Self Determination Scale (Sheldon, 1995) for the Hungarian population. A total of 4335 adults participated in the study. The mean age of the participants was 27.97 (SD=9.60). The sample consisted mostly from females, less than 20% were males. Exploratory and confirmatory factor analyses were performed for adequacy checking. Cronbach’s alpha was used to examine the reliability of the factors. Our results revealed that the Hungarian version of SDS has good psychometric properties and it is a reliable tool for psychologist who would like to study or assess self-determination in their clients. The final, adapted and validated SDS items are presented in this paper.

Keywords: self-determination scale, Hungarian, adaptation, validation, reliability

Procedia PDF Downloads 221
2048 Rural Development as a Strategy to Deter Migration in India - Re-Examining the Ideology of Cluster Development

Authors: Nandini Mohan, Thiruvengadam R. B.

Abstract:

Mahatma Gandhi advocated that the true indicator of modern India lay in the development of its villages. This has been proven with the recent outbreak of the Coronavirus pandemic and the surfacing predicament of our urban centers. Developed on the Industrialization model, the current state of the metropolis is of rampant overcrowding, high rates of unemployment, inadequate infrastructure, and resources to cater to the growing population. A majority of each city’s strength composes of the migrant population, demonstrated through the migrant crisis, a direct repercussion of COVID-19. This paper explores the ideology of how rural development can act as a tactic to counter the high rates of rural-urban migration. It establishes the need for a rural push, as India is predominantly an agrarian economy, with a vast disparity between the urban and rural centers due to its urban bias. It seeks to define development in holistic terms. It studies the models of ‘cluster’ as conceptualized by V.K.R.V. Rao, and detailed by Architect Charles Correa in his book, The New Landscape. The paper reexamines the theory of cluster development through existing models proposed by the government of India. Namely, PURA (Provision of Urban Amenities in Rural Areas), DRI (Deendayal Research Institute), and Rurban under Shyama Prasad Mukharjee Rurban Mission. It analyses the models, their strengths, weaknesses, and reasons for their failure and success to derive parameters for the ideation of an archetype model. A model of rural development that talks of the simultaneous development of existing adjacent villages, by the introduction of set unique functions, that may turn into self-sustaining clusters or agglomerations in the future, which could serve as the next step for Indian village development based on the cluster ideology.

Keywords: counter migration, models of rural development, cluster development theory, India

Procedia PDF Downloads 56
2047 Genetic Diversity and Discovery of Unique SNPs in Five Country Cultivars of Sesamum indicum by Next-Generation Sequencing

Authors: Nam-Kuk Kim, Jin Kim, Soomin Park, Changhee Lee, Mijin Chu, Seong-Hun Lee

Abstract:

In this study, we conducted whole genome re-sequencing of 10 cultivars originated from five countries including Korea, China, India, Pakistan and Ethiopia with Sesamum indicum (Zhongzho No. 13) genome as a reference. Almost 80% of the whole genome sequences of the reference genome could be covered by sequenced reads. Numerous SNP and InDel were detected by bioinformatic analysis. Among these variants, 266,051 SNPs were identified as unique to countries. Pakistan and Ethiopia had high densities of SNPs compared to other countries. Three main clusters (cluster 1: Korea, cluster 2: Pakistan and India, cluster 3: Ethiopia and China) were recovered by neighbor-joining analysis using all variants. Interestingly, some variants were detected in DGAT1 (diacylglycerol O-acyltransferase 1) and FADS (fatty acid desaturase) genes, which are known to be related with fatty acid synthesis and metabolism. These results can provide useful information to understand the regional characteristics and develop DNA markers for origin discrimination of sesame.

Keywords: Sesamum indicum, NGS, SNP, DNA marker

Procedia PDF Downloads 300
2046 An AI-Based Dynamical Resource Allocation Calculation Algorithm for Unmanned Aerial Vehicle

Authors: Zhou Luchen, Wu Yubing, Burra Venkata Durga Kumar

Abstract:

As the scale of the network becomes larger and more complex than before, the density of user devices is also increasing. The development of Unmanned Aerial Vehicle (UAV) networks is able to collect and transform data in an efficient way by using software-defined networks (SDN) technology. This paper proposed a three-layer distributed and dynamic cluster architecture to manage UAVs by using an AI-based resource allocation calculation algorithm to address the overloading network problem. Through separating services of each UAV, the UAV hierarchical cluster system performs the main function of reducing the network load and transferring user requests, with three sub-tasks including data collection, communication channel organization, and data relaying. In this cluster, a head node and a vice head node UAV are selected considering the Central Processing Unit (CPU), operational (RAM), and permanent (ROM) memory of devices, battery charge, and capacity. The vice head node acts as a backup that stores all the data in the head node. The k-means clustering algorithm is used in order to detect high load regions and form the UAV layered clusters. The whole process of detecting high load areas, forming and selecting UAV clusters, and moving the selected UAV cluster to that area is proposed as offloading traffic algorithm.

Keywords: k-means, resource allocation, SDN, UAV network, unmanned aerial vehicles

Procedia PDF Downloads 75
2045 Low Overhead Dynamic Channel Selection with Cluster-Based Spatial-Temporal Station Reporting in Wireless Networks

Authors: Zeyad Abdelmageid, Xianbin Wang

Abstract:

Choosing the operational channel for a WLAN access point (AP) in WLAN networks has been a static channel assignment process initiated by the user during the deployment process of the AP, which fails to cope with the dynamic conditions of the assigned channel at the station side afterward. However, the dramatically growing number of Wi-Fi APs and stations operating in the unlicensed band has led to dynamic, distributed, and often severe interference. This highlights the urgent need for the AP to dynamically select the best overall channel of operation for the basic service set (BSS) by considering the distributed and changing channel conditions at all stations. Consequently, dynamic channel selection algorithms which consider feedback from the station side have been developed. Despite the significant performance improvement, existing channel selection algorithms suffer from very high feedback overhead. Feedback latency from the STAs, due to the high overhead, can cause the eventually selected channel to no longer be optimal for operation due to the dynamic sharing nature of the unlicensed band. This has inspired us to develop our own dynamic channel selection algorithm with reduced overhead through the proposed low-overhead, cluster-based station reporting mechanism. The main idea behind the cluster-based station reporting is the observation that STAs which are very close to each other tend to have very similar channel conditions. Instead of requesting each STA to report on every candidate channel while causing high overhead, the AP divides STAs into clusters then assigns each STA in each cluster one channel to report feedback on. With the proper design of the cluster based reporting, the AP does not lose any information about the channel conditions at the station side while reducing feedback overhead. The simulation results show equal performance and, at times, better performance with a fraction of the overhead. We believe that this algorithm has great potential in designing future dynamic channel selection algorithms with low overhead.

Keywords: channel assignment, Wi-Fi networks, clustering, DBSCAN, overhead

Procedia PDF Downloads 86
2044 The Development and Validation of the Awareness to Disaster Risk Reduction Questionnaire for Teachers

Authors: Ian Phil Canlas, Mageswary Karpudewan, Joyce Magtolis, Rosario Canlas

Abstract:

This study reported the development and validation of the Awareness to Disaster Risk Reduction Questionnaire for Teachers (ADRRQT). The questionnaire is a combination of Likert scale and open-ended questions that were grouped into two parts. The first part included questions relating to the general awareness on disaster risk reduction. Whereas, the second part comprised questions regarding the integration of disaster risk reduction in the teaching process. The entire process of developing and validating of the ADRRQT was described in this study. Statistical and qualitative findings revealed that the ADRRQT is significantly valid and reliable and has the potential of measuring awareness to disaster risk reduction of stakeholders in the field of teaching. Moreover, it also shows the potential to be adopted in other fields.

Keywords: awareness, development, disaster risk reduction, questionnaire, validation

Procedia PDF Downloads 190
2043 Analysis Customer Loyalty Characteristic and Segmentation Analysis in Mobile Phone Category in Indonesia

Authors: A. B. Robert, Adam Pramadia, Calvin Andika

Abstract:

The main purpose of this study is to explore consumer loyalty characteristic of mobile phone category in Indonesia. Second, this research attempts to identify consumer segment and to explore their profile in each segment as the basis of marketing strategy formulation. This study used some tools of multivariate analysis such as discriminant analysis and cluster analysis. Discriminate analysis used to discriminate consumer loyal and not loyal by using particular variables. Cluster analysis used to reveal various segment in mobile phone category. In addition to having better customer understanding in each segment, this study used descriptive analysis and cross tab analysis in each segment defined by cluster analysis. This study expected several findings. First, consumer can be divided into two large group of loyal versus not loyal by set of variables. Second, this study identifies customer segment in mobile phone category. Third, exploring customer profile in each segment that has been identified. This study answer a call for additional empirical research into different product categories. Therefore, a replication research is advisable. By knowing the customer loyalty characteristic, and deep analysis of their consumption behavior and profile for each segment, this study is very advisable for high impact marketing strategy development. This study contributes body of knowledge by adding empirical study of consumer loyalty, segmentation analysis in mobile phone category by multiple brand analysis.

Keywords: customer loyalty, segmentation, marketing strategy, discriminant analysis, cluster analysis, mobile phone

Procedia PDF Downloads 564
2042 A Segmentation Method for Grayscale Images Based on the Firefly Algorithm and the Gaussian Mixture Model

Authors: Donatella Giuliani

Abstract:

In this research, we propose an unsupervised grayscale image segmentation method based on a combination of the Firefly Algorithm and the Gaussian Mixture Model. Firstly, the Firefly Algorithm has been applied in a histogram-based research of cluster means. The Firefly Algorithm is a stochastic global optimization technique, centered on the flashing characteristics of fireflies. In this context it has been performed to determine the number of clusters and the related cluster means in a histogram-based segmentation approach. Successively these means are used in the initialization step for the parameter estimation of a Gaussian Mixture Model. The parametric probability density function of a Gaussian Mixture Model is represented as a weighted sum of Gaussian component densities, whose parameters are evaluated applying the iterative Expectation-Maximization technique. The coefficients of the linear super-position of Gaussians can be thought as prior probabilities of each component. Applying the Bayes rule, the posterior probabilities of the grayscale intensities have been evaluated, therefore their maxima are used to assign each pixel to the clusters, according to their gray-level values. The proposed approach appears fairly solid and reliable when applied even to complex grayscale images. The validation has been performed by using different standard measures, more precisely: the Root Mean Square Error (RMSE), the Structural Content (SC), the Normalized Correlation Coefficient (NK) and the Davies-Bouldin (DB) index. The achieved results have strongly confirmed the robustness of this gray scale segmentation method based on a metaheuristic algorithm. Another noteworthy advantage of this methodology is due to the use of maxima of responsibilities for the pixel assignment that implies a consistent reduction of the computational costs.

Keywords: clustering images, firefly algorithm, Gaussian mixture model, meta heuristic algorithm, image segmentation

Procedia PDF Downloads 189
2041 The Network Relative Model Accuracy (NeRMA) Score: A Method to Quantify the Accuracy of Prediction Models in a Concurrent External Validation

Authors: Carl van Walraven, Meltem Tuna

Abstract:

Background: Network meta-analysis (NMA) quantifies the relative efficacy of 3 or more interventions from studies containing a subgroup of interventions. This study applied the analytical approach of NMA to quantify the relative accuracy of prediction models with distinct inclusion criteria that are evaluated on a common population (‘concurrent external validation’). Methods: We simulated binary events in 5000 patients using a known risk function. We biased the risk function and modified its precision by pre-specified amounts to create 15 prediction models with varying accuracy and distinct patient applicability. Prediction model accuracy was measured using the Scaled Brier Score (SBS). Overall prediction model accuracy was measured using fixed-effects methods that accounted for model applicability patterns. Prediction model accuracy was summarized as the Network Relative Model Accuracy (NeRMA) Score which ranges from -∞ through 0 (accuracy of random guessing) to 1 (accuracy of most accurate model in concurrent external validation). Results: The unbiased prediction model had the highest SBS. The NeRMA score correctly ranked all simulated prediction models by the extent of bias from the known risk function. A SAS macro and R-function was created to implement the NeRMA Score. Conclusions: The NeRMA Score makes it possible to quantify the accuracy of binomial prediction models having distinct inclusion criteria in a concurrent external validation.

Keywords: prediction model accuracy, scaled brier score, fixed effects methods, concurrent external validation

Procedia PDF Downloads 180
2040 A Spatial Approach to Model Mortality Rates

Authors: Yin-Yee Leong, Jack C. Yue, Hsin-Chung Wang

Abstract:

Human longevity has been experiencing its largest increase since the end of World War II, and modeling the mortality rates is therefore often the focus of many studies. Among all mortality models, the Lee–Carter model is the most popular approach since it is fairly easy to use and has good accuracy in predicting mortality rates (e.g., for Japan and the USA). However, empirical studies from several countries have shown that the age parameters of the Lee–Carter model are not constant in time. Many modifications of the Lee–Carter model have been proposed to deal with this problem, including adding an extra cohort effect and adding another period effect. In this study, we propose a spatial modification and use clusters to explain why the age parameters of the Lee–Carter model are not constant. In spatial analysis, clusters are areas with unusually high or low mortality rates than their neighbors, where the “location” of mortality rates is measured by age and time, that is, a 2-dimensional coordinate. We use a popular cluster detection method—Spatial scan statistics, a local statistical test based on the likelihood ratio test to evaluate where there are locations with mortality rates that cannot be described well by the Lee–Carter model. We first use computer simulation to demonstrate that the cluster effect is a possible source causing the problem of the age parameters not being constant. Next, we show that adding the cluster effect can solve the non-constant problem. We also apply the proposed approach to mortality data from Japan, France, the USA, and Taiwan. The empirical results show that our approach has better-fitting results and smaller mean absolute percentage errors than the Lee–Carter model.

Keywords: mortality improvement, Lee–Carter model, spatial statistics, cluster detection

Procedia PDF Downloads 145
2039 Developing and Evaluating Clinical Risk Prediction Models for Coronary Artery Bypass Graft Surgery

Authors: Mohammadreza Mohebbi, Masoumeh Sanagou

Abstract:

The ability to predict clinical outcomes is of great importance to physicians and clinicians. A number of different methods have been used in an effort to accurately predict these outcomes. These methods include the development of scoring systems based on multivariate statistical modelling, and models involving the use of classification and regression trees. The process usually consists of two consecutive phases, namely model development and external validation. The model development phase consists of building a multivariate model and evaluating its predictive performance by examining calibration and discrimination, and internal validation. External validation tests the predictive performance of a model by assessing its calibration and discrimination in different but plausibly related patients. A motivate example focuses on prediction modeling using a sample of patients undergone coronary artery bypass graft (CABG) has been used for illustrative purpose and a set of primary considerations for evaluating prediction model studies using specific quality indicators as criteria to help stakeholders evaluate the quality of a prediction model study has been proposed.

Keywords: clinical prediction models, clinical decision rule, prognosis, external validation, model calibration, biostatistics

Procedia PDF Downloads 267
2038 Experimental Validation of Computational Fluid Dynamics Used for Pharyngeal Flow Patterns during Obstructive Sleep Apnea

Authors: Pragathi Gurumurthy, Christina Hagen, Patricia Ulloa, Martin A. Koch, Thorsten M. Buzug

Abstract:

Obstructive sleep apnea (OSA) is a sleep disorder where the patient suffers a disturbed airflow during sleep due to partial or complete occlusion of the pharyngeal airway. Recently, numerical simulations have been used to better understand the mechanism of pharyngeal collapse. However, to gain confidence in the solutions so obtained, an experimental validation is required. Therefore, in this study an experimental validation of computational fluid dynamics (CFD) used for the study of human pharyngeal flow patterns during OSA is performed. A stationary incompressible Navier-Stokes equation solved using the finite element method was used to numerically study the flow patterns in a computed tomography-based human pharynx model. The inlet flow rate was set to 250 ml/s and such that a flat profile was maintained at the inlet. The outlet pressure was set to 0 Pa. The experimental technique used for the validation of CFD of fluid flow patterns is phase contrast-MRI (PC-MRI). Using the same computed tomography data of the human pharynx as in the simulations, a phantom for the experiment was 3 D printed. Glycerol (55.27% weight) in water was used as a test fluid at 25°C. Inflow conditions similar to the CFD study were simulated using an MRI compatible flow pump (CardioFlow-5000MR, Shelley Medical Imaging Technologies). The entire experiment was done on a 3 T MR system (Ingenia, Philips) with 108 channel body coil using an RF-spoiled, gradient echo sequence. A comparison of the axial velocity obtained in the pharynx from the numerical simulations and PC-MRI shows good agreement. The region of jet impingement and recirculation also coincide, therefore validating the numerical simulations. Hence, the experimental validation proves the reliability and correctness of the numerical simulations.

Keywords: computational fluid dynamics, experimental validation, phase contrast-MRI, obstructive sleep apnea

Procedia PDF Downloads 286
2037 Validation of the X-Ray Densitometry Method for Radial Density Pattern Determination of Acacia seyal var. seyal Tree Species

Authors: Hanadi Mohamed Shawgi Gamal, Claus Thomas Bues

Abstract:

Wood density is a variable influencing many of the technological and quality properties of wood. Understanding the pattern of wood density radial variation is important for its end-use. The X-ray technique, traditionally applied to softwood species to assess the wood quality properties, due to its simple and relatively uniform wood structure. On the other hand, very limited information is available about the validation of using this technique for hardwood species. The suitability of using the X-ray technique for the determination of hardwood density has a special significance in countries like Sudan, where only a few timbers are well known. This will not only save the time consumed by using the traditional methods, but it will also enhance the investigations of the great number of the lesser known species, the thing which will fill the huge cap of lake information of hardwood species growing in Sudan. The current study aimed to evaluate the validation of using the X-ray densitometry technique to determine the radial variation of wood density of Acacia seyal var. seyal. To this, a total of thirty trees were collected randomly from four states in Sudan. The wood density radial trend was determined using the basic density as well as density obtained by the X-ray densitometry method in order to assess the validation of X-ray technique in wood density radial variation determination. The results showed that the pattern of radial trend of density obtained by X-ray technique is very similar to that achieved by basic density. These results confirmed the validation of using the X-ray technique for Acacia seyal var. seyal density radial trend determination. It also promotes the suitability of using this method in other hardwood species.

Keywords: x-ray densitometry, wood density, Acacia seyal var. seyal, radial variation

Procedia PDF Downloads 115
2036 Chemometric QSRR Evaluation of Behavior of s-Triazine Pesticides in Liquid Chromatography

Authors: Lidija R. Jevrić, Sanja O. Podunavac-Kuzmanović, Strahinja Z. Kovačević

Abstract:

This study considers the selection of the most suitable in silico molecular descriptors that could be used for s-triazine pesticides characterization. Suitable descriptors among topological, geometrical and physicochemical are used for quantitative structure-retention relationships (QSRR) model establishment. Established models were obtained using linear regression (LR) and multiple linear regression (MLR) analysis. In this paper, MLR models were established avoiding multicollinearity among the selected molecular descriptors. Statistical quality of established models was evaluated by standard and cross-validation statistical parameters. For detection of similarity or dissimilarity among investigated s-triazine pesticides and their classification, principal component analysis (PCA) and hierarchical cluster analysis (HCA) were used and gave similar grouping. This study is financially supported by COST action TD1305.

Keywords: chemometrics, classification analysis, molecular descriptors, pesticides, regression analysis

Procedia PDF Downloads 356
2035 Development of a Predictive Model to Prevent Financial Crisis

Authors: Tengqin Han

Abstract:

Delinquency has been a crucial factor in economics throughout the years. Commonly seen in credit card and mortgage, it played one of the crucial roles in causing the most recent financial crisis in 2008. In each case, a delinquency is a sign of the loaner being unable to pay off the debt, and thus may cause a lost of property in the end. Individually, one case of delinquency seems unimportant compared to the entire credit system. China, as an emerging economic entity, the national strength and economic strength has grown rapidly, and the gross domestic product (GDP) growth rate has remained as high as 8% in the past decades. However, potential risks exist behind the appearance of prosperity. Among the risks, the credit system is the most significant one. Due to long term and a large amount of balance of the mortgage, it is critical to monitor the risk during the performance period. In this project, about 300,000 mortgage account data are analyzed in order to develop a predictive model to predict the probability of delinquency. Through univariate analysis, the data is cleaned up, and through bivariate analysis, the variables with strong predictive power are detected. The project is divided into two parts. In the first part, the analysis data of 2005 are split into 2 parts, 60% for model development, and 40% for in-time model validation. The KS of model development is 31, and the KS for in-time validation is 31, indicating the model is stable. In addition, the model is further validation by out-of-time validation, which uses 40% of 2006 data, and KS is 33. This indicates the model is still stable and robust. In the second part, the model is improved by the addition of macroeconomic economic indexes, including GDP, consumer price index, unemployment rate, inflation rate, etc. The data of 2005 to 2010 is used for model development and validation. Compared with the base model (without microeconomic variables), KS is increased from 41 to 44, indicating that the macroeconomic variables can be used to improve the separation power of the model, and make the prediction more accurate.

Keywords: delinquency, mortgage, model development, model validation

Procedia PDF Downloads 198
2034 Design of a Graphical User Interface for Data Preprocessing and Image Segmentation Process in 2D MRI Images

Authors: Enver Kucukkulahli, Pakize Erdogmus, Kemal Polat

Abstract:

The 2D image segmentation is a significant process in finding a suitable region in medical images such as MRI, PET, CT etc. In this study, we have focused on 2D MRI images for image segmentation process. We have designed a GUI (graphical user interface) written in MATLABTM for 2D MRI images. In this program, there are two different interfaces including data pre-processing and image clustering or segmentation. In the data pre-processing section, there are median filter, average filter, unsharp mask filter, Wiener filter, and custom filter (a filter that is designed by user in MATLAB). As for the image clustering, there are seven different image segmentations for 2D MR images. These image segmentation algorithms are as follows: PSO (particle swarm optimization), GA (genetic algorithm), Lloyds algorithm, k-means, the combination of Lloyds and k-means, mean shift clustering, and finally BBO (Biogeography Based Optimization). To find the suitable cluster number in 2D MRI, we have designed the histogram based cluster estimation method and then applied to these numbers to image segmentation algorithms to cluster an image automatically. Also, we have selected the best hybrid method for each 2D MR images thanks to this GUI software.

Keywords: image segmentation, clustering, GUI, 2D MRI

Procedia PDF Downloads 348
2033 Analysis Of Non-uniform Characteristics Of Small Underwater Targets Based On Clustering

Authors: Tianyang Xu

Abstract:

Small underwater targets generally have a non-centrosymmetric geometry, and the acoustic scattering field of the target has spatial inhomogeneity under active sonar detection conditions. In view of the above problems, this paper takes the hemispherical cylindrical shell as the research object, and considers the angle continuity implied in the echo characteristics, and proposes a cluster-driven research method for the non-uniform characteristics of target echo angle. First, the target echo features are extracted, and feature vectors are constructed. Secondly, the t-SNE algorithm is used to improve the internal connection of the feature vector in the low-dimensional feature space and to construct the visual feature space. Finally, the implicit angular relationship between echo features is extracted under unsupervised condition by cluster analysis. The reconstruction results of the local geometric structure of the target corresponding to different categories show that the method can effectively divide the angle interval of the local structure of the target according to the natural acoustic scattering characteristics of the target.

Keywords: underwater target;, non-uniform characteristics;, cluster-driven method;, acoustic scattering characteristics

Procedia PDF Downloads 90
2032 A Concept of Data Mining with XML Document

Authors: Akshay Agrawal, Anand K. Srivastava

Abstract:

The increasing amount of XML datasets available to casual users increases the necessity of investigating techniques to extract knowledge from these data. Data mining is widely applied in the database research area in order to extract frequent correlations of values from both structured and semi-structured datasets. The increasing availability of heterogeneous XML sources has raised a number of issues concerning how to represent and manage these semi structured data. In recent years due to the importance of managing these resources and extracting knowledge from them, lots of methods have been proposed in order to represent and cluster them in different ways.

Keywords: XML, similarity measure, clustering, cluster quality, semantic clustering

Procedia PDF Downloads 347
2031 Analysis of Production Forecasting in Unconventional Gas Resources Development Using Machine Learning and Data-Driven Approach

Authors: Dongkwon Han, Sangho Kim, Sunil Kwon

Abstract:

Unconventional gas resources have dramatically changed the future energy landscape. Unlike conventional gas resources, the key challenges in unconventional gas have been the requirement that applies to advanced approaches for production forecasting due to uncertainty and complexity of fluid flow. In this study, artificial neural network (ANN) model which integrates machine learning and data-driven approach was developed to predict productivity in shale gas. The database of 129 wells of Eagle Ford shale basin used for testing and training of the ANN model. The Input data related to hydraulic fracturing, well completion and productivity of shale gas were selected and the output data is a cumulative production. The performance of the ANN using all data sets, clustering and variables importance (VI) models were compared in the mean absolute percentage error (MAPE). ANN model using all data sets, clustering, and VI were obtained as 44.22%, 10.08% (cluster 1), 5.26% (cluster 2), 6.35%(cluster 3), and 32.23% (ANN VI), 23.19% (SVM VI), respectively. The results showed that the pre-trained ANN model provides more accurate results than the ANN model using all data sets.

Keywords: unconventional gas, artificial neural network, machine learning, clustering, variables importance

Procedia PDF Downloads 172
2030 Implementation of Algorithm K-Means for Grouping District/City in Central Java Based on Macro Economic Indicators

Authors: Nur Aziza Luxfiati

Abstract:

Clustering is partitioning data sets into sub-sets or groups in such a way that elements certain properties have shared property settings with a high level of similarity within one group and a low level of similarity between groups. . The K-Means algorithm is one of thealgorithmsclustering as a grouping tool that is most widely used in scientific and industrial applications because the basic idea of the kalgorithm is-means very simple. In this research, applying the technique of clustering using the k-means algorithm as a method of solving the problem of national development imbalances between regions in Central Java Province based on macroeconomic indicators. The data sample used is secondary data obtained from the Central Java Provincial Statistics Agency regarding macroeconomic indicator data which is part of the publication of the 2019 National Socio-Economic Survey (Susenas) data. score and determine the number of clusters (k) using the elbow method. After the clustering process is carried out, the validation is tested using themethodsBetween-Class Variation (BCV) and Within-Class Variation (WCV). The results showed that detection outlier using z-score normalization showed no outliers. In addition, the results of the clustering test obtained a ratio value that was not high, namely 0.011%. There are two district/city clusters in Central Java Province which have economic similarities based on the variables used, namely the first cluster with a high economic level consisting of 13 districts/cities and theclustersecondwith a low economic level consisting of 22 districts/cities. And in the cluster second, namely, between low economies, the authors grouped districts/cities based on similarities to macroeconomic indicators such as 20 districts of Gross Regional Domestic Product, with a Poverty Depth Index of 19 districts, with 5 districts in Human Development, and as many as Open Unemployment Rate. 10 districts.

Keywords: clustering, K-Means algorithm, macroeconomic indicators, inequality, national development

Procedia PDF Downloads 131
2029 Development and Validation of a HPLC Method for Standardization of Methanolic Extract of Hypericum sinaicum Hochst

Authors: Taghreed A. Ibrahim, Atef A. El-Hela, Hala M. El-Hefnawy

Abstract:

The chromatographic profile of methanol extract of Hypericum sinaicum was determined using HPLC-DAD. Apigenin was used as an external standard in the development and validation of the HPLC method. The proposed method is simple, rapid and reliable and can be successfully applied for standardization of Hypericum sinaicum methanol extract.

Keywords: quality control, standardization, falvonoids, methanol extract

Procedia PDF Downloads 473
2028 Multi-Cluster Overlapping K-Means Extension Algorithm (MCOKE)

Authors: Said Baadel, Fadi Thabtah, Joan Lu

Abstract:

Clustering involves the partitioning of n objects into k clusters. Many clustering algorithms use hard-partitioning techniques where each object is assigned to one cluster. In this paper, we propose an overlapping algorithm MCOKE which allows objects to belong to one or more clusters. The algorithm is different from fuzzy clustering techniques because objects that overlap are assigned a membership value of 1 (one) as opposed to a fuzzy membership degree. The algorithm is also different from other overlapping algorithms that require a similarity threshold to be defined as a priority which can be difficult to determine by novice users.

Keywords: data mining, k-means, MCOKE, overlapping

Procedia PDF Downloads 536
2027 Development of a Decision-Making Method by Using Machine Learning Algorithms in the Early Stage of School Building Design

Authors: Pegah Eshraghi, Zahra Sadat Zomorodian, Mohammad Tahsildoost

Abstract:

Over the past decade, energy consumption in educational buildings has steadily increased. The purpose of this research is to provide a method to quickly predict the energy consumption of buildings using separate evaluation of zones and decomposing the building to eliminate the complexity of geometry at the early design stage. To produce this framework, machine learning algorithms such as Support vector regression (SVR) and Artificial neural network (ANN) are used to predict energy consumption and thermal comfort metrics in a school as a case. The database consists of more than 55000 samples in three climates of Iran. Cross-validation evaluation and unseen data have been used for validation. In a specific label, cooling energy, it can be said the accuracy of prediction is at least 84% and 89% in SVR and ANN, respectively. The results show that the SVR performed much better than the ANN.

Keywords: early stage of design, energy, thermal comfort, validation, machine learning

Procedia PDF Downloads 49
2026 Validation of Mapping Historical Linked Data to International Committee for Documentation (CIDOC) Conceptual Reference Model Using Shapes Constraint Language

Authors: Ghazal Faraj, András Micsik

Abstract:

Shapes Constraint Language (SHACL), a World Wide Web Consortium (W3C) language, provides well-defined shapes and RDF graphs, named "shape graphs". These shape graphs validate other resource description framework (RDF) graphs which are called "data graphs". The structural features of SHACL permit generating a variety of conditions to evaluate string matching patterns, value type, and other constraints. Moreover, the framework of SHACL supports high-level validation by expressing more complex conditions in languages such as SPARQL protocol and RDF Query Language (SPARQL). SHACL includes two parts: SHACL Core and SHACL-SPARQL. SHACL Core includes all shapes that cover the most frequent constraint components. While SHACL-SPARQL is an extension that allows SHACL to express more complex customized constraints. Validating the efficacy of dataset mapping is an essential component of reconciled data mechanisms, as the enhancement of different datasets linking is a sustainable process. The conventional validation methods are the semantic reasoner and SPARQL queries. The former checks formalization errors and data type inconsistency, while the latter validates the data contradiction. After executing SPARQL queries, the retrieved information needs to be checked manually by an expert. However, this methodology is time-consuming and inaccurate as it does not test the mapping model comprehensively. Therefore, there is a serious need to expose a new methodology that covers the entire validation aspects for linking and mapping diverse datasets. Our goal is to conduct a new approach to achieve optimal validation outcomes. The first step towards this goal is implementing SHACL to validate the mapping between the International Committee for Documentation (CIDOC) conceptual reference model (CRM) and one of its ontologies. To initiate this project successfully, a thorough understanding of both source and target ontologies was required. Subsequently, the proper environment to run SHACL and its shape graphs were determined. As a case study, we performed SHACL over a CIDOC-CRM dataset after running a Pellet reasoner via the Protégé program. The applied validation falls under multiple categories: a) data type validation which constrains whether the source data is mapped to the correct data type. For instance, checking whether a birthdate is assigned to xsd:datetime and linked to Person entity via crm:P82a_begin_of_the_begin property. b) Data integrity validation which detects inconsistent data. For instance, inspecting whether a person's birthdate occurred before any of the linked event creation dates. The expected results of our work are: 1) highlighting validation techniques and categories, 2) selecting the most suitable techniques for those various categories of validation tasks. The next plan is to establish a comprehensive validation model and generate SHACL shapes automatically.

Keywords: SHACL, CIDOC-CRM, SPARQL, validation of ontology mapping

Procedia PDF Downloads 227
2025 Routing Protocol in Ship Dynamic Positioning Based on WSN Clustering Data Fusion System

Authors: Zhou Mo, Dennis Chow

Abstract:

In the dynamic positioning system (DPS) for vessels, the reliable information transmission between each note basically relies on the wireless protocols. From the perspective of cluster-based routing protocols for wireless sensor networks, the data fusion technology based on the sleep scheduling mechanism and remaining energy in network layer is proposed, which applies the sleep scheduling mechanism to the routing protocols, considering the remaining energy of node and location information when selecting cluster-head. The problem of uneven distribution of nodes in each cluster is solved by the Equilibrium. At the same time, Classified Forwarding Mechanism as well as Redelivery Policy strategy is adopted to avoid congestion in the transmission of huge amount of data, reduce the delay in data delivery and enhance the real-time response. In this paper, a simulation test is conducted to improve the routing protocols, which turn out to reduce the energy consumption of nodes and increase the efficiency of data delivery.

Keywords: DPS for vessel, wireless sensor network, data fusion, routing protocols

Procedia PDF Downloads 495
2024 Synthesis, Characterization, Validation of Resistant Microbial Strains and Anti Microbrial Activity of Substitted Pyrazoles

Authors: Rama Devi Kyatham, D. Ashok, K. S. K. Rao Patnaik, Raju Bathula

Abstract:

We have shown the importance of pyrazoles as anti-microbial chemical entities. These compounds have generally been considered significant due to their wide range of pharmacological acivities and their discovery motivates new avenues of research.The proposed pyrazoles were synthesized and evaluated for their anti-microbial activities. The Synthesized compounds were analyzed by different spectroscopic methods.

Keywords: pyrazoles, validation, resistant microbial strains, anti-microbial activities

Procedia PDF Downloads 136
2023 Therapeutic Journey towards Self: Developing Positivity with Indications of Cluster B and C Personality Traits

Authors: Shweta Jha, Nandita Chaube

Abstract:

The concept of self has a major role to play in the study of personality which drives the current study in its present form. This is a case of Miss S, a 17-year-old Hindu, currently in eleventh standard, with no family history of mental illness but with a past history of inability to manage relationships, multiple emotional and sexual relationships, repeated self harming behaviour, and sexual abuse over a period of 2 months at the age of 10 years. She comes with a psychiatric history of one episode of dissociative fall followed by a stressful event which left the patient with many psychological disturbances matching the criterion of Cluster B and C traits. Current episode precipitated due to the relationship failure, predisposing factor is her personality traits, and poor social and family support. Considering the patient’s aspiration for positivity and demand of the therapy, ventilation sessions were carried out which made her capable of understanding and dealing with her negative emotions, also strengthened mother child bond, helped her maintain meaningful and healthy relationships, also helped her increase her problem solving ability and adaptive coping skills making her feel more positive and acceptable towards herself, family members and others.

Keywords: cluster B and C traits, personality, therapy, self

Procedia PDF Downloads 263
2022 Molecular Identification and Genotyping of Human Brucella Strains Isolated in Kuwait

Authors: Abu Salim Mustafa

Abstract:

Brucellosis is a zoonotic disease endemic in Kuwait. Human brucellosis can be caused by several Brucella species with Brucella melitensis causing the most severe and Brucella abortus the least severe disease. Furthermore, relapses are common after successful chemotherapy of patients. The classical biochemical methods of culture and serology for identification of Brucellae provide information about the species and serotypes only. However, to differentiate between relapse and reinfection/epidemiological investigations, the identification of genotypes using molecular methods is essential. In this study, four molecular methods [16S rRNA gene sequencing, real-time PCR, enterobacterial repetitive intergenic consensus (ERIC)-PCR and multilocus variable-number tandem-repeat analysis (MLVA)-16] were evaluated for the identification and typing of 75 strains of Brucella isolated in Kuwait. The 16S rRNA gene sequencing suggested that all the strains were B. melitensis and real-time PCR confirmed their species identity as B. melitensis. The ERIC-PCR band profiles produced a dendrogram of 75 branches suggesting each strain to be of a unique type. The cluster classification, based on ~ 80% similarity, divided all the ERIC genotypes into two clusters, A and B. Cluster A consisted of 9 ERIC genotypes (A1-A9) corresponding to 9 individual strains. Cluster B comprised of 13 ERIC genotypes (B1-B13) with B5 forming the largest cluster of 51 strains. MLVA-16 identified all isolates as B. melitensis and divided them into 71 MLVA-types. The cluster analysis of MLVA-16-types suggested that most of the strains in Kuwait originated from the East Mediterranean Region, a few from the African group and one new genotype closely matched with the West Mediterranean region. In conclusion, this work demonstrates that B. melitensis, the most pathogenic species of Brucella, is prevalent in Kuwait. Furthermore, MLVA-16 is the best molecular method, which can identify the Brucella species and genotypes as well as determine their origin in the global context. Supported by Kuwait University Research Sector grants MI04/15 and SRUL02/13.

Keywords: Brucella, ERIC-PCR, MLVA-16, RT-PCR, 16S rRNA gene sequencing

Procedia PDF Downloads 346
2021 Research on Routing Protocol in Ship Dynamic Positioning Based on WSN Clustering Data Fusion System

Authors: Zhou Mo, Dennis Chow

Abstract:

In the dynamic positioning system (DPS) for vessels, the reliable information transmission between each note basically relies on the wireless protocols. From the perspective of cluster-based routing pro-tocols for wireless sensor networks, the data fusion technology based on the sleep scheduling mechanism and remaining energy in network layer is proposed, which applies the sleep scheduling mechanism to the routing protocols, considering the remaining energy of node and location information when selecting cluster-head. The problem of uneven distribution of nodes in each cluster is solved by the Equilibrium. At the same time, Classified Forwarding Mechanism as well as Redelivery Policy strategy is adopted to avoid congestion in the transmission of huge amount of data, reduce the delay in data delivery and enhance the real-time response. In this paper, a simulation test is conducted to improve the routing protocols, which turns out to reduce the energy consumption of nodes and increase the efficiency of data delivery.

Keywords: DPS for vessel, wireless sensor network, data fusion, routing protocols

Procedia PDF Downloads 423
2020 Assessing Functional Structure in European Marine Ecosystems Using a Vector-Autoregressive Spatio-Temporal Model

Authors: Katyana A. Vert-Pre, James T. Thorson, Thomas Trancart, Eric Feunteun

Abstract:

In marine ecosystems, spatial and temporal species structure is an important component of ecosystems’ response to anthropological and environmental factors. Although spatial distribution patterns and fish temporal series of abundance have been studied in the past, little research has been allocated to the joint dynamic spatio-temporal functional patterns in marine ecosystems and their use in multispecies management and conservation. Each species represents a function to the ecosystem, and the distribution of these species might not be random. A heterogeneous functional distribution will lead to a more resilient ecosystem to external factors. Applying a Vector-Autoregressive Spatio-Temporal (VAST) model for count data, we estimate the spatio-temporal distribution, shift in time, and abundance of 140 species of the Eastern English Chanel, Bay of Biscay and Mediterranean Sea. From the model outputs, we determined spatio-temporal clusters, calculating p-values for hierarchical clustering via multiscale bootstrap resampling. Then, we designed a functional map given the defined cluster. We found that the species distribution within the ecosystem was not random. Indeed, species evolved in space and time in clusters. Moreover, these clusters remained similar over time deriving from the fact that species of a same cluster often shifted in sync, keeping the overall structure of the ecosystem similar overtime. Knowing the co-existing species within these clusters could help with predicting data-poor species distribution and abundance. Further analysis is being performed to assess the ecological functions represented in each cluster.

Keywords: cluster distribution shift, European marine ecosystems, functional distribution, spatio-temporal model

Procedia PDF Downloads 166