Search results for: Average Linkage Clustering (ALC)
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5374

Search results for: Average Linkage Clustering (ALC)

5224 Genetic Diversity in Capsicum Germplasm Based on Inter Simple Sequence Repeat Markers

Authors: Siwapech Silapaprayoon, Januluk Khanobdee, Sompid Samipak

Abstract:

Chili peppers are the fruits of Capsicum pepper plants well known for their fiery burning sensation on the tongue after consumption. They are members of the Solanaceae or common nightshade family along with potato, tomato and eggplant. Thai cuisine has gained popularity for its distinct flavors due to usages of various spices and its heat from the addition of chili pepper. Though being used in little quantity for each dish, chili pepper holds a special place in Thai cuisine. There are many varieties of chili peppers in Thailand, and thirty accessions were collected at Rajamangala University of Technology Lanna, Lampang, Thailand. To effectively manage any germplasm it is essential to know the diversity and relationships among members. Thirty-six Inter Simple Sequence Repeat (ISSRs) DNA markers were used to analyze the germplasm. Total of 335 polymorphic bands was obtained giving the average of 9.3 alleles per marker. Unweighted pair-group mean arithmetic method (UPGMA) clustering of data using NTSYS-pc software indicated that the accessions showed varied levels of genetic similarity ranging from 0.57-1.00 similarity coefficient index indicating significant levels of variation. At SM coefficient of 0.81, the germplasm was separated into four groups. Phenotypic variation was discussed in context of phylogenetic tree clustering.

Keywords: diversity, germplasm, Chili pepper, ISSR

Procedia PDF Downloads 130
5223 Conjunctive Management of Surface and Groundwater Resources under Uncertainty: A Retrospective Optimization Approach

Authors: Julius M. Ndambuki, Gislar E. Kifanyi, Samuel N. Odai, Charles Gyamfi

Abstract:

Conjunctive management of surface and groundwater resources is a challenging task due to the spatial and temporal variability nature of hydrology as well as hydrogeology of the water storage systems. Surface water-groundwater hydrogeology is highly uncertain; thus it is imperative that this uncertainty is explicitly accounted for, when managing water resources. Various methodologies have been developed and applied by researchers in an attempt to account for the uncertainty. For example, simulation-optimization models are often used for conjunctive water resources management. However, direct application of such an approach in which all realizations are considered at each iteration of the optimization process leads to a very expensive optimization in terms of computational time, particularly when the number of realizations is large. The aim of this paper, therefore, is to introduce and apply an efficient approach referred to as Retrospective Optimization Approximation (ROA) that can be used for optimizing conjunctive use of surface water and groundwater over a multiple hydrogeological model simulations. This work is based on stochastic simulation-optimization framework using a recently emerged technique of sample average approximation (SAA) which is a sampling based method implemented within the Retrospective Optimization Approximation (ROA) approach. The ROA approach solves and evaluates a sequence of generated optimization sub-problems in an increasing number of realizations (sample size). Response matrix technique was used for linking simulation model with optimization procedure. The k-means clustering sampling technique was used to map the realizations. The methodology is demonstrated through the application to a hypothetical example. In the example, the optimization sub-problems generated were solved and analysed using “Active-Set” core optimizer implemented under MATLAB 2014a environment. Through k-means clustering sampling technique, the ROA – Active Set procedure was able to arrive at a (nearly) converged maximum expected total optimal conjunctive water use withdrawal rate within a relatively few number of iterations (6 to 7 iterations). Results indicate that the ROA approach is a promising technique for optimizing conjunctive water use of surface water and groundwater withdrawal rates under hydrogeological uncertainty.

Keywords: conjunctive water management, retrospective optimization approximation approach, sample average approximation, uncertainty

Procedia PDF Downloads 209
5222 Development of Border Trade of Thailand-Myanmar: Case Study of Ranong Province

Authors: Sakapas Saengchai

Abstract:

This research has objective to study and analysis, expending linkage of trading border of Thai-Myanmar and the way of development trading of Thai-Myanmar border. There are advantage of competition in ASEAN Community on collection data and observation, in-depth interview, group conversation and exchange opinion of public agency, entrepreneur and people. Result of study found that main development of border trade is 1) Cross-border service should be development infrastructure of land telecommunication, sea has support economics of cross-border trade, 2) International consumption service should be expand service with Myanmar and India for linkage with entrepreneur and trading from international to Thailand, 3) Establish business for provide service has development cooperation of logistics via Andaman of Thailand, and 4) Mobility personnel, exchange personnel including labor for development potential of border trade has competition advantage.

Keywords: border trade, development, service, ASEAN

Procedia PDF Downloads 300
5221 A 5G Architecture Based to Dynamic Vehicular Clustering Enhancing VoD Services Over Vehicular Ad hoc Networks

Authors: Lamaa Sellami, Bechir Alaya

Abstract:

Nowadays, video-on-demand (VoD) applications are becoming one of the tendencies driving vehicular network users. In this paper, considering the unpredictable vehicle density, the unexpected acceleration or deceleration of the different cars included in the vehicular traffic load, and the limited radio range of the employed communication scheme, we introduce the “Dynamic Vehicular Clustering” (DVC) algorithm as a new scheme for video streaming systems over VANET. The proposed algorithm takes advantage of the concept of small cells and the introduction of wireless backhauls, inspired by the different features and the performance of the Long Term Evolution (LTE)- Advanced network. The proposed clustering algorithm considers multiple characteristics such as the vehicle’s position and acceleration to reduce latency and packet loss. Therefore, each cluster is counted as a small cell containing vehicular nodes and an access point that is elected regarding some particular specifications.

Keywords: video-on-demand, vehicular ad-hoc network, mobility, vehicular traffic load, small cell, wireless backhaul, LTE-advanced, latency, packet loss

Procedia PDF Downloads 115
5220 Influence of Iron Ore Mineralogy on Cluster Formation inside the Shaft Furnace

Authors: M. Bahgat, H. A. Hanafy, S. Lakdawala

Abstract:

Clustering phenomenon of pellets was observed frequently in shaft processes operating at higher temperatures. Clustering is a result of the growth of fibrous iron precipitates (iron whiskers) that become hooked to each other and finally become crystallized during the initial stages of metallization. If the pellet clustering is pronounced, sometimes leads to blocking inside the furnace and forced shutdown takes place. This work clarifies further the relation between metallic iron whisker growth and iron ore mineralogy. Various pellet sizes (6 – 12.0 & +12.0 mm) from three different ores (A, B & C) were (completely and partially) reduced at 985 oC with H2/CO gas mixture using thermos-gravimetric technique. It was found that reducibility increases by decreasing the iron ore pellet’s size. Ore (A) has the highest reducibility than ore (B) and ore (C). Increasing the iron ore pellet’s size leads to increase the probability of metallic iron whisker formation. Ore (A) has the highest tendency for metallic iron whisker formation than ore (B) and ore (C). The reduction reactions for all iron ores A, B and C are mainly controlled by diffusion reaction mechanism.

Keywords: shaft furnace, cluster, metallic iron whisker, mineralogy, ferrous metallurgy

Procedia PDF Downloads 446
5219 A Hybrid Fuzzy Clustering Approach for Fertile and Unfertile Analysis

Authors: Shima Soltanzadeh, Mohammad Hosain Fazel Zarandi, Mojtaba Barzegar Astanjin

Abstract:

Diagnosis of male infertility by the laboratory tests is expensive and, sometimes it is intolerable for patients. Filling out the questionnaire and then using classification method can be the first step in decision-making process, so only in the cases with a high probability of infertility we can use the laboratory tests. In this paper, we evaluated the performance of four classification methods including naive Bayesian, neural network, logistic regression and fuzzy c-means clustering as a classification, in the diagnosis of male infertility due to environmental factors. Since the data are unbalanced, the ROC curves are most suitable method for the comparison. In this paper, we also have selected the more important features using a filtering method and examined the impact of this feature reduction on the performance of each methods; generally, most of the methods had better performance after applying the filter. We have showed that using fuzzy c-means clustering as a classification has a good performance according to the ROC curves and its performance is comparable to other classification methods like logistic regression.

Keywords: classification, fuzzy c-means, logistic regression, Naive Bayesian, neural network, ROC curve

Procedia PDF Downloads 316
5218 Maximization of Lifetime for Wireless Sensor Networks Based on Energy Efficient Clustering Algorithm

Authors: Frodouard Minani

Abstract:

Since last decade, wireless sensor networks (WSNs) have been used in many areas like health care, agriculture, defense, military, disaster hit areas and so on. Wireless Sensor Networks consist of a Base Station (BS) and more number of wireless sensors in order to monitor temperature, pressure, motion in different environment conditions. The key parameter that plays a major role in designing a protocol for Wireless Sensor Networks is energy efficiency which is a scarcest resource of sensor nodes and it determines the lifetime of sensor nodes. Maximizing sensor node’s lifetime is an important issue in the design of applications and protocols for Wireless Sensor Networks. Clustering sensor nodes mechanism is an effective topology control approach for helping to achieve the goal of this research. In this paper, the researcher presents an energy efficiency protocol to prolong the network lifetime based on Energy efficient clustering algorithm. The Low Energy Adaptive Clustering Hierarchy (LEACH) is a routing protocol for clusters which is used to lower the energy consumption and also to improve the lifetime of the Wireless Sensor Networks. Maximizing energy dissipation and network lifetime are important matters in the design of applications and protocols for wireless sensor networks. Proposed system is to maximize the lifetime of the Wireless Sensor Networks by choosing the farthest cluster head (CH) instead of the closest CH and forming the cluster by considering the following parameter metrics such as Node’s density, residual-energy and distance between clusters (inter-cluster distance). In this paper, comparisons between the proposed protocol and comparative protocols in different scenarios have been done and the simulation results showed that the proposed protocol performs well over other comparative protocols in various scenarios.

Keywords: base station, clustering algorithm, energy efficient, sensors, wireless sensor networks

Procedia PDF Downloads 118
5217 Modeling Average Paths Traveled by Ferry Vessels Using AIS Data

Authors: Devin Simmons

Abstract:

At the USDOT’s Bureau of Transportation Statistics, a biannual census of ferry operators in the U.S. is conducted, with results such as route mileage used to determine federal funding levels for operators. AIS data allows for the possibility of using GIS software and geographical methods to confirm operator-reported mileage for individual ferry routes. As part of the USDOT’s work on the ferry census, an algorithm was developed that uses AIS data for ferry vessels in conjunction with known ferry terminal locations to model the average route travelled for use as both a cartographic product and confirmation of operator-reported mileage. AIS data from each vessel is first analyzed to determine individual journeys based on the vessel’s velocity, and changes in velocity over time. These trips are then converted to geographic linestring objects. Using the terminal locations, the algorithm then determines whether the trip represented a known ferry route. Given a large enough dataset, routes will be represented by multiple trip linestrings, which are then filtered by DBSCAN spatial clustering to remove outliers. Finally, these remaining trips are ready to be averaged into one route. The algorithm interpolates the point on each trip linestring that represents the start point. From these start points, a centroid is calculated, and the first point of the average route is determined. Each trip is interpolated again to find the point that represents one percent of the journey’s completion, and the centroid of those points is used as the next point in the average route, and so on until 100 points have been calculated. Routes created using this algorithm have shown demonstrable improvement over previous methods, which included the implementation of a LOESS model. Additionally, the algorithm greatly reduces the amount of manual digitizing needed to visualize ferry activity.

Keywords: ferry vessels, transportation, modeling, AIS data

Procedia PDF Downloads 150
5216 Feature Evaluation Based on Random Subspace and Multiple-K Ensemble

Authors: Jaehong Yu, Seoung Bum Kim

Abstract:

Clustering analysis can facilitate the extraction of intrinsic patterns in a dataset and reveal its natural groupings without requiring class information. For effective clustering analysis in high dimensional datasets, unsupervised dimensionality reduction is an important task. Unsupervised dimensionality reduction can generally be achieved by feature extraction or feature selection. In many situations, feature selection methods are more appropriate than feature extraction methods because of their clear interpretation with respect to the original features. The unsupervised feature selection can be categorized as feature subset selection and feature ranking method, and we focused on unsupervised feature ranking methods which evaluate the features based on their importance scores. Recently, several unsupervised feature ranking methods were developed based on ensemble approaches to achieve their higher accuracy and stability. However, most of the ensemble-based feature ranking methods require the true number of clusters. Furthermore, these algorithms evaluate the feature importance depending on the ensemble clustering solution, and they produce undesirable evaluation results if the clustering solutions are inaccurate. To address these limitations, we proposed an ensemble-based feature ranking method with random subspace and multiple-k ensemble (FRRM). The proposed FRRM algorithm evaluates the importance of each feature with the random subspace ensemble, and all evaluation results are combined with the ensemble importance scores. Moreover, FRRM does not require the determination of the true number of clusters in advance through the use of the multiple-k ensemble idea. Experiments on various benchmark datasets were conducted to examine the properties of the proposed FRRM algorithm and to compare its performance with that of existing feature ranking methods. The experimental results demonstrated that the proposed FRRM outperformed the competitors.

Keywords: clustering analysis, multiple-k ensemble, random subspace-based feature evaluation, unsupervised feature ranking

Procedia PDF Downloads 312
5215 Economic Cost of Malaria: A Threat to Household Income in Nigeria

Authors: Nsikan Affiah, Kayode Osungbade, Williams Uzoma

Abstract:

Malaria remains one of the major killers of humans worldwide, threatening the lives of more than one-third of the world’s population. Some people refers it to; a disease of poverty because it contributes towards national poverty through its impact on foreign direct investment, tourism, labour productivity, and trade. At the micro level, it may cause poverty through spending on health care, income losses, and premature deaths. Unfortunately, malaria is a disease that affects both low-income household and its high-income counterpart, but low-income households are still at greater risk because significant part of the available monthly income is dedicated to various preventive and treatment measures. The objective of this study is to estimate direct and indirect cost of malaria treatment in households in a section of South-South Region (Akwa Ibom State) of Nigeria. A cross-sectional study of Six Hundred and Forty (640) heads of households or any adult representative of households in three local government areas of Akwa Ibom State, Nigeria from May 1-31, 2015 were ascertained through interviewer-administered questionnaire adapted from Nigerian Malaria Indicator Survey Report. The clustering technique was used to select 640 households with the help of Primary Health Care (PHC) house numbering system. Using exchange rate of 197 Naira/USD, result shows that direct cost of malaria treatment was 8,894.44 USD while the indirect cost of malaria treatment was 11,012.81 USD. Total cost of treatment made up of 44.7% direct cost and 55.3% indirect cost, with average direct cost of malaria treatment per household estimated at 20.6 USD and the average indirect cost of treatment per household estimated at 25.1 USD. Average total cost for each episode (888) of malaria was estimated at 22.4 USD. While at household level, the average total cost was estimated at 45.5 USD. From the average total cost, low-income households would spend 36% of monthly household income on treating malaria and the impact could be said to be catastrophic, compared to high-income households where only 1.2% of monthly household income is spent on malaria treatment. It could be concluded that the cost of malaria treatment is well beyond the means of households and given the reality of repeated bouts of malaria and its contribution to the impoverishment of households, there is a need for urgent action.

Keywords: direct cost, indirect cost, low income households, malaria

Procedia PDF Downloads 231
5214 Constraints and Opportunities of Wood Production Value Chain: Evidence from Southwest Ethiopia

Authors: Abduselam Faris, Rijalu Negash, Zera Kedir

Abstract:

This study was initiated to identify constraints and opportunities of the wood production value chain in Southwest Ethiopia. About 385 wood trees growing farmers were randomly interviewed. Similarly, about 30 small-scale wood processors, 30 retailers, 15 local collectors and 5 wholesalers were purposively included in the study. The results of the study indicated that 98.96 % of the smallholder farmers that engaged in the production of wood trees which is used for wood were male-headed, with an average age of 46.88 years. The main activity that the household engaged was agriculture (crop and livestock) which accounts for about 61.56% of the sample respondents. Through value chain mapping of actors, the major value chain participant and supporting actors were identified. On average, the tree-growing farmers generated gross income of 9385.926 Ethiopian birr during the survey year. Among the critical constraints identified along the wood production value chain was limited supply of credit, poor market information dissemination, high interference of brokers, and shortage of machines, inadequate working area and electricity. The availability of forest resources is the leading opportunity in the wood production value chain. Reinforcing the linkage among wood production value chain actors, providing skill training for small-scale processors, and developing suitable policy for wood tree wise use is key recommendations forward.

Keywords: value chain analysis, wood production, southwest Ethiopia, constraints and opportunities

Procedia PDF Downloads 67
5213 The Linkage of Urban and Energy Planning for Sustainable Cities: The Case of Denmark and Germany

Authors: Jens-Phillip Petersen

Abstract:

The reduction of GHG emissions in buildings is a focus area of national energy policies in Europe, because buildings are responsible for a major share of the final energy consumption. It is at local scale where policies to increase the share of renewable energies and energy efficiency measures get implemented. Municipalities, as local authorities and responsible entity for land-use planning, have a direct influence on urban patterns and energy use, which makes them key actors in the transition towards sustainable cities. Hence, synchronizing urban planning with energy planning offers great potential to increase society’s energy-efficiency; this has a high significance to reach GHG-reduction targets. In this paper, the actual linkage of urban planning and energy planning in Denmark and Germany was assessed; substantive barriers preventing their integration and driving factors that lead to successful transitions towards a holistic urban energy planning procedures were identified.

Keywords: energy planning, urban planning, renewable energies, sustainable cities

Procedia PDF Downloads 322
5212 Uncertainty Quantification of Corrosion Anomaly Length of Oil and Gas Steel Pipelines Based on Inline Inspection and Field Data

Authors: Tammeen Siraj, Wenxing Zhou, Terry Huang, Mohammad Al-Amin

Abstract:

The high resolution inline inspection (ILI) tool is used extensively in the pipeline industry to identify, locate, and measure metal-loss corrosion anomalies on buried oil and gas steel pipelines. Corrosion anomalies may occur singly (i.e. individual anomalies) or as clusters (i.e. a colony of corrosion anomalies). Although the ILI technology has advanced immensely, there are measurement errors associated with the sizes of corrosion anomalies reported by ILI tools due limitations of the tools and associated sizing algorithms, and detection threshold of the tools (i.e. the minimum detectable feature dimension). Quantifying the measurement error in the ILI data is crucial for corrosion management and developing maintenance strategies that satisfy the safety and economic constraints. Studies on the measurement error associated with the length of the corrosion anomalies (in the longitudinal direction of the pipeline) has been scarcely reported in the literature and will be investigated in the present study. Limitations in the ILI tool and clustering process can sometimes cause clustering error, which is defined as the error introduced during the clustering process by including or excluding a single or group of anomalies in or from a cluster. Clustering error has been found to be one of the biggest contributory factors for relatively high uncertainties associated with ILI reported anomaly length. As such, this study focuses on developing a consistent and comprehensive framework to quantify the measurement errors in the ILI-reported anomaly length by comparing the ILI data and corresponding field measurements for individual and clustered corrosion anomalies. The analysis carried out in this study is based on the ILI and field measurement data for a set of anomalies collected from two segments of a buried natural gas pipeline currently in service in Alberta, Canada. Data analyses showed that the measurement error associated with the ILI-reported length of the anomalies without clustering error, denoted as Type I anomalies is markedly less than that for anomalies with clustering error, denoted as Type II anomalies. A methodology employing data mining techniques is further proposed to classify the Type I and Type II anomalies based on the ILI-reported corrosion anomaly information.

Keywords: clustered corrosion anomaly, corrosion anomaly assessment, corrosion anomaly length, individual corrosion anomaly, metal-loss corrosion, oil and gas steel pipeline

Procedia PDF Downloads 292
5211 Agglomerative Hierarchical Clustering Based on Morphmetric Parameters of the Populations of Labeo rohita

Authors: Fayyaz Rasool, Naureen Aziz Qureshi, Shakeela Parveen

Abstract:

Labeo rohita populations from five geographical locations from the hatchery and riverine system of Punjab-Pakistan were studied for the clustering on the basis of similarities and differences based on morphometric parameters within the species. Agglomerative Hierarchical Clustering (AHC) was done by using Pearson Correlation Coefficient and Unweighted Pair Group Method with Arithmetic Mean (UPGMA) as Agglomeration method by XLSTAT 2012 version 1.02. A dendrogram with the data on the morphometrics of the representative samples of each site divided the populations of Labeo rohita in to five major clusters or classes. The variance decomposition for the optimal classification values remained as 19.24% for within class variation, while 80.76% for the between class differences. The representative central objects of the each class, the distances between the class centroids and also the distance between the central objects of the classes were generated by the analysis. A measurable distinction between the classes of the populations of the Labeo rohita was indicated in this study which determined the impacts of changing environment and other possible factors influencing the variation level among the populations of the same species.

Keywords: AHC, Labeo rohita, hatchery, riverine, morphometric

Procedia PDF Downloads 430
5210 Switched System Diagnosis Based on Intelligent State Filtering with Unknown Models

Authors: Nada Slimane, Foued Theljani, Faouzi Bouani

Abstract:

The paper addresses the problem of fault diagnosis for systems operating in several modes (normal or faulty) based on states assessment. We use, for this purpose, a methodology consisting of three main processes: 1) sequential data clustering, 2) linear model regression and 3) state filtering. Typically, Kalman Filter (KF) is an algorithm that provides estimation of unknown states using a sequence of I/O measurements. Inevitably, although it is an efficient technique for state estimation, it presents two main weaknesses. First, it merely predicts states without being able to isolate/classify them according to their different operating modes, whether normal or faulty modes. To deal with this dilemma, the KF is endowed with an extra clustering step based fully on sequential version of the k-means algorithm. Second, to provide state estimation, KF requires state space models, which can be unknown. A linear regularized regression is used to identify the required models. To prove its effectiveness, the proposed approach is assessed on a simulated benchmark.

Keywords: clustering, diagnosis, Kalman Filtering, k-means, regularized regression

Procedia PDF Downloads 159
5209 Radar on Bike: Coarse Classification based on Multi-Level Clustering for Cyclist Safety Enhancement

Authors: Asma Omri, Noureddine Benothman, Sofiane Sayahi, Fethi Tlili, Hichem Besbes

Abstract:

Cycling, a popular mode of transportation, can also be perilous due to cyclists' vulnerability to collisions with vehicles and obstacles. This paper presents an innovative cyclist safety system based on radar technology designed to offer real-time collision risk warnings to cyclists. The system incorporates a low-power radar sensor affixed to the bicycle and connected to a microcontroller. It leverages radar point cloud detections, a clustering algorithm, and a supervised classifier. These algorithms are optimized for efficiency to run on the TI’s AWR 1843 BOOST radar, utilizing a coarse classification approach distinguishing between cars, trucks, two-wheeled vehicles, and other objects. To enhance the performance of clustering techniques, we propose a 2-Level clustering approach. This approach builds on the state-of-the-art Density-based spatial clustering of applications with noise (DBSCAN). The objective is to first cluster objects based on their velocity, then refine the analysis by clustering based on position. The initial level identifies groups of objects with similar velocities and movement patterns. The subsequent level refines the analysis by considering the spatial distribution of these objects. The clusters obtained from the first level serve as input for the second level of clustering. Our proposed technique surpasses the classical DBSCAN algorithm in terms of geometrical metrics, including homogeneity, completeness, and V-score. Relevant cluster features are extracted and utilized to classify objects using an SVM classifier. Potential obstacles are identified based on their velocity and proximity to the cyclist. To optimize the system, we used the View of Delft dataset for hyperparameter selection and SVM classifier training. The system's performance was assessed using our collected dataset of radar point clouds synchronized with a camera on an Nvidia Jetson Nano board. The radar-based cyclist safety system is a practical solution that can be easily installed on any bicycle and connected to smartphones or other devices, offering real-time feedback and navigation assistance to cyclists. We conducted experiments to validate the system's feasibility, achieving an impressive 85% accuracy in the classification task. This system has the potential to significantly reduce the number of accidents involving cyclists and enhance their safety on the road.

Keywords: 2-level clustering, coarse classification, cyclist safety, warning system based on radar technology

Procedia PDF Downloads 62
5208 Performance Analysis of Deterministic Stable Election Protocol Using Fuzzy Logic in Wireless Sensor Network

Authors: Sumanpreet Kaur, Harjit Pal Singh, Vikas Khullar

Abstract:

In Wireless Sensor Network (WSN), the sensor containing motes (nodes) incorporate batteries that can lament at some extent. To upgrade the energy utilization, clustering is one of the prototypical approaches for split sensor motes into a number of clusters where one mote (also called as node) proceeds as a Cluster Head (CH). CH selection is one of the optimization techniques for enlarging stability and network lifespan. Deterministic Stable Election Protocol (DSEP) is an effectual clustering protocol that makes use of three kinds of nodes with dissimilar residual energy for CH election. Fuzzy Logic technology is used to expand energy level of DSEP protocol by using fuzzy inference system. This paper presents protocol DSEP using Fuzzy Logic (DSEP-FL) CH by taking into account four linguistic variables such as energy, concentration, centrality and distance to base station. Simulation results show that our proposed method gives more effective results in term of a lifespan of network and stability as compared to the performance of other clustering protocols.

Keywords: DSEP, fuzzy logic, energy model, WSN

Procedia PDF Downloads 178
5207 A Concept of Data Mining with XML Document

Authors: Akshay Agrawal, Anand K. Srivastava

Abstract:

The increasing amount of XML datasets available to casual users increases the necessity of investigating techniques to extract knowledge from these data. Data mining is widely applied in the database research area in order to extract frequent correlations of values from both structured and semi-structured datasets. The increasing availability of heterogeneous XML sources has raised a number of issues concerning how to represent and manage these semi structured data. In recent years due to the importance of managing these resources and extracting knowledge from them, lots of methods have been proposed in order to represent and cluster them in different ways.

Keywords: XML, similarity measure, clustering, cluster quality, semantic clustering

Procedia PDF Downloads 353
5206 Progressive Multimedia Collection Structuring via Scene Linking

Authors: Aman Berhe, Camille Guinaudeau, Claude Barras

Abstract:

In order to facilitate information seeking in large collections of multimedia documents with long and progressive content (such as broadcast news or TV series), one can extract the semantic links that exist between semantically coherent parts of documents, i.e., scenes. The links can then create a coherent collection of scenes from which it is easier to perform content analysis, topic extraction, or information retrieval. In this paper, we focus on TV series structuring and propose two approaches for scene linking at different levels of granularity (episode and season): a fuzzy online clustering technique and a graph-based community detection algorithm. When evaluated on the two first seasons of the TV series Game of Thrones, we found that the fuzzy online clustering approach performed better compared to graph-based community detection at the episode level, while graph-based approaches show better performance at the season level.

Keywords: multimedia collection structuring, progressive content, scene linking, fuzzy clustering, community detection

Procedia PDF Downloads 75
5205 Analysis of Production Forecasting in Unconventional Gas Resources Development Using Machine Learning and Data-Driven Approach

Authors: Dongkwon Han, Sangho Kim, Sunil Kwon

Abstract:

Unconventional gas resources have dramatically changed the future energy landscape. Unlike conventional gas resources, the key challenges in unconventional gas have been the requirement that applies to advanced approaches for production forecasting due to uncertainty and complexity of fluid flow. In this study, artificial neural network (ANN) model which integrates machine learning and data-driven approach was developed to predict productivity in shale gas. The database of 129 wells of Eagle Ford shale basin used for testing and training of the ANN model. The Input data related to hydraulic fracturing, well completion and productivity of shale gas were selected and the output data is a cumulative production. The performance of the ANN using all data sets, clustering and variables importance (VI) models were compared in the mean absolute percentage error (MAPE). ANN model using all data sets, clustering, and VI were obtained as 44.22%, 10.08% (cluster 1), 5.26% (cluster 2), 6.35%(cluster 3), and 32.23% (ANN VI), 23.19% (SVM VI), respectively. The results showed that the pre-trained ANN model provides more accurate results than the ANN model using all data sets.

Keywords: unconventional gas, artificial neural network, machine learning, clustering, variables importance

Procedia PDF Downloads 176
5204 Nonlinear Multivariable Analysis of CO2 Emissions in China

Authors: Hsiao-Tien Pao, Yi-Ying Li, Hsin-Chia Fu

Abstract:

This paper addressed the impacts of energy consumption, economic growth, financial development, and population size on environmental degradation using grey relational analysis (GRA) for China, where foreign direct investment (FDI) inflows is the proxy variable for financial development. The more recent historical data during the period 2004–2011 are used, because the use of very old data for data analysis may not be suitable for rapidly developing countries. The results of the GRA indicate that the linkage effects of energy consumption–emissions and GDP–emissions are ranked first and second, respectively. These reveal that energy consumption and economic growth are strongly correlated with emissions. Higher economic growth requires more energy consumption and increasing environmental pollution. Likewise, more efficient energy use needs a higher level of economic development. Therefore, policies to improve energy efficiency and create a low-carbon economy can reduce emissions without hurting economic growth. The finding of FDI–emissions linkage is ranked third. This indicates that China do not apply weak environmental regulations to attract inward FDI. Furthermore, China’s government in attracting inward FDI should strengthen environmental policy. The finding of population–emissions linkage effect is ranked fourth, implying that population size does not directly affect CO2 emissions, even though China has the world’s largest population, and Chinese people are very economical use of energy-related products. Overall, the energy conservation, improving efficiency, managing demand, and financial development, which aim at curtailing waste of energy, reducing both energy consumption and emissions, and without loss of the country’s competitiveness, can be adopted for developing economies. The GRA is one of the best way to use a lower data to build a dynamic analysis model.

Keywords: China, CO₂ emissions, foreign direct investment, grey relational analysis

Procedia PDF Downloads 378
5203 Cluster Analysis and Benchmarking for Performance Optimization of a Pyrochlore Processing Unit

Authors: Ana C. R. P. Ferreira, Adriano H. P. Pereira

Abstract:

Given the frequent variation of mineral properties throughout the Araxá pyrochlore deposit, even if a good homogenization work has been carried out before feeding the processing plants, an operation with quality and performance’s high variety standard is expected. These results could be improved and standardized if the blend composition parameters that most influence the processing route are determined, and then the types of raw materials are grouped by them, finally presenting a great reference with operational settings for each group. Associating the physical and chemical parameters of a unit operation through benchmarking or even an optimal reference of metallurgical recovery and product quality reflects in the reduction of the production costs, optimization of the mineral resource, and guarantee of greater stability in the subsequent processes of the production chain that uses the mineral of interest. Conducting a comprehensive exploratory data analysis to identify which characteristics of the ore are most relevant to the process route, associated with the use of Machine Learning algorithms for grouping the raw material (ore) and associating these with reference variables in the process’ benchmark is a reasonable alternative for the standardization and improvement of mineral processing units. Clustering methods through Decision Tree and K-Means were employed, associated with algorithms based on the theory of benchmarking, with criteria defined by the process team in order to reference the best adjustments for processing the ore piles of each cluster. A clean user interface was created to obtain the outputs of the created algorithm. The results were measured through the average time of adjustment and stabilization of the process after a new pile of homogenized ore enters the plant, as well as the average time needed to achieve the best processing result. Direct gains from the metallurgical recovery of the process were also measured. The results were promising, with a reduction in the adjustment time and stabilization when starting the processing of a new ore pile, as well as reaching the benchmark. Also noteworthy are the gains in metallurgical recovery, which reflect a significant saving in ore consumption and a consequent reduction in production costs, hence a more rational use of the tailings dams and life optimization of the mineral deposit.

Keywords: mineral clustering, machine learning, process optimization, pyrochlore processing

Procedia PDF Downloads 127
5202 The Correlation between Education, Food Intake, Exercise, and Medication Obedience with the Average of Blood Sugar in Indonesia

Authors: Aisyah Rahmatul Laily

Abstract:

Indonesia Ministry of Health is increasing their awareness on non communicable diseases. From the top ten causes of death, two of them are non communicable diseases. Diabetes Mellitus is one of the two non communicable diseases above that have the increasing number of patient from year to year. From that problem, this research is made to determine the correlation between education, food intake, exercise, and the medication obedience with the average of blood sugar. In this research, the researchers used observational and cross-sectional studies. The sample that used in this research were 50 patients in Puskesmas Gamping I Yogyakarta who have suffered from Diabetes Mellitus in long period. The researcher doing anamnesis by using questionnaire to collect the data, then analyzed it with Chi Square to determine the correlation between each variable. The dependent variable in this research is the average of blood sugar, whereas the independent variables are education, food intake, do exercise, and the obedience of medication. The result shows a relation between education and average blood sugar level (p=0.029), a relation between food intake and average blood sugar level (p=0.009), and a relation between exercise and average blood sugar level (p=0.023). There is also a relation between the medication obedience with the average of blood sugar (p=0,002). The conclusion is that the positive correlations exist between education and average blood sugar level, between food intake and average blood sugar level, and between medication obedience and average blood sugar level.

Keywords: average of blood sugar, education, exercise, food intake, medication obedience

Procedia PDF Downloads 255
5201 Implementation of Algorithm K-Means for Grouping District/City in Central Java Based on Macro Economic Indicators

Authors: Nur Aziza Luxfiati

Abstract:

Clustering is partitioning data sets into sub-sets or groups in such a way that elements certain properties have shared property settings with a high level of similarity within one group and a low level of similarity between groups. . The K-Means algorithm is one of thealgorithmsclustering as a grouping tool that is most widely used in scientific and industrial applications because the basic idea of the kalgorithm is-means very simple. In this research, applying the technique of clustering using the k-means algorithm as a method of solving the problem of national development imbalances between regions in Central Java Province based on macroeconomic indicators. The data sample used is secondary data obtained from the Central Java Provincial Statistics Agency regarding macroeconomic indicator data which is part of the publication of the 2019 National Socio-Economic Survey (Susenas) data. score and determine the number of clusters (k) using the elbow method. After the clustering process is carried out, the validation is tested using themethodsBetween-Class Variation (BCV) and Within-Class Variation (WCV). The results showed that detection outlier using z-score normalization showed no outliers. In addition, the results of the clustering test obtained a ratio value that was not high, namely 0.011%. There are two district/city clusters in Central Java Province which have economic similarities based on the variables used, namely the first cluster with a high economic level consisting of 13 districts/cities and theclustersecondwith a low economic level consisting of 22 districts/cities. And in the cluster second, namely, between low economies, the authors grouped districts/cities based on similarities to macroeconomic indicators such as 20 districts of Gross Regional Domestic Product, with a Poverty Depth Index of 19 districts, with 5 districts in Human Development, and as many as Open Unemployment Rate. 10 districts.

Keywords: clustering, K-Means algorithm, macroeconomic indicators, inequality, national development

Procedia PDF Downloads 142
5200 An Empirical Study to Predict Myocardial Infarction Using K-Means and Hierarchical Clustering

Authors: Md. Minhazul Islam, Shah Ashisul Abed Nipun, Majharul Islam, Md. Abdur Rakib Rahat, Jonayet Miah, Salsavil Kayyum, Anwar Shadaab, Faiz Al Faisal

Abstract:

The target of this research is to predict Myocardial Infarction using unsupervised Machine Learning algorithms. Myocardial Infarction Prediction related to heart disease is a challenging factor faced by doctors & hospitals. In this prediction, accuracy of the heart disease plays a vital role. From this concern, the authors have analyzed on a myocardial dataset to predict myocardial infarction using some popular Machine Learning algorithms K-Means and Hierarchical Clustering. This research includes a collection of data and the classification of data using Machine Learning Algorithms. The authors collected 345 instances along with 26 attributes from different hospitals in Bangladesh. This data have been collected from patients suffering from myocardial infarction along with other symptoms. This model would be able to find and mine hidden facts from historical Myocardial Infarction cases. The aim of this study is to analyze the accuracy level to predict Myocardial Infarction by using Machine Learning techniques.

Keywords: Machine Learning, K-means, Hierarchical Clustering, Myocardial Infarction, Heart Disease

Procedia PDF Downloads 183
5199 Optimal Design for SARMA(P,Q)L Process of EWMA Control Chart

Authors: Yupaporn Areepong

Abstract:

The main goal of this paper is to study Statistical Process Control (SPC) with Exponentially Weighted Moving Average (EWMA) control chart when observations are serially-correlated. The characteristic of control chart is Average Run Length (ARL) which is the average number of samples taken before an action signal is given. Ideally, an acceptable ARL of in-control process should be enough large, so-called (ARL0). Otherwise it should be small when the process is out-of-control, so-called Average of Delay Time (ARL1) or a mean of true alarm. We find explicit formulas of ARL for EWMA control chart for Seasonal Autoregressive and Moving Average processes (SARMA) with Exponential white noise. The results of ARL obtained from explicit formula and Integral equation are in good agreement. In particular, this formulas for evaluating (ARL0) and (ARL1) be able to get a set of optimal parameters which depend on smoothing parameter (λ) and width of control limit (H) for designing EWMA chart with minimum of (ARL1).

Keywords: average run length, optimal parameters, exponentially weighted moving average (EWMA), control chart

Procedia PDF Downloads 538
5198 Identification of Quantitative Trait Loci Conferring Downy Mildew Resistance in Cucumis sativus

Authors: Pawinee Innark, Hudsaya Punyanitikul, Chanuluk Khanobdee, Chatchawan Jantasuriyarat, Sompid Samipak

Abstract:

One of the most devastating diseases in cucumber is downy mildew caused by the fungus Pseudoperonospora cubensis. To enable the use of marker-assisted breeding for resistance cultivars, sixty six microsatellite markers were used to map (quantitative trait loci) QTLs for DM resistance. Total of 315 F2 population from the cross between DM-resistant inbred line CSL0067 and susceptible CSL0139 were evaluated for downy mildew resistance in cotyledon, first and second true leaf at 7, 10, and 14 day after inoculation. The QTL analysis revealed that the downy mildew resistant genes were controlled by multiple recessive genes. From eight linkage groups (LG 1.1, 1.2, 2, 3, 4, 5.1, 5.2 and 6), fourteen QTL positions were detected on 4 linkage groups (LG 1.1, 2, 5.1 and 6) with the log of odd scores ranged from 3.538 to 9.165. Among them, Cot7_5.1_2 and Cot10_5.1 had major-effect QTL with the R2 values of 10.9 and 12.5%, respectively. The flanking markers for Cot7_5.1_2 were SSR19172 - SSR07531 markers and for Cot10_5.1 were SSR03943 - SSR00772. Besides QTLs on chromosome 1, 5 and 6 that were previously reported, this study also revealed a QTL for DM resistance on chromosome 2 that can be used as a new source in cucumber breeding program.

Keywords: cucumber, DNA marker, downy mildew, QTL

Procedia PDF Downloads 230
5197 Automatic Detection of Traffic Stop Locations Using GPS Data

Authors: Areej Salaymeh, Loren Schwiebert, Stephen Remias, Jonathan Waddell

Abstract:

Extracting information from new data sources has emerged as a crucial task in many traffic planning processes, such as identifying traffic patterns, route planning, traffic forecasting, and locating infrastructure improvements. Given the advanced technologies used to collect Global Positioning System (GPS) data from dedicated GPS devices, GPS equipped phones, and navigation tools, intelligent data analysis methodologies are necessary to mine this raw data. In this research, an automatic detection framework is proposed to help identify and classify the locations of stopped GPS waypoints into two main categories: signalized intersections or highway congestion. The Delaunay triangulation is used to perform this assessment in the clustering phase. While most of the existing clustering algorithms need assumptions about the data distribution, the effectiveness of the Delaunay triangulation relies on triangulating geographical data points without such assumptions. Our proposed method starts by cleaning noise from the data and normalizing it. Next, the framework will identify stoppage points by calculating the traveled distance. The last step is to use clustering to form groups of waypoints for signalized traffic and highway congestion. Next, a binary classifier was applied to find distinguish highway congestion from signalized stop points. The binary classifier uses the length of the cluster to find congestion. The proposed framework shows high accuracy for identifying the stop positions and congestion points in around 99.2% of trials. We show that it is possible, using limited GPS data, to distinguish with high accuracy.

Keywords: Delaunay triangulation, clustering, intelligent transportation systems, GPS data

Procedia PDF Downloads 254
5196 Feature Selection of Personal Authentication Based on EEG Signal for K-Means Cluster Analysis Using Silhouettes Score

Authors: Jianfeng Hu

Abstract:

Personal authentication based on electroencephalography (EEG) signals is one of the important field for the biometric technology. More and more researchers have used EEG signals as data source for biometric. However, there are some disadvantages for biometrics based on EEG signals. The proposed method employs entropy measures for feature extraction from EEG signals. Four type of entropies measures, sample entropy (SE), fuzzy entropy (FE), approximate entropy (AE) and spectral entropy (PE), were deployed as feature set. In a silhouettes calculation, the distance from each data point in a cluster to all another point within the same cluster and to all other data points in the closest cluster are determined. Thus silhouettes provide a measure of how well a data point was classified when it was assigned to a cluster and the separation between them. This feature renders silhouettes potentially well suited for assessing cluster quality in personal authentication methods. In this study, “silhouettes scores” was used for assessing the cluster quality of k-means clustering algorithm is well suited for comparing the performance of each EEG dataset. The main goals of this study are: (1) to represent each target as a tuple of multiple feature sets, (2) to assign a suitable measure to each feature set, (3) to combine different feature sets, (4) to determine the optimal feature weighting. Using precision/recall evaluations, the effectiveness of feature weighting in clustering was analyzed. EEG data from 22 subjects were collected. Results showed that: (1) It is possible to use fewer electrodes (3-4) for personal authentication. (2) There was the difference between each electrode for personal authentication (p<0.01). (3) There is no significant difference for authentication performance among feature sets (except feature PE). Conclusion: The combination of k-means clustering algorithm and silhouette approach proved to be an accurate method for personal authentication based on EEG signals.

Keywords: personal authentication, K-mean clustering, electroencephalogram, EEG, silhouettes

Procedia PDF Downloads 261
5195 Proposing an Algorithm to Cluster Ad Hoc Networks, Modulating Two Levels of Learning Automaton and Nodes Additive Weighting

Authors: Mohammad Rostami, Mohammad Reza Forghani, Elahe Neshat, Fatemeh Yaghoobi

Abstract:

An Ad Hoc network consists of wireless mobile equipment which connects to each other without any infrastructure, using connection equipment. The best way to form a hierarchical structure is clustering. Various methods of clustering can form more stable clusters according to nodes' mobility. In this research we propose an algorithm, which allocates some weight to nodes based on factors, i.e. link stability and power reduction rate. According to the allocated weight in the previous phase, the cellular learning automaton picks out in the second phase nodes which are candidates for being cluster head. In the third phase, learning automaton selects cluster head nodes, member nodes and forms the cluster. Thus, this automaton does the learning from the setting and can form optimized clusters in terms of power consumption and link stability. To simulate the proposed algorithm we have used omnet++4.2.2. Simulation results indicate that newly formed clusters have a longer lifetime than previous algorithms and decrease strongly network overload by reducing update rate.

Keywords: mobile Ad Hoc networks, clustering, learning automaton, cellular automaton, battery power

Procedia PDF Downloads 392