Search results for: clustering algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2398

Search results for: clustering algorithms

508 Development of pm2.5 Forecasting System in Seoul, South Korea Using Chemical Transport Modeling and ConvLSTM-DNN

Authors: Ji-Seok Koo, Hee‑Yong Kwon, Hui-Young Yun, Kyung-Hui Wang, Youn-Seo Koo

Abstract:

This paper presents a forecasting system for PM2.5 levels in Seoul, South Korea, leveraging a combination of chemical transport modeling and ConvLSTM-DNN machine learning technology. Exposure to PM2.5 has known detrimental impacts on public health, making its prediction crucial for establishing preventive measures. Existing forecasting models, like the Community Multiscale Air Quality (CMAQ) and Weather Research and Forecasting (WRF), are hindered by their reliance on uncertain input data, such as anthropogenic emissions and meteorological patterns, as well as certain intrinsic model limitations. The system we've developed specifically addresses these issues by integrating machine learning and using carefully selected input features that account for local and distant sources of PM2.5. In South Korea, the PM2.5 concentration is greatly influenced by both local emissions and long-range transport from China, and our model effectively captures these spatial and temporal dynamics. Our PM2.5 prediction system combines the strengths of advanced hybrid machine learning algorithms, convLSTM and DNN, to improve upon the limitations of the traditional CMAQ model. Data used in the system include forecasted information from CMAQ and WRF models, along with actual PM2.5 concentration and weather variable data from monitoring stations in China and South Korea. The system was implemented specifically for Seoul's PM2.5 forecasting.

Keywords: PM2.5 forecast, machine learning, convLSTM, DNN

Procedia PDF Downloads 26
507 Hybrid GNN Based Machine Learning Forecasting Model For Industrial IoT Applications

Authors: Atish Bagchi, Siva Chandrasekaran

Abstract:

Background: According to World Bank national accounts data, the estimated global manufacturing value-added output in 2020 was 13.74 trillion USD. These manufacturing processes are monitored, modelled, and controlled by advanced, real-time, computer-based systems, e.g., Industrial IoT, PLC, SCADA, etc. These systems measure and manipulate a set of physical variables, e.g., temperature, pressure, etc. Despite the use of IoT, SCADA etc., in manufacturing, studies suggest that unplanned downtime leads to economic losses of approximately 864 billion USD each year. Therefore, real-time, accurate detection, classification and prediction of machine behaviour are needed to minimise financial losses. Although vast literature exists on time-series data processing using machine learning, the challenges faced by the industries that lead to unplanned downtimes are: The current algorithms do not efficiently handle the high-volume streaming data from industrial IoTsensors and were tested on static and simulated datasets. While the existing algorithms can detect significant 'point' outliers, most do not handle contextual outliers (e.g., values within normal range but happening at an unexpected time of day) or subtle changes in machine behaviour. Machines are revamped periodically as part of planned maintenance programmes, which change the assumptions on which original AI models were created and trained. Aim: This research study aims to deliver a Graph Neural Network(GNN)based hybrid forecasting model that interfaces with the real-time machine control systemand can detect, predict machine behaviour and behavioural changes (anomalies) in real-time. This research will help manufacturing industries and utilities, e.g., water, electricity etc., reduce unplanned downtimes and consequential financial losses. Method: The data stored within a process control system, e.g., Industrial-IoT, Data Historian, is generally sampled during data acquisition from the sensor (source) and whenpersistingin the Data Historian to optimise storage and query performance. The sampling may inadvertently discard values that might contain subtle aspects of behavioural changes in machines. This research proposed a hybrid forecasting and classification model which combines the expressive and extrapolation capability of GNN enhanced with the estimates of entropy and spectral changes in the sampled data and additional temporal contexts to reconstruct the likely temporal trajectory of machine behavioural changes. The proposed real-time model belongs to the Deep Learning category of machine learning and interfaces with the sensors directly or through 'Process Data Historian', SCADA etc., to perform forecasting and classification tasks. Results: The model was interfaced with a Data Historianholding time-series data from 4flow sensors within a water treatment plantfor45 days. The recorded sampling interval for a sensor varied from 10 sec to 30 min. Approximately 65% of the available data was used for training the model, 20% for validation, and the rest for testing. The model identified the anomalies within the water treatment plant and predicted the plant's performance. These results were compared with the data reported by the plant SCADA-Historian system and the official data reported by the plant authorities. The model's accuracy was much higher (20%) than that reported by the SCADA-Historian system and matched the validated results declared by the plant auditors. Conclusions: The research demonstrates that a hybrid GNN based approach enhanced with entropy calculation and spectral information can effectively detect and predict a machine's behavioural changes. The model can interface with a plant's 'process control system' in real-time to perform forecasting and classification tasks to aid the asset management engineers to operate their machines more efficiently and reduce unplanned downtimes. A series of trialsare planned for this model in the future in other manufacturing industries.

Keywords: GNN, Entropy, anomaly detection, industrial time-series, AI, IoT, Industry 4.0, Machine Learning

Procedia PDF Downloads 113
506 A Hierarchical Method for Multi-Class Probabilistic Classification Vector Machines

Authors: P. Byrnes, F. A. DiazDelaO

Abstract:

The Support Vector Machine (SVM) has become widely recognised as one of the leading algorithms in machine learning for both regression and binary classification. It expresses predictions in terms of a linear combination of kernel functions, referred to as support vectors. Despite its popularity amongst practitioners, SVM has some limitations, with the most significant being the generation of point prediction as opposed to predictive distributions. Stemming from this issue, a probabilistic model namely, Probabilistic Classification Vector Machines (PCVM), has been proposed which respects the original functional form of SVM whilst also providing a predictive distribution. As physical system designs become more complex, an increasing number of classification tasks involving industrial applications consist of more than two classes. Consequently, this research proposes a framework which allows for the extension of PCVM to a multi class setting. Additionally, the original PCVM framework relies on the use of type II maximum likelihood to provide estimates for both the kernel hyperparameters and model evidence. In a high dimensional multi class setting, however, this approach has been shown to be ineffective due to bad scaling as the number of classes increases. Accordingly, we propose the application of Markov Chain Monte Carlo (MCMC) based methods to provide a posterior distribution over both parameters and hyperparameters. The proposed framework will be validated against current multi class classifiers through synthetic and real life implementations.

Keywords: probabilistic classification vector machines, multi class classification, MCMC, support vector machines

Procedia PDF Downloads 200
505 Applying GIS Geographic Weighted Regression Analysis to Assess Local Factors Impeding Smallholder Farmers from Participating in Agribusiness Markets: A Case Study of Vihiga County, Western Kenya

Authors: Mwehe Mathenge, Ben G. J. S. Sonneveld, Jacqueline E. W. Broerse

Abstract:

Smallholder farmers are important drivers of agriculture productivity, food security, and poverty reduction in Sub-Saharan Africa. However, they are faced with myriad challenges in their efforts at participating in agribusiness markets. How the geographic explicit factors existing at the local level interact to impede smallholder farmers' decision to participates (or not) in agribusiness markets is not well understood. Deconstructing the spatial complexity of the local environment could provide a deeper insight into how geographically explicit determinants promote or impede resource-poor smallholder farmers from participating in agribusiness. This paper’s objective was to identify, map, and analyze local spatial autocorrelation in factors that impede poor smallholders from participating in agribusiness markets. Data were collected using geocoded researcher-administered survey questionnaires from 392 households in Western Kenya. Three spatial statistics methods in geographic information system (GIS) were used to analyze data -Global Moran’s I, Cluster and Outliers Analysis (Anselin Local Moran’s I), and geographically weighted regression. The results of Global Moran’s I reveal the presence of spatial patterns in the dataset that was not caused by spatial randomness of data. Subsequently, Anselin Local Moran’s I result identified spatially and statistically significant local spatial clustering (hot spots and cold spots) in factors hindering smallholder participation. Finally, the geographically weighted regression results unearthed those specific geographic explicit factors impeding market participation in the study area. The results confirm that geographically explicit factors are indispensable in influencing the smallholder farming decisions, and policymakers should take cognizance of them. Additionally, this research demonstrated how geospatial explicit analysis conducted at the local level, using geographically disaggregated data, could help in identifying households and localities where the most impoverished and resource-poor smallholder households reside. In designing spatially targeted interventions, policymakers could benefit from geospatial analysis methods in understanding complex geographic factors and processes that interact to influence smallholder farmers' decision-making processes and choices.

Keywords: agribusiness markets, GIS, smallholder farmers, spatial statistics, disaggregated spatial data

Procedia PDF Downloads 108
504 Bridge Health Monitoring: A Review

Authors: Mohammad Bakhshandeh

Abstract:

Structural Health Monitoring (SHM) is a crucial and necessary practice that plays a vital role in ensuring the safety and integrity of critical structures, and in particular, bridges. The continuous monitoring of bridges for signs of damage or degradation through Bridge Health Monitoring (BHM) enables early detection of potential problems, allowing for prompt corrective action to be taken before significant damage occurs. Although all monitoring techniques aim to provide accurate and decisive information regarding the remaining useful life, safety, integrity, and serviceability of bridges, understanding the development and propagation of damage is vital for maintaining uninterrupted bridge operation. Over the years, extensive research has been conducted on BHM methods, and experts in the field have increasingly adopted new methodologies. In this article, we provide a comprehensive exploration of the various BHM approaches, including sensor-based, non-destructive testing (NDT), model-based, and artificial intelligence (AI)-based methods. We also discuss the challenges associated with BHM, including sensor placement and data acquisition, data analysis and interpretation, cost and complexity, and environmental effects, through an extensive review of relevant literature and research studies. Additionally, we examine potential solutions to these challenges and propose future research ideas to address critical gaps in BHM.

Keywords: structural health monitoring (SHM), bridge health monitoring (BHM), sensor-based methods, machine-learning algorithms, and model-based techniques, sensor placement, data acquisition, data analysis

Procedia PDF Downloads 61
503 Design and Optimization of Open Loop Supply Chain Distribution Network Using Hybrid K-Means Cluster Based Heuristic Algorithm

Authors: P. Suresh, K. Gunasekaran, R. Thanigaivelan

Abstract:

Radio frequency identification (RFID) technology has been attracting considerable attention with the expectation of improved supply chain visibility for consumer goods, apparel, and pharmaceutical manufacturers, as well as retailers and government procurement agencies. It is also expected to improve the consumer shopping experience by making it more likely that the products they want to purchase are available. Recent announcements from some key retailers have brought interest in RFID to the forefront. A modified K- Means Cluster based Heuristic approach, Hybrid Genetic Algorithm (GA) - Simulated Annealing (SA) approach, Hybrid K-Means Cluster based Heuristic-GA and Hybrid K-Means Cluster based Heuristic-GA-SA for Open Loop Supply Chain Network problem are proposed. The study incorporated uniform crossover operator and combined crossover operator in GAs for solving open loop supply chain distribution network problem. The algorithms are tested on 50 randomly generated data set and compared with each other. The results of the numerical experiments show that the Hybrid K-means cluster based heuristic-GA-SA, when tested on 50 randomly generated data set, shows superior performance to the other methods for solving the open loop supply chain distribution network problem.

Keywords: RFID, supply chain distribution network, open loop supply chain, genetic algorithm, simulated annealing

Procedia PDF Downloads 122
502 Magnetic Navigation in Underwater Networks

Authors: Kumar Divyendra

Abstract:

Underwater Sensor Networks (UWSNs) have wide applications in areas such as water quality monitoring, marine wildlife management etc. A typical UWSN system consists of a set of sensors deployed randomly underwater which communicate with each other using acoustic links. RF communication doesn't work underwater, and GPS too isn't available underwater. Additionally Automated Underwater Vehicles (AUVs) are deployed to collect data from some special nodes called Cluster Heads (CHs). These CHs aggregate data from their neighboring nodes and forward them to the AUVs using optical links when an AUV is in range. This helps reduce the number of hops covered by data packets and helps conserve energy. We consider the three-dimensional model of the UWSN. Nodes are initially deployed randomly underwater. They attach themselves to the surface using a rod and can only move upwards or downwards using a pump and bladder mechanism. We use graph theory concepts to maximize the coverage volume while every node maintaining connectivity with at least one surface node. We treat the surface nodes as landmarks and each node finds out its hop distance from every surface node. We treat these hop-distances as coordinates and use them for AUV navigation. An AUV intending to move closer to a node with given coordinates moves hop by hop through nodes that are closest to it in terms of these coordinates. In absence of GPS, multiple different approaches like Inertial Navigation System (INS), Doppler Velocity Log (DVL), computer vision-based navigation, etc., have been proposed. These systems have their own drawbacks. INS accumulates error with time, vision techniques require prior information about the environment. We propose a method that makes use of the earth's magnetic field values for navigation and combines it with other methods that simultaneously increase the coverage volume under the UWSN. The AUVs are fitted with magnetometers that measure the magnetic intensity (I), horizontal inclination (H), and Declination (D). The International Geomagnetic Reference Field (IGRF) is a mathematical model of the earth's magnetic field, which provides the field values for the geographical coordinateson earth. Researchers have developed an inverse deep learning model that takes the magnetic field values and predicts the location coordinates. We make use of this model within our work. We combine this with with the hop-by-hop movement described earlier so that the AUVs move in such a sequence that the deep learning predictor gets trained as quickly and precisely as possible We run simulations in MATLAB to prove the effectiveness of our model with respect to other methods described in the literature.

Keywords: clustering, deep learning, network backbone, parallel computing

Procedia PDF Downloads 59
501 Predicting the Compressive Strength of Geopolymer Concrete Using Machine Learning Algorithms: Impact of Chemical Composition and Curing Conditions

Authors: Aya Belal, Ahmed Maher Eltair, Maggie Ahmed Mashaly

Abstract:

Geopolymer concrete is gaining recognition as a sustainable alternative to conventional Portland Cement concrete due to its environmentally friendly nature, which is a key goal for Smart City initiatives. It has demonstrated its potential as a reliable material for the design of structural elements. However, the production of Geopolymer concrete is hindered by batch-to-batch variations, which presents a significant challenge to the widespread adoption of Geopolymer concrete. To date, Machine learning has had a profound impact on various fields by enabling models to learn from large datasets and predict outputs accurately. This paper proposes an integration between the current drift to Artificial Intelligence and the composition of Geopolymer mixtures to predict their mechanical properties. This study employs Python software to develop machine learning model in specific Decision Trees. The research uses the percentage oxides and the chemical composition of the Alkali Solution along with the curing conditions as the input independent parameters, irrespective of the waste products used in the mixture yielding the compressive strength of the mix as the output parameter. The results showed 90 % agreement of the predicted values to the actual values having the ratio of the Sodium Silicate to the Sodium Hydroxide solution being the dominant parameter in the mixture.

Keywords: decision trees, geopolymer concrete, machine learning, smart cities, sustainability

Procedia PDF Downloads 50
500 Client Hacked Server

Authors: Bagul Abhijeet

Abstract:

Background: Client-Server model is the backbone of today’s internet communication. In which normal user can not have control over particular website or server? By using the same processing model one can have unauthorized access to particular server. In this paper, we discussed about application scenario of hacking for simple website or server consist of unauthorized way to access the server database. This application emerges to autonomously take direct access of simple website or server and retrieve all essential information maintain by administrator. In this system, IP address of server given as input to retrieve user-id and password of server. This leads to breaking administrative security of server and acquires the control of server database. Whereas virus helps to escape from server security by crashing the whole server. Objective: To control malicious attack and preventing all government website, and also find out illegal work to do hackers activity. Results: After implementing different hacking as well as non-hacking techniques, this system hacks simple web sites with normal security credentials. It provides access to server database and allow attacker to perform database operations from client machine. Above Figure shows the experimental result of this application upon different servers and provides satisfactory results as required. Conclusion: In this paper, we have presented a to view to hack the server which include some hacking as well as non-hacking methods. These algorithms and methods provide efficient way to hack server database. By breaking the network security allow to introduce new and better security framework. The terms “Hacking” not only consider for its illegal activities but also it should be use for strengthen our global network.

Keywords: Hacking, Vulnerabilities, Dummy request, Virus, Server monitoring

Procedia PDF Downloads 219
499 Improving Activity Recognition Classification of Repetitious Beginner Swimming Using a 2-Step Peak/Valley Segmentation Method with Smoothing and Resampling for Machine Learning

Authors: Larry Powell, Seth Polsley, Drew Casey, Tracy Hammond

Abstract:

Human activity recognition (HAR) systems have shown positive performance when recognizing repetitive activities like walking, running, and sleeping. Water-based activities are a reasonably new area for activity recognition. However, water-based activity recognition has largely focused on supporting the elite and competitive swimming population, which already has amazing coordination and proper form. Beginner swimmers are not perfect, and activity recognition needs to support the individual motions to help beginners. Activity recognition algorithms are traditionally built around short segments of timed sensor data. Using a time window input can cause performance issues in the machine learning model. The window’s size can be too small or large, requiring careful tuning and precise data segmentation. In this work, we present a method that uses a time window as the initial segmentation, then separates the data based on the change in the sensor value. Our system uses a multi-phase segmentation method that pulls all peaks and valleys for each axis of an accelerometer placed on the swimmer’s lower back. This results in high recognition performance using leave-one-subject-out validation on our study with 20 beginner swimmers, with our model optimized from our final dataset resulting in an F-Score of 0.95.

Keywords: time window, peak/valley segmentation, feature extraction, beginner swimming, activity recognition

Procedia PDF Downloads 83
498 Retrieving Iconometric Proportions of South Indian Sculptures Based on Statistical Analysis

Authors: M. Bagavandas

Abstract:

Introduction: South Indian stone sculptures are known for their elegance and history. They are available in large numbers in different monuments situated different parts of South India. These art pieces have been studied using iconography details, but this pioneering study introduces a novel method known as iconometry which is a quantitative study that deals with measurements of different parts of icons to find answers for important unanswered questions. The main aim of this paper is to compare iconometric measurements of the sculptures with canonical proportion to determine whether the sculptors of the past had followed any of the canonical proportions prescribed in the ancient text. If not, this study recovers the proportions used for carving sculptures which is not available to us now. Also, it will be interesting to see how these sculptural proportions of different monuments belonging to different dynasties differ from one another in terms these proportions. Methods and Materials: As Indian sculptures are depicted in different postures, one way of making measurements independent of size, is to decode on a suitable measurement and convert the other measurements as proportions with respect to the chosen measurement. Since in all canonical texts of Indian art, all different measurements are given in terms of face length, it is chosen as the required measurement for standardizing the measurements. In order to compare these facial measurements with measurements prescribed in Indian canons of Iconography, the ten facial measurements like face length, morphological face length, nose length, nose-to-chin length, eye length, lip length, face breadth, nose breadth, eye breadth and lip breadth were standardized using the face length and the number of measurements reduced to nine. Each measurement was divided by the corresponding face length and multiplied by twelve and given in angula unit used in the canonical texts. The reason for multiplying by twelve is that the face length is given as twelve angulas in the canonical texts for all figures. Clustering techniques were used to determine whether the sculptors of the past had followed any of the proportions prescribed in the canonical texts of the past to carve sculptures and also to compare the proportions of sculptures of different monuments. About one hundred twenty-seven stone sculptures from four monuments belonging to the Pallava, the Chola, the Pandya and the Vijayanagar dynasties were taken up for this study. These art pieces belong to a period ranging from the eighth to the sixteenth century A.D. and all of them adorning different monuments situated in different parts of Tamil Nadu State, South India. Anthropometric instruments were used for taking measurements and the author himself had measured all the sample pieces of this study. Result: Statistical analysis of sculptures of different centers of art from different dynasties shows a considerable difference in facial proportions and many of these proportions differ widely from the canonical proportions. The retrieved different facial proportions indicate that the definition of beauty has been changing from period to period and region to region.

Keywords: iconometry, proportions, sculptures, statistics

Procedia PDF Downloads 130
497 A Framework Based on Dempster-Shafer Theory of Evidence Algorithm for the Analysis of the TV-Viewers’ Behaviors

Authors: Hamdi Amroun, Yacine Benziani, Mehdi Ammi

Abstract:

In this paper, we propose an approach of detecting the behavior of the viewers of a TV program in a non-controlled environment. The experiment we propose is based on the use of three types of connected objects (smartphone, smart watch, and a connected remote control). 23 participants were observed while watching their TV programs during three phases: before, during and after watching a TV program. Their behaviors were detected using an approach based on The Dempster Shafer Theory (DST) in two phases. The first phase is to approximate dynamically the mass functions using an approach based on the correlation coefficient. The second phase is to calculate the approximate mass functions. To approximate the mass functions, two approaches have been tested: the first approach was to divide each features data space into cells; each one has a specific probability distribution over the behaviors. The probability distributions were computed statistically (estimated by empirical distribution). The second approach was to predict the TV-viewing behaviors through the use of classifiers algorithms and add uncertainty to the prediction based on the uncertainty of the model. Results showed that mixing the fusion rule with the computation of the initial approximate mass functions using a classifier led to an overall of 96%, 95% and 96% success rate for the first, second and third TV-viewing phase respectively. The results were also compared to those found in the literature. This study aims to anticipate certain actions in order to maintain the attention of TV viewers towards the proposed TV programs with usual connected objects, taking into account the various uncertainties that can be generated.

Keywords: Iot, TV-viewing behaviors identification, automatic classification, unconstrained environment

Procedia PDF Downloads 194
496 Detection and Classification of Mammogram Images Using Principle Component Analysis and Lazy Classifiers

Authors: Rajkumar Kolangarakandy

Abstract:

Feature extraction and selection is the primary part of any mammogram classification algorithms. The choice of feature, attribute or measurements have an important influence in any classification system. Discrete Wavelet Transformation (DWT) coefficients are one of the prominent features for representing images in frequency domain. The features obtained after the decomposition of the mammogram images using wavelet transformations have higher dimension. Even though the features are higher in dimension, they were highly correlated and redundant in nature. The dimensionality reduction techniques play an important role in selecting the optimum number of features from the higher dimension data, which are highly correlated. PCA is a mathematical tool that reduces the dimensionality of the data while retaining most of the variation in the dataset. In this paper, a multilevel classification of mammogram images using reduced discrete wavelet transformation coefficients and lazy classifiers is proposed. The classification is accomplished in two different levels. In the first level, mammogram ROIs extracted from the dataset is classified as normal and abnormal types. In the second level, all the abnormal mammogram ROIs is classified into benign and malignant too. A further classification is also accomplished based on the variation in structure and intensity distribution of the images in the dataset. The Lazy classifiers called Kstar, IBL and LWL are used for classification. The classification results obtained with the reduced feature set is highly promising and the result is also compared with the performance obtained without dimension reduction.

Keywords: PCA, wavelet transformation, lazy classifiers, Kstar, IBL, LWL

Procedia PDF Downloads 305
495 Clustering Locations of Textile and Garment Industries to Compare with the Future Industrial Cluster in Thailand

Authors: Kanogkan Leerojanaprapa

Abstract:

Textile and garment industry is used to a major exporting industry of Thailand. According to lacking of the nation's price-competitiveness by stopping the EU's GSP (Generalised Scheme of Preferences) and ‘Nationwide Minimum Wage Policy’ that Thailand’s employers must pay all employees at least 300 baht (about $10) a day, the supply chains of the Thai textile and garment industry is affected and need to be reformed. Therefore, either Thai textile or garment industry will be existed or not would be concerned. This is also challenged for the government to decide which industries should be promoted the future industries of Thailand. Recently Thai government launch The Cluster-based Special Economic Development Zones Policy for promoting business cluster (effect on September 16, 2015). They define a cluster as the concentration of interconnected businesses and related institutions that operate within the same geographic areas and textiles and garment is one of target industrial clusters and 9 provinces are targeted (Bangkok, Kanchanaburi, Nakhon Pathom, Ratchaburi, Samut Sakhon, Chonburi, Chachoengsao, Prachinburi, and Sa Kaeo). The cluster zone are defined to link west-east corridor connected to manufacturing source in Cambodia and Mynmar to Bangkok where are promoted to be design, sourcing, and trading hub. The Thai government will provide tax and non-tax incentives for targeted industries within the clusters and expects these businesses are scattered to where they can get the most benefit which will identify future industrial cluster. This research will show the difference between the current cluster and future cluster following the target provinces of the textile and garment. The current cluster is analysed from secondary data. The four characteristics of the numbers of plants in Spinning, weaving and finishing of textiles, Manufacture of made-up textile articles, except apparel, Manufacture of knitted and crocheted fabrics, and Manufacture of other textiles, not elsewhere classified in particular 77 provinces (in total) are clustered by K-means cluster analysis and Hierarchical Cluster Analysis. In addition, the cluster can be confirmed and showed which variables contribute the most to defined cluster solution with ANOVA test. The results of analysis can identify 22 provinces (which the textile or garment plants are located) into 3 clusters. Plants in cluster 1 tend to be large numbers of plants which is only Bangkok, Next plants in cluster 2 tend to be moderate numbers of plants which are Samut Prakan, Samut Sakhon and Nakhon Pathom. Finally plants in cluster 3 tend to be little numbers of plants which are other 18 provinces. The same methodology can be implemented in other industries for future study.

Keywords: ANOVA, hierarchical cluster analysis, industrial clusters, K -means cluster analysis, textile and garment industry

Procedia PDF Downloads 187
494 Spatial Economic Attributes of O. R. Tambo Airport, South Africa

Authors: Masilonyane Mokhele

Abstract:

Across the world, different planning models of the so-called airport-led developments are becoming bandwagons hailed as key to the future of cities. However, in the existing knowledge, there is paucity of empirically informed description and explanation of the economic fundamentals driving the forces of attraction of airports. This void is arguably a result of the absence of an appropriate theoretical framework to guide the analyses. Given this paucity, the aim of the paper is to contribute towards a theoretical framework that could be used to describe and explain forces that drive the location and mix of airport-centric developments. Towards achieving this aim, the objectives of the paper are: one, to establish the type of economic activities that are located on and around O.R. Tambo International Airport (ORTIA), and analyse the reasons for locating there; two, to establish changes that have occurred over time in the form of the airport-centric development of ORTIA; three, to identify the propulsive economic qualities of ORTIA; four, to analyse the spatial, economic and structural linkages within the airport-centric development of ORTIA, between the airport-centric development and the airport, as well as the airport-centric development’s linkages with their metropolitan area and other regional, national and international airport-centric developments and locations. To address the objectives above, the study adopted a case study approach, centred on ORTIA in South Africa: Africa’s busiest airport in terms of passengers and airfreight handled. Using a lens of location theory, a survey was adopted as a main research method, wherein telephonic interviews were conducted with a representative number of firms on and around ORTIA. Other data collection methods encompassed in-depth qualitative interviews (to augment the information obtained through the survey) and analysis of secondary information, particularly as regards establishing changes that have occurred in the form of ORTIA and surrounds. From the empirical findings, ORTIA was discovered to have propulsive economic qualities that act as significant forces of attraction in the clustering of firms. Together with its airport-centric development, ORTIA was discovered to have growth pole properties because of the linkages that occur within the study area, and the linkages that exist between the airport-centric firms and the airport. It was noted that the transport-oriented firms (typified by couriers and freight carriers) act as anchors in some fellow airport-centric firms making use of elements of urbanisation economies, particularly as regards the use of the airport for airfreight services. The empirical findings presented in the paper (in conjunction with results from other airport-centric development case studies) could be used as contribution towards extending theory that describes and explains forces that drive the location and mix of airport-centric developments.

Keywords: airports, airport-centric development, O. R. Tambo international airport, South Africa

Procedia PDF Downloads 232
493 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Su-Hyeon Jeon, ByeoungKug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we previously proposed a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. In this paper, we design a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: big data analysis, document classification, multi-category, text mining, topic analysis

Procedia PDF Downloads 238
492 The Dynamics of Planktonic Crustacean Populations in an Open Access Lagoon, Bordered by Heavy Industry, Southwest, Nigeria

Authors: E. O. Clarke, O. J. Aderinola, O. A. Adeboyejo, M. A. Anetekhai

Abstract:

Aims: The study is aimed at establishing the influence of some physical and chemical parameters on the abundance, distribution pattern and seasonal variations of the planktonic crustacean populations. Place and Duration of Study: A premier investigation into the dynamics of planktonic crustacean populations in Ologe lagoon was carried out from January 2011 to December 2012. Study Design: The study covered identification, temporal abundance, spatial distribution and diversity of the planktonic crustacea. Methodology: Standard techniques were used to collect samples from eleven stations covering five proximal satellite towns (Idoluwo, Oto, Ibiye, Obele, and Gbanko) bordering the lagoon. Data obtained were statistically analyzed using linear regression and hierarchical clustering. Results:Thirteen (13) planktonic crustacean populations were identified. Total percentage abundance was highest for Bosmina species (20%) and lowest for Polyphemus species (0.8%). The Pearson’s correlation coefficient (“r” values) between total planktonic crustacean population and some physical and chemical parameters showed that positive correlations having low level of significance occurred with salinity (r = 0.042) (sig = 0.184) and with surface water dissolved oxygen (r = 0.299) (sig = 0.155). Linear regression plots indicated that, the total population of planktonic crustacea were mainly influenced and only increased with an increase in value of surface water temperature (Rsq = 0.791) and conductivity (Rsq = 0.589). The total population of planktonic crustacea had a near neutral (zero correlation) with the surface water dissolved oxygen and thus, does not significantly change with the level of the surface water dissolved oxygen. The correlations were positive with NO3-N (midstream) at Ibiye (Rsq =0.022) and (downstream) Gbanko (Rsq =0.013), PO4-P at Ibiye (Rsq =0.258), K at Idoluwo (Rsq =0.295) and SO4-S at Oto (Rsq = 0.094) and Gbanko (Rsq = 0.457). The Berger-Parker Dominance Index (BPDI) showed that the most dominant species was Bosmina species (BPDI = 1.000), followed by Calanus species (BPDI = 1.254). Clusters by squared Euclidan distances using average linkage between groups showed proximities, transcending the borders of genera. Conclusion: The results revealed that planktonic crustacean population in Ologe lagoon undergo seasonal perturbations, were highly influenced by nutrient, metal and organic matter inputs from river Owoh, Agbara industrial estate and surrounding farmlands and were patchy in spatial distribution.

Keywords: diversity, dominance, perturbations, richness, crustacea, lagoon

Procedia PDF Downloads 683
491 Using Monte Carlo Model for Simulation of Rented Housing in Mashhad, Iran

Authors: Mohammad Rahim Rahnama

Abstract:

The study employs Monte Carlo method for simulation of rented housing in Mashhad second largest city in Iran. A total number of 334 rental residential units in Mashhad, including both apartments and houses (villa), were randomly selected from advertisements placed in Khorasan Newspapers during the months of July and August of 2015. In order to simulate the monthly rent price, the rent index was calculated through combining the mortgage and the rent price. In the next step, the relation between the variables of the floor area and that of the number of bedrooms for each unit, in both apartments and houses(villa), was calculated through multivariate regression using SPSS and was coded in XML. The initial model was called using simulation button in SPSS and was simulated using triangular and binominal algorithms. The findings revealed that the average simulated rental index was 548.5$ per month. Calculating the sensitivity of rental index to a number of bedrooms we found that firstly, 97% of units have three bedrooms, and secondly as the number of bedrooms increases from one to three, for the rent price of less than 200$, the percentage of units having one bedroom decreases from 10% to 0. Contrariwise, for units with the rent price of more than 571.4$, the percentage of bedrooms increases from 37% to 48%. In the light of these findings, it becomes clear that planning to build rental residential units, overseeing the rent prices, and granting subsidies to rental residential units, for apartments with two bedrooms, present a felicitous policy for regulating residential units in Mashhad.

Keywords: Mashhad, Monte Carlo, simulation, rent price, residential unit

Procedia PDF Downloads 240
490 Measurement of Solids Concentration in Hydrocyclone Using ERT: Validation Against CFD

Authors: Vakamalla Teja Reddy, Narasimha Mangadoddy

Abstract:

Hydrocyclones are used to separate particles into different size fractions in the mineral processing, chemical and metallurgical industries. High speed video imaging, Laser Doppler Anemometry (LDA), X-ray and Gamma ray tomography are previously used to measure the two-phase flow characteristics in the cyclone. However, investigation of solids flow characteristics inside the cyclone is often impeded by the nature of the process due to slurry opaqueness and solid metal wall vessels. In this work, a dual-plane high speed Electrical resistance tomography (ERT) is used to measure hydrocyclone internal flow dynamics in situ. Experiments are carried out in 3 inch hydrocyclone for feed solid concentrations varying in the range of 0-50%. ERT data analysis through the optimized FEM mesh size and reconstruction algorithms on air-core and solid concentration tomograms is assessed. Results are presented in terms of the air-core diameter and solids volume fraction contours using Maxwell’s equation for various hydrocyclone operational parameters. It is confirmed by ERT that the air core occupied area and wall solids conductivity levels decreases with increasing the feed solids concentration. Algebraic slip mixture based multi-phase computational fluid dynamics (CFD) model is used to predict the air-core size and the solid concentrations in the hydrocyclone. Validation of air-core size and mean solid volume fractions by ERT measurements with the CFD simulations is attempted.

Keywords: air-core, electrical resistance tomography, hydrocyclone, multi-phase CFD

Procedia PDF Downloads 345
489 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction

Procedia PDF Downloads 310
488 Monitoring the Drying and Grinding Process during Production of Celitement through a NIR-Spectroscopy Based Approach

Authors: Carolin Lutz, Jörg Matthes, Patrick Waibel, Ulrich Precht, Krassimir Garbev, Günter Beuchle, Uwe Schweike, Peter Stemmermann, Hubert B. Keller

Abstract:

Online measurement of the product quality is a challenging task in cement production, especially in the production of Celitement, a novel environmentally friendly hydraulic binder. The mineralogy and chemical composition of clinker in ordinary Portland cement production is measured by X-ray diffraction (XRD) and X ray fluorescence (XRF), where only crystalline constituents can be detected. But only a small part of the Celitement components can be measured via XRD, because most constituents have an amorphous structure. This paper describes the development of algorithms suitable for an on-line monitoring of the final processing step of Celitement based on NIR-data. For calibration intermediate products were dried at different temperatures and ground for variable durations. The products were analyzed using XRD and thermogravimetric analyses together with NIR-spectroscopy to investigate the dependency between the drying and the milling processes on one and the NIR-signal on the other side. As a result, different characteristic parameters have been defined. A short overview of the Celitement process and the challenging tasks of the online measurement and evaluation of the product quality will be presented. Subsequently, methods for systematic development of near-infrared calibration models and the determination of the final calibration model will be introduced. The application of the model on experimental data illustrates that NIR-spectroscopy allows for a quick and sufficiently exact determination of crucial process parameters.

Keywords: calibration model, celitement, cementitious material, NIR spectroscopy

Procedia PDF Downloads 462
487 Sustainable Business Model Archetypes – A Systematic Review and Application to the Plastic Industry

Authors: Felix Schumann, Giorgia Carratta, Tobias Dauth, Liv Jaeckel

Abstract:

In the last few decades, the rapid growth of the use and disposal of plastic items has led to their overaccumulation in the environment. As a result, plastic pollution has become a subject of global concern. Today plastics are used as raw materials in almost every industry. While the recognition of the ecological, social, and economic impact of plastics in academic research is on the rise, the potential role of the ‘plastic industry’ in dealing with such issues is still largely underestimated. Therefore, the literature on sustainable plastic management is still nascent and fragmented. Working towards sustainability requires a fundamental shift in the way companies employ plastics in their day-to-day business. For that reason, the applicability of the business model concept has recently gained momentum in environmental research. Business model innovation is increasingly recognized as an important driver to re-conceptualize the purpose of the firm and to readily integrate sustainability in their business. It can serve as a starting point to investigate whether and how sustainability can be realized under industry- and firm-specific circumstances. Yet, there is no comprehensive view in the plastic industry on how firms start refining their business models to embed sustainability in their operations. Our study addresses this gap, looking primarily at the industrial sectors responsible for the production of the largest amount of plastic waste today: plastic packaging, consumer goods, construction, textile, and transport. Relying on the archetypes of sustainable business models and applying them to the aforementioned sectors, we try to identify companies’ current strategies to make their business models more sustainable. Based on the thematic clustering, we can develop an integrative framework for the plastic industry. The findings are underpinned and illustrated by a variety of relevant plastic management solutions that the authors have identified through a systematic literature review and analysis of existing, empirically grounded research in this field. Using the archetypes, we can promote options for business model innovations for the most important sectors in which plastics are used. Moreover, by linking the proposed business model archetypes to the plastic industry, our research approach guides firms in exploring sustainable business opportunities. Likewise, researchers and policymakers can utilize our classification to identify best practices. The authors believe that the study advances the current knowledge on sustainable plastic management through its broad empirical industry analyses. Hence, the application of business model archetypes in the plastic industry will be useful for shaping companies’ transformation to create and deliver more sustainability and provides avenues for future research endeavors.

Keywords: business models, environmental economics, plastic management, plastic pollution, sustainability

Procedia PDF Downloads 65
486 A Machine Learning Based Framework for Education Levelling in Multicultural Countries: UAE as a Case Study

Authors: Shatha Ghareeb, Rawaa Al-Jumeily, Thar Baker

Abstract:

In Abu Dhabi, there are many different education curriculums where sector of private schools and quality assurance is supervising many private schools in Abu Dhabi for many nationalities. As there are many different education curriculums in Abu Dhabi to meet expats’ needs, there are different requirements for registration and success. In addition, there are different age groups for starting education in each curriculum. In fact, each curriculum has a different number of years, assessment techniques, reassessment rules, and exam boards. Currently, students that transfer curriculums are not being placed in the right year group due to different start and end dates of each academic year and their date of birth for each year group is different for each curriculum and as a result, we find students that are either younger or older for that year group which therefore creates gaps in their learning and performance. In addition, there is not a way of storing student data throughout their academic journey so that schools can track the student learning process. In this paper, we propose to develop a computational framework applicable in multicultural countries such as UAE in which multi-education systems are implemented. The ultimate goal is to use cloud and fog computing technology integrated with Artificial Intelligence techniques of Machine Learning to aid in a smooth transition when assigning students to their year groups, and provide leveling and differentiation information of students who relocate from a particular education curriculum to another, whilst also having the ability to store and access student data from anywhere throughout their academic journey.

Keywords: admissions, algorithms, cloud computing, differentiation, fog computing, levelling, machine learning

Procedia PDF Downloads 100
485 Review of Theories and Applications of Genetic Programing in Sediment Yield Modeling

Authors: Adesoji Tunbosun Jaiyeola, Josiah Adeyemo

Abstract:

Sediment yield can be considered to be the total sediment load that leaves a drainage basin. The knowledge of the quantity of sediments present in a river at a particular time can lead to better flood capacity in reservoirs and consequently help to control over-bane flooding. Furthermore, as sediment accumulates in the reservoir, it gradually loses its ability to store water for the purposes for which it was built. The development of hydrological models to forecast the quantity of sediment present in a reservoir helps planners and managers of water resources systems, to understand the system better in terms of its problems and alternative ways to address them. The application of artificial intelligence models and technique to such real-life situations have proven to be an effective approach of solving complex problems. This paper makes an extensive review of literature relevant to the theories and applications of evolutionary algorithms, and most especially genetic programming. The successful applications of genetic programming as a soft computing technique were reviewed in sediment modelling and other branches of knowledge. Some fundamental issues such as benchmark, generalization ability, bloat and over-fitting and other open issues relating to the working principles of GP, which needs to be addressed by the GP community were also highlighted. This review aim to give GP theoreticians, researchers and the general community of GP enough research direction, valuable guide and also keep all stakeholders abreast of the issues which need attention during the next decade for the advancement of GP.

Keywords: benchmark, bloat, generalization, genetic programming, over-fitting, sediment yield

Procedia PDF Downloads 403
484 Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling

Authors: Florin Leon, Silvia Curteanu

Abstract:

Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.

Keywords: batch bulk methyl methacrylate polymerization, adaptive sampling, machine learning, large margin nearest neighbor regression

Procedia PDF Downloads 266
483 The Relationship between Violence against Women and Levels of Self-Esteem in Urban Informal Settlements of Mumbai, India: A Cross-Sectional Study

Authors: A. Bentley, A. Prost, N. Daruwalla, D. Osrin

Abstract:

Background: This study aims to investigate the relationship between experiences of violence against women in the family, and levels of self-esteem in women residing in informal settlement (slum) areas of Mumbai, India. The authors hypothesise that violence against women in Indian households extends beyond that of intimate partner violence (IPV), to include other members of the family and that experiences of violence are associated with lower levels of self-esteem. Methods: Experiences of violence were assessed through a cross-sectional survey of 598 women, including questions about specific acts of emotional, economic, physical and sexual violence across different time points, and the main perpetrator of each. Self-esteem was assessed using the Rosenberg self-esteem questionnaire. A global score for self-esteem was calculated and the relationship between violence in the past year and Rosenberg self-esteem score was assessed using multivariable linear regression models, adjusted for years of education completed, and clustering using robust standard errors. Results: 482 (81%) women consented to interview. On average, they were 28.5 years old, had completed 6 years of education and had been married 9.5 years. 88% were Muslim and 46% lived in joint families. 44% of women had experienced at least one act of violence in their lifetime (33% emotional, 22% economic, 24% physical, 12% sexual). Of the women who experienced violence after marriage, 70% cited a perpetrator other than the husband for at least one of the acts. 5% had low self-esteem (Rosenberg score < 15). For women who experienced emotional violence in the past year, the Rosenberg score was 2.6 points lower (p < 0.001). It was 1.2 points lower (p = 0.03) for women who experienced economic violence. For physical or sexual violence in the past year, no statistically significant relationship with Rosenberg score was seen. However, for a one-unit increase in the number of different acts of each type of violence experienced in the past year, a decrease in Rosenberg score was seen (-0.62 for emotional, -0.76 for economic, -0.53 for physical and -0.47 for sexual; p < 0.05 for all). Discussion: The high prevalence of violence experiences across the lifetime was likely due to the detailed assessment of violence and the inclusion of perpetrators within the family other than the husband. Experiences of emotional or economic violence in the past year were associated with lower Rosenberg scores and therefore lower self-esteem, but no relationship was seen between experiences of physical or sexual violence and Rosenberg score overall. For all types of violence in the past year, a greater number of different acts were associated with a decrease in Rosenberg score. Emotional violence showed the strongest relationship with self-esteem, but for all types of violence the more complex the pattern of perpetration with different methods used, the lower the levels of self-esteem. Due to the cross-sectional nature of the study causal directionality cannot be attributed. Further work to investigate the relationship between severity of violence and self-esteem and whether self-esteem mediates relationships between violence and poorer mental health would be beneficial.

Keywords: family violence, India, informal settlements, Rosenberg self-esteem scale, self-esteem, violence against women

Procedia PDF Downloads 102
482 Machine Learning Models for the Prediction of Heating and Cooling Loads of a Residential Building

Authors: Aaditya U. Jhamb

Abstract:

Due to the current energy crisis that many countries are battling, energy-efficient buildings are the subject of extensive research in the modern technological era because of growing worries about energy consumption and its effects on the environment. The paper explores 8 factors that help determine energy efficiency for a building: (relative compactness, surface area, wall area, roof area, overall height, orientation, glazing area, and glazing area distribution), with Tsanas and Xifara providing a dataset. The data set employed 768 different residential building models to anticipate heating and cooling loads with a low mean squared error. By optimizing these characteristics, machine learning algorithms may assess and properly forecast a building's heating and cooling loads, lowering energy usage while increasing the quality of people's lives. As a result, the paper studied the magnitude of the correlation between these input factors and the two output variables using various statistical methods of analysis after determining which input variable was most closely associated with the output loads. The most conclusive model was the Decision Tree Regressor, which had a mean squared error of 0.258, whilst the least definitive model was the Isotonic Regressor, which had a mean squared error of 21.68. This paper also investigated the KNN Regressor and the Linear Regression, which had to mean squared errors of 3.349 and 18.141, respectively. In conclusion, the model, given the 8 input variables, was able to predict the heating and cooling loads of a residential building accurately and precisely.

Keywords: energy efficient buildings, heating load, cooling load, machine learning models

Procedia PDF Downloads 66
481 Designing and Prototyping Permanent Magnet Generators for Wind Energy

Authors: T. Asefi, J. Faiz, M. A. Khan

Abstract:

This paper introduces dual rotor axial flux machines with surface mounted and spoke type ferrite permanent magnets with concentrated windings; they are introduced as alternatives to a generator with surface mounted Nd-Fe-B magnets. The output power, voltage, speed and air gap clearance for all the generators are identical. The machine designs are optimized for minimum mass using a population-based algorithm, assuming the same efficiency as the Nd-Fe-B machine. A finite element analysis (FEA) is applied to predict the performance, emf, developed torque, cogging torque, no load losses, leakage flux and efficiency of both ferrite generators and that of the Nd-Fe-B generator. To minimize cogging torque, different rotor pole topologies and different pole arc to pole pitch ratios are investigated by means of 3D FEA. It was found that the surface mounted ferrite generator topology is unable to develop the nominal electromagnetic torque, and has higher torque ripple and is heavier than the spoke type machine. Furthermore, it was shown that the spoke type ferrite permanent magnet generator has favorable performance and could be an alternative to rare-earth permanent magnet generators, particularly in wind energy applications. Finally, the analytical and numerical results are verified using experimental results.

Keywords: axial flux, permanent magnet generator, dual rotor, ferrite permanent magnet generator, finite element analysis, wind turbines, cogging torque, population-based algorithms

Procedia PDF Downloads 117
480 Sequential Pattern Mining from Data of Medical Record with Sequential Pattern Discovery Using Equivalent Classes (SPADE) Algorithm (A Case Study : Bolo Primary Health Care, Bima)

Authors: Rezky Rifaini, Raden Bagus Fajriya Hakim

Abstract:

This research was conducted at the Bolo primary health Care in Bima Regency. The purpose of the research is to find out the association pattern that is formed of medical record database from Bolo Primary health care’s patient. The data used is secondary data from medical records database PHC. Sequential pattern mining technique is the method that used to analysis. Transaction data generated from Patient_ID, Check_Date and diagnosis. Sequential Pattern Discovery Algorithms Using Equivalent Classes (SPADE) is one of the algorithm in sequential pattern mining, this algorithm find frequent sequences of data transaction, using vertical database and sequence join process. Results of the SPADE algorithm is frequent sequences that then used to form a rule. It technique is used to find the association pattern between items combination. Based on association rules sequential analysis with SPADE algorithm for minimum support 0,03 and minimum confidence 0,75 is gotten 3 association sequential pattern based on the sequence of patient_ID, check_Date and diagnosis data in the Bolo PHC.

Keywords: diagnosis, primary health care, medical record, data mining, sequential pattern mining, SPADE algorithm

Procedia PDF Downloads 365
479 Using Cyclic Structure to Improve Inference on Network Community Structure

Authors: Behnaz Moradijamei, Michael Higgins

Abstract:

Identifying community structure is a critical task in analyzing social media data sets often modeled by networks. Statistical models such as the stochastic block model have proven to explain the structure of communities in real-world network data. In this work, we develop a goodness-of-fit test to examine community structure's existence by using a distinguishing property in networks: cyclic structures are more prevalent within communities than across them. To better understand how communities are shaped by the cyclic structure of the network rather than just the number of edges, we introduce a novel method for deciding on the existence of communities. We utilize these structures by using renewal non-backtracking random walk (RNBRW) to the existing goodness-of-fit test. RNBRW is an important variant of random walk in which the walk is prohibited from returning back to a node in exactly two steps and terminates and restarts once it completes a cycle. We investigate the use of RNBRW to improve the performance of existing goodness-of-fit tests for community detection algorithms based on the spectral properties of the adjacency matrix. Our proposed test on community structure is based on the probability distribution of eigenvalues of the normalized retracing probability matrix derived by RNBRW. We attempt to make the best use of asymptotic results on such a distribution when there is no community structure, i.e., asymptotic distribution under the null hypothesis. Moreover, we provide a theoretical foundation for our statistic by obtaining the true mean and a tight lower bound for RNBRW edge weights variance.

Keywords: hypothesis testing, RNBRW, network inference, community structure

Procedia PDF Downloads 106