Search results for: grey clustering method
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 19075

Search results for: grey clustering method

18805 Evaluation of Security and Performance of Master Node Protocol in the Bitcoin Peer-To-Peer Network

Authors: Muntadher Sallal, Gareth Owenson, Mo Adda, Safa Shubbar

Abstract:

Bitcoin is a digital currency based on a peer-to-peer network to propagate and verify transactions. Bitcoin is gaining wider adoption than any previous crypto-currency. However, the mechanism of peers randomly choosing logical neighbors without any knowledge about underlying physical topology can cause a delay overhead in information propagation, which makes the system vulnerable to double-spend attacks. Aiming at alleviating the propagation delay problem, this paper introduces proximity-aware extensions to the current Bitcoin protocol, named Master Node Based Clustering (MNBC). The ultimate purpose of the proposed protocol, that are based on how clusters are formulated and how nodes can define their membership, is to improve the information propagation delay in the Bitcoin network. In MNBC protocol, physical internet connectivity increases, as well as the number of hops between nodes, decreases through assigning nodes to be responsible for maintaining clusters based on physical internet proximity. We show, through simulations, that the proposed protocol defines better clustering structures that optimize the performance of the transaction propagation over the Bitcoin protocol. The evaluation of partition attacks in the MNBC protocol, as well as the Bitcoin network, was done in this paper. Evaluation results prove that even though the Bitcoin network is more resistant against the partitioning attack than the MNBC protocol, more resources are needed to be spent to split the network in the MNBC protocol, especially with a higher number of nodes.

Keywords: Bitcoin network, propagation delay, clustering, scalability

Procedia PDF Downloads 99
18804 Recognition of Objects in a Maritime Environment Using a Combination of Pre- and Post-Processing of the Polynomial Fit Method

Authors: R. R. Hordijk, O. J. G. Somsen

Abstract:

Traditionally, radar systems are the eyes and ears of a ship. However, these systems have their drawbacks and nowadays they are extended with systems that work with video and photos. Processing of data from these videos and photos is however very labour-intensive and efforts are being made to automate this process. A major problem when trying to recognize objects in water is that the 'background' is not homogeneous so that traditional image recognition technics do not work well. Main question is, can a method be developed which automate this recognition process. There are a large number of parameters involved to facilitate the identification of objects on such images. One is varying the resolution. In this research, the resolution of some images has been reduced to the extreme value of 1% of the original to reduce clutter before the polynomial fit (pre-processing). It turned out that the searched object was clearly recognizable as its grey value was well above the average. Another approach is to take two images of the same scene shortly after each other and compare the result. Because the water (waves) fluctuates much faster than an object floating in the water one can expect that the object is the only stable item in the two images. Both these methods (pre-processing and comparing two images of the same scene) delivered useful results. Though it is too early to conclude that with these methods all image problems can be solved they are certainly worthwhile for further research.

Keywords: image processing, image recognition, polynomial fit, water

Procedia PDF Downloads 512
18803 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 423
18802 Approach Based on Fuzzy C-Means for Band Selection in Hyperspectral Images

Authors: Diego Saqui, José H. Saito, José R. Campos, Lúcio A. de C. Jorge

Abstract:

Hyperspectral images and remote sensing are important for many applications. A problem in the use of these images is the high volume of data to be processed, stored and transferred. Dimensionality reduction techniques can be used to reduce the volume of data. In this paper, an approach to band selection based on clustering algorithms is presented. This approach allows to reduce the volume of data. The proposed structure is based on Fuzzy C-Means (or K-Means) and NWHFC algorithms. New attributes in relation to other studies in the literature, such as kurtosis and low correlation, are also considered. A comparison of the results of the approach using the Fuzzy C-Means and K-Means with different attributes is performed. The use of both algorithms show similar good results but, particularly when used attributes variance and kurtosis in the clustering process, however applicable in hyperspectral images.

Keywords: band selection, fuzzy c-means, k-means, hyperspectral image

Procedia PDF Downloads 377
18801 Discriminating Between Energy Drinks and Sports Drinks Based on Their Chemical Properties Using Chemometric Methods

Authors: Robert Cazar, Nathaly Maza

Abstract:

Energy drinks and sports drinks are quite popular among young adults and teenagers worldwide. Some concerns regarding their health effects – particularly those of the energy drinks - have been raised based on scientific findings. Differentiating between these two types of drinks by means of their chemical properties seems to be an instructive task. Chemometrics provides the most appropriate strategy to do so. In this study, a discrimination analysis of the energy and sports drinks has been carried out applying chemometric methods. A set of eleven samples of available commercial brands of drinks – seven energy drinks and four sports drinks – were collected. Each sample was characterized by eight chemical variables (carbohydrates, energy, sugar, sodium, pH, degrees Brix, density, and citric acid). The data set was standardized and examined by exploratory chemometric techniques such as clustering and principal component analysis. As a preliminary step, a variable selection was carried out by inspecting the variable correlation matrix. It was detected that some variables are redundant, so they can be safely removed, leaving only five variables that are sufficient for this analysis. They are sugar, sodium, pH, density, and citric acid. Then, a hierarchical clustering `employing the average – linkage criterion and using the Euclidian distance metrics was performed. It perfectly separates the two types of drinks since the resultant dendogram, cut at the 25% similarity level, assorts the samples in two well defined groups, one of them containing the energy drinks and the other one the sports drinks. Further assurance of the complete discrimination is provided by the principal component analysis. The projection of the data set on the first two principal components – which retain the 71% of the data information – permits to visualize the distribution of the samples in the two groups identified in the clustering stage. Since the first principal component is the discriminating one, the inspection of its loadings consents to characterize such groups. The energy drinks group possesses medium to high values of density, citric acid, and sugar. The sports drinks group, on the other hand, exhibits low values of those variables. In conclusion, the application of chemometric methods on a data set that features some chemical properties of a number of energy and sports drinks provides an accurate, dependable way to discriminate between these two types of beverages.

Keywords: chemometrics, clustering, energy drinks, principal component analysis, sports drinks

Procedia PDF Downloads 87
18800 Parallel Genetic Algorithms Clustering for Handling Recruitment Problem

Authors: Walid Moudani, Ahmad Shahin

Abstract:

This research presents a study to handle the recruitment services system. It aims to enhance a business intelligence system by embedding data mining in its core engine and to facilitate the link between job searchers and recruiters companies. The purpose of this study is to present an intelligent management system for supporting recruitment services based on data mining methods. It consists to apply segmentation on the extracted job postings offered by the different recruiters. The details of the job postings are associated to a set of relevant features that are extracted from the web and which are based on critical criterion in order to define consistent clusters. Thereafter, we assign the job searchers to the best cluster while providing a ranking according to the job postings of the selected cluster. The performance of the proposed model used is analyzed, based on a real case study, with the clustered job postings dataset and classified job searchers dataset by using some metrics.

Keywords: job postings, job searchers, clustering, genetic algorithms, business intelligence

Procedia PDF Downloads 312
18799 Collection and Phenotypic Characterization of Some Nigerian Bambara Groundnut (Vigna subterranea (L.) Verdc.) Germplasm Using Seed Morphology

Authors: Abejide Dorcas Ropo, Falusi Olamide Ahmed, Daudu Oladipupo Abdulazeez Yusuf, Muhammad Liman Muhammad, Gado Aishatu Adamu

Abstract:

Bambara groundnut is an indigenous African legume with great potential to tackle the problem of food insecurity in Nigeria. A germplasm collection mission was carried out in collaboration with the Agricultural Developments Project (ADP) Extension officers of Nigeria between October and December 2014. Bambara groundnut seeds were collected from farmers in different States in Nigeria, such as Kaduna, Niger, Kogi, Benue, Plateau, Adamawa, Nasarawa, Jigawa, Enugu, and Federal Capital Territoy (FCT) Abuja. Some seeds were also collected from National Centre for Genetic Resources and Biotechnology (NACGRAB). The seeds were phenotyped using the descriptor list of Vigna subterranea produced by the International Plant Genetic Resource Institute. A total of 45 original seed lots were collected, which comprised of mixed seeds having different seed coat colours (15) and pure seeded accessions having the same seed coat and eye colour (30). After sorting, a total of 83 accessions were derived from the 45 original seed lots collected, and a total of 24 distinct seed morphotypes with varying seed coat colours and eye colours were identified from the collections. They include cream ( cream ash eye, cream plain eye, and cream black eye), cream purplish spots, cream brown spots/stripe, cream black stripe, cream dark brown patches, cream light grey spots, cream black patches, black, red, light red, dark red, brownish red, brown speckled with black, red speckled with black, brown, brown with brown pattern below hilum, brown with black pattern below hilum, cream black, grey brown, grey black and variegated red. The highest number of accessions were collected from NACGRAB (11), followed by Niger State (10), and the lowest from Benue, Jigawa, and Adamawa States (2). Niger State also had the highest number of mixed seeds. The different seed phenotypes observed in the study are important for the field production of true-to-type lines and can be exploited for the genetic improvement of the Bambara groundnut.

Keywords: Bambara groundnut, characterization, collection, germplasm, phenotypic

Procedia PDF Downloads 112
18798 Liver Lesion Extraction with Fuzzy Thresholding in Contrast Enhanced Ultrasound Images

Authors: Abder-Rahman Ali, Adélaïde Albouy-Kissi, Manuel Grand-Brochier, Viviane Ladan-Marcus, Christine Hoeffl, Claude Marcus, Antoine Vacavant, Jean-Yves Boire

Abstract:

In this paper, we present a new segmentation approach for focal liver lesions in contrast enhanced ultrasound imaging. This approach, based on a two-cluster Fuzzy C-Means methodology, considers type-II fuzzy sets to handle uncertainty due to the image modality (presence of speckle noise, low contrast, etc.), and to calculate the optimum inter-cluster threshold. Fine boundaries are detected by a local recursive merging of ambiguous pixels. The method has been tested on a representative database. Compared to both Otsu and type-I Fuzzy C-Means techniques, the proposed method significantly reduces the segmentation errors.

Keywords: defuzzification, fuzzy clustering, image segmentation, type-II fuzzy sets

Procedia PDF Downloads 461
18797 Analysis Of Non-uniform Characteristics Of Small Underwater Targets Based On Clustering

Authors: Tianyang Xu

Abstract:

Small underwater targets generally have a non-centrosymmetric geometry, and the acoustic scattering field of the target has spatial inhomogeneity under active sonar detection conditions. In view of the above problems, this paper takes the hemispherical cylindrical shell as the research object, and considers the angle continuity implied in the echo characteristics, and proposes a cluster-driven research method for the non-uniform characteristics of target echo angle. First, the target echo features are extracted, and feature vectors are constructed. Secondly, the t-SNE algorithm is used to improve the internal connection of the feature vector in the low-dimensional feature space and to construct the visual feature space. Finally, the implicit angular relationship between echo features is extracted under unsupervised condition by cluster analysis. The reconstruction results of the local geometric structure of the target corresponding to different categories show that the method can effectively divide the angle interval of the local structure of the target according to the natural acoustic scattering characteristics of the target.

Keywords: underwater target;, non-uniform characteristics;, cluster-driven method;, acoustic scattering characteristics

Procedia PDF Downloads 96
18796 CoP-Networks: Virtual Spaces for New Faculty’s Professional Development in the 21st Higher Education

Authors: Eman AbuKhousa, Marwan Z. Bataineh

Abstract:

The 21st century higher education and globalization challenge new faculty members to build effective professional networks and partnership with industry in order to accelerate their growth and success. This creates the need for community of practice (CoP)-oriented development approaches that focus on cognitive apprenticeship while considering individual predisposition and future career needs. This work adopts data mining, clustering analysis, and social networking technologies to present the CoP-Network as a virtual space that connects together similar career-aspiration individuals who are socially influenced to join and engage in a process for domain-related knowledge and practice acquisitions. The CoP-Network model can be integrated into higher education to extend traditional graduate and professional development programs.

Keywords: clustering analysis, community of practice, data mining, higher education, new faculty challenges, social network, social influence, professional development

Procedia PDF Downloads 163
18795 Hyperspectral Data Classification Algorithm Based on the Deep Belief and Self-Organizing Neural Network

Authors: Li Qingjian, Li Ke, He Chun, Huang Yong

Abstract:

In this paper, the method of combining the Pohl Seidman's deep belief network with the self-organizing neural network is proposed to classify the target. This method is mainly aimed at the high nonlinearity of the hyperspectral image, the high sample dimension and the difficulty in designing the classifier. The main feature of original data is extracted by deep belief network. In the process of extracting features, adding known labels samples to fine tune the network, enriching the main characteristics. Then, the extracted feature vectors are classified into the self-organizing neural network. This method can effectively reduce the dimensions of data in the spectrum dimension in the preservation of large amounts of raw data information, to solve the traditional clustering and the long training time when labeled samples less deep learning algorithm for training problems, improve the classification accuracy and robustness. Through the data simulation, the results show that the proposed network structure can get a higher classification precision in the case of a small number of known label samples.

Keywords: DBN, SOM, pattern classification, hyperspectral, data compression

Procedia PDF Downloads 321
18794 Identification of Watershed Landscape Character Types in Middle Yangtze River within Wuhan Metropolitan Area

Authors: Huijie Wang, Bin Zhang

Abstract:

In China, the middle reaches of the Yangtze River are well-developed, boasting a wealth of different types of watershed landscape. In this regard, landscape character assessment (LCA) can serve as a basis for protection, management and planning of trans-regional watershed landscape types. For this study, we chose the middle reaches of the Yangtze River in Wuhan metropolitan area as our study site, wherein the water system consists of rich variety in landscape types. We analyzed trans-regional data to cluster and identify types of landscape characteristics at two levels. 55 basins were analyzed as variables with topography, land cover and river system features in order to identify the watershed landscape character types. For watershed landscape, drainage density and degree of curvature were specified as special variables to directly reflect the regional differences of river system features. Then, we used the principal component analysis (PCA) method and hierarchical clustering algorithm based on the geographic information system (GIS) and statistical products and services solution (SPSS) to obtain results for clusters of watershed landscape which were divided into 8 characteristic groups. These groups highlighted watershed landscape characteristics of different river systems as well as key landscape characteristics that can serve as a basis for targeted protection of watershed landscape characteristics, thus helping to rationally develop multi-value landscape resources and promote coordinated development of trans-regions.

Keywords: GIS, hierarchical clustering, landscape character, landscape typology, principal component analysis, watershed

Procedia PDF Downloads 201
18793 Proposing a Boundary Coverage Algorithm ‎for Underwater Sensor Network

Authors: Seyed Mohsen Jameii

Abstract:

Wireless underwater sensor networks are a type of sensor networks that are located in underwater environments and linked together by acoustic waves. The application of these kinds of network includes monitoring of pollutants (chemical, biological, and nuclear), oil fields detection, prediction of the likelihood of a tsunami in coastal areas, the use of wireless sensor nodes to monitor the passing submarines, and determination of appropriate locations for anchoring ships. This paper proposes a boundary coverage algorithm for intrusion detection in underwater sensor networks. In the first phase of the proposed algorithm, optimal deployment of nodes is done in the water. In the second phase, after the employment of nodes at the proper depth, clustering is executed to reduce the exchanges of messages between the sensors. In the third phase, the algorithm of "divide and conquer" is used to save energy and increase network efficiency. The simulation results demonstrate the efficiency of the proposed algorithm.

Keywords: boundary coverage, clustering, divide and ‎conquer, underwater sensor nodes

Procedia PDF Downloads 315
18792 Power Aware Modified I-LEACH Protocol Using Fuzzy IF Then Rules

Authors: Gagandeep Singh, Navdeep Singh

Abstract:

Due to limited battery of sensor nodes, so energy efficiency found to be main constraint in WSN. Therefore the main focus of the present work is to find the ways to minimize the energy consumption problem and will results; enhancement in the network stability period and life time. Many researchers have proposed different kind of the protocols to enhance the network lifetime further. This paper has evaluated the issues which have been neglected in the field of the WSNs. WSNs are composed of multiple unattended ultra-small, limited-power sensor nodes. Sensor nodes are deployed randomly in the area of interest. Sensor nodes have limited processing, wireless communication and power resource capabilities Sensor nodes send sensed data to sink or Base Station (BS). I-LEACH gives adaptive clustering mechanism which very efficiently deals with energy conservations. This paper ends up with the shortcomings of various adaptive clustering based WSNs protocols.

Keywords: WSN, I-Leach, MATLAB, sensor

Procedia PDF Downloads 254
18791 Role of Grey Scale Ultrasound Including Elastography in Grading the Severity of Carpal Tunnel Syndrome - A Comparative Cross-sectional Study

Authors: Arjun Prakash, Vinutha H., Karthik N.

Abstract:

BACKGROUND: Carpal tunnel syndrome (CTS) is a common entrapment neuropathy with an estimated prevalence of 0.6 - 5.8% in the general adult population. It is caused by compression of the Median Nerve (MN) at the wrist as it passes through a narrow osteofibrous canal. Presently, the diagnosis is established by the clinical symptoms and physical examination and Nerve conduction study (NCS) is used to assess its severity. However, it is considered to be painful, time consuming and expensive, with a false-negative rate between 16 - 34%. Ultrasonography (USG) is now increasingly used as a diagnostic tool in CTS due to its non-invasive nature, increased accessibility and relatively low cost. Elastography is a newer modality in USG which helps to assess stiffness of tissues. However, there is limited available literature about its applications in peripheral nerves. OBJECTIVES: Our objectives were to measure the Cross-Sectional Area (CSA) and elasticity of MN at the carpal tunnel using Grey scale Ultrasonography (USG), Strain Elastography (SE) and Shear Wave Elastography (SWE). We also made an attempt to independently evaluate the role of Gray scale USG, SE and SWE in grading the severity of CTS, keeping NCS as the gold standard. MATERIALS AND METHODS: After approval from the Institutional Ethics Review Board, we conducted a comparative cross sectional study for a period of 18 months. The participants were divided into two groups. Group A consisted of 54 patients with clinically diagnosed CTS who underwent NCS, and Group B consisted of 50 controls without any clinical symptoms of CTS. All Ultrasound examinations were performed on SAMSUNG RS 80 EVO Ultrasound machine with 2 - 9 Mega Hertz linear probe. In both groups, CSA of the MN was measured on Grey scale USG, and its elasticity was measured at the carpal tunnel (in terms of Strain ratio and Shear Modulus). The variables were compared between both groups by using ‘Independent t test’, and subgroup analyses were performed using one-way analysis of variance. Receiver operating characteristic curves were used to evaluate the diagnostic performance of each variable. RESULTS: The mean CSA of the MN was 13.60 + 3.201 mm2 and 9.17 + 1.665 mm2 in Group A and Group B, respectively (p < 0.001). The mean SWE was 30.65 + 12.996 kPa and 17.33 + 2.919 kPa in Group A and Group B, respectively (p < 0.001), and the mean Strain ratio was 7.545 + 2.017 and 5.802 + 1.153 in Group A and Group B respectively (p < 0.001). CONCLUSION: The combined use of Gray scale USG, SE and SWE is extremely useful in grading the severity of CTS and can be used as a painless and cost-effective alternative to NCS. Early diagnosis and grading of CTS and effective treatment is essential to avoid permanent nerve damage and functional disability.

Keywords: carpal tunnel, ultrasound, elastography, nerve conduction study

Procedia PDF Downloads 74
18790 Machine Learning Analysis of Eating Disorders Risk, Physical Activity and Psychological Factors in Adolescents: A Community Sample Study

Authors: Marc Toutain, Pascale Leconte, Antoine Gauthier

Abstract:

Introduction: Eating Disorders (ED), such as anorexia, bulimia, and binge eating, are psychiatric illnesses that mostly affect young people. The main symptoms concern eating (restriction, excessive food intake) and weight control behaviors (laxatives, vomiting). Psychological comorbidities (depression, executive function disorders, etc.) and problematic behaviors toward physical activity (PA) are commonly associated with ED. Acquaintances on ED risk factors are still lacking, and more community sample studies are needed to improve prevention and early detection. To our knowledge, studies are needed to specifically investigate the link between ED risk level, PA, and psychological risk factors in a community sample of adolescents. The aim of this study is to assess the relation between ED risk level, exercise (type, frequency, and motivations for engaging in exercise), and psychological factors based on the Jacobi risk factors model. We suppose that a high risk of ED will be associated with the practice of high caloric cost PA, motivations oriented to weight and shape control, and psychological disturbances. Method: An online survey destined for students has been sent to several middle schools and colleges in northwest France. This survey combined several questionnaires, the Eating Attitude Test-26 assessing ED risk; the Exercise Motivation Inventory–2 assessing motivations toward PA; the Hospital Anxiety and Depression Scale assessing anxiety and depression, the Contour Drawing Rating Scale; and the Body Esteem Scale assessing body dissatisfaction, Rosenberg Self-esteem Scale assessing self-esteem, the Exercise Dependence Scale-Revised assessing PA dependence, the Multidimensional Assessment of Interoceptive Awareness assessing interoceptive awareness and the Frost Multidimensional Perfectionism Scale assessing perfectionism. Machine learning analysis will be performed in order to constitute groups with a tree-based model clustering method, extract risk profile(s) with a bootstrap method comparison, and predict ED risk with a prediction method based on a decision tree-based model. Expected results: 1044 complete records have already been collected, and the survey will be closed at the end of May 2022. Records will be analyzed with a clustering method and a bootstrap method in order to reveal risk profile(s). Furthermore, a predictive tree decision method will be done to extract an accurate predictive model of ED risk. This analysis will confirm typical main risk factors and will give more data on presumed strong risk factors such as exercise motivations and interoceptive deficit. Furthermore, it will enlighten particular risk profiles with a strong level of proof and greatly contribute to improving the early detection of ED and contribute to a better understanding of ED risk factors.

Keywords: eating disorders, risk factors, physical activity, machine learning

Procedia PDF Downloads 67
18789 The Use of Appeals in Green Printed Advertisements: A Case of Product Orientation and Organizational Image Orientation Ads

Authors: Chutima Ruanguttamanun

Abstract:

Despite the relatively large number of studies that have examined the use of appeals in advertisements, research on the use of appeals in green advertisements is still underdeveloped and needs to be investigated further, as it is definitely a tool for marketers to create illustrious ads. In this study, content analysis was employed to examine the nature of green advertising appeals and to match the appeals with the green advertisements. Two different types of green print advertisings, product orientation and organizational image orientation were used. Thirty highly educated participants with different backgrounds were asked individually to ascertain three appeals out of thirty-four given appeals found among forty real green advertisements. To analyze participant responses and to group them based on common appeals, two-step K-mean clustering is used. The clustering solution indicates that eye-catching graphics and imaginative appeals are highly notable in both types of green ads. Depressed, meaningful and sad appeals are found to be highly used in organizational image orientation ads, whereas, corporate image, informative and natural appeals are found to be essential for product orientation ads.

Keywords: advertising appeals, green marketing, green advertisement, printed advertisement

Procedia PDF Downloads 249
18788 Exploring the Role of Data Mining in Crime Classification: A Systematic Literature Review

Authors: Faisal Muhibuddin, Ani Dijah Rahajoe

Abstract:

This in-depth exploration, through a systematic literature review, scrutinizes the nuanced role of data mining in the classification of criminal activities. The research focuses on investigating various methodological aspects and recent developments in leveraging data mining techniques to enhance the effectiveness and precision of crime categorization. Commencing with an exposition of the foundational concepts of crime classification and its evolutionary dynamics, this study details the paradigm shift from conventional methods towards approaches supported by data mining, addressing the challenges and complexities inherent in the modern crime landscape. Specifically, the research delves into various data mining techniques, including K-means clustering, Naïve Bayes, K-nearest neighbour, and clustering methods. A comprehensive review of the strengths and limitations of each technique provides insights into their respective contributions to improving crime classification models. The integration of diverse data sources takes centre stage in this research. A detailed analysis explores how the amalgamation of structured data (such as criminal records) and unstructured data (such as social media) can offer a holistic understanding of crime, enriching classification models with more profound insights. Furthermore, the study explores the temporal implications in crime classification, emphasizing the significance of considering temporal factors to comprehend long-term trends and seasonality. The availability of real-time data is also elucidated as a crucial element in enhancing responsiveness and accuracy in crime classification.

Keywords: data mining, classification algorithm, naïve bayes, k-means clustering, k-nearest neigbhor, crime, data analysis, sistematic literature review

Procedia PDF Downloads 44
18787 Bioinformatic Approaches in Population Genetics and Phylogenetic Studies

Authors: Masoud Sheidai

Abstract:

Biologists with a special field of population genetics and phylogeny have different research tasks such as populations’ genetic variability and divergence, species relatedness, the evolution of genetic and morphological characters, and identification of DNA SNPs with adaptive potential. To tackle these problems and reach a concise conclusion, they must use the proper and efficient statistical and bioinformatic methods as well as suitable genetic and morphological characteristics. In recent years application of different bioinformatic and statistical methods, which are based on various well-documented assumptions, are the proper analytical tools in the hands of researchers. The species delineation is usually carried out with the use of different clustering methods like K-means clustering based on proper distance measures according to the studied features of organisms. A well-defined species are assumed to be separated from the other taxa by molecular barcodes. The species relationships are studied by using molecular markers, which are analyzed by different analytical methods like multidimensional scaling (MDS) and principal coordinate analysis (PCoA). The species population structuring and genetic divergence are usually investigated by PCoA and PCA methods and a network diagram. These are based on bootstrapping of data. The Association of different genes and DNA sequences to ecological and geographical variables is determined by LFMM (Latent factor mixed model) and redundancy analysis (RDA), which are based on Bayesian and distance methods. Molecular and morphological differentiating characters in the studied species may be identified by linear discriminant analysis (DA) and discriminant analysis of principal components (DAPC). We shall illustrate these methods and related conclusions by giving examples from different edible and medicinal plant species.

Keywords: GWAS analysis, K-Means clustering, LFMM, multidimensional scaling, redundancy analysis

Procedia PDF Downloads 101
18786 Predicting the Human Impact of Natural Onset Disasters Using Pattern Recognition Techniques and Rule Based Clustering

Authors: Sara Hasani

Abstract:

This research focuses on natural sudden onset disasters characterised as ‘occurring with little or no warning and often cause excessive injuries far surpassing the national response capacities’. Based on the panel analysis of the historic record of 4,252 natural onset disasters between 1980 to 2015, a predictive method was developed to predict the human impact of the disaster (fatality, injured, homeless) with less than 3% of errors. The geographical dispersion of the disasters includes every country where the data were available and cross-examined from various humanitarian sources. The records were then filtered into 4252 records of the disasters where the five predictive variables (disaster type, HDI, DRI, population, and population density) were clearly stated. The procedure was designed based on a combination of pattern recognition techniques and rule-based clustering for prediction and discrimination analysis to validate the results further. The result indicates that there is a relationship between the disaster human impact and the five socio-economic characteristics of the affected country mentioned above. As a result, a framework was put forward, which could predict the disaster’s human impact based on their severity rank in the early hours of disaster strike. The predictions in this model were outlined in two worst and best-case scenarios, which respectively inform the lower range and higher range of the prediction. A necessity to develop the predictive framework can be highlighted by noticing that despite the existing research in literature, a framework for predicting the human impact and estimating the needs at the time of the disaster is yet to be developed. This can further be used to allocate the resources at the response phase of the disaster where the data is scarce.

Keywords: disaster management, natural disaster, pattern recognition, prediction

Procedia PDF Downloads 136
18785 Machine Learning Approaches Based on Recency, Frequency, Monetary (RFM) and K-Means for Predicting Electrical Failures and Voltage Reliability in Smart Cities

Authors: Panaya Sudta, Wanchalerm Patanacharoenwong, Prachya Bumrungkun

Abstract:

As With the evolution of smart grids, ensuring the reliability and efficiency of electrical systems in smart cities has become crucial. This paper proposes a distinct approach that combines advanced machine learning techniques to accurately predict electrical failures and address voltage reliability issues. This approach aims to improve the accuracy and efficiency of reliability evaluations in smart cities. The aim of this research is to develop a comprehensive predictive model that accurately predicts electrical failures and voltage reliability in smart cities. This model integrates RFM analysis, K-means clustering, and LSTM networks to achieve this objective. The research utilizes RFM analysis, traditionally used in customer value assessment, to categorize and analyze electrical components based on their failure recency, frequency, and monetary impact. K-means clustering is employed to segment electrical components into distinct groups with similar characteristics and failure patterns. LSTM networks are used to capture the temporal dependencies and patterns in customer data. This integration of RFM, K-means, and LSTM results in a robust predictive tool for electrical failures and voltage reliability. The proposed model has been tested and validated on diverse electrical utility datasets. The results show a significant improvement in prediction accuracy and reliability compared to traditional methods, achieving an accuracy of 92.78% and an F1-score of 0.83. This research contributes to the proactive maintenance and optimization of electrical infrastructures in smart cities. It also enhances overall energy management and sustainability. The integration of advanced machine learning techniques in the predictive model demonstrates the potential for transforming the landscape of electrical system management within smart cities. The research utilizes diverse electrical utility datasets to develop and validate the predictive model. RFM analysis, K-means clustering, and LSTM networks are applied to these datasets to analyze and predict electrical failures and voltage reliability. The research addresses the question of how accurately electrical failures and voltage reliability can be predicted in smart cities. It also investigates the effectiveness of integrating RFM analysis, K-means clustering, and LSTM networks in achieving this goal. The proposed approach presents a distinct, efficient, and effective solution for predicting and mitigating electrical failures and voltage issues in smart cities. It significantly improves prediction accuracy and reliability compared to traditional methods. This advancement contributes to the proactive maintenance and optimization of electrical infrastructures, overall energy management, and sustainability in smart cities.

Keywords: electrical state prediction, smart grids, data-driven method, long short-term memory, RFM, k-means, machine learning

Procedia PDF Downloads 34
18784 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: clustering algorithms, coastal engineering, data mining, data summarization, statistical methods

Procedia PDF Downloads 341
18783 Detection and Identification of Chlamydophila psittaci in Asymptomatic and Symptomatic Parrots in Isfahan

Authors: Mehdi Moradi Sarmeidani, Peyman Keyhani, Hasan Momtaz

Abstract:

Chlamydophila psittaci is a avian pathogen that may cause respiratory disorders in humans. Conjunctival and cloacal swabs from 54 captive psittacine birds presented at veterinary clinics were collected to determine the prevalence of C. psittaci in domestic birds in Isfahan. Samples were collected during 2014 from a total of 10 different species of parrots, with African gray(33), Cockatiel lutino(3), Cockatiel gray(2), Cockatiel cinnamon(1), Pearl cockatiel(6), Timneh African grey(1), Ringneck parakeet(2), Melopsittacus undulatus(1), Alexander parakeet(2), Green Parakeet(3) being the most representative species sampled. C. psittaci was detected in 27 (50%) birds using molecular detection (PCR) method. The detection of this bacterium in captive psittacine birds shows that there is a potential risk for human whom has a direct contact and there is a possibility of infecting other birds.

Keywords: chlamydophila psittaci, psittacine birds, PCR, Isfahan

Procedia PDF Downloads 346
18782 An Integrated Label Propagation Network for Structural Condition Assessment

Authors: Qingsong Xiong, Cheng Yuan, Qingzhao Kong, Haibei Xiong

Abstract:

Deep-learning-driven approaches based on vibration responses have attracted larger attention in rapid structural condition assessment while obtaining sufficient measured training data with corresponding labels is relevantly costly and even inaccessible in practical engineering. This study proposes an integrated label propagation network for structural condition assessment, which is able to diffuse the labels from continuously-generating measurements by intact structure to those of missing labels of damage scenarios. The integrated network is embedded with damage-sensitive features extraction by deep autoencoder and pseudo-labels propagation by optimized fuzzy clustering, the architecture and mechanism which are elaborated. With a sophisticated network design and specified strategies for improving performance, the present network achieves to extends the superiority of self-supervised representation learning, unsupervised fuzzy clustering and supervised classification algorithms into an integration aiming at assessing damage conditions. Both numerical simulations and full-scale laboratory shaking table tests of a two-story building structure were conducted to validate its capability of detecting post-earthquake damage. The identifying accuracy of a present network was 0.95 in numerical validations and an average 0.86 in laboratory case studies, respectively. It should be noted that the whole training procedure of all involved models in the network stringently doesn’t rely upon any labeled data of damage scenarios but only several samples of intact structure, which indicates a significant superiority in model adaptability and feasible applicability in practice.

Keywords: autoencoder, condition assessment, fuzzy clustering, label propagation

Procedia PDF Downloads 78
18781 Local Directional Encoded Derivative Binary Pattern Based Coral Image Classification Using Weighted Distance Gray Wolf Optimization Algorithm

Authors: Annalakshmi G., Sakthivel Murugan S.

Abstract:

This paper presents a local directional encoded derivative binary pattern (LDEDBP) feature extraction method that can be applied for the classification of submarine coral reef images. The classification of coral reef images using texture features is difficult due to the dissimilarities in class samples. In coral reef image classification, texture features are extracted using the proposed method called local directional encoded derivative binary pattern (LDEDBP). The proposed approach extracts the complete structural arrangement of the local region using local binary batten (LBP) and also extracts the edge information using local directional pattern (LDP) from the edge response available in a particular region, thereby achieving extra discriminative feature value. Typically the LDP extracts the edge details in all eight directions. The process of integrating edge responses along with the local binary pattern achieves a more robust texture descriptor than the other descriptors used in texture feature extraction methods. Finally, the proposed technique is applied to an extreme learning machine (ELM) method with a meta-heuristic algorithm known as weighted distance grey wolf optimizer (GWO) to optimize the input weight and biases of single-hidden-layer feed-forward neural networks (SLFN). In the empirical results, ELM-WDGWO demonstrated their better performance in terms of accuracy on all coral datasets, namely RSMAS, EILAT, EILAT2, and MLC, compared with other state-of-the-art algorithms. The proposed method achieves the highest overall classification accuracy of 94% compared to the other state of art methods.

Keywords: feature extraction, local directional pattern, ELM classifier, GWO optimization

Procedia PDF Downloads 142
18780 O-LEACH: The Problem of Orphan Nodes in the LEACH of Routing Protocol for Wireless Sensor Networks

Authors: Wassim Jerbi, Abderrahmen Guermazi, Hafedh Trabelsi

Abstract:

The optimum use of coverage in wireless sensor networks (WSNs) is very important. LEACH protocol called Low Energy Adaptive Clustering Hierarchy, presents a hierarchical clustering algorithm for wireless sensor networks. LEACH is a protocol that allows the formation of distributed cluster. In each cluster, LEACH randomly selects some sensor nodes called cluster heads (CHs). The selection of CHs is made with a probabilistic calculation. It is supposed that each non-CH node joins a cluster and becomes a cluster member. Nevertheless, some CHs can be concentrated in a specific part of the network. Thus, several sensor nodes cannot reach any CH. to solve this problem. We created an O-LEACH Orphan nodes protocol, its role is to reduce the sensor nodes which do not belong the cluster. The cluster member called Gateway receives messages from neighboring orphan nodes. The gateway informs CH having the neighboring nodes that not belong to any group. However, Gateway called (CH') attaches the orphaned nodes to the cluster and then collected the data. O-Leach enables the formation of a new method of cluster, leads to a long life and minimal energy consumption. Orphan nodes possess enough energy and seeks to be covered by the network. The principal novel contribution of the proposed work is O-LEACH protocol which provides coverage of the whole network with a minimum number of orphaned nodes and has a very high connectivity rates.As a result, the WSN application receives data from the entire network including orphan nodes. The proper functioning of the Application requires, therefore, management of intelligent resources present within each the network sensor. The simulation results show that O-LEACH performs better than LEACH in terms of coverage, connectivity rate, energy and scalability.

Keywords: WSNs; routing; LEACH; O-LEACH; Orphan nodes; sub-cluster; gateway; CH’

Procedia PDF Downloads 348
18779 Mugil cephalus Presents a Feasible Alternative To Lates calcarifer Farming in Brackishwater: Evidence From Grey Mullet Mugil Cephalus Farming in Bangladesh

Authors: Asif Hasan

Abstract:

Among the reported suitable mariculture species in Bangladesh, seabass and mullet are the two most popular candidates due to their high market values. Several field studies conducted on the culture of seabass in Bangladesh, it still remains a challenge to commercially grow this species due to its exclusive carnivorous nature. In contrast, the grey mullet (M. cephalus) is a fast-growing, omnivorous euryhaline fish that has shown excellent growth in many areas including South Asia. Choice of a sustainable aquaculture technique must consider the productivity and yield as well as their environmental suitability. This study was designed to elucidate the ecologically suitable culture technique of M. cephalus in brakishwater ponds by comparing the biotic and abiotic components of pond ecosystem. In addition to growth parameters (yield, ADG, SGR, weight gain, FCR), Physicochemical parameters (Temperature, DO, pH, salinity, TDS, transparency, ammonia, and Chlorophyll-a concentration) and biological community composition (phytoplankton, zooplankton and benthic macroinvertebrates) were investigated from ponds under Semi-intensive, Improve extensive and Traditional culture system. While temperature were similar in the three culture types, ponds under improve-extensive showed better environmental conditions with significantly higher mean DO and transparency, and lower TDS and Chlorophyll-a. The abundance of zooplankton, phytoplankton and benthic macroinvertebrates were apparently higher in semi-intensive ponds. The Analysis of Similarity (ANOSIM) suggested moderate difference in the planktonic community composition. While the fish growth parameters of M. cephalus and total yield did not differ significantly between three systems, M. cephalus yield (kg/decimal) was apparently higher in semi-intensive pond due to high stocking density and intensive feeding. The results suggested that the difference between the three systems were due to more efficient utilization of nutrients in improve extensive ponds which affected fish growth through trophic cascades. This study suggested that different culture system of M. cephalus is an alternative and more beneficial method owing to its ecological and economic benefits in brackishwater ponds.

Keywords: Mugil cephalus, pond ecosystem, mariculture, fisheries management

Procedia PDF Downloads 39
18778 A Comprehensive Study and Evaluation on Image Fashion Features Extraction

Authors: Yuanchao Sang, Zhihao Gong, Longsheng Chen, Long Chen

Abstract:

Clothing fashion represents a human’s aesthetic appreciation towards everyday outfits and appetite for fashion, and it reflects the development of status in society, humanity, and economics. However, modelling fashion by machine is extremely challenging because fashion is too abstract to be efficiently described by machines. Even human beings can hardly reach a consensus about fashion. In this paper, we are dedicated to answering a fundamental fashion-related problem: what image feature best describes clothing fashion? To address this issue, we have designed and evaluated various image features, ranging from traditional low-level hand-crafted features to mid-level style awareness features to various current popular deep neural network-based features, which have shown state-of-the-art performance in various vision tasks. In summary, we tested the following 9 feature representations: color, texture, shape, style, convolutional neural networks (CNNs), CNNs with distance metric learning (CNNs&DML), AutoEncoder, CNNs with multiple layer combination (CNNs&MLC) and CNNs with dynamic feature clustering (CNNs&DFC). Finally, we validated the performance of these features on two publicly available datasets. Quantitative and qualitative experimental results on both intra-domain and inter-domain fashion clothing image retrieval showed that deep learning based feature representations far outweigh traditional hand-crafted feature representation. Additionally, among all deep learning based methods, CNNs with explicit feature clustering performs best, which shows feature clustering is essential for discriminative fashion feature representation.

Keywords: convolutional neural network, feature representation, image processing, machine modelling

Procedia PDF Downloads 118
18777 Artificial Intelligent Methodology for Liquid Propellant Engine Design Optimization

Authors: Hassan Naseh, Javad Roozgard

Abstract:

This paper represents the methodology based on Artificial Intelligent (AI) applied to Liquid Propellant Engine (LPE) optimization. The AI methodology utilized from Adaptive neural Fuzzy Inference System (ANFIS). In this methodology, the optimum objective function means to achieve maximum performance (specific impulse). The independent design variables in ANFIS modeling are combustion chamber pressure and temperature and oxidizer to fuel ratio and output of this modeling are specific impulse that can be applied with other objective functions in LPE design optimization. To this end, the LPE’s parameter has been modeled in ANFIS methodology based on generating fuzzy inference system structure by using grid partitioning, subtractive clustering and Fuzzy C-Means (FCM) clustering for both inferences (Mamdani and Sugeno) and various types of membership functions. The final comparing optimization results shown accuracy and processing run time of the Gaussian ANFIS Methodology between all methods.

Keywords: ANFIS methodology, artificial intelligent, liquid propellant engine, optimization

Procedia PDF Downloads 552
18776 Clustering Using Cooperative Multihop Mini-Groups in Wireless Sensor Network: A Novel Approach

Authors: Virender Ranga, Mayank Dave, Anil Kumar Verma

Abstract:

Recently wireless sensor networks (WSNs) are used in many real life applications like environmental monitoring, habitat monitoring, health monitoring etc. Due to power constraint cheaper devices used in these applications, the energy consumption of each device should be kept as low as possible such that network operates for longer period of time. One of the techniques to prolong the network lifetime is an intelligent grouping of sensor nodes such that they can perform their operation in cooperative and energy efficient manner. With this motivation, we propose a novel approach by organize the sensor nodes in cooperative multihop mini-groups so that the total global energy consumption of the network can be reduced and network lifetime can be improved. Our proposed approach also reduces the number of transmitted messages inside the WSNs, which further minimizes the energy consumption of the whole network. The experimental simulations show that our proposed approach outperforms over the state-of-the-art approach in terms of stability period and aggregated data.

Keywords: clustering, cluster-head, mini-group, stability period

Procedia PDF Downloads 335