Search results for: grey clustering method
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 18992

Search results for: grey clustering method

18752 Fusion Models for Cyber Threat Defense: Integrating Clustering, Random Forests, and Support Vector Machines to Against Windows Malware

Authors: Azita Ramezani, Atousa Ramezani

Abstract:

In the ever-escalating landscape of windows malware the necessity for pioneering defense strategies turns into undeniable this study introduces an avant-garde approach fusing the capabilities of clustering random forests and support vector machines SVM to combat the intricate web of cyber threats our fusion model triumphs with a staggering accuracy of 98.67 and an equally formidable f1 score of 98.68 a testament to its effectiveness in the realm of windows malware defense by deciphering the intricate patterns within malicious code our model not only raises the bar for detection precision but also redefines the paradigm of cybersecurity preparedness this breakthrough underscores the potential embedded in the fusion of diverse analytical methodologies and signals a paradigm shift in fortifying against the relentless evolution of windows malicious threats as we traverse through the dynamic cybersecurity terrain this research serves as a beacon illuminating the path toward a resilient future where innovative fusion models stand at the forefront of cyber threat defense.

Keywords: fusion models, cyber threat defense, windows malware, clustering, random forests, support vector machines (SVM), accuracy, f1-score, cybersecurity, malicious code detection

Procedia PDF Downloads 39
18751 Employing GIS to Analyze Areas Prone to Flooding: Case Study of Thailand

Authors: Sanpachai Huvanandana, Settapong Malisuwan, Soparwan Tongyuak, Prust Pannachet, Anong Phoepueak, Navneet Madan

Abstract:

Many regions of Thailand are prone to flooding due to tropical climate. A commonly increasing precipitation in this continent results in risk of flooding. Many efforts have been implemented such as drainage control system, multiple dams, and irrigation canals. In order to decide where the drainages, dams, and canal should be appropriately located, the flooding risk area should be determined. This paper is aimed to identify the appropriate features that can be used to classify the flooding risk area in Thailand. Several features have been analyzed and used to classify the area. Non-supervised clustering techniques have been used and the results have been compared with ten years average actual flooding area.

Keywords: flood area clustering, geographical information system, flood features

Procedia PDF Downloads 261
18750 Detecting of Crime Hot Spots for Crime Mapping

Authors: Somayeh Nezami

Abstract:

The management of financial and human resources of police in metropolitans requires many information and exact plans to reduce a rate of crime and increase the safety of the society. Geographical Information Systems have an important role in providing crime maps and their analysis. By using them and identification of crime hot spots along with spatial presentation of the results, it is possible to allocate optimum resources while presenting effective methods for decision making and preventive solutions. In this paper, we try to explain and compare between some of the methods of hot spots analysis such as Mode, Fuzzy Mode and Nearest Neighbour Hierarchical spatial clustering (NNH). Then the spots with the highest crime rates of drug smuggling for one province in Iran with borderline with Afghanistan are obtained. We will show that among these three methods NNH leads to the best result.

Keywords: GIS, Hot spots, nearest neighbor hierarchical spatial clustering, NNH, spatial analysis of crime

Procedia PDF Downloads 299
18749 Design an Algorithm for Software Development in CBSE Envrionment Using Feed Forward Neural Network

Authors: Amit Verma, Pardeep Kaur

Abstract:

In software development organizations, Component based Software engineering (CBSE) is emerging paradigm for software development and gained wide acceptance as it often results in increase quality of software product within development time and budget. In component reusability, main challenges are the right component identification from large repositories at right time. The major objective of this work is to provide efficient algorithm for storage and effective retrieval of components using neural network and parameters based on user choice through clustering. This research paper aims to propose an algorithm that provides error free and automatic process (for retrieval of the components) while reuse of the component. In this algorithm, keywords (or components) are extracted from software document, after by applying k mean clustering algorithm. Then weights assigned to those keywords based on their frequency and after assigning weights, ANN predicts whether correct weight is assigned to keywords (or components) or not, otherwise it back propagates in to initial step (re-assign the weights). In last, store those all keywords into repositories for effective retrieval. Proposed algorithm is very effective in the error correction and detection with user base choice while choice of component for reusability for efficient retrieval is there.

Keywords: component based development, clustering, back propagation algorithm, keyword based retrieval

Procedia PDF Downloads 359
18748 Assessing Significance of Correlation with Binomial Distribution

Authors: Vijay Kumar Singh, Pooja Kushwaha, Prabhat Ranjan, Krishna Kumar Ojha, Jitendra Kumar

Abstract:

Present day high-throughput genomic technologies, NGS/microarrays, are producing large volume of data that require improved analysis methods to make sense of the data. The correlation between genes and samples has been regularly used to gain insight into many biological phenomena including, but not limited to, co-expression/co-regulation, gene regulatory networks, clustering and pattern identification. However, presence of outliers and violation of assumptions underlying Pearson correlation is frequent and may distort the actual correlation between the genes and lead to spurious conclusions. Here, we report a method to measure the strength of association between genes. The method assumes that the expression values of a gene are Bernoulli random variables whose outcome depends on the sample being probed. The method considers the two genes as uncorrelated if the number of sample with same outcome for both the genes (Ns) is equal to certainly expected number (Es). The extent of correlation depends on how far Ns can deviate from the Es. The method does not assume normality for the parent population, fairly unaffected by the presence of outliers, can be applied to qualitative data and it uses the binomial distribution to assess the significance of association. At this stage, we would not claim about the superiority of the method over other existing correlation methods, but our method could be another way of calculating correlation in addition to existing methods. The method uses binomial distribution, which has not been used until yet, to assess the significance of association between two variables. We are evaluating the performance of our method on NGS/microarray data, which is noisy and pierce by the outliers, to see if our method can differentiate between spurious and actual correlation. While working with the method, it has not escaped our notice that the method could also be generalized to measure the association of more than two variables which has been proven difficult with the existing methods.

Keywords: binomial distribution, correlation, microarray, outliers, transcriptome

Procedia PDF Downloads 383
18747 Energy Efficient Clustering with Reliable and Load-Balanced Multipath Routing for Wireless Sensor Networks

Authors: Alamgir Naushad, Ghulam Abbas, Shehzad Ali Shah, Ziaul Haq Abbas

Abstract:

Unlike conventional networks, it is particularly challenging to manage resources efficiently in Wireless Sensor Networks (WSNs) due to their inherent characteristics, such as dynamic network topology and limited bandwidth and battery power. To ensure energy efficiency, this paper presents a routing protocol for WSNs, namely, Enhanced Hybrid Multipath Routing (EHMR), which employs hierarchical clustering and proposes a next hop selection mechanism between nodes according to a maximum residual energy metric together with a minimum hop count. Load-balancing of data traffic over multiple paths is achieved for a better packet delivery ratio and low latency rate. Reliability is ensured in terms of higher data rate and lower end-to-end delay. EHMR also enhances the fast-failure recovery mechanism to recover a failed path. Simulation results demonstrate that EHMR achieves a higher packet delivery ratio, reduced energy consumption per-packet delivery, lower end-to-end latency, and reduced effect of data rate on packet delivery ratio when compared with eminent WSN routing protocols.

Keywords: energy efficiency, load-balancing, hierarchical clustering, multipath routing, wireless sensor networks

Procedia PDF Downloads 51
18746 Remote Assessment and Change Detection of GreenLAI of Cotton Crop Using Different Vegetation Indices

Authors: Ganesh B. Shinde, Vijaya B. Musande

Abstract:

Cotton crop identification based on the timely information has significant advantage to the different implications of food, economic and environment. Due to the significant advantages, the accurate detection of cotton crop regions using supervised learning procedure is challenging problem in remote sensing. Here, classifiers on the direct image are played a major role but the results are not much satisfactorily. In order to further improve the effectiveness, variety of vegetation indices are proposed in the literature. But, recently, the major challenge is to find the better vegetation indices for the cotton crop identification through the proposed methodology. Accordingly, fuzzy c-means clustering is combined with neural network algorithm, trained by Levenberg-Marquardt for cotton crop classification. To experiment the proposed method, five LISS-III satellite images was taken and the experimentation was done with six vegetation indices such as Simple Ratio, Normalized Difference Vegetation Index, Enhanced Vegetation Index, Green Atmospherically Resistant Vegetation Index, Wide-Dynamic Range Vegetation Index, Green Chlorophyll Index. Along with these indices, Green Leaf Area Index is also considered for investigation. From the research outcome, Green Atmospherically Resistant Vegetation Index outperformed with all other indices by reaching the average accuracy value of 95.21%.

Keywords: Fuzzy C-Means clustering (FCM), neural network, Levenberg-Marquardt (LM) algorithm, vegetation indices

Procedia PDF Downloads 288
18745 Investigation Studies of WNbMoVTa and WNbMoVTaCr₀.₅Al Refractory High Entropy Alloys as Plasma-Facing Materials

Authors: Burçak Boztemur, Yue Xu, Laima Luo, M. Lütfi Öveçoğlu, Duygu Ağaoğulları

Abstract:

Tungsten (W) is used chiefly as plasma-facing material. However, it has some problems, such as brittleness after plasma exposure. High-entropy alloys (RHEAs) are a new opportunity for this deficiency. So, the neutron shielding behavior of WNbMoVTa and WNbMoVTaCr₀.₅Al compositions were examined against He⁺ irradiation in this study. The mechanical and irradiation properties of the WNbMoVTa base composition were investigated by adding the Al and Cr elements. The mechanical alloying (MA) for 6 hours was applied to obtain RHEA powders. According to the X-ray diffraction (XRD) method, the body-centered cubic (BCC) phase and NbTa phase with a small amount of WC impurity that comes from vials and balls were determined after 6 h MA. Also, RHEA powders were consolidated with the spark plasma sintering (SPS) method (1500 ºC, 30 MPa, and 10 min). After the SPS method, (Nb,Ta)C and W₂C₀.₈₅ phases were obtained with the decomposition of WC and stearic acid that is added during MA based on XRD results. Also, the BCC phase was obtained for both samples. While the Al₂O₃ phase with a small intensity was seen for the WNbMoVTaCr₀.₅Al sample, the Ta₂VO₆ phase was determined for the base sample. These phases were observed as three different regions according to scanning electron microscopy (SEM). All elements were distributed homogeneously on the white region by measuring an electron probe micro-analyzer (EPMA) coupled with a wavelength dispersive spectroscope (WDS). Also, the grey region of the WNbMoVTa sample was rich in Ta, V, and O elements. However, the amount of Al and O elements was higher for the grey region of the WNbMoVTaCr₀.₅Al sample. The high amount of Nb, Ta, and C elements were determined for both samples. Archimedes’ densities that were measured with alcohol media were closer to the theoretical densities of RHEAs. These values were important for the microhardness and irradiation resistance of compositions. While the Vickers microhardness value of the WNbMoVTa sample was measured as ~11 GPa, this value increased to nearly 13 GPa with the WNbMoVTaCr₀.₅Al sample. These values were compatible with the wear behavior. The wear volume loss was decreased to 0.16×10⁻⁴ from 1.25×10⁻⁴ mm³ by the addition of Al and Cr elements to the WNbMoVTa. The He⁺ irradiation was conducted on the samples to observe surface damage. After irradiation, the XRD patterns were shifted to the left because of defects and dislocations. He⁺ ions were infused under the surface, so they created the lattice expansion. The peak shifting of the WNbMoVTaCr₀.₅Al sample was less than the WNbMoVTa base sample, thanks to less impact. A small amount of fuzz was observed for the base sample. This structure was removed and transformed into a wavy structure with the addition of Cr and Al elements. Also, the deformation hardening was actualized after irradiation. A lower amount of hardening was obtained with the WNbMoVTaCr₀.₅Al sample based on the changing microhardness values. The surface deformation was decreased in the WNbMoVTaCr₀.₅Al sample.

Keywords: refractory high entropy alloy, microhardness, wear resistance, He⁺ irradiation

Procedia PDF Downloads 48
18744 Treatment of Grey Water from Different Restaurants in FUTA Using Fungi

Authors: F. A. Ogundolie, F. Okogue, D. V. Adegunloye

Abstract:

Greywater samples were obtained from three restaurants in the Federal University of Technology; Akure coded SSR, MGR and GGR. Fungi isolates obtained include Rhizopus stolonifer, Aspergillus niger, Mucor mucedo, Aspergillus flavus, Saccharomyces cerevisiae. Of these fungi isolates obtained, R. stolonifer, A. niger and A. flavus showed significant degradation ability on grey water and was used for this research. A simple bioreactor was constructed using biodegradation process in purification of waste water samples. Waste water undergoes primary treatment; secondary treatment involves the introduction of the isolated organisms into the waste water sample and the tertiary treatment which involved the use of filter candle and the sand bed filtration process to achieve the end product without the use of chemicals. A. niger brought about significant reduction in both the bacterial load and the fungi load of the greywater samples of the three respective restaurants with a reduction of (1.29 × 108 to 1.57 × 102 cfu/ml; 1.04 × 108 to 1.12 × 102 cfu/ml and 1.72 × 108 to 1.60 × 102 cfu/ml) for bacterial load in SSR, MGR and GGR respectively. Reduction of 2.01 × 104 to 1.2 × 101; 1.72 × 104 to 1.1 × 101, and 2.50 × 104 to 1.5 × 101 in fungi load from SSR, MGR and GGR respectively. Result of degradation of these selected waste water by the fungi showed that A. niger was probably more potent in the degradation of organic matter and hence, A. niger could be used in the treatment of wastewater.

Keywords: Aspergillus niger, greywater, bacterial, fungi, microbial load, bioreactor, biodegradation, purification, organic matter and filtration

Procedia PDF Downloads 287
18743 Real-Time Classification of Marbles with Decision-Tree Method

Authors: K. S. Parlak, E. Turan

Abstract:

The separation of marbles according to the pattern quality is a process made according to expert decision. The classification phase is the most critical part in terms of economic value. In this study, a self-learning system is proposed which performs the classification of marbles quickly and with high success. This system performs ten feature extraction by taking ten marble images from the camera. The marbles are classified by decision tree method using the obtained properties. The user forms the training set by training the system at the marble classification stage. The system evolves itself in every marble image that is classified. The aim of the proposed system is to minimize the error caused by the person performing the classification and achieve it quickly.

Keywords: decision tree, feature extraction, k-means clustering, marble classification

Procedia PDF Downloads 356
18742 A Systematic Review of Process Research in Software Engineering

Authors: Tulasi Rayasa, Phani Kumar Pullela

Abstract:

A systematic review is a research method that involves collecting and evaluating the information on a specific topic in order to provide a comprehensive and unbiased review. This type of review aims to improve the software development process by ensuring that the research is thorough and accurate. To ensure objectivity, it is important to follow systematic guidelines and consider multiple sources, such as literature reviews, interviews, and surveys. The evaluation process should also be streamlined by incorporating research from journals and other sources, such as grey literature. The main goal of a systematic review is to identify the consistency of current models in the field of computer application and software engineering.

Keywords: computer application, software engineering, process research, data science

Procedia PDF Downloads 70
18741 Predicting Open Chromatin Regions in Cell-Free DNA Whole Genome Sequencing Data by Correlation Clustering  

Authors: Fahimeh Palizban, Farshad Noravesh, Amir Hossein Saeidian, Mahya Mehrmohamadi

Abstract:

In the recent decade, the emergence of liquid biopsy has significantly improved cancer monitoring and detection. Dying cells, including those originating from tumors, shed their DNA into the blood and contribute to a pool of circulating fragments called cell-free DNA. Accordingly, identifying the tissue origin of these DNA fragments from the plasma can result in more accurate and fast disease diagnosis and precise treatment protocols. Open chromatin regions are important epigenetic features of DNA that reflect cell types of origin. Profiling these features by DNase-seq, ATAC-seq, and histone ChIP-seq provides insights into tissue-specific and disease-specific regulatory mechanisms. There have been several studies in the area of cancer liquid biopsy that integrate distinct genomic and epigenomic features for early cancer detection along with tissue of origin detection. However, multimodal analysis requires several types of experiments to cover the genomic and epigenomic aspects of a single sample, which will lead to a huge amount of cost and time. To overcome these limitations, the idea of predicting OCRs from WGS is of particular importance. In this regard, we proposed a computational approach to target the prediction of open chromatin regions as an important epigenetic feature from cell-free DNA whole genome sequence data. To fulfill this objective, local sequencing depth will be fed to our proposed algorithm and the prediction of the most probable open chromatin regions from whole genome sequencing data can be carried out. Our method integrates the signal processing method with sequencing depth data and includes count normalization, Discrete Fourie Transform conversion, graph construction, graph cut optimization by linear programming, and clustering. To validate the proposed method, we compared the output of the clustering (open chromatin region+, open chromatin region-) with previously validated open chromatin regions related to human blood samples of the ATAC-DB database. The percentage of overlap between predicted open chromatin regions and the experimentally validated regions obtained by ATAC-seq in ATAC-DB is greater than 67%, which indicates meaningful prediction. As it is evident, OCRs are mostly located in the transcription start sites (TSS) of the genes. In this regard, we compared the concordance between the predicted OCRs and the human genes TSS regions obtained from refTSS and it showed proper accordance around 52.04% and ~78% with all and the housekeeping genes, respectively. Accurately detecting open chromatin regions from plasma cell-free DNA-seq data is a very challenging computational problem due to the existence of several confounding factors, such as technical and biological variations. Although this approach is in its infancy, there has already been an attempt to apply it, which leads to a tool named OCRDetector with some restrictions like the need for highly depth cfDNA WGS data, prior information about OCRs distribution, and considering multiple features. However, we implemented a graph signal clustering based on a single depth feature in an unsupervised learning manner that resulted in faster performance and decent accuracy. Overall, we tried to investigate the epigenomic pattern of a cell-free DNA sample from a new computational perspective that can be used along with other tools to investigate genetic and epigenetic aspects of a single whole genome sequencing data for efficient liquid biopsy-related analysis.

Keywords: open chromatin regions, cancer, cell-free DNA, epigenomics, graph signal processing, correlation clustering

Procedia PDF Downloads 114
18740 Neural Network Based Path Loss Prediction for Global System for Mobile Communication in an Urban Environment

Authors: Danladi Ali

Abstract:

In this paper, we measured GSM signal strength in the Dnepropetrovsk city in order to predict path loss in study area using nonlinear autoregressive neural network prediction and we also, used neural network clustering to determine average GSM signal strength receive at the study area. The nonlinear auto-regressive neural network predicted that the GSM signal is attenuated with the mean square error (MSE) of 2.6748dB, this attenuation value is used to modify the COST 231 Hata and the Okumura-Hata models. The neural network clustering revealed that -75dB to -95dB is received more frequently. This means that the signal strength received at the study is mostly weak signal

Keywords: one-dimensional multilevel wavelets, path loss, GSM signal strength, propagation, urban environment and model

Procedia PDF Downloads 356
18739 Hybrid Hierarchical Routing Protocol for WSN Lifetime Maximization

Authors: H. Aoudia, Y. Touati, E. H. Teguig, A. Ali Cherif

Abstract:

Conceiving and developing routing protocols for wireless sensor networks requires considerations on constraints such as network lifetime and energy consumption. In this paper, we propose a hybrid hierarchical routing protocol named HHRP combining both clustering mechanism and multipath optimization taking into account residual energy and RSSI measures. HHRP consists of classifying dynamically nodes into clusters where coordinators nodes with extra privileges are able to manipulate messages, aggregate data and ensure transmission between nodes according to TDMA and CDMA schedules. The reconfiguration of the network is carried out dynamically based on a threshold value which is associated with the number of nodes belonging to the smallest cluster. To show the effectiveness of the proposed approach HHRP, a comparative study with LEACH protocol is illustrated in simulations.

Keywords: routing protocol, optimization, clustering, WSN

Procedia PDF Downloads 435
18738 Genetic Diversity in Capsicum Germplasm Based on Inter Simple Sequence Repeat Markers

Authors: Siwapech Silapaprayoon, Januluk Khanobdee, Sompid Samipak

Abstract:

Chili peppers are the fruits of Capsicum pepper plants well known for their fiery burning sensation on the tongue after consumption. They are members of the Solanaceae or common nightshade family along with potato, tomato and eggplant. Thai cuisine has gained popularity for its distinct flavors due to usages of various spices and its heat from the addition of chili pepper. Though being used in little quantity for each dish, chili pepper holds a special place in Thai cuisine. There are many varieties of chili peppers in Thailand, and thirty accessions were collected at Rajamangala University of Technology Lanna, Lampang, Thailand. To effectively manage any germplasm it is essential to know the diversity and relationships among members. Thirty-six Inter Simple Sequence Repeat (ISSRs) DNA markers were used to analyze the germplasm. Total of 335 polymorphic bands was obtained giving the average of 9.3 alleles per marker. Unweighted pair-group mean arithmetic method (UPGMA) clustering of data using NTSYS-pc software indicated that the accessions showed varied levels of genetic similarity ranging from 0.57-1.00 similarity coefficient index indicating significant levels of variation. At SM coefficient of 0.81, the germplasm was separated into four groups. Phenotypic variation was discussed in context of phylogenetic tree clustering.

Keywords: diversity, germplasm, Chili pepper, ISSR

Procedia PDF Downloads 128
18737 Probing Environmental Sustainability via Brownfield Remediation: A Framework to Manage Brownfields in Ethiopia Lesson to Africa

Authors: Mikiale Gebreslase Gebremariam, Chai Huaqi, Tesfay Gebretsdkan Gebremichael, Dawit Nega Bekele

Abstract:

In recent years, brownfield redevelopment projects (BRPs) have contributed to the overarching paradigm of the United Nations 2030 agendas. In the present circumstance, most developed nations adopted BRPs, an efficacious urban policy tool. However, in developing and some advanced countries, BRPs are lacking due to limitations of awareness, policy tools, and financial capability for cleaning up brownfield sites. For example, the growth and development of Ethiopian cities were achieved at the cost of poor urban planning, including no community consultations and excessive urbanization for future growth. The demand for land resources is more and more urgent as the result of an intermigration to major cities and towns for socio-economic reasons and population growth. In the past, the development mode of spreading major cities has made horizontal urbanizations stretching outwards. Expansion in search of more land resources, while the outer cities are growing, the inner cities are polluted by environmental pollution. It is noteworthy that the rapid development of cities has not brought about an increase in people's happiness index. Thus, the proposed management framework for managing brownfields in Ethiopia as a lesson to the developing nation facing similar challenges and growth will add immense value in solving the problems and give insights into brownfield land utilization. Under the umbrella of the grey incidence decision-making model and with the consideration of multiple stakeholders and tight environmental and economic constraints, the proposed management framework integrates different criteria from economic, social, environmental, technical, and risk aspects into the grey incidence decision-making model and gives useful guidance to manage brownfields in Ethiopia. Furthermore, it will contribute to the future development of the social economy and the missions of the 2030 UN sustainable development goals.

Keywords: Brownfields, environmental sustainability, Ethiopia, grey-incidence decision-making, sustainable urban development

Procedia PDF Downloads 59
18736 A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.

Keywords: pattern recognition, global terrorism database, Manhattan distance, k-means clustering, terrorism data analysis

Procedia PDF Downloads 356
18735 Steel Bridge Coating Inspection Using Image Processing with Neural Network Approach

Authors: Ahmed Elbeheri, Tarek Zayed

Abstract:

Steel bridges deterioration has been one of the problems in North America for the last years. Steel bridges deterioration mainly attributed to the difficult weather conditions. Steel bridges suffer fatigue cracks and corrosion, which necessitate immediate inspection. Visual inspection is the most common technique for steel bridges inspection, but it depends on the inspector experience, conditions, and work environment. So many Non-destructive Evaluation (NDE) models have been developed use Non-destructive technologies to be more accurate, reliable and non-human dependent. Non-destructive techniques such as The Eddy Current Method, The Radiographic Method (RT), Ultra-Sonic Method (UT), Infra-red thermography and Laser technology have been used. Digital Image processing will be used for Corrosion detection as an Alternative for visual inspection. Different models had used grey-level and colored digital image for processing. However, color image proved to be better as it uses the color of the rust to distinguish it from the different backgrounds. The detection of the rust is an important process as it’s the first warning for the corrosion and a sign of coating erosion. To decide which is the steel element to be repainted and how urgent it is the percentage of rust should be calculated. In this paper, an image processing approach will be developed to detect corrosion and its severity. Two models were developed 1st to detect rust and 2nd to detect rust percentage.

Keywords: steel bridge, bridge inspection, steel corrosion, image processing

Procedia PDF Downloads 278
18734 Assessment of Ultra-High Cycle Fatigue Behavior of EN-GJL-250 Cast Iron Using Ultrasonic Fatigue Testing Machine

Authors: Saeedeh Bakhtiari, Johannes Depessemier, Stijn Hertelé, Wim De Waele

Abstract:

High cycle fatigue comprising up to 107 load cycles has been the subject of many studies, and the behavior of many materials was recorded adequately in this regime. However, many applications involve larger numbers of load cycles during the lifetime of machine components. In this ultra-high cycle regime, other failure mechanisms play, and the concept of a fatigue endurance limit (assumed for materials such as steel) is often an oversimplification of reality. When machine component design demands a high geometrical complexity, cast iron grades become interesting candidate materials. Grey cast iron is known for its low cost, high compressive strength, and good damping properties. However, the ultra-high cycle fatigue behavior of cast iron is poorly documented. The current work focuses on the ultra-high cycle fatigue behavior of EN-GJL-250 (GG25) grey cast iron by developing an ultrasonic (20 kHz) fatigue testing system. Moreover, the testing machine is instrumented to measure the temperature and the displacement of  the specimen, and to control the temperature. The high resonance frequency allowed to assess the  behavior of the cast iron of interest within a matter of days for ultra-high numbers of cycles, and repeat the tests to quantify the natural scatter in fatigue resistance.

Keywords: GG25, cast iron, ultra-high cycle fatigue, ultrasonic test

Procedia PDF Downloads 137
18733 Genomic Prediction Reliability Using Haplotypes Defined by Different Methods

Authors: Sohyoung Won, Heebal Kim, Dajeong Lim

Abstract:

Genomic prediction is an effective way to measure the abilities of livestock for breeding based on genomic estimated breeding values, statistically predicted values from genotype data using best linear unbiased prediction (BLUP). Using haplotypes, clusters of linked single nucleotide polymorphisms (SNPs), as markers instead of individual SNPs can improve the reliability of genomic prediction since the probability of a quantitative trait loci to be in strong linkage disequilibrium (LD) with markers is higher. To efficiently use haplotypes in genomic prediction, finding optimal ways to define haplotypes is needed. In this study, 770K SNP chip data was collected from Hanwoo (Korean cattle) population consisted of 2506 cattle. Haplotypes were first defined in three different ways using 770K SNP chip data: haplotypes were defined based on 1) length of haplotypes (bp), 2) the number of SNPs, and 3) k-medoids clustering by LD. To compare the methods in parallel, haplotypes defined by all methods were set to have comparable sizes; in each method, haplotypes defined to have an average number of 5, 10, 20 or 50 SNPs were tested respectively. A modified GBLUP method using haplotype alleles as predictor variables was implemented for testing the prediction reliability of each haplotype set. Also, conventional genomic BLUP (GBLUP) method, which uses individual SNPs were tested to evaluate the performance of the haplotype sets on genomic prediction. Carcass weight was used as the phenotype for testing. As a result, using haplotypes defined by all three methods showed increased reliability compared to conventional GBLUP. There were not many differences in the reliability between different haplotype defining methods. The reliability of genomic prediction was highest when the average number of SNPs per haplotype was 20 in all three methods, implying that haplotypes including around 20 SNPs can be optimal to use as markers for genomic prediction. When the number of alleles generated by each haplotype defining methods was compared, clustering by LD generated the least number of alleles. Using haplotype alleles for genomic prediction showed better performance, suggesting improved accuracy in genomic selection. The number of predictor variables was decreased when the LD-based method was used while all three haplotype defining methods showed similar performances. This suggests that defining haplotypes based on LD can reduce computational costs and allows efficient prediction. Finding optimal ways to define haplotypes and using the haplotype alleles as markers can provide improved performance and efficiency in genomic prediction.

Keywords: best linear unbiased predictor, genomic prediction, haplotype, linkage disequilibrium

Procedia PDF Downloads 114
18732 A Robust Spatial Feature Extraction Method for Facial Expression Recognition

Authors: H. G. C. P. Dinesh, G. Tharshini, M. P. B. Ekanayake, G. M. R. I. Godaliyadda

Abstract:

This paper presents a new spatial feature extraction method based on principle component analysis (PCA) and Fisher Discernment Analysis (FDA) for facial expression recognition. It not only extracts reliable features for classification, but also reduces the feature space dimensions of pattern samples. In this method, first each gray scale image is considered in its entirety as the measurement matrix. Then, principle components (PCs) of row vectors of this matrix and variance of these row vectors along PCs are estimated. Therefore, this method would ensure the preservation of spatial information of the facial image. Afterwards, by incorporating the spectral information of the eigen-filters derived from the PCs, a feature vector was constructed, for a given image. Finally, FDA was used to define a set of basis in a reduced dimension subspace such that the optimal clustering is achieved. The method of FDA defines an inter-class scatter matrix and intra-class scatter matrix to enhance the compactness of each cluster while maximizing the distance between cluster marginal points. In order to matching the test image with the training set, a cosine similarity based Bayesian classification was used. The proposed method was tested on the Cohn-Kanade database and JAFFE database. It was observed that the proposed method which incorporates spatial information to construct an optimal feature space outperforms the standard PCA and FDA based methods.

Keywords: facial expression recognition, principle component analysis (PCA), fisher discernment analysis (FDA), eigen-filter, cosine similarity, bayesian classifier, f-measure

Procedia PDF Downloads 405
18731 Development of a Robust Protein Classifier to Predict EMT Status of Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (CESC) Tumors

Authors: ZhenlinJu, Christopher P. Vellano, RehanAkbani, Yiling Lu, Gordon B. Mills

Abstract:

The epithelial–mesenchymal transition (EMT) is a process by which epithelial cells acquire mesenchymal characteristics, such as profound disruption of cell-cell junctions, loss of apical-basolateral polarity, and extensive reorganization of the actin cytoskeleton to induce cell motility and invasion. A hallmark of EMT is its capacity to promote metastasis, which is due in part to activation of several transcription factors and subsequent downregulation of E-cadherin. Unfortunately, current approaches have yet to uncover robust protein marker sets that can classify tumors as possessing strong EMT signatures. In this study, we utilize reverse phase protein array (RPPA) data and consensus clustering methods to successfully classify a subset of cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC) tumors into an EMT protein signaling group (EMT group). The overall survival (OS) of patients in the EMT group is significantly worse than those in the other Hormone and PI3K/AKT signaling groups. In addition to a shrinkage and selection method for linear regression (LASSO), we applied training/test set and Monte Carlo resampling approaches to identify a set of protein markers that predicts the EMT status of CESC tumors. We fit a logistic model to these protein markers and developed a classifier, which was fixed in the training set and validated in the testing set. The classifier robustly predicted the EMT status of the testing set with an area under the curve (AUC) of 0.975 by Receiver Operating Characteristic (ROC) analysis. This method not only identifies a core set of proteins underlying an EMT signature in cervical cancer patients, but also provides a tool to examine protein predictors that drive molecular subtypes in other diseases.

Keywords: consensus clustering, TCGA CESC, Silhouette, Monte Carlo LASSO

Procedia PDF Downloads 438
18730 An Adaptive Oversampling Technique for Imbalanced Datasets

Authors: Shaukat Ali Shahee, Usha Ananthakumar

Abstract:

A data set exhibits class imbalance problem when one class has very few examples compared to the other class, and this is also referred to as between class imbalance. The traditional classifiers fail to classify the minority class examples correctly due to its bias towards the majority class. Apart from between-class imbalance, imbalance within classes where classes are composed of a different number of sub-clusters with these sub-clusters containing different number of examples also deteriorates the performance of the classifier. Previously, many methods have been proposed for handling imbalanced dataset problem. These methods can be classified into four categories: data preprocessing, algorithmic based, cost-based methods and ensemble of classifier. Data preprocessing techniques have shown great potential as they attempt to improve data distribution rather than the classifier. Data preprocessing technique handles class imbalance either by increasing the minority class examples or by decreasing the majority class examples. Decreasing the majority class examples lead to loss of information and also when minority class has an absolute rarity, removing the majority class examples is generally not recommended. Existing methods available for handling class imbalance do not address both between-class imbalance and within-class imbalance simultaneously. In this paper, we propose a method that handles between class imbalance and within class imbalance simultaneously for binary classification problem. Removing between class imbalance and within class imbalance simultaneously eliminates the biases of the classifier towards bigger sub-clusters by minimizing the error domination of bigger sub-clusters in total error. The proposed method uses model-based clustering to find the presence of sub-clusters or sub-concepts in the dataset. The number of examples oversampled among the sub-clusters is determined based on the complexity of sub-clusters. The method also takes into consideration the scatter of the data in the feature space and also adaptively copes up with unseen test data using Lowner-John ellipsoid for increasing the accuracy of the classifier. In this study, neural network is being used as this is one such classifier where the total error is minimized and removing the between-class imbalance and within class imbalance simultaneously help the classifier in giving equal weight to all the sub-clusters irrespective of the classes. The proposed method is validated on 9 publicly available data sets and compared with three existing oversampling techniques that rely on the spatial location of minority class examples in the euclidean feature space. The experimental results show the proposed method to be statistically significantly superior to other methods in terms of various accuracy measures. Thus the proposed method can serve as a good alternative to handle various problem domains like credit scoring, customer churn prediction, financial distress, etc., that typically involve imbalanced data sets.

Keywords: classification, imbalanced dataset, Lowner-John ellipsoid, model based clustering, oversampling

Procedia PDF Downloads 391
18729 Hybrid Adaptive Modeling to Enhance Robustness of Real-Time Optimization

Authors: Hussain Syed Asad, Richard Kwok Kit Yuen, Gongsheng Huang

Abstract:

Real-time optimization has been considered an effective approach for improving energy efficient operation of heating, ventilation, and air-conditioning (HVAC) systems. In model-based real-time optimization, model mismatches cannot be avoided. When model mismatches are significant, the performance of the real-time optimization will be impaired and hence the expected energy saving will be reduced. In this paper, the model mismatches for chiller plant on real-time optimization are considered. In the real-time optimization of the chiller plant, simplified semi-physical or grey box model of chiller is always used, which should be identified using available operation data. To overcome the model mismatches associated with the chiller model, hybrid Genetic Algorithms (HGAs) method is used for online real-time training of the chiller model. HGAs combines Genetic Algorithms (GAs) method (for global search) and traditional optimization method (i.e. faster and more efficient for local search) to avoid conventional hit and trial process of GAs. The identification of model parameters is synthesized as an optimization problem; and the objective function is the Least Square Error between the output from the model and the actual output from the chiller plant. A case study is used to illustrate the implementation of the proposed method. It has been shown that the proposed approach is able to provide reliability in decision making, enhance the robustness of the real-time optimization strategy and improve on energy performance.

Keywords: energy performance, hybrid adaptive modeling, hybrid genetic algorithms, real-time optimization, heating, ventilation, and air-conditioning

Procedia PDF Downloads 388
18728 Automatic Landmark Selection Based on Feature Clustering for Visual Autonomous Unmanned Aerial Vehicle Navigation

Authors: Paulo Fernando Silva Filho, Elcio Hideiti Shiguemori

Abstract:

The selection of specific landmarks for an Unmanned Aerial Vehicles’ Visual Navigation systems based on Automatic Landmark Recognition has significant influence on the precision of the system’s estimated position. At the same time, manual selection of the landmarks does not guarantee a high recognition rate, which would also result on a poor precision. This work aims to develop an automatic landmark selection that will take the image of the flight area and identify the best landmarks to be recognized by the Visual Navigation Landmark Recognition System. The criterion to select a landmark is based on features detected by ORB or AKAZE and edges information on each possible landmark. Results have shown that disposition of possible landmarks is quite different from the human perception.

Keywords: clustering, edges, feature points, landmark selection, X-means

Procedia PDF Downloads 251
18727 Clustering Based and Centralized Routing Table Topology of Control Protocol in Mobile Wireless Sensor Networks

Authors: Mbida Mohamed, Ezzati Abdellah

Abstract:

A strong challenge in the wireless sensor networks (WSN) is to save the energy and have a long life time in the network without having a high rate of loss information. However, topology control (TC) protocols are designed in a way that the network is divided and having a standard system of exchange packets between nodes. In this article, we will propose a clustering based and centralized routing table protocol of TC (CBCRT) which delegates a leader node that will encapsulate a single routing table in every cluster nodes. Hence, if a node wants to send packets to the sink, it requests the information's routing table of the current cluster from the node leader in order to root the packet.

Keywords: mobile wireless sensor networks, routing, topology of control, protocols

Procedia PDF Downloads 243
18726 GBKMeans: A Genetic Based K-Means Applied to the Capacitated Planning of Reading Units

Authors: Anderson S. Fonseca, Italo F. S. Da Silva, Robert D. A. Santos, Mayara G. Da Silva, Pedro H. C. Vieira, Antonio M. S. Sobrinho, Victor H. B. Lemos, Petterson S. Diniz, Anselmo C. Paiva, Eliana M. G. Monteiro

Abstract:

In Brazil, the National Electric Energy Agency (ANEEL) establishes that electrical energy companies are responsible for measuring and billing their customers. Among these regulations, it’s defined that a company must bill your customers within 27-33 days. If a relocation or a change of period is required, the consumer must be notified in writing, in advance of a billing period. To make it easier to organize a workday’s measurements, these companies create a reading plan. These plans consist of grouping customers into reading groups, which are visited by an employee responsible for measuring consumption and billing. The creation process of a plan efficiently and optimally is a capacitated clustering problem with constraints related to homogeneity and compactness, that is, the employee’s working load and the geographical position of the consuming unit. This process is a work done manually by several experts who have experience in the geographic formation of the region, which takes a large number of days to complete the final planning, and because it’s human activity, there is no guarantee of finding the best optimization for planning. In this paper, the GBKMeans method presents a technique based on K-Means and genetic algorithms for creating a capacitated cluster that respects the constraints established in an efficient and balanced manner, that minimizes the cost of relocating consumer units and the time required for final planning creation. The results obtained by the presented method are compared with the current planning of a real city, showing an improvement of 54.71% in the standard deviation of working load and 11.97% in the compactness of the groups.

Keywords: capacitated clustering, k-means, genetic algorithm, districting problems

Procedia PDF Downloads 170
18725 New Variational Approach for Contrast Enhancement of Color Image

Authors: Wanhyun Cho, Seongchae Seo, Soonja Kang

Abstract:

In this work, we propose a variational technique for image contrast enhancement which utilizes global and local information around each pixel. The energy functional is defined by a weighted linear combination of three terms which are called on a local, a global contrast term and dispersion term. The first one is a local contrast term that can lead to improve the contrast of an input image by increasing the grey-level differences between each pixel and its neighboring to utilize contextual information around each pixel. The second one is global contrast term, which can lead to enhance a contrast of image by minimizing the difference between its empirical distribution function and a cumulative distribution function to make the probability distribution of pixel values becoming a symmetric distribution about median. The third one is a dispersion term that controls the departure between new pixel value and pixel value of original image while preserving original image characteristics as well as possible. Second, we derive the Euler-Lagrange equation for true image that can achieve the minimum of a proposed functional by using the fundamental lemma for the calculus of variations. And, we considered the procedure that this equation can be solved by using a gradient decent method, which is one of the dynamic approximation techniques. Finally, by conducting various experiments, we can demonstrate that the proposed method can enhance the contrast of colour images better than existing techniques.

Keywords: color image, contrast enhancement technique, variational approach, Euler-Lagrang equation, dynamic approximation method, EME measure

Procedia PDF Downloads 423
18724 Graph-Based Semantical Extractive Text Analysis

Authors: Mina Samizadeh

Abstract:

In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to explore the data. This leads to an intense growing interest in the research community to develop computational methods focused on processing this text data. A line of study focused on condensing the text so that we are able to get a higher level of understanding in a shorter time. The two important tasks to do this are keyword extraction and text summarization. In keyword extraction, we are interested in finding the key important words from a text. This makes us familiar with the general topic of a text. In text summarization, we are interested in producing a short-length text which includes important information about the document. The TextRank algorithm, an unsupervised learning method that is an extension of the PageRank (algorithm which is the base algorithm of Google search engine for searching pages and ranking them), has shown its efficacy in large-scale text mining, especially for text summarization and keyword extraction. This algorithm can automatically extract the important parts of a text (keywords or sentences) and declare them as a result. However, this algorithm neglects the semantic similarity between the different parts. In this work, we improved the results of the TextRank algorithm by incorporating the semantic similarity between parts of the text. Aside from keyword extraction and text summarization, we develop a topic clustering algorithm based on our framework, which can be used individually or as a part of generating the summary to overcome coverage problems.

Keywords: keyword extraction, n-gram extraction, text summarization, topic clustering, semantic analysis

Procedia PDF Downloads 46
18723 Authentication Based on Hand Movement by Low Dimensional Space Representation

Authors: Reut Lanyado, David Mendlovic

Abstract:

Most biological methods for authentication require special equipment and, some of them are easy to fake. We proposed a method for authentication based on hand movement while typing a sentence with a regular camera. This technique uses the full video of the hand, which is harder to fake. In the first phase, we tracked the hand joints in each frame. Next, we represented a single frame for each individual using our Pose Agnostic Rotation and Movement (PARM) dimensional space. Then, we indicated a full video of hand movement in a fixed low dimensional space using this method: Fixed Dimension Video by Interpolation Statistics (FDVIS). Finally, we identified each individual in the FDVIS representation using unsupervised clustering and supervised methods. Accuracy exceeds 96% for 80 individuals by using supervised KNN.

Keywords: authentication, feature extraction, hand recognition, security, signal processing

Procedia PDF Downloads 97