Search results for: genetic algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3376

Search results for: genetic algorithms

2416 Using Econometric Methods to Explore Obesity Stigma and Avoidance of Breast and Cervical Cancer Screening

Authors: Stephanie A. Schauder, Gosia Sylwestrzak

Abstract:

Overweight and obese women report avoiding preventive care due to fear of weight-related bias from medical professionals. Gynecological exams, due to their sensitive and personally invasive nature, are especially susceptible to avoidance. This research investigates the association between body mass index (BMI) and screening rates for breast and cervical cancer using claims data from 1.3 million members of a large health insurance company. Because obesity is associated with increased cancer risk, screenings for these cancers should increase as BMI increases. However, this paper finds that the distribution of cancer screening rates by BMI take an inverted U-shape with underweight and obese members having the lowest screening rates. For cervical cancer screening, those in the target population with a BMI of 23 have the highest screening rate at 68%, while Obese Class III members have a screening rate of 50%. Those in the underweight category have a screening rate of 58%. This relationship persists even after controlling for health and demographic covariates in regression analysis. Interestingly, there is no association between BMI and BRCA (BReast CAncer gene) genetic testing. This is consistent with the narrative that stigma causes avoidance because genetic testing does not involve any assessment of a person’s body. More work must be done to determine how to increase cancer screening rates in those who may feel stigmatized due to their weight.

Keywords: cancer screening, cervical cancer, breast cancer, weight stigma, avoidance of care

Procedia PDF Downloads 196
2415 Hybrid Deep Learning and FAST-BRISK 3D Object Detection Technique for Bin-Picking Application

Authors: Thanakrit Taweesoontorn, Sarucha Yanyong, Poom Konghuayrob

Abstract:

Robotic arms have gained popularity in various industries due to their accuracy and efficiency. This research proposes a method for bin-picking tasks using the Cobot, combining the YOLOv5 CNNs model for object detection and pose estimation with traditional feature detection (FAST), feature description (BRISK), and matching algorithms. By integrating these algorithms and utilizing a small-scale depth sensor camera for capturing depth and color images, the system achieves real-time object detection and accurate pose estimation, enabling the robotic arm to pick objects correctly in both position and orientation. Furthermore, the proposed method is implemented within the ROS framework to provide a seamless platform for robotic control and integration. This integration of robotics, cameras, and AI technology contributes to the development of industrial robotics, opening up new possibilities for automating challenging tasks and improving overall operational efficiency.

Keywords: robotic vision, image processing, applications of robotics, artificial intelligent

Procedia PDF Downloads 90
2414 Learning Algorithms for Fuzzy Inference Systems Composed of Double- and Single-Input Rule Modules

Authors: Hirofumi Miyajima, Kazuya Kishida, Noritaka Shigei, Hiromi Miyajima

Abstract:

Most of self-tuning fuzzy systems, which are automatically constructed from learning data, are based on the steepest descent method (SDM). However, this approach often requires a large convergence time and gets stuck into a shallow local minimum. One of its solutions is to use fuzzy rule modules with a small number of inputs such as DIRMs (Double-Input Rule Modules) and SIRMs (Single-Input Rule Modules). In this paper, we consider a (generalized) DIRMs model composed of double and single-input rule modules. Further, in order to reduce the redundant modules for the (generalized) DIRMs model, pruning and generative learning algorithms for the model are suggested. In order to show the effectiveness of them, numerical simulations for function approximation, Box-Jenkins and obstacle avoidance problems are performed.

Keywords: Box-Jenkins's problem, double-input rule module, fuzzy inference model, obstacle avoidance, single-input rule module

Procedia PDF Downloads 350
2413 Machine Learning Approach for Yield Prediction in Semiconductor Production

Authors: Heramb Somthankar, Anujoy Chakraborty

Abstract:

This paper presents a classification study on yield prediction in semiconductor production using machine learning approaches. A complicated semiconductor production process is generally monitored continuously by signals acquired from sensors and measurement sites. A monitoring system contains a variety of signals, all of which contain useful information, irrelevant information, and noise. In the case of each signal being considered a feature, "Feature Selection" is used to find the most relevant signals. The open-source UCI SECOM Dataset provides 1567 such samples, out of which 104 fail in quality assurance. Feature extraction and selection are performed on the dataset, and useful signals were considered for further study. Afterward, common machine learning algorithms were employed to predict whether the signal yields pass or fail. The most relevant algorithm is selected for prediction based on the accuracy and loss of the ML model.

Keywords: deep learning, feature extraction, feature selection, machine learning classification algorithms, semiconductor production monitoring, signal processing, time-series analysis

Procedia PDF Downloads 104
2412 Blind Super-Resolution Reconstruction Based on PSF Estimation

Authors: Osama A. Omer, Amal Hamed

Abstract:

Successful blind image Super-Resolution algorithms require the exact estimation of the Point Spread Function (PSF). In the absence of any prior information about the imagery system and the true image; this estimation is normally done by trial and error experimentation until an acceptable restored image quality is obtained. Multi-frame blind Super-Resolution algorithms often have disadvantages of slow convergence and sensitiveness to complex noises. This paper presents a Super-Resolution image reconstruction algorithm based on estimation of the PSF that yields the optimum restored image quality. The estimation of PSF is performed by the knife-edge method and it is implemented by measuring spreading of the edges in the reproduced HR image itself during the reconstruction process. The proposed image reconstruction approach is using L1 norm minimization and robust regularization based on a bilateral prior to deal with different data and noise models. A series of experiment results show that the proposed method can outperform other previous work robustly and efficiently.

Keywords: blind, PSF, super-resolution, knife-edge, blurring, bilateral, L1 norm

Procedia PDF Downloads 361
2411 Exploring Factors That May Contribute to the Underdiagnosis of Hereditary Transthyretin Amyloidosis in African American Patients

Authors: Kelsi Hagerty, Ami Rosen, Aaliyah Heyward, Nadia Ali, Emily Brown, Erin Demo, Yue Guan, Modele Ogunniyi, Brianna McDaniels, Alanna Morris, Kunal Bhatt

Abstract:

Hereditary transthyretin amyloidosis (hATTR) is a progressive, multi-systemic, and life-threatening disease caused by a disruption in the TTR protein that delivers thyroxine and retinol to the liver. This disruption causes the protein to misfold into amyloid fibrils, leading to the accumulation of the amyloid fibrils in the heart, nerves, and GI tract. Over 130 variants in the TTR gene are known to cause hATTR. The Val122Ile variant is the most common in the United States and is seen almost exclusively in people of African descent. TTR variants are inherited in an autosomal dominant fashion and have incomplete penetrance and variable expressivity. Individuals with hATTR may exhibit symptoms from as early as 30 years to as late as 80 years of age. hATTR is characterized by a wide range of clinical symptoms such as cardiomyopathy, neuropathy, carpal tunnel syndrome, and GI complications. Without treatment, hATTR leads to progressive disease and can ultimately lead to heart failure. hATTR disproportionately affects individuals of African descent; the estimated prevalence of hATTR among Black individuals in the US is 3.4%. Unfortunately, hATTR is often underdiagnosed and misdiagnosed because many symptoms of the disease overlap with other cardiac conditions. Due to the progressive nature of the disease, multi-systemic manifestations that can lead to a shortened lifespan, and the availability of free genetic testing and promising FDA-approved therapies that enhance treatability, early identification of individuals with a pathogenic hATTR variant is important, as this can significantly impact medical management for patients and their relatives. Furthermore, recent literature suggests that TTR genetic testing should be performed in all patients with suspicion of TTR-related cardiomyopathy, regardless of age, and that follow-up with genetic counseling services is recommended. Relatives of patients with hATTR benefit from genetic testing because testing can identify carriers early and allow relatives to receive regular screening and management. Despite the striking prevalence of hATTR among Black individuals, hATTR remains underdiagnosed in this patient population, and germline genetic testing for hATTR in Black individuals seems to be underrepresented, though the reasons for this have not yet been brought to light. Historically, Black patients experience a number of barriers to seeking healthcare that has been hypothesized to perpetuate the underdiagnosis of hATTR, such as lack of access and mistrust of healthcare professionals. Prior research has described a myriad of factors that shape an individual’s decision about whether to pursue presymptomatic genetic testing for a familial pathogenic variant, such as family closeness and communication, family dynamics, and a desire to inform other family members about potential health risks. This study explores these factors through 10 in-depth interviews with patients with hATTR about what factors may be contributing to the underdiagnosis of hATTR in the Black population. Participants were selected from the Emory University Amyloidosis clinic based on having a molecular diagnosis of hATTR. Interviews were recorded and transcribed verbatim, then coded using MAXQDA software. Thematic analysis was completed to draw commonalities between participants. Upon preliminary analysis, several themes have emerged. Barriers identified include i) Misdiagnosis and a prolonged diagnostic odyssey, ii) Family communication and dynamics surrounding health issues, iii) Perceptions of healthcare and one’s own health risks, and iv) The need for more intimate provider-patient relationships and communication. Overall, this study gleaned valuable insight from members of the Black community about possible factors contributing to the underdiagnosis of hATTR, as well as potential solutions to go about resolving this issue.

Keywords: cardiac amyloidosis, heart failure, TTR, genetic testing

Procedia PDF Downloads 95
2410 Multi-Spectral Medical Images Enhancement Using a Weber’s law

Authors: Muna F. Al-Sammaraie

Abstract:

The aim of this research is to present a multi spectral image enhancement methods used to achieve highly real digital image populates only a small portion of the available range of digital values. Also, a quantitative measure of image enhancement is presented. This measure is related with concepts of the Webers Low of the human visual system. For decades, several image enhancement techniques have been proposed. Although most techniques require profuse amount of advance and critical steps, the result for the perceive image are not as satisfied. This study involves changing the original values so that more of the available range is used; then increases the contrast between features and their backgrounds. It consists of reading the binary image on the basis of pixels taking them byte-wise and displaying it, calculating the statistics of an image, automatically enhancing the color of the image based on statistics calculation using algorithms and working with RGB color bands. Finally, the enhanced image is displayed along with image histogram. A number of experimental results illustrated the performance of these algorithms. Particularly the quantitative measure has helped to select optimal processing parameters: the best parameters and transform.

Keywords: image enhancement, multi-spectral, RGB, histogram

Procedia PDF Downloads 325
2409 Non-Destructive Evaluation for Physical State Monitoring of an Angle Section Thin-Walled Curved Beam

Authors: Palash Dey, Sudip Talukdar

Abstract:

In this work, a cross-breed approach is presented for obtaining both the amount of the damage intensity and location of damage existing in thin-walled members. This cross-breed approach is developed based on response surface methodology (RSM) and genetic algorithm (GA). Theoretical finite element (FE) model of cracked angle section thin walled curved beam has been linked to the developed approach to carry out trial experiments to generate response surface functions (RSFs) of free, forced and heterogeneous dynamic response data. Subsequently, the error between the computed response surface functions and measured dynamic response data has been minimized using GA to find out the optimum damage parameters (amount of the damage intensity and location). A single crack of varying location and depth has been considered in this study. The presented approach has been found to reveal good accuracy in prediction of crack parameters and possess great potential in crack detection as it requires only the current response of a cracked beam.

Keywords: damage parameters, finite element, genetic algorithm, response surface methodology, thin walled curved beam

Procedia PDF Downloads 245
2408 All Types of Base Pair Substitutions Induced by γ-Rays in Haploid and Diploid Yeast Cells

Authors: Natalia Koltovaya, Nadezhda Zhuchkina, Ksenia Lyubimova

Abstract:

We study the biological effects induced by ionizing radiation in view of therapeutic exposure and the idea of space flights beyond Earth's magnetosphere. In particular, we examine the differences between base pair substitution induction by ionizing radiation in model haploid and diploid yeast Saccharomyces cerevisiae cells. Such mutations are difficult to study in higher eukaryotic systems. In our research, we have used a collection of six isogenic trp5-strains and 14 isogenic haploid and diploid cyc1-strains that are specific markers of all possible base-pair substitutions. These strains differ from each other only in single base substitutions within codon-50 of the trp5 gene or codon-22 of the cyc1 gene. Different mutation spectra for two different haploid genetic trp5- and cyc1-assays and different mutation spectra for the same genetic cyc1-system in cells with different ploidy — haploid and diploid — have been obtained. It was linear function for dose-dependence in haploid and exponential in diploid cells. We suggest that the differences between haploid yeast strains reflect the dependence on the sequence context, while the differences between haploid and diploid strains reflect the different molecular mechanisms of mutations.

Keywords: base pair substitutions, γ-rays, haploid and diploid cells, yeast Saccharomyces cerevisiae

Procedia PDF Downloads 153
2407 Improving the Efficiency of a High Pressure Turbine by Using Non-Axisymmetric Endwall: A Comparison of Two Optimization Algorithms

Authors: Abdul Rehman, Bo Liu

Abstract:

Axial flow turbines are commonly designed with high loads that generate strong secondary flows and result in high secondary losses. These losses contribute to almost 30% to 50% of the total losses. Non-axisymmetric endwall profiling is one of the passive control technique to reduce the secondary flow loss. In this paper, the non-axisymmetric endwall profile construction and optimization for the stator endwalls are presented to improve the efficiency of a high pressure turbine. The commercial code NUMECA Fine/ Design3D coupled with Fine/Turbo was used for the numerical investigation, design of experiments and the optimization. All the flow simulations were conducted by using steady RANS and Spalart-Allmaras as a turbulence model. The non-axisymmetric endwalls of stator hub and shroud were created by using the perturbation law based on Bezier Curves. Each cut having multiple control points was supposed to be created along the virtual streamlines in the blade channel. For the design of experiments, each sample was arbitrarily generated based on values automatically chosen for the control points defined during parameterization. The Optimization was achieved by using two algorithms i.e. the stochastic algorithm and gradient-based algorithm. For the stochastic algorithm, a genetic algorithm based on the artificial neural network was used as an optimization method in order to achieve the global optimum. The evaluation of the successive design iterations was performed using artificial neural network prior to the flow solver. For the second case, the conjugate gradient algorithm with a three dimensional CFD flow solver was used to systematically vary a free-form parameterization of the endwall. This method is efficient and less time to consume as it requires derivative information of the objective function. The objective function was to maximize the isentropic efficiency of the turbine by keeping the mass flow rate as constant. The performance was quantified by using a multi-objective function. Other than these two classifications of the optimization methods, there were four optimizations cases i.e. the hub only, the shroud only, and the combination of hub and shroud. For the fourth case, the shroud endwall was optimized by using the optimized hub endwall geometry. The hub optimization resulted in an increase in the efficiency due to more homogenous inlet conditions for the rotor. The adverse pressure gradient was reduced but the total pressure loss in the vicinity of the hub was increased. The shroud optimization resulted in an increase in efficiency, total pressure loss and entropy were reduced. The combination of hub and shroud did not show overwhelming results which were achieved for the individual cases of the hub and the shroud. This may be caused by fact that there were too many control variables. The fourth case of optimization showed the best result because optimized hub was used as an initial geometry to optimize the shroud. The efficiency was increased more than the individual cases of optimization with a mass flow rate equal to the baseline design of the turbine. The results of artificial neural network and conjugate gradient method were compared.

Keywords: artificial neural network, axial turbine, conjugate gradient method, non-axisymmetric endwall, optimization

Procedia PDF Downloads 221
2406 Building Scalable and Accurate Hybrid Kernel Mapping Recommender

Authors: Hina Iqbal, Mustansar Ali Ghazanfar, Sandor Szedmak

Abstract:

Recommender systems uses artificial intelligence practices for filtering obscure information and can predict if a user likes a specified item. Kernel mapping Recommender systems have been proposed which are accurate and state-of-the-art algorithms and resolve recommender system’s design objectives such as; long tail, cold-start, and sparsity. The aim of research is to propose hybrid framework that can efficiently integrate different versions— namely item-based and user-based KMR— of KMR algorithm. We have proposed various heuristic algorithms that integrate different versions of KMR (into a unified framework) resulting in improved accuracy and elimination of problems associated with conventional recommender system. We have tested our system on publically available movies dataset and benchmark with KMR. The results (in terms of accuracy, precision, recall, F1 measure and ROC metrics) reveal that the proposed algorithm is quite accurate especially under cold-start and sparse scenarios.

Keywords: Kernel Mapping Recommender Systems, hybrid recommender systems, cold start, sparsity, long tail

Procedia PDF Downloads 332
2405 Machine Learning for Disease Prediction Using Symptoms and X-Ray Images

Authors: Ravija Gunawardana, Banuka Athuraliya

Abstract:

Machine learning has emerged as a powerful tool for disease diagnosis and prediction. The use of machine learning algorithms has the potential to improve the accuracy of disease prediction, thereby enabling medical professionals to provide more effective and personalized treatments. This study focuses on developing a machine-learning model for disease prediction using symptoms and X-ray images. The importance of this study lies in its potential to assist medical professionals in accurately diagnosing diseases, thereby improving patient outcomes. Respiratory diseases are a significant cause of morbidity and mortality worldwide, and chest X-rays are commonly used in the diagnosis of these diseases. However, accurately interpreting X-ray images requires significant expertise and can be time-consuming, making it difficult to diagnose respiratory diseases in a timely manner. By incorporating machine learning algorithms, we can significantly enhance disease prediction accuracy, ultimately leading to better patient care. The study utilized the Mask R-CNN algorithm, which is a state-of-the-art method for object detection and segmentation in images, to process chest X-ray images. The model was trained and tested on a large dataset of patient information, which included both symptom data and X-ray images. The performance of the model was evaluated using a range of metrics, including accuracy, precision, recall, and F1-score. The results showed that the model achieved an accuracy rate of over 90%, indicating that it was able to accurately detect and segment regions of interest in the X-ray images. In addition to X-ray images, the study also incorporated symptoms as input data for disease prediction. The study used three different classifiers, namely Random Forest, K-Nearest Neighbor and Support Vector Machine, to predict diseases based on symptoms. These classifiers were trained and tested using the same dataset of patient information as the X-ray model. The results showed promising accuracy rates for predicting diseases using symptoms, with the ensemble learning techniques significantly improving the accuracy of disease prediction. The study's findings indicate that the use of machine learning algorithms can significantly enhance disease prediction accuracy, ultimately leading to better patient care. The model developed in this study has the potential to assist medical professionals in diagnosing respiratory diseases more accurately and efficiently. However, it is important to note that the accuracy of the model can be affected by several factors, including the quality of the X-ray images, the size of the dataset used for training, and the complexity of the disease being diagnosed. In conclusion, the study demonstrated the potential of machine learning algorithms for disease prediction using symptoms and X-ray images. The use of these algorithms can improve the accuracy of disease diagnosis, ultimately leading to better patient care. Further research is needed to validate the model's accuracy and effectiveness in a clinical setting and to expand its application to other diseases.

Keywords: K-nearest neighbor, mask R-CNN, random forest, support vector machine

Procedia PDF Downloads 143
2404 Estimation of Heritability and Repeatability for Pre-Weaning Body Weights of Domestic Rabbits Raised in Derived Savanna Zone of Nigeria

Authors: Adewale I. Adeolu, Vivian U. Oleforuh-Okoleh, Sylvester N. Ibe

Abstract:

Heritability and repeatability estimates are needed for the genetic evaluation of livestock populations and consequently for the purpose of upgrading or improvement. Pooled data on 604 progeny from three consecutive parities of purebred rabbit breeds (Chinchilla, Dutch and New Zealand white) raised in Derived Savanna Zone of Nigeria were used to estimate heritability and repeatability for pre-weaning body weights between 1st and 8th week of age. Traits studied include Individual kit weight at birth (IKWB), 2nd week (IK2W), 4th week (IK4W), 6th week (IK6W) and 8th week (IK8W). Nested random effects analysis of (Co)variances as described by Statistical Analysis System (SAS) were employed in the estimation. Respective heritability estimates from the sire component (h2s) and repeatability (R) as intra-class correlations of repeated measurements from the three parties for IKWB, IK2W, IK4W and IK8W are 0.59±0.24, 0.55±0.24, 0.93±0.31, 0.28±0.17, 0.64±0.26 and 0.12±0.14, 0.05±0.14, 0.58±0.02, 0.60±0.11, 0.20±0.14. Heritability and repeatability (except R for IKWB and IK2W) estimates are moderate to high. In conclusion, since pre-weaning body weights in the present study tended to be moderately to highly heritable and repeatable, improvement of rabbits raised in derived savanna zone can be realized through genetic selection criterions.

Keywords: heritability, nested design, parity, pooled data, repeatability

Procedia PDF Downloads 144
2403 Low Overhead Dynamic Channel Selection with Cluster-Based Spatial-Temporal Station Reporting in Wireless Networks

Authors: Zeyad Abdelmageid, Xianbin Wang

Abstract:

Choosing the operational channel for a WLAN access point (AP) in WLAN networks has been a static channel assignment process initiated by the user during the deployment process of the AP, which fails to cope with the dynamic conditions of the assigned channel at the station side afterward. However, the dramatically growing number of Wi-Fi APs and stations operating in the unlicensed band has led to dynamic, distributed, and often severe interference. This highlights the urgent need for the AP to dynamically select the best overall channel of operation for the basic service set (BSS) by considering the distributed and changing channel conditions at all stations. Consequently, dynamic channel selection algorithms which consider feedback from the station side have been developed. Despite the significant performance improvement, existing channel selection algorithms suffer from very high feedback overhead. Feedback latency from the STAs, due to the high overhead, can cause the eventually selected channel to no longer be optimal for operation due to the dynamic sharing nature of the unlicensed band. This has inspired us to develop our own dynamic channel selection algorithm with reduced overhead through the proposed low-overhead, cluster-based station reporting mechanism. The main idea behind the cluster-based station reporting is the observation that STAs which are very close to each other tend to have very similar channel conditions. Instead of requesting each STA to report on every candidate channel while causing high overhead, the AP divides STAs into clusters then assigns each STA in each cluster one channel to report feedback on. With the proper design of the cluster based reporting, the AP does not lose any information about the channel conditions at the station side while reducing feedback overhead. The simulation results show equal performance and, at times, better performance with a fraction of the overhead. We believe that this algorithm has great potential in designing future dynamic channel selection algorithms with low overhead.

Keywords: channel assignment, Wi-Fi networks, clustering, DBSCAN, overhead

Procedia PDF Downloads 113
2402 Profit-Based Artificial Neural Network (ANN) Trained by Migrating Birds Optimization: A Case Study in Credit Card Fraud Detection

Authors: Ashkan Zakaryazad, Ekrem Duman

Abstract:

A typical classification technique ranks the instances in a data set according to the likelihood of belonging to one (positive) class. A credit card (CC) fraud detection model ranks the transactions in terms of probability of being fraud. In fact, this approach is often criticized, because firms do not care about fraud probability but about the profitability or costliness of detecting a fraudulent transaction. The key contribution in this study is to focus on the profit maximization in the model building step. The artificial neural network proposed in this study works based on profit maximization instead of minimizing the error of prediction. Moreover, some studies have shown that the back propagation algorithm, similar to other gradient–based algorithms, usually gets trapped in local optima and swarm-based algorithms are more successful in this respect. In this study, we train our profit maximization ANN using the Migrating Birds optimization (MBO) which is introduced to literature recently.

Keywords: neural network, profit-based neural network, sum of squared errors (SSE), MBO, gradient descent

Procedia PDF Downloads 470
2401 From Genome to Field: Applying Genome Wide Association Study for Sustainable Ascochyta Blight Management in Faba Beans

Authors: Rabia Faridi, Rizwana Maqbool, Umara Sahar Rana, Zaheer Ahmad

Abstract:

Climate change impacts agriculture, notably in Germany, where spring faba beans predominate. However, improved winter hardiness aligns with milder winters, enabling autumn-sown varieties. Genetic resistance to Ascochyta blight is vital for crop integration. Traditional breeding faces challenges due to complex inheritance. This study assessed 224 homozygous faba bean lines for Ascochyta resistance traits. To achieve h²>70%, 12 replicates were required (realized h²=87%). Genetic variation and strong trait correlations were observed. Five lines outperformed 29H, while three were highly susceptible. A genome-wide association study (GWAS) with 188 inbred lines and 2058 markers, including 17 guide SNP markers, identified 12 markers associated with resistance traits, potentially indicating new resistance genes. One guide marker (Vf-Mt1g014230-001) on chromosome III validated a known QTL. The guided marker approach complemented GWAS, facilitating marker-assisted selection for Ascochyta resistance. The Göttingen Winter Bean Population offers promise for resistance breeding.

Keywords: genome wide association studies, marker assisted breeding, faba bean, ascochyta blight

Procedia PDF Downloads 54
2400 An Analysis on Clustering Based Gene Selection and Classification for Gene Expression Data

Authors: K. Sathishkumar, V. Thiagarasu

Abstract:

Due to recent advances in DNA microarray technology, it is now feasible to obtain gene expression profiles of tissue samples at relatively low costs. Many scientists around the world use the advantage of this gene profiling to characterize complex biological circumstances and diseases. Microarray techniques that are used in genome-wide gene expression and genome mutation analysis help scientists and physicians in understanding of the pathophysiological mechanisms, in diagnoses and prognoses, and choosing treatment plans. DNA microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. This work presents an analysis of several clustering algorithms proposed to deals with the gene expression data effectively. The existing clustering algorithms like Support Vector Machine (SVM), K-means algorithm and evolutionary algorithm etc. are analyzed thoroughly to identify the advantages and limitations. The performance evaluation of the existing algorithms is carried out to determine the best approach. In order to improve the classification performance of the best approach in terms of Accuracy, Convergence Behavior and processing time, a hybrid clustering based optimization approach has been proposed.

Keywords: microarray technology, gene expression data, clustering, gene Selection

Procedia PDF Downloads 321
2399 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay

Abstract:

Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 379
2398 Personalize E-Learning System Based on Clustering and Sequence Pattern Mining Approach

Authors: H. S. Saini, K. Vijayalakshmi, Rishi Sayal

Abstract:

Network-based education has been growing rapidly in size and quality. Knowledge clustering becomes more important in personalized information retrieval for web-learning. A personalized-Learning service after the learners’ knowledge has been classified with clustering. Through automatic analysis of learners’ behaviors, their partition with similar data level and interests may be discovered so as to produce learners with contents that best match educational needs for collaborative learning. We present a specific mining tool and a recommender engine that we have integrated in the online learning in order to help the teacher to carry out the whole e-learning process. We propose to use sequential pattern mining algorithms to discover the most used path by the students and from this information can recommend links to the new students automatically meanwhile they browse in the course. We have Developed a specific author tool in order to help the teacher to apply all the data mining process. We tend to report on many experiments with real knowledge so as to indicate the quality of using both clustering and sequential pattern mining algorithms together for discovering personalized e-learning systems.

Keywords: e-learning, cluster, personalization, sequence, pattern

Procedia PDF Downloads 424
2397 Harmonizing Cities: Integrating Land Use Diversity and Multimodal Transit for Social Equity

Authors: Zi-Yan Chao

Abstract:

With the rapid development of urbanization and increasing demand for efficient transportation systems, the interaction between land use diversity and transportation resource allocation has become a critical issue in urban planning. Achieving a balance of land use types, such as residential, commercial, and industrial areas, is crucial role in ensuring social equity and sustainable urban development. Simultaneously, optimizing multimodal transportation networks, including bus, subway, and car routes, is essential for minimizing total travel time and costs, while ensuring fairness for all social groups, particularly in meeting the transportation needs of low-income populations. This study develops a bilevel programming model to address these challenges, with land use diversity as the foundation for measuring equity. The upper-level model maximizes land use diversity for balanced land distribution across regions. The lower-level model optimizes multimodal transportation networks to minimize travel time and costs while maintaining user equilibrium. The model also incorporates constraints to ensure fair resource allocation, such as balancing transportation accessibility and cost differences across various social groups. A solution approach is developed to solve the bilevel optimization problem, ensuring efficient exploration of the solution space for land use and transportation resource allocation. This study maximizes social equity by maximizing land use diversity and achieving user equilibrium with optimal transportation resource distribution. The proposed method provides a robust framework for addressing urban planning challenges, contributing to sustainable and equitable urban development.

Keywords: bilevel programming model, genetic algorithms, land use diversity, multimodal transportation optimization, social equity

Procedia PDF Downloads 11
2396 Implications of Optimisation Algorithm on the Forecast Performance of Artificial Neural Network for Streamflow Modelling

Authors: Martins Y. Otache, John J. Musa, Abayomi I. Kuti, Mustapha Mohammed

Abstract:

The performance of an artificial neural network (ANN) is contingent on a host of factors, for instance, the network optimisation scheme. In view of this, the study examined the general implications of the ANN training optimisation algorithm on its forecast performance. To this end, the Bayesian regularisation (Br), Levenberg-Marquardt (LM), and the adaptive learning gradient descent: GDM (with momentum) algorithms were employed under different ANN structural configurations: (1) single-hidden layer, and (2) double-hidden layer feedforward back propagation network. Results obtained revealed generally that the gradient descent with momentum (GDM) optimisation algorithm, with its adaptive learning capability, used a relatively shorter time in both training and validation phases as compared to the Levenberg- Marquardt (LM) and Bayesian Regularisation (Br) algorithms though learning may not be consummated; i.e., in all instances considering also the prediction of extreme flow conditions for 1-day and 5-day ahead, respectively especially using the ANN model. In specific statistical terms on the average, model performance efficiency using the coefficient of efficiency (CE) statistic were Br: 98%, 94%; LM: 98 %, 95 %, and GDM: 96 %, 96% respectively for training and validation phases. However, on the basis of relative error distribution statistics (MAE, MAPE, and MSRE), GDM performed better than the others overall. Based on the findings, it is imperative to state that the adoption of ANN for real-time forecasting should employ training algorithms that do not have computational overhead like the case of LM that requires the computation of the Hessian matrix, protracted time, and sensitivity to initial conditions; to this end, Br and other forms of the gradient descent with momentum should be adopted considering overall time expenditure and quality of the forecast as well as mitigation of network overfitting. On the whole, it is recommended that evaluation should consider implications of (i) data quality and quantity and (ii) transfer functions on the overall network forecast performance.

Keywords: streamflow, neural network, optimisation, algorithm

Procedia PDF Downloads 150
2395 A Novel Heuristic for Analysis of Large Datasets by Selecting Wrapper-Based Features

Authors: Bushra Zafar, Usman Qamar

Abstract:

Large data sample size and dimensions render the effectiveness of conventional data mining methodologies. A data mining technique are important tools for collection of knowledgeable information from variety of databases and provides supervised learning in the form of classification to design models to describe vital data classes while structure of the classifier is based on class attribute. Classification efficiency and accuracy are often influenced to great extent by noisy and undesirable features in real application data sets. The inherent natures of data set greatly masks its quality analysis and leave us with quite few practical approaches to use. To our knowledge first time, we present a new approach for investigation of structure and quality of datasets by providing a targeted analysis of localization of noisy and irrelevant features of data sets. Machine learning is based primarily on feature selection as pre-processing step which offers us to select few features from number of features as a subset by reducing the space according to certain evaluation criterion. The primary objective of this study is to trim down the scope of the given data sample by searching a small set of important features which may results into good classification performance. For this purpose, a heuristic for wrapper-based feature selection using genetic algorithm and for discriminative feature selection an external classifier are used. Selection of feature based on its number of occurrence in the chosen chromosomes. Sample dataset has been used to demonstrate proposed idea effectively. A proposed method has improved average accuracy of different datasets is about 95%. Experimental results illustrate that proposed algorithm increases the accuracy of prediction of different diseases.

Keywords: data mining, generic algorithm, KNN algorithms, wrapper based feature selection

Procedia PDF Downloads 314
2394 Investigating Data Normalization Techniques in Swarm Intelligence Forecasting for Energy Commodity Spot Price

Authors: Yuhanis Yusof, Zuriani Mustaffa, Siti Sakira Kamaruddin

Abstract:

Data mining is a fundamental technique in identifying patterns from large data sets. The extracted facts and patterns contribute in various domains such as marketing, forecasting, and medical. Prior to that, data are consolidated so that the resulting mining process may be more efficient. This study investigates the effect of different data normalization techniques, which are Min-max, Z-score, and decimal scaling, on Swarm-based forecasting models. Recent swarm intelligence algorithms employed includes the Grey Wolf Optimizer (GWO) and Artificial Bee Colony (ABC). Forecasting models are later developed to predict the daily spot price of crude oil and gasoline. Results showed that GWO works better with Z-score normalization technique while ABC produces better accuracy with the Min-Max. Nevertheless, the GWO is more superior that ABC as its model generates the highest accuracy for both crude oil and gasoline price. Such a result indicates that GWO is a promising competitor in the family of swarm intelligence algorithms.

Keywords: artificial bee colony, data normalization, forecasting, Grey Wolf optimizer

Procedia PDF Downloads 473
2393 Genomic Diversity and Relationship among Arabian Peninsula Dromedary Camels Using Full Genome Sequencing Approach

Authors: H. Bahbahani, H. Musa, F. Al Mathen

Abstract:

The dromedary camels (Camelus dromedarius) are single-humped even-toed ungulates populating the African Sahara, Arabian Peninsula, and Southwest Asia. The genome of this desert-adapted species has been minimally investigated using autosomal microsatellite and mitochondrial DNA markers. In this study, the genomes of 33 dromedary camel samples from different parts of the Arabian Peninsula were sequenced using Illumina Next Generation Sequencing (NGS) platform. These data were combined with Genotyping-by-Sequencing (GBS) data from African (Sudanese) dromedaries to investigate the genomic relationship between African and Arabian Peninsula dromedary camels. Principle Component Analysis (PCA) and average genome-wide admixture analysis were be conducted on these data to tackle the objectives of these studies. Both of the two analyses conducted revealed phylogeographic distinction between these two camel populations. However, no breed-wise genetic classification has been revealed among the African (Sudanese) camel breeds. The Arabian Peninsula camel populations also show higher heterozygosity than the Sudanese camels. The results of this study explain the evolutionary history and migration of African dromedary camels from their center of domestication in the southern Arabian Peninsula. These outputs help scientists to further understand the evolutionary history of dromedary camels, which might impact in conserving the favorable genetic of this species.

Keywords: dromedary, genotyping-by-sequencing, Arabian Peninsula, Sudan

Procedia PDF Downloads 200
2392 A Case Study of Misinterpretation of Results in Forensic DNA Cases Due to Expression of Y- Chromosome in Females

Authors: Garima Chaudhary

Abstract:

The gender of an individual in forensic DNA analysis is normally accessed by using the STR multiplexes with the incorporated gender based marker amelogenin or in other words by presence or absence of Y-Chromosome, but it may not be true in all the cases. We hereby report an interesting case of a phenotypic female carrying a male karyotype (46XY). In the alleged murder case, the deceased female with XY genotype was noticed. The expression of 18 Y-linked genes was studied to measure the extent of expression. Expression at 4 loci was observed that might have caused the misinterpretation in forensic casework. This clinical situation of the deceased in this case was diagnosed as testicular feminization syndrome, which characterize a female phenotype with a male karyotype (46, XY). Most of these cases have SRY (testis determining factor). The genetic explanation of this phenomenon is not very clear. Here, we are discussing the impact of such situations of genetic discrepancy in forensic interpretation of results. In the presented murder case of a phenotypic female, sexual assault was also suspected. For confirmation vaginal swabs and micro slides were also sent to us for DNA examination. After DNA analysis using STR markers, Y-chromosome was detected in the samples which supporting the suspicion of sexual assault before murder. When the reference blood sample of the deceased was analyzed, it was found to be case of testicular feminization syndrome. Interesting inferences were made from the results obtained.

Keywords: DNA profiling, forensic case study, Y chromosome, females

Procedia PDF Downloads 223
2391 Classifying and Analysis 8-Bit to 8-Bit S-Boxes Characteristic Using S-Box Evaluation Characteristic

Authors: Muhammad Luqman, Yusuf Kurniawan

Abstract:

S-Boxes is one of the linear parts of the cryptographic algorithm. The existence of S-Box in the cryptographic algorithm is needed to maintain non-linearity of the algorithm. Nowadays, modern cryptographic algorithms use an S-Box as a part of algorithm process. Despite the fact that several cryptographic algorithms today reuse theoretically secure and carefully constructed S-Boxes, there is an evaluation characteristic that can measure security properties of S-Boxes and hence the corresponding primitives. Analysis of an S-Box usually is done using manual mathematics calculation. Several S-Boxes are presented as a Truth Table without any mathematical background algorithm. Then, it’s rather difficult to determine the strength of Truth Table S-Box without a mathematical algorithm. A comprehensive analysis should be applied to the Truth Table S-Box to determine the characteristic. Several important characteristics should be owned by the S-Boxes, they are Nonlinearity, Balancedness, Algebraic degree, LAT, DAT, differential delta uniformity, correlation immunity and global avalanche criterion. Then, a comprehensive tool will be present to automatically calculate the characteristics of S-Boxes and determine the strength of S-Box. Comprehensive analysis is done on a deterministic process to produce a sequence of S-Boxes characteristic and give advice for a better S-Box construction.

Keywords: cryptographic properties, Truth Table S-Boxes, S-Boxes characteristic, deterministic process

Procedia PDF Downloads 361
2390 Molecular Characterization of Chicken B Cell Marker (ChB6) in Native Chicken of Poonch Region from International Borders of India and Pakistan

Authors: Mandeep Singh Azad.Dibyendu Chakraborty, Vikas Vohra

Abstract:

Introduction: Poonch is one of the remotest districts of the Jammu and Kashmir (UT) and situated on international borders. This native poultry population in these areas is quite hardy and thrives well in adverse climatic conditions. Till date, no local breed from this area (Jammu Province) has been characterized thus present study was undertaken with the main objectives of molecular characterization of ChB6 gene in local native chicken of Poonch region located at international borders between India and Pakistan. The chicken B-cell marker (ChB6) gene has been proposed as a candidate gene in regulating B-cell development. Material and Method: RNA was isolated by Blood RNA Purification Kit (HiPura) and Trizol method from whole blood samples. Positive PCR products with size 1110 bp were selected for further purification, sequencing and analysis. The amplified PCR product was sequenced by Sangers dideoxy chain termination method. The obtained sequence of ChB6 gene of Poonchi chicken were compared by MEGAX software. BioEdit software was used to construct phylogenic tree, and Neighbor Joining method was used to infer evolutionary history. In order to compute evolutionary distance Maximum Composite Likelihood method was used. Results: The positively amplified samples of ChB6 genes were then subjected to Sanger sequencing with “Primer Walking. The sequences were then analyzed using MEGA X and BioEdit software. The sequence results were compared with other reported sequence from different breed of chicken and with other species obtained from the NCBI (National Center for Biotechnology Information). ClustalW method using MEGA X software was used for multiple sequence alignment. The sequence results of ChB6 gene of Poonchi chicken was compared with Centrocercus urophasianus, G. gallus mRNA for B6.1 protein, G. gallus mRNA for B6.2, G. gallus mRNA for B6.3, Gallus gallus B6.1, Halichoeres bivittatus, Miniopterus fuliginosus Ferringtonia patagonica, Tympanuchus phasianellus. The genetic distances were 0.2720, 0.0000, 0.0245, 0.0212, 0.0147, 1.6461, 2.2394, 2.0070 and 0.2363 for ChB6 gene of Poonchi chicken sequence with other sequences in the present study respectively. Sequencing results showed variations between different species. It was observed that AT content were higher then GC content for ChB6 gene. The lower AT content suggests less thermostable. It was observed that there was no sequence difference within the Poonchi population for ChB6 gene. The high homology within chicken population indicates the conservation of ChB6 gene. The maximum difference was observed with Miniopterus fuliginosus (Eastern bent-wing bat) followed by Ferringtonia patagonica and Halichoeres bivittatus. Conclusion: Genetic variation is the essential component for genetic improvement. The results of immune related gene Chb6 shows between population genetic variability. Therefore, further association studies of this gene with some prevalent diseases in large population would be helpful to identify disease resistant/ susceptible genotypes in the indigenous chicken population.

Keywords: ChB6, sequencing, ClustalW, genetic distance, poonchi chicken, SNP

Procedia PDF Downloads 62
2389 Breast Cancer and BRCA Gene: A Study on Genetic and Environmental Interaction

Authors: Abhishikta Ghosh Roy

Abstract:

Breast cancer is the most common malignancy among women globally, including India. Human breast cancer results from the genetic and environmental interaction. The present study attempts to understand the molecular heterogeneity of BRCA1 and BRCA2 genes, as well as to understand the association of various lifestyle and reproductive variables for the Breast Cancer risk. The study was conducted amongst 110 patients and 128 controls with total DNA sequencing of flanking and coding regions of BRCA1 BRCA2 genes that revealed ten Single Nucleotide Polymorphisms (SNPs) (6 novels). The controls selected for the study were age, sex and ethnic group matched. After written and informed consent biological samples were collected from the subjects. After detailed molecular analysis, significant (p < 0.005) molecular heterogeneity is revealed in terms of SNPs in BRCA1 (4 Exonic & 1 Intronic) and BRCA2 (2exonic and 3 Intronic) genes. The augmentation study investigated significant (p < 0.05) association with positive family history, early age at menarche, irregular menstrual periods, menopause, prolong contraceptive use, nulliparity, history of abortions, consumption of alcohol and smoking for breast cancer risk. To the best of authors knowledge, this study is the first of its kind, envisaged that the identification of the SNPs and modification of the lifestyle factors might aid to minimize the risk among the Bengalee Hindu females.

Keywords: breast cancer, BRCA, lifestyle, India

Procedia PDF Downloads 110
2388 Non-Population Search Algorithms for Capacitated Material Requirement Planning in Multi-Stage Assembly Flow Shop with Alternative Machines

Authors: Watcharapan Sukkerd, Teeradej Wuttipornpun

Abstract:

This paper aims to present non-population search algorithms called tabu search (TS), simulated annealing (SA) and variable neighborhood search (VNS) to minimize the total cost of capacitated MRP problem in multi-stage assembly flow shop with two alternative machines. There are three main steps for the algorithm. Firstly, an initial sequence of orders is constructed by a simple due date-based dispatching rule. Secondly, the sequence of orders is repeatedly improved to reduce the total cost by applying TS, SA and VNS separately. Finally, the total cost is further reduced by optimizing the start time of each operation using the linear programming (LP) model. Parameters of the algorithm are tuned by using real data from automotive companies. The result shows that VNS significantly outperforms TS, SA and the existing algorithm.

Keywords: capacitated MRP, tabu search, simulated annealing, variable neighborhood search, linear programming, assembly flow shop, application in industry

Procedia PDF Downloads 229
2387 Genetic Divergence and Morphogenic Analysis of Sugarcane Red Rot Pathogen Colletotrichum falcatum under South Gujarat Condition

Authors: Prittesh Patel, Ramar Krishnamurthy

Abstract:

In the present study, nine strains of C. falcatum obtained from different places and cultivars were characterized for sporulation, growth rate, and 18S rRNA gene sequence. All isolates had characteristic fast-growing sparse and fleecy aerial mycelia on potato dextrose agar with sickle shape conidia (length x width: varied from 20.0 X 3.89 to 25.52 X 5.34 μm) and blackish to orange acervuli with setae (length x width: varied from 112.37X 2.78 to 167.66 X 6.73 μm). They could be divided into two groups on the base of morphology; P1, dense mycelia with concentric growth and P2, sparse mycelia with uneven growth. Genomic DNA isolation followed by PCR amplification with ITS1 and ITS4 primer produced ~550bp amplicons for all isolates. Phylogeny generated by 18S rRNA gene sequence confirmed the variation in isolates and mainly grouped into two clusters; cluster 1 contained CoC671 isolates (cfNAV and cfPAR) and Co86002 isolate (cfTIM). Other isolates cfMAD, cfKAM, and cfMAR were grouped into cluster 2. Remaining isolates did not fall into any cluster. Isolate cfGAN, collected from Co86032 was found highly diverse of all the nine isolates. In a nutshell, we found considerable genetic divergence and morphological variation within C. falcatum accessions collected from different areas of south Gujarat, India and these can be used for the breeding program.

Keywords: Colletotrichum falcatum, ITS, morphology, red rot, sugarcane

Procedia PDF Downloads 121