Search results for: deep neural models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9388

Search results for: deep neural models

9088 MarginDistillation: Distillation for Face Recognition Neural Networks with Margin-Based Softmax

Authors: Svitov David, Alyamkin Sergey

Abstract:

The usage of convolutional neural networks (CNNs) in conjunction with the margin-based softmax approach demonstrates the state-of-the-art performance for the face recognition problem. Recently, lightweight neural network models trained with the margin-based softmax have been introduced for the face identification task for edge devices. In this paper, we propose a distillation method for lightweight neural network architectures that outperforms other known methods for the face recognition task on LFW, AgeDB-30 and Megaface datasets. The idea of the proposed method is to use class centers from the teacher network for the student network. Then the student network is trained to get the same angles between the class centers and face embeddings predicted by the teacher network.

Keywords: ArcFace, distillation, face recognition, margin-based softmax

Procedia PDF Downloads 146
9087 Real-Time Pedestrian Detection Method Based on Improved YOLOv3

Authors: Jingting Luo, Yong Wang, Ying Wang

Abstract:

Pedestrian detection in image or video data is a very important and challenging task in security surveillance. The difficulty of this task is to locate and detect pedestrians of different scales in complex scenes accurately. To solve these problems, a deep neural network (RT-YOLOv3) is proposed to realize real-time pedestrian detection at different scales in security monitoring. RT-YOLOv3 improves the traditional YOLOv3 algorithm. Firstly, the deep residual network is added to extract vehicle features. Then six convolutional neural networks with different scales are designed and fused with the corresponding scale feature maps in the residual network to form the final feature pyramid to perform pedestrian detection tasks. This method can better characterize pedestrians. In order to further improve the accuracy and generalization ability of the model, a hybrid pedestrian data set training method is used to extract pedestrian data from the VOC data set and train with the INRIA pedestrian data set. Experiments show that the proposed RT-YOLOv3 method achieves 93.57% accuracy of mAP (mean average precision) and 46.52f/s (number of frames per second). In terms of accuracy, RT-YOLOv3 performs better than Fast R-CNN, Faster R-CNN, YOLO, SSD, YOLOv2, and YOLOv3. This method reduces the missed detection rate and false detection rate, improves the positioning accuracy, and meets the requirements of real-time detection of pedestrian objects.

Keywords: pedestrian detection, feature detection, convolutional neural network, real-time detection, YOLOv3

Procedia PDF Downloads 141
9086 Assisting Dating of Greek Papyri Images with Deep Learning

Authors: Asimina Paparrigopoulou, John Pavlopoulos, Maria Konstantinidou

Abstract:

Dating papyri accurately is crucial not only to editing their texts but also for our understanding of palaeography and the history of writing, ancient scholarship, material culture, networks in antiquity, etc. Most ancient manuscripts offer little evidence regarding the time of their production, forcing papyrologists to date them on palaeographical grounds, a method often criticized for its subjectivity. By experimenting with data obtained from the Collaborative Database of Dateable Greek Bookhands and the PapPal online collections of objectively dated Greek papyri, this study shows that deep learning dating models, pre-trained on generic images, can achieve accurate chronological estimates for a test subset (67,97% accuracy for book hands and 55,25% for documents). To compare the estimates of these models with those of humans, experts were asked to complete a questionnaire with samples of literary and documentary hands that had to be sorted chronologically by century. The same samples were dated by the models in question. The results are presented and analysed.

Keywords: image classification, papyri images, dating

Procedia PDF Downloads 78
9085 Modeling Biomass and Biodiversity across Environmental and Management Gradients in Temperate Grasslands with Deep Learning and Sentinel-1 and -2

Authors: Javier Muro, Anja Linstadter, Florian Manner, Lisa Schwarz, Stephan Wollauer, Paul Magdon, Gohar Ghazaryan, Olena Dubovyk

Abstract:

Monitoring the trade-off between biomass production and biodiversity in grasslands is critical to evaluate the effects of management practices across environmental gradients. New generations of remote sensing sensors and machine learning approaches can model grasslands’ characteristics with varying accuracies. However, studies often fail to cover a sufficiently broad range of environmental conditions, and evidence suggests that prediction models might be case specific. In this study, biomass production and biodiversity indices (species richness and Fishers’ α) are modeled in 150 grassland plots for three sites across Germany. These sites represent a North-South gradient and are characterized by distinct soil types, topographic properties, climatic conditions, and management intensities. Predictors used are derived from Sentinel-1 & 2 and a set of topoedaphic variables. The transferability of the models is tested by training and validating at different sites. The performance of feed-forward deep neural networks (DNN) is compared to a random forest algorithm. While biomass predictions across gradients and sites were acceptable (r2 0.5), predictions of biodiversity indices were poor (r2 0.14). DNN showed higher generalization capacity than random forest when predicting biomass across gradients and sites (relative root mean squared error of 0.5 for DNN vs. 0.85 for random forest). DNN also achieved high performance when using the Sentinel-2 surface reflectance data rather than different combinations of spectral indices, Sentinel-1 data, or topoedaphic variables, simplifying dimensionality. This study demonstrates the necessity of training biomass and biodiversity models using a broad range of environmental conditions and ensuring spatial independence to have realistic and transferable models where plot level information can be upscaled to landscape scale.

Keywords: ecosystem services, grassland management, machine learning, remote sensing

Procedia PDF Downloads 218
9084 Graph Clustering Unveiled: ClusterSyn - A Machine Learning Framework for Predicting Anti-Cancer Drug Synergy Scores

Authors: Babak Bahri, Fatemeh Yassaee Meybodi, Changiz Eslahchi

Abstract:

In the pursuit of effective cancer therapies, the exploration of combinatorial drug regimens is crucial to leverage synergistic interactions between drugs, thereby improving treatment efficacy and overcoming drug resistance. However, identifying synergistic drug pairs poses challenges due to the vast combinatorial space and limitations of experimental approaches. This study introduces ClusterSyn, a machine learning (ML)-powered framework for classifying anti-cancer drug synergy scores. ClusterSyn employs a two-step approach involving drug clustering and synergy score prediction using a fully connected deep neural network. For each cell line in the training dataset, a drug graph is constructed, with nodes representing drugs and edge weights denoting synergy scores between drug pairs. Drugs are clustered using the Markov clustering (MCL) algorithm, and vectors representing the similarity of drug pairs to each cluster are input into the deep neural network for synergy score prediction (synergy or antagonism). Clustering results demonstrate effective grouping of drugs based on synergy scores, aligning similar synergy profiles. Subsequently, neural network predictions and synergy scores of the two drugs on others within their clusters are used to predict the synergy score of the considered drug pair. This approach facilitates comparative analysis with clustering and regression-based methods, revealing the superior performance of ClusterSyn over state-of-the-art methods like DeepSynergy and DeepDDS on diverse datasets such as Oniel and Almanac. The results highlight the remarkable potential of ClusterSyn as a versatile tool for predicting anti-cancer drug synergy scores.

Keywords: drug synergy, clustering, prediction, machine learning., deep learning

Procedia PDF Downloads 76
9083 Convolutional Neural Network Based on Random Kernels for Analyzing Visual Imagery

Authors: Ja-Keoung Koo, Kensuke Nakamura, Hyohun Kim, Dongwha Shin, Yeonseok Kim, Ji-Su Ahn, Byung-Woo Hong

Abstract:

The machine learning techniques based on a convolutional neural network (CNN) have been actively developed and successfully applied to a variety of image analysis tasks including reconstruction, noise reduction, resolution enhancement, segmentation, motion estimation, object recognition. The classical visual information processing that ranges from low level tasks to high level ones has been widely developed in the deep learning framework. It is generally considered as a challenging problem to derive visual interpretation from high dimensional imagery data. A CNN is a class of feed-forward artificial neural network that usually consists of deep layers the connections of which are established by a series of non-linear operations. The CNN architecture is known to be shift invariant due to its shared weights and translation invariance characteristics. However, it is often computationally intractable to optimize the network in particular with a large number of convolution layers due to a large number of unknowns to be optimized with respect to the training set that is generally required to be large enough to effectively generalize the model under consideration. It is also necessary to limit the size of convolution kernels due to the computational expense despite of the recent development of effective parallel processing machinery, which leads to the use of the constantly small size of the convolution kernels throughout the deep CNN architecture. However, it is often desired to consider different scales in the analysis of visual features at different layers in the network. Thus, we propose a CNN model where different sizes of the convolution kernels are applied at each layer based on the random projection. We apply random filters with varying sizes and associate the filter responses with scalar weights that correspond to the standard deviation of the random filters. We are allowed to use large number of random filters with the cost of one scalar unknown for each filter. The computational cost in the back-propagation procedure does not increase with the larger size of the filters even though the additional computational cost is required in the computation of convolution in the feed-forward procedure. The use of random kernels with varying sizes allows to effectively analyze image features at multiple scales leading to a better generalization. The robustness and effectiveness of the proposed CNN based on random kernels are demonstrated by numerical experiments where the quantitative comparison of the well-known CNN architectures and our models that simply replace the convolution kernels with the random filters is performed. The experimental results indicate that our model achieves better performance with less number of unknown weights. The proposed algorithm has a high potential in the application of a variety of visual tasks based on the CNN framework. Acknowledgement—This work was supported by the MISP (Ministry of Science and ICT), Korea, under the National Program for Excellence in SW (20170001000011001) supervised by IITP, and NRF-2014R1A2A1A11051941, NRF2017R1A2B4006023.

Keywords: deep learning, convolutional neural network, random kernel, random projection, dimensionality reduction, object recognition

Procedia PDF Downloads 285
9082 Selecting the Best RBF Neural Network Using PSO Algorithm for ECG Signal Prediction

Authors: Najmeh Mohsenifar, Narjes Mohsenifar, Abbas Kargar

Abstract:

In this paper, has been presented a stable method for predicting the ECG signals through the RBF neural networks, by the PSO algorithm. In spite of quasi-periodic ECG signal from a healthy person, there are distortions in electro cardiographic data for a patient. Therefore, there is no precise mathematical model for prediction. Here, we have exploited neural networks that are capable of complicated nonlinear mapping. Although the architecture and spread of RBF networks are usually selected through trial and error, the PSO algorithm has been used for choosing the best neural network. In this way, 2 second of a recorded ECG signal is employed to predict duration of 20 second in advance. Our simulations show that PSO algorithm can find the RBF neural network with minimum MSE and the accuracy of the predicted ECG signal is 97 %.

Keywords: electrocardiogram, RBF artificial neural network, PSO algorithm, predict, accuracy

Procedia PDF Downloads 624
9081 Decision Support System for Diagnosis of Breast Cancer

Authors: Oluwaponmile D. Alao

Abstract:

In this paper, two models have been developed to ascertain the best network needed for diagnosis of breast cancer. Breast cancer has been a disease that required the attention of the medical practitioner. Experience has shown that misdiagnose of the disease has been a major challenge in the medical field. Therefore, designing a system with adequate performance for will help in making diagnosis of the disease faster and accurate. In this paper, two models: backpropagation neural network and support vector machine has been developed. The performance obtained is also compared with other previously obtained algorithms to ascertain the best algorithms.

Keywords: breast cancer, data mining, neural network, support vector machine

Procedia PDF Downloads 345
9080 Investigating the Influence of Activation Functions on Image Classification Accuracy via Deep Convolutional Neural Network

Authors: Gulfam Haider, sana danish

Abstract:

Convolutional Neural Networks (CNNs) have emerged as powerful tools for image classification, and the choice of optimizers profoundly affects their performance. The study of optimizers and their adaptations remains a topic of significant importance in machine learning research. While numerous studies have explored and advocated for various optimizers, the efficacy of these optimization techniques is still subject to scrutiny. This work aims to address the challenges surrounding the effectiveness of optimizers by conducting a comprehensive analysis and evaluation. The primary focus of this investigation lies in examining the performance of different optimizers when employed in conjunction with the popular activation function, Rectified Linear Unit (ReLU). By incorporating ReLU, known for its favorable properties in prior research, the aim is to bolster the effectiveness of the optimizers under scrutiny. Specifically, we evaluate the adjustment of these optimizers with both the original Softmax activation function and the modified ReLU activation function, carefully assessing their impact on overall performance. To achieve this, a series of experiments are conducted using a well-established benchmark dataset for image classification tasks, namely the Canadian Institute for Advanced Research dataset (CIFAR-10). The selected optimizers for investigation encompass a range of prominent algorithms, including Adam, Root Mean Squared Propagation (RMSprop), Adaptive Learning Rate Method (Adadelta), Adaptive Gradient Algorithm (Adagrad), and Stochastic Gradient Descent (SGD). The performance analysis encompasses a comprehensive evaluation of the classification accuracy, convergence speed, and robustness of the CNN models trained with each optimizer. Through rigorous experimentation and meticulous assessment, we discern the strengths and weaknesses of the different optimization techniques, providing valuable insights into their suitability for image classification tasks. By conducting this in-depth study, we contribute to the existing body of knowledge surrounding optimizers in CNNs, shedding light on their performance characteristics for image classification. The findings gleaned from this research serve to guide researchers and practitioners in making informed decisions when selecting optimizers and activation functions, thus advancing the state-of-the-art in the field of image classification with convolutional neural networks.

Keywords: deep neural network, optimizers, RMsprop, ReLU, stochastic gradient descent

Procedia PDF Downloads 125
9079 Inversely Designed Chipless Radio Frequency Identification (RFID) Tags Using Deep Learning

Authors: Madhawa Basnayaka, Jouni Paltakari

Abstract:

Fully passive backscattering chipless RFID tags are an emerging wireless technology with low cost, higher reading distance, and fast automatic identification without human interference, unlike already available technologies like optical barcodes. The design optimization of chipless RFID tags is crucial as it requires replacing integrated chips found in conventional RFID tags with printed geometric designs. These designs enable data encoding and decoding through backscattered electromagnetic (EM) signatures. The applications of chipless RFID tags have been limited due to the constraints of data encoding capacity and the ability to design accurate yet efficient configurations. The traditional approach to accomplishing design parameters for a desired EM response involves iterative adjustment of design parameters and simulating until the desired EM spectrum is achieved. However, traditional numerical simulation methods encounter limitations in optimizing design parameters efficiently due to the speed and resource consumption. In this work, a deep learning neural network (DNN) is utilized to establish a correlation between the EM spectrum and the dimensional parameters of nested centric rings, specifically square and octagonal. The proposed bi-directional DNN has two simultaneously running neural networks, namely spectrum prediction and design parameters prediction. First, spectrum prediction DNN was trained to minimize mean square error (MSE). After the training process was completed, the spectrum prediction DNN was able to accurately predict the EM spectrum according to the input design parameters within a few seconds. Then, the trained spectrum prediction DNN was connected to the design parameters prediction DNN and trained two networks simultaneously. For the first time in chipless tag design, design parameters were predicted accurately after training bi-directional DNN for a desired EM spectrum. The model was evaluated using a randomly generated spectrum and the tag was manufactured using the predicted geometrical parameters. The manufactured tags were successfully tested in the laboratory. The amount of iterative computer simulations has been significantly decreased by this approach. Therefore, highly efficient but ultrafast bi-directional DNN models allow rapid and complicated chipless RFID tag designs.

Keywords: artificial intelligence, chipless RFID, deep learning, machine learning

Procedia PDF Downloads 49
9078 Real-Time Generative Architecture for Mesh and Texture

Authors: Xi Liu, Fan Yuan

Abstract:

In the evolving landscape of physics-based machine learning (PBML), particularly within fluid dynamics and its applications in electromechanical engineering, robot vision, and robot learning, achieving precision and alignment with researchers' specific needs presents a formidable challenge. In response, this work proposes a methodology that integrates neural transformation with a modified smoothed particle hydrodynamics model for generating transformed 3D fluid simulations. This approach is useful for nanoscale science, where the unique and complex behaviors of viscoelastic medium demand accurate neurally-transformed simulations for materials understanding and manipulation. In electromechanical engineering, the method enhances the design and functionality of fluid-operated systems, particularly microfluidic devices, contributing to advancements in nanomaterial design, drug delivery systems, and more. The proposed approach also aligns with the principles of PBML, offering advantages such as multi-fluid stylization and consistent particle attribute transfer. This capability is valuable in various fields where the interaction of multiple fluid components is significant. Moreover, the application of neurally-transformed hydrodynamical models extends to manufacturing processes, such as the production of microelectromechanical systems, enhancing efficiency and cost-effectiveness. The system's ability to perform neural transfer on 3D fluid scenes using a deep learning algorithm alongside physical models further adds a layer of flexibility, allowing researchers to tailor simulations to specific needs across scientific and engineering disciplines.

Keywords: physics-based machine learning, robot vision, robot learning, hydrodynamics

Procedia PDF Downloads 65
9077 Artificial Neural Networks in Environmental Psychology: Application in Architectural Projects

Authors: Diego De Almeida Pereira, Diana Borchenko

Abstract:

Artificial neural networks are used for many applications as they are able to learn complex nonlinear relationships between input and output data. As the number of neurons and layers in a neural network increases, it is possible to represent more complex behaviors. The present study proposes that artificial neural networks are a valuable tool for architecture and engineering professionals concerned with understanding how buildings influence human and social well-being based on theories of environmental psychology.

Keywords: environmental psychology, architecture, neural networks, human and social well-being

Procedia PDF Downloads 494
9076 Churn Prediction for Telecommunication Industry Using Artificial Neural Networks

Authors: Ulas Vural, M. Ergun Okay, E. Mesut Yildiz

Abstract:

Telecommunication service providers demand accurate and precise prediction of customer churn probabilities to increase the effectiveness of their customer relation services. The large amount of customer data owned by the service providers is suitable for analysis by machine learning methods. In this study, expenditure data of customers are analyzed by using an artificial neural network (ANN). The ANN model is applied to the data of customers with different billing duration. The proposed model successfully predicts the churn probabilities at 83% accuracy for only three months expenditure data and the prediction accuracy increases up to 89% when the nine month data is used. The experiments also show that the accuracy of ANN model increases on an extended feature set with information of the changes on the bill amounts.

Keywords: customer relationship management, churn prediction, telecom industry, deep learning, artificial neural networks

Procedia PDF Downloads 143
9075 Simulations in Structural Masonry Walls with Chases Horizontal Through Models in State Deformation Plan (2D)

Authors: Raquel Zydeck, Karina Azzolin, Luis Kosteski, Alisson Milani

Abstract:

This work presents numerical models in plane deformations (2D), using the Discrete Element Method formedbybars (LDEM) andtheFiniteElementMethod (FEM), in structuralmasonrywallswith horizontal chasesof 20%, 30%, and 50% deep, located in the central part and 1/3 oftheupperpartofthewall, withcenteredandeccentricloading. Differentcombinationsofboundaryconditionsandinteractionsbetweenthemethodswerestudied.

Keywords: chases in structural masonry walls, discrete element method formed by bars, finite element method, numerical models, boundary condition

Procedia PDF Downloads 167
9074 Enhancing Spatial Interpolation: A Multi-Layer Inverse Distance Weighting Model for Complex Regression and Classification Tasks in Spatial Data Analysis

Authors: Yakin Hajlaoui, Richard Labib, Jean-François Plante, Michel Gamache

Abstract:

This study introduces the Multi-Layer Inverse Distance Weighting Model (ML-IDW), inspired by the mathematical formulation of both multi-layer neural networks (ML-NNs) and Inverse Distance Weighting model (IDW). ML-IDW leverages ML-NNs' processing capabilities, characterized by compositions of learnable non-linear functions applied to input features, and incorporates IDW's ability to learn anisotropic spatial dependencies, presenting a promising solution for nonlinear spatial interpolation and learning from complex spatial data. it employ gradient descent and backpropagation to train ML-IDW, comparing its performance against conventional spatial interpolation models such as Kriging and standard IDW on regression and classification tasks using simulated spatial datasets of varying complexity. the results highlight the efficacy of ML-IDW, particularly in handling complex spatial datasets, exhibiting lower mean square error in regression and higher F1 score in classification.

Keywords: deep learning, multi-layer neural networks, gradient descent, spatial interpolation, inverse distance weighting

Procedia PDF Downloads 52
9073 Object-Scene: Deep Convolutional Representation for Scene Classification

Authors: Yanjun Chen, Chuanping Hu, Jie Shao, Lin Mei, Chongyang Zhang

Abstract:

Traditional image classification is based on encoding scheme (e.g. Fisher Vector, Vector of Locally Aggregated Descriptor) with low-level image features (e.g. SIFT, HoG). Compared to these low-level local features, deep convolutional features obtained at the mid-level layer of convolutional neural networks (CNN) have richer information but lack of geometric invariance. For scene classification, there are scattered objects with different size, category, layout, number and so on. It is crucial to find the distinctive objects in scene as well as their co-occurrence relationship. In this paper, we propose a method to take advantage of both deep convolutional features and the traditional encoding scheme while taking object-centric and scene-centric information into consideration. First, to exploit the object-centric and scene-centric information, two CNNs that trained on ImageNet and Places dataset separately are used as the pre-trained models to extract deep convolutional features at multiple scales. This produces dense local activations. By analyzing the performance of different CNNs at multiple scales, it is found that each CNN works better in different scale ranges. A scale-wise CNN adaption is reasonable since objects in scene are at its own specific scale. Second, a fisher kernel is applied to aggregate a global representation at each scale and then to merge into a single vector by using a post-processing method called scale-wise normalization. The essence of Fisher Vector lies on the accumulation of the first and second order differences. Hence, the scale-wise normalization followed by average pooling would balance the influence of each scale since different amount of features are extracted. Third, the Fisher vector representation based on the deep convolutional features is followed by a linear Supported Vector Machine, which is a simple yet efficient way to classify the scene categories. Experimental results show that the scale-specific feature extraction and normalization with CNNs trained on object-centric and scene-centric datasets can boost the results from 74.03% up to 79.43% on MIT Indoor67 when only two scales are used (compared to results at single scale). The result is comparable to state-of-art performance which proves that the representation can be applied to other visual recognition tasks.

Keywords: deep convolutional features, Fisher Vector, multiple scales, scale-specific normalization

Procedia PDF Downloads 331
9072 JaCoText: A Pretrained Model for Java Code-Text Generation

Authors: Jessica Lopez Espejel, Mahaman Sanoussi Yahaya Alassan, Walid Dahhane, El Hassane Ettifouri

Abstract:

Pretrained transformer-based models have shown high performance in natural language generation tasks. However, a new wave of interest has surged: automatic programming language code generation. This task consists of translating natural language instructions to a source code. Despite the fact that well-known pre-trained models on language generation have achieved good performance in learning programming languages, effort is still needed in automatic code generation. In this paper, we introduce JaCoText, a model based on Transformer neural network. It aims to generate java source code from natural language text. JaCoText leverages the advantages of both natural language and code generation models. More specifically, we study some findings from state of the art and use them to (1) initialize our model from powerful pre-trained models, (2) explore additional pretraining on our java dataset, (3) lead experiments combining the unimodal and bimodal data in training, and (4) scale the input and output length during the fine-tuning of the model. Conducted experiments on CONCODE dataset show that JaCoText achieves new state-of-the-art results.

Keywords: java code generation, natural language processing, sequence-to-sequence models, transformer neural networks

Procedia PDF Downloads 283
9071 The Data-Driven Localized Wave Solution of the Fokas-Lenells Equation Using Physics-Informed Neural Network

Authors: Gautam Kumar Saharia, Sagardeep Talukdar, Riki Dutta, Sudipta Nandy

Abstract:

The physics-informed neural network (PINN) method opens up an approach for numerically solving nonlinear partial differential equations leveraging fast calculating speed and high precession of modern computing systems. We construct the PINN based on a strong universal approximation theorem and apply the initial-boundary value data and residual collocation points to weekly impose initial and boundary conditions to the neural network and choose the optimization algorithms adaptive moment estimation (ADAM) and Limited-memory Broyden-Fletcher-Golfard-Shanno (L-BFGS) algorithm to optimize learnable parameter of the neural network. Next, we improve the PINN with a weighted loss function to obtain both the bright and dark soliton solutions of the Fokas-Lenells equation (FLE). We find the proposed scheme of adjustable weight coefficients into PINN has a better convergence rate and generalizability than the basic PINN algorithm. We believe that the PINN approach to solve the partial differential equation appearing in nonlinear optics would be useful in studying various optical phenomena.

Keywords: deep learning, optical soliton, physics informed neural network, partial differential equation

Procedia PDF Downloads 69
9070 Robustness of the Deep Chroma Extractor and Locally-Normalized Quarter Tone Filters in Automatic Chord Estimation under Reverberant Conditions

Authors: Luis Alvarado, Victor Poblete, Isaac Gonzalez, Yetzabeth Gonzalez

Abstract:

In MIREX 2016 (http://www.music-ir.org/mirex), the deep neural network (DNN)-Deep Chroma Extractor, proposed by Korzeniowski and Wiedmer, reached the highest score in an audio chord recognition task. In the present paper, this tool is assessed under acoustic reverberant environments and distinct source-microphone distances. The evaluation dataset comprises The Beatles and Queen datasets. These datasets are sequentially re-recorded with a single microphone in a real reverberant chamber at four reverberation times (0 -anechoic-, 1, 2, and 3 s, approximately), as well as four source-microphone distances (32, 64, 128, and 256 cm). It is expected that the performance of the trained DNN will dramatically decrease under these acoustic conditions with signals degraded by room reverberation and distance to the source. Recently, the effect of the bio-inspired Locally-Normalized Cepstral Coefficients (LNCC), has been assessed in a text independent speaker verification task using speech signals degraded by additive noise at different signal-to-noise ratios with variations of recording distance, and it has also been assessed under reverberant conditions with variations of recording distance. LNCC showed a performance so high as the state-of-the-art Mel Frequency Cepstral Coefficient filters. Based on these results, this paper proposes a variation of locally-normalized triangular filters called Locally-Normalized Quarter Tone (LNQT) filters. By using the LNQT spectrogram, robustness improvements of the trained Deep Chroma Extractor are expected, compared with classical triangular filters, and thus compensating the music signal degradation improving the accuracy of the chord recognition system.

Keywords: chord recognition, deep neural networks, feature extraction, music information retrieval

Procedia PDF Downloads 231
9069 Multimodal Direct Neural Network Positron Emission Tomography Reconstruction

Authors: William Whiteley, Jens Gregor

Abstract:

In recent developments of direct neural network based positron emission tomography (PET) reconstruction, two prominent architectures have emerged for converting measurement data into images: 1) networks that contain fully-connected layers; and 2) networks that primarily use a convolutional encoder-decoder architecture. In this paper, we present a multi-modal direct PET reconstruction method called MDPET, which is a hybrid approach that combines the advantages of both types of networks. MDPET processes raw data in the form of sinograms and histo-images in concert with attenuation maps to produce high quality multi-slice PET images (e.g., 8x440x440). MDPET is trained on a large whole-body patient data set and evaluated both quantitatively and qualitatively against target images reconstructed with the standard PET reconstruction benchmark of iterative ordered subsets expectation maximization. The results show that MDPET outperforms the best previously published direct neural network methods in measures of bias, signal-to-noise ratio, mean absolute error, and structural similarity.

Keywords: deep learning, image reconstruction, machine learning, neural network, positron emission tomography

Procedia PDF Downloads 109
9068 Aromatic Medicinal Plant Classification Using Deep Learning

Authors: Tsega Asresa Mengistu, Getahun Tigistu

Abstract:

Computer vision is an artificial intelligence subfield that allows computers and systems to retrieve meaning from digital images. It is applied in various fields of study self-driving cars, video surveillance, agriculture, Quality control, Health care, construction, military, and everyday life. Aromatic and medicinal plants are botanical raw materials used in cosmetics, medicines, health foods, and other natural health products for therapeutic and Aromatic culinary purposes. Herbal industries depend on these special plants. These plants and their products not only serve as a valuable source of income for farmers and entrepreneurs, and going to export not only industrial raw materials but also valuable foreign exchange. There is a lack of technologies for the classification and identification of Aromatic and medicinal plants in Ethiopia. The manual identification system of plants is a tedious, time-consuming, labor, and lengthy process. For farmers, industry personnel, academics, and pharmacists, it is still difficult to identify parts and usage of plants before ingredient extraction. In order to solve this problem, the researcher uses a deep learning approach for the efficient identification of aromatic and medicinal plants by using a convolutional neural network. The objective of the proposed study is to identify the aromatic and medicinal plant Parts and usages using computer vision technology. Therefore, this research initiated a model for the automatic classification of aromatic and medicinal plants by exploring computer vision technology. Morphological characteristics are still the most important tools for the identification of plants. Leaves are the most widely used parts of plants besides the root, flower and fruit, latex, and barks. The study was conducted on aromatic and medicinal plants available in the Ethiopian Institute of Agricultural Research center. An experimental research design is proposed for this study. This is conducted in Convolutional neural networks and Transfer learning. The Researcher employs sigmoid Activation as the last layer and Rectifier liner unit in the hidden layers. Finally, the researcher got a classification accuracy of 66.4 in convolutional neural networks and 67.3 in mobile networks, and 64 in the Visual Geometry Group.

Keywords: aromatic and medicinal plants, computer vision, deep convolutional neural network

Procedia PDF Downloads 438
9067 Analysis and Prediction of Netflix Viewing History Using Netflixlatte as an Enriched Real Data Pool

Authors: Amir Mabhout, Toktam Ghafarian, Amirhossein Farzin, Zahra Makki, Sajjad Alizadeh, Amirhossein Ghavi

Abstract:

The high number of Netflix subscribers makes it attractive for data scientists to extract valuable knowledge from the viewers' behavioural analyses. This paper presents a set of statistical insights into viewers' viewing history. After that, a deep learning model is used to predict the future watching behaviour of the users based on previous watching history within the Netflixlatte data pool. Netflixlatte in an aggregated and anonymized data pool of 320 Netflix viewers with a length 250 000 data points recorded between 2008-2022. We observe insightful correlations between the distribution of viewing time and the COVID-19 pandemic outbreak. The presented deep learning model predicts future movie and TV series viewing habits with an average loss of 0.175.

Keywords: data analysis, deep learning, LSTM neural network, netflix

Procedia PDF Downloads 248
9066 Neural Adaptive Controller for a Class of Nonlinear Pendulum Dynamical System

Authors: Mohammad Reza Rahimi Khoygani, Reza Ghasemi

Abstract:

In this paper, designing direct adaptive neural controller is applied for a class of a nonlinear pendulum dynamic system. The radial basis function (RBF) is used for the Neural network (NN). The adaptive neural controller is robust in presence of external and internal uncertainties. Both the effectiveness of the controller and robustness against disturbances are the merits of this paper. The promising performance of the proposed controllers investigates in simulation results.

Keywords: adaptive control, pendulum dynamical system, nonlinear control, adaptive neural controller, nonlinear dynamical, neural network, RBF, driven pendulum, position control

Procedia PDF Downloads 669
9065 The Intersection of Artificial Intelligence and Mathematics

Authors: Mitat Uysal, Aynur Uysal

Abstract:

Artificial Intelligence (AI) is fundamentally driven by mathematics, with many of its core algorithms rooted in mathematical principles such as linear algebra, probability theory, calculus, and optimization techniques. This paper explores the deep connection between AI and mathematics, highlighting the role of mathematical concepts in key AI techniques like machine learning, neural networks, and optimization. To demonstrate this connection, a case study involving the implementation of a neural network using Python is presented. This practical example illustrates the essential role that mathematics plays in training a model and solving real-world problems.

Keywords: AI, mathematics, machine learning, optimization techniques, image processing

Procedia PDF Downloads 13
9064 Data Augmentation for Early-Stage Lung Nodules Using Deep Image Prior and Pix2pix

Authors: Qasim Munye, Juned Islam, Haseeb Qureshi, Syed Jung

Abstract:

Lung nodules are commonly identified in computed tomography (CT) scans by experienced radiologists at a relatively late stage. Early diagnosis can greatly increase survival. We propose using a pix2pix conditional generative adversarial network to generate realistic images simulating early-stage lung nodule growth. We have applied deep images prior to 2341 slices from 895 computed tomography (CT) scans from the Lung Image Database Consortium (LIDC) dataset to generate pseudo-healthy medical images. From these images, 819 were chosen to train a pix2pix network. We observed that for most of the images, the pix2pix network was able to generate images where the nodule increased in size and intensity across epochs. To evaluate the images, 400 generated images were chosen at random and shown to a medical student beside their corresponding original image. Of these 400 generated images, 384 were defined as satisfactory - meaning they resembled a nodule and were visually similar to the corresponding image. We believe that this generated dataset could be used as training data for neural networks to detect lung nodules at an early stage or to improve the accuracy of such networks. This is particularly significant as datasets containing the growth of early-stage nodules are scarce. This project shows that the combination of deep image prior and generative models could potentially open the door to creating larger datasets than currently possible and has the potential to increase the accuracy of medical classification tasks.

Keywords: medical technology, artificial intelligence, radiology, lung cancer

Procedia PDF Downloads 66
9063 Deep Reinforcement Learning Approach for Optimal Control of Industrial Smart Grids

Authors: Niklas Panten, Eberhard Abele

Abstract:

This paper presents a novel approach for real-time and near-optimal control of industrial smart grids by deep reinforcement learning (DRL). To achieve highly energy-efficient factory systems, the energetic linkage of machines, technical building equipment and the building itself is desirable. However, the increased complexity of the interacting sub-systems, multiple time-variant target values and stochastic influences by the production environment, weather and energy markets make it difficult to efficiently control the energy production, storage and consumption in the hybrid industrial smart grids. The studied deep reinforcement learning approach allows to explore the solution space for proper control policies which minimize a cost function. The deep neural network of the DRL agent is based on a multilayer perceptron (MLP), Long Short-Term Memory (LSTM) and convolutional layers. The agent is trained within multiple Modelica-based factory simulation environments by the Advantage Actor Critic algorithm (A2C). The DRL controller is evaluated by means of the simulation and then compared to a conventional, rule-based approach. Finally, the results indicate that the DRL approach is able to improve the control performance and significantly reduce energy respectively operating costs of industrial smart grids.

Keywords: industrial smart grids, energy efficiency, deep reinforcement learning, optimal control

Procedia PDF Downloads 193
9062 Author Profiling: Prediction of Learners’ Gender on a MOOC Platform Based on Learners’ Comments

Authors: Tahani Aljohani, Jialin Yu, Alexandra. I. Cristea

Abstract:

The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.

Keywords: deep learning, data mining, gender predication, MOOCs

Procedia PDF Downloads 147
9061 Evaluation of Modern Natural Language Processing Techniques via Measuring a Company's Public Perception

Authors: Burak Oksuzoglu, Savas Yildirim, Ferhat Kutlu

Abstract:

Opinion mining (OM) is one of the natural language processing (NLP) problems to determine the polarity of opinions, mostly represented on a positive-neutral-negative axis. The data for OM is usually collected from various social media platforms. In an era where social media has considerable control over companies’ futures, it’s worth understanding social media and taking actions accordingly. OM comes to the fore here as the scale of the discussion about companies increases, and it becomes unfeasible to gauge opinion on individual levels. Thus, the companies opt to automize this process by applying machine learning (ML) approaches to their data. For the last two decades, OM or sentiment analysis (SA) has been mainly performed by applying ML classification algorithms such as support vector machines (SVM) and Naïve Bayes to a bag of n-gram representations of textual data. With the advent of deep learning and its apparent success in NLP, traditional methods have become obsolete. Transfer learning paradigm that has been commonly used in computer vision (CV) problems started to shape NLP approaches and language models (LM) lately. This gave a sudden rise to the usage of the pretrained language model (PTM), which contains language representations that are obtained by training it on the large datasets using self-supervised learning objectives. The PTMs are further fine-tuned by a specialized downstream task dataset to produce efficient models for various NLP tasks such as OM, NER (Named-Entity Recognition), Question Answering (QA), and so forth. In this study, the traditional and modern NLP approaches have been evaluated for OM by using a sizable corpus belonging to a large private company containing about 76,000 comments in Turkish: SVM with a bag of n-grams, and two chosen pre-trained models, multilingual universal sentence encoder (MUSE) and bidirectional encoder representations from transformers (BERT). The MUSE model is a multilingual model that supports 16 languages, including Turkish, and it is based on convolutional neural networks. The BERT is a monolingual model in our case and transformers-based neural networks. It uses a masked language model and next sentence prediction tasks that allow the bidirectional training of the transformers. During the training phase of the architecture, pre-processing operations such as morphological parsing, stemming, and spelling correction was not used since the experiments showed that their contribution to the model performance was found insignificant even though Turkish is a highly agglutinative and inflective language. The results show that usage of deep learning methods with pre-trained models and fine-tuning achieve about 11% improvement over SVM for OM. The BERT model achieved around 94% prediction accuracy while the MUSE model achieved around 88% and SVM did around 83%. The MUSE multilingual model shows better results than SVM, but it still performs worse than the monolingual BERT model.

Keywords: BERT, MUSE, opinion mining, pretrained language model, SVM, Turkish

Procedia PDF Downloads 146
9060 Exploring the Synergistic Effects of Aerobic Exercise and Cinnamon Extract on Metabolic Markers in Insulin-Resistant Rats through Advanced Machine Learning and Deep Learning Techniques

Authors: Masoomeh Alsadat Mirshafaei

Abstract:

The present study aims to explore the effect of an 8-week aerobic training regimen combined with cinnamon extract on serum irisin and leptin levels in insulin-resistant rats. Additionally, this research leverages various machine learning (ML) and deep learning (DL) algorithms to model the complex interdependencies between exercise, nutrition, and metabolic markers, offering a groundbreaking approach to obesity and diabetes research. Forty-eight Wistar rats were selected and randomly divided into four groups: control, training, cinnamon, and training cinnamon. The training protocol was conducted over 8 weeks, with sessions 5 days a week at 75-80% VO2 max. The cinnamon and training-cinnamon groups were injected with 200 ml/kg/day of cinnamon extract. Data analysis included serum data, dietary intake, exercise intensity, and metabolic response variables, with blood samples collected 72 hours after the final training session. The dataset was analyzed using one-way ANOVA (P<0.05) and fed into various ML and DL models, including Support Vector Machines (SVM), Random Forest (RF), and Convolutional Neural Networks (CNN). Traditional statistical methods indicated that aerobic training, with and without cinnamon extract, significantly increased serum irisin and decreased leptin levels. Among the algorithms, the CNN model provided superior performance in identifying specific interactions between cinnamon extract concentration and exercise intensity, optimizing the increase in irisin and the decrease in leptin. The CNN model achieved an accuracy of 92%, outperforming the SVM (85%) and RF (88%) models in predicting the optimal conditions for metabolic marker improvements. The study demonstrated that advanced ML and DL techniques could uncover nuanced relationships and potential cellular responses to exercise and dietary supplements, which is not evident through traditional methods. These findings advocate for the integration of advanced analytical techniques in nutritional science and exercise physiology, paving the way for personalized health interventions in managing obesity and diabetes.

Keywords: aerobic training, cinnamon extract, insulin resistance, irisin, leptin, convolutional neural networks, exercise physiology, support vector machines, random forest

Procedia PDF Downloads 36
9059 AI for Efficient Geothermal Exploration and Utilization

Authors: Velimir "monty" Vesselinov, Trais Kliplhuis, Hope Jasperson

Abstract:

Artificial intelligence (AI) is a powerful tool in the geothermal energy sector, aiding in both exploration and utilization. Identifying promising geothermal sites can be challenging due to limited surface indicators and the need for expensive drilling to confirm subsurface resources. Geothermal reservoirs can be located deep underground and exhibit complex geological structures, making traditional exploration methods time-consuming and imprecise. AI algorithms can analyze vast datasets of geological, geophysical, and remote sensing data, including satellite imagery, seismic surveys, geochemistry, geology, etc. Machine learning algorithms can identify subtle patterns and relationships within this data, potentially revealing hidden geothermal potential in areas previously overlooked. To address these challenges, a SIML (Science-Informed Machine Learning) technology has been developed. SIML methods are different from traditional ML techniques. In both cases, the ML models are trained to predict the spatial distribution of an output (e.g., pressure, temperature, heat flux) based on a series of inputs (e.g., permeability, porosity, etc.). The traditional ML (a) relies on deep and wide neural networks (NNs) based on simple algebraic mappings to represent complex processes. In contrast, the SIML neurons incorporate complex mappings (including constitutive relationships and physics/chemistry models). This results in ML models that have a physical meaning and satisfy physics laws and constraints. The prototype of the developed software, called GeoTGO, is accessible through the cloud. Our software prototype demonstrates how different data sources can be made available for processing, executed demonstrative SIML analyses, and presents the results in a table and graphic form.

Keywords: science-informed machine learning, artificial inteligence, exploration, utilization, hidden geothermal

Procedia PDF Downloads 52