Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 803

Search results for: imbalanced datasets

53 Contextual Toxicity Detection with Data Augmentation

Abstract:

Understanding and detecting toxicity is an important problem to support safer human interactions online. Our work focuses on the important problem of contextual toxicity detection, where automated classifiers are tasked with determining whether a short textual segment (usually a sentence) is toxic within its conversational context. We use “toxicity” as an umbrella term to denote a number of variants commonly named in the literature, including hate, abuse, offence, among others. Detecting toxicity in context is a non-trivial problem and has been addressed by very few previous studies. These previous studies have analysed the influence of conversational context in human perception of toxicity in controlled experiments and concluded that humans rarely change their judgements in the presence of context. They have also evaluated contextual detection models based on state-of-the-art Deep Learning and Natural Language Processing (NLP) techniques. Counterintuitively, they reached the general conclusion that computational models tend to suffer performance degradation in the presence of context. We challenge these empirical observations by devising better contextual predictive models that also rely on NLP data augmentation techniques to create larger and better data. In our study, we start by further analysing the human perception of toxicity in conversational data (i.e., tweets), in the absence versus presence of context, in this case, previous tweets in the same conversational thread. We observed that the conclusions of previous work on human perception are mainly due to data issues: The contextual data available does not provide sufficient evidence that context is indeed important (even for humans). The data problem is common in current toxicity datasets: cases labelled as toxic are either obviously toxic (i.e., overt toxicity with swear, racist, etc. words), and thus context does is not needed for a decision, or are ambiguous, vague or unclear even in the presence of context; in addition, the data contains labeling inconsistencies. To address this problem, we propose to automatically generate contextual samples where toxicity is not obvious (i.e., covert cases) without context or where different contexts can lead to different toxicity judgements for the same tweet. We generate toxic and non-toxic utterances conditioned on the context or on target tweets using a range of techniques for controlled text generation(e.g., Generative Adversarial Networks and steering techniques). On the contextual detection models, we posit that their poor performance is due to limitations on both of the data they are trained on (same problems stated above) and the architectures they use, which are not able to leverage context in effective ways. To improve on that, we propose text classification architectures that take the hierarchy of conversational utterances into account. In experiments benchmarking ours against previous models on existing and automatically generated data, we show that both data and architectural choices are very important. Our model achieves substantial performance improvements as compared to the baselines that are non-contextual or contextual but agnostic of the conversation structure.

Keywords: contextual toxicity detection, data augmentation, hierarchical text classification models, natural language processing

Procedia PDF Downloads 171

52 Evaluation of Modern Natural Language Processing Techniques via Measuring a Company's Public Perception

Authors: Burak Oksuzoglu, Savas Yildirim, Ferhat Kutlu

Abstract:

Opinion mining (OM) is one of the natural language processing (NLP) problems to determine the polarity of opinions, mostly represented on a positive-neutral-negative axis. The data for OM is usually collected from various social media platforms. In an era where social media has considerable control over companies’ futures, it’s worth understanding social media and taking actions accordingly. OM comes to the fore here as the scale of the discussion about companies increases, and it becomes unfeasible to gauge opinion on individual levels. Thus, the companies opt to automize this process by applying machine learning (ML) approaches to their data. For the last two decades, OM or sentiment analysis (SA) has been mainly performed by applying ML classification algorithms such as support vector machines (SVM) and Naïve Bayes to a bag of n-gram representations of textual data. With the advent of deep learning and its apparent success in NLP, traditional methods have become obsolete. Transfer learning paradigm that has been commonly used in computer vision (CV) problems started to shape NLP approaches and language models (LM) lately. This gave a sudden rise to the usage of the pretrained language model (PTM), which contains language representations that are obtained by training it on the large datasets using self-supervised learning objectives. The PTMs are further fine-tuned by a specialized downstream task dataset to produce efficient models for various NLP tasks such as OM, NER (Named-Entity Recognition), Question Answering (QA), and so forth. In this study, the traditional and modern NLP approaches have been evaluated for OM by using a sizable corpus belonging to a large private company containing about 76,000 comments in Turkish: SVM with a bag of n-grams, and two chosen pre-trained models, multilingual universal sentence encoder (MUSE) and bidirectional encoder representations from transformers (BERT). The MUSE model is a multilingual model that supports 16 languages, including Turkish, and it is based on convolutional neural networks. The BERT is a monolingual model in our case and transformers-based neural networks. It uses a masked language model and next sentence prediction tasks that allow the bidirectional training of the transformers. During the training phase of the architecture, pre-processing operations such as morphological parsing, stemming, and spelling correction was not used since the experiments showed that their contribution to the model performance was found insignificant even though Turkish is a highly agglutinative and inflective language. The results show that usage of deep learning methods with pre-trained models and fine-tuning achieve about 11% improvement over SVM for OM. The BERT model achieved around 94% prediction accuracy while the MUSE model achieved around 88% and SVM did around 83%. The MUSE multilingual model shows better results than SVM, but it still performs worse than the monolingual BERT model.

Keywords: BERT, MUSE, opinion mining, pretrained language model, SVM, Turkish

Procedia PDF Downloads 148

51 Impact Location From Instrumented Mouthguard Kinematic Data In Rugby

Authors: Jazim Sohail, Filipe Teixeira-Dias

Abstract:

Mild traumatic brain injury (mTBI) within non-helmeted contact sports is a growing concern due to the serious risk of potential injury. Extensive research is being conducted looking into head kinematics in non-helmeted contact sports utilizing instrumented mouthguards that allow researchers to record accelerations and velocities of the head during and after an impact. This does not, however, allow the location of the impact on the head, and its magnitude and orientation, to be determined. This research proposes and validates two methods to quantify impact locations from instrumented mouthguard kinematic data, one using rigid body dynamics, the other utilizing machine learning. The rigid body dynamics technique focuses on establishing and matching moments from Euler’s and torque equations in order to find the impact location on the head. The methodology is validated with impact data collected from a lab test with the dummy head fitted with an instrumented mouthguard. Additionally, a Hybrid III Dummy head finite element model was utilized to create synthetic kinematic data sets for impacts from varying locations to validate the impact location algorithm. The algorithm calculates accurate impact locations; however, it will require preprocessing of live data, which is currently being done by cross-referencing data timestamps to video footage. The machine learning technique focuses on eliminating the preprocessing aspect by establishing trends within time-series signals from instrumented mouthguards to determine the impact location on the head. An unsupervised learning technique is used to cluster together impacts within similar regions from an entire time-series signal. The kinematic signals established from mouthguards are converted to the frequency domain before using a clustering algorithm to cluster together similar signals within a time series that may span the length of a game. Impacts are clustered within predetermined location bins. The same Hybrid III Dummy finite element model is used to create impacts that closely replicate on-field impacts in order to create synthetic time-series datasets consisting of impacts in varying locations. These time-series data sets are used to validate the machine learning technique. The rigid body dynamics technique provides a good method to establish accurate impact location of impact signals that have already been labeled as true impacts and filtered out of the entire time series. However, the machine learning technique provides a method that can be implemented with long time series signal data but will provide impact location within predetermined regions on the head. Additionally, the machine learning technique can be used to eliminate false impacts captured by sensors saving additional time for data scientists using instrumented mouthguard kinematic data as validating true impacts with video footage would not be required.

Keywords: head impacts, impact location, instrumented mouthguard, machine learning, mTBI

Procedia PDF Downloads 217

50 Automatic Identification and Classification of Contaminated Biodegradable Plastics using Machine Learning Algorithms and Hyperspectral Imaging Technology

Authors: Nutcha Taneepanichskul, Helen C. Hailes, Mark Miodownik

Abstract:

Plastic waste has emerged as a critical global environmental challenge, primarily driven by the prevalent use of conventional plastics derived from petrochemical refining and manufacturing processes in modern packaging. While these plastics serve vital functions, their persistence in the environment post-disposal poses significant threats to ecosystems. Addressing this issue necessitates approaches, one of which involves the development of biodegradable plastics designed to degrade under controlled conditions, such as industrial composting facilities. It is imperative to note that compostable plastics are engineered for degradation within specific environments and are not suited for uncontrolled settings, including natural landscapes and aquatic ecosystems. The full benefits of compostable packaging are realized when subjected to industrial composting, preventing environmental contamination and waste stream pollution. Therefore, effective sorting technologies are essential to enhance composting rates for these materials and diminish the risk of contaminating recycling streams. In this study, it leverage hyperspectral imaging technology (HSI) coupled with advanced machine learning algorithms to accurately identify various types of plastics, encompassing conventional variants like Polyethylene terephthalate (PET), Polypropylene (PP), Low density polyethylene (LDPE), High density polyethylene (HDPE) and biodegradable alternatives such as Polybutylene adipate terephthalate (PBAT), Polylactic acid (PLA), and Polyhydroxyalkanoates (PHA). The dataset is partitioned into three subsets: a training dataset comprising uncontaminated conventional and biodegradable plastics, a validation dataset encompassing contaminated plastics of both types, and a testing dataset featuring real-world packaging items in both pristine and contaminated states. Five distinct machine learning algorithms, namely Partial Least Squares Discriminant Analysis (PLS-DA), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Logistic Regression, and Decision Tree Algorithm, were developed and evaluated for their classification performance. Remarkably, the Logistic Regression and CNN model exhibited the most promising outcomes, achieving a perfect accuracy rate of 100% for the training and validation datasets. Notably, the testing dataset yielded an accuracy exceeding 80%. The successful implementation of this sorting technology within recycling and composting facilities holds the potential to significantly elevate recycling and composting rates. As a result, the envisioned circular economy for plastics can be established, thereby offering a viable solution to mitigate plastic pollution.

Keywords: biodegradable plastics, sorting technology, hyperspectral imaging technology, machine learning algorithms

Procedia PDF Downloads 82

49 Assessment of DNA Sequence Encoding Techniques for Machine Learning Algorithms Using a Universal Bacterial Marker

Authors: Diego Santibañez Oyarce, Fernanda Bravo Cornejo, Camilo Cerda Sarabia, Belén Díaz Díaz, Esteban Gómez Terán, Hugo Osses Prado, Raúl Caulier-Cisterna, Jorge Vergara-Quezada, Ana Moya-Beltrán

Abstract:

The advent of high-throughput sequencing technologies has revolutionized genomics, generating vast amounts of genetic data that challenge traditional bioinformatics methods. Machine learning addresses these challenges by leveraging computational power to identify patterns and extract information from large datasets. However, biological sequence data, being symbolic and non-numeric, must be converted into numerical formats for machine learning algorithms to process effectively. So far, some encoding methods, such as one-hot encoding or k-mers, have been explored. This work proposes additional approaches for encoding DNA sequences in order to compare them with existing techniques and determine if they can provide improvements or if current methods offer superior results. Data from the 16S rRNA gene, a universal marker, was used to analyze eight bacterial groups that are significant in the pulmonary environment and have clinical implications. The bacterial genes included in this analysis are Prevotella, Abiotrophia, Acidovorax, Streptococcus, Neisseria, Veillonella, Mycobacterium, and Megasphaera. These data were downloaded from the NCBI database in Genbank file format, followed by a syntactic analysis to selectively extract relevant information from each file. For data encoding, a sequence normalization process was carried out as the first step. From approximately 22,000 initial data points, a subset was generated for testing purposes. Specifically, 55 sequences from each bacterial group met the length criteria, resulting in an initial sample of approximately 440 sequences. The sequences were encoded using different methods, including one-hot encoding, k-mers, Fourier transform, and Wavelet transform. Various machine learning algorithms, such as support vector machines, random forests, and neural networks, were trained to evaluate these encoding methods. The performance of these models was assessed using multiple metrics, including the confusion matrix, ROC curve, and F1 Score, providing a comprehensive evaluation of their classification capabilities. The results show that accuracies between encoding methods vary by up to approximately 15%, with the Fourier transform obtaining the best results for the evaluated machine learning algorithms. These findings, supported by the detailed analysis using the confusion matrix, ROC curve, and F1 Score, provide valuable insights into the effectiveness of different encoding methods and machine learning algorithms for genomic data analysis, potentially improving the accuracy and efficiency of bacterial classification and related genomic studies.

Keywords: DNA encoding, machine learning, Fourier transform, Fourier transformation

Procedia PDF Downloads 28

48 Temporal Estimation of Hydrodynamic Parameter Variability in Constructed Wetlands

Authors: Mohammad Moezzibadi, Isabelle Charpentier, Adrien Wanko, Robert Mosé

Abstract:

The calibration of hydrodynamic parameters for subsurface constructed wetlands (CWs) is a sensitive process since highly non-linear equations are involved in unsaturated flow modeling. CW systems are engineered systems designed to favour natural treatment processes involving wetland vegetation, soil, and their microbial flora. Their significant efficiency at reducing the ecological impact of urban runoff has been recently proved in the field. Numerical flow modeling in a vertical variably saturated CW is here carried out by implementing the Richards model by means of a mixed hybrid finite element method (MHFEM), particularly well adapted to the simulation of heterogeneous media, and the van Genuchten-Mualem parametrization. For validation purposes, MHFEM results were compared to those of HYDRUS (a software based on a finite element discretization). As van Genuchten-Mualem soil hydrodynamic parameters depend on water content, their estimation is subject to considerable experimental and numerical studies. In particular, the sensitivity analysis performed with respect to the van Genuchten-Mualem parameters reveals a predominant influence of the shape parameters α, n and the saturated conductivity of the filter on the piezometric heads, during saturation and desaturation. Modeling issues arise when the soil reaches oven-dry conditions. A particular attention should also be brought to boundary condition modeling (surface ponding or evaporation) to be able to tackle different sequences of rainfall-runoff events. For proper parameter identification, large field datasets would be needed. As these are usually not available, notably due to the randomness of the storm events, we thus propose a simple, robust and low-cost numerical method for the inverse modeling of the soil hydrodynamic properties. Among the methods, the variational data assimilation technique introduced by Le Dimet and Talagrand is applied. To that end, a variational data assimilation technique is implemented by applying automatic differentiation (AD) to augment computer codes with derivative computations. Note that very little effort is needed to obtain the differentiated code using the on-line Tapenade AD engine. Field data are collected for a three-layered CW located in Strasbourg (Alsace, France) at the water edge of the urban water stream Ostwaldergraben, during several months. Identification experiments are conducted by comparing measured and computed piezometric head by means of the least square objective function. The temporal variability of hydrodynamic parameter is then assessed and analyzed.

Keywords: automatic differentiation, constructed wetland, inverse method, mixed hybrid FEM, sensitivity analysis

Procedia PDF Downloads 164

47 Strategies for Synchronizing Chocolate Conching Data Using Dynamic Time Warping

Authors: Fernanda A. P. Peres, Thiago N. Peres, Flavio S. Fogliatto, Michel J. Anzanello

Abstract:

Batch processes are widely used in food industry and have an important role in the production of high added value products, such as chocolate. Process performance is usually described by variables that are monitored as the batch progresses. Data arising from these processes are likely to display a strong correlation-autocorrelation structure, and are usually monitored using control charts based on multiway principal components analysis (MPCA). Process control of a new batch is carried out comparing the trajectories of its relevant process variables with those in a reference set of batches that yielded products within specifications; it is clear that proper determination of the reference set is key for the success of a correct signalization of non-conforming batches in such quality control schemes. In chocolate manufacturing, misclassifications of non-conforming batches in the conching phase may lead to significant financial losses. In such context, the accuracy of process control grows in relevance. In addition to that, the main assumption in MPCA-based monitoring strategies is that all batches are synchronized in duration, both the new batch being monitored and those in the reference set. Such assumption is often not satisfied in chocolate manufacturing process. As a consequence, traditional techniques as MPCA-based charts are not suitable for process control and monitoring. To address that issue, the objective of this work is to compare the performance of three dynamic time warping (DTW) methods in the alignment and synchronization of chocolate conching process variables’ trajectories, aimed at properly determining the reference distribution for multivariate statistical process control. The power of classification of batches in two categories (conforming and non-conforming) was evaluated using the k-nearest neighbor (KNN) algorithm. Real data from a milk chocolate conching process was collected and the following variables were monitored over time: frequency of soybean lecithin dosage, rotation speed of the shovels, current of the main motor of the conche, and chocolate temperature. A set of 62 batches with durations between 495 and 1,170 minutes was considered; 53% of the batches were known to be conforming based on lab test results and experts’ evaluations. Results showed that all three DTW methods tested were able to align and synchronize the conching dataset. However, synchronized datasets obtained from these methods performed differently when inputted in the KNN classification algorithm. Kassidas, MacGregor and Taylor’s (named KMT) method was deemed the best DTW method for aligning and synchronizing a milk chocolate conching dataset, presenting 93.7% accuracy, 97.2% sensitivity and 90.3% specificity in batch classification, being considered the best option to determine the reference set for the milk chocolate dataset. Such method was recommended due to the lowest number of iterations required to achieve convergence and highest average accuracy in the testing portion using the KNN classification technique.

Keywords: batch process monitoring, chocolate conching, dynamic time warping, reference set distribution, variable duration

Procedia PDF Downloads 168

46 Bioinformatic Prediction of Hub Genes by Analysis of Signaling Pathways, Transcriptional Regulatory Networks and DNA Methylation Pattern in Colon Cancer

Authors: Ankan Roy, Niharika, Samir Kumar Patra

Abstract:

Anomalous nexus of complex topological assemblies and spatiotemporal epigenetic choreography at chromosomal territory may forms the most sophisticated regulatory layer of gene expression in cancer. Colon cancer is one of the leading malignant neoplasms of the lower gastrointestinal tract worldwide. There is still a paucity of information about the complex molecular mechanisms of colonic cancerogenesis. Bioinformatics prediction and analysis helps to identify essential genes and significant pathways for monitoring and conquering this deadly disease. The present study investigates and explores potential hub genes as biomarkers and effective therapeutic targets for colon cancer treatment. Colon cancer patient sample containing gene expression profile datasets, such as GSE44076, GSE20916, and GSE37364 were downloaded from Gene Expression Omnibus (GEO) database and thoroughly screened using the GEO2R tool and Funrich software to find out common 2 differentially expressed genes (DEGs). Other approaches, including Gene Ontology (GO) and KEGG pathway analysis, Protein-Protein Interaction (PPI) network construction and hub gene investigation, Overall Survival (OS) analysis, gene correlation analysis, methylation pattern analysis, and hub gene-Transcription factors regulatory network construction, were performed and validated using various bioinformatics tool. Initially, we identified 166 DEGs, including 68 up-regulated and 98 down-regulated genes. Up-regulated genes are mainly associated with the Cytokine-cytokine receptor interaction, IL17 signaling pathway, ECM-receptor interaction, Focal adhesion and PI3K-Akt pathway. Downregulated genes are enriched in metabolic pathways, retinol metabolism, Steroid hormone biosynthesis, and bile secretion. From the protein-protein interaction network, thirty hub genes with high connectivity are selected using the MCODE and cytoHubba plugin. Survival analysis, expression validation, correlation analysis, and methylation pattern analysis were further verified using TCGA data. Finally, we predicted COL1A1, COL1A2, COL4A1, SPP1, SPARC, and THBS2 as potential master regulators in colonic cancerogenesis. Moreover, our experimental data highlights that disruption of lipid raft and RAS/MAPK signaling cascade affects this gene hub at mRNA level. We identified COL1A1, COL1A2, COL4A1, SPP1, SPARC, and THBS2 as determinant hub genes in colon cancer progression. They can be considered as biomarkers for diagnosis and promising therapeutic targets in colon cancer treatment. Additionally, our experimental data advertise that signaling pathway act as connecting link between membrane hub and gene hub.

Keywords: hub genes, colon cancer, DNA methylation, epigenetic engineering, bioinformatic predictions

Procedia PDF Downloads 132

45 An Evolutionary Approach for Automated Optimization and Design of Vivaldi Antennas

Authors: Sahithi Yarlagadda

Abstract:

The design of antenna is constrained by mathematical and geometrical parameters. Though there are diverse antenna structures with wide range of feeds yet, there are many geometries to be tried, which cannot be customized into predefined computational methods. The antenna design and optimization qualify to apply evolutionary algorithmic approach since the antenna parameters weights dependent on geometric characteristics directly. The evolutionary algorithm can be explained simply for a given quality function to be maximized. We can randomly create a set of candidate solutions, elements of the function's domain, and apply the quality function as an abstract fitness measure. Based on this fitness, some of the better candidates are chosen to seed the next generation by applying recombination and permutation to them. In conventional approach, the quality function is unaltered for any iteration. But the antenna parameters and geometries are wide to fit into single function. So, the weight coefficients are obtained for all possible antenna electrical parameters and geometries; the variation is learnt by mining the data obtained for an optimized algorithm. The weight and covariant coefficients of corresponding parameters are logged for learning and future use as datasets. This paper drafts an approach to obtain the requirements to study and methodize the evolutionary approach to automated antenna design for our past work on Vivaldi antenna as test candidate. The antenna parameters like gain, directivity, etc. are directly caged by geometries, materials, and dimensions. The design equations are to be noted here and valuated for all possible conditions to get maxima and minima for given frequency band. The boundary conditions are thus obtained prior to implementation, easing the optimization. The implementation mainly aimed to study the practical computational, processing, and design complexities that incur while simulations. HFSS is chosen for simulations and results. MATLAB is used to generate the computations, combinations, and data logging. MATLAB is also used to apply machine learning algorithms and plotting the data to design the algorithm. The number of combinations is to be tested manually, so HFSS API is used to call HFSS functions from MATLAB itself. MATLAB parallel processing tool box is used to run multiple simulations in parallel. The aim is to develop an add-in to antenna design software like HFSS, CSTor, a standalone application to optimize pre-identified common parameters of wide range of antennas available. In this paper, we have used MATLAB to calculate Vivaldi antenna parameters like slot line characteristic impedance, impedance of stripline, slot line width, flare aperture size, dielectric and K means, and Hamming window are applied to obtain the best test parameters. HFSS API is used to calculate the radiation, bandwidth, directivity, and efficiency, and data is logged for applying the Evolutionary genetic algorithm in MATLAB. The paper demonstrates the computational weights and Machine Learning approach for automated antenna optimizing for Vivaldi antenna.

Keywords: machine learning, Vivaldi, evolutionary algorithm, genetic algorithm

Procedia PDF Downloads 111

44 The Impact of Shifting Trading Pattern from Long-Haul to Short-Sea to the Car Carriers’ Freight Revenues

Authors: Tianyu Wang, Nikita Karandikar

Abstract:

The uncertainty around cost, safety, and feasibility of the decarbonized shipping fuels has made it increasingly complex for the shipping companies to set pricing strategies and forecast their freight revenues going forward. The increase in the green fuel surcharges will ultimately influence the automobile’s consumer prices. The auto shipping demand (ton-miles) has been gradually shifting from long-haul to short-sea trade over the past years following the relocation of the original equipment manufacturer (OEM) manufacturing to regions such as South America and Southeast Asia. The objective of this paper is twofold: 1) to investigate the car-carriers freight revenue development over the years when the trade pattern is gradually shifting towards short-sea exports 2) to empirically identify the quantitative impact of such trade pattern shifting to mainly freight rate, but also vessel size, fleet size as well as Green House Gas (GHG) emission in Roll on-Roll Off (Ro-Ro) shipping. In this paper, a model of analyzing and forecasting ton-miles and freight revenues for the trade routes of AS-NA (Asia to North America), EU-NA (Europe to North America), and SA-NA (South America to North America) is established by deploying Automatic Identification System (AIS) data and the financial results of a selected car carrier company. More specifically, Wallenius Wilhelmsen Logistics (WALWIL), the Norwegian Ro-Ro carrier listed on Oslo Stock Exchange, is selected as the case study company in this paper. AIS-based ton-mile datasets of WALWIL vessels that are sailing into North America region from three different origins (Asia, Europe, and South America), together with WALWIL’s quarterly freight revenues as reported in trade segments, will be investigated and compared for the past five years (2018-2022). Furthermore, ordinary‐least‐square (OLS) regression is utilized to construct the ton-mile demand and freight revenue forecasting. The determinants of trade pattern shifting, such as import tariffs following the China-US trade war and fuel prices following the 0.1% Emission Control Areas (ECA) zone requirement after IMO2020 will be set as key variable inputs to the machine learning model. The model will be tested on another newly listed Norwegian Car Carrier, Hoegh Autoliner, to forecast its 2022 financial results and to validate the accuracy based on its actual results. GHG emissions on the three routes will be compared and discussed based on a constant emission per mile assumption and voyage distances. Our findings will provide important insights about 1) the trade-off evaluation between revenue reduction and energy saving with the new ton-mile pattern and 2) how the trade flow shifting would influence the future need for the vessel and fleet size.

Keywords: AIS, automobile exports, maritime big data, trade flows

Procedia PDF Downloads 121

43 Influence of a High-Resolution Land Cover Classification on Air Quality Modelling

Authors: C. Silveira, A. Ascenso, J. Ferreira, A. I. Miranda, P. Tuccella, G. Curci

Abstract:

Poor air quality is one of the main environmental causes of premature deaths worldwide, and mainly in cities, where the majority of the population lives. It is a consequence of successive land cover (LC) and use changes, as a result of the intensification of human activities. Knowing these landscape modifications in a comprehensive spatiotemporal dimension is, therefore, essential for understanding variations in air pollutant concentrations. In this sense, the use of air quality models is very useful to simulate the physical and chemical processes that affect the dispersion and reaction of chemical species into the atmosphere. However, the modelling performance should always be evaluated since the resolution of the input datasets largely dictates the reliability of the air quality outcomes. Among these data, the updated LC is an important parameter to be considered in atmospheric models, since it takes into account the Earth’s surface changes due to natural and anthropic actions, and regulates the exchanges of fluxes (emissions, heat, moisture, etc.) between the soil and the air. This work aims to evaluate the performance of the Weather Research and Forecasting model coupled with Chemistry (WRF-Chem), when different LC classifications are used as an input. The influence of two LC classifications was tested: i) the 24-classes USGS (United States Geological Survey) LC database included by default in the model, and the ii) CLC (Corine Land Cover) and specific high-resolution LC data for Portugal, reclassified according to the new USGS nomenclature (33-classes). Two distinct WRF-Chem simulations were carried out to assess the influence of the LC on air quality over Europe and Portugal, as a case study, for the year 2015, using the nesting technique over three simulation domains (25 km², 5 km²and 1 km² horizontal resolution). Based on the 33-classes LC approach, particular emphasis was attributed to Portugal, given the detail and higher LC spatial resolution (100 m x 100 m) than the CLC data (5000 m x 5000 m). As regards to the air quality, only the LC impacts on tropospheric ozone concentrations were evaluated, because ozone pollution episodes typically occur in Portugal, in particular during the spring/summer, and there are few research works relating to this pollutant with LC changes. The WRF-Chem results were validated by season and station typology using background measurements from the Portuguese air quality monitoring network. As expected, a better model performance was achieved in rural stations: moderate correlation (0.4 – 0.7), BIAS (10 – 21µg.m^-3) and RMSE (20 – 30 µg.m^-3), and where higher average ozone concentrations were estimated. Comparing both simulations, small differences grounded on the Leaf Area Index and air temperature values were found, although the high-resolution LC approach shows a slight enhancement in the model evaluation. This highlights the role of the LC on the exchange of atmospheric fluxes, and stresses the need to consider a high-resolution LC characterization combined with other detailed model inputs, such as the emission inventory, to improve air quality assessment.

Keywords: land use, spatial resolution, WRF-Chem, air quality assessment

Procedia PDF Downloads 159

42 Assessing Moisture Adequacy over Semi-arid and Arid Indian Agricultural Farms using High-Resolution Thermography

Authors: Devansh Desai, Rahul Nigam

Abstract:

Crop water stress (W) at a given growth stage starts to set in as moisture availability (M) to roots falls below 75% of maximum. It has been found that ratio of crop evapotranspiration (ET) and reference evapotranspiration (ET0) is an indicator of moisture adequacy and is strongly correlated with ‘M’ and ‘W’. The spatial variability of ET0 is generally less over an agricultural farm of 1-5 ha than ET, which depends on both surface and atmospheric conditions, while the former depends only on atmospheric conditions. Solutions from surface energy balance (SEB) and thermal infrared (TIR) remote sensing are now known to estimate latent heat flux of ET. In the present study, ET and moisture adequacy index (MAI) (=ET/ET0) have been estimated over two contrasting western India agricultural farms having rice-wheat system in semi-arid climate and arid grassland system, limited by moisture availability. High-resolution multi-band TIR sensing observations at 65m from ECOSTRESS (ECOsystemSpaceborne Thermal Radiometer Experiment on Space Station) instrument on-board International Space Station (ISS) were used in an analytical SEB model, STIC (Surface Temperature Initiated Closure) to estimate ET and MAI. The ancillary variables used in the ET modeling and MAI estimation were land surface albedo, NDVI from close-by LANDSAT data at 30m spatial resolution, ET0 product at 4km spatial resolution from INSAT 3D, meteorological forcing variables from short-range weather forecast on air temperature and relative humidity from NWP model. Farm-scale ET estimates at 65m spatial resolution were found to show low RMSE of 16.6% to 17.5% with R2 >0.8 from 18 datasets as compared to reported errors (25 – 30%) from coarser-scale ET at 1 to 8 km spatial resolution when compared to in situ measurements from eddy covariance systems. The MAI was found to show lower (<0.25) and higher (>0.5) magnitudes in the contrasting agricultural farms. The study showed the potential need of high-resolution high-repeat spaceborne multi-band TIR payloads alongwith optical payload in estimating farm-scale ET and MAI for estimating consumptive water use and water stress. A set of future high-resolution multi-band TIR sensors are planned on-board Indo-French TRISHNA, ESA’s LSTM, NASA’s SBG space-borne missions to address sustainable irrigation water management at farm-scale to improve crop water productivity. These will provide precise and fundamental variables of surface energy balance such as LST (Land Surface Temperature), surface emissivity, albedo and NDVI. A synchronization among these missions is needed in terms of observations, algorithms, product definitions, calibration-validation experiments and downstream applications to maximize the potential benefits.

Keywords: thermal remote sensing, land surface temperature, crop water stress, evapotranspiration

Procedia PDF Downloads 71

41 Supplementing Aerial-Roving Surveys with Autonomous Optical Cameras: A High Temporal Resolution Approach to Monitoring and Estimating Effort within a Recreational Salmon Fishery in British Columbia, Canada

Authors: Ben Morrow, Patrick O'Hara, Natalie Ban, Tunai Marques, Molly Fraser, Christopher Bone

Abstract:

Relative to commercial fisheries, recreational fisheries are often poorly understood and pose various challenges for monitoring frameworks. In British Columbia (BC), Canada, Pacific salmon are heavily targeted by recreational fishers while also being a key source of nutrient flow and crucial prey for a variety of marine and terrestrial fauna, including endangered Southern Resident killer whales (Orcinus orca). Although commercial fisheries were historically responsible for the majority of salmon retention, recreational fishing now comprises both greater effort and retention. The current monitoring scheme for recreational salmon fisheries involves aerial-roving creel surveys. However, this method has been identified as costly and having low predictive power as it is often limited to sampling fragments of fluid and temporally dynamic fisheries. This study used imagery from two shore-based autonomous cameras in a highly active recreational fishery around Sooke, BC, and evaluated their efficacy in supplementing existing aerial-roving surveys for monitoring a recreational salmon fishery. This study involved continuous monitoring and high temporal resolution (over one million images analyzed in a single fishing season), using a deep learning-based vessel detection algorithm and a custom image annotation tool to efficiently thin datasets. This allowed for the quantification of peak-season effort from a busy harbour, species-specific retention estimates, high levels of detected fishing events at a nearby popular fishing location, as well as the proportion of the fishery management area represented by cameras. Then, this study demonstrated how it could substantially enhance the temporal resolution of a fishery through diel activity pattern analyses, scaled monthly to visualize clusters of activity. This work also highlighted considerable off-season fishing detection, currently unaccounted for in the existing monitoring framework. These results demonstrate several distinct applications of autonomous cameras for providing enhanced detail currently unavailable in the current monitoring framework, each of which has important considerations for the managerial allocation of resources. Further, the approach and methodology can benefit other studies that apply shore-based camera monitoring, supplement aerial-roving creel surveys to improve fine-scale temporal understanding, inform the optimal timing of creel surveys, and improve the predictive power of recreational stock assessments to preserve important and endangered fish species.

Keywords: cameras, monitoring, recreational fishing, stock assessment

Procedia PDF Downloads 123

40 Effect of Climate Change on the Genomics of Invasiveness of the Whitefly Bemisia tabaci Species Complex by Estimating the Effective Population Size via a Coalescent Method

Authors: Samia Elfekih, Wee Tek Tay, Karl Gordon, Paul De Barro

Abstract:

Invasive species represent an increasing threat to food biosecurity, causing significant economic losses in agricultural systems. An example is the sweet potato whitefly, Bemisia tabaci, which is a complex of morphologically indistinguishable species causing average annual global damage estimated at US$2.4 billion. The Bemisia complex represents an interesting model for evolutionary studies because of their extensive distribution and potential for invasiveness and population expansion. Within this complex, two species, Middle East-Asia Minor 1 (MEAM1) and Mediterranean (MED) have invaded well beyond their home ranges whereas others, such as Indian Ocean (IO) and Australia (AUS), have not. In order to understand why some Bemisia species have become invasive, genome-wide sequence scans were used to estimate population dynamics over time and relate these to climate. The Bayesian Skyline Plot (BSP) method as implemented in BEAST was used to infer the historical effective population size. In order to overcome sampling bias, the populations were combined based on geographical origin. The datasets used for this particular analysis are genome-wide SNPs (single nucleotide polymorphisms) called separately in each of the following groups: Sub-Saharan Africa (Burkina Faso), Europe (Spain, France, Greece and Croatia), USA (Arizona), Mediterranean-Middle East (Israel, Italy), Middle East-Central Asia (Turkmenistan, Iran) and Reunion Island. The non-invasive ‘AUS’ species endemic to Australia was used as an outgroup. The main findings of this study show that the BSP for the Sub-Saharan African MED population is different from that observed in MED populations from the Mediterranean Basin, suggesting evolution under a different set of environmental conditions. For MED, the effective size of the African (Burkina Faso) population showed a rapid expansion ≈250,000-310,000 years ago (YA), preceded by a period of slower growth. The European MED populations (i.e., Spain, France, Croatia, and Greece) showed a single burst of expansion at ≈160,000-200,000 YA. The MEAM1 populations from Israel and Italy and the ones from Iran and Turkmenistan are similar as they both show the earlier expansion at ≈250,000-300,000 YA. The single IO population lacked the latter expansion but had the earlier one. This pattern is shared with the Sub-Saharan African (Burkina Faso) MED, suggesting IO also faced a similar history of environmental change, which seems plausible given their relatively close geographical distributions. In conclusion, populations within the invasive species MED and MEAM1 exhibited signatures of population expansion lacking in non-invasive species (IO and AUS) during the Pleistocene, a geological epoch marked by repeated climatic oscillations with cycles of glacial and interglacial periods. These expansions strongly suggested the potential of some Bemisia species’ genomes to affect their adaptability and invasiveness.

Keywords: whitefly, RADseq, invasive species, SNP, climate change

Procedia PDF Downloads 127

39 Archaeoseismological Evidence for a Possible Destructive Earthquake in the 7th Century AD at the Ancient Sites of Bulla Regia and Chemtou (NW Tunisia): Seismotectonic and Structural Implications

Authors: Abdelkader Soumaya, Noureddine Ben Ayed, Ali Kadri, Said Maouche, Hayet Khayati Ammar, Ahmed Braham

Abstract:

The historic sites of Bulla Regia and Chemtou are among the most important archaeological monuments in northwestern Tunisia, which flourished as large, wealthy settlements during the Roman and Byzantine periods (2nd to 7th centuries AD). An archaeoseismological study provides the first indications about the impact of a possible ancient strong earthquake in the destruction of these cities. Based on previous archaeological excavation results, including numismatic evidence, pottery, economic meltdown and urban transformation, the abrupt ruin and destruction of the cities of Bulla Regia and Chemtou can be bracketed between 613 and 647 AD. In this study, we carried out the first attempt to use the analysis of earthquake archaeological effects (EAEs) that were observed during our field investigations in these two historic cities. The damage includes different types of EAEs: folds on regular pavements, displaced and deformed vaults, folded walls, tilted walls, collapsed keystones in arches, dipping broken corners, displaced-fallen columns, block extrusions in walls, penetrative fractures in brick-made walls and open fractures on regular pavements. These deformations are spread over 10 different sectors or buildings and include 56 measured EAEs. The structural analysis of the identified EAEs can indicate an ancient destructive earthquake that probably destroyed the Bulla Regia and Chemtou archaeological sites. We then analyzed these measurements using structural geological analysis to obtain the maximum horizontal strain of the ground (e.g., S ₕₘₐₓ) on each building-oriented damage. After the collection and analysis of these strain datasets, we proceed to plot the orientation of Sₕₘₐₓ trajectories on the map of the archaeological site (Bulla Regia). We concluded that the obtained Sₕₘₐₓ trajectories within this site could then be related to the mean direction of ground motion (oscillatory movement of the ground) triggered by a seismic event, as documented for some historical earthquakes across the world. These Sₕₘₐₓ orientations closely match the current active stress field, as highlighted by some instrumental events in northern Tunisia. In terms of the seismic source, we strongly suggest that the reactivation of a neotectonic strike-slip fault trending N50E must be responsible for this probable historic earthquake and the recent instrumental seismicity in this area. This fault segment, affecting the folded quaternary deposits south of Jebel Rebia, passes through the monument of Bulla Regia. Stress inversion of the observed and measured data along this fault shows an N150 - 160 trend of Sₕₘₐₓ under a transpressional tectonic regime, which is quite consistent with the GPS data and the state of the current stress field in this region.

Keywords: NW Tunisia, archaeoseismology, earthquake archaeological effect, bulla regia - Chemtou, seismotectonic, neotectonic fault

Procedia PDF Downloads 51

38 Automatic Adult Age Estimation Using Deep Learning of the ResNeXt Model Based on CT Reconstruction Images of the Costal Cartilage

Authors: Ting Lu, Ya-Ru Diao, Fei Fan, Ye Xue, Lei Shi, Xian-e Tang, Meng-jun Zhan, Zhen-hua Deng

Abstract:

Accurate adult age estimation (AAE) is a significant and challenging task in forensic and archeology fields. Attempts have been made to explore optimal adult age metrics, and the rib is considered a potential age marker. The traditional way is to extract age-related features designed by experts from macroscopic or radiological images followed by classification or regression analysis. Those results still have not met the high-level requirements for practice, and the limitation of using feature design and manual extraction methods is loss of information since the features are likely not designed explicitly for extracting information relevant to age. Deep learning (DL) has recently garnered much interest in imaging learning and computer vision. It enables learning features that are important without a prior bias or hypothesis and could be supportive of AAE. This study aimed to develop DL models for AAE based on CT images and compare their performance to the manual visual scoring method. Chest CT data were reconstructed using volume rendering (VR). Retrospective data of 2500 patients aged 20.00-69.99 years were obtained between December 2019 and September 2021. Five-fold cross-validation was performed, and datasets were randomly split into training and validation sets in a 4:1 ratio for each fold. Before feeding the inputs into networks, all images were augmented with random rotation and vertical flip, normalized, and resized to 224×224 pixels. ResNeXt was chosen as the DL baseline due to its advantages of higher efficiency and accuracy in image classification. Mean absolute error (MAE) was the primary parameter. Independent data from 100 patients acquired between March and April 2022 were used as a test set. The manual method completely followed the prior study, which reported the lowest MAEs (5.31 in males and 6.72 in females) among similar studies. CT data and VR images were used. The radiation density of the first costal cartilage was recorded using CT data on the workstation. The osseous and calcified projections of the 1 to 7 costal cartilages were scored based on VR images using an eight-stage staging technique. According to the results of the prior study, the optimal models were the decision tree regression model in males and the stepwise multiple linear regression equation in females. Predicted ages of the test set were calculated separately using different models by sex. A total of 2600 patients (training and validation sets, mean age=45.19 years±14.20 [SD]; test set, mean age=46.57±9.66) were evaluated in this study. Of ResNeXt model training, MAEs were obtained with 3.95 in males and 3.65 in females. Based on the test set, DL achieved MAEs of 4.05 in males and 4.54 in females, which were far better than the MAEs of 8.90 and 6.42 respectively, for the manual method. Those results showed that the DL of the ResNeXt model outperformed the manual method in AAE based on CT reconstruction of the costal cartilage and the developed system may be a supportive tool for AAE.

Keywords: forensic anthropology, age determination by the skeleton, costal cartilage, CT, deep learning

Procedia PDF Downloads 74

37 Quantification of Magnetic Resonance Elastography for Tissue Shear Modulus using U-Net Trained with Finite-Differential Time-Domain Simulation

Authors: Jiaying Zhang, Xin Mu, Chang Ni, Jeff L. Zhang

Abstract:

Magnetic resonance elastography (MRE) non-invasively assesses tissue elastic properties, such as shear modulus, by measuring tissue’s displacement in response to mechanical waves. The estimated metrics on tissue elasticity or stiffness have been shown to be valuable for monitoring physiologic or pathophysiologic status of tissue, such as a tumor or fatty liver. To quantify tissue shear modulus from MRE-acquired displacements (essentially an inverse problem), multiple approaches have been proposed, including Local Frequency Estimation (LFE) and Direct Inversion (DI). However, one common problem with these methods is that the estimates are severely noise-sensitive due to either the inverse-problem nature or noise propagation in the pixel-by-pixel process. With the advent of deep learning (DL) and its promise in solving inverse problems, a few groups in the field of MRE have explored the feasibility of using DL methods for quantifying shear modulus from MRE data. Most of the groups chose to use real MRE data for DL model training and to cut training images into smaller patches, which enriches feature characteristics of training data but inevitably increases computation time and results in outcomes with patched patterns. In this study, simulated wave images generated by Finite Differential Time Domain (FDTD) simulation are used for network training, and U-Net is used to extract features from each training image without cutting it into patches. The use of simulated data for model training has the flexibility of customizing training datasets to match specific applications. The proposed method aimed to estimate tissue shear modulus from MRE data with high robustness to noise and high model-training efficiency. Specifically, a set of 3000 maps of shear modulus (with a range of 1 kPa to 15 kPa) containing randomly positioned objects were simulated, and their corresponding wave images were generated. The two types of data were fed into the training of a U-Net model as its output and input, respectively. For an independently simulated set of 1000 images, the performance of the proposed method against DI and LFE was compared by the relative errors (root mean square error or RMSE divided by averaged shear modulus) between the true shear modulus map and the estimated ones. The results showed that the estimated shear modulus by the proposed method achieved a relative error of 4.91%±0.66%, substantially lower than 78.20%±1.11% by LFE. Using simulated data, the proposed method significantly outperformed LFE and DI in resilience to increasing noise levels and in resolving fine changes of shear modulus. The feasibility of the proposed method was also tested on MRE data acquired from phantoms and from human calf muscles, resulting in maps of shear modulus with low noise. In future work, the method’s performance on phantom and its repeatability on human data will be tested in a more quantitative manner. In conclusion, the proposed method showed much promise in quantifying tissue shear modulus from MRE with high robustness and efficiency.

Keywords: deep learning, magnetic resonance elastography, magnetic resonance imaging, shear modulus estimation

Procedia PDF Downloads 68

36 Methodology to Achieve Non-Cooperative Target Identification Using High Resolution Range Profiles

Authors: Olga Hernán-Vega, Patricia López-Rodríguez, David Escot-Bocanegra, Raúl Fernández-Recio, Ignacio Bravo

Abstract:

Non-Cooperative Target Identification has become a key research domain in the Defense industry since it provides the ability to recognize targets at long distance and under any weather condition. High Resolution Range Profiles, one-dimensional radar images where the reflectivity of a target is projected onto the radar line of sight, are widely used for identification of flying targets. According to that, to face this problem, an approach to Non-Cooperative Target Identification based on the exploitation of Singular Value Decomposition to a matrix of range profiles is presented. Target Identification based on one-dimensional radar images compares a collection of profiles of a given target, namely test set, with the profiles included in a pre-loaded database, namely training set. The classification is improved by using Singular Value Decomposition since it allows to model each aircraft as a subspace and to accomplish recognition in a transformed domain where the main features are easier to extract hence, reducing unwanted information such as noise. Singular Value Decomposition permits to define a signal subspace which contain the highest percentage of the energy, and a noise subspace which will be discarded. This way, only the valuable information of each target is used in the recognition process. The identification algorithm is based on finding the target that minimizes the angle between subspaces and takes place in a transformed domain. Two metrics, F1 and F2, based on Singular Value Decomposition are accomplished in the identification process. In the case of F2, the angle is weighted, since the top vectors set the importance in the contribution to the formation of a target signal, on the contrary F1 simply shows the evolution of the unweighted angle. In order to have a wide database or radar signatures and evaluate the performance, range profiles are obtained through numerical simulation of seven civil aircraft at defined trajectories taken from an actual measurement. Taking into account the nature of the datasets, the main drawback of using simulated profiles instead of actual measured profiles is that the former implies an ideal identification scenario, since measured profiles suffer from noise, clutter and other unwanted information and simulated profiles don't. In this case, the test and training samples have similar nature and usually a similar high signal-to-noise ratio, so as to assess the feasibility of the approach, the addition of noise has been considered before the creation of the test set. The identification results applying the unweighted and weighted metrics are analysed for demonstrating which algorithm provides the best robustness against noise in an actual possible scenario. So as to confirm the validity of the methodology, identification experiments of profiles coming from electromagnetic simulations are conducted, revealing promising results. Considering the dissimilarities between the test and training sets when noise is added, the recognition performance has been improved when weighting is applied. Future experiments with larger sets are expected to be conducted with the aim of finally using actual profiles as test sets in a real hostile situation.

Keywords: HRRP, NCTI, simulated/synthetic database, SVD

Procedia PDF Downloads 354

35 Developing a Deep Understanding of the Immune Response in Hepatitis B Virus Infected Patients Using a Knowledge Driven Approach

Authors: Hanan Begali, Shahi Dost, Annett Ziegler, Markus Cornberg, Maria-Esther Vidal, Anke R. M. Kraft

Abstract:

Chronic hepatitis B virus (HBV) infection can be treated with nucleot(s)ide analog (NA), for example, which inhibits HBV replication. However, they have hardly any influence on the functional cure of HBV, which is defined by hepatitis B surface antigen (HBsAg) loss. NA needs to be taken life-long, which is not available for all patients worldwide. Additionally, NA-treated patients are still at risk of developing cirrhosis, liver failure, or hepatocellular carcinoma (HCC). Although each patient has the same components of the immune system, immune responses vary between patients. Therefore, a deeper understanding of the immune response against HBV in different patients is necessary to understand the parameters leading to HBV cure and to use this knowledge to optimize HBV therapies. This requires seamless integration of an enormous amount of diverse and fine-grained data from viral markers, e.g., hepatitis B core-related antigen (HBcrAg) and hepatitis B surface antigen (HBsAg). The data integration system relies on the assumption that profiling human immune systems requires the analysis of various variables (e.g., demographic data, treatments, pre-existing conditions, immune cell response, or HLA-typing) rather than only one. However, the values of these variables are collected independently. They are presented in a myriad of formats, e.g., excel files, textual descriptions, lab book notes, and images of flow cytometry dot plots. Additionally, patients can be identified differently in these analyses. This heterogeneity complicates the integration of variables, as data management techniques are needed to create a unified view in which individual formats and identifiers are transparent when profiling the human immune systems. The proposed study (HBsRE) aims at integrating heterogeneous data sets of 87 chronically HBV-infected patients, e.g., clinical data, immune cell response, and HLA-typing, with knowledge encoded in biomedical ontologies and open-source databases into a knowledge-driven framework. This new technique enables us to harmonize and standardize heterogeneous datasets in the defined modeling of the data integration system, which will be evaluated in the knowledge graph (KG). KGs are data structures that represent the knowledge and data as factual statements using a graph data model. Finally, the analytic data model will be applied on top of KG in order to develop a deeper understanding of the immune profiles among various patients and to evaluate factors playing a role in a holistic profile of patients with HBsAg level loss. Additionally, our objective is to utilize this unified approach to stratify patients for new effective treatments. This study is developed in the context of the project “Transforming big data into knowledge: for deep immune profiling in vaccination, infectious diseases, and transplantation (ImProVIT)”, which is a multidisciplinary team composed of computer scientists, infection biologists, and immunologists.

Keywords: chronic hepatitis B infection, immune response, knowledge graphs, ontology

Procedia PDF Downloads 108

34 Comparison of Machine Learning-Based Models for Predicting Streptococcus pyogenes Virulence Factors and Antimicrobial Resistance

Authors: Fernanda Bravo Cornejo, Camilo Cerda Sarabia, Belén Díaz Díaz, Diego Santibañez Oyarce, Esteban Gómez Terán, Hugo Osses Prado, Raúl Caulier-Cisterna, Jorge Vergara-Quezada, Ana Moya-Beltrán

Abstract:

Streptococcus pyogenes is a gram-positive bacteria involved in a wide range of diseases and is a major-human-specific bacterial pathogen. In Chile, this year the 'Ministerio de Salud' declared an alert due to the increase in strains throughout the year. This increase can be attributed to the multitude of factors including antimicrobial resistance (AMR) and Virulence Factors (VF). Understanding these VF and AMR is crucial for developing effective strategies and improving public health responses. Moreover, experimental identification and characterization of these pathogenic mechanisms are labor-intensive and time-consuming. Therefore, new computational methods are required to provide robust techniques for accelerating this identification. Advances in Machine Learning (ML) algorithms represent the opportunity to refine and accelerate the discovery of VF associated with Streptococcus pyogenes. In this work, we evaluate the accuracy of various machine learning models in predicting the virulence factors and antimicrobial resistance of Streptococcus pyogenes, with the objective of providing new methods for identifying the pathogenic mechanisms of this organism.Our comprehensive approach involved the download of 32,798 genbank files of S. pyogenes from NCBI dataset, coupled with the incorporation of data from Virulence Factor Database (VFDB) and Antibiotic Resistance Database (CARD) which contains sequences of AMR gene sequence and resistance profiles. These datasets provided labeled examples of both virulent and non-virulent genes, enabling a robust foundation for feature extraction and model training. We employed preprocessing, characterization and feature extraction techniques on primary nucleotide/amino acid sequences and selected the optimal more for model training. The feature set was constructed using sequence-based descriptors (e.g., k-mers and One-hot encoding), and functional annotations based on database prediction. The ML models compared are logistic regression, decision trees, support vector machines, neural networks among others. The results of this work show some differences in accuracy between the algorithms, these differences allow us to identify different aspects that represent unique opportunities for a more precise and efficient characterization and identification of VF and AMR. This comparative analysis underscores the value of integrating machine learning techniques in predicting S. pyogenes virulence and AMR, offering potential pathways for more effective diagnostic and therapeutic strategies. Future work will focus on incorporating additional omics data, such as transcriptomics, and exploring advanced deep learning models to further enhance predictive capabilities.

Keywords: antibiotic resistance, streptococcus pyogenes, virulence factors., machine learning

Procedia PDF Downloads 37

33 The Senior Traveler Market as a Competitive Advantage for the Luxury Hotel Sector in the UK Post-Pandemic

Authors: Feyi Olorunshola

Abstract:

Over the last few years, the senior travel market has been noted for its potential in the wider tourism industry. The tourism sector includes the hotel and hospitality, travel, transportation, and several other subdivisions to make it economically viable. In particular, the hotel attracts a substantial part of the expenditure in tourism activities as when people plan to travel, suitable accommodation for relaxation, dining, entertainment and so on is paramount to their decision-making. The global retail value of the hotel as of 2018 was significant for tourism. But, despite indications of the hotel to the tourism industry at large, very few empirical studies are available to establish how this sector can leverage on the senior demographic to achieve competitive advantage. Predominantly, studies on the mature market have focused on destination tourism, with a limited investigation on the hotel which makes a significant contribution to tourism. Also, several scholarly studies have demonstrated the importance of the senior travel market to the hotel, yet there is very little empirical research in the field which has explored the driving factors that will become the accepted new normal for this niche segment post-pandemic. Giving that the hotel already operates in a highly saturated business environment, and on top of this pre-existing challenge, the ongoing global health outbreak has further put the sector in a vulnerable position. Therefore, the hotel especially the full-service luxury category must evolve rapidly for it to survive in the current business environment. The hotel can no longer rely on corporate travelers to generate higher revenue since the unprecedented wake of the pandemic in 2020 many organizations have invented a different approach of conducting their businesses online, therefore, the hotel needs to anticipate a significant drop in business travellers. However, the rooms and the rest of the facilities must be occupied to keep their business operating. The way forward for the hotel lies in the leisure sector, but the question now is to focus on the potential demographics of travelers, in this case, the seniors who have been repeatedly recognized as the lucrative market because of increase discretionary income, availability of time and the global population trends. To achieve the study objectives, a mixed-method approach will be utilized drawing on both qualitative (netnography) and quantitative (survey) methods, cognitive and decision-making theories (means-end chain) and competitive theories to identify the salient drivers explaining senior hotel choice and its influence on their decision-making. The target population are repeated seniors’ age 65 years and over who are UK resident, and from the top tourist market to the UK (USA, Germany, and France). Structural equation modelling will be employed to analyze the datasets. The theoretical implication is the development of new concepts using a robust research design, and as well as advancing existing framework to hotel study. Practically, it will provide the hotel management with the latest information to design a competitive marketing strategy and activities to target the mature market post-pandemic and over a long period.

Keywords: competitive advantage, covid-19, full-service hotel, five-star, luxury hotels

Procedia PDF Downloads 123

32 MEIOSIS: Museum Specimens Shed Light in Biodiversity Shrinkage

Authors: Zografou Konstantina, Anagnostellis Konstantinos, Brokaki Marina, Kaltsouni Eleftheria, Dimaki Maria, Kati Vassiliki

Abstract:

Body size is crucial to ecology, influencing everything from individual reproductive success to the dynamics of communities and ecosystems. Understanding how temperature affects variations in body size is vital for both theoretical and practical purposes, as changes in size can modify trophic interactions by altering predator-prey size ratios and changing the distribution and transfer of biomass, which ultimately impacts food web stability and ecosystem functioning. Notably, a decrease in body size is frequently mentioned as the third "universal" response to climate warming, alongside shifts in distribution and changes in phenology. This trend is backed by ecological theories like the temperature-size rule (TSR) and Bergmann's rule, which have been observed in numerous species, indicating that many species are likely to shrink in size as temperatures rise. However, the thermal responses related to body size are still contradictory, and further exploration is needed. To tackle this challenge, we developed the MEIOSIS project, aimed at providing valuable insights into the relationship between the body size of species, species’ traits, environmental factors, and their response to climate change. We combined a digitized collection of butterflies from the Swiss Federal Institute of Technology in Zürich with our newly digitized butterfly collection from Goulandris Natural History Museum in Greece to analyse trends in time. For a total of 23868 images, the length of the right forewing was measured using ImageJ software. Each forewing was measured from the point at which the wing meets the thorax to the apex of the wing. The forewing length of museum specimens has been shown to have a strong correlation with wing surface area and has been utilized in prior studies as a proxy for overall body size. Temperature data corresponding to the years of collection were also incorporated into the datasets. A second dataset was generated when a custom computer vision tool was implemented for the automated morphological measuring of samples for the digitized collection in Zürich. Using the second dataset, we corrected manual measurements with ImageJ, and a final dataset containing 31922 samples was used for analysis. Setting time as a smoother variable, species identity as a random factor, and the length of right-wing size (a proxy for body size) as the response variable, we ran a global model for a maximum period of 110 years (1900 – 2010). Then, we investigated functional variability between different terrestrial biomes in a second model. Both models confirmed our initial hypothesis and resulted in a decreasing trend in body size over the years. We expect that this first output can be provided as basic data for the next challenge, i.e., to identify the ecological traits that influence species' temperature-size responses, enabling us to predict the direction and intensity of a species' reaction to rising temperatures more accurately.

Keywords: butterflies, shrinking body size, museum specimens, climate change

Procedia PDF Downloads 13

31 Food Consumption and Adaptation to Climate Change: Evidence from Ghana

Authors: Frank Adusah-Poku, John Bosco Dramani, Prince Boakye Frimpong

Abstract:

Climate change is considered a principal threat to human existence and livelihood. The persistence and intensity of droughts and floods in recent years have adversely affected food production systems and value chains, making it impossible to end global hunger by 2030. Thus, this study aims to examine the effect of climate change on food consumption for both farm and non-farm households in Ghana. An important focus of the analysis is to investigate how climate change affects alternative dimensions of food security, examine the extent to which these effects vary across heterogeneous groups, and explore the channels through which climate change affects food consumption. Finally, we conducted a pilot study to understand the significance of farm and non-farm diversification measures in reducing the harmful impact of climate change on farm households. The approach of this article is to use two secondary and one primary datasets. The first secondary dataset is the Ghana Socioeconomic Panel Survey (GSPS). The GSPS is a household panel dataset collected during the period 2009 to 2019. The second dataset is monthly district rainfall and temperature gridded data from the Ghana Meteorological Agency. This data was matched to the GSPS dataset at the district level. Finally, the primary data was obtained from a survey of farm and non-farm adaptation practices used by farmers in three regions in Northern Ghana. The study employed the household fixed effects model to estimate the effect of climate change (measured by temperature and rainfall) on food consumption in Ghana. Again, it used the spatial and temporal variation in temperature and rainfall across the districts in Ghana to estimate the household-level model. Evidence of potential mechanisms through which climate change affects food consumption was explored using two steps. First, the potential mechanism variables were regressed on temperature, rainfall, and the control variables. In the second and final step, the potential mechanism variables were included as extra covariates in the first model. The results revealed that extreme average temperature and drought had caused a decrease in food consumption as well as reduced the intake of important food nutrients such as carbohydrates, protein and vitamins. The results further indicated that low rainfall increased food insecurity among households with no education compared with those with primary and secondary education. Again, non-farm activity and silos have been revealed as the transmission pathways through which the effect of climate change on farm households can be moderated. Finally, the results indicated over 90% of the small-holder farmers interviewed had no farm diversification adaptation strategies for climate change, and a little over 50% of the farmers owned unskilled or manual non-farm economic ventures. This makes it very difficult for the majority of the farmers to withstand climate-related shocks. These findings suggest that achieving the Sustainable Development Goal of Zero Hunger by 2030 needs an integrated approach, such as reducing the over-reliance on rainfed agriculture, educating farmers, and implementing non-farm interventions to improve food consumption in Ghana.

Keywords: climate change, food consumption, Ghana, non-farm activity

Procedia PDF Downloads 13

30 Multi-Objective Optimization of Assembly Manufacturing Factory Setups

Authors: Andreas Lind, Aitor Iriondo Pascual, Dan Hogberg, Lars Hanson

Abstract:

Factory setup lifecycles are most often described and prepared in CAD environments; the preparation is based on experience and inputs from several cross-disciplinary processes. Early in the factory setup preparation, a so-called block layout is created. The intention is to describe a high-level view of the intended factory setup and to claim area reservations and allocations. Factory areas are then blocked, i.e., targeted to be used for specific intended resources and processes, later redefined with detailed factory setup layouts. Each detailed layout is based on the block layout and inputs from cross-disciplinary preparation processes, such as manufacturing sequence, productivity, workers’ workplace requirements, and resource setup preparation. However, this activity is often not carried out with all variables considered simultaneously, which might entail a risk of sub-optimizing the detailed layout based on manual decisions. Therefore, this work aims to realize a digital method for assembly manufacturing layout planning where productivity, area utilization, and ergonomics can be considered simultaneously in a cross-disciplinary manner. The purpose of the digital method is to support engineers in finding optimized designs of detailed layouts for assembly manufacturing factories, thereby facilitating better decisions regarding setups of future factories. Input datasets are company-specific descriptions of required dimensions for specific area reservations, such as defined dimensions of a worker’s workplace, material façades, aisles, and the sequence to realize the product assembly manufacturing process. To test and iteratively develop the digital method, a demonstrator has been developed with an adaptation of existing software that simulates and proposes optimized designs of detailed layouts. Since the method is to consider productivity, ergonomics, area utilization, and constraints from the automatically generated block layout, a multi-objective optimization approach is utilized. In the demonstrator, the input data are sent to the simulation software industrial path solutions (IPS). Based on the input and Lua scripts, the IPS software generates a block layout in compliance with the company’s defined dimensions of area reservations. Communication is then established between the IPS and the software EPP (Ergonomics in Productivity Platform), including intended resource descriptions, assembly manufacturing process, and manikin (digital human) resources. Using multi-objective optimization approaches, the EPP software then calculates layout proposals that are sent iteratively and simulated and rendered in IPS, following the rules and regulations defined in the block layout as well as productivity and ergonomics constraints and objectives. The software demonstrator is promising. The software can handle several parameters to optimize the detailed layout simultaneously and can put forward several proposals. It can optimize multiple parameters or weight the parameters to fine-tune the optimal result of the detailed layout. The intention of the demonstrator is to make the preparation between cross-disciplinary silos transparent and achieve a common preparation of the assembly manufacturing factory setup, thereby facilitating better decisions.

Keywords: factory setup, multi-objective, optimization, simulation

Procedia PDF Downloads 153

29 Deep Learning-Based Classification of 3D CT Scans with Real Clinical Data; Impact of Image format

Authors: Maryam Fallahpoor, Biswajeet Pradhan

Abstract:

Background: Artificial intelligence (AI) serves as a valuable tool in mitigating the scarcity of human resources required for the evaluation and categorization of vast quantities of medical imaging data. When AI operates with optimal precision, it minimizes the demand for human interpretations and, thereby, reduces the burden on radiologists. Among various AI approaches, deep learning (DL) stands out as it obviates the need for feature extraction, a process that can impede classification, especially with intricate datasets. The advent of DL models has ushered in a new era in medical imaging, particularly in the context of COVID-19 detection. Traditional 2D imaging techniques exhibit limitations when applied to volumetric data, such as Computed Tomography (CT) scans. Medical images predominantly exist in one of two formats: neuroimaging informatics technology initiative (NIfTI) and digital imaging and communications in medicine (DICOM). Purpose: This study aims to employ DL for the classification of COVID-19-infected pulmonary patients and normal cases based on 3D CT scans while investigating the impact of image format. Material and Methods: The dataset used for model training and testing consisted of 1245 patients from IranMehr Hospital. All scans shared a matrix size of 512 × 512, although they exhibited varying slice numbers. Consequently, after loading the DICOM CT scans, image resampling and interpolation were performed to standardize the slice count. All images underwent cropping and resampling, resulting in uniform dimensions of 128 × 128 × 60. Resolution uniformity was achieved through resampling to 1 mm × 1 mm × 1 mm, and image intensities were confined to the range of (−1000, 400) Hounsfield units (HU). For classification purposes, positive pulmonary COVID-19 involvement was designated as 1, while normal images were assigned a value of 0. Subsequently, a U-net-based lung segmentation module was applied to obtain 3D segmented lung regions. The pre-processing stage included normalization, zero-centering, and shuffling. Four distinct 3D CNN models (ResNet152, ResNet50, DensNet169, and DensNet201) were employed in this study. Results: The findings revealed that the segmentation technique yielded superior results for DICOM images, which could be attributed to the potential loss of information during the conversion of original DICOM images to NIFTI format. Notably, ResNet152 and ResNet50 exhibited the highest accuracy at 90.0%, and the same models achieved the best F1 score at 87%. ResNet152 also secured the highest Area under the Curve (AUC) at 0.932. Regarding sensitivity and specificity, DensNet201 achieved the highest values at 93% and 96%, respectively. Conclusion: This study underscores the capacity of deep learning to classify COVID-19 pulmonary involvement using real 3D hospital data. The results underscore the significance of employing DICOM format 3D CT images alongside appropriate pre-processing techniques when training DL models for COVID-19 detection. This approach enhances the accuracy and reliability of diagnostic systems for COVID-19 detection.

Keywords: deep learning, COVID-19 detection, NIFTI format, DICOM format

Procedia PDF Downloads 89

28 CD97 and Its Role in Glioblastoma Stem Cell Self-Renewal

Authors: Niklas Ravn-Boess, Nainita Bhowmick, Takamitsu Hattori, Shohei Koide, Christopher Park, Dimitris Placantonakis

Abstract:

Background: Glioblastoma (GBM) is the most common and deadly primary brain malignancy in adults. Tumor propagation, brain invasion, and resistance to therapy critically depend on GBM stem-like cells (GSCs); however, the mechanisms that regulate GSC self-renewal are incompletely understood. Given the aggressiveness and poor prognosis of GBM, it is imperative to find biomarkers that could also translate into novel drug targets. Along these lines, we have identified a cell surface antigen, CD97 (ADGRE5), an adhesion G protein-coupled receptor (GPCR), that is expressed on GBM cells but is absent from non-neoplastic brain tissue. CD97 has been shown to promote invasiveness, angiogenesis, and migration in several human cancers, but its frequency of expression and functional role in regulating GBM growth and survival, and its potential as a therapeutic target has not been investigated. Design: We assessed CD97 mRNA and protein expression in patient derived GBM samples and cell lines using publicly available RNA-sequencing datasets and flow cytometry, respectively. To assess CD97 function, we generated shRNA lentiviral constructs that target a sequence in the CD97 extracellular domain (ECD). A scrambled shRNA (scr) with no predicted targets in the genome was used as a control. We evaluated CD97 shRNA lentivirally transduced GBM cells for Ki67, Annexin V, and DAPI. We also tested CD97 KD cells for their ability to self-renew using clonogenic tumorsphere formation assays. Further, we utilized synthetic Abs (sAbs) generated against the ECD of CD97 to test for potential antitumor effects using patient-derived GBM cell lines. Results: CD97 mRNA expression was expressed at high levels in all GBM samples available in the TCGA cohort. We found high levels of surface CD97 protein expression in 6/6 patient-derived GBM cell cultures, but not human neural stem cells. Flow cytometry confirmed downregulation of CD97 in CD97 shRNA lentivirally transduced cells. CD97 KD induced a significant reduction in cell growth in 3 independent GBM cell lines representing mesenchymal and proneural subtypes, which was accompanied by reduced (~20%) Ki67 staining and increased (~30%) apoptosis. Incubation of GBM cells with sAbs (20 ug/ ml) against the ECD of CD97 for 3 days induced GSC differentiation, as determined by the expression of GFAP and Tubulin. Using three unique GBM patient derived cultures, we found that CD97 KD attenuated the ability of GBM cells to initiate sphere formation by over 300 fold, consistent with an impairment in GSC self-renewal. Conclusion: Loss of CD97 expression in patient-derived GBM cells markedly decreases proliferation, induces cell death, and reduces tumorsphere formation. sAbs against the ECD of CD97 reduce tumorsphere formation, recapitulating the phenotype of CD97 KD, suggesting that sAbs that inhibit CD97 function exhibit anti-tumor activity. Collectively, these findings indicate that CD97 is necessary for the proliferation and survival of human GBM cells and identify CD97 as a promising therapeutically targetable vulnerability in GBM.

Keywords: adhesion GPCR, CD97, GBM stem cell, glioblastoma

Procedia PDF Downloads 138

27 Performance of CALPUFF Dispersion Model for Investigation the Dispersion of the Pollutants Emitted from an Industrial Complex, Daura Refinery, to an Urban Area in Baghdad

Authors: Ramiz M. Shubbar, Dong In Lee, Hatem A. Gzar, Arthur S. Rood

Abstract:

Air pollution is one of the biggest environmental problems in Baghdad, Iraq. The Daura refinery located nearest the center of Baghdad, represents the largest industrial area, which transmits enormous amounts of pollutants, therefore study the gaseous pollutants and particulate matter are very important to the environment and the health of the workers in refinery and the people whom leaving in areas around the refinery. Actually, some studies investigated the studied area before, but it depended on the basic Gaussian equation in a simple computer programs, however, that kind of work at that time is very useful and important, but during the last two decades new largest production units were added to the Daura refinery such as, PU_3 (Power unit_3 (Boiler 11&12)), CDU_1 (Crude Distillation unit_70000 barrel_1), and CDU_2 (Crude Distillation unit_70000 barrel_2). Therefore, it is necessary to use new advanced model to study air pollution at the region for the new current years, and calculation the monthly emission rate of pollutants through actual amounts of fuel which consumed in production unit, this may be lead to accurate concentration values of pollutants and the behavior of dispersion or transport in study area. In this study to the best of author’s knowledge CALPUFF model was used and examined for first time in Iraq. CALPUFF is an advanced non-steady-state meteorological and air quality modeling system, was applied to investigate the pollutants concentration of SO2, NO2, CO, and PM1-10μm, at areas adjacent to Daura refinery which located in the center of Baghdad in Iraq. The CALPUFF modeling system includes three main components: CALMET is a diagnostic 3-dimensional meteorological model, CALPUFF (an air quality dispersion model), CALPOST is a post processing package, and an extensive set of preprocessing programs produced to interface the model to standard routinely available meteorological and geophysical datasets. The targets of this work are modeling and simulation the four pollutants (SO2, NO2, CO, and PM1-10μm) which emitted from Daura refinery within one year. Emission rates of these pollutants were calculated for twelve units includes thirty plants, and 35 stacks by using monthly average of the fuel amount consumption at this production units. Assess the performance of CALPUFF model in this study and detect if it is appropriate and get out predictions of good accuracy compared with available pollutants observation. CALPUFF model was investigated at three stability classes (stable, neutral, and unstable) to indicate the dispersion of the pollutants within deferent meteorological conditions. The simulation of the CALPUFF model showed the deferent kind of dispersion of these pollutants in this region depends on the stability conditions and the environment of the study area, monthly, and annual averages of pollutants were applied to view the dispersion of pollutants in the contour maps. High values of pollutants were noticed in this area, therefore this study recommends to more investigate and analyze of the pollutants, reducing the emission rate of pollutants by using modern techniques and natural gas, increasing the stack height of units, and increasing the exit gas velocity from stacks.

Keywords: CALPUFF, daura refinery, Iraq, pollutants

Procedia PDF Downloads 198

26 Balancing Biodiversity and Agriculture: A Broad-Scale Analysis of the Land Sparing/Land Sharing Trade-Off for South African Birds

Authors: Chevonne Reynolds, Res Altwegg, Andrew Balmford, Claire N. Spottiswoode

Abstract:

Modern agriculture has revolutionised the planet’s capacity to support humans, yet has simultaneously had a greater negative impact on biodiversity than any other human activity. Balancing the demand for food with the conservation of biodiversity is one of the most pressing issues of our time. Biodiversity-friendly farming (‘land sharing’), or alternatively, separation of conservation and production activities (‘land sparing’), are proposed as two strategies for mediating the trade-off between agriculture and biodiversity. However, there is much debate regarding the efficacy of each strategy, as this trade-off has typically been addressed by short term studies at fine spatial scales. These studies ignore processes that are relevant to biodiversity at larger scales, such as meta-population dynamics and landscape connectivity. Therefore, to better understand species response to agricultural land-use and provide evidence to underpin the planning of better production landscapes, we need to determine the merits of each strategy at larger scales. In South Africa, a remarkable citizen science project - the South African Bird Atlas Project 2 (SABAP2) – collates an extensive dataset describing the occurrence of birds at a 5-min by 5-min grid cell resolution. We use these data, along with fine-resolution data on agricultural land-use, to determine which strategy optimises the agriculture-biodiversity trade-off in a southern African context, and at a spatial scale never considered before. To empirically test this trade-off, we model bird species population density, derived for each 5-min grid cell by Royle-Nicols single-species occupancy modelling, against both the amount and configuration of different types of agricultural production in the same 5-min grid cell. In using both production amount and configuration, we can show not only how species population densities react to changes in yield, but also describe the production landscape patterns most conducive to conservation. Furthermore, the extent of both the SABAP2 and land-cover datasets allows us to test this trade-off across multiple regions to determine if bird populations respond in a consistent way and whether results can be extrapolated to other landscapes. We tested the land sparing/sharing trade-off for 281 bird species across three different biomes in South Africa. Overall, a higher proportion of species are classified as losers, and would benefit from land sparing. However, this proportion of loser-sparers is not consistent and varies across biomes and the different types of agricultural production. This is most likely because of differences in the intensity of agricultural land-use and the interactions between the differing types of natural vegetation and agriculture. Interestingly, we observe a higher number of species that benefit from agriculture than anticipated, suggesting that agriculture is a legitimate resource for certain bird species. Our results support those seen at smaller scales and across vastly different agricultural systems, that land sparing benefits the most species. However, our analysis suggests that land sparing needs to be implemented at spatial scales much larger than previously considered. Species persistence in agricultural landscapes will require the conservation of large tracts of land, and is an important consideration in developing countries, which are undergoing rapid agricultural development.

Keywords: agriculture, birds, land sharing, land sparing

Procedia PDF Downloads 209

25 Application of Harris Hawks Optimization Metaheuristic Algorithm and Random Forest Machine Learning Method for Long-Term Production Scheduling Problem under Uncertainty in Open-Pit Mines

Authors: Kamyar Tolouei, Ehsan Moosavi

Abstract:

In open-pit mines, the long-term production scheduling optimization problem (LTPSOP) is a complicated problem that contains constraints, large datasets, and uncertainties. Uncertainty in the output is caused by several geological, economic, or technical factors. Due to its dimensions and NP-hard nature, it is usually difficult to find an ideal solution to the LTPSOP. The optimal schedule generally restricts the ore, metal, and waste tonnages, average grades, and cash flows of each period. Past decades have witnessed important measurements of long-term production scheduling and optimal algorithms since researchers have become highly cognizant of the issue. In fact, it is not possible to consider LTPSOP as a well-solved problem. Traditional production scheduling methods in open-pit mines apply an estimated orebody model to produce optimal schedules. The smoothing result of some geostatistical estimation procedures causes most of the mine schedules and production predictions to be unrealistic and imperfect. With the expansion of simulation procedures, the risks from grade uncertainty in ore reserves can be evaluated and organized through a set of equally probable orebody realizations. In this paper, to synthesize grade uncertainty into the strategic mine schedule, a stochastic integer programming framework is presented to LTPSOP. The objective function of the model is to maximize the net present value and minimize the risk of deviation from the production targets considering grade uncertainty simultaneously while satisfying all technical constraints and operational requirements. Instead of applying one estimated orebody model as input to optimize the production schedule, a set of equally probable orebody realizations are applied to synthesize grade uncertainty in the strategic mine schedule and to produce a more profitable and risk-based production schedule. A mixture of metaheuristic procedures and mathematical methods paves the way to achieve an appropriate solution. This paper introduced a hybrid model between the augmented Lagrangian relaxation (ALR) method and the metaheuristic algorithm, the Harris Hawks optimization (HHO), to solve the LTPSOP under grade uncertainty conditions. In this study, the HHO is experienced to update Lagrange coefficients. Besides, a machine learning method called Random Forest is applied to estimate gold grade in a mineral deposit. The Monte Carlo method is used as the simulation method with 20 realizations. The results specify that the progressive versions have been considerably developed in comparison with the traditional methods. The outcomes were also compared with the ALR-genetic algorithm and ALR-sub-gradient. To indicate the applicability of the model, a case study on an open-pit gold mining operation is implemented. The framework displays the capability to minimize risk and improvement in the expected net present value and financial profitability for LTPSOP. The framework could control geological risk more effectively than the traditional procedure considering grade uncertainty in the hybrid model framework.

Keywords: grade uncertainty, metaheuristic algorithms, open-pit mine, production scheduling optimization

Procedia PDF Downloads 106

24 Evaluating the Accuracy of Biologically Relevant Variables Generated by ClimateAP

Authors: Jing Jiang, Wenhuan XU, Lei Zhang, Shiyi Zhang, Tongli Wang

Abstract:

Climate data quality significantly affects the reliability of ecological modeling. In the Asia Pacific (AP) region, low-quality climate data hinders ecological modeling. ClimateAP, a software developed in 2017, generates high-quality climate data for the AP region, benefiting researchers in forestry and agriculture. However, its adoption remains limited. This study aims to confirm the validity of biologically relevant variable data generated by ClimateAP during the normal climate period through comparison with the currently available gridded data. Climate data from 2,366 weather stations were used to evaluate the prediction accuracy of ClimateAP in comparison with the commonly used gridded data from WorldClim1.4. Univariate regressions were applied to 48 monthly biologically relevant variables, and the relationship between the observational data and the predictions made by ClimateAP and WorldClim was evaluated using Adjusted R-Squared and Root Mean Squared Error (RMSE). Locations were categorized into mountainous and flat landforms, considering elevation, slope, ruggedness, and Topographic Position Index. Univariate regressions were then applied to all biologically relevant variables for each landform category. Random Forest (RF) models were implemented for the climatic niche modeling of Cunninghamia lanceolata. A comparative analysis of the prediction accuracies of RF models constructed with distinct climate data sources was conducted to evaluate their relative effectiveness. Biologically relevant variables were obtained from three unpublished Chinese meteorological datasets. ClimateAPv3.0 and WorldClim predictions were obtained from weather station coordinates and WorldClim1.4 rasters, respectively, for the normal climate period of 1961-1990. Occurrence data for Cunninghamia lanceolata came from integrated biodiversity databases with 3,745 unique points. ClimateAP explains a minimum of 94.74%, 97.77%, 96.89%, and 94.40% of monthly maximum, minimum, average temperature, and precipitation variances, respectively. It outperforms WorldClim in 37 biologically relevant variables with lower RMSE values. ClimateAP achieves higher R-squared values for the 12 monthly minimum temperature variables and consistently higher Adjusted R-squared values across all landforms for precipitation. ClimateAP's temperature data yields lower Adjusted R-squared values than gridded data in high-elevation, rugged, and mountainous areas but achieves higher values in mid-slope drainages, plains, open slopes, and upper slopes. Using ClimateAP improves the prediction accuracy of tree occurrence from 77.90% to 82.77%. The biologically relevant climate data produced by ClimateAP is validated based on evaluations using observations from weather stations. The use of ClimateAP leads to an improvement in data quality, especially in non-mountainous regions. The results also suggest that using biologically relevant variables generated by ClimateAP can slightly enhance climatic niche modeling for tree species, offering a better understanding of tree species adaptation and resilience compared to using gridded data.

Keywords: climate data validation, data quality, Asia pacific climate, climatic niche modeling, random forest models, tree species

Procedia PDF Downloads 68