Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 3743

Search results for: mixed dataset

3383 Static Analysis of Security Issues of the Python Packages Ecosystem

Abstract:

Python is considered the most popular programming language and offers its own ecosystem for archiving and maintaining open-source software packages. This system is called the python package index (PyPI), the repository of this programming language. Unfortunately, one-third of these software packages have vulnerabilities that allow attackers to execute code automatically when a vulnerable or malicious package is installed. This paper contributes to large-scale empirical studies investigating security issues in the python ecosystem by evaluating package vulnerabilities. These provide a series of implications that can help the security of software ecosystems by improving the process of discovering, fixing, and managing package vulnerabilities. The vulnerable dataset is generated using the NVD, the national vulnerability database, and the Snyk vulnerability dataset. In addition, we evaluated 807 vulnerability reports in the NVD and 3900 publicly known security vulnerabilities in Python Package Manager (pip) from the Snyk database from 2002 to 2022. As a result, many Python vulnerabilities appear in high severity, followed by medium severity. The most problematic areas have been improper input validation and denial of service attacks. A hybrid scanning tool that combines the three scanners bandit, snyk and dlint, which provide a clear report of the code vulnerability, is also described.

Keywords: Python vulnerabilities, bandit, Snyk, Dlint, Python package index, ecosystem, static analysis, malicious attacks

Procedia PDF Downloads 101

3382 Hard Disk Failure Predictions in Supercomputing System Based on CNN-LSTM and Oversampling Technique

Authors: Yingkun Huang, Li Guo, Zekang Lan, Kai Tian

Abstract:

Hard disk drives (HDD) failure of the exascale supercomputing system may lead to service interruption and invalidate previous calculations, and it will cause permanent data loss. Therefore, initiating corrective actions before hard drive failures materialize is critical to the continued operation of jobs. In this paper, a highly accurate analysis model based on CNN-LSTM and oversampling technique was proposed, which can correctly predict the necessity of a disk replacement even ten days in advance. Generally, the learning-based method performs poorly on a training dataset with long-tail distribution, especially fault prediction is a very classic situation as the scarcity of failure data. To overcome the puzzle, a new oversampling was employed to augment the data, and then, an improved CNN-LSTM with the shortcut was built to learn more effective features. The shortcut transmits the results of the previous layer of CNN and is used as the input of the LSTM model after weighted fusion with the output of the next layer. Finally, a detailed, empirical comparison of 6 prediction methods is presented and discussed on a public dataset for evaluation. The experiments indicate that the proposed method predicts disk failure with 0.91 Precision, 0.91 Recall, 0.91 F-measure, and 0.90 MCC for 10 days prediction horizon. Thus, the proposed algorithm is an efficient algorithm for predicting HDD failure in supercomputing.

Keywords: HDD replacement, failure, CNN-LSTM, oversampling, prediction

Procedia PDF Downloads 54

3381 An Experimental Investigation on Banana and Pineapple Natural Fibers Reinforced with Polypropylene Composite by Impact Test and SEM Analysis

Authors: D. Karibasavaraja, Ramesh M.R., Sufiyan Ahmed, Noyonika M.R., Sameeksha A. V., Mamatha J., Samiksha S. Urs

Abstract:

This research paper gives an overview of the experimental analysis of natural fibers with polymer composite. The whole world is concerned about conserving the environment. Henceforth, the demand for natural and decomposable materials is increasing. The application of natural fibers is widely used in aerospace for manufacturing aircraft bodies, and ship construction in navy fields. Based on the literature review, researchers and scientists are replacing synthetic fibers with natural fibers. The selection of these fibers mainly depends on lightweight, easily available, and economical and has its own physical and chemical properties and many other properties that make them a fine quality fiber. The pineapple fiber has desirable properties of good mechanical strength, high cellulose content, and fiber length. Hybrid composite was prepared using different proportions of pineapple fiber and banana fiber, and their ratios were varied in 90% polypropylene mixed with 5% banana fiber and 5% pineapple fiber, 85% polypropylene mixed with 7.5% banana fiber and 7.5% pineapple fiber and 80% polypropylene mixed with 10% banana fiber and 10% pineapple fiber. By impact experimental analysis, we concluded that the combination of 90% polypropylene and 5% banana fiber and 5% pineapple fiber exhibits a higher toughness value with mechanical strength. We also conducted scanning electron microscopy (SEM) analysis which showed better fiber orientation bonding between the banana and pineapple fibers with polypropylene composites. The main aim of the present research is to evaluate the properties of pineapple fiber and banana fiber reinforced with hybrid polypropylene composites.

Keywords: toughness, fracture, impact strength, banana fibers, pineapple fibers, tensile strength, SEM analysis

Procedia PDF Downloads 118

3380 Toxicity of Cry1ac Bacillus thuringiensis against Helicoverpa armigera (Hubner) on Artificial Diet under Laboratory Conditions

Authors: Tahammal Hussain, Khuram Zia, Mumammad Jalal Arif, Megha Parajulee, Abdul Hakeem

Abstract:

The Bioassay on neonate, 2nd and 3rd instar larvae of Helicoverpa armigera (Hubner) were conducted against Bacillus thuringiensis proteins Cry1Ac. Cry1Ac was incorporated into an artificial diet and was serially diluted with distilled water and then mixed with diet at an appropriate temperature of diet. Toxins incorporated prepared diet was poured into Petri-dishes. For controls, distilled water was mixed with the diet. Five toxin doses 0.25, 0.5, 1, 2, and 4 ug / ml and one control were used for each instars of H. armigera 20 larvae were used in each replication and each treatment is replicated four times. LC50 of Cry1Ac against neonate, 2nd and 3rd instar larvae of H. armigera were 0.34, 0.81 and 1.46 ug / ml. So Cry1Ac is more effective against neonate larvae of H .armigera as compared to 2nd and 3rd instar larvae under laboratory conditions.

Keywords: B. thuringiensis, Cry1Ac, H. armigera, toxicity

Procedia PDF Downloads 382

3379 Classification of Potential Biomarkers in Breast Cancer Using Artificial Intelligence Algorithms and Anthropometric Datasets

Authors: Aref Aasi, Sahar Ebrahimi Bajgani, Erfan Aasi

Abstract:

Breast cancer (BC) continues to be the most frequent cancer in females and causes the highest number of cancer-related deaths in women worldwide. Inspired by recent advances in studying the relationship between different patient attributes and features and the disease, in this paper, we have tried to investigate the different classification methods for better diagnosis of BC in the early stages. In this regard, datasets from the University Hospital Centre of Coimbra were chosen, and different machine learning (ML)-based and neural network (NN) classifiers have been studied. For this purpose, we have selected favorable features among the nine provided attributes from the clinical dataset by using a random forest algorithm. This dataset consists of both healthy controls and BC patients, and it was noted that glucose, BMI, resistin, and age have the most importance, respectively. Moreover, we have analyzed these features with various ML-based classifier methods, including Decision Tree (DT), K-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machine (SVM) along with NN-based Multi-Layer Perceptron (MLP) classifier. The results revealed that among different techniques, the SVM and MLP classifiers have the most accuracy, with amounts of 96% and 92%, respectively. These results divulged that the adopted procedure could be used effectively for the classification of cancer cells, and also it encourages further experimental investigations with more collected data for other types of cancers.

Keywords: breast cancer, diagnosis, machine learning, biomarker classification, neural network

Procedia PDF Downloads 101

3378 Modelling and Maping Malnutrition Toddlers in Bojonegoro Regency with Mixed Geographically Weighted Regression Approach

Authors: Elvira Mustikawati P.H., Iis Dewi Ratih, Dita Amelia

Abstract:

Bojonegoro has proclaimed a policy of zero malnutrition. Therefore, as an effort to solve the cases of malnutrition children in Bojonegoro, this study used the approach geographically Mixed Weighted Regression (MGWR) to determine the factors that influence the percentage of malnourished children under five in which factors can be divided into locally influential factor in each district and global factors that influence throughout the district. Based on the test of goodness of fit models, R2 and AIC values in GWR models are better than MGWR models. R2 and AIC values in MGWR models are 84.37% and 14.28, while the GWR models respectively are 91.04% and -62.04. Based on the analysis with GWR models, District Sekar, Bubulan, Gondang, and Dander is a district with three predictor variables (percentage of vitamin A, the percentage of births assisted health personnel, and the percentage of clean water) that significantly influence the percentage of malnourished children under five.

Keywords: GWR, MGWR, R2, AIC

Procedia PDF Downloads 261

3377 D3Advert: Data-Driven Decision Making for Ad Personalization through Personality Analysis Using BiLSTM Network

Authors: Sandesh Achar

Abstract:

Personalized advertising holds greater potential for higher conversion rates compared to generic advertisements. However, its widespread application in the retail industry faces challenges due to complex implementation processes. These complexities impede the swift adoption of personalized advertisement on a large scale. Personalized advertisement, being a data-driven approach, necessitates consumer-related data, adding to its complexity. This paper introduces an innovative data-driven decision-making framework, D3Advert, which personalizes advertisements by analyzing personalities using a BiLSTM network. The framework utilizes the Myers–Briggs Type Indicator (MBTI) dataset for development. The employed BiLSTM network, specifically designed and optimized for D3Advert, classifies user personalities into one of the sixteen MBTI categories based on their social media posts. The classification accuracy is 86.42%, with precision, recall, and F1-Score values of 85.11%, 84.14%, and 83.89%, respectively. The D3Advert framework personalizes advertisements based on these personality classifications. Experimental implementation and performance analysis of D3Advert demonstrate a 40% improvement in impressions. D3Advert’s innovative and straightforward approach has the potential to transform personalized advertising and foster widespread personalized advertisement adoption in marketing.

Keywords: personalized advertisement, deep Learning, MBTI dataset, BiLSTM network, NLP.

Procedia PDF Downloads 18

3376 Field Saturation Flow Measurement Using Dynamic Passenger Car Unit under Mixed Traffic Condition

Authors: Ramesh Chandra Majhi

Abstract:

Saturation flow is a very important input variable for the design of signalized intersections. Saturation flow measurement is well established for homogeneous traffic. However, saturation flow measurement and modeling is a challenging task in heterogeneous characterized by multiple vehicle types and non-lane based movement. Present study focuses on proposing a field procedure for Saturation flow measurement and the effect of typical mixed traffic behavior at the signal as far as non-lane based traffic movement is concerned. Data collected during peak and off-peak hour from five intersections with varying approach width is used for validating the saturation flow model. The insights from the study can be used for modeling saturation flow and delay at signalized intersection in heterogeneous traffic conditions.

Keywords: optimization, passenger car unit, saturation flow, signalized intersection

Procedia PDF Downloads 301

3375 Assessment of Sustainability Initiatives at Applied Science University in Bahrain

Authors: Bayan Ahmed Alsaffar

Abstract:

The aim of this study is to assess the sustainability initiatives at Applied Sciences University (ASU) in Bahrain using a mixed-methods approach based on students, staff, and faculty perceptions. The study involves a literature review, interviews with faculty members and students, and a survey of ASU's level of sustainability in education, research, operations, administration, and finance that depended on the Sustainability Tracking, Assessment & Rating System (STARS). STARS is a tool used to evaluate the sustainability performance of higher education institutions. The study concludes that a mixed-methods approach can provide a powerful tool for assessing sustainability initiatives at ASU and ultimately lead to insights that can inform effective strategies for improving sustainability efforts. The current study contributes to the field of sustainability in universities and highlights the importance of user engagement and awareness for achieving sustainability goals.

Keywords: environment, initiatives, society, sustainability, STARS, university

Procedia PDF Downloads 48

3374 Lead and Cadmium Residue Determination in Spices Available in Tripoli City Markets (Libya)

Authors: Mohamed Ziyaina, Ahlam Rajab, Khadija Alkhweldi, Wafia Algami, Omer Al. Toumi, Barbara Rasco1

Abstract:

In recent years, there has been a growing interest in monitoring heavy metal contamination in food products. Spices can improve the taste of food and can also be a source of many bioactive compounds but can unfortunately, also be contaminated with dangerous materials, potentially heavy metals. This study was conducted to investigate lead (Pb) and cadmium (Cd) contamination in selected spices commonly consumed in Libya including Capsicum frutescens (chili pepper) Piper nigrum, (black pepper), Curcuma longa (turmeric), and mixed spices (HRARAT) which consist of a combination of: Alpinia officinarum, Zingiber officinale and Cinnamomum zeylanicum. Spices were analyzed by atomic absorption spectroscopy after digestion with nitric acid/hydrogen peroxide. The highest level of lead (Pb) was found in Curcuma longa and Capsicum frutescens in wholesale markets (1.05 ± 0.01 mg/kg, 0.96 ± 0.06 mg/kg). Cadmium (Cd) levels exceeded FAO/WHO permissible limit. Curcuma longa and Piper nigrum sold in retail markets had a high concentration of Cd (0.36 ± 0.09, 0.35 ± 0.07 mg/kg, respectively) followed by (0.32 ± 0.04 mg/kg) for Capsicum frutescens. Mixed spices purchased from wholesale markets also had high levels of Cd (0.31 ± 0.08 mg/kg). Curcuma longa and Capsicum frutescens may pose a food safety risk due to high levels of lead and cadmium. Cadmium levels exceeded FAO/WHO recommendations (0.2 ppm) for Piper nigrum, Curcuma longa, and mixed spices (HRARAT).

Keywords: heavy metals, lead, cadmium determination, spice

Procedia PDF Downloads 613

3373 Changes in Serum Neopterin in Workers Exposed to Different Mineral Dust

Authors: Gospodinka Prakova, Pavlina Gidikova, Gergana Sandeva, Kamelia Haracherova, Emil Slavov

Abstract:

Neopterin was demonstrated to be a sensitive marker of cell-mediated immune reactions which plays a key role in the interaction of monocyte / macrophage activation. The purpose of this work was to investigate changes in serum neopterin in workers exposed to different composition of mineral dust. Material and Methods: Serum neopterin was studied in 193 exposed workers, divided into three groups, depending on the mineral dust and content of the quartz in the respirable fraction. The I-st group-coal dust containing less than 2% free crystalline silica (n=44), II-nd group-coal dust containing over 2% free crystalline silica (n=94) and the III-rd group-mixed dust with corundum and carborundum (n=55). The control group was composed of 21 individuals without exposure to dust. Serum neopterin was investigated by Elisa method in ng/ml according to the instructions of the manufacturer. Results and Discussion: It was found significantly higher level of serum neopterin in exposed workers of mineral dust (2,10 ± 0,62 ng / ml), compared with that of the control group (1,10 ± 0,85 ng/ml; p < 0,05). Neopterin levels in workers exposed to coal dust (1,87 ± 0,42 ng / ml-I-st and 3,32 ± 0,77 ng / ml-II-nd group) were significantly higher compared with those exposed to a mixed dust (1,31±0,68 mg / ml-third) and control group (p < 0,05). No significant difference in serum neopterin when exposed to a mixed dust composed of corundum and carborundum (III-rd) and a control group. Conclusion: The results of this study indicate activates a cell-mediated immune response when exposed to a mineral dust. The level of that activation depends mainly on the composition of the dust and is significantly highest in workers exposed to coal dust.

Keywords: mineral dust, neopterin, occupational exposure, respirable crystalline silica

Procedia PDF Downloads 245

3372 A Transformer-Based Approach for Multi-Human 3D Pose Estimation Using Color and Depth Images

Authors: Qiang Wang, Hongyang Yu

Abstract:

Multi-human 3D pose estimation is a challenging task in computer vision, which aims to recover the 3D joint locations of multiple people from multi-view images. In contrast to traditional methods, which typically only use color (RGB) images as input, our approach utilizes both color and depth (D) information contained in RGB-D images. We also employ a transformer-based model as the backbone of our approach, which is able to capture long-range dependencies and has been shown to perform well on various sequence modeling tasks. Our method is trained and tested on the Carnegie Mellon University (CMU) Panoptic dataset, which contains a diverse set of indoor and outdoor scenes with multiple people in varying poses and clothing. We evaluate the performance of our model on the standard 3D pose estimation metrics of mean per-joint position error (MPJPE). Our results show that the transformer-based approach outperforms traditional methods and achieves competitive results on the CMU Panoptic dataset. We also perform an ablation study to understand the impact of different design choices on the overall performance of the model. In summary, our work demonstrates the effectiveness of using a transformer-based approach with RGB-D images for multi-human 3D pose estimation and has potential applications in real-world scenarios such as human-computer interaction, robotics, and augmented reality.

Keywords: multi-human 3D pose estimation, RGB-D images, transformer, 3D joint locations

Procedia PDF Downloads 49

3371 Automated Digital Mammogram Segmentation Using Dispersed Region Growing and Pectoral Muscle Sliding Window Algorithm

Authors: Ayush Shrivastava, Arpit Chaudhary, Devang Kulshreshtha, Vibhav Prakash Singh, Rajeev Srivastava

Abstract:

Early diagnosis of breast cancer can improve the survival rate by detecting cancer at an early stage. Breast region segmentation is an essential step in the analysis of digital mammograms. Accurate image segmentation leads to better detection of cancer. It aims at separating out Region of Interest (ROI) from rest of the image. The procedure begins with removal of labels, annotations and tags from the mammographic image using morphological opening method. Pectoral Muscle Sliding Window Algorithm (PMSWA) is used for removal of pectoral muscle from mammograms which is necessary as the intensity values of pectoral muscles are similar to that of ROI which makes it difficult to separate out. After removing the pectoral muscle, Dispersed Region Growing Algorithm (DRGA) is used for segmentation of mammogram which disperses seeds in different regions instead of a single bright region. To demonstrate the validity of our segmentation method, 322 mammographic images from Mammographic Image Analysis Society (MIAS) database are used. The dataset contains medio-lateral oblique (MLO) view of mammograms. Experimental results on MIAS dataset show the effectiveness of our proposed method.

Keywords: CAD, dispersed region growing algorithm (DRGA), image segmentation, mammography, pectoral muscle sliding window algorithm (PMSWA)

Procedia PDF Downloads 288

3370 An Electrocardiography Deep Learning Model to Detect Atrial Fibrillation on Clinical Application

Authors: Jui-Chien Hsieh

Abstract:

Background:12-lead electrocardiography(ECG) is one of frequently-used tools to detect atrial fibrillation (AF), which might degenerate into life-threaten stroke, in clinical Practice. Based on this study, the AF detection by the clinically-used 12-lead ECG device has only 0.73~0.77 positive predictive value (ppv). Objective: It is on great demand to develop a new algorithm to improve the precision of AF detection using 12-lead ECG. Due to the progress on artificial intelligence (AI), we develop an ECG deep model that has the ability to recognize AF patterns and reduce false-positive errors. Methods: In this study, (1) 570-sample 12-lead ECG reports whose computer interpretation by the ECG device was AF were collected as the training dataset. The ECG reports were interpreted by 2 senior cardiologists, and confirmed that the precision of AF detection by the ECG device is 0.73.; (2) 88 12-lead ECG reports whose computer interpretation generated by the ECG device was AF were used as test dataset. Cardiologist confirmed that 68 cases of 88 reports were AF, and others were not AF. The precision of AF detection by ECG device is about 0.77; (3) A parallel 4-layer 1 dimensional convolutional neural network (CNN) was developed to identify AF based on limb-lead ECGs and chest-lead ECGs. Results: The results indicated that this model has better performance on AF detection than traditional computer interpretation of the ECG device in 88 test samples with 0.94 ppv, 0.98 sensitivity, 0.80 specificity. Conclusions: As compared to the clinical ECG device, this AI ECG model promotes the precision of AF detection from 0.77 to 0.94, and can generate impacts on clinical applications.

Keywords: 12-lead ECG, atrial fibrillation, deep learning, convolutional neural network

Procedia PDF Downloads 91

3369 Empirical Roughness Progression Models of Heavy Duty Rural Pavements

Authors: Nahla H. Alaswadko, Rayya A. Hassan, Bayar N. Mohammed

Abstract:

Empirical deterministic models have been developed to predict roughness progression of heavy duty spray sealed pavements for a dataset representing rural arterial roads. The dataset provides a good representation of the relevant network and covers a wide range of operating and environmental conditions. A sample with a large size of historical time series data for many pavement sections has been collected and prepared for use in multilevel regression analysis. The modelling parameters include road roughness as performance parameter and traffic loading, time, initial pavement strength, reactivity level of subgrade soil, climate condition, and condition of drainage system as predictor parameters. The purpose of this paper is to report the approaches adopted for models development and validation. The study presents multilevel models that can account for the correlation among time series data of the same section and to capture the effect of unobserved variables. Study results show that the models fit the data very well. The contribution and significance of relevant influencing factors in predicting roughness progression are presented and explained. The paper concludes that the analysis approach used for developing the models confirmed their accuracy and reliability by well-fitting to the validation data.

Keywords: roughness progression, empirical model, pavement performance, heavy duty pavement

Procedia PDF Downloads 140

3368 Identification of Hepatocellular Carcinoma Using Supervised Learning Algorithms

Authors: Sagri Sharma

Abstract:

Analysis of diseases integrating multi-factors increases the complexity of the problem and therefore, development of frameworks for the analysis of diseases is an issue that is currently a topic of intense research. Due to the inter-dependence of the various parameters, the use of traditional methodologies has not been very effective. Consequently, newer methodologies are being sought to deal with the problem. Supervised Learning Algorithms are commonly used for performing the prediction on previously unseen data. These algorithms are commonly used for applications in fields ranging from image analysis to protein structure and function prediction and they get trained using a known dataset to come up with a predictor model that generates reasonable predictions for the response to new data. Gene expression profiles generated by DNA analysis experiments can be quite complex since these experiments can involve hypotheses involving entire genomes. The application of well-known machine learning algorithm - Support Vector Machine - to analyze the expression levels of thousands of genes simultaneously in a timely, automated and cost effective way is thus used. The objectives to undertake the presented work are development of a methodology to identify genes relevant to Hepatocellular Carcinoma (HCC) from gene expression dataset utilizing supervised learning algorithms and statistical evaluations along with development of a predictive framework that can perform classification tasks on new, unseen data.

Keywords: artificial intelligence, biomarker, gene expression datasets, hepatocellular carcinoma, machine learning, supervised learning algorithms, support vector machine

Procedia PDF Downloads 402

3367 Hounsfield-Based Automatic Evaluation of Volumetric Breast Density on Radiotherapy CT-Scans

Authors: E. M. D. Akuoko, Eliana Vasquez Osorio, Marcel Van Herk, Marianne Aznar

Abstract:

Radiotherapy is an integral part of treatment for many patients with breast cancer. However, side effects can occur, e.g., fibrosis or erythema. If patients at higher risks of radiation-induced side effects could be identified before treatment, they could be given more individual information about the risks and benefits of radiotherapy. We hypothesize that breast density is correlated with the risk of side effects and present a novel method for automatic evaluation based on radiotherapy planning CT scans. Methods: 799 supine CT scans of breast radiotherapy patients were available from the REQUITE dataset. The methodology was first established in a subset of 114 patients (cohort 1) before being applied to the whole dataset (cohort 2). All patients were scanned in the supine position, with arms up, and the treated breast (ipsilateral) was identified. Manual experts contour available in 96 patients for both the ipsilateral and contralateral breast in cohort 1. Breast tissue was segmented using atlas-based automatic contouring software, ADMIRE® v3.4 (Elekta AB, Sweden). Once validated, the automatic segmentation method was applied to cohort 2. Breast density was then investigated by thresholding voxels within the contours, using Otsu threshold and pixel intensity ranges based on Hounsfield units (-200 to -100 for fatty tissue, and -99 to +100 for fibro-glandular tissue). Volumetric breast density (VBD) was defined as the volume of fibro-glandular tissue / (volume of fibro-glandular tissue + volume of fatty tissue). A sensitivity analysis was performed to verify whether calculated VBD was affected by the choice of breast contour. In addition, we investigated the correlation between volumetric breast density (VBD) and patient age and breast size. VBD values were compared between ipsilateral and contralateral breast contours. Results: Estimated VBD values were 0.40 (range 0.17-0.91) in cohort 1, and 0.43 (0.096-0.99) in cohort 2. We observed ipsilateral breasts to be denser than contralateral breasts. Breast density was negatively associated with breast volume (Spearman: R=-0.5, p-value < 2.2e-16) and age (Spearman: R=-0.24, p-value = 4.6e-10). Conclusion: VBD estimates could be obtained automatically on a large CT dataset. Patients’ age or breast volume may not be the only variables that explain breast density. Future work will focus on assessing the usefulness of VBD as a predictive variable for radiation-induced side effects.

Keywords: breast cancer, automatic image segmentation, radiotherapy, big data, breast density, medical imaging

Procedia PDF Downloads 109

3366 In-Context Meta Learning for Automatic Designing Pretext Tasks for Self-Supervised Image Analysis

Authors: Toktam Khatibi

Abstract:

Self-supervised learning (SSL) includes machine learning models that are trained on one aspect and/or one part of the input to learn other aspects and/or part of it. SSL models are divided into two different categories, including pre-text task-based models and contrastive learning ones. Pre-text tasks are some auxiliary tasks learning pseudo-labels, and the trained models are further fine-tuned for downstream tasks. However, one important disadvantage of SSL using pre-text task solving is defining an appropriate pre-text task for each image dataset with a variety of image modalities. Therefore, it is required to design an appropriate pretext task automatically for each dataset and each downstream task. To the best of our knowledge, the automatic designing of pretext tasks for image analysis has not been considered yet. In this paper, we present a framework based on In-context learning that describes each task based on its input and output data using a pre-trained image transformer. Our proposed method combines the input image and its learned description for optimizing the pre-text task design and its hyper-parameters using Meta-learning models. The representations learned from the pre-text tasks are fine-tuned for solving the downstream tasks. We demonstrate that our proposed framework outperforms the compared ones on unseen tasks and image modalities in addition to its superior performance for previously known tasks and datasets.

Keywords: in-context learning (ICL), meta learning, self-supervised learning (SSL), vision-language domain, transformers

Procedia PDF Downloads 49

3365 A Selective and Fast Hydrogen Sensor Using Doped-LaCrO₃ as Sensing Electrode

Authors: He Zhang, Jianxin Yi

Abstract:

As a clean energy, hydrogen shows many advantages such as renewability, high heat value, and extensive sources and may play an important role in the future society. However, hydrogen is a combustible gas because of its low ignition energy (0.02mJ) and wide explosive limit (4% ~ 74% in air). It is very likely to cause fire hazard or explosion once leakage is happened and not detected in time. Mixed-potential type sensor has attracted much attention in monitoring and detecting hydrogen due to its high response, simple support electronics and long-term stability. Typically, this kind of sensor is consisted of a sensing electrode (SE), a reference electrode (RE) and a solid electrolyte. The SE and RE materials usually display different electrocatalytic abilities to hydrogen. So hydrogen could be detected by measuring the EMF change between the two electrodes. Previous reports indicate that a high-performance sensing electrode is important for improving the sensing characteristics of the sensor. In this report, a planar type mixed-potential hydrogen sensor using La₀.₈Sr₀.₂Cr₀.₅Mn₀.₅O₃₋δ (LSCM) as SE, Pt as RE and yttria-stabilized zirconia (YSZ) as solid electrolyte was developed. The reason for selecting LSCM as sensing electrode is that it shows the high electrocatalytic ability to hydrogen in solid oxide fuel cells. The sensing performance of the fabricated LSCM/YSZ/Pt sensor was tested systemically. The experimental results show that the sensor displays high response to hydrogen. The response values for 100ppm and 1000ppm hydrogen at 450 ºC are -70 mV and -118 mV, respectively. The response time is an important parameter to evaluate a sensor. In this report, the sensor response time decreases with increasing hydrogen concentration and get saturated above 500ppm. The steady response time at 450 ºC is as short as 4s, indicating the sensor shows great potential in practical application to monitor hydrogen. An excellent response repeatability to 100ppm hydrogen at 450 ˚C and a good sensor reproducibility among three sensors were also observed. Meanwhile, the sensor exhibits excellent selectivity to hydrogen compared with several interfering gases such as NO₂, CH₄, CO, C₃H₈ and NH₃. Polarization curves were tested to investigate the sensing mechanism and the results indicated the sensor abide by the mixed-potential mechanism.

Keywords: fire hazard, H₂ sensor, mixed-potential, perovskite

Procedia PDF Downloads 152

3364 A Comparative Analysis of (De)legitimation Strategies in Selected African Inaugural Speeches

Authors: Lily Chimuanya, Ehioghae Esther

Abstract:

Language, a versatile and sophisticated tool, is fundamentally sacrosanct to mankind especially within the realm of politics. In this dynamic world, political leaders adroitly use language to engage in a strategic show aimed at manipulating or mechanising the opinion of discerning people. This nuanced synergy is marked by different rhetorical strategies, meticulously synced with contextual factors ranging from cultural, ideological, and political to achieve multifaceted persuasive objectives. This study investigates the (de)legitimation strategies inherent in African presidential inaugural speeches, as African leaders not only state their policy agenda through inaugural speeches but also subtly indulge in a dance of legitimation and delegitimation, performing a twofold objective of strengthening the credibility of their administration and, at times, undermining the performance of the past administration. Drawing insights from two different legitimation models and a dataset of 4 African presidential inaugural speeches obtained from authentic websites, the study describes the roles of authorisation, rationalisation, moral evaluation, altruism, and mythopoesis in unmasking the structure of political discourse. The analysis takes a mixed-method approach to unpack the (de)legitimation strategy embedded in the carefully chosen speeches. The focus extends beyond a superficial exploration and delves into the linguistic elements that form the basis of presidential discourse. In conclusion, this examination goes beyond the nuanced landscape of language as a potent tool in politics, with each strategy contributing to the overall rhetorical impact and shaping the narrative. From this perspective, the study argues that presidential inaugural speeches are not only linguistic exercises but also viable weapons that influence perceptions and legitimise authority.

Keywords: CDA, legitimation, inaugural speeches, delegitmation

Procedia PDF Downloads 23

3363 Collaborative Data Refinement for Enhanced Ionic Conductivity Prediction in Garnet-Type Materials

Authors: Zakaria Kharbouch, Mustapha Bouchaara, F. Elkouihen, A. Habbal, A. Ratnani, A. Faik

Abstract:

Solid-state lithium-ion batteries have garnered increasing interest in modern energy research due to their potential for safer, more efficient, and sustainable energy storage systems. Among the critical components of these batteries, the electrolyte plays a pivotal role, with LLZO garnet-based electrolytes showing significant promise. Garnet materials offer intrinsic advantages such as high Li-ion conductivity, wide electrochemical stability, and excellent compatibility with lithium metal anodes. However, optimizing ionic conductivity in garnet structures poses a complex challenge, primarily due to the multitude of potential dopants that can be incorporated into the LLZO crystal lattice. The complexity of material design, influenced by numerous dopant options, requires a systematic method to find the most effective combinations. This study highlights the utility of machine learning (ML) techniques in the materials discovery process to navigate the complex range of factors in garnet-based electrolytes. Collaborators from the materials science and ML fields worked with a comprehensive dataset previously employed in a similar study and collected from various literature sources. This dataset served as the foundation for an extensive data refinement phase, where meticulous error identification, correction, outlier removal, and garnet-specific feature engineering were conducted. This rigorous process substantially improved the dataset's quality, ensuring it accurately captured the underlying physical and chemical principles governing garnet ionic conductivity. The data refinement effort resulted in a significant improvement in the predictive performance of the machine learning model. Originally starting at an accuracy of 0.32, the model underwent substantial refinement, ultimately achieving an accuracy of 0.88. This enhancement highlights the effectiveness of the interdisciplinary approach and underscores the substantial potential of machine learning techniques in materials science research.

Keywords: lithium batteries, all-solid-state batteries, machine learning, solid state electrolytes

Procedia PDF Downloads 29

3362 Properties of Self-Compacting Concrete Mixed with Fly Ash

Authors: Abhinandan Singh Gill, Gurbir Kaur Jawanda

Abstract:

Since the introduction of self-consolidating concrete (SCC) in Japan during the late 1980’s, acceptance and usage of this concrete in the construction industry has been steadily gaining momentum. In the United States, the usage of SCC has been spearheaded by the precast concrete industry. Good SCC must possess the following key fresh properties: filling ability, passing ability, and resistance to segregation. Self-compacting concrete is one of 'the most revolutionary developments' in concrete research; this concrete is able to flow and to fill the most restocked places of the form work without vibration. There are several methods for testing its properties. In the fresh state: the most frequently used are slump flow test, L box and V-funnel. This work presents properties of self-compacting concrete, mixed with fly ash. The test results for acceptance characteristics of self-compacting concrete such as slump flow; V-funnel and L-Box are presented. Further, the compressive strength at the ages of 7, 28 days was also determined and results are included here.

Keywords: compressive strength, fly ash, self-compacting concrete, slump flow test, super plasticizer

Procedia PDF Downloads 382

3361 Neuro-Connectivity Analysis Using Abide Data in Autism Study

Authors: Dulal Bhaumik, Fei Jie, Runa Bhaumik, Bikas Sinha

Abstract:

Human brain is an amazingly complex network. Aberrant activities in this network can lead to various neurological disorders such as multiple sclerosis, Parkinson’s disease, Alzheimer’s disease and autism. fMRI has emerged as an important tool to delineate the neural networks affected by such diseases, particularly autism. In this paper, we propose mixed-effects models together with an appropriate procedure for controlling false discoveries to detect disrupted connectivities in whole brain studies. Results are illustrated with a large data set known as Autism Brain Imaging Data Exchange or ABIDE which includes 361 subjects from 8 medical centers. We believe that our findings have addressed adequately the small sample inference problem, and thus are more reliable for therapeutic target for intervention. In addition, our result can be used for early detection of subjects who are at high risk of developing neurological disorders.

Keywords: ABIDE, autism spectrum disorder, fMRI, mixed-effects model

Procedia PDF Downloads 256

3360 Low-Level Modeling for Optimal Train Routing and Scheduling in Busy Railway Stations

Authors: Quoc Khanh Dang, Thomas Bourdeaud’huy, Khaled Mesghouni, Armand Toguy´eni

Abstract:

This paper studies a train routing and scheduling problem for busy railway stations. Our objective is to allow trains to be routed in dense areas that are reaching saturation. Unlike traditional methods that allocate all resources to setup a route for a train and until the route is freed, our work focuses on the use of resources as trains progress through the railway node. This technique allows a larger number of trains to be routed simultaneously in a railway node and thus reduces their current saturation. To deal with this problem, this study proposes an abstract model and a mixed-integer linear programming formulation to solve it. The applicability of our method is illustrated on a didactic example.

Keywords: busy railway stations, mixed-integer linear programming, offline railway station management, train platforming, train routing, train scheduling

Procedia PDF Downloads 228

3359 Exploring SSD Suitable Allocation Schemes Incompliance with Workload Patterns

Authors: Jae Young Park, Hwansu Jung, Jong Tae Kim

Abstract:

Whether the data has been well parallelized is an important factor in the Solid-State-Drive (SSD) performance. SSD parallelization is affected by allocation scheme and it is directly connected to SSD performance. There are dynamic allocation and static allocation in representative allocation schemes. Dynamic allocation is more adaptive in exploiting write operation parallelism, while static allocation is better in read operation parallelism. Therefore, it is hard to select the appropriate allocation scheme when the workload is mixed read and write operations. We simulated conditions on a few mixed data patterns and analyzed the results to help the right choice for better performance. As the results, if data arrival interval is long enough prior operations to be finished and continuous read intensive data environment static allocation is more suitable. Dynamic allocation performs the best on write performance and random data patterns.

Keywords: dynamic allocation, NAND flash based SSD, SSD parallelism, static allocation

Procedia PDF Downloads 310

3358 Improved Classification Procedure for Imbalanced and Overlapped Situations

Authors: Hankyu Lee, Seoung Bum Kim

Abstract:

The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.

Keywords: classification, imbalanced data with class overlap, split data space, support vector machine

Procedia PDF Downloads 279

3357 Using Autoencoder as Feature Extractor for Malware Detection

Authors: Umm-E-Hani, Faiza Babar, Hanif Durad

Abstract:

Malware-detecting approaches suffer many limitations, due to which all anti-malware solutions have failed to be reliable enough for detecting zero-day malware. Signature-based solutions depend upon the signatures that can be generated only when malware surfaces at least once in the cyber world. Another approach that works by detecting the anomalies caused in the environment can easily be defeated by diligently and intelligently written malware. Solutions that have been trained to observe the behavior for detecting malicious files have failed to cater to the malware capable of detecting the sandboxed or protected environment. Machine learning and deep learning-based approaches greatly suffer in training their models with either an imbalanced dataset or an inadequate number of samples. AI-based anti-malware solutions that have been trained with enough samples targeted a selected feature vector, thus ignoring the input of leftover features in the maliciousness of malware just to cope with the lack of underlying hardware processing power. Our research focuses on producing an anti-malware solution for detecting malicious PE files by circumventing the earlier-mentioned shortcomings. Our proposed framework, which is based on automated feature engineering through autoencoders, trains the model over a fairly large dataset. It focuses on the visual patterns of malware samples to automatically extract the meaningful part of the visual pattern. Our experiment has successfully produced a state-of-the-art accuracy of 99.54 % over test data.

Keywords: malware, auto encoders, automated feature engineering, classification

Procedia PDF Downloads 49

3356 The Identification of Combined Genomic Expressions as a Diagnostic Factor for Oral Squamous Cell Carcinoma

Authors: Ki-Yeo Kim

Abstract:

Trends in genetics are transforming in order to identify differential coexpressions of correlated gene expression rather than the significant individual gene. Moreover, it is known that a combined biomarker pattern improves the discrimination of a specific cancer. The identification of the combined biomarker is also necessary for the early detection of invasive oral squamous cell carcinoma (OSCC). To identify the combined biomarker that could improve the discrimination of OSCC, we explored an appropriate number of genes in a combined gene set in order to attain the highest level of accuracy. After detecting a significant gene set, including the pre-defined number of genes, a combined expression was identified using the weights of genes in a gene set. We used the Principal Component Analysis (PCA) for the weight calculation. In this process, we used three public microarray datasets. One dataset was used for identifying the combined biomarker, and the other two datasets were used for validation. The discrimination accuracy was measured by the out-of-bag (OOB) error. There was no relation between the significance and the discrimination accuracy in each individual gene. The identified gene set included both significant and insignificant genes. One of the most significant gene sets in the classification of normal and OSCC included MMP1, SOCS3 and ACOX1. Furthermore, in the case of oral dysplasia and OSCC discrimination, two combined biomarkers were identified. The combined genomic expression achieved better performance in the discrimination of different conditions than in a single significant gene. Therefore, it could be expected that accurate diagnosis for cancer could be possible with a combined biomarker.

Keywords: oral squamous cell carcinoma, combined biomarker, microarray dataset, correlated genes

Procedia PDF Downloads 394

3355 Wolof Voice Response Recognition System: A Deep Learning Model for Wolof Audio Classification

Authors: Krishna Mohan Bathula, Fatou Bintou Loucoubar, FNU Kaleemunnisa, Christelle Scharff, Mark Anthony De Castro

Abstract:

Voice recognition algorithms such as automatic speech recognition and text-to-speech systems with African languages can play an important role in bridging the digital divide of Artificial Intelligence in Africa, contributing to the establishment of a fully inclusive information society. This paper proposes a Deep Learning model that can classify the user responses as inputs for an interactive voice response system. A dataset with Wolof language words ‘yes’ and ‘no’ is collected as audio recordings. A two stage Data Augmentation approach is adopted for enhancing the dataset size required by the deep neural network. Data preprocessing and feature engineering with Mel-Frequency Cepstral Coefficients are implemented. Convolutional Neural Networks (CNNs) have proven to be very powerful in image classification and are promising for audio processing when sounds are transformed into spectra. For performing voice response classification, the recordings are transformed into sound frequency feature spectra and then applied image classification methodology using a deep CNN model. The inference model of this trained and reusable Wolof voice response recognition system can be integrated with many applications associated with both web and mobile platforms.

Keywords: automatic speech recognition, interactive voice response, voice response recognition, wolof word classification

Procedia PDF Downloads 89

3354 A Fast, Portable Computational Framework for Aerodynamic Simulations

Authors: Mehdi Ghommem, Daniel Garcia, Nathan Collier, Victor Calo

Abstract:

We develop a fast, user-friendly implementation of a potential flow solver based on the unsteady vortex lattice method (UVLM). The computational framework uses the Python programming language which has easy integration with the scripts requiring computationally-expensive operations written in Fortran. The mixed-language approach enables high performance in terms of solution time and high flexibility in terms of easiness of code adaptation to different system configurations and applications. This computational tool is intended to predict the unsteady aerodynamic behavior of multiple moving bodies (e.g., flapping wings, rotating blades, suspension bridges...) subject to an incoming air. We simulate different aerodynamic problems to validate and illustrate the usefulness and effectiveness of the developed computational tool.

Keywords: unsteady aerodynamics, numerical simulations, mixed-language approach, potential flow

Procedia PDF Downloads 270