Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2053

Search results for: forest cover-type dataset

343 Resisting Adversarial Assaults: A Model-Agnostic Autoencoder Solution

Authors: Massimo Miccoli, Luca Marangoni, Alberto Aniello Scaringi, Alessandro Marceddu, Alessandro Amicone

Abstract:

The susceptibility of deep neural networks (DNNs) to adversarial manipulations is a recognized challenge within the computer vision domain. Adversarial examples, crafted by adding subtle yet malicious alterations to benign images, exploit this vulnerability. Various defense strategies have been proposed to safeguard DNNs against such attacks, stemming from diverse research hypotheses. Building upon prior work, our approach involves the utilization of autoencoder models. Autoencoders, a type of neural network, are trained to learn representations of training data and reconstruct inputs from these representations, typically minimizing reconstruction errors like mean squared error (MSE). Our autoencoder was trained on a dataset of benign examples; learning features specific to them. Consequently, when presented with significantly perturbed adversarial examples, the autoencoder exhibited high reconstruction errors. The architecture of the autoencoder was tailored to the dimensions of the images under evaluation. We considered various image sizes, constructing models differently for 256x256 and 512x512 images. Moreover, the choice of the computer vision model is crucial, as most adversarial attacks are designed with specific AI structures in mind. To mitigate this, we proposed a method to replace image-specific dimensions with a structure independent of both dimensions and neural network models, thereby enhancing robustness. Our multi-modal autoencoder reconstructs the spectral representation of images across the red-green-blue (RGB) color channels. To validate our approach, we conducted experiments using diverse datasets and subjected them to adversarial attacks using models such as ResNet50 and ViT_L_16 from the torch vision library. The autoencoder extracted features used in a classification model, resulting in an MSE (RGB) of 0.014, a classification accuracy of 97.33%, and a precision of 99%.

Keywords: adversarial attacks, malicious images detector, binary classifier, multimodal transformer autoencoder

Procedia PDF Downloads 114

342 Climate Species Lists: A Combination of Methods for Urban Areas

Authors: Andrea Gion Saluz, Tal Hertig, Axel Heinrich, Stefan Stevanovic

Abstract:

Higher temperatures, seasonal changes in precipitation, and extreme weather events are increasingly affecting trees. To counteract the increasing challenges of urban trees, strategies are increasingly being sought to preserve existing tree populations on the one hand and to prepare for the coming years on the other. One such strategy lies in strategic climate tree species selection. The search is on for species or varieties that can cope with the new climatic conditions. Many efforts in German-speaking countries deal with this in detail, such as the tree lists of the German Conference of Garden Authorities (GALK), the project Stadtgrün 2021, or the instruments of the Climate Species Matrix by Prof. Dr. Roloff. In this context, different methods for a correct species selection are offered. One possibility is to select certain physiological attributes that indicate the climate resilience of a species. To calculate the dissimilarity of the present climate of different geographic regions in relation to the future climate of any city, a weighted (standardized) Euclidean distance (SED) for seasonal climate values is calculated for each region of the Earth. The calculation was performed in the QGIS geographic information system, using global raster datasets on monthly climate values in the 1981-2010 standard period. Data from a European forest inventory were used to identify tree species growing in the calculated analogue climate regions. The inventory used is the compilation of georeferenced point data at a 1 km grid resolution on the occurrence of tree species in 21 European countries. In this project, the results of the methodological application are shown for the city of Zurich for the year 2060. In the first step, analog climate regions based on projected climate values for the measuring station Kirche Fluntern (ZH) were searched for. In a further step, the methods mentioned above were applied to generate tree species lists for the city of Zurich. These lists were then qualitatively evaluated with respect to the suitability of the different tree species for the Zurich area to generate a cleaned and thus usable list of possible future tree species.

Keywords: climate change, climate region, climate tree, urban tree

Procedia PDF Downloads 109

341 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock

Abstract:

Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing

Procedia PDF Downloads 131

340 A Comprehensive Framework for Fraud Prevention and Customer Feedback Classification in E-Commerce

Authors: Samhita Mummadi, Sree Divya Nagalli, Harshini Vemuri, Saketh Charan Nakka, Sumesh K. J.

Abstract:

One of the most significant challenges faced by people in today’s digital era is an alarming increase in fraudulent activities on online platforms. The fascination with online shopping to avoid long queues in shopping malls, the availability of a variety of products, and home delivery of goods have paved the way for a rapid increase in vast online shopping platforms. This has had a major impact on increasing fraudulent activities as well. This loop of online shopping and transactions has paved the way for fraudulent users to commit fraud. For instance, consider a store that orders thousands of products all at once, but what’s fishy about this is the massive number of items purchased and their transactions turning out to be fraud, leading to a huge loss for the seller. Considering scenarios like these underscores the urgent need to introduce machine learning approaches to combat fraud in online shopping. By leveraging robust algorithms, namely KNN, Decision Trees, and Random Forest, which are highly effective in generating accurate results, this research endeavors to discern patterns indicative of fraudulent behavior within transactional data. Introducing a comprehensive solution to this problem in order to empower e-commerce administrators in timely fraud detection and prevention is the primary motive and the main focus. In addition to that, sentiment analysis is harnessed in the model so that the e-commerce admin can tailor to the customer’s and consumer’s concerns, feedback, and comments, allowing the admin to improve the user’s experience. The ultimate objective of this study is to ramp up online shopping platforms against fraud and ensure a safer shopping experience. This paper underscores a model accuracy of 84%. All the findings and observations that were noted during our work lay the groundwork for future advancements in the development of more resilient and adaptive fraud detection systems, which will become crucial as technologies continue to evolve.

Keywords: behavior analysis, feature selection, Fraudulent pattern recognition, imbalanced classification, transactional anomalies

Procedia PDF Downloads 31

339 Determination of Potential Agricultural Lands Using Landsat 8 OLI Images and GIS: Case Study of Gokceada (Imroz) Turkey

Authors: Rahmi Kafadar, Levent Genc

Abstract:

In present study, it was aimed to determine potential agricultural lands (PALs) in Gokceada (Imroz) Island of Canakkale province, Turkey. Seven-band Landsat 8 OLI images acquired on July 12 and August 13, 2013, and their 14-band combination image were used to identify current Land Use Land Cover (LULC) status. Principal Component Analysis (PCA) was applied to three Landsat datasets in order to reduce the correlation between the bands. A total of six Original and PCA images were classified using supervised classification method to obtain the LULC maps including 6 main classes (“Forest”, “Agriculture”, “Water Surface”, “Residential Area-Bare Soil”, “Reforestation” and “Other”). Accuracy assessment was performed by checking the accuracy of 120 randomized points for each LULC maps. The best overall accuracy and Kappa statistic values (90.83%, 0.8791% respectively) were found for PCA images which were generated from 14-bands combined images called 3-B/JA. Digital Elevation Model (DEM) with 15 m spatial resolution (ASTER) was used to consider topographical characteristics. Soil properties were obtained by digitizing 1:25000 scaled soil maps of rural services directorate general. Potential Agricultural Lands (PALs) were determined using Geographic information Systems (GIS). Procedure was applied considering that “Other” class of LULC map may be used for agricultural purposes in the future properties. Overlaying analysis was conducted using Slope (S), Land Use Capability Class (LUCC), Other Soil Properties (OSP) and Land Use Capability Sub-Class (SUBC) properties. A total of 901.62 ha areas within “Other” class (15798.2 ha) of LULC map were determined as PALs. These lands were ranked as “Very Suitable”, “Suitable”, “Moderate Suitable” and “Low Suitable”. It was determined that the 8.03 ha were classified as “Very Suitable” while 18.59 ha as suitable and 11.44 ha as “Moderate Suitable” for PALs. In addition, 756.56 ha were found to be “Low Suitable”. The results obtained from this preliminary study can serve as basis for further studies.

Keywords: digital elevation model (DEM), geographic information systems (GIS), gokceada (Imroz), lANDSAT 8 OLI-TIRS, land use land cover (LULC)

Procedia PDF Downloads 354

338 Early Prediction of Diseases in a Cow for Cattle Industry

Authors: Ghufran Ahmed, Muhammad Osama Siddiqui, Shahbaz Siddiqui, Rauf Ahmad Shams Malick, Faisal Khan, Mubashir Khan

Abstract:

In this paper, a machine learning-based approach for early prediction of diseases in cows is proposed. Different ML algos are applied to extract useful patterns from the available dataset. Technology has changed today’s world in every aspect of life. Similarly, advanced technologies have been developed in livestock and dairy farming to monitor dairy cows in various aspects. Dairy cattle monitoring is crucial as it plays a significant role in milk production around the globe. Moreover, it has become necessary for farmers to adopt the latest early prediction technologies as the food demand is increasing with population growth. This highlight the importance of state-ofthe-art technologies in analyzing how important technology is in analyzing dairy cows’ activities. It is not easy to predict the activities of a large number of cows on the farm, so, the system has made it very convenient for the farmers., as it provides all the solutions under one roof. The cattle industry’s productivity is boosted as the early diagnosis of any disease on a cattle farm is detected and hence it is treated early. It is done on behalf of the machine learning output received. The learning models are already set which interpret the data collected in a centralized system. Basically, we will run different algorithms on behalf of the data set received to analyze milk quality, and track cows’ health, location, and safety. This deep learning algorithm draws patterns from the data, which makes it easier for farmers to study any animal’s behavioral changes. With the emergence of machine learning algorithms and the Internet of Things, accurate tracking of animals is possible as the rate of error is minimized. As a result, milk productivity is increased. IoT with ML capability has given a new phase to the cattle farming industry by increasing the yield in the most cost-effective and time-saving manner.

Keywords: IoT, machine learning, health care, dairy cows

Procedia PDF Downloads 73

337 Analysis of Biomarkers Intractable Epileptogenic Brain Networks with Independent Component Analysis and Deep Learning Algorithms: A Comprehensive Framework for Scalable Seizure Prediction with Unimodal Neuroimaging Data in Pediatric Patients

Authors: Bliss Singhal

Abstract:

Epilepsy is a prevalent neurological disorder affecting approximately 50 million individuals worldwide and 1.2 million Americans. There exist millions of pediatric patients with intractable epilepsy, a condition in which seizures fail to come under control. The occurrence of seizures can result in physical injury, disorientation, unconsciousness, and additional symptoms that could impede children's ability to participate in everyday tasks. Predicting seizures can help parents and healthcare providers take precautions, prevent risky situations, and mentally prepare children to minimize anxiety and nervousness associated with the uncertainty of a seizure. This research proposes a comprehensive framework to predict seizures in pediatric patients by evaluating machine learning algorithms on unimodal neuroimaging data consisting of electroencephalogram signals. The bandpass filtering and independent component analysis proved to be effective in reducing the noise and artifacts from the dataset. Various machine learning algorithms’ performance is evaluated on important metrics such as accuracy, precision, specificity, sensitivity, F1 score and MCC. The results show that the deep learning algorithms are more successful in predicting seizures than logistic Regression, and k nearest neighbors. The recurrent neural network (RNN) gave the highest precision and F1 Score, long short-term memory (LSTM) outperformed RNN in accuracy and convolutional neural network (CNN) resulted in the highest Specificity. This research has significant implications for healthcare providers in proactively managing seizure occurrence in pediatric patients, potentially transforming clinical practices, and improving pediatric care.

Keywords: intractable epilepsy, seizure, deep learning, prediction, electroencephalogram channels

Procedia PDF Downloads 85

336 Gradient Boosted Trees on Spark Platform for Supervised Learning in Health Care Big Data

Authors: Gayathri Nagarajan, L. D. Dhinesh Babu

Abstract:

Health care is one of the prominent industries that generate voluminous data thereby finding the need of machine learning techniques with big data solutions for efficient processing and prediction. Missing data, incomplete data, real time streaming data, sensitive data, privacy, heterogeneity are few of the common challenges to be addressed for efficient processing and mining of health care data. In comparison with other applications, accuracy and fast processing are of higher importance for health care applications as they are related to the human life directly. Though there are many machine learning techniques and big data solutions used for efficient processing and prediction in health care data, different techniques and different frameworks are proved to be effective for different applications largely depending on the characteristics of the datasets. In this paper, we present a framework that uses ensemble machine learning technique gradient boosted trees for data classification in health care big data. The framework is built on Spark platform which is fast in comparison with other traditional frameworks. Unlike other works that focus on a single technique, our work presents a comparison of six different machine learning techniques along with gradient boosted trees on datasets of different characteristics. Five benchmark health care datasets are considered for experimentation, and the results of different machine learning techniques are discussed in comparison with gradient boosted trees. The metric chosen for comparison is misclassification error rate and the run time of the algorithms. The goal of this paper is to i) Compare the performance of gradient boosted trees with other machine learning techniques in Spark platform specifically for health care big data and ii) Discuss the results from the experiments conducted on datasets of different characteristics thereby drawing inference and conclusion. The experimental results show that the accuracy is largely dependent on the characteristics of the datasets for other machine learning techniques whereas gradient boosting trees yields reasonably stable results in terms of accuracy without largely depending on the dataset characteristics.

Keywords: big data analytics, ensemble machine learning, gradient boosted trees, Spark platform

Procedia PDF Downloads 241

335 Training a Neural Network to Segment, Detect and Recognize Numbers

Authors: Abhisek Dash

Abstract:

This study had three neural networks, one for number segmentation, one for number detection and one for number recognition all of which are coupled to one another. All networks were trained on the MNIST dataset and were convolutional. It was assumed that the images had lighter background and darker foreground. The segmentation network took 28x28 images as input and had sixteen outputs. Segmentation training starts when a dark pixel is encountered. Taking a window(7x7) over that pixel as focus, the eight neighborhood of the focus was checked for further dark pixels. The segmentation network was then trained to move in those directions which had dark pixels. To this end the segmentation network had 16 outputs. They were arranged as “go east”, ”don’t go east ”, “go south east”, “don’t go south east”, “go south”, “don’t go south” and so on w.r.t focus window. The focus window was resized into a 28x28 image and the network was trained to consider those neighborhoods which had dark pixels. The neighborhoods which had dark pixels were pushed into a queue in a particular order. The neighborhoods were then popped one at a time stitched to the existing partial image of the number one at a time and trained on which neighborhoods to consider when the new partial image was presented. The above process was repeated until the image was fully covered by the 7x7 neighborhoods and there were no more uncovered black pixels. During testing the network scans and looks for the first dark pixel. From here on the network predicts which neighborhoods to consider and segments the image. After this step the group of neighborhoods are passed into the detection network. The detection network took 28x28 images as input and had two outputs denoting whether a number was detected or not. Since the ground truth of the bounds of a number was known during training the detection network outputted in favor of number not found until the bounds were not met and vice versa. The recognition network was a standard CNN that also took 28x28 images and had 10 outputs for recognition of numbers from 0 to 9. This network was activated only when the detection network votes in favor of number detected. The above methodology could segment connected and overlapping numbers. Additionally the recognition unit was only invoked when a number was detected which minimized false positives. It also eliminated the need for rules of thumb as segmentation is learned. The strategy can also be extended to other characters as well.

Keywords: convolutional neural networks, OCR, text detection, text segmentation

Procedia PDF Downloads 163

334 An Experimental Machine Learning Analysis on Adaptive Thermal Comfort and Energy Management in Hospitals

Authors: Ibrahim Khan, Waqas Khalid

Abstract:

The Healthcare sector is known to consume a higher proportion of total energy consumption in the HVAC market owing to an excessive cooling and heating requirement in maintaining human thermal comfort in indoor conditions, catering to patients undergoing treatment in hospital wards, rooms, and intensive care units. The indoor thermal comfort conditions in selected hospitals of Islamabad, Pakistan, were measured on a real-time basis with the collection of first-hand experimental data using calibrated sensors measuring Ambient Temperature, Wet Bulb Globe Temperature, Relative Humidity, Air Velocity, Light Intensity and CO2 levels. The Experimental data recorded was analyzed in conjunction with the Thermal Comfort Questionnaire Surveys, where the participants, including patients, doctors, nurses, and hospital staff, were assessed based on their thermal sensation, acceptability, preference, and comfort responses. The Recorded Dataset, including experimental and survey-based responses, was further analyzed in the development of a correlation between operative temperature, operative relative humidity, and other measured operative parameters with the predicted mean vote and adaptive predicted mean vote, with the adaptive temperature and adaptive relative humidity estimated using the seasonal data set gathered for both summer – hot and dry, and hot and humid as well as winter – cold and dry, and cold and humid climate conditions. The Machine Learning Logistic Regression Algorithm was incorporated to train the operative experimental data parameters and develop a correlation between patient sensations and the thermal environmental parameters for which a new ML-based adaptive thermal comfort model was proposed and developed in our study. Finally, the accuracy of our model was determined using the K-fold cross-validation.

Keywords: predicted mean vote, thermal comfort, energy management, logistic regression, machine learning

Procedia PDF Downloads 63

333 Feature Engineering Based Detection of Buffer Overflow Vulnerability in Source Code Using Deep Neural Networks

Authors: Mst Shapna Akter, Hossain Shahriar

Abstract:

One of the most important challenges in the field of software code audit is the presence of vulnerabilities in software source code. Every year, more and more software flaws are found, either internally in proprietary code or revealed publicly. These flaws are highly likely exploited and lead to system compromise, data leakage, or denial of service. C and C++ open-source code are now available in order to create a largescale, machine-learning system for function-level vulnerability identification. We assembled a sizable dataset of millions of opensource functions that point to potential exploits. We developed an efficient and scalable vulnerability detection method based on deep neural network models that learn features extracted from the source codes. The source code is first converted into a minimal intermediate representation to remove the pointless components and shorten the dependency. Moreover, we keep the semantic and syntactic information using state-of-the-art word embedding algorithms such as glove and fastText. The embedded vectors are subsequently fed into deep learning networks such as LSTM, BilSTM, LSTM-Autoencoder, word2vec, BERT, and GPT-2 to classify the possible vulnerabilities. Furthermore, we proposed a neural network model which can overcome issues associated with traditional neural networks. Evaluation metrics such as f1 score, precision, recall, accuracy, and total execution time have been used to measure the performance. We made a comparative analysis between results derived from features containing a minimal text representation and semantic and syntactic information. We found that all of the deep learning models provide comparatively higher accuracy when we use semantic and syntactic information as the features but require higher execution time as the word embedding the algorithm puts on a bit of complexity to the overall system.

Keywords: cyber security, vulnerability detection, neural networks, feature extraction

Procedia PDF Downloads 91

332 Investigate the Side Effects of Patients With Severe COVID-19 and Choose the Appropriate Medication Regimens to Deal With Them

Authors: Rasha Ahmadi

Abstract:

In December 2019, a coronavirus, currently identified as SARS-CoV-2, produced a series of acute atypical respiratory illnesses in Wuhan, Hubei Province, China. The sickness induced by this virus was named COVID-19. The virus is transmittable between humans and has caused pandemics worldwide. The number of death tolls continues to climb and a huge number of countries have been obliged to perform social isolation and lockdown. Lack of focused therapy continues to be a problem. Epidemiological research showed that senior patients were more susceptible to severe diseases, whereas children tend to have milder symptoms. In this study, we focus on other possible side effects of COVID-19 and more detailed treatment strategies. Using bioinformatics analysis, we first isolated the gene expression profile of patients with severe COVID-19 from the GEO database. Patients' blood samples were used in the GSE183071 dataset. We then categorized the genes with high and low expression. In the next step, we uploaded the genes separately to the Enrichr database and evaluated our data for signs and symptoms as well as related medication regimens. The results showed that 138 genes with high expression and 108 genes with low expression were observed differentially in the severe COVID-19 VS control group. Symptoms and diseases such as embolism and thrombosis of the abdominal aorta, ankylosing spondylitis, suicidal ideation or attempt, regional enteritis were observed in genes with high expression and in genes with low expression of acute and subacute forms of ischemic heart, CNS infection and poliomyelitis, synovitis and tenosynovitis. Following the detection of diseases and possible signs and symptoms, Carmustine, Bithionol, Leflunomide were evaluated more significantly for high-expression genes and Chlorambucil, Ifosfamide, Hydroxyurea, Bisphenol for low-expression genes. In general, examining the different and invisible aspects of COVID-19 and identifying possible treatments can help us significantly in the emergency and hospitalization of patients.

Keywords: phenotypes, drug regimens, gene expression profiles, bioinformatics analysis, severe COVID-19

Procedia PDF Downloads 143

331 An Ecological Systems Approach to Risk and Protective Factors of Sibling Conflict for Children in the United Kingdom

Authors: C. A. Bradley, D. Patsios, D. Berridge

Abstract:

This paper presents evidence to better understand the risk and protective factors related to sibling conflict and the patterns of association between sibling conflict and negative adjustment outcomes by incorporating additional familial and societal factors within statistical models of risk and adjustment. It was conducted through the secondary analysis of a large representative cross-sectional dataset of children in the UK. The original study includes proxy interviews for young children and self-report interviews for adolescents. The study applies an ecological systems framework for the analyses. Hierarchical regression models assess risk and protective factors and adjustment outcomes associated with sibling conflict. Interactions reveal differential effect between contextual risk factors and the social context of influence. The general pattern of findings suggested that, although factors affecting likelihood of experiencing sibling conflict were often determined by child age, some remained consistent across childhood. These factors were often conditional on each other, reinforcing the importance of an ecological framework. Across both age-groups, sibling conflict was associated with siblings closer in age; male sibling groups; most advantaged socio-economic group; and exposure to community violence, such as witnessing violent assault or robbery. The study develops the evidence base on the influence of ethnicity and socio-economic group on sibling conflict by exploring interactions between social context. It also identifies key new areas of influence – such as family structure, disability, and community violence in exacerbating or reducing risk of conflict. The study found negative associations between sibling conflict and young children’s mental well-being and adolescents' mental well-being and anti-social behaviour, but also more context specific associations – such as sibling conflict moderating the negative impact of adversity and high risk experiences for young children such as parental violence toward the child.

Keywords: adjustment, conflict, ecological systems, family systems, risk and protective factors, sibling

Procedia PDF Downloads 108

330 Understanding the Fundamental Driver of Semiconductor Radiation Tolerance with Experiment and Theory

Authors: Julie V. Logan, Preston T. Webster, Kevin B. Woller, Christian P. Morath, Michael P. Short

Abstract:

Semiconductors, as the base of critical electronic systems, are exposed to damaging radiation while operating in space, nuclear reactors, and particle accelerator environments. What innate property allows some semiconductors to sustain little damage while others accumulate defects rapidly with dose is, at present, poorly understood. This limits the extent to which radiation tolerance can be implemented as a design criterion. To address this problem of determining the driver of semiconductor radiation tolerance, the first step is to generate a dataset of the relative radiation tolerance of a large range of semiconductors (exposed to the same radiation damage and characterized in the same way). To accomplish this, Rutherford backscatter channeling experiments are used to compare the displaced lattice atom buildup in InAs, InP, GaP, GaN, ZnO, MgO, and Si as a function of step-wise alpha particle dose. With this experimental information on radiation-induced incorporation of interstitial defects in hand, hybrid density functional theory electron densities (and their derived quantities) are calculated, and their gradient and Laplacian are evaluated to obtain key fundamental information about the interactions in each material. It is shown that simple, undifferentiated values (which are typically used to describe bond strength) are insufficient to predict radiation tolerance. Instead, the curvature of the electron density at bond critical points provides a measure of radiation tolerance consistent with the experimental results obtained. This curvature and associated forces surrounding bond critical points disfavors localization of displaced lattice atoms at these points, favoring their diffusion toward perfect lattice positions. With this criterion to predict radiation tolerance, simple density functional theory simulations can be conducted on potential new materials to gain insight into how they may operate in demanding high radiation environments.

Keywords: density functional theory, GaN, GaP, InAs, InP, MgO, radiation tolerance, rutherford backscatter channeling

Procedia PDF Downloads 174

329 Radar Fault Diagnosis Strategy Based on Deep Learning

Authors: Bin Feng, Zhulin Zong

Abstract:

Radar systems are critical in the modern military, aviation, and maritime operations, and their proper functioning is essential for the success of these operations. However, due to the complexity and sensitivity of radar systems, they are susceptible to various faults that can significantly affect their performance. Traditional radar fault diagnosis strategies rely on expert knowledge and rule-based approaches, which are often limited in effectiveness and require a lot of time and resources. Deep learning has recently emerged as a promising approach for fault diagnosis due to its ability to learn features and patterns from large amounts of data automatically. In this paper, we propose a radar fault diagnosis strategy based on deep learning that can accurately identify and classify faults in radar systems. Our approach uses convolutional neural networks (CNN) to extract features from radar signals and fault classify the features. The proposed strategy is trained and validated on a dataset of measured radar signals with various types of faults. The results show that it achieves high accuracy in fault diagnosis. To further evaluate the effectiveness of the proposed strategy, we compare it with traditional rule-based approaches and other machine learning-based methods, including decision trees, support vector machines (SVMs), and random forests. The results demonstrate that our deep learning-based approach outperforms the traditional approaches in terms of accuracy and efficiency. Finally, we discuss the potential applications and limitations of the proposed strategy, as well as future research directions. Our study highlights the importance and potential of deep learning for radar fault diagnosis. It suggests that it can be a valuable tool for improving the performance and reliability of radar systems. In summary, this paper presents a radar fault diagnosis strategy based on deep learning that achieves high accuracy and efficiency in identifying and classifying faults in radar systems. The proposed strategy has significant potential for practical applications and can pave the way for further research.

Keywords: radar system, fault diagnosis, deep learning, radar fault

Procedia PDF Downloads 92

328 Machine Learning Techniques to Predict Cyberbullying and Improve Social Work Interventions

Authors: Oscar E. Cariceo, Claudia V. Casal

Abstract:

Machine learning offers a set of techniques to promote social work interventions and can lead to support decisions of practitioners in order to predict new behaviors based on data produced by the organizations, services agencies, users, clients or individuals. Machine learning techniques include a set of generalizable algorithms that are data-driven, which means that rules and solutions are derived by examining data, based on the patterns that are present within any data set. In other words, the goal of machine learning is teaching computers through 'examples', by training data to test specifics hypothesis and predict what would be a certain outcome, based on a current scenario and improve that experience. Machine learning can be classified into two general categories depending on the nature of the problem that this technique needs to tackle. First, supervised learning involves a dataset that is already known in terms of their output. Supervising learning problems are categorized, into regression problems, which involve a prediction from quantitative variables, using a continuous function; and classification problems, which seek predict results from discrete qualitative variables. For social work research, machine learning generates predictions as a key element to improving social interventions on complex social issues by providing better inference from data and establishing more precise estimated effects, for example in services that seek to improve their outcomes. This paper exposes the results of a classification algorithm to predict cyberbullying among adolescents. Data were retrieved from the National Polyvictimization Survey conducted by the government of Chile in 2017. A logistic regression model was created to predict if an adolescent would experience cyberbullying based on the interaction and behavior of gender, age, grade, type of school, and self-esteem sentiments. The model can predict with an accuracy of 59.8% if an adolescent will suffer cyberbullying. These results can help to promote programs to avoid cyberbullying at schools and improve evidence based practice.

Keywords: cyberbullying, evidence based practice, machine learning, social work research

Procedia PDF Downloads 169

327 Computational Fluid Dynamicsfd Simulations of Air Pollutant Dispersion: Validation of Fire Dynamic Simulator Against the Cute Experiments of the Cost ES1006 Action

Authors: Virginie Hergault, Siham Chebbah, Bertrand Frere

Abstract:

Following in-house objectives, Central laboratory of Paris police Prefecture conducted a general review on models and Computational Fluid Dynamics (CFD) codes used to simulate pollutant dispersion in the atmosphere. Starting from that review and considering main features of Large Eddy Simulation, Central Laboratory Of Paris Police Prefecture (LCPP) postulates that the Fire Dynamics Simulator (FDS) model, from National Institute of Standards and Technology (NIST), should be well suited for air pollutant dispersion modeling. This paper focuses on the implementation and the evaluation of FDS in the frame of the European COST ES1006 Action. This action aimed at quantifying the performance of modeling approaches. In this paper, the CUTE dataset carried out in the city of Hamburg, and its mock-up has been used. We have performed a comparison of FDS results with wind tunnel measurements from CUTE trials on the one hand, and, on the other, with the models results involved in the COST Action. The most time-consuming part of creating input data for simulations is the transfer of obstacle geometry information to the format required by SDS. Thus, we have developed Python codes to convert automatically building and topographic data to the FDS input file. In order to evaluate the predictions of FDS with observations, statistical performance measures have been used. These metrics include the fractional bias (FB), the normalized mean square error (NMSE) and the fraction of predictions within a factor of two of observations (FAC2). As well as the CFD models tested in the COST Action, FDS results demonstrate a good agreement with measured concentrations. Furthermore, the metrics assessment indicate that FB and NMSE meet the tolerance acceptable.

Keywords: numerical simulations, atmospheric dispersion, cost ES1006 action, CFD model, cute experiments, wind tunnel data, numerical results

Procedia PDF Downloads 136

326 Advancing Phenological Understanding of Plants/Trees Through Phenocam Digital Time-lapse Images

Authors: Siddhartha Khare, Suyash Khare

Abstract:

Phenology, a crucial discipline in ecology, offers insights into the seasonal dynamics of organisms within natural ecosystems and the underlying environmental triggers. Leveraging the potent capabilities of digital repeat photography, PhenoCams capture invaluable data on the phenology of crops, plants, and trees. These cameras yield digital imagery in Red Green Blue (RGB) color channels, and some advanced systems even incorporate Near Infrared (NIR) bands. This study presents compelling case studies employing PhenoCam technology to unravel the phenology of black spruce trees. Through the analysis of RGB color channels, a range of essential color metrics including red chromatic coordinate (RCC), green chromatic coordinate (GCC), blue chromatic coordinate (BCC), vegetation contrast index (VCI), and excess green index (ExGI) are derived. These metrics illuminate variations in canopy color across seasons, shedding light on bud and leaf development. This, in turn, facilitates a deeper understanding of phenological events and aids in delineating the growth periods of trees and plants. The initial phase of this study addresses critical questions surrounding the fidelity of continuous canopy greenness records in representing bud developmental phases. Additionally, it discerns which color-based index most accurately tracks the seasonal variations in tree phenology within evergreen forest ecosystems. The subsequent section of this study delves into the transition dates of black spruce (Picea mariana (Mill.) B.S.P.) phenology. This is achieved through a fortnightly comparative analysis of the MODIS normalized difference vegetation index (NDVI) and the enhanced vegetation index (EVI). By employing PhenoCam technology and leveraging advanced color metrics, this study significantly advances our comprehension of black spruce tree phenology, offering valuable insights for ecological research and management.

Keywords: phenology, remote sensing, phenocam, color metrics, NDVI, GCC

Procedia PDF Downloads 60

325 Exploring the Applications of Neural Networks in the Adaptive Learning Environment

Authors: Baladitya Swaika, Rahul Khatry

Abstract:

Computer Adaptive Tests (CATs) is one of the most efficient ways for testing the cognitive abilities of students. CATs are based on Item Response Theory (IRT) which is based on item selection and ability estimation using statistical methods of maximum information selection/selection from posterior and maximum-likelihood (ML)/maximum a posteriori (MAP) estimators respectively. This study aims at combining both classical and Bayesian approaches to IRT to create a dataset which is then fed to a neural network which automates the process of ability estimation and then comparing it to traditional CAT models designed using IRT. This study uses python as the base coding language, pymc for statistical modelling of the IRT and scikit-learn for neural network implementations. On creation of the model and on comparison, it is found that the Neural Network based model performs 7-10% worse than the IRT model for score estimations. Although performing poorly, compared to the IRT model, the neural network model can be beneficially used in back-ends for reducing time complexity as the IRT model would have to re-calculate the ability every-time it gets a request whereas the prediction from a neural network could be done in a single step for an existing trained Regressor. This study also proposes a new kind of framework whereby the neural network model could be used to incorporate feature sets, other than the normal IRT feature set and use a neural network’s capacity of learning unknown functions to give rise to better CAT models. Categorical features like test type, etc. could be learnt and incorporated in IRT functions with the help of techniques like logistic regression and can be used to learn functions and expressed as models which may not be trivial to be expressed via equations. This kind of a framework, when implemented would be highly advantageous in psychometrics and cognitive assessments. This study gives a brief overview as to how neural networks can be used in adaptive testing, not only by reducing time-complexity but also by being able to incorporate newer and better datasets which would eventually lead to higher quality testing.

Keywords: computer adaptive tests, item response theory, machine learning, neural networks

Procedia PDF Downloads 176

324 Ecosystem Services and Human Well-Being: Case Study of Tiriya Village, Bastar India

Authors: S. Vaibhav Kant Sahu, Surabhi Bipin Seth

Abstract:

Human well-being has multiple constituents including the basic material for a good life, freedom and choice, health, good social relations, and security. Poverty is also multidimensional and has been defined as the pronounced deprivation of well-being. Dhurwa tribe of Bastar (India) have symbiotic relation with nature, it provisions ecosystem service such as food, fuel and fiber; regulating services such as climate regulation and non-material benefits such as spiritual or aesthetic benefits and they are managing their forest from ages. The demand for ecosystem services is now so great that trade-off among services become rule. Aim of study to explore evidences for linkages between ecosystem services and well-being of indigenous community, how much it helps them in poverty reduction and interaction between them. Objective of study was to find drivers of change and evidence concerning link between ecosystem, human development and sustainability, evidence in decision making does it opt for multi sectoral objectives. Which means human well-being as the central focus for assessment, while recognizing that biodiversity and ecosystems also have intrinsic value. Ecosystem changes that may have little impact on human well-being over days or weeks may have pronounced impacts over years or decades; so assessments needed to be conducted at spatial and temporal scales under social, political, economic scales to have high-resolution data. Researcher used framework developed by Millennium ecosystem assessment; since human action now directly or unknowingly virtually alter ecosystem. Researcher used ethnography study to get primary qualitative data, secondary data collected from panchayat office. The responses were transcribed and translated into English, as interview held in Hindi and local indigenous language. Focus group discussion were held with group of 10 women at Tiriya village. Researcher concluded with well-being is not just gap between ecosystem service supply but also increases vulnerability. Decision can have consequences external to the decision framework these consequences are called externalities because they are not part of the decision-making calculus.

Keywords: Bastar, Dhurwa tribe, ecosystem services, millennium ecosystem assessment, sustainability

Procedia PDF Downloads 302

323 Habitat Suitability, Genetic Diversity and Population Structure of Two Sympatric Fruit Bat Species Reveal the Need of an Urgent Conservation Action

Authors: Mohamed Thani Ibouroi, Ali Cheha, Claudine Montgelard, Veronique Arnal, Dawiyat Massoudi, Guillelme Astruc, Said Ali Ousseni Dhurham, Aurelien Besnard

Abstract:

The Livingstone's flying fox (Pteropus livingstonii) and the Comorian fruit bat (P.seychellensis comorensis) are two endemic fruit bat species among the mostly threatened animals of the Comoros archipelagos. Despite their role as important ecosystem service providers like all flying fox species as pollinators and seed dispersers, little is known about their ecologies, population genetics and structures making difficult the development of evidence-based conservation strategies. In this study, we assess spatial distribution and ecological niche of both species using Species Distribution Modeling (SDM) based on the recent Ensemble of Small Models (ESMs) approach using presence-only data. Population structure and genetic diversity of the two species were assessed using both mitochondrial and microsatellite markers based on non-invasive genetic samples. Our ESMs highlight a clear niche partitioning of the two sympatric species. Livingstone’s flying fox has a very limited distribution, restricted on steep slope of natural forests at high elevation. On the contrary, the Comorian fruit bat has a relatively large geographic range spread over low elevations in farmlands and villages. Our genetic analysis shows a low genetic diversity for both fruit bats species. They also show that the Livingstone’s flying fox population of the two islands were genetically isolated while no evidence of genetic differentiation was detected for the Comorian fruit bats between islands. Our results support the idea that natural habitat loss, especially the natural forest loss and fragmentation are the important factors impacting the distribution of the Livingstone’s flying fox by limiting its foraging area and reducing its potential roosting sites. On the contrary, the Comorian fruit bats seem to be favored by human activities probably because its diets are less specialized. By this study, we concluded that the Livingstone’s flying fox species and its habitat are of high priority in term of conservation at the Comoros archipelagos scale.

Keywords: Comoros islands, ecological niche, habitat loss, population genetics, fruit bats, conservation biology

Procedia PDF Downloads 268

322 Healthcare-SignNet: Advanced Video Classification for Medical Sign Language Recognition Using CNN and RNN Models

Authors: Chithra A. V., Somoshree Datta, Sandeep Nithyanandan

Abstract:

Sign Language Recognition (SLR) is the process of interpreting and translating sign language into spoken or written language using technological systems. It involves recognizing hand gestures, facial expressions, and body movements that makeup sign language communication. The primary goal of SLR is to facilitate communication between hearing- and speech-impaired communities and those who do not understand sign language. Due to the increased awareness and greater recognition of the rights and needs of the hearing- and speech-impaired community, sign language recognition has gained significant importance over the past 10 years. Technological advancements in the fields of Artificial Intelligence and Machine Learning have made it more practical and feasible to create accurate SLR systems. This paper presents a distinct approach to SLR by framing it as a video classification problem using Deep Learning (DL), whereby a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) has been used. This research targets the integration of sign language recognition into healthcare settings, aiming to improve communication between medical professionals and patients with hearing impairments. The spatial features from each video frame are extracted using a CNN, which captures essential elements such as hand shapes, movements, and facial expressions. These features are then fed into an RNN network that learns the temporal dependencies and patterns inherent in sign language sequences. The INCLUDE dataset has been enhanced with more videos from the healthcare domain and the model is evaluated on the same. Our model achieves 91% accuracy, representing state-of-the-art performance in this domain. The results highlight the effectiveness of treating SLR as a video classification task with the CNN-RNN architecture. This approach not only improves recognition accuracy but also offers a scalable solution for real-time SLR applications, significantly advancing the field of accessible communication technologies.

Keywords: sign language recognition, deep learning, convolution neural network, recurrent neural network

Procedia PDF Downloads 30

321 Robustness of the Deep Chroma Extractor and Locally-Normalized Quarter Tone Filters in Automatic Chord Estimation under Reverberant Conditions

Authors: Luis Alvarado, Victor Poblete, Isaac Gonzalez, Yetzabeth Gonzalez

Abstract:

In MIREX 2016 (http://www.music-ir.org/mirex), the deep neural network (DNN)-Deep Chroma Extractor, proposed by Korzeniowski and Wiedmer, reached the highest score in an audio chord recognition task. In the present paper, this tool is assessed under acoustic reverberant environments and distinct source-microphone distances. The evaluation dataset comprises The Beatles and Queen datasets. These datasets are sequentially re-recorded with a single microphone in a real reverberant chamber at four reverberation times (0 -anechoic-, 1, 2, and 3 s, approximately), as well as four source-microphone distances (32, 64, 128, and 256 cm). It is expected that the performance of the trained DNN will dramatically decrease under these acoustic conditions with signals degraded by room reverberation and distance to the source. Recently, the effect of the bio-inspired Locally-Normalized Cepstral Coefficients (LNCC), has been assessed in a text independent speaker verification task using speech signals degraded by additive noise at different signal-to-noise ratios with variations of recording distance, and it has also been assessed under reverberant conditions with variations of recording distance. LNCC showed a performance so high as the state-of-the-art Mel Frequency Cepstral Coefficient filters. Based on these results, this paper proposes a variation of locally-normalized triangular filters called Locally-Normalized Quarter Tone (LNQT) filters. By using the LNQT spectrogram, robustness improvements of the trained Deep Chroma Extractor are expected, compared with classical triangular filters, and thus compensating the music signal degradation improving the accuracy of the chord recognition system.

Keywords: chord recognition, deep neural networks, feature extraction, music information retrieval

Procedia PDF Downloads 234

320 Utilizing Spatial Uncertainty of On-The-Go Measurements to Design Adaptive Sampling of Soil Electrical Conductivity in a Rice Field

Authors: Ismaila Olabisi Ogundiji, Hakeem Mayowa Olujide, Qasim Usamot

Abstract:

The main reasons for site-specific management for agricultural inputs are to increase the profitability of crop production, to protect the environment and to improve products’ quality. Information about the variability of different soil attributes within a field is highly essential for the decision-making process. Lack of fast and accurate acquisition of soil characteristics remains one of the biggest limitations of precision agriculture due to being expensive and time-consuming. Adaptive sampling has been proven as an accurate and affordable sampling technique for planning within a field for site-specific management of agricultural inputs. This study employed spatial uncertainty of soil apparent electrical conductivity (ECa) estimates to identify adaptive re-survey areas in the field. The original dataset was grouped into validation and calibration groups where the calibration group was sub-grouped into three sets of different measurements pass intervals. A conditional simulation was performed on the field ECa to evaluate the ECa spatial uncertainty estimates by the use of the geostatistical technique. The grouping of high-uncertainty areas for each set was done using image segmentation in MATLAB, then, high and low area value-separate was identified. Finally, an adaptive re-survey was carried out on those areas of high-uncertainty. Adding adaptive re-surveying significantly minimized the time required for resampling whole field and resulted in ECa with minimal error. For the most spacious transect, the root mean square error (RMSE) yielded from an initial crude sampling survey was minimized after an adaptive re-survey, which was close to that value of the ECa yielded with an all-field re-survey. The estimated sampling time for the adaptive re-survey was found to be 45% lesser than that of all-field re-survey. The results indicate that designing adaptive sampling through spatial uncertainty models significantly mitigates sampling cost, and there was still conformity in the accuracy of the observations.

Keywords: soil electrical conductivity, adaptive sampling, conditional simulation, spatial uncertainty, site-specific management

Procedia PDF Downloads 134

319 Enhancing Robustness in Federated Learning through Decentralized Oracle Consensus and Adaptive Evaluation

Authors: Peiming Li

Abstract:

This paper presents an innovative blockchain-based approach to enhance the reliability and efficiency of federated learning systems. By integrating a decentralized oracle consensus mechanism into the federated learning framework, we address key challenges of data and model integrity. Our approach utilizes a network of redundant oracles, functioning as independent validators within an epoch-based training system in the federated learning model. In federated learning, data is decentralized, residing on various participants' devices. This scenario often leads to concerns about data integrity and model quality. Our solution employs blockchain technology to establish a transparent and tamper-proof environment, ensuring secure data sharing and aggregation. The decentralized oracles, a concept borrowed from blockchain systems, act as unbiased validators. They assess the contributions of each participant using a Hidden Markov Model (HMM), which is crucial for evaluating the consistency of participant inputs and safeguarding against model poisoning and malicious activities. Our methodology's distinct feature is its epoch-based training. An epoch here refers to a specific training phase where data is updated and assessed for quality and relevance. The redundant oracles work in concert to validate data updates during these epochs, enhancing the system's resilience to security threats and data corruption. The effectiveness of this system was tested using the Mnist dataset, a standard in machine learning for benchmarking. Results demonstrate that our blockchain-oriented federated learning approach significantly boosts system resilience, addressing the common challenges of federated environments. This paper aims to make these advanced concepts accessible, even to those with a limited background in blockchain or federated learning. We provide a foundational understanding of how blockchain technology can revolutionize data integrity in decentralized systems and explain the role of oracles in maintaining model accuracy and reliability.

Keywords: federated learning system, block chain, decentralized oracles, hidden markov model

Procedia PDF Downloads 63

318 Improving Cell Type Identification of Single Cell Data by Iterative Graph-Based Noise Filtering

Authors: Annika Stechemesser, Rachel Pounds, Emma Lucas, Chris Dawson, Julia Lipecki, Pavle Vrljicak, Jan Brosens, Sean Kehoe, Jason Yap, Lawrence Young, Sascha Ott

Abstract:

Advances in technology make it now possible to retrieve the genetic information of thousands of single cancerous cells. One of the key challenges in single cell analysis of cancerous tissue is to determine the number of different cell types and their characteristic genes within the sample to better understand the tumors and their reaction to different treatments. For this analysis to be possible, it is crucial to filter out background noise as it can severely blur the downstream analysis and give misleading results. In-depth analysis of the state-of-the-art filtering methods for single cell data showed that they do, in some cases, not separate noisy and normal cells sufficiently. We introduced an algorithm that filters and clusters single cell data simultaneously without relying on certain genes or thresholds chosen by eye. It detects communities in a Shared Nearest Neighbor similarity network, which captures the similarities and dissimilarities of the cells by optimizing the modularity and then identifies and removes vertices with a weak clustering belonging. This strategy is based on the fact that noisy data instances are very likely to be similar to true cell types but do not match any of these wells. Once the clustering is complete, we apply a set of evaluation metrics on the cluster level and accept or reject clusters based on the outcome. The performance of our algorithm was tested on three datasets and led to convincing results. We were able to replicate the results on a Peripheral Blood Mononuclear Cells dataset. Furthermore, we applied the algorithm to two samples of ovarian cancer from the same patient before and after chemotherapy. Comparing the standard approach to our algorithm, we found a hidden cell type in the ovarian postchemotherapy data with interesting marker genes that are potentially relevant for medical research.

Keywords: cancer research, graph theory, machine learning, single cell analysis

Procedia PDF Downloads 114

317 Walking in a Weather rather than a Climate: Critique on the Meta-Narrative of Buddhism in Early India

Authors: Yongjun Kim

Abstract:

Since the agreement on the historicity of historical Buddha in eastern India, the beginning, heyday and decline of Buddhism in Early India have been discussed in urbanization, commercialism and state formation context, in short, Weberian socio-politico frame. Recent Scholarship, notably in archaeology and anthropology, has proposed ‘re-materialization of Buddhism in Early India’ based on what Buddhist had actually done rather than what they should do according to canonical teachings or philosophies. But its historical narrations still remain with a domain of socio-politico meta-narrative which tends to unjustifiably dismiss the naturally existing heterogeneity and often chaotic dynamic of diverse agencies, landscape perceptions, localized traditions, etc. An author will argue the multiplicity of theoretical standpoints for the reconstruction on the Buddhism in Early India. For this, at first, the diverse agencies, localized traditions, landscape patterns of Buddhist communities and monasteries in Trans-Himalayan regions; focusing Zanskar Valley and Spiti Valley in India will be illustrated based on an author’s field work. And then an author will discuss this anthropological landscape analysis is better appropriated with textual and archaeological evidences on the tension between urban monastic and forest Buddhism, the phenomena of sacred landscape, cemetery, garden, natural cave along with socio-economic landscape, the demographic heterogeneity in Early India. Finally, it will be attempted to compare between anthropological landscape of present Trans-Himalayan and archaeological one of ancient Western India. The study of Buddhism in Early India has hardly been discussed through multivalent theoretical archaeology and anthropology of religion, thus traditional and recent scholarship have produced historical meta-narrative though heterogeneous among them. The multidisciplinary approaches of textual critics, archaeology and anthropology will surely help to deconstruct the grand and all-encompassing historical description on Buddhism in Early India and then to reconstruct the localized, behavioral and multivalent narratives. This paper expects to highlight the importance of lesser-studied Buddhist archaeological sites and the dynamic views on religious landscape in Early India with a help of critical anthropology of religion.

Keywords: analogy by living traditions, Buddhism in Early India, landscape analysis, meta-narrative

Procedia PDF Downloads 333

316 Contextual SenSe Model: Word Sense Disambiguation using Sense and Sense Value of Context Surrounding the Target

Authors: Vishal Raj, Noorhan Abbas

Abstract:

Ambiguity in NLP (Natural language processing) refers to the ability of a word, phrase, sentence, or text to have multiple meanings. This results in various kinds of ambiguities such as lexical, syntactic, semantic, anaphoric and referential am-biguities. This study is focused mainly on solving the issue of Lexical ambiguity. Word Sense Disambiguation (WSD) is an NLP technique that aims to resolve lexical ambiguity by determining the correct meaning of a word within a given context. Most WSD solutions rely on words for training and testing, but we have used lemma and Part of Speech (POS) tokens of words for training and testing. Lemma adds generality and POS adds properties of word into token. We have designed a novel method to create an affinity matrix to calculate the affinity be-tween any pair of lemma_POS (a token where lemma and POS of word are joined by underscore) of given training set. Additionally, we have devised an al-gorithm to create the sense clusters of tokens using affinity matrix under hierar-chy of POS of lemma. Furthermore, three different mechanisms to predict the sense of target word using the affinity/similarity value are devised. Each contex-tual token contributes to the sense of target word with some value and whichever sense gets higher value becomes the sense of target word. So, contextual tokens play a key role in creating sense clusters and predicting the sense of target word, hence, the model is named Contextual SenSe Model (CSM). CSM exhibits a noteworthy simplicity and explication lucidity in contrast to contemporary deep learning models characterized by intricacy, time-intensive processes, and chal-lenging explication. CSM is trained on SemCor training data and evaluated on SemEval test dataset. The results indicate that despite the naivety of the method, it achieves promising results when compared to the Most Frequent Sense (MFS) model.

Keywords: word sense disambiguation (wsd), contextual sense model (csm), most frequent sense (mfs), part of speech (pos), natural language processing (nlp), oov (out of vocabulary), lemma_pos (a token where lemma and pos of word are joined by underscore), information retrieval (ir), machine translation (mt)

Procedia PDF Downloads 109

315 Principal Component Analysis Combined Machine Learning Techniques on Pharmaceutical Samples by Laser Induced Breakdown Spectroscopy

Authors: Kemal Efe Eseller, Göktuğ Yazici

Abstract:

Laser-induced breakdown spectroscopy (LIBS) is a rapid optical atomic emission spectroscopy which is used for material identification and analysis with the advantages of in-situ analysis, elimination of intensive sample preparation, and micro-destructive properties for the material to be tested. LIBS delivers short pulses of laser beams onto the material in order to create plasma by excitation of the material to a certain threshold. The plasma characteristics, which consist of wavelength value and intensity amplitude, depends on the material and the experiment’s environment. In the present work, medicine samples’ spectrum profiles were obtained via LIBS. Medicine samples’ datasets include two different concentrations for both paracetamol based medicines, namely Aferin and Parafon. The spectrum data of the samples were preprocessed via filling outliers based on quartiles, smoothing spectra to eliminate noise and normalizing both wavelength and intensity axis. Statistical information was obtained and principal component analysis (PCA) was incorporated to both the preprocessed and raw datasets. The machine learning models were set based on two different train-test splits, which were 70% training – 30% test and 80% training – 20% test. Cross-validation was preferred to protect the models against overfitting; thus the sample amount is small. The machine learning results of preprocessed and raw datasets were subjected to comparison for both splits. This is the first time that all supervised machine learning classification algorithms; consisting of Decision Trees, Discriminant, naïve Bayes, Support Vector Machines (SVM), k-NN(k-Nearest Neighbor) Ensemble Learning and Neural Network algorithms; were incorporated to LIBS data of paracetamol based pharmaceutical samples, and their different concentrations on preprocessed and raw dataset in order to observe the effect of preprocessing.

Keywords: machine learning, laser-induced breakdown spectroscopy, medicines, principal component analysis, preprocessing

Procedia PDF Downloads 88

314 Effect of Biostimulants to Control the Phelipanche ramosa L. Pomel in Processing Tomato Crop

Authors: G. Disciglio, G. Gatta, F. Lops, A. Libutti, A. Tarantino, E. Tarantino

Abstract:

The experimental trial was carried out in open field at Foggia district (Apulia Region, Southern Italy), during the spring-summer season 2014, in order to evaluate the effect of four biostimulant products (RadiconÒ, Viormon plusÒ, LysodinÒ and SiaptonÒ 10L), compared with a control (no biostimulant), on the infestation of processing tomato crop (cv Dres) by the chlorophyll-lacking root parasite Phelipanche ramosa. Biostimulants consist in different categories of products (microbial inoculants, humic and fulvic acids, hydrolyzed proteins and aminoacids, seaweed extracts) which play various roles in plant growing, including the improvement of crop resistance and quali-quantitative characteristics of yield. The experimental trial was arranged according to a complete randomized block design with five treatments, each of one replicated three times. The processing tomato seedlings were transplanted on 5 May 2014. Throughout the crop cycle, P. ramosa infestation was assessed according to the number of emerged shoots (branched plants) counted in each plot, at 66, 78 and 92 day after transplanting. The tomato fruits were harvested at full-stage of maturity on 8 August 2014. From each plot, the marketable yield was measured and the quali-quantitative yield parameters (mean weight, dry matter content, colour coordinate, colour index and soluble solids content of the fruits) were determined. The whole dataset was tested according to the basic assumptions for the analysis of variance (ANOVA) and the differences between the means were determined using Tukey’s tests at the 5% probability level. The results of the study showed that none of the applied biostimulants provided a whole control of Phelipanche, although some positive effects were obtained from their application. To this respect, the RadiconÒ appeared to be the most effective in reducing the infestation of this root-parasite in tomato crop. This treatment also gave the higher tomato yield.

Keywords: biostimulant, control methods, Phelipanche ramosa, tomato crop

Procedia PDF Downloads 301