Search results for: string classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2283

Search results for: string classification

1053 Semi-Supervised Outlier Detection Using a Generative and Adversary Framework

Authors: Jindong Gu, Matthias Schubert, Volker Tresp

Abstract:

In many outlier detection tasks, only training data belonging to one class, i.e., the positive class, is available. The task is then to predict a new data point as belonging either to the positive class or to the negative class, in which case the data point is considered an outlier. For this task, we propose a novel corrupted Generative Adversarial Network (CorGAN). In the adversarial process of training CorGAN, the Generator generates outlier samples for the negative class, and the Discriminator is trained to distinguish the positive training data from the generated negative data. The proposed framework is evaluated using an image dataset and a real-world network intrusion dataset. Our outlier-detection method achieves state-of-the-art performance on both tasks.

Keywords: one-class classification, outlier detection, generative adversary networks, semi-supervised learning

Procedia PDF Downloads 151
1052 Fast and Robust Long-term Tracking with Effective Searching Model

Authors: Thang V. Kieu, Long P. Nguyen

Abstract:

Kernelized Correlation Filter (KCF) based trackers have gained a lot of attention recently because of their accuracy and fast calculation speed. However, this algorithm is not robust in cases where the object is lost by a sudden change of direction, being obscured or going out of view. In order to improve KCF performance in long-term tracking, this paper proposes an anomaly detection method for target loss warning by analyzing the response map of each frame, and a classification algorithm for reliable target re-locating mechanism by using Random fern. Being tested with Visual Tracker Benchmark and Visual Object Tracking datasets, the experimental results indicated that the precision and success rate of the proposed algorithm were 2.92 and 2.61 times higher than that of the original KCF algorithm, respectively. Moreover, the proposed tracker handles occlusion better than many state-of-the-art long-term tracking methods while running at 60 frames per second.

Keywords: correlation filter, long-term tracking, random fern, real-time tracking

Procedia PDF Downloads 135
1051 Static vs. Stream Mining Trajectories Similarity Measures

Authors: Musaab Riyadh, Norwati Mustapha, Dina Riyadh

Abstract:

Trajectory similarity can be defined as the cost of transforming one trajectory into another based on certain similarity method. It is the core of numerous mining tasks such as clustering, classification, and indexing. Various approaches have been suggested to measure similarity based on the geometric and dynamic properties of trajectory, the overlapping between trajectory segments, and the confined area between entire trajectories. In this article, an evaluation of these approaches has been done based on computational cost, usage memory, accuracy, and the amount of data which is needed in advance to determine its suitability to stream mining applications. The evaluation results show that the stream mining applications support similarity methods which have low computational cost and memory, single scan on data, and free of mathematical complexity due to the high-speed generation of data.

Keywords: global distance measure, local distance measure, semantic trajectory, spatial dimension, stream data mining

Procedia PDF Downloads 392
1050 Recognition of Grocery Products in Images Captured by Cellular Phones

Authors: Farshideh Einsele, Hassan Foroosh

Abstract:

In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using wellknown geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process.

Keywords: camera-based OCR, feature extraction, document, image processing, grocery products

Procedia PDF Downloads 405
1049 Stream Extraction from 1m-DTM Using ArcGIS

Authors: Jerald Ruta, Ricardo Villar, Jojemar Bantugan, Nycel Barbadillo, Jigg Pelayo

Abstract:

Streams are important in providing water supply for industrial, agricultural and human consumption, In short when there are streams there are lives. Identifying streams are essential since many developed cities are situated in the vicinity of these bodies of water and in flood management, it serves as basin for surface runoff within the area. This study aims to process and generate features from high-resolution digital terrain model (DTM) with 1-meter resolution using Hydrology Tools of ArcGIS. The raster was then filled, processed flow direction and accumulation, then raster calculate and provide stream order, converted to vector, and clearing undesirable features using the ancillary or google earth. In field validation streams were classified whether perennial, intermittent or ephemeral. Results show more than 90% of the extracted feature were accurate in assessment through field validation.

Keywords: digital terrain models, hydrology tools, strahler method, stream classification

Procedia PDF Downloads 267
1048 Possibility of Prediction of Death in SARS-Cov-2 Patients Using Coagulogram Analysis

Authors: Omonov Jahongir Mahmatkulovic

Abstract:

Purpose: To study the significance of D-dimer (DD), prothrombin time (PT), activated partial thromboplastin time (APTT), thrombin time (TT), and fibrinogen coagulation parameters (Fg) in predicting the course, severity and prognosis of COVID-19. Source and method of research: From September 15, 2021, to November 5, 2021, 93 patients aged 25 to 60 with suspected COVID-19, who are under inpatient treatment at the multidisciplinary clinic of the Tashkent Medical Academy, were retrospectively examined. DD, PT, APTT, and Fg were studied in dynamics and studied changes. Results: Coagulation disorders occurred in the early stages of COVID-19 infection with an increase in DD in 54 (58%) patients and an increase in Fg in 93 (100%) patients. DD and Fg levels are associated with the clinical classification. Of the 33 patients who died, 21 had an increase in DD in the first laboratory study, 27 had an increase in DD in the second and third laboratory studies, and 15 had an increase in PT in the third test. The results of the ROC analysis of mortality showed that the AUC DD was three times 0.721, 0.801, and 0.844, respectively; PT was 0.703, 0.845, and 0.972. (P<0:01). Conclusion”: Coagulation dysfunction is more common in patients with severe and critical conditions. DD and PT can be used as important predictors of mortality from COVID-19.

Keywords: Covid19, DD, PT, Coagulogram analysis, APTT

Procedia PDF Downloads 105
1047 Neural Nets Based Approach for 2-Cells Power Converter Control

Authors: Kamel Laidi, Khelifa Benmansour, Ouahid Bouchhida

Abstract:

Neural networks-based approach for 2-cells serial converter has been developed and implemented. The approach is based on a behavioural description of the different operating modes of the converter. Each operating mode represents a well-defined configuration, and for which is matched an operating zone satisfying given invariance conditions, depending on the capacitors' voltages and the load current of the converter. For each mode, a control vector whose components are the control signals to be applied to the converter switches has been associated. Therefore, the problem is reduced to a classification task of the different operating modes of the converter. The artificial neural nets-based approach, which constitutes a powerful tool for this kind of task, has been adopted and implemented. The application to a 2-cells chopper has allowed ensuring efficient and robust control of the load current and a high capacitors voltages balancing.

Keywords: neural nets, control, multicellular converters, 2-cells chopper

Procedia PDF Downloads 833
1046 Artificial Intelligence Methods in Estimating the Minimum Miscibility Pressure Required for Gas Flooding

Authors: Emad A. Mohammed

Abstract:

Utilizing the capabilities of Data Mining and Artificial Intelligence in the prediction of the minimum miscibility pressure (MMP) required for multi-contact miscible (MCM) displacement of reservoir petroleum by hydrocarbon gas flooding using Fuzzy Logic models and Artificial Neural Network models will help a lot in giving accurate results. The factors affecting the (MMP) as it is proved from the literature and from the dataset are as follows: XC2-6: Intermediate composition in the oil-containing C2-6, CO2 and H2S, in mole %, XC1: Amount of methane in the oil (%),T: Temperature (°C), MwC7+: Molecular weight of C7+ (g/mol), YC2+: Mole percent of C2+ composition in injected gas (%), MwC2+: Molecular weight of C2+ in injected gas. Fuzzy Logic and Neural Networks have been used widely in prediction and classification, with relatively high accuracy, in different fields of study. It is well known that the Fuzzy Inference system can handle uncertainty within the inputs such as in our case. The results of this work showed that our proposed models perform better with higher performance indices than other emprical correlations.

Keywords: MMP, gas flooding, artificial intelligence, correlation

Procedia PDF Downloads 143
1045 Review of Concepts and Tools Applied to Assess Risks Associated with Food Imports

Authors: A. Falenski, A. Kaesbohrer, M. Filter

Abstract:

Introduction: Risk assessments can be performed in various ways and in different degrees of complexity. In order to assess risks associated with imported foods additional information needs to be taken into account compared to a risk assessment on regional products. The present review is an overview on currently available best practise approaches and data sources used for food import risk assessments (IRAs). Methods: A literature review has been performed. PubMed was searched for articles about food IRAs published in the years 2004 to 2014 (English and German texts only, search string “(English [la] OR German [la]) (2004:2014 [dp]) import [ti] risk”). Titles and abstracts were screened for import risks in the context of IRAs. The finally selected publications were analysed according to a predefined questionnaire extracting the following information: risk assessment guidelines followed, modelling methods used, data and software applied, existence of an analysis of uncertainty and variability. IRAs cited in these publications were also included in the analysis. Results: The PubMed search resulted in 49 publications, 17 of which contained information about import risks and risk assessments. Within these 19 cross references were identified to be of interest for the present study. These included original articles, reviews and guidelines. At least one of the guidelines of the World Organisation for Animal Health (OIE) and the Codex Alimentarius Commission were referenced in any of the IRAs, either for import of animals or for imports concerning foods, respectively. Interestingly, also a combination of both was used to assess the risk associated with the import of live animals serving as the source of food. Methods ranged from full quantitative IRAs using probabilistic models and dose-response models to qualitative IRA in which decision trees or severity tables were set up using parameter estimations based on expert opinions. Calculations were done using @Risk, R or Excel. Most heterogeneous was the type of data used, ranging from general information on imported goods (food, live animals) to pathogen prevalence in the country of origin. These data were either publicly available in databases or lists (e.g., OIE WAHID and Handystatus II, FAOSTAT, Eurostat, TRACES), accessible on a national level (e.g., herd information) or only open to a small group of people (flight passenger import data at national airport customs office). In the IRAs, an uncertainty analysis has been mentioned in some cases, but calculations have been performed only in a few cases. Conclusion: The current state-of-the-art in the assessment of risks of imported foods is characterized by a great heterogeneity in relation to general methodology and data used. Often information is gathered on a case-by-case basis and reformatted by hand in order to perform the IRA. This analysis therefore illustrates the need for a flexible, modular framework supporting the connection of existing data sources with data analysis and modelling tools. Such an infrastructure could pave the way to IRA workflows applicable ad-hoc, e.g. in case of a crisis situation.

Keywords: import risk assessment, review, tools, food import

Procedia PDF Downloads 301
1044 Detection of High Fructose Corn Syrup in Honey by Near Infrared Spectroscopy and Chemometrics

Authors: Mercedes Bertotto, Marcelo Bello, Hector Goicoechea, Veronica Fusca

Abstract:

The National Service of Agri-Food Health and Quality (SENASA), controls honey to detect contamination by synthetic or natural chemical substances and establishes and controls the traceability of the product. The utility of near-infrared spectroscopy for the detection of adulteration of honey with high fructose corn syrup (HFCS) was investigated. First of all, a mixture of different authentic artisanal Argentinian honey was prepared to cover as much heterogeneity as possible. Then, mixtures were prepared by adding different concentrations of high fructose corn syrup (HFCS) to samples of the honey pool. 237 samples were used, 108 of them were authentic honey and 129 samples corresponded to honey adulterated with HFCS between 1 and 10%. They were stored unrefrigerated from time of production until scanning and were not filtered after receipt in the laboratory. Immediately prior to spectral collection, honey was incubated at 40°C overnight to dissolve any crystalline material, manually stirred to achieve homogeneity and adjusted to a standard solids content (70° Brix) with distilled water. Adulterant solutions were also adjusted to 70° Brix. Samples were measured by NIR spectroscopy in the range of 650 to 7000 cm⁻¹. The technique of specular reflectance was used, with a lens aperture range of 150 mm. Pretreatment of the spectra was performed by Standard Normal Variate (SNV). The ant colony optimization genetic algorithm sample selection (ACOGASS) graphical interface was used, using MATLAB version 5.3, to select the variables with the greatest discriminating power. The data set was divided into a validation set and a calibration set, using the Kennard-Stone (KS) algorithm. A combined method of Potential Functions (PF) was chosen together with Partial Least Square Linear Discriminant Analysis (PLS-DA). Different estimators of the predictive capacity of the model were compared, which were obtained using a decreasing number of groups, which implies more demanding validation conditions. The optimal number of latent variables was selected as the number associated with the minimum error and the smallest number of unassigned samples. Once the optimal number of latent variables was defined, we proceeded to apply the model to the training samples. With the calibrated model for the training samples, we proceeded to study the validation samples. The calibrated model that combines the potential function methods and PLSDA can be considered reliable and stable since its performance in future samples is expected to be comparable to that achieved for the training samples. By use of Potential Functions (PF) and Partial Least Square Linear Discriminant Analysis (PLS-DA) classification, authentic honey and honey adulterated with HFCS could be identified with a correct classification rate of 97.9%. The results showed that NIR in combination with the PT and PLS-DS methods can be a simple, fast and low-cost technique for the detection of HFCS in honey with high sensitivity and power of discrimination.

Keywords: adulteration, multivariate analysis, potential functions, regression

Procedia PDF Downloads 124
1043 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 91
1042 A Review on the Use of Salt in Building Construction

Authors: Vesna Pungercar, Florian Musso

Abstract:

Identifying materials that can substitute rare or expensive natural resources is one of the key challenges for improving resource efficiency in the building sector. With a growing world population and rising living standards, more and more salt is produced as waste through seawater desalination and potash mining processes. Unfortunately, most of the salt is directly disposed of into nature, where it causes environmental pollution. On the other hand, salt is affordable, is used therapeutically in various respiratory treatments, and can store humidity and heat. It was, therefore, necessary to determine salt materials already in use in building construction and their hygrothermal properties. This research aims to identify salt materials from different scientific branches and historically, to investigate their properties and prioritize the most promising salt materials for indoor applications in a thermal envelope. This was realized through literature review and classification of salt materials into three groups (raw salt materials, composite salt materials, and processed salt materials). The outcome of this research shows that salt has already been used as a building material for centuries and has a potential for future applications due to its hygrothermal properties in a thermal envelope.

Keywords: salt, building material, hygrothermal properties, environment

Procedia PDF Downloads 166
1041 Cervical Cell Classification Using Random Forests

Authors: Dalwinder Singh, Amandeep Verma, Manpreet Kaur, Birmohan Singh

Abstract:

The detection of pre-cancerous changes using a Pap smear test of cervical cell is the important step for the early diagnosis of cervical cancer. The Pap smear test consists of a sample of human cells taken from the cervix which are analysed to detect cancerous and pre-cancerous stage of the given subject. The manual analysis of these cells is labor intensive and time consuming process which relies on expert cytotechnologist. In this paper, a computer assisted system for the automated analysis of the cervical cells has been proposed. We propose a morphology based approach to the nucleus detection and segmentation of the cytoplasmic region of the given single or multiple overlapped cell. Further, various texture and region based features are calculated from these cells to classify these into normal and abnormal cell. Experimental results on public available dataset show that our system has achieved satisfactory success rate.

Keywords: cervical cancer, cervical tissue, mathematical morphology, texture features

Procedia PDF Downloads 525
1040 The Use of Thermal Infrared Wavelengths to Determine the Volcanic Soils

Authors: Levent Basayigit, Mert Dedeoglu, Fadime Ozogul

Abstract:

In this study, an application was carried out to determine the Volcanic Soils by using remote sensing.  The study area was located on the Golcuk formation in Isparta-Turkey. The thermal bands of Landsat 7 image were used for processing. The implementation of the climate model that was based on the water index was used in ERDAS Imagine software together with pixel based image classification. Soil Moisture Index (SMI) was modeled by using the surface temperature (Ts) which was obtained from thermal bands and vegetation index (NDVI) derived from Landsat 7. Surface moisture values were grouped and classified by using scoring system. Thematic layers were compared together with the field studies. Consequently, different moisture levels for volcanic soils were indicator for determination and separation. Those thermal wavelengths are preferable bands for separation of volcanic soils using moisture and temperature models.

Keywords: Landsat 7, soil moisture index, temperature models, volcanic soils

Procedia PDF Downloads 303
1039 Parameter Selection and Monitoring for Water-Powered Percussive Drilling in Green-Fields Mineral Exploration

Authors: S. J. Addinell, T. Richard, B. Evans

Abstract:

The Deep Exploration Technologies Cooperative Research Centre (DET CRC) is researching and developing a new coiled tubing based greenfields mineral exploration drilling system utilising downhole water powered percussive drill tooling. This new drilling system is aimed at significantly reducing the costs associated with identifying mineral resource deposits beneath deep, barron cover. This system has shown superior rates of penetration in water-rich hard rock formations at depths exceeding 500 meters. Several key challenges exist regarding the deployment and use of these bottom hole assemblies for mineral exploration, and this paper discusses some of the key technical challenges. This paper presents experimental results obtained from the research program during laboratory and field testing of the prototype drilling system. A study of the morphological aspects of the cuttings generated during the percussive drilling process is presented and shows a strong power law relationship for particle size distributions. Several percussive drilling parameters such as RPM, applied fluid pressure and weight on bit have been shown to influence the particle size distributions of the cuttings generated. This has direct influence on other drilling parameters such as flow loop performance, cuttings dewatering, and solids control. Real-time, accurate knowledge of percussive system operating parameters will assist the driller in maximising the efficiency of the drilling process. The applied fluid flow, fluid pressure, and rock properties are known to influence the natural oscillating frequency of the percussive hammer, but this paper also shows that drill bit design, drill bit wear and the applied weight on bit can also influence the oscillation frequency. Due to the changing drilling conditions and therefore changing operating parameters, real-time understanding of the natural operating frequency is paramount to achieving system optimisation. Several techniques to understand the oscillating frequency have been investigated and presented. With a conventional top drive drilling rig, spectral analysis of applied fluid pressure, hydraulic feed force pressure, hold back pressure and drill string vibrations have shown the presence of the operating frequency of the bottom hole tooling. Unfortunately, however, with the implementation of a coiled tubing drilling rig, implementing a positive displacement downhole motor to provide drill bit rotation, these signals are not available for interrogation at the surface and therefore another method must be considered. The investigation and analysis of ground vibrations using geophone sensors, similar to seismic-while-drilling techniques have indicated the presence of the natural oscillating frequency of the percussive hammer. This method is shown to provide a robust technique for the determination of the downhole percussive oscillation frequency when used with a coiled tubing drill rig.

Keywords: cuttings characterization, drilling optimization, oscillation frequency, percussive drilling, spectral analysis

Procedia PDF Downloads 229
1038 Intelligent Prediction of Breast Cancer Severity

Authors: Wahab Ali, Oyebade K. Oyedotun, Adnan Khashman

Abstract:

Breast cancer remains a threat to the woman’s world in view of survival rates, it early diagnosis and mortality statistics. So far, research has shown that many survivors of breast cancer cases are in the ones with early diagnosis. Breast cancer is usually categorized into stages which indicates its severity and corresponding survival rates for patients. Investigations show that the farther into the stages before diagnosis the lesser the chance of survival; hence the early diagnosis of breast cancer becomes imperative, and consequently the application of novel technologies to achieving this. Over the year, mammograms have used in the diagnosis of breast cancer, but the inconclusive deductions made from such scans lead to either false negative cases where cancer patients may be left untreated or false positive where unnecessary biopsies are carried out. This paper presents the application of artificial neural networks in the prediction of severity of breast tumour (whether benign or malignant) using mammography reports and other factors that are related to breast cancer.

Keywords: breast cancer, intelligent classification, neural networks, mammography

Procedia PDF Downloads 487
1037 ExactData Smart Tool For Marketing Analysis

Authors: Aleksandra Jonas, Aleksandra Gronowska, Maciej Ścigacz, Szymon Jadczak

Abstract:

Exact Data is a smart tool which helps with meaningful marketing content creation. It helps marketers achieve this by analyzing the text of an advertisement before and after its publication on social media sites like Facebook or Instagram. In our research we focus on four areas of natural language processing (NLP): grammar correction, sentiment analysis, irony detection and advertisement interpretation. Our research has identified a considerable lack of NLP tools for the Polish language, which specifically aid online marketers. In light of this, our research team has set out to create a robust and versatile NLP tool for the Polish language. The primary objective of our research is to develop a tool that can perform a range of language processing tasks in this language, such as sentiment analysis, text classification, text correction and text interpretation. Our team has been working diligently to create a tool that is accurate, reliable, and adaptable to the specific linguistic features of Polish, and that can provide valuable insights for a wide range of marketers needs. In addition to the Polish language version, we are also developing an English version of the tool, which will enable us to expand the reach and impact of our research to a wider audience. Another area of focus in our research involves tackling the challenge of the limited availability of linguistically diverse corpora for non-English languages, which presents a significant barrier in the development of NLP applications. One approach we have been pursuing is the translation of existing English corpora, which would enable us to use the wealth of linguistic resources available in English for other languages. Furthermore, we are looking into other methods, such as gathering language samples from social media platforms. By analyzing the language used in social media posts, we can collect a wide range of data that reflects the unique linguistic characteristics of specific regions and communities, which can then be used to enhance the accuracy and performance of NLP algorithms for non-English languages. In doing so, we hope to broaden the scope and capabilities of NLP applications. Our research focuses on several key NLP techniques including sentiment analysis, text classification, text interpretation and text correction. To ensure that we can achieve the best possible performance for these techniques, we are evaluating and comparing different approaches and strategies for implementing them. We are exploring a range of different methods, including transformers and convolutional neural networks (CNNs), to determine which ones are most effective for different types of NLP tasks. By analyzing the strengths and weaknesses of each approach, we can identify the most effective techniques for specific use cases, and further enhance the performance of our tool. Our research aims to create a tool, which can provide a comprehensive analysis of advertising effectiveness, allowing marketers to identify areas for improvement and optimize their advertising strategies. The results of this study suggest that a smart tool for advertisement analysis can provide valuable insights for businesses seeking to create effective advertising campaigns.

Keywords: NLP, AI, IT, language, marketing, analysis

Procedia PDF Downloads 84
1036 Syndromic Surveillance Framework Using Tweets Data Analytics

Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden

Abstract:

Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.

Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza

Procedia PDF Downloads 115
1035 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: mutex task generation, data augmentation, meta-learning, text classification.

Procedia PDF Downloads 141
1034 A Weighted Approach to Unconstrained Iris Recognition

Authors: Yao-Hong Tsai

Abstract:

This paper presents a weighted approach to unconstrained iris recognition. Nowadays, commercial systems are usually characterized by strong acquisition constraints based on the subject’s cooperation. However, it is not always achievable for real scenarios in our daily life. Researchers have been focused on reducing these constraints and maintaining the performance of the system by new techniques at the same time. With large variation in the environment, there are two main improvements to develop the proposed iris recognition system. For solving extremely uneven lighting condition, statistic based illumination normalization is first used on eye region to increase the accuracy of iris feature. The detection of the iris image is based on Adaboost algorithm. Secondly, the weighted approach is designed by Gaussian functions according to the distance to the center of the iris. Furthermore, local binary pattern (LBP) histogram is then applied to texture classification with the weight. Experiment showed that the proposed system provided users a more flexible and feasible way to interact with the verification system through iris recognition.

Keywords: authentication, iris recognition, adaboost, local binary pattern

Procedia PDF Downloads 223
1033 A Five-Year Experience of Intensity Modulated Radiotherapy in Nasopharyngeal Carcinomas in Tunisia

Authors: Omar Nouri, Wafa Mnejja, Fatma Dhouib, Syrine Zouari, Wicem Siala, Ilhem Charfeddine, Afef Khanfir, Leila Farhat, Nejla Fourati, Jamel Daoud

Abstract:

Purpose and Objective: Intensity modulated radiation (IMRT) technique, associated with induction chemotherapy (IC) and/or concomitant chemotherapy (CC), is actually the recommended treatment modality for nasopharyngeal carcinomas (NPC). The aim of this study was to evaluate the therapeutic results and the patterns of relapse with this treatment protocol. Material and methods: A retrospective monocentric study of 145 patients with NPC treated between June 2016 and July 2021. All patients received IMRT with integrated simultaneous boost (SIB) of 33 daily fractions at a dose of 69.96 Gy for high-risk volume, 60 Gy for intermediate risk volume and 54 Gy for low-risk volume. The high-risk volume dose was 66.5 Gy in children. Survival analysis was performed according to the Kaplan-Meier method, and the Log-rank test was used to compare factors that may influence survival. Results: Median age was 48 years (11-80) with a sex ratio of 2.9. One hundred-twenty tumors (82.7%) were classified as stages III-IV according to the 2017 UICC TNM classification. Ten patients (6.9%) were metastatic at diagnosis. One hundred-thirty-five patient (93.1%) received IC, 104 of which (77%) were TPF-based (taxanes, cisplatin and 5 fluoro-uracil). One hundred-thirty-eight patient (95.2%) received CC, mostly cisplatin in 134 cases (97%). After a median follow-up of 50 months [22-82], 46 patients (31.7%) had a relapse: 12 (8.2%) experienced local and/or regional relapse after a median of 18 months [6-43], 29 (20%) experienced distant relapse after a median of 9 months [2-24] and 5 patients (3.4%) had both. Thirty-five patients (24.1%) died, including 5 (3.4%) from a cause other than their cancer. Three-year overall survival (OS), cancer specific survival, disease free survival, metastasis free survival and loco-regional free survival were respectively 78.1%, 81.3%, 67.8%, 74.5% and 88.1%. Anatomo-clinic factors predicting OS were age > 50 years (88.7 vs. 70.5%; p=0.004), diabetes history (81.2 vs. 66.7%; p=0.027), UICC N classification (100 vs. 95 vs. 77.5 vs. 68.8% respectively for N0, N1, N2 and N3; p=0.008), the practice of a lymph node biopsy (84.2 vs. 57%; p=0.05), and UICC TNM stages III-IV (93.8 vs. 73.6% respectively for stage I-II vs. III-IV; p=0.044). Therapeutic factors predicting OS were a number of CC courses (less than 4 courses: 65.8 vs. 86%; p=0.03, less than 5 courses: 71.5 vs. 89%; p=0.041), a weight loss > 10% during treatment (84.1 vs. 60.9%; p=0.021) and a total cumulative cisplatin dose, including IC and CC, < 380 mg/m² (64.4 vs. 87.6%; p=0.003). Radiotherapy delay and total duration did not significantly affect OS. No grade 3-4 late side effects were noted in the evaluable 127 patients (87.6%). The most common toxicity was dry mouth which was grade 2 in 47 cases (37%) and grade 1 in 55 cases (43.3%).Conclusion: IMRT for nasopharyngeal carcinoma granted a high loco-regional control rate for patients during the last five years. However, distant relapses remain frequent and conditionate the prognosis. We identified many anatomo-clinic and therapeutic prognosis factors. Therefore, high-risk patients require a more aggressive therapeutic approach, such as radiotherapy dose escalation or adding adjuvant chemotherapy.

Keywords: therapeutic results, prognostic factors, intensity-modulated radiotherapy, nasopharyngeal carcinoma

Procedia PDF Downloads 62
1032 Data-Centric Anomaly Detection with Diffusion Models

Authors: Sheldon Liu, Gordon Wang, Lei Liu, Xuefeng Liu

Abstract:

Anomaly detection, also referred to as one-class classification, plays a crucial role in identifying product images that deviate from the expected distribution. This study introduces Data-centric Anomaly Detection with Diffusion Models (DCADDM), presenting a systematic strategy for data collection and further diversifying the data with image generation via diffusion models. The algorithm addresses data collection challenges in real-world scenarios and points toward data augmentation with the integration of generative AI capabilities. The paper explores the generation of normal images using diffusion models. The experiments demonstrate that with 30% of the original normal image size, modeling in an unsupervised setting with state-of-the-art approaches can achieve equivalent performances. With the addition of generated images via diffusion models (10% equivalence of the original dataset size), the proposed algorithm achieves better or equivalent anomaly localization performance.

Keywords: diffusion models, anomaly detection, data-centric, generative AI

Procedia PDF Downloads 81
1031 Classification Earthquake Distribution in the Banda Sea Collision Zone with Point Process Approach

Authors: H. J. Wattimanela, U. S. Passaribu, N. T. Puspito, S. W. Indratno

Abstract:

Banda Sea collision zone (BSCZ) of is the result of the interaction and convergence of Indo-Australian plate, Eurasian plate and Pacific plate. This location in the eastern part of Indonesia. This zone has a very high seismic activity. In this research, we will be calculated rate (λ) and Mean Square Eror (MSE). By this result, we will identification of Poisson distribution of earthquakes in the BSCZ with the point process approach. Chi-square test approach and test Anscombe made in the process of identifying a Poisson distribution in the partition area. The data used are earthquakes with Magnitude ≥ 6 SR and its period 1964-2013 and sourced from BMKG Jakarta. This research is expected to contribute to the Moluccas Province and surrounding local governments in performing spatial plan document related to disaster management.

Keywords: molluca banda sea collision zone, earthquakes, mean square error, poisson distribution, chi-square test, anscombe test

Procedia PDF Downloads 299
1030 Errors in Selected Writings of EFL Students: A Study of Department of English, Taraba State University, Jalingo, Nigeria

Authors: Joy Aworookoroh

Abstract:

Writing is one of the active skills in language learning. Students of English as a foreign language are expected to write efficiently and proficiently in the language; however, there are usually challenges to optimal performance and competence in writing. Errors, on the other hand, in a foreign language learning situation are more positive than negative as they provide the basis for solving the limitations of the students. This paper investigates the situation in the Department of English, Taraba State University Jalingo. Students are administered a descriptive writing test across different levels of study. The target students are multilingual with an L1 of either Kuteb, Hausa or Junkun languages. The essays are accessed to identify the different kinds of errors in them alongside the classification of the order. Errors of correctness, clarity, engagement, and delivery were identified. However, the study identified that the degree of errors reduces alongside the experience and exposure of the students to an EFL classroom.

Keywords: errors, writings, descriptive essay, multilingual

Procedia PDF Downloads 60
1029 Detection of COVID-19 Cases From X-Ray Images Using Capsule-Based Network

Authors: Donya Ashtiani Haghighi, Amirali Baniasadi

Abstract:

Coronavirus (COVID-19) disease has spread abruptly all over the world since the end of 2019. Computed tomography (CT) scans and X-ray images are used to detect this disease. Different Deep Neural Network (DNN)-based diagnosis solutions have been developed, mainly based on Convolutional Neural Networks (CNNs), to accelerate the identification of COVID-19 cases. However, CNNs lose important information in intermediate layers and require large datasets. In this paper, Capsule Network (CapsNet) is used. Capsule Network performs better than CNNs for small datasets. Accuracy of 0.9885, f1-score of 0.9883, precision of 0.9859, recall of 0.9908, and Area Under the Curve (AUC) of 0.9948 are achieved on the Capsule-based framework with hyperparameter tuning. Moreover, different dropout rates are investigated to decrease overfitting. Accordingly, a dropout rate of 0.1 shows the best results. Finally, we remove one convolution layer and decrease the number of trainable parameters to 146,752, which is a promising result.

Keywords: capsule network, dropout, hyperparameter tuning, classification

Procedia PDF Downloads 76
1028 An Automated Approach to Consolidate Galileo System Availability

Authors: Marie Bieber, Fabrice Cosson, Olivier Schmitt

Abstract:

Europe's Global Navigation Satellite System, Galileo, provides worldwide positioning and navigation services. The satellites in space are only one part of the Galileo system. An extensive ground infrastructure is essential to oversee the satellites and ensure accurate navigation signals. High reliability and availability of the entire Galileo system are crucial to continuously provide positioning information of high quality to users. Outages are tracked, and operational availability is regularly assessed. A highly flexible and adaptive tool has been developed to automate the Galileo system availability analysis. Not only does it enable a quick availability consolidation, but it also provides first steps towards improving the data quality of maintenance tickets used for the analysis. This includes data import and data preparation, with a focus on processing strings used for classification and identifying faulty data. Furthermore, the tool allows to handle a low amount of data, which is a major constraint when the aim is to provide accurate statistics.

Keywords: availability, data quality, system performance, Galileo, aerospace

Procedia PDF Downloads 165
1027 Efficient Feature Fusion for Noise Iris in Unconstrained Environment

Authors: Yao-Hong Tsai

Abstract:

This paper presents an efficient fusion algorithm for iris images to generate stable feature for recognition in unconstrained environment. Recently, iris recognition systems are focused on real scenarios in our daily life without the subject’s cooperation. Under large variation in the environment, the objective of this paper is to combine information from multiple images of the same iris. The result of image fusion is a new image which is more stable for further iris recognition than each original noise iris image. A wavelet-based approach for multi-resolution image fusion is applied in the fusion process. The detection of the iris image is based on Adaboost algorithm and then local binary pattern (LBP) histogram is then applied to texture classification with the weighting scheme. Experiment showed that the generated features from the proposed fusion algorithm can improve the performance for verification system through iris recognition.

Keywords: image fusion, iris recognition, local binary pattern, wavelet

Procedia PDF Downloads 366
1026 Empowering a New Frontier in Heart Disease Detection: Unleashing Quantum Machine Learning

Authors: Sadia Nasrin Tisha, Mushfika Sharmin Rahman, Javier Orduz

Abstract:

Machine learning is applied in a variety of fields throughout the world. The healthcare sector has benefited enormously from it. One of the most effective approaches for predicting human heart diseases is to use machine learning applications to classify data and predict the outcome as a classification. However, with the rapid advancement of quantum technology, quantum computing has emerged as a potential game-changer for many applications. Quantum algorithms have the potential to execute substantially faster than their classical equivalents, which can lead to significant improvements in computational performance and efficiency. In this study, we applied quantum machine learning concepts to predict coronary heart diseases from text data. We experimented thrice with three different features; and three feature sets. The data set consisted of 100 data points. We pursue to do a comparative analysis of the two approaches, highlighting the potential benefits of quantum machine learning for predicting heart diseases.

Keywords: quantum machine learning, SVM, QSVM, matrix product state

Procedia PDF Downloads 92
1025 Improving University Operations with Data Mining: Predicting Student Performance

Authors: Mladen Dragičević, Mirjana Pejić Bach, Vanja Šimičević

Abstract:

The purpose of this paper is to develop models that would enable predicting student success. These models could improve allocation of students among colleges and optimize the newly introduced model of government subsidies for higher education. For the purpose of collecting data, an anonymous survey was carried out in the last year of undergraduate degree student population using random sampling method. Decision trees were created of which two have been chosen that were most successful in predicting student success based on two criteria: Grade Point Average (GPA) and time that a student needs to finish the undergraduate program (time-to-degree). Decision trees have been shown as a good method of classification student success and they could be even more improved by increasing survey sample and developing specialized decision trees for each type of college. These types of methods have a big potential for use in decision support systems.

Keywords: data mining, knowledge discovery in databases, prediction models, student success

Procedia PDF Downloads 405
1024 Oil Pollution Analysis of the Ecuadorian Rainforest Using Remote Sensing Methods

Authors: Juan Heredia, Naci Dilekli

Abstract:

The Ecuadorian Rainforest has been polluted for almost 60 years with little to no regard to oversight, law, or regulations. The consequences have been vast environmental damage such as pollution and deforestation, as well as sickness and the death of many people and animals. The aim of this paper is to quantify and localize the polluted zones, which something that has not been conducted and is the first step for remediation. To approach this problem, multi-spectral Remote Sensing imagery was utilized using a novel algorithm developed for this study, based on four normalized indices available in the literature. The algorithm classifies the pixels in polluted or healthy ones. The results of this study include a new algorithm for pixel classification and quantification of the polluted area in the selected image. Those results were finally validated by ground control points found in the literature. The main conclusion of this work is that using hyperspectral images, it is possible to identify polluted vegetation. The future work is environmental remediation, in-situ tests, and more extensive results that would inform new policymaking.

Keywords: remote sensing, oil pollution quatification, amazon forest, hyperspectral remote sensing

Procedia PDF Downloads 160