Search results for: classification of filters
1358 A Comprehensive Framework for Fraud Prevention and Customer Feedback Classification in E-Commerce
Authors: Samhita Mummadi, Sree Divya Nagalli, Harshini Vemuri, Saketh Charan Nakka, Sumesh K. J.
Abstract:
One of the most significant challenges faced by people in today’s digital era is an alarming increase in fraudulent activities on online platforms. The fascination with online shopping to avoid long queues in shopping malls, the availability of a variety of products, and home delivery of goods have paved the way for a rapid increase in vast online shopping platforms. This has had a major impact on increasing fraudulent activities as well. This loop of online shopping and transactions has paved the way for fraudulent users to commit fraud. For instance, consider a store that orders thousands of products all at once, but what’s fishy about this is the massive number of items purchased and their transactions turning out to be fraud, leading to a huge loss for the seller. Considering scenarios like these underscores the urgent need to introduce machine learning approaches to combat fraud in online shopping. By leveraging robust algorithms, namely KNN, Decision Trees, and Random Forest, which are highly effective in generating accurate results, this research endeavors to discern patterns indicative of fraudulent behavior within transactional data. Introducing a comprehensive solution to this problem in order to empower e-commerce administrators in timely fraud detection and prevention is the primary motive and the main focus. In addition to that, sentiment analysis is harnessed in the model so that the e-commerce admin can tailor to the customer’s and consumer’s concerns, feedback, and comments, allowing the admin to improve the user’s experience. The ultimate objective of this study is to ramp up online shopping platforms against fraud and ensure a safer shopping experience. This paper underscores a model accuracy of 84%. All the findings and observations that were noted during our work lay the groundwork for future advancements in the development of more resilient and adaptive fraud detection systems, which will become crucial as technologies continue to evolve.Keywords: behavior analysis, feature selection, Fraudulent pattern recognition, imbalanced classification, transactional anomalies
Procedia PDF Downloads 271357 Spatial Patterns of Urban Expansion in Kuwait City between 1989 and 2001
Authors: Saad Algharib, Jay Lee
Abstract:
Urbanization is a complex phenomenon that occurs during the city’s development from one form to another. In other words, it is the process when the activities in the land use/land cover change from rural to urban. Since the oil exploration, Kuwait City has been growing rapidly due to its urbanization and population growth by both natural growth and inward immigration. The main objective of this study is to detect changes in urban land use/land cover and to examine the changing spatial patterns of urban growth in and around Kuwait City between 1989 and 2001. In addition, this study also evaluates the spatial patterns of the changes detected and how they can be related to the spatial configuration of the city. Recently, the use of remote sensing and geographic information systems became very useful and important tools in urban studies because of the integration of them can allow and provide the analysts and planners to detect, monitor and analyze the urban growth in a region effectively. Moreover, both planners and users can predict the trends of the growth in urban areas in the future with remotely sensed and GIS data because they can be effectively updated with required precision levels. In order to identify the new urban areas between 1989 and 2001, the study uses satellite images of the study area and remote sensing technology for classifying these images. Unsupervised classification method was applied to classify images to land use and land cover data layers. After finishing the unsupervised classification method, GIS overlay function was applied to the classified images for detecting the locations and patterns of the new urban areas that developed during the study period. GIS was also utilized to evaluate the distribution of the spatial patterns. For example, Moran’s index was applied for all data inputs to examine the urban growth distribution. Furthermore, this study assesses if the spatial patterns and process of these changes take place in a random fashion or with certain identifiable trends. During the study period, the result of this study indicates that the urban growth has occurred and expanded 10% from 32.4% in 1989 to 42.4% in 2001. Also, the results revealed that the largest increase of the urban area occurred between the major highways after the forth ring road from the center of Kuwait City. Moreover, the spatial distribution of urban growth occurred in cluster manners.Keywords: geographic information systems, remote sensing, urbanization, urban growth
Procedia PDF Downloads 1711356 Normalized Compression Distance Based Scene Alteration Analysis of a Video
Authors: Lakshay Kharbanda, Aabhas Chauhan
Abstract:
In this paper, an application of Normalized Compression Distance (NCD) to detect notable scene alterations occurring in videos is presented. Several research groups have been developing methods to perform image classification using NCD, a computable approximation to Normalized Information Distance (NID) by studying the degree of similarity in images. The timeframes where significant aberrations between the frames of a video have occurred have been identified by obtaining a threshold NCD value, using two compressors: LZMA and BZIP2 and defining scene alterations using Pixel Difference Percentage metrics.Keywords: image compression, Kolmogorov complexity, normalized compression distance, root mean square error
Procedia PDF Downloads 3401355 Recognition of Tifinagh Characters with Missing Parts Using Neural Network
Authors: El Mahdi Barrah, Said Safi, Abdessamad Malaoui
Abstract:
In this paper, we present an algorithm for reconstruction from incomplete 2D scans for tifinagh characters. This algorithm is based on using correlation between the lost block and its neighbors. This system proposed contains three main parts: pre-processing, features extraction and recognition. In the first step, we construct a database of tifinagh characters. In the second step, we will apply “shape analysis algorithm”. In classification part, we will use Neural Network. The simulation results demonstrate that the proposed method give good results.Keywords: Tifinagh character recognition, neural networks, local cost computation, ANN
Procedia PDF Downloads 3341354 Classification of Sturm-Liouville Problems at Infinity
Authors: Kishor J. shinde
Abstract:
We determine the values of k and p such that the Sturm-Liouville differential operator τu=-(d^2 u)/(dx^2) + kx^p u is in limit point case or limit circle case at infinity. In particular it is shown that τ is in the limit point case when (i) for p=2 and ∀k, (ii) for ∀p and k=0, (iii) for all p and k>0, (iv) for 0≤p≤2 and k<0, (v) for p<0 and k<0. τ is in the limit circle case when (i) for p>2 and k<0.Keywords: limit point case, limit circle case, Sturm-Liouville, infinity
Procedia PDF Downloads 3671353 Rice Area Determination Using Landsat-Based Indices and Land Surface Temperature Values
Authors: Burçin Saltık, Levent Genç
Abstract:
In this study, it was aimed to determine a route for identification of rice cultivation areas within Thrace and Marmara regions of Turkey using remote sensing and GIS. Landsat 8 (OLI-TIRS) imageries acquired in production season of 2013 with 181/32 Path/Row number were used. Four different seasonal images were generated utilizing original bands and different transformation techniques. All images were classified individually using supervised classification techniques and Land Use Land Cover Maps (LULC) were generated with 8 classes. Areas (ha, %) of each classes were calculated. In addition, district-based rice distribution maps were developed and results of these maps were compared with Turkish Statistical Institute (TurkSTAT; TSI)’s actual rice cultivation area records. Accuracy assessments were conducted, and most accurate map was selected depending on accuracy assessment and coherency with TSI results. Additionally, rice areas on over 4° slope values were considered as mis-classified pixels and they eliminated using slope map and GIS tools. Finally, randomized rice zones were selected to obtain maximum-minimum value ranges of each date (May, June, July, August, September images separately) NDVI, LSWI, and LST images to test whether they may be used for rice area determination via raster calculator tool of ArcGIS. The most accurate classification for rice determination was obtained from seasonal LSWI LULC map, and considering TSI data and accuracy assessment results and mis-classified pixels were eliminated from this map. According to results, 83151.5 ha of rice areas exist within study area. However, this result is higher than TSI records with an area of 12702.3 ha. Use of maximum-minimum range of rice area NDVI, LSWI, and LST was tested in Meric district. It was seen that using the value ranges obtained from July imagery, gave the closest results to TSI records, and the difference was only 206.4 ha. This difference is normal due to relatively low resolution of images. Thus, employment of images with higher spectral, spatial, temporal and radiometric resolutions may provide more reliable results.Keywords: landsat 8 (OLI-TIRS), LST, LSWI, LULC, NDVI, rice
Procedia PDF Downloads 2281352 Comprehensive Machine Learning-Based Glucose Sensing from Near-Infrared Spectra
Authors: Bitewulign Mekonnen
Abstract:
Context: This scientific paper focuses on the use of near-infrared (NIR) spectroscopy to determine glucose concentration in aqueous solutions accurately and rapidly. The study compares six different machine learning methods for predicting glucose concentration and also explores the development of a deep learning model for classifying NIR spectra. The objective is to optimize the detection model and improve the accuracy of glucose prediction. This research is important because it provides a comprehensive analysis of various machine-learning techniques for estimating aqueous glucose concentrations. Research Aim: The aim of this study is to compare and evaluate different machine-learning methods for predicting glucose concentration from NIR spectra. Additionally, the study aims to develop and assess a deep-learning model for classifying NIR spectra. Methodology: The research methodology involves the use of machine learning and deep learning techniques. Six machine learning regression models, including support vector machine regression, partial least squares regression, extra tree regression, random forest regression, extreme gradient boosting, and principal component analysis-neural network, are employed to predict glucose concentration. The NIR spectra data is randomly divided into train and test sets, and the process is repeated ten times to increase generalization ability. In addition, a convolutional neural network is developed for classifying NIR spectra. Findings: The study reveals that the SVMR, ETR, and PCA-NN models exhibit excellent performance in predicting glucose concentration, with correlation coefficients (R) > 0.99 and determination coefficients (R²)> 0.985. The deep learning model achieves high macro-averaging scores for precision, recall, and F1-measure. These findings demonstrate the effectiveness of machine learning and deep learning methods in optimizing the detection model and improving glucose prediction accuracy. Theoretical Importance: This research contributes to the field by providing a comprehensive analysis of various machine-learning techniques for estimating glucose concentrations from NIR spectra. It also explores the use of deep learning for the classification of indistinguishable NIR spectra. The findings highlight the potential of machine learning and deep learning in enhancing the prediction accuracy of glucose-relevant features. Data Collection and Analysis Procedures: The NIR spectra and corresponding references for glucose concentration are measured in increments of 20 mg/dl. The data is randomly divided into train and test sets, and the models are evaluated using regression analysis and classification metrics. The performance of each model is assessed based on correlation coefficients, determination coefficients, precision, recall, and F1-measure. Question Addressed: The study addresses the question of whether machine learning and deep learning methods can optimize the detection model and improve the accuracy of glucose prediction from NIR spectra. Conclusion: The research demonstrates that machine learning and deep learning methods can effectively predict glucose concentration from NIR spectra. The SVMR, ETR, and PCA-NN models exhibit superior performance, while the deep learning model achieves high classification scores. These findings suggest that machine learning and deep learning techniques can be used to improve the prediction accuracy of glucose-relevant features. Further research is needed to explore their clinical utility in analyzing complex matrices, such as blood glucose levels.Keywords: machine learning, signal processing, near-infrared spectroscopy, support vector machine, neural network
Procedia PDF Downloads 941351 Python Implementation for S1000D Applicability Depended Processing Model - SALERNO
Authors: Theresia El Khoury, Georges Badr, Amir Hajjam El Hassani, Stéphane N’Guyen Van Ky
Abstract:
The widespread adoption of machine learning and artificial intelligence across different domains can be attributed to the digitization of data over several decades, resulting in vast amounts of data, types, and structures. Thus, data processing and preparation turn out to be a crucial stage. However, applying these techniques to S1000D standard-based data poses a challenge due to its complexity and the need to preserve logical information. This paper describes SALERNO, an S1000d AppLicability dEpended pRocessiNg mOdel. This python-based model analyzes and converts the XML S1000D-based files into an easier data format that can be used in machine learning techniques while preserving the different logic and relationships in files. The model parses the files in the given folder, filters them, and extracts the required information to be saved in appropriate data frames and Excel sheets. Its main idea is to group the extracted information by applicability. In addition, it extracts the full text by replacing internal and external references while maintaining the relationships between files, as well as the necessary requirements. The resulting files can then be saved in databases and used in different models. Documents in both English and French languages were tested, and special characters were decoded. Updates on the technical manuals were taken into consideration as well. The model was tested on different versions of the S1000D, and the results demonstrated its ability to effectively handle the applicability, requirements, references, and relationships across all files and on different levels.Keywords: aeronautics, big data, data processing, machine learning, S1000D
Procedia PDF Downloads 1571350 Quasiperiodic Magnetic Chains as Spin Filters
Authors: Arunava Chakrabarti
Abstract:
A one-dimensional chain of magnetic atoms, representative of a quantum gas in an artificial quasi-periodic potential and modeled by the well-known Aubry-Andre function and its variants are studied in respect of its capability of working as a spin filter for arbitrary spins. The basic formulation is explained in terms of a perfectly periodic chain first, where it is shown that a definite correlation between the spin S of the incoming particles and the magnetic moment h of the substrate atoms can open up a gap in the energy spectrum. This is crucial for a spin filtering action. The simple one-dimensional chain is shown to be equivalent to a 2S+1 strand ladder network. This equivalence is exploited to work out the condition for the opening of gaps. The formulation is then applied for a one-dimensional chain with quasi-periodic variation in the site potentials, the magnetic moments and their orientations following an Aubry-Andre modulation and its variants. In addition, we show that a certain correlation between the system parameters can generate absolutely continuous bands in such systems populated by Bloch like extended wave functions only, signaling the possibility of a metal-insulator transition. This is a case of correlated disorder (a deterministic one), and the results provide a non-trivial variation to the famous Anderson localization problem. We have worked within a tight binding formalism and have presented explicit results for the spin half, spin one, three halves and spin five half particles incident on the magnetic chain to explain our scheme and the central results.Keywords: Aubry-Andre model, correlated disorder, localization, spin filter
Procedia PDF Downloads 3561349 Hybrid GNN Based Machine Learning Forecasting Model For Industrial IoT Applications
Authors: Atish Bagchi, Siva Chandrasekaran
Abstract:
Background: According to World Bank national accounts data, the estimated global manufacturing value-added output in 2020 was 13.74 trillion USD. These manufacturing processes are monitored, modelled, and controlled by advanced, real-time, computer-based systems, e.g., Industrial IoT, PLC, SCADA, etc. These systems measure and manipulate a set of physical variables, e.g., temperature, pressure, etc. Despite the use of IoT, SCADA etc., in manufacturing, studies suggest that unplanned downtime leads to economic losses of approximately 864 billion USD each year. Therefore, real-time, accurate detection, classification and prediction of machine behaviour are needed to minimise financial losses. Although vast literature exists on time-series data processing using machine learning, the challenges faced by the industries that lead to unplanned downtimes are: The current algorithms do not efficiently handle the high-volume streaming data from industrial IoTsensors and were tested on static and simulated datasets. While the existing algorithms can detect significant 'point' outliers, most do not handle contextual outliers (e.g., values within normal range but happening at an unexpected time of day) or subtle changes in machine behaviour. Machines are revamped periodically as part of planned maintenance programmes, which change the assumptions on which original AI models were created and trained. Aim: This research study aims to deliver a Graph Neural Network(GNN)based hybrid forecasting model that interfaces with the real-time machine control systemand can detect, predict machine behaviour and behavioural changes (anomalies) in real-time. This research will help manufacturing industries and utilities, e.g., water, electricity etc., reduce unplanned downtimes and consequential financial losses. Method: The data stored within a process control system, e.g., Industrial-IoT, Data Historian, is generally sampled during data acquisition from the sensor (source) and whenpersistingin the Data Historian to optimise storage and query performance. The sampling may inadvertently discard values that might contain subtle aspects of behavioural changes in machines. This research proposed a hybrid forecasting and classification model which combines the expressive and extrapolation capability of GNN enhanced with the estimates of entropy and spectral changes in the sampled data and additional temporal contexts to reconstruct the likely temporal trajectory of machine behavioural changes. The proposed real-time model belongs to the Deep Learning category of machine learning and interfaces with the sensors directly or through 'Process Data Historian', SCADA etc., to perform forecasting and classification tasks. Results: The model was interfaced with a Data Historianholding time-series data from 4flow sensors within a water treatment plantfor45 days. The recorded sampling interval for a sensor varied from 10 sec to 30 min. Approximately 65% of the available data was used for training the model, 20% for validation, and the rest for testing. The model identified the anomalies within the water treatment plant and predicted the plant's performance. These results were compared with the data reported by the plant SCADA-Historian system and the official data reported by the plant authorities. The model's accuracy was much higher (20%) than that reported by the SCADA-Historian system and matched the validated results declared by the plant auditors. Conclusions: The research demonstrates that a hybrid GNN based approach enhanced with entropy calculation and spectral information can effectively detect and predict a machine's behavioural changes. The model can interface with a plant's 'process control system' in real-time to perform forecasting and classification tasks to aid the asset management engineers to operate their machines more efficiently and reduce unplanned downtimes. A series of trialsare planned for this model in the future in other manufacturing industries.Keywords: GNN, Entropy, anomaly detection, industrial time-series, AI, IoT, Industry 4.0, Machine Learning
Procedia PDF Downloads 1501348 Regeneration of Geological Models Using Support Vector Machine Assisted by Principal Component Analysis
Authors: H. Jung, N. Kim, B. Kang, J. Choe
Abstract:
History matching is a crucial procedure for predicting reservoir performances and making future decisions. However, it is difficult due to uncertainties of initial reservoir models. Therefore, it is important to have reliable initial models for successful history matching of highly heterogeneous reservoirs such as channel reservoirs. In this paper, we proposed a novel scheme for regenerating geological models using support vector machine (SVM) and principal component analysis (PCA). First, we perform PCA for figuring out main geological characteristics of models. Through the procedure, permeability values of each model are transformed to new parameters by principal components, which have eigenvalues of large magnitude. Secondly, the parameters are projected into two-dimensional plane by multi-dimensional scaling (MDS) based on Euclidean distances. Finally, we train an SVM classifier using 20% models which show the most similar or dissimilar well oil production rates (WOPR) with the true values (10% for each). Then, the other 80% models are classified by trained SVM. We select models on side of low WOPR errors. One hundred channel reservoir models are initially generated by single normal equation simulation. By repeating the classification process, we can select models which have similar geological trend with the true reservoir model. The average field of the selected models is utilized as a probability map for regeneration. Newly generated models can preserve correct channel features and exclude wrong geological properties maintaining suitable uncertainty ranges. History matching with the initial models cannot provide trustworthy results. It fails to find out correct geological features of the true model. However, history matching with the regenerated ensemble offers reliable characterization results by figuring out proper channel trend. Furthermore, it gives dependable prediction of future performances with reduced uncertainties. We propose a novel classification scheme which integrates PCA, MDS, and SVM for regenerating reservoir models. The scheme can easily sort out reliable models which have similar channel trend with the reference in lowered dimension space.Keywords: history matching, principal component analysis, reservoir modelling, support vector machine
Procedia PDF Downloads 1601347 Microwave Dielectric Properties and Microstructures of Nd(Ti₀.₅W₀.₅)O₄ Ceramics for Application in Wireless Gas Sensors
Authors: Yih-Chien Chen, Yue-Xuan Du, Min-Zhe Weng
Abstract:
Carbon monoxide is a substance produced by the incomplete combustion. It is toxic even at concentrations of less than 100ppm. Since it is colorless and odorless, it is difficult to detect. CO sensors have been developed using a variety of physical mechanisms, including semiconductor oxides, solid electrolytes, and organic semiconductors. Many works have focused on using semiconducting sensors composed of sensitive layers such as ZnO, TiO₂, and NiO with high sensitivity for gases. However, these sensors working at high temperatures increased their power consumption. On the other hand, the dielectric resonator (DR) is attractive for gas detection due to its large surface area and sensitivity for external environments. Materials that are to be employed in sensing devices must have a high-quality factor. Numerous researches into the fergusonite-type structure and related ceramic systems have explored. Extensive research into RENbO₄ ceramics has explored their potential application in resonators, filters, and antennas in modern communication systems, which are operated at microwave frequencies. Nd(Ti₀.₅W₀.₅)O₄ ceramics were synthesized herein using the conventional mixed-oxide method. The Nd(Ti₀.₅W₀.₅)O₄ ceramics were prepared using the conventional solid-state method. Dielectric constants (εᵣ) of 15.4-19.4 and quality factor (Q×f) of 3,600-11,100 GHz were obtained at sintering temperatures in the range 1425-1525°C for 4 h. The dielectric properties of the Nd(Ti₀.₅W₀.₅)O₄ ceramics at microwave frequencies were found to vary with the sintering temperature. For a further understanding of these microwave dielectric properties, they were analyzed by densification, X-ray diffraction (XRD), and by making microstructural observations.Keywords: dielectric constant, dielectric resonators, sensors, quality factor
Procedia PDF Downloads 2601346 PM10 Chemical Characteristics in a Background Site at the Universidad Libre Bogotá
Authors: Laura X. Martinez, Andrés F. Rodríguez, Ruth A. Catacoli
Abstract:
One of the most important factors for air pollution is that the concentrations of PM10 maintain a constant trend, with the exception of some places where that frequently surpasses the allowed ranges established by Colombian legislation. The community that surrounds the Universidad Libre Bogotá is inhabited by a considerable number of students and workers, all of whom are possibly being exposed to PM10 for long periods of time while on campus. Thus, the chemical characterization of PM10 found in the ambient air at the Universidad Libre Bogotá was identified as a problem. A Hi-Vol sampler and EPA Test Method 5 were used to determine if the quality of air is adequate for the human respiratory system. Additionally, quartz fiber filters were utilized during sampling. Samples were taken three days a week during a dry period throughout the months of November and December 2015. The gravimetric analysis method was used to determine PM10 concentrations. The chemical characterization includes non-conventional carcinogenic pollutants. Atomic absorption spectrophotometry (AAS) was used for the determination of metals and VOCs were analyzed using the FTIR (Fourier transform infrared spectroscopy) method. In this way, concentrations of PM10, ranging from values of 13 µg/m3 to 66 µg/m3, were obtained; these values were below standard conditions. This evidence concludes that the PM10 concentrations during an exposure period of 24 hours are lower than the values established by Colombian law, Resolution 610 of 2010; however, when comparing these with the limits set by the World Health Organization (WHO), these concentrations could possibly exceed permissible levels.Keywords: air quality, atomic absorption spectrophotometry, gas chromatography, particulate matter
Procedia PDF Downloads 2561345 Real-Time Visualization Using GPU-Accelerated Filtering of LiDAR Data
Authors: Sašo Pečnik, Borut Žalik
Abstract:
This paper presents a real-time visualization technique and filtering of classified LiDAR point clouds. The visualization is capable of displaying filtered information organized in layers by the classification attribute saved within LiDAR data sets. We explain the used data structure and data management, which enables real-time presentation of layered LiDAR data. Real-time visualization is achieved with LOD optimization based on the distance from the observer without loss of quality. The filtering process is done in two steps and is entirely executed on the GPU and implemented using programmable shaders.Keywords: filtering, graphics, level-of-details, LiDAR, real-time visualization
Procedia PDF Downloads 3081344 Assessment of Conventional Drinking Water Treatment Plants as Removal Systems of Virulent Microsporidia
Authors: M. A. Gad, A. Z. Al-Herrawy
Abstract:
Microsporidia comprises various pathogenic species can infect humans by means of water. Moreover, chlorine disinfection of drinking-water has limitations against this protozoan pathogen. A total of 48 water samples were collected from two drinking water treatment plants having two different filtration systems (slow sand filter and rapid sand filter) during one year period. Samples were collected from inlet and outlet of each plant. Samples were separately filtrated through nitrocellulose membrane (142 mm, 0.45 µm), then eluted and centrifuged. The obtained pellet from each sample was subjected to DNA extraction, then, amplification using genus-specific primer for microsporidia. Each microsporidia-PCR positive sample was performed by two species specific primers for Enterocytozoon bieneusi and Encephalitozoon intestinalis. The results of the present study showed that the percentage of removal for microsporidia through different treatment processes reached its highest rate in the station using slow sand filters (100%), while the removal by rapid sand filter system was 81.8%. Statistically, the two different drinking water treatment plants (slow and rapid) had significant effect for removal of microsporidia. Molecular identification of microsporidia-PCR positive samples using two different primers for Enterocytozoon bieneusi and Encephalitozoon intestinalis showed the presence of the two pervious species in the inlet water of the two stations, while Encephalitozoon intestinalis was detected in the outlet water only. In conclusion, the appearance of virulent microsporidia in treated drinking water may cause potential health threat.Keywords: removal, efficacy, microsporidia, drinking water treatment plants, PCR
Procedia PDF Downloads 2121343 Active Features Determination: A Unified Framework
Authors: Meenal Badki
Abstract:
We address the issue of active feature determination, where the objective is to determine the set of examples on which additional data (such as lab tests) needs to be gathered, given a large number of examples with some features (such as demographics) and some examples with all the features (such as the complete Electronic Health Record). We note that certain features may be more costly, unique, or laborious to gather. Our proposal is a general active learning approach that is independent of classifiers and similarity metrics. It allows us to identify examples that differ from the full data set and obtain all the features for the examples that match. Our comprehensive evaluation shows the efficacy of this approach, which is driven by four authentic clinical tasks.Keywords: feature determination, classification, active learning, sample-efficiency
Procedia PDF Downloads 761342 Use of Fractal Geometry in Machine Learning
Authors: Fuad M. Alkoot
Abstract:
The main component of a machine learning system is the classifier. Classifiers are mathematical models that can perform classification tasks for a specific application area. Additionally, many classifiers are combined using any of the available methods to reduce the classifier error rate. The benefits gained from the combination of multiple classifier designs has motivated the development of diverse approaches to multiple classifiers. We aim to investigate using fractal geometry to develop an improved classifier combiner. Initially we experiment with measuring the fractal dimension of data and use the results in the development of a combiner strategy.Keywords: fractal geometry, machine learning, classifier, fractal dimension
Procedia PDF Downloads 2181341 Arabic Handwriting Recognition Using Local Approach
Authors: Mohammed Arif, Abdessalam Kifouche
Abstract:
Optical character recognition (OCR) has a main role in the present time. It's capable to solve many serious problems and simplify human activities. The OCR yields to 70's, since many solutions has been proposed, but unfortunately, it was supportive to nothing but Latin languages. This work proposes a system of recognition of an off-line Arabic handwriting. This system is based on a structural segmentation method and uses support vector machines (SVM) in the classification phase. We have presented a state of art of the characters segmentation methods, after that a view of the OCR area, also we will address the normalization problems we went through. After a comparison between the Arabic handwritten characters & the segmentation methods, we had introduced a contribution through a segmentation algorithm.Keywords: OCR, segmentation, Arabic characters, PAW, post-processing, SVM
Procedia PDF Downloads 711340 Hybrid Knowledge Approach for Determining Health Care Provider Specialty from Patient Diagnoses
Authors: Erin Lynne Plettenberg, Jeremy Vickery
Abstract:
In an access-control situation, the role of a user determines whether a data request is appropriate. This paper combines vetted web mining and logic modeling to build a lightweight system for determining the role of a health care provider based only on their prior authorized requests. The model identifies provider roles with 100% recall from very little data. This shows the value of vetted web mining in AI systems, and suggests the impact of the ICD classification on medical practice.Keywords: electronic medical records, information extraction, logic modeling, ontology, vetted web mining
Procedia PDF Downloads 1721339 Transformers in Gene Expression-Based Classification
Authors: Babak Forouraghi
Abstract:
A genetic circuit is a collection of interacting genes and proteins that enable individual cells to implement and perform vital biological functions such as cell division, growth, death, and signaling. In cell engineering, synthetic gene circuits are engineered networks of genes specifically designed to implement functionalities that are not evolved by nature. These engineered networks enable scientists to tackle complex problems such as engineering cells to produce therapeutics within the patient's body, altering T cells to target cancer-related antigens for treatment, improving antibody production using engineered cells, tissue engineering, and production of genetically modified plants and livestock. Construction of computational models to realize genetic circuits is an especially challenging task since it requires the discovery of flow of genetic information in complex biological systems. Building synthetic biological models is also a time-consuming process with relatively low prediction accuracy for highly complex genetic circuits. The primary goal of this study was to investigate the utility of a pre-trained bidirectional encoder transformer that can accurately predict gene expressions in genetic circuit designs. The main reason behind using transformers is their innate ability (attention mechanism) to take account of the semantic context present in long DNA chains that are heavily dependent on spatial representation of their constituent genes. Previous approaches to gene circuit design, such as CNN and RNN architectures, are unable to capture semantic dependencies in long contexts as required in most real-world applications of synthetic biology. For instance, RNN models (LSTM, GRU), although able to learn long-term dependencies, greatly suffer from vanishing gradient and low-efficiency problem when they sequentially process past states and compresses contextual information into a bottleneck with long input sequences. In other words, these architectures are not equipped with the necessary attention mechanisms to follow a long chain of genes with thousands of tokens. To address the above-mentioned limitations of previous approaches, a transformer model was built in this work as a variation to the existing DNA Bidirectional Encoder Representations from Transformers (DNABERT) model. It is shown that the proposed transformer is capable of capturing contextual information from long input sequences with attention mechanism. In a previous work on genetic circuit design, the traditional approaches to classification and regression, such as Random Forrest, Support Vector Machine, and Artificial Neural Networks, were able to achieve reasonably high R2 accuracy levels of 0.95 to 0.97. However, the transformer model utilized in this work with its attention-based mechanism, was able to achieve a perfect accuracy level of 100%. Further, it is demonstrated that the efficiency of the transformer-based gene expression classifier is not dependent on presence of large amounts of training examples, which may be difficult to compile in many real-world gene circuit designs.Keywords: transformers, generative ai, gene expression design, classification
Procedia PDF Downloads 591338 Software Architectural Design Ontology
Authors: Muhammad Irfan Marwat, Sadaqat Jan, Syed Zafar Ali Shah
Abstract:
Software architecture plays a key role in software development but absence of formal description of software architecture causes different impede in software development. To cope with these difficulties, ontology has been used as artifact. This paper proposes ontology for software architectural design based on IEEE model for architecture description and Kruchten 4+1 model for viewpoints classification. For categorization of style and views, ISO/IEC 42010 has been used. Corpus method has been used to evaluate ontology. The main aim of the proposed ontology is to classify and locate software architectural design information.Keywords: semantic-based software architecture, software architecture, ontology, software engineering
Procedia PDF Downloads 5491337 Automatic Differential Diagnosis of Melanocytic Skin Tumours Using Ultrasound and Spectrophotometric Data
Authors: Kristina Sakalauskiene, Renaldas Raisutis, Gintare Linkeviciute, Skaidra Valiukeviciene
Abstract:
Cutaneous melanoma is a melanocytic skin tumour, which has a very poor prognosis while is highly resistant to treatment and tends to metastasize. Thickness of melanoma is one of the most important biomarker for stage of disease, prognosis and surgery planning. In this study, we hypothesized that the automatic analysis of spectrophotometric images and high-frequency ultrasonic 2D data can improve differential diagnosis of cutaneous melanoma and provide additional information about tumour penetration depth. This paper presents the novel complex automatic system for non-invasive melanocytic skin tumour differential diagnosis and penetration depth evaluation. The system is composed of region of interest segmentation in spectrophotometric images and high-frequency ultrasound data, quantitative parameter evaluation, informative feature extraction and classification with linear regression classifier. The segmentation of melanocytic skin tumour region in ultrasound image is based on parametric integrated backscattering coefficient calculation. The segmentation of optical image is based on Otsu thresholding. In total 29 quantitative tissue characterization parameters were evaluated by using ultrasound data (11 acoustical, 4 shape and 15 textural parameters) and 55 quantitative features of dermatoscopic and spectrophotometric images (using total melanin, dermal melanin, blood and collagen SIAgraphs acquired using spectrophotometric imaging device SIAscope). In total 102 melanocytic skin lesions (including 43 cutaneous melanomas) were examined by using SIAscope and ultrasound system with 22 MHz center frequency single element transducer. The diagnosis and Breslow thickness (pT) of each MST were evaluated during routine histological examination after excision and used as a reference. The results of this study have shown that automatic analysis of spectrophotometric and high frequency ultrasound data can improve non-invasive classification accuracy of early-stage cutaneous melanoma and provide supplementary information about tumour penetration depth.Keywords: cutaneous melanoma, differential diagnosis, high-frequency ultrasound, melanocytic skin tumours, spectrophotometric imaging
Procedia PDF Downloads 2701336 The Development of User Behavior in Urban Regeneration Areas by Utilizing the Floating Population Data
Authors: Jung-Hun Cho, Tae-Heon Moon, Sun-Young Heo
Abstract:
A lot of urban problems, caused by urbanization and industrialization, have occurred around the world. In particular, the creation of satellite towns, which was attributed to the explicit expansion of the city, has led to the traffic problems and the hollowization of old towns, raising the necessity of urban regeneration in old towns along with the aging of existing urban infrastructure. To select urban regeneration priority regions for the strategic execution of urban regeneration in Korea, the number of population, the number of businesses, and deterioration degree were chosen as standards. Existing standards had a limit in coping with solving urban problems fundamentally and rapidly changing reality. Therefore, it was necessary to add new indicators that can reflect the decline in relevant cities and conditions. In this regard, this study selected Busan Metropolitan City, Korea as the target area as a leading city, where urban regeneration such as an international port city has been activated like Yokohama, Japan. Prior to setting the urban regeneration priority region, the conditions of reality should be reflected because uniform and uncharacterized projects have been implemented without a quantitative analysis about population behavior within the region. For this reason, this study conducted a characterization analysis and type classification, based on the user behaviors by using representative floating population of the big data, which is a hot issue all over the society in recent days. The target areas were analyzed in this study. While 23 regions were classified as three types in existing Busan Metropolitan City urban regeneration priority region, 23 regions were classified as four types in existing Busan Metropolitan City urban regeneration priority region in terms of the type classification on the basis of user behaviors. Four types were classified as follows; type (Ⅰ) of young people - morning type, Type (Ⅱ) of the old and middle-aged- general type with sharp floating population, type (Ⅲ) of the old and middle aged-24hour-type, and type (Ⅳ) of the old and middle aged with less floating population. Characteristics were shown in each region of four types, and the study results of user behaviors were different from those of existing urban regeneration priority region. According to the results, in type (Ⅰ) young people were the majority around the existing old built-up area, where floating population at dawn is four times more than in other areas. In Type (Ⅱ), there were many old and middle-aged people around the existing built-up area and general neighborhoods, where the average floating population was more than in other areas due to commuting, while in type (Ⅲ), there was no change in the floating population throughout 24 hours, although there were many old and middle aged people in population around the existing general neighborhoods. Type (Ⅳ) includes existing economy-based type, central built-up area type, and general neighborhood type, where old and middle aged people were the majority as a general type of commuting with less floating population. Unlike existing urban regeneration priority region, these types were sub-divided according to types, and in this study, approach methods and basic orientations of urban regeneration were set to reflect the reality to a certain degree including the indicators of effective floating population to identify the dynamic activity of urban areas and existing regeneration priority areas in connection with urban regeneration projects by regions. Therefore, it is possible to make effective urban plans through offering the substantial ground by utilizing scientific and quantitative data. To induce more realistic and effective regeneration projects, the regeneration projects tailored to the present local conditions should be developed by reflecting the present conditions on the formulation of urban regeneration strategic plans.Keywords: floating population, big data, urban regeneration, urban regeneration priority region, type classification
Procedia PDF Downloads 2131335 A Deep Learning Approach for the Predictive Quality of Directional Valves in the Hydraulic Final Test
Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter
Abstract:
The increasing use of deep learning applications in production is becoming a competitive advantage. Predictive quality enables the assurance of product quality by using data-driven forecasts via machine learning models as a basis for decisions on test results. The use of real Bosch production data along the value chain of hydraulic valves is a promising approach to classifying the leakage of directional valves.Keywords: artificial neural networks, classification, hydraulics, predictive quality, deep learning
Procedia PDF Downloads 2441334 Riesz Mixture Model for Brain Tumor Detection
Authors: Mouna Zitouni, Mariem Tounsi
Abstract:
This research introduces an application of the Riesz mixture model for medical image segmentation for accurate diagnosis and treatment of brain tumors. We propose a pixel classification technique based on the Riesz distribution, derived from an extended Bartlett decomposition. To our knowledge, this is the first study addressing this approach. The Expectation-Maximization algorithm is implemented for parameter estimation. A comparative analysis, using both synthetic and real brain images, demonstrates the superiority of the Riesz model over a recent method based on the Wishart distribution.Keywords: EM algorithm, segmentation, Riesz probability distribution, Wishart probability distribution
Procedia PDF Downloads 181333 Case Studies in Three Domains of Learning: Cognitive, Affective, Psychomotor
Authors: Zeinabsadat Haghshenas
Abstract:
Bloom’s Taxonomy has been changed during the years. The idea of this writing is about the revision that has happened in both facts and terms. It also contains case studies of using cognitive Bloom’s taxonomy in teaching geometric solids to the secondary school students, affective objectives in a creative workshop for adults and psychomotor objectives in fixing a malfunctioned refrigerator lamp. There is also pointed to the important role of classification objectives in adult education as a way to prevent memory loss.Keywords: adult education, affective domain, cognitive domain, memory loss, psychomotor domain
Procedia PDF Downloads 4681332 Robust Heart Rate Estimation from Multiple Cardiovascular and Non-Cardiovascular Physiological Signals Using Signal Quality Indices and Kalman Filter
Authors: Shalini Rankawat, Mansi Rankawat, Rahul Dubey, Mazad Zaveri
Abstract:
Physiological signals such as electrocardiogram (ECG) and arterial blood pressure (ABP) in the intensive care unit (ICU) are often seriously corrupted by noise, artifacts, and missing data, which lead to errors in the estimation of heart rate (HR) and incidences of false alarm from ICU monitors. Clinical support in ICU requires most reliable heart rate estimation. Cardiac activity, because of its relatively high electrical energy, may introduce artifacts in Electroencephalogram (EEG), Electrooculogram (EOG), and Electromyogram (EMG) recordings. This paper presents a robust heart rate estimation method by detection of R-peaks of ECG artifacts in EEG, EMG & EOG signals, using energy-based function and a novel Signal Quality Index (SQI) assessment technique. SQIs of physiological signals (EEG, EMG, & EOG) were obtained by correlation of nonlinear energy operator (teager energy) of these signals with either ECG or ABP signal. HR is estimated from ECG, ABP, EEG, EMG, and EOG signals from separate Kalman filter based upon individual SQIs. Data fusion of each HR estimate was then performed by weighing each estimate by the Kalman filters’ SQI modified innovations. The fused signal HR estimate is more accurate and robust than any of the individual HR estimate. This method was evaluated on MIMIC II data base of PhysioNet from bedside monitors of ICU patients. The method provides an accurate HR estimate even in the presence of noise and artifacts.Keywords: ECG, ABP, EEG, EMG, EOG, ECG artifacts, Teager-Kaiser energy, heart rate, signal quality index, Kalman filter, data fusion
Procedia PDF Downloads 6961331 Strategic Management for Corporate Social Responsibility in Colombian Industries: A Typology of CSR
Authors: Iris Maria Velez Osorio
Abstract:
There has been in the last decade a concern about the environment, particularly about clean and enough water for human consumption but, some enterprises had some trouble to understand the limited resources in the environment. This research tries to understand how some industries are better oriented to the preservation of the environment through investment for strategic management of scarce resources and try in the best way possible, the contaminants. It was made an industry classification since four different group of theories for Corporate Social Responsibility agree with variables of: investment in environmental care, water protection, and residues treatment finding different levels of commitment with CSR.Keywords: corporate social responsibility, environment, strategic management, water
Procedia PDF Downloads 3761330 The Research on Diesel Bus Emissions in Ulaanbaatar City: Mongolia
Authors: Tsetsegmaa A., Bayarsuren B., Altantsetseg Ts.
Abstract:
To make the best decision on reducing harmful emissions from buses, we need to have a clear understanding of the current state of their actual emissions. The emissions from city buses running on high sulfur fuel, particularly particulate matter (PM) and nitrogen oxides (NOx) from the exhaust gases of conventional diesel engines, have been studied and measured with and without diesel particulate filter (DPF) in Ulaanbaatar city. The study was conducted by using the PEMS (Portable Emissions Measurement System) and gravimetric method in real traffic conditions. The obtained data were used to determine the actual emission rates and to evaluate the effectiveness of the selected particulate filters. Actual road and daily PM emissions from city buses were determined during the warm and cold seasons. A bus with an average daily mileage of 242 km was found to emit 166.155 g of PM into the city's atmosphere on average per day, with 141.3 g in summer and 175.8 g in winter. The actual PM of the city bus is 0.6866 g/km. The concentration of NOx in the exhaust gas averages 1410.94 ppm. The use of DPF reduced the exhaust gas opacity of 24 buses by an average of 97% and filtered a total of 340.4 kg of soot from these buses over a period of six months. Retrofitting an old conventional diesel engine with cassette-type silicon carbide (SiC) DPF, despite the laboriousness of cleaning, can significantly reduce particulate matter emissions. Innovation: First comprehensive road PM and NOx emission dataset and actual road emissions from public buses have been identified. PM and NOx mathematical model equations have been estimated as a function of the bus technical speed and engine revolution with and without DPF.Keywords: conventional diesel, silicon carbide, real-time onboard measurements, particulate matter, diesel retrofit, fuel sulphur
Procedia PDF Downloads 1651329 The Condition Testing of Damaged Plates Using Acoustic Features and Machine Learning
Authors: Kyle Saltmarsh
Abstract:
Acoustic testing possesses many benefits due to its non-destructive nature and practicality. There hence exists many scenarios in which using acoustic testing for condition testing shows powerful feasibility. A wealth of information is contained within the acoustic and vibration characteristics of structures, allowing the development meaningful features for the classification of their respective condition. In this paper, methods, results, and discussions are presented on the use of non-destructive acoustic testing coupled with acoustic feature extraction and machine learning techniques for the condition testing of manufactured circular steel plates subjected to varied levels of damage.Keywords: plates, deformation, acoustic features, machine learning
Procedia PDF Downloads 337