Search results for: malware classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2223

Search results for: malware classification

1293 Adolescent-Parent Relationship as the Most Important Factor in Preventing Mood Disorders in Adolescents: An Application of Artificial Intelligence to Social Studies

Authors: Elżbieta Turska

Abstract:

Introduction: One of the most difficult times in a person’s life is adolescence. The experiences in this period may shape the future life of this person to a large extent. This is the reason why many young people experience sadness, dejection, hopelessness, sense of worthlessness, as well as losing interest in various activities and social relationships, all of which are often classified as mood disorders. As many as 15-40% adolescents experience depressed moods and for most of them they resolve and are not carried into adulthood. However, (5-6%) of those affected by mood disorders develop the depressive syndrome and as many as (1-3%) develop full-blown clinical depression. Materials: A large questionnaire was given to 2508 students, aged 13–16 years old, and one of its parts was the Burns checklist, i.e. the standard test for identifying depressed mood. The questionnaire asked about many aspects of the student’s life, it included a total of 53 questions, most of which had subquestions. It is important to note that the data suffered from many problems, the most important of which were missing data and collinearity. Aim: In order to identify the correlates of mood disorders we built predictive models which were then trained and validated. Our aim was not to be able to predict which students suffer from mood disorders but rather to explore the factors influencing mood disorders. Methods: The problems with data described above practically excluded using all classical statistical methods. For this reason, we attempted to use the following Artificial Intelligence (AI) methods: classification trees with surrogate variables, random forests and xgboost. All analyses were carried out with the use of the mlr package for the R programming language. Resuts: The predictive model built by classification trees algorithm outperformed the other algorithms by a large margin. As a result, we were able to rank the variables (questions and subquestions from the questionnaire) from the most to least influential as far as protection against mood disorder is concerned. Thirteen out of twenty most important variables reflect the relationships with parents. This seems to be a really significant result both from the cognitive point of view and also from the practical point of view, i.e. as far as interventions to correct mood disorders are concerned.

Keywords: mood disorders, adolescents, family, artificial intelligence

Procedia PDF Downloads 101
1292 Detecting Covid-19 Fake News Using Deep Learning Technique

Authors: AnjalI A. Prasad

Abstract:

Nowadays, social media played an important role in spreading misinformation or fake news. This study analyzes the fake news related to the COVID-19 pandemic spread in social media. This paper aims at evaluating and comparing different approaches that are used to mitigate this issue, including popular deep learning approaches, such as CNN, RNN, LSTM, and BERT algorithm for classification. To evaluate models’ performance, we used accuracy, precision, recall, and F1-score as the evaluation metrics. And finally, compare which algorithm shows better result among the four algorithms.

Keywords: BERT, CNN, LSTM, RNN

Procedia PDF Downloads 206
1291 Wearable Antenna for Diagnosis of Parkinson’s Disease Using a Deep Learning Pipeline on Accelerated Hardware

Authors: Subham Ghosh, Banani Basu, Marami Das

Abstract:

Background: The development of compact, low-power antenna sensors has resulted in hardware restructuring, allowing for wireless ubiquitous sensing. The antenna sensors can create wireless body-area networks (WBAN) by linking various wireless nodes across the human body. WBAN and IoT applications, such as remote health and fitness monitoring and rehabilitation, are becoming increasingly important. In particular, Parkinson’s disease (PD), a common neurodegenerative disorder, presents clinical features that can be easily misdiagnosed. As a mobility disease, it may greatly benefit from the antenna’s nearfield approach with a variety of activities that can use WBAN and IoT technologies to increase diagnosis accuracy and patient monitoring. Methodology: This study investigates the feasibility of leveraging a single patch antenna mounted (using cloth) on the wrist dorsal to differentiate actual Parkinson's disease (PD) from false PD using a small hardware platform. The semi-flexible antenna operates at the 2.4 GHz ISM band and collects reflection coefficient (Γ) data from patients performing five exercises designed for the classification of PD and other disorders such as essential tremor (ET) or those physiological disorders caused by anxiety or stress. The obtained data is normalized and converted into 2-D representations using the Gabor wavelet transform (GWT). Data augmentation is then used to expand the dataset size. A lightweight deep-learning (DL) model is developed to run on the GPU-enabled NVIDIA Jetson Nano platform. The DL model processes the 2-D images for feature extraction and classification. Findings: The DL model was trained and tested on both the original and augmented datasets, thus doubling the dataset size. To ensure robustness, a 5-fold stratified cross-validation (5-FSCV) method was used. The proposed framework, utilizing a DL model with 1.356 million parameters on the NVIDIA Jetson Nano, achieved optimal performance in terms of accuracy of 88.64%, F1-score of 88.54, and recall of 90.46%, with a latency of 33 seconds per epoch.

Keywords: antenna, deep-learning, GPU-hardware, Parkinson’s disease

Procedia PDF Downloads 12
1290 Design of a Backlight Hyperspectral Imaging System for Enhancing Image Quality in Artificial Vision Food Packaging Online Inspections

Authors: Ferran Paulí Pla, Pere Palacín Farré, Albert Fornells Herrera, Pol Toldrà Fernández

Abstract:

Poor image acquisition is limiting the promising growth of industrial vision in food control. In recent years, the food industry has witnessed a significant increase in the implementation of automation in quality control through artificial vision, a trend that continues to grow. During the packaging process, some defects may appear, compromising the proper sealing of the products and diminishing their shelf life, sanitary conditions and overall properties. While failure to detect a defective product leads to major losses, food producers also aim to minimize over-rejection to avoid unnecessary waste. Thus, accuracy in the evaluation of the products is crucial, and, given the large production volumes, even small improvements have a significant impact. Recently, efforts have been focused on maximizing the performance of classification neural networks; nevertheless, their performance is limited by the quality of the input data. Monochrome linear backlight systems are most commonly used for online inspections of food packaging thermo-sealing zones. These simple acquisition systems fit the high cadence of the production lines imposed by the market demand. Nevertheless, they provide a limited amount of data, which negatively impacts classification algorithm training. A desired situation would be one where data quality is maximized in terms of obtaining the key information to detect defects while maintaining a fast working pace. This work presents a backlight hyperspectral imaging system designed and implemented replicating an industrial environment to better understand the relationship between visual data quality and spectral illumination range for a variety of packed food products. Furthermore, results led to the identification of advantageous spectral bands that significantly enhance image quality, providing clearer detection of defects.

Keywords: artificial vision, food packaging, hyperspectral imaging, image acquisition, quality control

Procedia PDF Downloads 23
1289 Assessing the Utility of Unmanned Aerial Vehicle-Borne Hyperspectral Image and Photogrammetry Derived 3D Data for Wetland Species Distribution Quick Mapping

Authors: Qiaosi Li, Frankie Kwan Kit Wong, Tung Fung

Abstract:

Lightweight unmanned aerial vehicle (UAV) loading with novel sensors offers a low cost approach for data acquisition in complex environment. This study established a framework for applying UAV system in complex environment quick mapping and assessed the performance of UAV-based hyperspectral image and digital surface model (DSM) derived from photogrammetric point clouds for 13 species classification in wetland area Mai Po Inner Deep Bay Ramsar Site, Hong Kong. The study area was part of shallow bay with flat terrain and the major species including reedbed and four mangroves: Kandelia obovata, Aegiceras corniculatum, Acrostichum auerum and Acanthus ilicifolius. Other species involved in various graminaceous plants, tarbor, shrub and invasive species Mikania micrantha. In particular, invasive species climbed up to the mangrove canopy caused damage and morphology change which might increase species distinguishing difficulty. Hyperspectral images were acquired by Headwall Nano sensor with spectral range from 400nm to 1000nm and 0.06m spatial resolution image. A sequence of multi-view RGB images was captured with 0.02m spatial resolution and 75% overlap. Hyperspectral image was corrected for radiative and geometric distortion while high resolution RGB images were matched to generate maximum dense point clouds. Furtherly, a 5 cm grid digital surface model (DSM) was derived from dense point clouds. Multiple feature reduction methods were compared to identify the efficient method and to explore the significant spectral bands in distinguishing different species. Examined methods including stepwise discriminant analysis (DA), support vector machine (SVM) and minimum noise fraction (MNF) transformation. Subsequently, spectral subsets composed of the first 20 most importance bands extracted by SVM, DA and MNF, and multi-source subsets adding extra DSM to 20 spectrum bands were served as input in maximum likelihood classifier (MLC) and SVM classifier to compare the classification result. Classification results showed that feature reduction methods from best to worst are MNF transformation, DA and SVM. MNF transformation accuracy was even higher than all bands input result. Selected bands frequently laid along the green peak, red edge and near infrared. Additionally, DA found that chlorophyll absorption red band and yellow band were also important for species classification. In terms of 3D data, DSM enhanced the discriminant capacity among low plants, arbor and mangrove. Meanwhile, DSM largely reduced misclassification due to the shadow effect and morphological variation of inter-species. In respect to classifier, nonparametric SVM outperformed than MLC for high dimension and multi-source data in this study. SVM classifier tended to produce higher overall accuracy and reduce scattered patches although it costs more time than MLC. The best result was obtained by combining MNF components and DSM in SVM classifier. This study offered a precision species distribution survey solution for inaccessible wetland area with low cost of time and labour. In addition, findings relevant to the positive effect of DSM as well as spectral feature identification indicated that the utility of UAV-borne hyperspectral and photogrammetry deriving 3D data is promising in further research on wetland species such as bio-parameters modelling and biological invasion monitoring.

Keywords: digital surface model (DSM), feature reduction, hyperspectral, photogrammetric point cloud, species mapping, unmanned aerial vehicle (UAV)

Procedia PDF Downloads 257
1288 Flood Hazard Assessment and Land Cover Dynamics of the Orai Khola Watershed, Bardiya, Nepal

Authors: Loonibha Manandhar, Rajendra Bhandari, Kumud Raj Kafle

Abstract:

Nepal’s Terai region is a part of the Ganges river basin which is one of the most disaster-prone areas of the world, with recurrent monsoon flooding causing millions in damage and the death and displacement of hundreds of people and households every year. The vulnerability of human settlements to natural disasters such as floods is increasing, and mapping changes in land use practices and hydro-geological parameters is essential in developing resilient communities and strong disaster management policies. The objective of this study was to develop a flood hazard zonation map of Orai Khola watershed and map the decadal land use/land cover dynamics of the watershed. The watershed area was delineated using SRTM DEM, and LANDSAT images were classified into five land use classes (forest, grassland, sediment and bare land, settlement area and cropland, and water body) using pixel-based semi-automated supervised maximum likelihood classification. Decadal changes in each class were then quantified using spatial modelling. Flood hazard mapping was performed by assigning weights to factors slope, rainfall distribution, distance from the river and land use/land cover on the basis of their estimated influence in causing flood hazard and performing weighed overlay analysis to identify areas that are highly vulnerable. The forest and grassland coverage increased by 11.53 km² (3.8%) and 1.43 km² (0.47%) from 1996 to 2016. The sediment and bare land areas decreased by 12.45 km² (4.12%) from 1996 to 2016 whereas settlement and cropland areas showed a consistent increase to 14.22 km² (4.7%). Waterbody coverage also increased to 0.3 km² (0.09%) from 1996-2016. 1.27% (3.65 km²) of total watershed area was categorized into very low hazard zone, 20.94% (60.31 km²) area into low hazard zone, 37.59% (108.3 km²) area into moderate hazard zone, 29.25% (84.27 km²) area into high hazard zone and 31 villages which comprised 10.95% (31.55 km²) were categorized into high hazard zone area.

Keywords: flood hazard, land use/land cover, Orai river, supervised maximum likelihood classification, weighed overlay analysis

Procedia PDF Downloads 355
1287 Characterization of Agroforestry Systems in Burkina Faso Using an Earth Observation Data Cube

Authors: Dan Kanmegne

Abstract:

Africa will become the most populated continent by the end of the century, with around 4 billion inhabitants. Food security and climate changes will become continental issues since agricultural practices depend on climate but also contribute to global emissions and land degradation. Agroforestry has been identified as a cost-efficient and reliable strategy to address these two issues. It is defined as the integrated management of trees and crops/animals in the same land unit. Agroforestry provides benefits in terms of goods (fruits, medicine, wood, etc.) and services (windbreaks, fertility, etc.), and is acknowledged to have a great potential for carbon sequestration; therefore it can be integrated into reduction mechanisms of carbon emissions. Particularly in sub-Saharan Africa, the constraint stands in the lack of information about both areas under agroforestry and the characterization (composition, structure, and management) of each agroforestry system at the country level. This study describes and quantifies “what is where?”, earliest to the quantification of carbon stock in different systems. Remote sensing (RS) is the most efficient approach to map such a dynamic technology as agroforestry since it gives relatively adequate and consistent information over a large area at nearly no cost. RS data fulfill the good practice guidelines of the Intergovernmental Panel On Climate Change (IPCC) that is to be used in carbon estimation. Satellite data are getting more and more accessible, and the archives are growing exponentially. To retrieve useful information to support decision-making out of this large amount of data, satellite data needs to be organized so to ensure fast processing, quick accessibility, and ease of use. A new solution is a data cube, which can be understood as a multi-dimensional stack (space, time, data type) of spatially aligned pixels and used for efficient access and analysis. A data cube for Burkina Faso has been set up from the cooperation project between the international service provider WASCAL and Germany, which provides an accessible exploitation architecture of multi-temporal satellite data. The aim of this study is to map and characterize agroforestry systems using the Burkina Faso earth observation data cube. The approach in its initial stage is based on an unsupervised image classification of a normalized difference vegetation index (NDVI) time series from 2010 to 2018, to stratify the country based on the vegetation. Fifteen strata were identified, and four samples per location were randomly assigned to define the sampling units. For safety reasons, the northern part will not be part of the fieldwork. A total of 52 locations will be visited by the end of the dry season in February-March 2020. The field campaigns will consist of identifying and describing different agroforestry systems and qualitative interviews. A multi-temporal supervised image classification will be done with a random forest algorithm, and the field data will be used for both training the algorithm and accuracy assessment. The expected outputs are (i) map(s) of agroforestry dynamics, (ii) characteristics of different systems (main species, management, area, etc.); (iii) assessment report of Burkina Faso data cube.

Keywords: agroforestry systems, Burkina Faso, earth observation data cube, multi-temporal image classification

Procedia PDF Downloads 146
1286 Fault Diagnosis of Manufacturing Systems Using AntTreeStoch with Parameter Optimization by ACO

Authors: Ouahab Kadri, Leila Hayet Mouss

Abstract:

In this paper, we present three diagnostic modules for complex and dynamic systems. These modules are based on three ant colony algorithms, which are AntTreeStoch, Lumer & Faieta and Binary ant colony. We chose these algorithms for their simplicity and their wide application range. However, we cannot use these algorithms in their basement forms as they have several limitations. To use these algorithms in a diagnostic system, we have proposed three variants. We have tested these algorithms on datasets issued from two industrial systems, which are clinkering system and pasteurization system.

Keywords: ant colony algorithms, complex and dynamic systems, diagnosis, classification, optimization

Procedia PDF Downloads 300
1285 Vertical and Horizantal Distribution Patterns of Major and Trace Elements: Surface and Subsurface Sediments of Endhorheic Lake Acigol Basin, Denizli Turkey

Authors: M. Budakoglu, M. Karaman

Abstract:

Lake Acıgöl is located in area with limited influences from urban and industrial pollution sources, there is nevertheless a need to understand all potential lithological and anthropogenic sources of priority contaminants in this closed basin. This study discusses vertical and horizontal distribution pattern of major, trace elements of recent lake sediments to better understand their current geochemical analog with lithological units in the Lake Acıgöl basin. This study also provides reliable background levels for the region by the detailed surfaced lithological units data. The detail results of surface, subsurface and shallow core sediments from these relatively unperturbed ecosystems, highlight its importance as conservation area, despite the high-scale industrial salt production activity. While P2O5/TiO2 versus MgO/CaO classification diagram indicate magmatic and sedimentary origin of lake sediment, Log(SiO2/Al2O3) versus Log(Na2O/K2O) classification diagrams express lithological assemblages of shale, iron-shale, vacke and arkose. The plot between TiO2 vs. SiO2 and P2O5/TiO2 vs. MgO/CaO also supports the origin of the primary magma source. The average compositions of the 20 different lithological units used as a proxy for geochemical background in the study area. As expected from weathered rock materials, there is a large variation in the major element content for all analyzed lake samples. The A-CN-K and A-CNK-FM ternary diagrams were used to deduce weathering trends. Surface and subsurface sediments display an intense weathering history according to these ternary diagrams. The most of the sediments samples plot around UCC and TTG, suggesting a low to moderate weathering history for the provenance. The sediments plot in a region clearly suggesting relative similar contents in Al2O3, CaO, Na2O, and K2O from those of lithological samples.

Keywords: Lake Acıgöl, recent lake sediment, geochemical speciation of major and trace elements, heavy metals, Denizli, Turkey

Procedia PDF Downloads 411
1284 A Comprehensive Framework for Fraud Prevention and Customer Feedback Classification in E-Commerce

Authors: Samhita Mummadi, Sree Divya Nagalli, Harshini Vemuri, Saketh Charan Nakka, Sumesh K. J.

Abstract:

One of the most significant challenges faced by people in today’s digital era is an alarming increase in fraudulent activities on online platforms. The fascination with online shopping to avoid long queues in shopping malls, the availability of a variety of products, and home delivery of goods have paved the way for a rapid increase in vast online shopping platforms. This has had a major impact on increasing fraudulent activities as well. This loop of online shopping and transactions has paved the way for fraudulent users to commit fraud. For instance, consider a store that orders thousands of products all at once, but what’s fishy about this is the massive number of items purchased and their transactions turning out to be fraud, leading to a huge loss for the seller. Considering scenarios like these underscores the urgent need to introduce machine learning approaches to combat fraud in online shopping. By leveraging robust algorithms, namely KNN, Decision Trees, and Random Forest, which are highly effective in generating accurate results, this research endeavors to discern patterns indicative of fraudulent behavior within transactional data. Introducing a comprehensive solution to this problem in order to empower e-commerce administrators in timely fraud detection and prevention is the primary motive and the main focus. In addition to that, sentiment analysis is harnessed in the model so that the e-commerce admin can tailor to the customer’s and consumer’s concerns, feedback, and comments, allowing the admin to improve the user’s experience. The ultimate objective of this study is to ramp up online shopping platforms against fraud and ensure a safer shopping experience. This paper underscores a model accuracy of 84%. All the findings and observations that were noted during our work lay the groundwork for future advancements in the development of more resilient and adaptive fraud detection systems, which will become crucial as technologies continue to evolve.

Keywords: behavior analysis, feature selection, Fraudulent pattern recognition, imbalanced classification, transactional anomalies

Procedia PDF Downloads 32
1283 Spatial Patterns of Urban Expansion in Kuwait City between 1989 and 2001

Authors: Saad Algharib, Jay Lee

Abstract:

Urbanization is a complex phenomenon that occurs during the city’s development from one form to another. In other words, it is the process when the activities in the land use/land cover change from rural to urban. Since the oil exploration, Kuwait City has been growing rapidly due to its urbanization and population growth by both natural growth and inward immigration. The main objective of this study is to detect changes in urban land use/land cover and to examine the changing spatial patterns of urban growth in and around Kuwait City between 1989 and 2001. In addition, this study also evaluates the spatial patterns of the changes detected and how they can be related to the spatial configuration of the city. Recently, the use of remote sensing and geographic information systems became very useful and important tools in urban studies because of the integration of them can allow and provide the analysts and planners to detect, monitor and analyze the urban growth in a region effectively. Moreover, both planners and users can predict the trends of the growth in urban areas in the future with remotely sensed and GIS data because they can be effectively updated with required precision levels. In order to identify the new urban areas between 1989 and 2001, the study uses satellite images of the study area and remote sensing technology for classifying these images. Unsupervised classification method was applied to classify images to land use and land cover data layers. After finishing the unsupervised classification method, GIS overlay function was applied to the classified images for detecting the locations and patterns of the new urban areas that developed during the study period. GIS was also utilized to evaluate the distribution of the spatial patterns. For example, Moran’s index was applied for all data inputs to examine the urban growth distribution. Furthermore, this study assesses if the spatial patterns and process of these changes take place in a random fashion or with certain identifiable trends. During the study period, the result of this study indicates that the urban growth has occurred and expanded 10% from 32.4% in 1989 to 42.4% in 2001. Also, the results revealed that the largest increase of the urban area occurred between the major highways after the forth ring road from the center of Kuwait City. Moreover, the spatial distribution of urban growth occurred in cluster manners.

Keywords: geographic information systems, remote sensing, urbanization, urban growth

Procedia PDF Downloads 171
1282 Normalized Compression Distance Based Scene Alteration Analysis of a Video

Authors: Lakshay Kharbanda, Aabhas Chauhan

Abstract:

In this paper, an application of Normalized Compression Distance (NCD) to detect notable scene alterations occurring in videos is presented. Several research groups have been developing methods to perform image classification using NCD, a computable approximation to Normalized Information Distance (NID) by studying the degree of similarity in images. The timeframes where significant aberrations between the frames of a video have occurred have been identified by obtaining a threshold NCD value, using two compressors: LZMA and BZIP2 and defining scene alterations using Pixel Difference Percentage metrics.

Keywords: image compression, Kolmogorov complexity, normalized compression distance, root mean square error

Procedia PDF Downloads 340
1281 Recognition of Tifinagh Characters with Missing Parts Using Neural Network

Authors: El Mahdi Barrah, Said Safi, Abdessamad Malaoui

Abstract:

In this paper, we present an algorithm for reconstruction from incomplete 2D scans for tifinagh characters. This algorithm is based on using correlation between the lost block and its neighbors. This system proposed contains three main parts: pre-processing, features extraction and recognition. In the first step, we construct a database of tifinagh characters. In the second step, we will apply “shape analysis algorithm”. In classification part, we will use Neural Network. The simulation results demonstrate that the proposed method give good results.

Keywords: Tifinagh character recognition, neural networks, local cost computation, ANN

Procedia PDF Downloads 335
1280 Classification of Sturm-Liouville Problems at Infinity

Authors: Kishor J. shinde

Abstract:

We determine the values of k and p such that the Sturm-Liouville differential operator τu=-(d^2 u)/(dx^2) + kx^p u is in limit point case or limit circle case at infinity. In particular it is shown that τ is in the limit point case when (i) for p=2 and ∀k, (ii) for ∀p and k=0, (iii) for all p and k>0, (iv) for 0≤p≤2 and k<0, (v) for p<0 and k<0. τ is in the limit circle case when (i) for p>2 and k<0.

Keywords: limit point case, limit circle case, Sturm-Liouville, infinity

Procedia PDF Downloads 367
1279 Rice Area Determination Using Landsat-Based Indices and Land Surface Temperature Values

Authors: Burçin Saltık, Levent Genç

Abstract:

In this study, it was aimed to determine a route for identification of rice cultivation areas within Thrace and Marmara regions of Turkey using remote sensing and GIS. Landsat 8 (OLI-TIRS) imageries acquired in production season of 2013 with 181/32 Path/Row number were used. Four different seasonal images were generated utilizing original bands and different transformation techniques. All images were classified individually using supervised classification techniques and Land Use Land Cover Maps (LULC) were generated with 8 classes. Areas (ha, %) of each classes were calculated. In addition, district-based rice distribution maps were developed and results of these maps were compared with Turkish Statistical Institute (TurkSTAT; TSI)’s actual rice cultivation area records. Accuracy assessments were conducted, and most accurate map was selected depending on accuracy assessment and coherency with TSI results. Additionally, rice areas on over 4° slope values were considered as mis-classified pixels and they eliminated using slope map and GIS tools. Finally, randomized rice zones were selected to obtain maximum-minimum value ranges of each date (May, June, July, August, September images separately) NDVI, LSWI, and LST images to test whether they may be used for rice area determination via raster calculator tool of ArcGIS. The most accurate classification for rice determination was obtained from seasonal LSWI LULC map, and considering TSI data and accuracy assessment results and mis-classified pixels were eliminated from this map. According to results, 83151.5 ha of rice areas exist within study area. However, this result is higher than TSI records with an area of 12702.3 ha. Use of maximum-minimum range of rice area NDVI, LSWI, and LST was tested in Meric district. It was seen that using the value ranges obtained from July imagery, gave the closest results to TSI records, and the difference was only 206.4 ha. This difference is normal due to relatively low resolution of images. Thus, employment of images with higher spectral, spatial, temporal and radiometric resolutions may provide more reliable results.

Keywords: landsat 8 (OLI-TIRS), LST, LSWI, LULC, NDVI, rice

Procedia PDF Downloads 228
1278 Comprehensive Machine Learning-Based Glucose Sensing from Near-Infrared Spectra

Authors: Bitewulign Mekonnen

Abstract:

Context: This scientific paper focuses on the use of near-infrared (NIR) spectroscopy to determine glucose concentration in aqueous solutions accurately and rapidly. The study compares six different machine learning methods for predicting glucose concentration and also explores the development of a deep learning model for classifying NIR spectra. The objective is to optimize the detection model and improve the accuracy of glucose prediction. This research is important because it provides a comprehensive analysis of various machine-learning techniques for estimating aqueous glucose concentrations. Research Aim: The aim of this study is to compare and evaluate different machine-learning methods for predicting glucose concentration from NIR spectra. Additionally, the study aims to develop and assess a deep-learning model for classifying NIR spectra. Methodology: The research methodology involves the use of machine learning and deep learning techniques. Six machine learning regression models, including support vector machine regression, partial least squares regression, extra tree regression, random forest regression, extreme gradient boosting, and principal component analysis-neural network, are employed to predict glucose concentration. The NIR spectra data is randomly divided into train and test sets, and the process is repeated ten times to increase generalization ability. In addition, a convolutional neural network is developed for classifying NIR spectra. Findings: The study reveals that the SVMR, ETR, and PCA-NN models exhibit excellent performance in predicting glucose concentration, with correlation coefficients (R) > 0.99 and determination coefficients (R²)> 0.985. The deep learning model achieves high macro-averaging scores for precision, recall, and F1-measure. These findings demonstrate the effectiveness of machine learning and deep learning methods in optimizing the detection model and improving glucose prediction accuracy. Theoretical Importance: This research contributes to the field by providing a comprehensive analysis of various machine-learning techniques for estimating glucose concentrations from NIR spectra. It also explores the use of deep learning for the classification of indistinguishable NIR spectra. The findings highlight the potential of machine learning and deep learning in enhancing the prediction accuracy of glucose-relevant features. Data Collection and Analysis Procedures: The NIR spectra and corresponding references for glucose concentration are measured in increments of 20 mg/dl. The data is randomly divided into train and test sets, and the models are evaluated using regression analysis and classification metrics. The performance of each model is assessed based on correlation coefficients, determination coefficients, precision, recall, and F1-measure. Question Addressed: The study addresses the question of whether machine learning and deep learning methods can optimize the detection model and improve the accuracy of glucose prediction from NIR spectra. Conclusion: The research demonstrates that machine learning and deep learning methods can effectively predict glucose concentration from NIR spectra. The SVMR, ETR, and PCA-NN models exhibit superior performance, while the deep learning model achieves high classification scores. These findings suggest that machine learning and deep learning techniques can be used to improve the prediction accuracy of glucose-relevant features. Further research is needed to explore their clinical utility in analyzing complex matrices, such as blood glucose levels.

Keywords: machine learning, signal processing, near-infrared spectroscopy, support vector machine, neural network

Procedia PDF Downloads 95
1277 Hybrid GNN Based Machine Learning Forecasting Model For Industrial IoT Applications

Authors: Atish Bagchi, Siva Chandrasekaran

Abstract:

Background: According to World Bank national accounts data, the estimated global manufacturing value-added output in 2020 was 13.74 trillion USD. These manufacturing processes are monitored, modelled, and controlled by advanced, real-time, computer-based systems, e.g., Industrial IoT, PLC, SCADA, etc. These systems measure and manipulate a set of physical variables, e.g., temperature, pressure, etc. Despite the use of IoT, SCADA etc., in manufacturing, studies suggest that unplanned downtime leads to economic losses of approximately 864 billion USD each year. Therefore, real-time, accurate detection, classification and prediction of machine behaviour are needed to minimise financial losses. Although vast literature exists on time-series data processing using machine learning, the challenges faced by the industries that lead to unplanned downtimes are: The current algorithms do not efficiently handle the high-volume streaming data from industrial IoTsensors and were tested on static and simulated datasets. While the existing algorithms can detect significant 'point' outliers, most do not handle contextual outliers (e.g., values within normal range but happening at an unexpected time of day) or subtle changes in machine behaviour. Machines are revamped periodically as part of planned maintenance programmes, which change the assumptions on which original AI models were created and trained. Aim: This research study aims to deliver a Graph Neural Network(GNN)based hybrid forecasting model that interfaces with the real-time machine control systemand can detect, predict machine behaviour and behavioural changes (anomalies) in real-time. This research will help manufacturing industries and utilities, e.g., water, electricity etc., reduce unplanned downtimes and consequential financial losses. Method: The data stored within a process control system, e.g., Industrial-IoT, Data Historian, is generally sampled during data acquisition from the sensor (source) and whenpersistingin the Data Historian to optimise storage and query performance. The sampling may inadvertently discard values that might contain subtle aspects of behavioural changes in machines. This research proposed a hybrid forecasting and classification model which combines the expressive and extrapolation capability of GNN enhanced with the estimates of entropy and spectral changes in the sampled data and additional temporal contexts to reconstruct the likely temporal trajectory of machine behavioural changes. The proposed real-time model belongs to the Deep Learning category of machine learning and interfaces with the sensors directly or through 'Process Data Historian', SCADA etc., to perform forecasting and classification tasks. Results: The model was interfaced with a Data Historianholding time-series data from 4flow sensors within a water treatment plantfor45 days. The recorded sampling interval for a sensor varied from 10 sec to 30 min. Approximately 65% of the available data was used for training the model, 20% for validation, and the rest for testing. The model identified the anomalies within the water treatment plant and predicted the plant's performance. These results were compared with the data reported by the plant SCADA-Historian system and the official data reported by the plant authorities. The model's accuracy was much higher (20%) than that reported by the SCADA-Historian system and matched the validated results declared by the plant auditors. Conclusions: The research demonstrates that a hybrid GNN based approach enhanced with entropy calculation and spectral information can effectively detect and predict a machine's behavioural changes. The model can interface with a plant's 'process control system' in real-time to perform forecasting and classification tasks to aid the asset management engineers to operate their machines more efficiently and reduce unplanned downtimes. A series of trialsare planned for this model in the future in other manufacturing industries.

Keywords: GNN, Entropy, anomaly detection, industrial time-series, AI, IoT, Industry 4.0, Machine Learning

Procedia PDF Downloads 150
1276 Automatic Target Recognition in SAR Images Based on Sparse Representation Technique

Authors: Ahmet Karagoz, Irfan Karagoz

Abstract:

Synthetic Aperture Radar (SAR) is a radar mechanism that can be integrated into manned and unmanned aerial vehicles to create high-resolution images in all weather conditions, regardless of day and night. In this study, SAR images of military vehicles with different azimuth and descent angles are pre-processed at the first stage. The main purpose here is to reduce the high speckle noise found in SAR images. For this, the Wiener adaptive filter, the mean filter, and the median filters are used to reduce the amount of speckle noise in the images without causing loss of data. During the image segmentation phase, pixel values are ordered so that the target vehicle region is separated from other regions containing unnecessary information. The target image is parsed with the brightest 20% pixel value of 255 and the other pixel values of 0. In addition, by using appropriate parameters of statistical region merging algorithm, segmentation comparison is performed. In the step of feature extraction, the feature vectors belonging to the vehicles are obtained by using Gabor filters with different orientation, frequency and angle values. A number of Gabor filters are created by changing the orientation, frequency and angle parameters of the Gabor filters to extract important features of the images that form the distinctive parts. Finally, images are classified by sparse representation method. In the study, l₁ norm analysis of sparse representation is used. A joint database of the feature vectors generated by the target images of military vehicle types is obtained side by side and this database is transformed into the matrix form. In order to classify the vehicles in a similar way, the test images of each vehicle is converted to the vector form and l₁ norm analysis of the sparse representation method is applied through the existing database matrix form. As a result, correct recognition has been performed by matching the target images of military vehicles with the test images by means of the sparse representation method. 97% classification success of SAR images of different military vehicle types is obtained.

Keywords: automatic target recognition, sparse representation, image classification, SAR images

Procedia PDF Downloads 367
1275 Regeneration of Geological Models Using Support Vector Machine Assisted by Principal Component Analysis

Authors: H. Jung, N. Kim, B. Kang, J. Choe

Abstract:

History matching is a crucial procedure for predicting reservoir performances and making future decisions. However, it is difficult due to uncertainties of initial reservoir models. Therefore, it is important to have reliable initial models for successful history matching of highly heterogeneous reservoirs such as channel reservoirs. In this paper, we proposed a novel scheme for regenerating geological models using support vector machine (SVM) and principal component analysis (PCA). First, we perform PCA for figuring out main geological characteristics of models. Through the procedure, permeability values of each model are transformed to new parameters by principal components, which have eigenvalues of large magnitude. Secondly, the parameters are projected into two-dimensional plane by multi-dimensional scaling (MDS) based on Euclidean distances. Finally, we train an SVM classifier using 20% models which show the most similar or dissimilar well oil production rates (WOPR) with the true values (10% for each). Then, the other 80% models are classified by trained SVM. We select models on side of low WOPR errors. One hundred channel reservoir models are initially generated by single normal equation simulation. By repeating the classification process, we can select models which have similar geological trend with the true reservoir model. The average field of the selected models is utilized as a probability map for regeneration. Newly generated models can preserve correct channel features and exclude wrong geological properties maintaining suitable uncertainty ranges. History matching with the initial models cannot provide trustworthy results. It fails to find out correct geological features of the true model. However, history matching with the regenerated ensemble offers reliable characterization results by figuring out proper channel trend. Furthermore, it gives dependable prediction of future performances with reduced uncertainties. We propose a novel classification scheme which integrates PCA, MDS, and SVM for regenerating reservoir models. The scheme can easily sort out reliable models which have similar channel trend with the reference in lowered dimension space.

Keywords: history matching, principal component analysis, reservoir modelling, support vector machine

Procedia PDF Downloads 160
1274 Real-Time Visualization Using GPU-Accelerated Filtering of LiDAR Data

Authors: Sašo Pečnik, Borut Žalik

Abstract:

This paper presents a real-time visualization technique and filtering of classified LiDAR point clouds. The visualization is capable of displaying filtered information organized in layers by the classification attribute saved within LiDAR data sets. We explain the used data structure and data management, which enables real-time presentation of layered LiDAR data. Real-time visualization is achieved with LOD optimization based on the distance from the observer without loss of quality. The filtering process is done in two steps and is entirely executed on the GPU and implemented using programmable shaders.

Keywords: filtering, graphics, level-of-details, LiDAR, real-time visualization

Procedia PDF Downloads 312
1273 Active Features Determination: A Unified Framework

Authors: Meenal Badki

Abstract:

We address the issue of active feature determination, where the objective is to determine the set of examples on which additional data (such as lab tests) needs to be gathered, given a large number of examples with some features (such as demographics) and some examples with all the features (such as the complete Electronic Health Record). We note that certain features may be more costly, unique, or laborious to gather. Our proposal is a general active learning approach that is independent of classifiers and similarity metrics. It allows us to identify examples that differ from the full data set and obtain all the features for the examples that match. Our comprehensive evaluation shows the efficacy of this approach, which is driven by four authentic clinical tasks.

Keywords: feature determination, classification, active learning, sample-efficiency

Procedia PDF Downloads 77
1272 Use of Fractal Geometry in Machine Learning

Authors: Fuad M. Alkoot

Abstract:

The main component of a machine learning system is the classifier. Classifiers are mathematical models that can perform classification tasks for a specific application area. Additionally, many classifiers are combined using any of the available methods to reduce the classifier error rate. The benefits gained from the combination of multiple classifier designs has motivated the development of diverse approaches to multiple classifiers. We aim to investigate using fractal geometry to develop an improved classifier combiner. Initially we experiment with measuring the fractal dimension of data and use the results in the development of a combiner strategy.

Keywords: fractal geometry, machine learning, classifier, fractal dimension

Procedia PDF Downloads 219
1271 Arabic Handwriting Recognition Using Local Approach

Authors: Mohammed Arif, Abdessalam Kifouche

Abstract:

Optical character recognition (OCR) has a main role in the present time. It's capable to solve many serious problems and simplify human activities. The OCR yields to 70's, since many solutions has been proposed, but unfortunately, it was supportive to nothing but Latin languages. This work proposes a system of recognition of an off-line Arabic handwriting. This system is based on a structural segmentation method and uses support vector machines (SVM) in the classification phase. We have presented a state of art of the characters segmentation methods, after that a view of the OCR area, also we will address the normalization problems we went through. After a comparison between the Arabic handwritten characters & the segmentation methods, we had introduced a contribution through a segmentation algorithm.

Keywords: OCR, segmentation, Arabic characters, PAW, post-processing, SVM

Procedia PDF Downloads 74
1270 Hybrid Knowledge Approach for Determining Health Care Provider Specialty from Patient Diagnoses

Authors: Erin Lynne Plettenberg, Jeremy Vickery

Abstract:

In an access-control situation, the role of a user determines whether a data request is appropriate. This paper combines vetted web mining and logic modeling to build a lightweight system for determining the role of a health care provider based only on their prior authorized requests. The model identifies provider roles with 100% recall from very little data. This shows the value of vetted web mining in AI systems, and suggests the impact of the ICD classification on medical practice.

Keywords: electronic medical records, information extraction, logic modeling, ontology, vetted web mining

Procedia PDF Downloads 174
1269 Transformers in Gene Expression-Based Classification

Authors: Babak Forouraghi

Abstract:

A genetic circuit is a collection of interacting genes and proteins that enable individual cells to implement and perform vital biological functions such as cell division, growth, death, and signaling. In cell engineering, synthetic gene circuits are engineered networks of genes specifically designed to implement functionalities that are not evolved by nature. These engineered networks enable scientists to tackle complex problems such as engineering cells to produce therapeutics within the patient's body, altering T cells to target cancer-related antigens for treatment, improving antibody production using engineered cells, tissue engineering, and production of genetically modified plants and livestock. Construction of computational models to realize genetic circuits is an especially challenging task since it requires the discovery of flow of genetic information in complex biological systems. Building synthetic biological models is also a time-consuming process with relatively low prediction accuracy for highly complex genetic circuits. The primary goal of this study was to investigate the utility of a pre-trained bidirectional encoder transformer that can accurately predict gene expressions in genetic circuit designs. The main reason behind using transformers is their innate ability (attention mechanism) to take account of the semantic context present in long DNA chains that are heavily dependent on spatial representation of their constituent genes. Previous approaches to gene circuit design, such as CNN and RNN architectures, are unable to capture semantic dependencies in long contexts as required in most real-world applications of synthetic biology. For instance, RNN models (LSTM, GRU), although able to learn long-term dependencies, greatly suffer from vanishing gradient and low-efficiency problem when they sequentially process past states and compresses contextual information into a bottleneck with long input sequences. In other words, these architectures are not equipped with the necessary attention mechanisms to follow a long chain of genes with thousands of tokens. To address the above-mentioned limitations of previous approaches, a transformer model was built in this work as a variation to the existing DNA Bidirectional Encoder Representations from Transformers (DNABERT) model. It is shown that the proposed transformer is capable of capturing contextual information from long input sequences with attention mechanism. In a previous work on genetic circuit design, the traditional approaches to classification and regression, such as Random Forrest, Support Vector Machine, and Artificial Neural Networks, were able to achieve reasonably high R2 accuracy levels of 0.95 to 0.97. However, the transformer model utilized in this work with its attention-based mechanism, was able to achieve a perfect accuracy level of 100%. Further, it is demonstrated that the efficiency of the transformer-based gene expression classifier is not dependent on presence of large amounts of training examples, which may be difficult to compile in many real-world gene circuit designs.

Keywords: transformers, generative ai, gene expression design, classification

Procedia PDF Downloads 61
1268 Software Architectural Design Ontology

Authors: Muhammad Irfan Marwat, Sadaqat Jan, Syed Zafar Ali Shah

Abstract:

Software architecture plays a key role in software development but absence of formal description of software architecture causes different impede in software development. To cope with these difficulties, ontology has been used as artifact. This paper proposes ontology for software architectural design based on IEEE model for architecture description and Kruchten 4+1 model for viewpoints classification. For categorization of style and views, ISO/IEC 42010 has been used. Corpus method has been used to evaluate ontology. The main aim of the proposed ontology is to classify and locate software architectural design information.

Keywords: semantic-based software architecture, software architecture, ontology, software engineering

Procedia PDF Downloads 550
1267 Automatic Differential Diagnosis of Melanocytic Skin Tumours Using Ultrasound and Spectrophotometric Data

Authors: Kristina Sakalauskiene, Renaldas Raisutis, Gintare Linkeviciute, Skaidra Valiukeviciene

Abstract:

Cutaneous melanoma is a melanocytic skin tumour, which has a very poor prognosis while is highly resistant to treatment and tends to metastasize. Thickness of melanoma is one of the most important biomarker for stage of disease, prognosis and surgery planning. In this study, we hypothesized that the automatic analysis of spectrophotometric images and high-frequency ultrasonic 2D data can improve differential diagnosis of cutaneous melanoma and provide additional information about tumour penetration depth. This paper presents the novel complex automatic system for non-invasive melanocytic skin tumour differential diagnosis and penetration depth evaluation. The system is composed of region of interest segmentation in spectrophotometric images and high-frequency ultrasound data, quantitative parameter evaluation, informative feature extraction and classification with linear regression classifier. The segmentation of melanocytic skin tumour region in ultrasound image is based on parametric integrated backscattering coefficient calculation. The segmentation of optical image is based on Otsu thresholding. In total 29 quantitative tissue characterization parameters were evaluated by using ultrasound data (11 acoustical, 4 shape and 15 textural parameters) and 55 quantitative features of dermatoscopic and spectrophotometric images (using total melanin, dermal melanin, blood and collagen SIAgraphs acquired using spectrophotometric imaging device SIAscope). In total 102 melanocytic skin lesions (including 43 cutaneous melanomas) were examined by using SIAscope and ultrasound system with 22 MHz center frequency single element transducer. The diagnosis and Breslow thickness (pT) of each MST were evaluated during routine histological examination after excision and used as a reference. The results of this study have shown that automatic analysis of spectrophotometric and high frequency ultrasound data can improve non-invasive classification accuracy of early-stage cutaneous melanoma and provide supplementary information about tumour penetration depth.

Keywords: cutaneous melanoma, differential diagnosis, high-frequency ultrasound, melanocytic skin tumours, spectrophotometric imaging

Procedia PDF Downloads 270
1266 The Development of User Behavior in Urban Regeneration Areas by Utilizing the Floating Population Data

Authors: Jung-Hun Cho, Tae-Heon Moon, Sun-Young Heo

Abstract:

A lot of urban problems, caused by urbanization and industrialization, have occurred around the world. In particular, the creation of satellite towns, which was attributed to the explicit expansion of the city, has led to the traffic problems and the hollowization of old towns, raising the necessity of urban regeneration in old towns along with the aging of existing urban infrastructure. To select urban regeneration priority regions for the strategic execution of urban regeneration in Korea, the number of population, the number of businesses, and deterioration degree were chosen as standards. Existing standards had a limit in coping with solving urban problems fundamentally and rapidly changing reality. Therefore, it was necessary to add new indicators that can reflect the decline in relevant cities and conditions. In this regard, this study selected Busan Metropolitan City, Korea as the target area as a leading city, where urban regeneration such as an international port city has been activated like Yokohama, Japan. Prior to setting the urban regeneration priority region, the conditions of reality should be reflected because uniform and uncharacterized projects have been implemented without a quantitative analysis about population behavior within the region. For this reason, this study conducted a characterization analysis and type classification, based on the user behaviors by using representative floating population of the big data, which is a hot issue all over the society in recent days. The target areas were analyzed in this study. While 23 regions were classified as three types in existing Busan Metropolitan City urban regeneration priority region, 23 regions were classified as four types in existing Busan Metropolitan City urban regeneration priority region in terms of the type classification on the basis of user behaviors. Four types were classified as follows; type (Ⅰ) of young people - morning type, Type (Ⅱ) of the old and middle-aged- general type with sharp floating population, type (Ⅲ) of the old and middle aged-24hour-type, and type (Ⅳ) of the old and middle aged with less floating population. Characteristics were shown in each region of four types, and the study results of user behaviors were different from those of existing urban regeneration priority region. According to the results, in type (Ⅰ) young people were the majority around the existing old built-up area, where floating population at dawn is four times more than in other areas. In Type (Ⅱ), there were many old and middle-aged people around the existing built-up area and general neighborhoods, where the average floating population was more than in other areas due to commuting, while in type (Ⅲ), there was no change in the floating population throughout 24 hours, although there were many old and middle aged people in population around the existing general neighborhoods. Type (Ⅳ) includes existing economy-based type, central built-up area type, and general neighborhood type, where old and middle aged people were the majority as a general type of commuting with less floating population. Unlike existing urban regeneration priority region, these types were sub-divided according to types, and in this study, approach methods and basic orientations of urban regeneration were set to reflect the reality to a certain degree including the indicators of effective floating population to identify the dynamic activity of urban areas and existing regeneration priority areas in connection with urban regeneration projects by regions. Therefore, it is possible to make effective urban plans through offering the substantial ground by utilizing scientific and quantitative data. To induce more realistic and effective regeneration projects, the regeneration projects tailored to the present local conditions should be developed by reflecting the present conditions on the formulation of urban regeneration strategic plans.

Keywords: floating population, big data, urban regeneration, urban regeneration priority region, type classification

Procedia PDF Downloads 214
1265 A Deep Learning Approach for the Predictive Quality of Directional Valves in the Hydraulic Final Test

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

The increasing use of deep learning applications in production is becoming a competitive advantage. Predictive quality enables the assurance of product quality by using data-driven forecasts via machine learning models as a basis for decisions on test results. The use of real Bosch production data along the value chain of hydraulic valves is a promising approach to classifying the leakage of directional valves.

Keywords: artificial neural networks, classification, hydraulics, predictive quality, deep learning

Procedia PDF Downloads 248
1264 Riesz Mixture Model for Brain Tumor Detection

Authors: Mouna Zitouni, Mariem Tounsi

Abstract:

This research introduces an application of the Riesz mixture model for medical image segmentation for accurate diagnosis and treatment of brain tumors. We propose a pixel classification technique based on the Riesz distribution, derived from an extended Bartlett decomposition. To our knowledge, this is the first study addressing this approach. The Expectation-Maximization algorithm is implemented for parameter estimation. A comparative analysis, using both synthetic and real brain images, demonstrates the superiority of the Riesz model over a recent method based on the Wishart distribution.

Keywords: EM algorithm, segmentation, Riesz probability distribution, Wishart probability distribution

Procedia PDF Downloads 21