Search results for: probabilistic classification vector machines
3292 Improved Computational Efficiency of Machine Learning Algorithm Based on Evaluation Metrics to Control the Spread of Coronavirus in the UK
Authors: Swathi Ganesan, Nalinda Somasiri, Rebecca Jeyavadhanam, Gayathri Karthick
Abstract:
The COVID-19 crisis presents a substantial and critical hazard to worldwide health. Since the occurrence of the disease in late January 2020 in the UK, the number of infected people confirmed to acquire the illness has increased tremendously across the country, and the number of individuals affected is undoubtedly considerably high. The purpose of this research is to figure out a predictive machine learning archetypal that could forecast COVID-19 cases within the UK. This study concentrates on the statistical data collected from 31st January 2020 to 31st March 2021 in the United Kingdom. Information on total COVID cases registered, new cases encountered on a daily basis, total death registered, and patients’ death per day due to Coronavirus is collected from World Health Organisation (WHO). Data preprocessing is carried out to identify any missing values, outliers, or anomalies in the dataset. The data is split into 8:2 ratio for training and testing purposes to forecast future new COVID cases. Support Vector Machines (SVM), Random Forests, and linear regression algorithms are chosen to study the model performance in the prediction of new COVID-19 cases. From the evaluation metrics such as r-squared value and mean squared error, the statistical performance of the model in predicting the new COVID cases is evaluated. Random Forest outperformed the other two Machine Learning algorithms with a training accuracy of 99.47% and testing accuracy of 98.26% when n=30. The mean square error obtained for Random Forest is 4.05e11, which is lesser compared to the other predictive models used for this study. From the experimental analysis Random Forest algorithm can perform more effectively and efficiently in predicting the new COVID cases, which could help the health sector to take relevant control measures for the spread of the virus.Keywords: COVID-19, machine learning, supervised learning, unsupervised learning, linear regression, support vector machine, random forest
Procedia PDF Downloads 1213291 Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification
Authors: Zhaoxin Luo, Michael Zhu
Abstract:
In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem.Keywords: nature language processing, recurrent neural network, hierarchical structure, document classification, Chinese
Procedia PDF Downloads 683290 Productivity and Structural Design of Manufacturing Systems
Authors: Ryspek Usubamatov, Tan San Chin, Sarken Kapaeva
Abstract:
Productivity of the manufacturing systems depends on technological processes, a technical data of machines and a structure of systems. Technology is presented by the machining mode and data, a technical data presents reliability parameters and auxiliary time for discrete production processes. The term structure of manufacturing systems includes the number of serial and parallel production machines and links between them. Structures of manufacturing systems depend on the complexity of technological processes. Mathematical models of productivity rate for manufacturing systems are important attributes that enable to define best structure by criterion of a productivity rate. These models are important tool in evaluation of the economical efficiency for production systems.Keywords: productivity, structure, manufacturing systems, structural design
Procedia PDF Downloads 5833289 A Novel PSO Based Decision Tree Classification
Authors: Ali Farzan
Abstract:
Classification of data objects or patterns is a major part in most of Decision making systems. One of the popular and commonly used classification methods is Decision Tree (DT). It is a hierarchical decision making system by which a binary tree is constructed and starting from root, at each node some of the classes is rejected until reaching the leaf nods. Each leaf node is a representative of one specific class. Finding the splitting criteria in each node for constructing or training the tree is a major problem. Particle Swarm Optimization (PSO) has been adopted as a metaheuristic searching method for finding the best splitting criteria. Result of evaluating the proposed method over benchmark datasets indicates the higher accuracy of the new PSO based decision tree.Keywords: decision tree, particle swarm optimization, splitting criteria, metaheuristic
Procedia PDF Downloads 4063288 A Neural Approach for the Offline Recognition of the Arabic Handwritten Words of the Algerian Departments
Authors: Salim Ouchtati, Jean Sequeira, Mouldi Bedda
Abstract:
In this work we present an off line system for the recognition of the Arabic handwritten words of the Algerian departments. The study is based mainly on the evaluation of neural network performances, trained with the gradient back propagation algorithm. The used parameters to form the input vector of the neural network are extracted on the binary images of the handwritten word by several methods: the parameters of distribution, the moments centered of the different projections and the Barr features. It should be noted that these methods are applied on segments gotten after the division of the binary image of the word in six segments. The classification is achieved by a multi layers perceptron. Detailed experiments are carried and satisfactory recognition results are reported.Keywords: handwritten word recognition, neural networks, image processing, pattern recognition, features extraction
Procedia PDF Downloads 5133287 Climatic and Environmental Variables Do Not Affect the Diversity of Possible Phytoplasmic Vector Insects Associated with Quercus humboltii Oak Trees in Bogota, Colombia
Authors: J. Lamilla-Monje, C. Solano-Puerto, L. Franco-Lara
Abstract:
Trees play an essential role in cities due to their ability to provide multiple ecosystem goods and services. Bogota trees are threatened by factors such as pests, pathogens, contamination, among others. Among the pathogens, phytoplasmas are a potential risk for urban trees, generating symptoms that affect the ecosystem services that these trees provide in Bogota, an example of this is the affectation of Q. humboldtii by phytoplasmas, these bacteria are transmitted for insects of the order Hemiptera, this is why the objective of this work was to know if the climatic variables (humidity, precipitation, and temperature) and environmental variables (PM10 and PM2.5) could be related to the distribution of the Oak Quercus entomofauna and specifically with the phytoplasma vector insects in Bogota. For this study, the sampling points were distributed in areas of the city with contrasting variables in two types of locations: parks and streets. A total of 68 trees were sampled in which the associated insects were collected using two methodologies: jameo and agitation traps. The results show that insects of the order Hemiptera were the most abundant, including a total of 1682 individuals represented by 29 morphotypes, within this order individuals from eight families were collected (Aphidae, Aradidae, Berytidae, Cicadellidae, Issidae, Membracidae, Miridae, and Psyllidae), finding as possible vectors the families Cicadellidae, Membracidae, and Psyllidae with 959, 8 and 14 individuals respectively. Within the Cicadellidae family, 21 morphotypes were found, being reported as vectors in the literature: Amplicephalus, Exitianus atratus, Haldorus sp., Xestocephalus desertorum, Idiocerinae sp., Scaphytopius sp., the Membracidae family was represented by two morphotypes and the Psyllidae by one. Results that suggest that there is no correlation between climatic and environmental variables with the diversity of insects associated with oak. Knowing the vector insects of phytoplasmas in oak trees will complete the pathosystem and generate effective vector control.Keywords: vector insects, diversity, phytoplasmas, Cicadellidae
Procedia PDF Downloads 1503286 Stochastic Model Predictive Control for Linear Discrete-Time Systems with Random Dither Quantization
Authors: Tomoaki Hashimoto
Abstract:
Recently, feedback control systems using random dither quantizers have been proposed for linear discrete-time systems. However, the constraints imposed on state and control variables have not yet been taken into account for the design of feedback control systems with random dither quantization. Model predictive control is a kind of optimal feedback control in which control performance over a finite future is optimized with a performance index that has a moving initial and terminal time. An important advantage of model predictive control is its ability to handle constraints imposed on state and control variables. Based on the model predictive control approach, the objective of this paper is to present a control method that satisfies probabilistic state constraints for linear discrete-time feedback control systems with random dither quantization. In other words, this paper provides a method for solving the optimal control problems subject to probabilistic state constraints for linear discrete-time feedback control systems with random dither quantization.Keywords: optimal control, stochastic systems, random dither, quantization
Procedia PDF Downloads 4443285 Enhanced Image Representation for Deep Belief Network Classification of Hyperspectral Images
Authors: Khitem Amiri, Mohamed Farah
Abstract:
Image classification is a challenging task and is gaining lots of interest since it helps us to understand the content of images. Recently Deep Learning (DL) based methods gave very interesting results on several benchmarks. For Hyperspectral images (HSI), the application of DL techniques is still challenging due to the scarcity of labeled data and to the curse of dimensionality. Among other approaches, Deep Belief Network (DBN) based approaches gave a fair classification accuracy. In this paper, we address the problem of the curse of dimensionality by reducing the number of bands and replacing the HSI channels by the channels representing radiometric indices. Therefore, instead of using all the HSI bands, we compute the radiometric indices such as NDVI (Normalized Difference Vegetation Index), NDWI (Normalized Difference Water Index), etc, and we use the combination of these indices as input for the Deep Belief Network (DBN) based classification model. Thus, we keep almost all the pertinent spectral information while reducing considerably the size of the image. In order to test our image representation, we applied our method on several HSI datasets including the Indian pines dataset, Jasper Ridge data and it gave comparable results to the state of the art methods while reducing considerably the time of training and testing.Keywords: hyperspectral images, deep belief network, radiometric indices, image classification
Procedia PDF Downloads 2803284 Experimental Implementation of Model Predictive Control for Permanent Magnet Synchronous Motor
Authors: Abdelsalam A. Ahmed
Abstract:
Fast speed drives for Permanent Magnet Synchronous Motor (PMSM) is a crucial performance for the electric traction systems. In this paper, PMSM is drived with a Model-based Predictive Control (MPC) technique. Fast speed tracking is achieved through optimization of the DC source utilization using MPC. The technique is based on predicting the optimum voltage vector applied to the driver. Control technique is investigated by comparing to the cascaded PI control based on Space Vector Pulse Width Modulation (SVPWM). MPC and SVPWM-based FOC are implemented with the TMS320F2812 DSP and its power driver circuits. The designed MPC for a PMSM drive is experimentally validated on a laboratory test bench. The performances are compared with those obtained by a conventional PI-based system in order to highlight the improvements, especially regarding speed tracking response.Keywords: permanent magnet synchronous motor, model-based predictive control, DC source utilization, cascaded PI control, space vector pulse width modulation, TMS320F2812 DSP
Procedia PDF Downloads 6443283 Utilizing Google Earth for Internet GIS
Authors: Alireza Derambakhsh
Abstract:
The objective of this examination is to explore the capability of utilizing Google Earth for Internet GIS applications. The study particularly analyzes the utilization of vector and characteristic information and the capability of showing and preparing this information in new ways utilizing the Google Earth stage. It has progressively been perceived that future improvements in GIS will fixate on Internet GIS, and in three noteworthy territories: GIS information access, spatial data scattering and GIS displaying/preparing. Google Earth is one of the group of geobrowsers that offer a free and simple to utilize administration that empower information with a spatial part to be overlain on top of a 3-D model of the Earth. This examination makes a methodological structure to accomplish its objective that comprises of three noteworthy parts: A database level, an application level and a customer level. As verification of idea a web model has been produced, which incorporates a differing scope of datasets and lets clients direst inquiries and make perceptions of this custom information. The outcomes uncovered that both vector and property information can be successfully spoken to and imagined utilizing Google Earth. In addition, the usefulness to question custom information and envision results has been added to the Google Earth stage.Keywords: Google earth, internet GIS, vector, characteristic information
Procedia PDF Downloads 3083282 Process of Research, Development and Application of New Pelletizer
Authors: Ľubomír Šooš, Peter Križan, Juraj Beniak, Miloš Matúš
Abstract:
The success of introducing a new product on the market is the new principle of production, or progressive design, improved efficiency or high quality of manufactured products. Proportionally with the growth of interest in press-biofuels - pellets or briquettes, is also growing interest in the new design better, more efficiently machines produce pellets, briquettes or granules completely new shapes. Our department has for years dedicated to the development of new highly productive designs pressing machines and new optimized press-biofuels. In this field, we have more than 40 national and international patents. The aim of paper is description of the introduction of a new principle pelleting mill and the description of his process of research, development, manufacturing and testing to deployment into production.Keywords: compacting process, pellets mill, design, new conception, press-biofuels, patent, waste
Procedia PDF Downloads 3833281 Evaluation of Random Forest and Support Vector Machine Classification Performance for the Prediction of Early Multiple Sclerosis from Resting State FMRI Connectivity Data
Authors: V. Saccà, A. Sarica, F. Novellino, S. Barone, T. Tallarico, E. Filippelli, A. Granata, P. Valentino, A. Quattrone
Abstract:
The work aim was to evaluate how well Random Forest (RF) and Support Vector Machine (SVM) algorithms could support the early diagnosis of Multiple Sclerosis (MS) from resting-state functional connectivity data. In particular, we wanted to explore the ability in distinguishing between controls and patients of mean signals extracted from ICA components corresponding to 15 well-known networks. Eighteen patients with early-MS (mean-age 37.42±8.11, 9 females) were recruited according to McDonald and Polman, and matched for demographic variables with 19 healthy controls (mean-age 37.55±14.76, 10 females). MRI was acquired by a 3T scanner with 8-channel head coil: (a)whole-brain T1-weighted; (b)conventional T2-weighted; (c)resting-state functional MRI (rsFMRI), 200 volumes. Estimated total lesion load (ml) and number of lesions were calculated using LST-toolbox from the corrected T1 and FLAIR. All rsFMRIs were pre-processed using tools from the FMRIB's Software Library as follows: (1) discarding of the first 5 volumes to remove T1 equilibrium effects, (2) skull-stripping of images, (3) motion and slice-time correction, (4) denoising with high-pass temporal filter (128s), (5) spatial smoothing with a Gaussian kernel of FWHM 8mm. No statistical significant differences (t-test, p < 0.05) were found between the two groups in the mean Euclidian distance and the mean Euler angle. WM and CSF signal together with 6 motion parameters were regressed out from the time series. We applied an independent component analysis (ICA) with the GIFT-toolbox using the Infomax approach with number of components=21. Fifteen mean components were visually identified by two experts. The resulting z-score maps were thresholded and binarized to extract the mean signal of the 15 networks for each subject. Statistical and machine learning analysis were then conducted on this dataset composed of 37 rows (subjects) and 15 features (mean signal in the network) with R language. The dataset was randomly splitted into training (75%) and test sets and two different classifiers were trained: RF and RBF-SVM. We used the intrinsic feature selection of RF, based on the Gini index, and recursive feature elimination (rfe) for the SVM, to obtain a rank of the most predictive variables. Thus, we built two new classifiers only on the most important features and we evaluated the accuracies (with and without feature selection) on test-set. The classifiers, trained on all the features, showed very poor accuracies on training (RF:58.62%, SVM:65.52%) and test sets (RF:62.5%, SVM:50%). Interestingly, when feature selection by RF and rfe-SVM were performed, the most important variable was the sensori-motor network I in both cases. Indeed, with only this network, RF and SVM classifiers reached an accuracy of 87.5% on test-set. More interestingly, the only misclassified patient resulted to have the lowest value of lesion volume. We showed that, with two different classification algorithms and feature selection approaches, the best discriminant network between controls and early MS, was the sensori-motor I. Similar importance values were obtained for the sensori-motor II, cerebellum and working memory networks. These findings, in according to the early manifestation of motor/sensorial deficits in MS, could represent an encouraging step toward the translation to the clinical diagnosis and prognosis.Keywords: feature selection, machine learning, multiple sclerosis, random forest, support vector machine
Procedia PDF Downloads 2403280 Uncertainty and Optimization Analysis Using PETREL RE
Authors: Ankur Sachan
Abstract:
The ability to make quick yet intelligent and value-added decisions to develop new fields has always been of great significance. In situations where the capital expenses and subsurface risk are high, carefully analyzing the inherent uncertainties in the reservoir and how they impact the predicted hydrocarbon accumulation and production becomes a daunting task. The problem is compounded in offshore environments, especially in the presence of heavy oils and disconnected sands where the margin for error is small. Uncertainty refers to the degree to which the data set may be in error or stray from the predicted values. To understand and quantify the uncertainties in reservoir model is important when estimating the reserves. Uncertainty parameters can be geophysical, geological, petrophysical etc. Identification of these parameters is necessary to carry out the uncertainty analysis. With so many uncertainties working at different scales, it becomes essential to have a consistent and efficient way of incorporating them into our analysis. Ranking the uncertainties based on their impact on reserves helps to prioritize/ guide future data gathering and uncertainty reduction efforts. Assigning probabilistic ranges to key uncertainties also enables the computation of probabilistic reserves. With this in mind, this paper, with the help the uncertainty and optimization process in petrel RE shows how the most influential uncertainties can be determined efficiently and how much impact so they have on the reservoir model thus helping in determining a cost effective and accurate model of the reservoir.Keywords: uncertainty, reservoir model, parameters, optimization analysis
Procedia PDF Downloads 6513279 Statistical Classification, Downscaling and Uncertainty Assessment for Global Climate Model Outputs
Authors: Queen Suraajini Rajendran, Sai Hung Cheung
Abstract:
Statistical down scaling models are required to connect the global climate model outputs and the local weather variables for climate change impact prediction. For reliable climate change impact studies, the uncertainty associated with the model including natural variability, uncertainty in the climate model(s), down scaling model, model inadequacy and in the predicted results should be quantified appropriately. In this work, a new approach is developed by the authors for statistical classification, statistical down scaling and uncertainty assessment and is applied to Singapore rainfall. It is a robust Bayesian uncertainty analysis methodology and tools based on coupling dependent modeling error with classification and statistical down scaling models in a way that the dependency among modeling errors will impact the results of both classification and statistical down scaling model calibration and uncertainty analysis for future prediction. Singapore data are considered here and the uncertainty and prediction results are obtained. From the results obtained, directions of research for improvement are briefly presented.Keywords: statistical downscaling, global climate model, climate change, uncertainty
Procedia PDF Downloads 3683278 Data Hiding by Vector Quantization in Color Image
Authors: Yung Gi Wu
Abstract:
With the growing of computer and network, digital data can be spread to anywhere in the world quickly. In addition, digital data can also be copied or tampered easily so that the security issue becomes an important topic in the protection of digital data. Digital watermark is a method to protect the ownership of digital data. Embedding the watermark will influence the quality certainly. In this paper, Vector Quantization (VQ) is used to embed the watermark into the image to fulfill the goal of data hiding. This kind of watermarking is invisible which means that the users will not conscious the existing of embedded watermark even though the embedded image has tiny difference compared to the original image. Meanwhile, VQ needs a lot of computation burden so that we adopt a fast VQ encoding scheme by partial distortion searching (PDS) and mean approximation scheme to speed up the data hiding process. The watermarks we hide to the image could be gray, bi-level and color images. Texts are also can be regarded as watermark to embed. In order to test the robustness of the system, we adopt Photoshop to fulfill sharpen, cropping and altering to check if the extracted watermark is still recognizable. Experimental results demonstrate that the proposed system can resist the above three kinds of tampering in general cases.Keywords: data hiding, vector quantization, watermark, color image
Procedia PDF Downloads 3643277 The Impact of Information and Communication Technology on the Performance of Office Technology Managers
Authors: Sunusi Tijjani
Abstract:
Information and communication technology is an indispensable tool in the performance of office technology managers. Today's offices are automated and equipped with modern office machines that enhances and improve the work of office managers. However, today's office technology managers can process, evaluate, manage and communicate all forms of information using technological devices. Information and Communication Technology is viewed as the process of processing, storing ad dissemination information while office technology managers are trained professional who can effectively operate modern office machines, perform administrative duties and attend meetings to take dawn minute of meetings. This paper examines the importance of information and communication technology toward enhancing the work of office managers. It also stresses the importance of information and communication technology toward proper and accurate record management.Keywords: communication, information, technology, managers
Procedia PDF Downloads 4853276 Signal Processing Techniques for Adaptive Beamforming with Robustness
Authors: Ju-Hong Lee, Ching-Wei Liao
Abstract:
Adaptive beamforming using antenna array of sensors is useful in the process of adaptively detecting and preserving the presence of the desired signal while suppressing the interference and the background noise. For conventional adaptive array beamforming, we require a prior information of either the impinging direction or the waveform of the desired signal to adapt the weights. The adaptive weights of an antenna array beamformer under a steered-beam constraint are calculated by minimizing the output power of the beamformer subject to the constraint that forces the beamformer to make a constant response in the steering direction. Hence, the performance of the beamformer is very sensitive to the accuracy of the steering operation. In the literature, it is well known that the performance of an adaptive beamformer will be deteriorated by any steering angle error encountered in many practical applications, e.g., the wireless communication systems with massive antennas deployed at the base station and user equipment. Hence, developing effective signal processing techniques to deal with the problem due to steering angle error for array beamforming systems has become an important research work. In this paper, we present an effective signal processing technique for constructing an adaptive beamformer against the steering angle error. The proposed array beamformer adaptively estimates the actual direction of the desired signal by using the presumed steering vector and the received array data snapshots. Based on the presumed steering vector and a preset angle range for steering mismatch tolerance, we first create a matrix related to the direction vector of signal sources. Two projection matrices are generated from the matrix. The projection matrix associated with the desired signal information and the received array data are utilized to iteratively estimate the actual direction vector of the desired signal. The estimated direction vector of the desired signal is then used for appropriately finding the quiescent weight vector. The other projection matrix is set to be the signal blocking matrix required for performing adaptive beamforming. Accordingly, the proposed beamformer consists of adaptive quiescent weights and partially adaptive weights. Several computer simulation examples are provided for evaluating and comparing the proposed technique with the existing robust techniques.Keywords: adaptive beamforming, robustness, signal blocking, steering angle error
Procedia PDF Downloads 1243275 Reliability Analysis of Glass Epoxy Composite Plate under Low Velocity
Authors: Shivdayal Patel, Suhail Ahmad
Abstract:
Safety assurance and failure prediction of composite material component of an offshore structure due to low velocity impact is essential for associated risk assessment. It is important to incorporate uncertainties associated with material properties and load due to an impact. Likelihood of this hazard causing a chain of failure events plays an important role in risk assessment. The material properties of composites mostly exhibit a scatter due to their in-homogeneity and anisotropic characteristics, brittleness of the matrix and fiber and manufacturing defects. In fact, the probability of occurrence of such a scenario is due to large uncertainties arising in the system. Probabilistic finite element analysis of composite plates due to low-velocity impact is carried out considering uncertainties of material properties and initial impact velocity. Impact-induced damage of composite plate is a probabilistic phenomenon due to a wide range of uncertainties arising in material and loading behavior. A typical failure crack initiates and propagates further into the interface causing de-lamination between dissimilar plies. Since individual crack in the ply is difficult to track. The progressive damage model is implemented in the FE code by a user-defined material subroutine (VUMAT) to overcome these problems. The limit state function is accordingly established while the stresses in the lamina are such that the limit state function (g(x)>0). The Gaussian process response surface method is presently adopted to determine the probability of failure. A comparative study is also carried out for different combination of impactor masses and velocities. The sensitivity based probabilistic design optimization procedure is investigated to achieve better strength and lighter weight of composite structures. Chain of failure events due to different modes of failure is considered to estimate the consequences of failure scenario. Frequencies of occurrence of specific impact hazards yield the expected risk due to economic loss.Keywords: composites, damage propagation, low velocity impact, probability of failure, uncertainty modeling
Procedia PDF Downloads 2793274 Active Islanding Detection Method Using Intelligent Controller
Authors: Kuang-Hsiung Tan, Chih-Chan Hu, Chien-Wu Lan, Shih-Sung Lin, Te-Jen Chang
Abstract:
An active islanding detection method using disturbance signal injection with intelligent controller is proposed in this study. First, a DC\AC power inverter is emulated in the distributed generator (DG) system to implement the tracking control of active power, reactive power outputs and the islanding detection. The proposed active islanding detection method is based on injecting a disturbance signal into the power inverter system through the d-axis current which leads to a frequency deviation at the terminal of the RLC load when the utility power is disconnected. Moreover, in order to improve the transient and steady-state responses of the active power and reactive power outputs of the power inverter, and to further improve the performance of the islanding detection method, two probabilistic fuzzy neural networks (PFNN) are adopted to replace the traditional proportional-integral (PI) controllers for the tracking control and the islanding detection. Furthermore, the network structure and the online learning algorithm of the PFNN are introduced in detail. Finally, the feasibility and effectiveness of the tracking control and the proposed active islanding detection method are verified with experimental results.Keywords: distributed generators, probabilistic fuzzy neural network, islanding detection, non-detection zone
Procedia PDF Downloads 3893273 Applying Neural Networks for Solving Record Linkage Problem via Fuzzy Description Logics
Authors: Mikheil Kalmakhelidze
Abstract:
Record linkage (RL) problem has become more and more important in recent years due to the growing interest towards big data analysis. The problem can be formulated in a very simple way: Given two entries a and b of a database, decide whether they represent the same object or not. There are two classical deterministic and probabilistic ways of solving the RL problem. Using simple Bayes classifier in many cases produces useful results but sometimes they show to be poor. In recent years several successful approaches have been made towards solving specific RL problems by neural network algorithms including single layer perception, multilayer back propagation network etc. In our work, we model the RL problem for specific dataset of student applications in fuzzy description logic (FDL) where linkage of specific pair (a,b) depends on the truth value of corresponding formula A(a,b) in a canonical FDL model. As a main result, we build neural network for deciding truth value of FDL formulas in a canonical model and thus link RL problem to machine learning. We apply the approach to dataset with 10000 entries and also compare to classical RL solving approaches. The results show to be more accurate than standard probabilistic approach.Keywords: description logic, fuzzy logic, neural networks, record linkage
Procedia PDF Downloads 2723272 Using Gene Expression Programming in Learning Process of Rough Neural Networks
Authors: Sanaa Rashed Abdallah, Yasser F. Hassan
Abstract:
The paper will introduce an approach where a rough sets, gene expression programming and rough neural networks are used cooperatively for learning and classification support. The Objective of gene expression programming rough neural networks (GEP-RNN) approach is to obtain new classified data with minimum error in training and testing process. Starting point of gene expression programming rough neural networks (GEP-RNN) approach is an information system and the output from this approach is a structure of rough neural networks which is including the weights and thresholds with minimum classification error.Keywords: rough sets, gene expression programming, rough neural networks, classification
Procedia PDF Downloads 3833271 A Statistical Approach to Classification of Agricultural Regions
Authors: Hasan Vural
Abstract:
Turkey is a favorable country to produce a great variety of agricultural products because of her different geographic and climatic conditions which have been used to divide the country into four main and seven sub regions. This classification into seven regions traditionally has been used in order to data collection and publication especially related with agricultural production. Afterwards, nine agricultural regions were considered. Recently, the governmental body which is responsible of data collection and dissemination (Turkish Institute of Statistics-TIS) has used 12 classes which include 11 sub regions and Istanbul province. This study aims to evaluate these classification efforts based on the acreage of ten main crops in a ten years time period (1996-2005). The panel data grouped in 11 subregions has been evaluated by cluster and multivariate statistical methods. It was concluded that from the agricultural production point of view, it will be rather meaningful to consider three main and eight sub-agricultural regions throughout the country.Keywords: agricultural region, factorial analysis, cluster analysis,
Procedia PDF Downloads 4153270 The Change of Urban Land Use/Cover Using Object Based Approach for Southern Bali
Authors: I. Gusti A. A. Rai Asmiwyati, Robert J. Corner, Ashraf M. Dewan
Abstract:
Change on land use/cover (LULC) dominantly affects spatial structure and function. It can have such impacts by disrupting social culture practice and disturbing physical elements. Thus, it has become essential to understand of the dynamics in time and space of LULC as it can be used as a critical input for developing sustainable LULC. This study was an attempt to map and monitor the LULC change in Bali Indonesia from 2003 to 2013. Using object based classification to improve the accuracy, and change detection, multi temporal land use/cover data were extracted from a set of ASTER satellite image. The overall accuracies of the classification maps of 2003 and 2013 were 86.99% and 80.36%, respectively. Built up area and paddy field were the dominant type of land use/cover in both years. Patch increase dominantly in 2003 illustrated the rapid paddy field fragmentation and the huge occurring transformation. This approach is new for the case of diverse urban features of Bali that has been growing fast and increased the classification accuracy than the manual pixel based classification.Keywords: land use/cover, urban, Bali, ASTER
Procedia PDF Downloads 5403269 Automatic Lexicon Generation for Domain Specific Dataset for Mining Public Opinion on China Pakistan Economic Corridor
Authors: Tayyaba Azim, Bibi Amina
Abstract:
The increase in the popularity of opinion mining with the rapid growth in the availability of social networks has attracted a lot of opportunities for research in the various domains of Sentiment Analysis and Natural Language Processing (NLP) using Artificial Intelligence approaches. The latest trend allows the public to actively use the internet for analyzing an individual’s opinion and explore the effectiveness of published facts. The main theme of this research is to account the public opinion on the most crucial and extensively discussed development projects, China Pakistan Economic Corridor (CPEC), considered as a game changer due to its promise of bringing economic prosperity to the region. So far, to the best of our knowledge, the theme of CPEC has not been analyzed for sentiment determination through the ML approach. This research aims to demonstrate the use of ML approaches to spontaneously analyze the public sentiment on Twitter tweets particularly about CPEC. Support Vector Machine SVM is used for classification task classifying tweets into positive, negative and neutral classes. Word2vec and TF-IDF features are used with the SVM model, a comparison of the trained model on manually labelled tweets and automatically generated lexicon is performed. The contributions of this work are: Development of a sentiment analysis system for public tweets on CPEC subject, construction of an automatic generation of the lexicon of public tweets on CPEC, different themes are identified among tweets and sentiments are assigned to each theme. It is worth noting that the applications of web mining that empower e-democracy by improving political transparency and public participation in decision making via social media have not been explored and practised in Pakistan region on CPEC yet.Keywords: machine learning, natural language processing, sentiment analysis, support vector machine, Word2vec
Procedia PDF Downloads 1483268 Harnessing Artificial Intelligence and Machine Learning for Advanced Fraud Detection and Prevention
Authors: Avinash Malladhi
Abstract:
Forensic accounting is a specialized field that involves the application of accounting principles, investigative skills, and legal knowledge to detect and prevent fraud. With the rise of big data and technological advancements, artificial intelligence (AI) and machine learning (ML) algorithms have emerged as powerful tools for forensic accountants to enhance their fraud detection capabilities. In this paper, we review and analyze various AI/ML algorithms that are commonly used in forensic accounting, including supervised and unsupervised learning, deep learning, natural language processing Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Support Vector Machines (SVMs), Decision Trees, and Random Forests. We discuss their underlying principles, strengths, and limitations and provide empirical evidence from existing research studies demonstrating their effectiveness in detecting financial fraud. We also highlight potential ethical considerations and challenges associated with using AI/ML in forensic accounting. Furthermore, we highlight the benefits of these technologies in improving fraud detection and prevention in forensic accounting.Keywords: AI, machine learning, forensic accounting & fraud detection, anti money laundering, Benford's law, fraud triangle theory
Procedia PDF Downloads 933267 Land Cover Classification System for the Estimation of Carbon Storage in Terrestrial Ecosystems
Authors: Lei Zhang
Abstract:
The carbon cycle greatly influences global change, and the land cover changes contribute to the status and rate of the carbon budget in ecosystems. This paper proposes a land cover classification system for mapping land cover, the national ecological environment assessment, and estimating carbon storage in ecosystems. The classification system consists of basic land cover classes at levels Ⅰ and Ⅱ and auxiliary features at level III. The basic 38 classes characterizing land cover features are derived from 19 criteria referring to composition, structure, pattern, phenology, etc. The basic classes reflect the status of carbon storage in ecosystems. The auxiliary classes at level III complement the attributes of higher levels by 9 criteria. The 5 environmental criteria of temperature, moisture, landform, aspect and slope mainly reflect the potential and intensity of carbon storage in ecosystems. The disturbance of vegetation succession caused by land use type influences the vegetation carbon budget. The other 3 vegetation cover criteria, growth period, and species characteristics further refine the vegetation types. The hierarchical structure of the land cover map (the classes of levels Ⅰ and Ⅱ) is independent of the products of level III, which is helpful for land cover product management and applications. The classification system has been adopted in the Chinese national land cover database for the carbon budget in ecosystems at a 30 m scale.Keywords: classification system, land cover, ecosystem, carbon storage, object based
Procedia PDF Downloads 703266 From Type-I to Type-II Fuzzy System Modeling for Diagnosis of Hepatitis
Authors: Shahabeddin Sotudian, M. H. Fazel Zarandi, I. B. Turksen
Abstract:
Hepatitis is one of the most common and dangerous diseases that affects humankind, and exposes millions of people to serious health risks every year. Diagnosis of Hepatitis has always been a challenge for physicians. This paper presents an effective method for diagnosis of hepatitis based on interval Type-II fuzzy. This proposed system includes three steps: pre-processing (feature selection), Type-I and Type-II fuzzy classification, and system evaluation. KNN-FD feature selection is used as the preprocessing step in order to exclude irrelevant features and to improve classification performance and efficiency in generating the classification model. In the fuzzy classification step, an “indirect approach” is used for fuzzy system modeling by implementing the exponential compactness and separation index for determining the number of rules in the fuzzy clustering approach. Therefore, we first proposed a Type-I fuzzy system that had an accuracy of approximately 90.9%. In the proposed system, the process of diagnosis faces vagueness and uncertainty in the final decision. Thus, the imprecise knowledge was managed by using interval Type-II fuzzy logic. The results that were obtained show that interval Type-II fuzzy has the ability to diagnose hepatitis with an average accuracy of 93.94%. The classification accuracy obtained is the highest one reached thus far. The aforementioned rate of accuracy demonstrates that the Type-II fuzzy system has a better performance in comparison to Type-I and indicates a higher capability of Type-II fuzzy system for modeling uncertainty.Keywords: hepatitis disease, medical diagnosis, type-I fuzzy logic, type-II fuzzy logic, feature selection
Procedia PDF Downloads 3063265 DeClEx-Processing Pipeline for Tumor Classification
Authors: Gaurav Shinde, Sai Charan Gongiguntla, Prajwal Shirur, Ahmed Hambaba
Abstract:
Health issues are significantly increasing, putting a substantial strain on healthcare services. This has accelerated the integration of machine learning in healthcare, particularly following the COVID-19 pandemic. The utilization of machine learning in healthcare has grown significantly. We introduce DeClEx, a pipeline that ensures that data mirrors real-world settings by incorporating Gaussian noise and blur and employing autoencoders to learn intermediate feature representations. Subsequently, our convolutional neural network, paired with spatial attention, provides comparable accuracy to state-of-the-art pre-trained models while achieving a threefold improvement in training speed. Furthermore, we provide interpretable results using explainable AI techniques. We integrate denoising and deblurring, classification, and explainability in a single pipeline called DeClEx.Keywords: machine learning, healthcare, classification, explainability
Procedia PDF Downloads 553264 Two Points Crossover Genetic Algorithm for Loop Layout Design Problem
Authors: Xu LiYun, Briand Florent, Fan GuoLiang
Abstract:
The loop-layout design problem (LLDP) aims at optimizing the sequence of positioning of the machines around the cyclic production line. Traffic congestion is the usual criteria to minimize in this type of problem, i.e. the number of additional cycles spent by each part in the network until the completion of its required routing sequence of machines. This paper aims at applying several improvements mechanisms such as a positioned-based crossover operator for the Genetic Algorithm (GA) called a Two Points Crossover (TPC) and an offspring selection process. The performance of the improved GA is measured using well-known examples from literature and compared to other evolutionary algorithms. Good results show that GA can still be competitive for this type of problem against more recent evolutionary algorithms.Keywords: crossover, genetic algorithm, layout design problem, loop-layout, manufacturing optimization
Procedia PDF Downloads 2793263 A Survey of Skin Cancer Detection and Classification from Skin Lesion Images Using Deep Learning
Authors: Joseph George, Anne Kotteswara Roa
Abstract:
Skin disease is one of the most common and popular kinds of health issues faced by people nowadays. Skin cancer (SC) is one among them, and its detection relies on the skin biopsy outputs and the expertise of the doctors, but it consumes more time and some inaccurate results. At the early stage, skin cancer detection is a challenging task, and it easily spreads to the whole body and leads to an increase in the mortality rate. Skin cancer is curable when it is detected at an early stage. In order to classify correct and accurate skin cancer, the critical task is skin cancer identification and classification, and it is more based on the cancer disease features such as shape, size, color, symmetry and etc. More similar characteristics are present in many skin diseases; hence it makes it a challenging issue to select important features from a skin cancer dataset images. Hence, the skin cancer diagnostic accuracy is improved by requiring an automated skin cancer detection and classification framework; thereby, the human expert’s scarcity is handled. Recently, the deep learning techniques like Convolutional neural network (CNN), Deep belief neural network (DBN), Artificial neural network (ANN), Recurrent neural network (RNN), and Long and short term memory (LSTM) have been widely used for the identification and classification of skin cancers. This survey reviews different DL techniques for skin cancer identification and classification. The performance metrics such as precision, recall, accuracy, sensitivity, specificity, and F-measures are used to evaluate the effectiveness of SC identification using DL techniques. By using these DL techniques, the classification accuracy increases along with the mitigation of computational complexities and time consumption.Keywords: skin cancer, deep learning, performance measures, accuracy, datasets
Procedia PDF Downloads 128