Search results for: features engineering methods for forecasting
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 20901

Search results for: features engineering methods for forecasting

20511 Comparing Community Detection Algorithms in Bipartite Networks

Authors: Ehsan Khademi, Mahdi Jalili

Abstract:

Despite the special features of bipartite networks, they are common in many systems. Real-world bipartite networks may show community structure, similar to what one can find in one-mode networks. However, the interpretation of the community structure in bipartite networks is different as compared to one-mode networks. In this manuscript, we compare a number of available methods that are frequently used to discover community structure of bipartite networks. These networks are categorized into two broad classes. One class is the methods that, first, transfer the network into a one-mode network, and then apply community detection algorithms. The other class is the algorithms that have been developed specifically for bipartite networks. These algorithms are applied on a model network with prescribed community structure.

Keywords: community detection, bipartite networks, co-clustering, modularity, network projection, complex networks

Procedia PDF Downloads 625
20510 A Network of Nouns and Their Features :A Neurocomputational Study

Authors: Skiker Kaoutar, Mounir Maouene

Abstract:

Neuroimaging studies indicate that a large fronto-parieto-temporal network support nouns and their features, with some areas store semantic knowledge (visual, auditory, olfactory, gustatory,…), other areas store lexical representation and other areas are implicated in general semantic processing. However, it is not well understood how this fronto-parieto-temporal network can be modulated by different semantic tasks and different semantic relations between nouns. In this study, we combine a behavioral semantic network, functional MRI studies involving object’s related nouns and brain network studies to explain how different semantic tasks and different semantic relations between nouns can modulate the activity within the brain network of nouns and their features. We first describe how nouns and their features form a large scale brain network. For this end, we examine the connectivities between areas recruited during the processing of nouns to know which configurations of interaction areas are possible. We can thus identify if, for example, brain areas that store semantic knowledge communicate via functional/structural links with areas that store lexical representations. Second, we examine how this network is modulated by different semantic tasks involving nouns and finally, we examine how category specific activation may result from the semantic relations among nouns. The results indicate that brain network of nouns and their features is highly modulated and flexible by different semantic tasks and semantic relations. At the end, this study can be used as a guide to help neurosientifics to interpret the pattern of fMRI activations detected in the semantic processing of nouns. Specifically; this study can help to interpret the category specific activations observed extensively in a large number of neuroimaging studies and clinical studies.

Keywords: nouns, features, network, category specificity

Procedia PDF Downloads 521
20509 A Study of Permission-Based Malware Detection Using Machine Learning

Authors: Ratun Rahman, Rafid Islam, Akin Ahmed, Kamrul Hasan, Hasan Mahmud

Abstract:

Malware is becoming more prevalent, and several threat categories have risen dramatically in recent years. This paper provides a bird's-eye view of the world of malware analysis. The efficiency of five different machine learning methods (Naive Bayes, K-Nearest Neighbor, Decision Tree, Random Forest, and TensorFlow Decision Forest) combined with features picked from the retrieval of Android permissions to categorize applications as harmful or benign is investigated in this study. The test set consists of 1,168 samples (among these android applications, 602 are malware and 566 are benign applications), each consisting of 948 features (permissions). Using the permission-based dataset, the machine learning algorithms then produce accuracy rates above 80%, except the Naive Bayes Algorithm with 65% accuracy. Of the considered algorithms TensorFlow Decision Forest performed the best with an accuracy of 90%.

Keywords: android malware detection, machine learning, malware, malware analysis

Procedia PDF Downloads 167
20508 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data

Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill

Abstract:

Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.

Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function

Procedia PDF Downloads 279
20507 Employing GIS to Analyze Areas Prone to Flooding: Case Study of Thailand

Authors: Sanpachai Huvanandana, Settapong Malisuwan, Soparwan Tongyuak, Prust Pannachet, Anong Phoepueak, Navneet Madan

Abstract:

Many regions of Thailand are prone to flooding due to tropical climate. A commonly increasing precipitation in this continent results in risk of flooding. Many efforts have been implemented such as drainage control system, multiple dams, and irrigation canals. In order to decide where the drainages, dams, and canal should be appropriately located, the flooding risk area should be determined. This paper is aimed to identify the appropriate features that can be used to classify the flooding risk area in Thailand. Several features have been analyzed and used to classify the area. Non-supervised clustering techniques have been used and the results have been compared with ten years average actual flooding area.

Keywords: flood area clustering, geographical information system, flood features

Procedia PDF Downloads 295
20506 Preparation of Papers - Developing a Leukemia Diagnostic System Based on Hybrid Deep Learning Architectures in Actual Clinical Environments

Authors: Skyler Kim

Abstract:

An early diagnosis of leukemia has always been a challenge to doctors and hematologists. On a worldwide basis, it was reported that there were approximately 350,000 new cases in 2012, and diagnosing leukemia was time-consuming and inefficient because of an endemic shortage of flow cytometry equipment in current clinical practice. As the number of medical diagnosis tools increased and a large volume of high-quality data was produced, there was an urgent need for more advanced data analysis methods. One of these methods was the AI approach. This approach has become a major trend in recent years, and several research groups have been working on developing these diagnostic models. However, designing and implementing a leukemia diagnostic system in real clinical environments based on a deep learning approach with larger sets remains complex. Leukemia is a major hematological malignancy that results in mortality and morbidity throughout different ages. We decided to select acute lymphocytic leukemia to develop our diagnostic system since acute lymphocytic leukemia is the most common type of leukemia, accounting for 74% of all children diagnosed with leukemia. The results from this development work can be applied to all other types of leukemia. To develop our model, the Kaggle dataset was used, which consists of 15135 total images, 8491 of these are images of abnormal cells, and 5398 images are normal. In this paper, we design and implement a leukemia diagnostic system in a real clinical environment based on deep learning approaches with larger sets. The proposed diagnostic system has the function of detecting and classifying leukemia. Different from other AI approaches, we explore hybrid architectures to improve the current performance. First, we developed two independent convolutional neural network models: VGG19 and ResNet50. Then, using both VGG19 and ResNet50, we developed a hybrid deep learning architecture employing transfer learning techniques to extract features from each input image. In our approach, fusing the features from specific abstraction layers can be deemed as auxiliary features and lead to further improvement of the classification accuracy. In this approach, features extracted from the lower levels are combined into higher dimension feature maps to help improve the discriminative capability of intermediate features and also overcome the problem of network gradient vanishing or exploding. By comparing VGG19 and ResNet50 and the proposed hybrid model, we concluded that the hybrid model had a significant advantage in accuracy. The detailed results of each model’s performance and their pros and cons will be presented in the conference.

Keywords: acute lymphoblastic leukemia, hybrid model, leukemia diagnostic system, machine learning

Procedia PDF Downloads 187
20505 Bioinformatic Approaches in Population Genetics and Phylogenetic Studies

Authors: Masoud Sheidai

Abstract:

Biologists with a special field of population genetics and phylogeny have different research tasks such as populations’ genetic variability and divergence, species relatedness, the evolution of genetic and morphological characters, and identification of DNA SNPs with adaptive potential. To tackle these problems and reach a concise conclusion, they must use the proper and efficient statistical and bioinformatic methods as well as suitable genetic and morphological characteristics. In recent years application of different bioinformatic and statistical methods, which are based on various well-documented assumptions, are the proper analytical tools in the hands of researchers. The species delineation is usually carried out with the use of different clustering methods like K-means clustering based on proper distance measures according to the studied features of organisms. A well-defined species are assumed to be separated from the other taxa by molecular barcodes. The species relationships are studied by using molecular markers, which are analyzed by different analytical methods like multidimensional scaling (MDS) and principal coordinate analysis (PCoA). The species population structuring and genetic divergence are usually investigated by PCoA and PCA methods and a network diagram. These are based on bootstrapping of data. The Association of different genes and DNA sequences to ecological and geographical variables is determined by LFMM (Latent factor mixed model) and redundancy analysis (RDA), which are based on Bayesian and distance methods. Molecular and morphological differentiating characters in the studied species may be identified by linear discriminant analysis (DA) and discriminant analysis of principal components (DAPC). We shall illustrate these methods and related conclusions by giving examples from different edible and medicinal plant species.

Keywords: GWAS analysis, K-Means clustering, LFMM, multidimensional scaling, redundancy analysis

Procedia PDF Downloads 124
20504 Classification of Hyperspectral Image Using Mathematical Morphological Operator-Based Distance Metric

Authors: Geetika Barman, B. S. Daya Sagar

Abstract:

In this article, we proposed a pixel-wise classification of hyperspectral images using a mathematical morphology operator-based distance metric called “dilation distance” and “erosion distance”. This method involves measuring the spatial distance between the spectral features of a hyperspectral image across the bands. The key concept of the proposed approach is that the “dilation distance” is the maximum distance a pixel can be moved without changing its classification, whereas the “erosion distance” is the maximum distance that a pixel can be moved before changing its classification. The spectral signature of the hyperspectral image carries unique class information and shape for each class. This article demonstrates how easily the dilation and erosion distance can measure spatial distance compared to other approaches. This property is used to calculate the spatial distance between hyperspectral image feature vectors across the bands. The dissimilarity matrix is then constructed using both measures extracted from the feature spaces. The measured distance metric is used to distinguish between the spectral features of various classes and precisely distinguish between each class. This is illustrated using both toy data and real datasets. Furthermore, we investigated the role of flat vs. non-flat structuring elements in capturing the spatial features of each class in the hyperspectral image. In order to validate, we compared the proposed approach to other existing methods and demonstrated empirically that mathematical operator-based distance metric classification provided competitive results and outperformed some of them.

Keywords: dilation distance, erosion distance, hyperspectral image classification, mathematical morphology

Procedia PDF Downloads 86
20503 Critical Pedagogy in the Philippine K-12 Grade 8 Values Education Curriculum and Textbook

Authors: Raymon Maac, Michael Arthus Muega, Joyce Ann Calingasan, Elva Maureen Gorospe

Abstract:

Critical pedagogy is known for its advocacy of humanistic and liberating education. Its far-reaching approach helps students to understand and analyze their own situations and the realities happening in their society. However, this pedagogy together with its promising features is not well-known in the Philippines. This paper determines the place of critical pedagogy in the new values education curriculum and analyzes its features in the K-12 Values Education curriculum and textbook. The study examines the position of critical pedagogy in the Philippine K-12 Values Education curriculum by closely studying and comparing their features; and scrutinizes the Grade 8 Values Education textbook specifically modules 4, 8, 10 and 13 which comprises 25% of the total 16 modules. The said modules are concerned with the role of the family in the preservation of social justice, which is one of the objectives of critical pedagogy. The findings in this research were based on the pieces of evidence gathered from the curriculum and textbook itself. Based on the evaluation done, the study found out that the ideas of critical pedagogy were the same with that of the objectives of K-12 Values Education Curriculum. Due to this, values education teachers can utilize critical pedagogy in their subject. In addition, the K-12 Values Education curriculum exhibits some of the features of critical pedagogy such as authentic student empowerment and critical thinking. Lastly, some features of critical pedagogy are also evident in some of the general parts and recommended activities in the K-12 Values Education textbook while other activities need to be fully developed by both teacher and students to reflect the genuine critical pedagogy.

Keywords: authentic student empowerment, critical pedagogy, critical thinking, liberating education

Procedia PDF Downloads 349
20502 Standalone Docking Station with Combined Charging Methods for Agricultural Mobile Robots

Authors: Leonor Varandas, Pedro D. Gaspar, Martim L. Aguiar

Abstract:

One of the biggest concerns in the field of agriculture is around the energy efficiency of robots that will perform agriculture’s activity and their charging methods. In this paper, two different charging methods for agricultural standalone docking stations are shown that will take into account various variants as field size and its irregularities, work’s nature to which the robot will perform, deadlines that have to be respected, among others. Its features also are dependent on the orchard, season, battery type and its technical specifications and cost. First charging base method focuses on wireless charging, presenting more benefits for small field. The second charging base method relies on battery replacement being more suitable for large fields, thus avoiding the robot stop for recharge. Existing many methods to charge a battery, the CC CV was considered the most appropriate for either simplicity or effectiveness. The choice of the battery for agricultural purposes is if most importance. While the most common battery used is Li-ion battery, this study also discusses the use of graphene-based new type of batteries with 45% over capacity to the Li-ion one. A Battery Management Systems (BMS) is applied for battery balancing. All these approaches combined showed to be a promising method to improve a lot of technical agricultural work, not just in terms of plantation and harvesting but also about every technique to prevent harmful events like plagues and weeds or even to reduce crop time and cost.

Keywords: agricultural mobile robot, charging methods, battery replacement method, wireless charging method

Procedia PDF Downloads 148
20501 Rhetorical Features of Research Article Abstracts of Non-Native English-Speaking Novice Student Researchers

Authors: Rita Darmayanti

Abstract:

This study aims at investigating the discourse pattern and structure of research article abstracts. The characteristics of the language used in abstracts written by non-native English-speaking (NNES) novice researchers are mainly examined in terms of rhetorical moves and the degree of variability of the rhetorical features as indicated by the structure of clauses and the linguistic features of the text. To this end, 20 abstracts written by undergraduate students of the accounting department at the State Polytechnic of Malang in 2018-2019 were employed as the data of this study. Findings showed that the most frequently used pattern of the rhetorical move is I(Introduction)-P(Purpose)-M(Method)-Pr(Product or Result)-C(Conclusion) with the significant use of active sentence and present and past tense. The findings of the study are projected to be utilized for evaluating the quality of students’ abstracts and generating a pedagogical proposal of ESP writing course or at least providing a critical review of current practices in ESP program intended for non-native English students at tertiary level.

Keywords: rhetorical features, rhetorical moves, non-native English-speaking novice researchers, research abstract

Procedia PDF Downloads 131
20500 Load Forecast of the Peak Demand Based on Both the Peak Demand and Its Location

Authors: Qais H. Alsafasfeh

Abstract:

The aim of this paper is to provide a forecast of the peak demand for the next 15 years for electrical distribution companies. The proposed methodology provides both the peak demand and its location for the next 15 years. This paper describes the Spatial Load Forecasting model used, the information provided by electrical distribution company in Jordan, the workflow followed, the parameters used and the assumptions made to run the model. The aim of this paper is to provide a forecast of the peak demand for the next 15 years for electrical distribution companies. The proposed methodology provides both the peak demand and its location for the next 15 years. This paper describes the Spatial Load Forecasting model used, the information provided by electrical distribution company in Jordan, the workflow followed, the parameters used and the assumptions made to run the model.

Keywords: load forecast, peak demand, spatial load, electrical distribution

Procedia PDF Downloads 495
20499 Transformative Digital Trends in Supply Chain Management: The Role of Artificial Intelligence

Authors: Srinivas Vangari

Abstract:

With the technological advancements around the globe, artificial intelligence (AI) has boosted supply chain management (SCM) by improving efficiency, sensitivity, and promptness. Artificial intelligence-based SCM provides comprehensive perceptions of consumer behavior in dynamic market situations and trends, foreseeing the accurate demand. It reduces overproduction and stockouts while optimizing production planning and streamlining operations. Consequently, the AI-driven SCM produces a customer-centric supply with resilient and robust operations. Intending to delve into the transformative significance of AI in SCM, this study focuses on improving efficiency in SCM with the integration of AI, understanding the production demand, accurate forecasting, and particular production planning. The study employs a mixed-method approach and expert survey insights to explore the challenges and benefits of AI applications in SCM. Further, a case analysis is incorporated to identify the best practices and potential challenges with the critical success features in AI-driven SCM. Key findings of the study indicate the significant advantages of the AI-integrated SCM, including optimized inventory management, improved transportation and logistics management, cost optimization, and advanced decision-making, positioning AI as a pivotal force in the future of supply chain management.

Keywords: artificial intelligence, supply chain management, accurate forecast, accurate planning of production, understanding demand

Procedia PDF Downloads 21
20498 The Usage of Artificial Intelligence in Instagram

Authors: Alanod Alqasim, Yasmine Iskandarani, Sita Algethami, Jawaher alzughaiby

Abstract:

This study focuses on the usage of AI (Artificial Intelligence) systems and features on the Instagram application and how it influences user experience and satisfaction. The aim is to evaluate the techniques and current capabilities, restrictions, and potential future directions of AI in an Instagram application. Following a concise explanation of the core concepts underlying AI usage on Instagram. To answer this question, 19 randomly selected users were asked to complete a 9-question survey on their experience and satisfaction with the app's features (Filters, user preferences, translation tool) and authenticity. The results revealed that there were three prevalent allegations. These declarations include that Instagram has an extremely attractive user interface; secondly, Instagram creates a strong sense of community; and lastly, Instagram has an important influence on mental health.

Keywords: AI (Artificial Intelligence), instagram, features, satisfaction, experience

Procedia PDF Downloads 82
20497 Modeling Factors Affecting Fertility Transition in Africa: Case of Kenya

Authors: Dennis Okora Amima Ondieki

Abstract:

Fertility transition has been identified to be affected by numerous factors. This research aimed to investigate the most real factors affecting fertility transition in Kenya. These factors were firstly extracted from the literature convened into demographic features, social, and economic features, social-cultural features, reproductive features and modernization features. All these factors had 23 factors identified for this study. The data for this study was from the Kenya Demographic and Health Surveys (KDHS) conducted in 1999-2003 and 2003-2008/9. The data was continuous, and it involved the mean birth order for the ten periods. Principal component analysis (PCA) was utilized using 23 factors. Principal component analysis conveyed religion, region, education and marital status as the real factors. PC scores were calculated for every point. The identified principal components were utilized as forecasters in the multiple regression model, with the fertility level as the response variable. The four components were found to be affecting fertility transition differently. It was found that fertility is affected positively by factors of region and marital and negatively by factors of religion and education. These four factors can be considered in the planning policy in Kenya and Africa at large.

Keywords: fertility transition, principal component analysis, Kenya demographic health survey, birth order

Procedia PDF Downloads 99
20496 Tensor Deep Stacking Neural Networks and Bilinear Mapping Based Speech Emotion Classification Using Facial Electromyography

Authors: P. S. Jagadeesh Kumar, Yang Yung, Wenli Hu

Abstract:

Speech emotion classification is a dominant research field in finding a sturdy and profligate classifier appropriate for different real-life applications. This effort accentuates on classifying different emotions from speech signal quarried from the features related to pitch, formants, energy contours, jitter, shimmer, spectral, perceptual and temporal features. Tensor deep stacking neural networks were supported to examine the factors that influence the classification success rate. Facial electromyography signals were composed of several forms of focuses in a controlled atmosphere by means of audio-visual stimuli. Proficient facial electromyography signals were pre-processed using moving average filter, and a set of arithmetical features were excavated. Extracted features were mapped into consistent emotions using bilinear mapping. With facial electromyography signals, a database comprising diverse emotions will be exposed with a suitable fine-tuning of features and training data. A success rate of 92% can be attained deprived of increasing the system connivance and the computation time for sorting diverse emotional states.

Keywords: speech emotion classification, tensor deep stacking neural networks, facial electromyography, bilinear mapping, audio-visual stimuli

Procedia PDF Downloads 254
20495 Estimation of Forces Applied to Forearm Using EMG Signal Features to Control of Powered Human Arm Prostheses

Authors: Faruk Ortes, Derya Karabulut, Yunus Ziya Arslan

Abstract:

Myoelectric features gathering from musculature environment are considered on a preferential basis to perceive muscle activation and control human arm prostheses according to recent experimental researches. EMG (electromyography) signal based human arm prostheses have shown a promising performance in terms of providing basic functional requirements of motions for the amputated people in recent years. However, these assistive devices for neurorehabilitation still have important limitations in enabling amputated people to perform rather sophisticated or functional movements. Surface electromyogram (EMG) is used as the control signal to command such devices. This kind of control consists of activating a motion in prosthetic arm using muscle activation for the same particular motion. Extraction of clear and certain neural information from EMG signals plays a major role especially in fine control of hand prosthesis movements. Many signal processing methods have been utilized for feature extraction from EMG signals. The specific objective of this study was to compare widely used time domain features of EMG signal including integrated EMG(IEMG), root mean square (RMS) and waveform length(WL) for prediction of externally applied forces to human hands. Obtained features were classified using artificial neural networks (ANN) to predict the forces. EMG signals supplied to process were recorded during only type of muscle contraction which is isometric and isotonic one. Experiments were performed by three healthy subjects who are right-handed and in a range of 25-35 year-old aging. EMG signals were collected from muscles of the proximal part of the upper body consisting of: biceps brachii, triceps brachii, pectorialis major and trapezius. The force prediction results obtained from the ANN were statistically analyzed and merits and pitfalls of the extracted features were discussed with detail. The obtained results are anticipated to contribute classification process of EMG signal and motion control of powered human arm prosthetics control.

Keywords: assistive devices for neurorehabilitation, electromyography, feature extraction, force estimation, human arm prosthesis

Procedia PDF Downloads 367
20494 Features of Soil Formation in the North of Western Siberia in Cryogenic Conditions

Authors: Tatiana V. Raudina, Sergey P. Kulizhskiy

Abstract:

A large part of Russia is located in permafrost areas. These areas are widely used because there are concentrated valuable natural resources. Therefore to explore of cryosols it is important due to the significant increase of anthropogenic stress as well as the problem of global climate change. In the north of Western Siberia permafrost phenomena is widespread. Permafrost as a factor of soil formation and cryogenesis as a process have a great impact on the soil formation of these areas. Based on the research results of permafrost-affected soils tundra landscapes formed in the central part of the Tazovskiy Peninsula in cryogenic conditions, data were obtained which characterize the morphological features of soils. The specificity of soil cover distribution and manifestation of soil-forming processes within the study area are noted. Permafrost features such as frost cracking, cryoturbation, thixotropy, movement of humus are formed. The formation of these features is increased with the development of the territory. As a consequence, there is a change in the components of the environment and the destruction of the soil cover.

Keywords: gleyed and nongleyed soils, permafrost, soil cryogenesis (pedocryogenesis), soil-forming macroprocesses

Procedia PDF Downloads 350
20493 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: clustering algorithms, coastal engineering, data mining, data summarization, statistical methods

Procedia PDF Downloads 361
20492 Bag of Words Representation Based on Fusing Two Color Local Descriptors and Building Multiple Dictionaries

Authors: Fatma Abdedayem

Abstract:

We propose an extension to the famous method called Bag of words (BOW) which proved a successful role in the field of image categorization. Practically, this method based on representing image with visual words. In this work, firstly, we extract features from images using Spatial Pyramid Representation (SPR) and two dissimilar color descriptors which are opponent-SIFT and transformed-color-SIFT. Secondly, we fuse color local features by joining the two histograms coming from these descriptors. Thirdly, after collecting of all features, we generate multi-dictionaries coming from n random feature subsets that obtained by dividing all features into n random groups. Then, by using these dictionaries separately each image can be represented by n histograms which are lately concatenated horizontally and form the final histogram, that allows to combine Multiple Dictionaries (MDBoW). In the final step, in order to classify image we have applied Support Vector Machine (SVM) on the generated histograms. Experimentally, we have used two dissimilar image datasets in order to test our proposition: Caltech 256 and PASCAL VOC 2007.

Keywords: bag of words (BOW), color descriptors, multi-dictionaries, MDBoW

Procedia PDF Downloads 297
20491 A Data-Driven Compartmental Model for Dengue Forecasting and Covariate Inference

Authors: Yichao Liu, Peter Fransson, Julian Heidecke, Jonas Wallin, Joacim Rockloev

Abstract:

Dengue, a mosquito-borne viral disease, poses a significant public health challenge in endemic tropical or subtropical countries, including Sri Lanka. To reveal insights into the complexity of the dynamics of this disease and study the drivers, a comprehensive model capable of both robust forecasting and insightful inference of drivers while capturing the co-circulating of several virus strains is essential. However, existing studies mostly focus on only one aspect at a time and do not integrate and carry insights across the siloed approach. While mechanistic models are developed to capture immunity dynamics, they are often oversimplified and lack integration of all the diverse drivers of disease transmission. On the other hand, purely data-driven methods lack constraints imposed by immuno-epidemiological processes, making them prone to overfitting and inference bias. This research presents a hybrid model that combines machine learning techniques with mechanistic modelling to overcome the limitations of existing approaches. Leveraging eight years of newly reported dengue case data, along with socioeconomic factors, such as human mobility, weekly climate data from 2011 to 2018, genetic data detecting the introduction and presence of new strains, and estimates of seropositivity for different districts in Sri Lanka, we derive a data-driven vector (SEI) to human (SEIR) model across 16 regions in Sri Lanka at the weekly time scale. By conducting ablation studies, the lag effects allowing delays up to 12 weeks of time-varying climate factors were determined. The model demonstrates superior predictive performance over a pure machine learning approach when considering lead times of 5 and 10 weeks on data withheld from model fitting. It further reveals several interesting interpretable findings of drivers while adjusting for the dynamics and influences of immunity and introduction of a new strain. The study uncovers strong influences of socioeconomic variables: population density, mobility, household income and rural vs. urban population. The study reveals substantial sensitivity to the diurnal temperature range and precipitation, while mean temperature and humidity appear less important in the study location. Additionally, the model indicated sensitivity to vegetation index, both max and average. Predictions on testing data reveal high model accuracy. Overall, this study advances the knowledge of dengue transmission in Sri Lanka and demonstrates the importance of incorporating hybrid modelling techniques to use biologically informed model structures with flexible data-driven estimates of model parameters. The findings show the potential to both inference of drivers in situations of complex disease dynamics and robust forecasting models.

Keywords: compartmental model, climate, dengue, machine learning, social-economic

Procedia PDF Downloads 84
20490 Music Genre Classification Based on Non-Negative Matrix Factorization Features

Authors: Soyon Kim, Edward Kim

Abstract:

In order to retrieve information from the massive stream of songs in the music industry, music search by title, lyrics, artist, mood, and genre has become more important. Despite the subjectivity and controversy over the definition of music genres across different nations and cultures, automatic genre classification systems that facilitate the process of music categorization have been developed. Manual genre selection by music producers is being provided as statistical data for designing automatic genre classification systems. In this paper, an automatic music genre classification system utilizing non-negative matrix factorization (NMF) is proposed. Short-term characteristics of the music signal can be captured based on the timbre features such as mel-frequency cepstral coefficient (MFCC), decorrelated filter bank (DFB), octave-based spectral contrast (OSC), and octave band sum (OBS). Long-term time-varying characteristics of the music signal can be summarized with (1) the statistical features such as mean, variance, minimum, and maximum of the timbre features and (2) the modulation spectrum features such as spectral flatness measure, spectral crest measure, spectral peak, spectral valley, and spectral contrast of the timbre features. Not only these conventional basic long-term feature vectors, but also NMF based feature vectors are proposed to be used together for genre classification. In the training stage, NMF basis vectors were extracted for each genre class. The NMF features were calculated in the log spectral magnitude domain (NMF-LSM) as well as in the basic feature vector domain (NMF-BFV). For NMF-LSM, an entire full band spectrum was used. However, for NMF-BFV, only low band spectrum was used since high frequency modulation spectrum of the basic feature vectors did not contain important information for genre classification. In the test stage, using the set of pre-trained NMF basis vectors, the genre classification system extracted the NMF weighting values of each genre as the NMF feature vectors. A support vector machine (SVM) was used as a classifier. The GTZAN multi-genre music database was used for training and testing. It is composed of 10 genres and 100 songs for each genre. To increase the reliability of the experiments, 10-fold cross validation was used. For a given input song, an extracted NMF-LSM feature vector was composed of 10 weighting values that corresponded to the classification probabilities for 10 genres. An NMF-BFV feature vector also had a dimensionality of 10. Combined with the basic long-term features such as statistical features and modulation spectrum features, the NMF features provided the increased accuracy with a slight increase in feature dimensionality. The conventional basic features by themselves yielded 84.0% accuracy, but the basic features with NMF-LSM and NMF-BFV provided 85.1% and 84.2% accuracy, respectively. The basic features required dimensionality of 460, but NMF-LSM and NMF-BFV required dimensionalities of 10 and 10, respectively. Combining the basic features, NMF-LSM and NMF-BFV together with the SVM with a radial basis function (RBF) kernel produced the significantly higher classification accuracy of 88.3% with a feature dimensionality of 480.

Keywords: mel-frequency cepstral coefficient (MFCC), music genre classification, non-negative matrix factorization (NMF), support vector machine (SVM)

Procedia PDF Downloads 303
20489 The Grammatical Dictionary Compiler: A System for Kartvelian Languages

Authors: Liana Lortkipanidze, Nino Amirezashvili, Nino Javashvili

Abstract:

The purpose of the grammatical dictionary is to provide information on the morphological and syntactic characteristics of the basic word in the dictionary entry. The electronic grammatical dictionaries are used as a tool of automated morphological analysis for texts processing. The Georgian Grammatical Dictionary should contain grammatical information for each word: part of speech, type of declension/conjugation, grammatical forms of the word (paradigm), alternative variants of basic word/lemma. In this paper, we present the system for compiling the Georgian Grammatical Dictionary automatically. We propose dictionary-based methods for extending grammatical lexicons. The input lexicon contains only a few number of words with identical grammatical features. The extension is based on similarity measures between features of words; more precisely, we add words to the extended lexicons, which are similar to those, which are already in the grammatical dictionary. Our dictionaries are corpora-based, and for the compiling, we introduce the method for lemmatization of unknown words, i.e., words of which neither full form nor lemma is in the grammatical dictionary.

Keywords: acquisition of lexicon, Georgian grammatical dictionary, lemmatization rules, morphological processor

Procedia PDF Downloads 145
20488 Digital Twins in the Built Environment: A Systematic Literature Review

Authors: Bagireanu Astrid, Bros-Williamson Julio, Duncheva Mila, Currie John

Abstract:

Digital Twins (DT) are an innovative concept of cyber-physical integration of data between an asset and its virtual replica. They have originated in established industries such as manufacturing and aviation and have garnered increasing attention as a potentially transformative technology within the built environment. With the potential to support decision-making, real-time simulations, forecasting abilities and managing operations, DT do not fall under a singular scope. This makes defining and leveraging the potential uses of DT a potential missed opportunity. Despite its recognised potential in established industries, literature on DT in the built environment remains limited. Inadequate attention has been given to the implementation of DT in construction projects, as opposed to its operational stage applications. Additionally, the absence of a standardised definition has resulted in inconsistent interpretations of DT in both industry and academia. There is a need to consolidate research to foster a unified understanding of the DT. Such consolidation is indispensable to ensure that future research is undertaken with a solid foundation. This paper aims to present a comprehensive systematic literature review on the role of DT in the built environment. To accomplish this objective, a review and thematic analysis was conducted, encompassing relevant papers from the last five years. The identified papers are categorised based on their specific areas of focus, and the content of these papers was translated into a through classification of DT. In characterising DT and the associated data processes identified, this systematic literature review has identified 6 DT opportunities specifically relevant to the built environment: Facilitating collaborative procurement methods, Supporting net-zero and decarbonization goals, Supporting Modern Methods of Construction (MMC) and off-site manufacturing (OSM), Providing increased transparency and stakeholders collaboration, Supporting complex decision making (real-time simulations and forecasting abilities) and Seamless integration with Internet of Things (IoT), data analytics and other DT. Finally, a discussion of each area of research is provided. A table of definitions of DT across the reviewed literature is provided, seeking to delineate the current state of DT implementation in the built environment context. Gaps in knowledge are identified, as well as research challenges and opportunities for further advancements in the implementation of DT within the built environment. This paper critically assesses the existing literature to identify the potential of DT applications, aiming to harness the transformative capabilities of data in the built environment. By fostering a unified comprehension of DT, this paper contributes to advancing the effective adoption and utilisation of this technology, accelerating progress towards the realisation of smart cities, decarbonisation, and other envisioned roles for DT in the construction domain.

Keywords: built environment, design, digital twins, literature review

Procedia PDF Downloads 81
20487 A New Method Separating Relevant Features from Irrelevant Ones Using Fuzzy and OWA Operator Techniques

Authors: Imed Feki, Faouzi Msahli

Abstract:

Selection of relevant parameters from a high dimensional process operation setting space is a problem frequently encountered in industrial process modelling. This paper presents a method for selecting the most relevant fabric physical parameters for each sensory quality feature. The proposed relevancy criterion has been developed using two approaches. The first utilizes a fuzzy sensitivity criterion by exploiting from experimental data the relationship between physical parameters and all the sensory quality features for each evaluator. Next an OWA aggregation procedure is applied to aggregate the ranking lists provided by different evaluators. In the second approach, another panel of experts provides their ranking lists of physical features according to their professional knowledge. Also by applying OWA and a fuzzy aggregation model, the data sensitivity-based ranking list and the knowledge-based ranking list are combined using our proposed percolation technique, to determine the final ranking list. The key issue of the proposed percolation technique is to filter automatically and objectively the relevant features by creating a gap between scores of relevant and irrelevant parameters. It permits to automatically generate threshold that can effectively reduce human subjectivity and arbitrariness when manually choosing thresholds. For a specific sensory descriptor, the threshold is defined systematically by iteratively aggregating (n times) the ranking lists generated by OWA and fuzzy models, according to a specific algorithm. Having applied the percolation technique on a real example, of a well known finished textile product especially the stonewashed denims, usually considered as the most important quality criteria in jeans’ evaluation, we separate the relevant physical features from irrelevant ones for each sensory descriptor. The originality and performance of the proposed relevant feature selection method can be shown by the variability in the number of physical features in the set of selected relevant parameters. Instead of selecting identical numbers of features with a predefined threshold, the proposed method can be adapted to the specific natures of the complex relations between sensory descriptors and physical features, in order to propose lists of relevant features of different sizes for different descriptors. In order to obtain more reliable results for selection of relevant physical features, the percolation technique has been applied for combining the fuzzy global relevancy and OWA global relevancy criteria in order to clearly distinguish scores of the relevant physical features from those of irrelevant ones.

Keywords: data sensitivity, feature selection, fuzzy logic, OWA operators, percolation technique

Procedia PDF Downloads 605
20486 Oil Demand Forecasting in China: A Structural Time Series Analysis

Authors: Tehreem Fatima, Enjun Xia

Abstract:

The research investigates the relationship between total oil consumption and transport oil consumption, GDP, oil price, and oil reserve in order to forecast future oil demand in China. Annual time series data is used over the period of 1980 to 2015, and for this purpose, an oil demand function is estimated by applying structural time series model (STSM). The technique also uncovers the Underline energy demand trend (UEDT) for China oil demand and GDP, oil reserve, oil price and UEDT are considering important drivers of China oil demand. The long-run elasticity of total oil consumption with respect to GDP and price are (0.5, -0.04) respectively while GDP, oil reserve, and price remain (0.17; 0.23; -0.05) respectively. Moreover, the Estimated results of long-run elasticity of transport oil consumption with respect to GDP and price are (0.5, -0.00) respectively long-run estimates remain (0.28; 37.76;-37.8) for GDP, oil reserve, and price respectively. For both model estimated underline energy demand trend (UEDT) remains nonlinear and stochastic and with an increasing trend of (UEDT) and based on estimated equations, it is predicted that China total oil demand somewhere will be 9.9 thousand barrel per day by 2025 as compare to 9.4 thousand barrel per day in 2015, while transport oil demand predicting value is 9.0 thousand barrel per day by 2020 as compare to 8.8 thousand barrel per day in 2015.

Keywords: china, forecasting, oil, structural time series model (STSM), underline energy demand trend (UEDT)

Procedia PDF Downloads 283
20485 Time Series Modelling for Forecasting Wheat Production and Consumption of South Africa in Time of War

Authors: Yiseyon Hosu, Joseph Akande

Abstract:

Wheat is one of the most important staple food grains of human for centuries and is largely consumed in South Africa. It has a special place in the South African economy because of its significance in food security, trade, and industry. This paper modelled and forecast the production and consumption of wheat in South Africa in the time covid-19 and the ongoing Russia-Ukraine war by using annual time series data from 1940–2021 based on the ARIMA models. Both the averaging forecast and selected models forecast indicate that there is the possibility of an increase with respect to production. The minimum and maximum growth in production is projected to be between 3million and 10 million tons, respectively. However, the model also forecast a possibility of depression with respect to consumption in South Africa. Although Covid-19 and the war between Ukraine and Russia, two major producers and exporters of global wheat, are having an effect on the volatility of the prices currently, the wheat production in South African is expected to increase and meat the consumption demand and provided an opportunity for increase export with respect to domestic consumption. The forecasting of production and consumption behaviours of major crops play an important role towards food and nutrition security, these findings can assist policymakers and will provide them with insights into the production and pricing policy of wheat in South Africa.

Keywords: ARIMA, food security, price volatility, staple food, South Africa

Procedia PDF Downloads 102
20484 Face Recognition Using Discrete Orthogonal Hahn Moments

Authors: Fatima Akhmedova, Simon Liao

Abstract:

One of the most critical decision points in the design of a face recognition system is the choice of an appropriate face representation. Effective feature descriptors are expected to convey sufficient, invariant and non-redundant facial information. In this work, we propose a set of Hahn moments as a new approach for feature description. Hahn moments have been widely used in image analysis due to their invariance, non-redundancy and the ability to extract features either globally and locally. To assess the applicability of Hahn moments to Face Recognition we conduct two experiments on the Olivetti Research Laboratory (ORL) database and University of Notre-Dame (UND) X1 biometric collection. Fusion of the global features along with the features from local facial regions are used as an input for the conventional k-NN classifier. The method reaches an accuracy of 93% of correctly recognized subjects for the ORL database and 94% for the UND database.

Keywords: face recognition, Hahn moments, recognition-by-parts, time-lapse

Procedia PDF Downloads 375
20483 Comparative Study Using WEKA for Red Blood Cells Classification

Authors: Jameela Ali, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifying the RBCs as normal, or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithm tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital-alaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively.

Keywords: K-nearest neighbors algorithm, radial basis function neural network, red blood cells, support vector machine

Procedia PDF Downloads 409
20482 Random Subspace Ensemble of CMAC Classifiers

Authors: Somaiyeh Dehghan, Mohammad Reza Kheirkhahan Haghighi

Abstract:

The rapid growth of domains that have data with a large number of features, while the number of samples is limited has caused difficulty in constructing strong classifiers. To reduce the dimensionality of the feature space becomes an essential step in classification task. Random subspace method (or attribute bagging) is an ensemble classifier that consists of several classifiers that each base learner in ensemble has subset of features. In the present paper, we introduce Random Subspace Ensemble of CMAC neural network (RSE-CMAC), each of which has training with subset of features. Then we use this model for classification task. For evaluation performance of our model, we compare it with bagging algorithm on 36 UCI datasets. The results reveal that the new model has better performance.

Keywords: classification, random subspace, ensemble, CMAC neural network

Procedia PDF Downloads 329