Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1065

Search results for: vector

765 Improved Classification Procedure for Imbalanced and Overlapped Situations

Abstract:

The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.

Keywords: classification, imbalanced data with class overlap, split data space, support vector machine

Procedia PDF Downloads 275

764 Pyrethroid and Organophosphate Susceptibility Status of Aedesaegypti (Linnaeus), Aedes albopictus (Skuse) and Culex quinquefasciatus (Say) in Penang, Malaysia

Authors: Hadura Abu Hasan, Zairi Jaal, P. J. McCall

Abstract:

Dengue is a serious problem in Malaysia, particularly in high-density urban communities with lower socio-economic levels. This study evaluated the susceptibility of local populations of Aedesaegypti (Linnaeus), Aedesalbopictus (Skuse) and Culexquinquefasciatus (Say) from the traditional community of BaganDalam, Penang, Malaysia to lambdacyhalothrin and pirimiphos-methyl using standard World Health Organization (WHO) adult bioassay test. Unfed female mosquitoes aged 3-5 days were exposed to WHO recommended dosages of insecticides over fixed time periods with results presented as knock-down time (KT50) for each strain.The insecticide susceptible VCRU laboratory strain was usedas control. All three specieswere highly resistant to lambda-cyhalothrin with less than 10% mortality at 24 hours after treatment. In contrast, Ae.aegypti and Ae. albopictus were susceptible to pirimiphos-methyl, showing 100% mortality recorded 24 hoursafter treatment. Cx. quinquefasciatuswasclassed as ‘suspected resistant’ to pirimiphos-methyl as mortality recorded 24 hours after treatment was 94-96%. The results indicate that organophosphates such as pirimiphos-methyl might be used as alternative to pyrethroid for dengue vector control in this dengue-prone area.

Keywords: vector control, aedes aegypti, aedes albopictus, dengue, culex quinquefasciatus, residuals insecticides, pyrethroid, organophosphate, resistant, mosquito

Procedia PDF Downloads 235

763 The Application of Video Segmentation Methods for the Purpose of Action Detection in Videos

Authors: Nassima Noufail, Sara Bouhali

Abstract:

In this work, we develop a semi-supervised solution for the purpose of action detection in videos and propose an efficient algorithm for video segmentation. The approach is divided into video segmentation, feature extraction, and classification. In the first part, a video is segmented into clips, and we used the K-means algorithm for this segmentation; our goal is to find groups based on similarity in the video. The application of k-means clustering into all the frames is time-consuming; therefore, we started by the identification of transition frames where the scene in the video changes significantly, and then we applied K-means clustering into these transition frames. We used two image filters, the gaussian filter and the Laplacian of Gaussian. Each filter extracts a set of features from the frames. The Gaussian filter blurs the image and omits the higher frequencies, and the Laplacian of gaussian detects regions of rapid intensity changes; we then used this vector of filter responses as an input to our k-means algorithm. The output is a set of cluster centers. Each video frame pixel is then mapped to the nearest cluster center and painted with a corresponding color to form a visual map. The resulting visual map had similar pixels grouped. We then computed a cluster score indicating how clusters are near each other and plotted a signal representing frame number vs. clustering score. Our hypothesis was that the evolution of the signal would not change if semantically related events were happening in the scene. We marked the breakpoints at which the root mean square level of the signal changes significantly, and each breakpoint is an indication of the beginning of a new video segment. In the second part, for each segment from part 1, we randomly selected a 16-frame clip, then we extracted spatiotemporal features using convolutional 3D network C3D for every 16 frames using a pre-trained model. The C3D final output is a 512-feature vector dimension; hence we used principal component analysis (PCA) for dimensionality reduction. The final part is the classification. The C3D feature vectors are used as input to a multi-class linear support vector machine (SVM) for the training model, and we used a multi-classifier to detect the action. We evaluated our experiment on the UCF101 dataset, which consists of 101 human action categories, and we achieved an accuracy that outperforms the state of art by 1.2%.

Keywords: video segmentation, action detection, classification, Kmeans, C3D

Procedia PDF Downloads 47

762 Parallel Pipelined Conjugate Gradient Algorithm on Heterogeneous Platforms

Authors: Sergey Kopysov, Nikita Nedozhogin, Leonid Tonkov

Abstract:

The article presents a parallel iterative solver for large sparse linear systems which can be used on a heterogeneous platform. Traditionally, the problem of solving linear systems does not scale well on multi-CPU/multi-GPUs clusters. For example, most of the attempts to implement the classical conjugate gradient method were at best counted in the same amount of time as the problem was enlarged. The paper proposes the pipelined variant of the conjugate gradient method (PCG), a formulation that is potentially better suited for hybrid CPU/GPU computing since it requires only one synchronization point per one iteration instead of two for standard CG. The standard and pipelined CG methods need the vector entries generated by the current GPU and other GPUs for matrix-vector products. So the communication between GPUs becomes a major performance bottleneck on multi GPU cluster. The article presents an approach to minimize the communications between parallel parts of algorithms. Additionally, computation and communication can be overlapped to reduce the impact of data exchange. Using the pipelined version of the CG method with one synchronization point, the possibility of asynchronous calculations and communications, load balancing between the CPU and GPU for solving the large linear systems allows for scalability. The algorithm is implemented with the combined use of technologies: MPI, OpenMP, and CUDA. We show that almost optimum speed up on 8-CPU/2GPU may be reached (relatively to a one GPU execution). The parallelized solver achieves a speedup of up to 5.49 times on 16 NVIDIA Tesla GPUs, as compared to one GPU.

Keywords: conjugate gradient, GPU, parallel programming, pipelined algorithm

Procedia PDF Downloads 127

761 A Machine Learning Approach for Earthquake Prediction in Various Zones Based on Solar Activity

Authors: Viacheslav Shkuratskyy, Aminu Bello Usman, Michael O’Dea, Saifur Rahman Sabuj

Abstract:

This paper examines relationships between solar activity and earthquakes; it applied machine learning techniques: K-nearest neighbour, support vector regression, random forest regression, and long short-term memory network. Data from the SILSO World Data Center, the NOAA National Center, the GOES satellite, NASA OMNIWeb, and the United States Geological Survey were used for the experiment. The 23rd and 24th solar cycles, daily sunspot number, solar wind velocity, proton density, and proton temperature were all included in the dataset. The study also examined sunspots, solar wind, and solar flares, which all reflect solar activity and earthquake frequency distribution by magnitude and depth. The findings showed that the long short-term memory network model predicts earthquakes more correctly than the other models applied in the study, and solar activity is more likely to affect earthquakes of lower magnitude and shallow depth than earthquakes of magnitude 5.5 or larger with intermediate depth and deep depth.

Keywords: k-nearest neighbour, support vector regression, random forest regression, long short-term memory network, earthquakes, solar activity, sunspot number, solar wind, solar flares

Procedia PDF Downloads 35

760 An Application of Vector Error Correction Model to Assess Financial Innovation Impact on Economic Growth of Bangladesh

Authors: Md. Qamruzzaman, Wei Jianguo

Abstract:

Over the decade, it is observed that financial development, through financial innovation, not only accelerated development of efficient and effective financial system but also act as a catalyst in the economic development process. In this study, we try to explore insight about how financial innovation causes economic growth in Bangladesh by using Vector Error Correction Model (VECM) for the period of 1990-2014. Test of Cointegration confirms the existence of a long-run association between financial innovation and economic growth. For investigating directional causality, we apply Granger causality test and estimation explore that long-run growth will be affected by capital flow from non-bank financial institutions and inflation in the economy but changes of growth rate do not have any impact on Capital flow in the economy and level of inflation in long-run. Whereas, growth and Market capitalization, as well as market capitalization and capital flow, confirm feedback hypothesis. Variance decomposition suggests that any innovation in the financial sector can cause GDP variation fluctuation in both long run and short run. Financial innovation promotes efficiency and cost in financial transactions in the financial system, can boost economic development process. The study proposed two policy recommendations for further development. First, innovation friendly financial policy should formulate to encourage adaption and diffusion of financial innovation in the financial system. Second, operation of financial market and capital market should be regulated with implementation of rules and regulation to create conducive environment.

Keywords: financial innovation, economic growth, GDP, financial institution, VECM

Procedia PDF Downloads 223

759 Forecasting Regional Data Using Spatial Vars

Authors: Taisiia Gorshkova

Abstract:

Since the 1980s, spatial correlation models have been used more often to model regional indicators. An increasingly popular method for studying regional indicators is modeling taking into account spatial relationships between objects that are part of the same economic zone. In 2000s the new class of model – spatial vector autoregressions was developed. The main difference between standard and spatial vector autoregressions is that in the spatial VAR (SpVAR), the values of indicators at time t may depend on the values of explanatory variables at the same time t in neighboring regions and on the values of explanatory variables at time t-k in neighboring regions. Thus, VAR is a special case of SpVAR in the absence of spatial lags, and the spatial panel data model is a special case of spatial VAR in the absence of time lags. Two specifications of SpVAR were applied to Russian regional data for 2000-2017. The values of GRP and regional CPI are used as endogenous variables. The lags of GRP, CPI and the unemployment rate were used as explanatory variables. For comparison purposes, the standard VAR without spatial correlation was used as “naïve” model. In the first specification of SpVAR the unemployment rate and the values of depending variables, GRP and CPI, in neighboring regions at the same moment of time t were included in equations for GRP and CPI respectively. To account for the values of indicators in neighboring regions, the adjacency weight matrix is used, in which regions with a common sea or land border are assigned a value of 1, and the rest - 0. In the second specification the values of depending variables in neighboring regions at the moment of time t were replaced by these values in the previous time moment t-1. According to the results obtained, when inflation and GRP of neighbors are added into the model both inflation and GRP are significantly affected by their previous values, and inflation is also positively affected by an increase in unemployment in the previous period and negatively affected by an increase in GRP in the previous period, which corresponds to economic theory. GRP is not affected by either the inflation lag or the unemployment lag. When the model takes into account lagged values of GRP and inflation in neighboring regions, the results of inflation modeling are practically unchanged: all indicators except the unemployment lag are significant at a 5% significance level. For GRP, in turn, GRP lags in neighboring regions also become significant at a 5% significance level. For both spatial and “naïve” VARs the RMSE were calculated. The minimum RMSE are obtained via SpVAR with lagged explanatory variables. Thus, according to the results of the study, it can be concluded that SpVARs can accurately model both the actual values of macro indicators (particularly CPI and GRP) and the general situation in the regions

Keywords: forecasting, regional data, spatial econometrics, vector autoregression

Procedia PDF Downloads 104

758 Determinants of Economic Growth in Pakistan: A Structural Vector Auto Regression Approach

Authors: Muhammad Ajmair

Abstract:

This empirical study followed structural vector auto regression (SVAR) approach proposed by the so-called AB-model of Amisano and Giannini (1997) to check the impact of relevant macroeconomic determinants on economic growth in Pakistan. Before that auto regressive distributive lag (ARDL) bound testing technique and time varying parametric approach along with general to specific approach was employed to find out relevant significant determinants of economic growth. To our best knowledge, no author made such a study that employed auto regressive distributive lag (ARDL) bound testing and time varying parametric approach with general to specific approach in empirical literature, but current study will bridge this gap. Annual data was taken from World Development Indicators (2014) during period 1976-2014. The widely-used Schwarz information criterion and Akaike information criterion were considered for the lag length in each estimated equation. Main findings of the study are that remittances received, gross national expenditures and inflation are found to be the best relevant positive and significant determinants of economic growth. Based on these empirical findings, we conclude that government should focus on overall economic growth augmenting factors while formulating any policy relevant to the concerned sector.

Keywords: economic growth, gross national expenditures, inflation, remittances

Procedia PDF Downloads 170

757 The Effect of Extensive Mosquito Migration on Dengue Control as Revealed by Phylogeny of Dengue Vector Aedes aegypti

Authors: M. D. Nirmani, K. L. N. Perera, G. H. Galhena

Abstract:

Dengue has become one of the most important arbo-viral disease in all tropical and subtropical regions of the world. Aedes aegypti, is the principal vector of the virus, vary in both epidemiological and behavioral characteristics, which could be finely measured through DNA sequence comparison at their population level. Such knowledge in the population differences can assist in implementation of effective vector control strategies allowing to make estimates of the gene flow and adaptive genomic changes, which are important predictors of the spread of Wolbachia infection or insecticide resistance. As such, this study was undertaken to investigate the phylogenetic relationships of Ae. aegypti from Galle and Colombo, Sri Lanka, based on the ribosomal protein region which spans between two exons, in order to understand the geographical distribution of genetically distinct mosquito clades and its impact on mosquito control measures. A 320bp DNA region spanning from 681-930 bp, corresponding to the ribosomal protein, was sequenced in 62 Ae. aegypti larvae collected from Galle (N=30) and Colombo (N=32), Sri Lanka. The sequences were aligned using ClustalW and the haplotypes were determined with DnaSP 5.10. Phylogenetic relationships among haplotypes were constructed using the maximum likelihood method under Tamura 3 parameter model in MEGA 7.0.14 including three previously reported sequences of Australian (N=2) and Brazilian (N=1) Ae. aegypti. The bootstrap support was calculated using 1000 replicates and the tree was rooted using Aedes notoscriptus (GenBank accession No. KJ194101). Among all sequences, nineteen different haplotypes were found among which five haplotypes were shared between 80% of mosquitoes in the two populations. Seven haplotypes were unique to each of the population. Phylogenetic tree revealed two basal clades and a single derived clade. All observed haplotypes of the two Ae. aegypti populations were distributed in all the three clades, indicating a lack of genetic differentiation between populations. The Brazilian Ae. aegypti haplotype and one of the Australian haplotypes were grouped together with the Sri Lankan basal haplotype in the same basal clade, whereas the other Australian haplotype was found in the derived clade. Phylogram showed that Galle and Colombo Ae. aegypti populations are highly related to each other despite the large geographic distance (129 Km) indicating a substantial genetic similarity between them. This may have probably arisen from passive migration assisted by human travelling and trade through both land and water as the two areas are bordered by the sea. In addition, studied Sri Lankan mosquito populations were closely related to Australian and Brazilian samples. Probably this might have caused by shipping industry between the three countries as all of them are fully or partially enclosed by sea. For example, illegal fishing boats migrating to Australia by sea is perhaps a good mean of transportation of all life stages of mosquitoes from Sri Lanka. These findings indicate that extensive mosquito migrations occur between populations not only within the country, but also among other countries in the world which might be a main barrier to the successful vector control measures.

Keywords: Aedes aegypti, dengue control, extensive mosquito migration, haplotypes, phylogeny, ribosomal protein

Procedia PDF Downloads 153

756 Enhancing Temporal Extrapolation of Wind Speed Using a Hybrid Technique: A Case Study in West Coast of Denmark

Authors: B. Elshafei, X. Mao

Abstract:

The demand for renewable energy is significantly increasing, major investments are being supplied to the wind power generation industry as a leading source of clean energy. The wind energy sector is entirely dependable and driven by the prediction of wind speed, which by the nature of wind is very stochastic and widely random. This s0tudy employs deep multi-fidelity Gaussian process regression, used to predict wind speeds for medium term time horizons. Data of the RUNE experiment in the west coast of Denmark were provided by the Technical University of Denmark, which represent the wind speed across the study area from the period between December 2015 and March 2016. The study aims to investigate the effect of pre-processing the data by denoising the signal using empirical wavelet transform (EWT) and engaging the vector components of wind speed to increase the number of input data layers for data fusion using deep multi-fidelity Gaussian process regression (GPR). The outcomes were compared using root mean square error (RMSE) and the results demonstrated a significant increase in the accuracy of predictions which demonstrated that using vector components of the wind speed as additional predictors exhibits more accurate predictions than strategies that ignore them, reflecting the importance of the inclusion of all sub data and pre-processing signals for wind speed forecasting models.

Keywords: data fusion, Gaussian process regression, signal denoise, temporal extrapolation

Procedia PDF Downloads 107

755 Effectiveness Assessment of a Brazilian Larvicide on Aedes Control

Authors: Josiane N. Muller, Allan K. R. Galardo, Tatiane A. Barbosa, Evan P. Ferro, Wellington M. Dos Santos, Ana Paula S. A. Correa, Edinaldo C. Rego, Jose B. P. Lima

Abstract:

The susceptibility status of an insect population to any larvicide depends on several factors such includes genetic constitution, environmental conditions and others. The mosquito Aedes aegypti is the primary vector of three important viral diseases, Zika, Dengue, and Chikungunya. The frequent outbreaks of those diseases in different parts of Brazil demonstrate the importance of testing the susceptibility of vectors in different environments. Since the control of this mosquito leads to the control of disease, alternatives for vector control that value the different Brazilian environmental conditions are needed for effective actions. The aim of this study was to evaluate a new commercial formulation of Bacillus thuringiensis israelenses (DengueTech: Brazilian innovative technology) in the Brazilian Legal Amazon considering the climate conditions. Semi-field tests were conducted in the Institute of Scientific and Technological Research of the State of Amapa in two different environments, one in a shaded area and the other exposed to sunlight. The mosquito larvae were exposed to larvicide concentration and a control; each group was tested in three containers of 40 liters each. To assess persistence 50 third instar larvae of Aedes aegypti laboratory lineages (Rockefeller) and 50 larvae of Aedes aegypti collected in the municipality of Macapa, Brazil’s Amapa state, were added weekly and after 24 hours the mortality was assessed. In total 16 tests were performed, where 12 were done with replacement of water (1/5 of the volume, three times per week). The effectiveness of the product was determined through mortality of ≥ 80%, as recommend by the World Health Organization. The results demonstrated that high-water temperatures (26-35 °C) on the containers influenced the residual time of the product, where the maximum effect achieved was 21 days in the shaded area; and no effectiveness of 60 days was found in any of the tests, as expected according to the larvicide company. The test with and without water replacement did not present significant differences in the mortality rate. Considering the different environments and climate, these results stimulate the need to test larvicide and its effectiveness in specific environmental settings in order to identify the parameters required for better results. Thus, we see the importance of semi-field researches considering the local climate conditions for a successful control of Aedes aegypti.

Keywords: Aedes aegypti, bioassay, larvicida, vector control

Procedia PDF Downloads 100

754 Sentiment Analysis of Chinese Microblog Comments: Comparison between Support Vector Machine and Long Short-Term Memory

Authors: Xu Jiaqiao

Abstract:

Text sentiment analysis is an important branch of natural language processing. This technology is widely used in public opinion analysis and web surfing recommendations. At present, the mainstream sentiment analysis methods include three parts: sentiment analysis based on a sentiment dictionary, based on traditional machine learning, and based on deep learning. This paper mainly analyzes and compares the advantages and disadvantages of the SVM method of traditional machine learning and the Long Short-term Memory (LSTM) method of deep learning in the field of Chinese sentiment analysis, using Chinese comments on Sina Microblog as the data set. Firstly, this paper classifies and adds labels to the original comment dataset obtained by the web crawler, and then uses Jieba word segmentation to classify the original dataset and remove stop words. After that, this paper extracts text feature vectors and builds document word vectors to facilitate the training of the model. Finally, SVM and LSTM models are trained respectively. After accuracy calculation, it can be obtained that the accuracy of the LSTM model is 85.80%, while the accuracy of SVM is 91.07%. But at the same time, LSTM operation only needs 2.57 seconds, SVM model needs 6.06 seconds. Therefore, this paper concludes that: compared with the SVM model, the LSTM model is worse in accuracy but faster in processing speed.

Keywords: sentiment analysis, support vector machine, long short-term memory, Chinese microblog comments

Procedia PDF Downloads 58

753 Insecticide Resistance Detection on Filarial Vector, Simulium (Simulium) nobile (Diptera: Simuliidae) in Malaysia

Authors: Chee Dhang Chen, Hiroyuki Takaoka, Koon Weng Lau, Poh Ruey Tan, Ai Chdon Chin, Van Lun Low, Abdul Aziz Azidah, Mohd Sofian-Azirun

Abstract:

Susceptibility status of Simulium (Simulium) nobile (Diptera: Simuliidae) adults obtained from Pahang, Malaysia was evaluated against 11 adulticides representing four major insecticide classes: organochlorines (DDT, dieldrin), organophosphates (malathion, fenitrothion), carbamates (bendiocarb, propoxur) and pyrethroids (etofenprox, deltamethrin, lambdacyhalothrin, permethrin, cyfluthrin). The adult bioassay was conducted according to WHO standard protocol to determine the insecticide susceptibility. Mortality at 24 h post treatment was used as indicator for susceptibility status. The results revealed that S. nobile obtained was susceptible to propoxur, cyfluthrin and bendiocarb with 100% mortality. S. nobile was resistant or exhibited some tolerant against lambdacyhalothrin and deltamethrin with mortality ranged ≥ 90% but < 98%. S. nobile populations in Pahang exhibited different level of resistant against 11 adulticides with mortality ranged from 60.00 ± 10.00 to 100.00 ± 0.00. In conclusion, S. nobile populations in Pahang were susceptible to propoxur, cyfluthrin and bendiocarb. The susceptibility status of S. nobile in descending order was propoxur, cyfluthrin > bendicarb > deltamethrin > lambdacyhalothrin > permethrin > etofenprox > DDT > malathion > fenitrothion > dieldrin. Regular surveys should be conducted to monitor the susceptibility status of this insect vector in order to prevent further development of resistance.

Keywords: black fly, adult bioassay, insecticide resistance, Malaysia

Procedia PDF Downloads 246

752 Prediction of Sepsis Illness from Patients Vital Signs Using Long Short-Term Memory Network and Dynamic Analysis

Authors: Marcio Freire Cruz, Naoaki Ono, Shigehiko Kanaya, Carlos Arthur Mattos Teixeira Cavalcante

Abstract:

The systems that record patient care information, known as Electronic Medical Record (EMR) and those that monitor vital signs of patients, such as heart rate, body temperature, and blood pressure have been extremely valuable for the effectiveness of the patient’s treatment. Several kinds of research have been using data from EMRs and vital signs of patients to predict illnesses. Among them, we highlight those that intend to predict, classify, or, at least identify patterns, of sepsis illness in patients under vital signs monitoring. Sepsis is an organic dysfunction caused by a dysregulated patient's response to an infection that affects millions of people worldwide. Early detection of sepsis is expected to provide a significant improvement in its treatment. Preceding works usually combined medical, statistical, mathematical and computational models to develop detection methods for early prediction, getting higher accuracies, and using the smallest number of variables. Among other techniques, we could find researches using survival analysis, specialist systems, machine learning and deep learning that reached great results. In our research, patients are modeled as points moving each hour in an n-dimensional space where n is the number of vital signs (variables). These points can reach a sepsis target point after some time. For now, the sepsis target point was calculated using the median of all patients’ variables on the sepsis onset. From these points, we calculate for each hour the position vector, the first derivative (velocity vector) and the second derivative (acceleration vector) of the variables to evaluate their behavior. And we construct a prediction model based on a Long Short-Term Memory (LSTM) Network, including these derivatives as explanatory variables. The accuracy of the prediction 6 hours before the time of sepsis, considering only the vital signs reached 83.24% and by including the vectors position, speed, and acceleration, we obtained 94.96%. The data are being collected from Medical Information Mart for Intensive Care (MIMIC) Database, a public database that contains vital signs, laboratory test results, observations, notes, and so on, from more than 60.000 patients.

Keywords: dynamic analysis, long short-term memory, prediction, sepsis

Procedia PDF Downloads 95

751 DNA Barcoding for Identification of Dengue Vectors from Assam and Arunachal Pradesh: North-Eastern States in India

Authors: Monika Soni, Shovonlal Bhowmick, Chandra Bhattacharya, Jitendra Sharma, Prafulla Dutta, Jagadish Mahanta

Abstract:

Aedes aegypti and Aedes albopictus are considered as two major vectors to transmit dengue virus. In North-east India, two states viz. Assam and Arunachal Pradesh are known to be high endemic zone for dengue and Chikungunya viral infection. The taxonomical classification of medically important vectors are important for mapping of actual evolutionary trends and epidemiological studies. However, misidentification of mosquito species in field-collected mosquito specimens could have a negative impact which may affect vector-borne disease control policy. DNA barcoding is a prominent method to record available species, differentiate from new addition and change of population structure. In this study, a combined approach of a morphological and molecular technique of DNA barcoding was adopted to explore sequence variation in mitochondrial cytochrome c oxidase subunit I (COI) gene within dengue vectors. The study has revealed the map distribution of the dengue vector from two states i.e. Assam and Arunachal Pradesh, India. Approximate five hundred mosquito specimens were collected from different parts of two states, and their morphological features were compared with the taxonomic keys. The analysis of detailed taxonomic study revealed identification of two species Aedes aegypti and Aedes albopictus. The species aegypti comprised of 66.6% of the specimen and represented as dominant dengue vector species. The sequences obtained through standard DNA barcoding protocol were compared with public databases, viz. GenBank and BOLD. The sequences of all Aedes albopictus have shown 100% similarity whereas sequence of Aedes aegypti has shown 99.77 - 100% similarity of COI gene with that of different geographically located same species based on BOLD database search. From dengue prevalent different geographical regions fifty-nine sequences were retrieved from NCBI and BOLD databases of the same and related taxa to determine the evolutionary distance model based on the phylogenetic analysis. Neighbor-Joining (NJ) and Maximum Likelihood (ML) phylogenetic tree was constructed in MEGA6.06 software with 1000 bootstrap replicates using Kimura-2-Parameter model. Data were analyzed for sequence divergence and found that intraspecific divergence ranged from 0.0 to 2.0% and interspecific divergence ranged from 11.0 to 12.0%. The transitional and transversional substitutions were tested individually. The sequences were deposited in NCBI: GenBank database. This observation claimed the first DNA barcoding analysis of Aedes mosquitoes from North-eastern states in India and also confirmed the range expansion of two important mosquito species. Overall, this study insight into the molecular ecology of the dengue vectors from North-eastern India which will enhance the understanding to improve the existing entomological surveillance and vector incrimination program.

Keywords: COI, dengue vectors, DNA barcoding, molecular identification, North-east India, phylogenetics

Procedia PDF Downloads 264

750 Classifier for Liver Ultrasound Images

Authors: Soumya Sajjan

Abstract:

Liver cancer is the most common cancer disease worldwide in men and women, and is one of the few cancers still on the rise. Liver disease is the 4th leading cause of death. According to new NHS (National Health Service) figures, deaths from liver diseases have reached record levels, rising by 25% in less than a decade; heavy drinking, obesity, and hepatitis are believed to be behind the rise. In this study, we focus on Development of Diagnostic Classifier for Ultrasound liver lesion. Ultrasound (US) Sonography is an easy-to-use and widely popular imaging modality because of its ability to visualize many human soft tissues/organs without any harmful effect. This paper will provide an overview of underlying concepts, along with algorithms for processing of liver ultrasound images Naturaly, Ultrasound liver lesion images are having more spackle noise. Developing classifier for ultrasound liver lesion image is a challenging task. We approach fully automatic machine learning system for developing this classifier. First, we segment the liver image by calculating the textural features from co-occurrence matrix and run length method. For classification, Support Vector Machine is used based on the risk bounds of statistical learning theory. The textural features for different features methods are given as input to the SVM individually. Performance analysis train and test datasets carried out separately using SVM Model. Whenever an ultrasonic liver lesion image is given to the SVM classifier system, the features are calculated, classified, as normal and diseased liver lesion. We hope the result will be helpful to the physician to identify the liver cancer in non-invasive method.

Keywords: segmentation, Support Vector Machine, ultrasound liver lesion, co-occurance Matrix

Procedia PDF Downloads 379

749 Parallel Fuzzy Rough Support Vector Machine for Data Classification in Cloud Environment

Authors: Arindam Chaudhuri

Abstract:

Classification of data has been actively used for most effective and efficient means of conveying knowledge and information to users. The prima face has always been upon techniques for extracting useful knowledge from data such that returns are maximized. With emergence of huge datasets the existing classification techniques often fail to produce desirable results. The challenge lies in analyzing and understanding characteristics of massive data sets by retrieving useful geometric and statistical patterns. We propose a supervised parallel fuzzy rough support vector machine (PFRSVM) for data classification in cloud environment. The classification is performed by PFRSVM using hyperbolic tangent kernel. The fuzzy rough set model takes care of sensitiveness of noisy samples and handles impreciseness in training samples bringing robustness to results. The membership function is function of center and radius of each class in feature space and is represented with kernel. It plays an important role towards sampling the decision surface. The success of PFRSVM is governed by choosing appropriate parameter values. The training samples are either linear or nonlinear separable. The different input points make unique contributions to decision surface. The algorithm is parallelized with a view to reduce training times. The system is built on support vector machine library using Hadoop implementation of MapReduce. The algorithm is tested on large data sets to check its feasibility and convergence. The performance of classifier is also assessed in terms of number of support vectors. The challenges encountered towards implementing big data classification in machine learning frameworks are also discussed. The experiments are done on the cloud environment available at University of Technology and Management, India. The results are illustrated for Gaussian RBF and Bayesian kernels. The effect of variability in prediction and generalization of PFRSVM is examined with respect to values of parameter C. It effectively resolves outliers’ effects, imbalance and overlapping class problems, normalizes to unseen data and relaxes dependency between features and labels. The average classification accuracy for PFRSVM is better than other classifiers for both Gaussian RBF and Bayesian kernels. The experimental results on both synthetic and real data sets clearly demonstrate the superiority of the proposed technique.

Keywords: FRSVM, Hadoop, MapReduce, PFRSVM

Procedia PDF Downloads 458

748 The Effect Analysis of Monetary Instruments through Islamic Banking Financing Channel toward Economic Growth in Indonesia, Period January 2008-December 2015

Authors: Sobar M. Johari, Ida Putri Anjarsari

Abstract:

In the transmission of monetary instrument towards real sector of the economy, Bank Indonesia as monetary authority has developed Islamic Bank Indonesia Certificate (abbreviated as SBIS) as an instrument in Islamic open market operation. One of the monetary transmission channels could take place through financing channel from which the fund is used as the source of banking financing. This study aims to analyse the impact of Islamic monetary instrument towards output or economic growth. Data used in this research is taken from Bank Indonesia and Central Board of Statistics for the period of January 2008 until December 2015. The study employs Granger Causality Test, Vector Error Correction Model (VECM), Impulse Response Function (IRF) technique and Forecast Error Variance Decomposition (FEVD) as its analytical methods. The results show that, first, the transmission mechanism of banking financing channel are not linked to output. Second, estimation results of VECM show that SBIS, PUAS, and FIN have significant impact in the long term towards output. When there is monetary shock, output or economic growth could be recovered and stabilized in the short term. FEVD results show that Islamic banking financing contributes 1.33 percent to increase economic growth.

Keywords: Islamic monetary instrument, Islamic banking financing channel, economic growth, Vector Error Correction Model (VECM)

Procedia PDF Downloads 243

747 Government Final Consumption Expenditure and Household Consumption Expenditure NPISHS in Nigeria

Authors: Usman A. Usman

Abstract:

Undeniably, unlike the Classical side, the Keynesian perspective of the aggregate demand side indeed has a significant position in the policy, growth, and welfare of Nigeria due to government involvement and ineffective demand of the population living with poor per capita income. This study seeks to investigate the effect of Government Final Consumption Expenditure, Financial Deepening on Households, and NPISHs Final consumption expenditure using data on Nigeria from 1981 to 2019. This study employed the ADF stationarity test, Johansen Cointegration test, and Vector Error Correction Model. The results of the study revealed that the coefficient of Government final consumption expenditure has a positive effect on household consumption expenditure in the long run. There is a long-run and short-run relationship between gross fixed capital formation and household consumption expenditure. The coefficients cpsgdp (financial deepening and gross fixed capital formation posit a negative impact on household final consumption expenditure. The coefficients money supply lm2gdp, which is another proxy for financial deepening, and the coefficient FDI have a positive effect on household final consumption expenditure in the long run. Therefore, this study recommends that Gross fixed capital formation stimulates household consumption expenditure; a legal framework to support investment is a panacea to increasing hoodmold income and consumption and reducing poverty in Nigeria. Therefore, this should be a key central component of policy.

Keywords: government final consumption expenditure, household consumption expenditure, vector error correction model, cointegration

Procedia PDF Downloads 19

746 The Moment of the Optimal Average Length of the Multivariate Exponentially Weighted Moving Average Control Chart for Equally Correlated Variables

Authors: Edokpa Idemudia Waziri, Salisu S. Umar

Abstract:

The Hotellng’s T^2 is a well-known statistic for detecting a shift in the mean vector of a multivariate normal distribution. Control charts based on T have been widely used in statistical process control for monitoring a multivariate process. Although it is a powerful tool, the T statistic is deficient when the shift to be detected in the mean vector of a multivariate process is small and consistent. The Multivariate Exponentially Weighted Moving Average (MEWMA) control chart is one of the control statistics used to overcome the drawback of the Hotellng’s T statistic. In this paper, the probability distribution of the Average Run Length (ARL) of the MEWMA control chart when the quality characteristics exhibit substantial cross correlation and when the process is in-control and out-of-control was derived using the Markov Chain algorithm. The derivation of the probability functions and the moments of the run length distribution were also obtained and they were consistent with some existing results for the in-control and out-of-control situation. By simulation process, the procedure identified a class of ARL for the MEWMA control when the process is in-control and out-of-control. From our study, it was observed that the MEWMA scheme is quite adequate for detecting a small shift and a good way to improve the quality of goods and services in a multivariate situation. It was also observed that as the in-control average run length ARL0¬ or the number of variables (p) increases, the optimum value of the ARL0pt increases asymptotically and as the magnitude of the shift σ increases, the optimal ARLopt decreases. Finally, we use the example from the literature to illustrate our method and demonstrate its efficiency.

Keywords: average run length, markov chain, multivariate exponentially weighted moving average, optimal smoothing parameter

Procedia PDF Downloads 389

745 Evaluation of Machine Learning Algorithms and Ensemble Methods for Prediction of Students’ Graduation

Authors: Soha A. Bahanshal, Vaibhav Verdhan, Bayong Kim

Abstract:

Graduation rates at six-year colleges are becoming a more essential indicator for incoming fresh students and for university rankings. Predicting student graduation is extremely beneficial to schools and has a huge potential for targeted intervention. It is important for educational institutions since it enables the development of strategic plans that will assist or improve students' performance in achieving their degrees on time (GOT). A first step and a helping hand in extracting useful information from these data and gaining insights into the prediction of students' progress and performance is offered by machine learning techniques. Data analysis and visualization techniques are applied to understand and interpret the data. The data used for the analysis contains students who have graduated in 6 years in the academic year 2017-2018 for science majors. This analysis can be used to predict the graduation of students in the next academic year. Different Predictive modelings such as logistic regression, decision trees, support vector machines, Random Forest, Naïve Bayes, and KNeighborsClassifier are applied to predict whether a student will graduate. These classifiers were evaluated with k folds of 5. The performance of these classifiers was compared based on accuracy measurement. The results indicated that Ensemble Classifier achieves better accuracy, about 91.12%. This GOT prediction model would hopefully be useful to university administration and academics in developing measures for assisting and boosting students' academic performance and ensuring they graduate on time.

Keywords: prediction, decision trees, machine learning, support vector machine, ensemble model, student graduation, GOT graduate on time

Procedia PDF Downloads 46

744 Government Final Consumption Expenditure Financial Deepening and Household Consumption Expenditure NPISHs in Nigeria

Authors: Usman A. Usman

Abstract:

Undeniably, unlike the Classical side, the Keynesian perspective of the aggregate demand side indeed has a significant position in the policy, growth, and welfare of Nigeria due to government involvement and ineffective demand of the population living with poor per capita income. This study seeks to investigate the effect of Government Final Consumption Expenditure, Financial Deepening on Households, and NPISHs Final consumption expenditure using data on Nigeria from 1981 to 2019. This study employed the ADF stationarity test, Johansen Cointegration test, and Vector Error Correction Model. The results of the study revealed that the coefficient of Government final consumption expenditure has a positive effect on household consumption expenditure in the long run. There is a long-run and short-run relationship between gross fixed capital formation and household consumption expenditure. The coefficients cpsgdp financial deepening and gross fixed capital formation posit a negative impact on household final consumption expenditure. The coefficients money supply lm2gdp, which is another proxy for financial deepening, and the coefficient FDI have a positive effect on household final consumption expenditure in the long run. Therefore, this study recommends that Gross fixed capital formation stimulates household consumption expenditure; a legal framework to support investment is a panacea to increasing hoodmold income and consumption and reducing poverty in Nigeria. Therefore, this should be a key central component of policy.

Keywords: household, government expenditures, vector error correction model, johansen test

Procedia PDF Downloads 25

743 Using Geo-Statistical Techniques and Machine Learning Algorithms to Model the Spatiotemporal Heterogeneity of Land Surface Temperature and its Relationship with Land Use Land Cover

Authors: Javed Mallick

Abstract:

In metropolitan areas, rapid changes in land use and land cover (LULC) have ecological and environmental consequences. Saudi Arabia's cities have experienced tremendous urban growth since the 1990s, resulting in urban heat islands, groundwater depletion, air pollution, loss of ecosystem services, and so on. From 1990 to 2020, this study examines the variance and heterogeneity in land surface temperature (LST) caused by LULC changes in Abha-Khamis Mushyet, Saudi Arabia. LULC was mapped using the support vector machine (SVM). The mono-window algorithm was used to calculate the land surface temperature (LST). To identify LST clusters, the local indicator of spatial associations (LISA) model was applied to spatiotemporal LST maps. In addition, the parallel coordinate (PCP) method was used to investigate the relationship between LST clusters and urban biophysical variables as a proxy for LULC. According to LULC maps, urban areas increased by more than 330% between 1990 and 2018. Between 1990 and 2018, built-up areas had an 83.6% transitional probability. Furthermore, between 1990 and 2020, vegetation and agricultural land were converted into built-up areas at a rate of 17.9% and 21.8%, respectively. Uneven LULC changes in built-up areas result in more LST hotspots. LST hotspots were associated with high NDBI but not NDWI or NDVI. This study could assist policymakers in developing mitigation strategies for urban heat islands

Keywords: land use land cover mapping, land surface temperature, support vector machine, LISA model, parallel coordinate plot

Procedia PDF Downloads 44

742 Assessing Functional Structure in European Marine Ecosystems Using a Vector-Autoregressive Spatio-Temporal Model

Authors: Katyana A. Vert-Pre, James T. Thorson, Thomas Trancart, Eric Feunteun

Abstract:

In marine ecosystems, spatial and temporal species structure is an important component of ecosystems’ response to anthropological and environmental factors. Although spatial distribution patterns and fish temporal series of abundance have been studied in the past, little research has been allocated to the joint dynamic spatio-temporal functional patterns in marine ecosystems and their use in multispecies management and conservation. Each species represents a function to the ecosystem, and the distribution of these species might not be random. A heterogeneous functional distribution will lead to a more resilient ecosystem to external factors. Applying a Vector-Autoregressive Spatio-Temporal (VAST) model for count data, we estimate the spatio-temporal distribution, shift in time, and abundance of 140 species of the Eastern English Chanel, Bay of Biscay and Mediterranean Sea. From the model outputs, we determined spatio-temporal clusters, calculating p-values for hierarchical clustering via multiscale bootstrap resampling. Then, we designed a functional map given the defined cluster. We found that the species distribution within the ecosystem was not random. Indeed, species evolved in space and time in clusters. Moreover, these clusters remained similar over time deriving from the fact that species of a same cluster often shifted in sync, keeping the overall structure of the ecosystem similar overtime. Knowing the co-existing species within these clusters could help with predicting data-poor species distribution and abundance. Further analysis is being performed to assess the ecological functions represented in each cluster.

Keywords: cluster distribution shift, European marine ecosystems, functional distribution, spatio-temporal model

Procedia PDF Downloads 164

741 Human Development Outcomes and Macroeconomic Indicators Nexus in Nigeria: An Empirical Investigation

Authors: Risikat Oladoyin S. Dauda, Onyebuchi Iwegbu

Abstract:

This study investigates the response of human development outcomes to selected macroeconomic indicators in Nigeria. Human development outcomes is measured by human development index while the selected macroeconomic variables are inflation rate, real interest rate, government capital expenditure, real exchange rate, current account balance, and savings. Structural Vector Autoregression (SVAR) technique is employed in examining the response of human development index to the macroeconomic shocks. The result from the forecast error variance decomposition and Impulse-Response analysis reveals that fiscal policy (government capital expenditure) shock is the greatest determinant of human development outcomes. This result reiterates the role which the government plays in improving the welfare of the citizenry. The fiscal policy tool is pivotal in human development which comes in the form of investment in education, health, housing, and infrastructure. Further conclusion drawn from this study is that human development outcome positively and significantly responds to shocks from real interest rate, a monetary policy transmission variable and is felt greatly in the short run period. The policy implication of this study is that if capital budget implementation falls below expectations, human development will be engendered. Hence, efforts should be made to ensure that full implementation and appraisal of government capital expenditure is taken sacrosanct as any shock from such plan, engenders human development outcome.

Keywords: human development outcome, macroeconomic outcomes, structural vector autoregression, SVAR

Procedia PDF Downloads 131

740 Direct CP Violation in Baryonic B-Hadron Decays

Authors: C. Q. Geng, Y. K. Hsiao

Abstract:

We study direct CP-violating asymmetries (CPAs) in the baryonic B decays of B- -> p\bar{p}M and Λb decays of Λb ®pM andΛb -> J/ΨpM with M=π-, K-,ρ-,K*- based on the generalized factorization method in the standard model (SM). In particular, we show that the CPAs in the vector modes of B-®p\bar{p}K* and Λb -> p K*- can be as large as 20%. We also discuss the simplest purely baryonic decays of Λb-> p\bar{p}n, p\bar{p}Λ, Λ\bar{p}Λ, and Λ\bar{Λ}Λ. We point out that some of CPAs are promising to be measured by the current as well as future B facilities.

Keywords: CP violation, B decays, baryonic decays, Λb decays

Procedia PDF Downloads 230

739 Efficiency of Robust Heuristic Gradient Based Enumerative and Tunneling Algorithms for Constrained Integer Programming Problems

Authors: Vijaya K. Srivastava, Davide Spinello

Abstract:

This paper presents performance of two robust gradient-based heuristic optimization procedures based on 3ⁿ enumeration and tunneling approach to seek global optimum of constrained integer problems. Both these procedures consist of two distinct phases for locating the global optimum of integer problems with a linear or non-linear objective function subject to linear or non-linear constraints. In both procedures, in the first phase, a local minimum of the function is found using the gradient approach coupled with hemstitching moves when a constraint is violated in order to return the search to the feasible region. In the second phase, in one optimization procedure, the second sub-procedure examines 3ⁿ integer combinations on the boundary and within hypercube volume encompassing the result neighboring the result from the first phase and in the second optimization procedure a tunneling function is constructed at the local minimum of the first phase so as to find another point on the other side of the barrier where the function value is approximately the same. In the next cycle, the search for the global optimum commences in both optimization procedures again using this new-found point as the starting vector. The search continues and repeated for various step sizes along the function gradient as well as that along the vector normal to the violated constraints until no improvement in optimum value is found. The results from both these proposed optimization methods are presented and compared with one provided by popular MS Excel solver that is provided within MS Office suite and other published results.

Keywords: constrained integer problems, enumerative search algorithm, Heuristic algorithm, Tunneling algorithm

Procedia PDF Downloads 302

738 Analyzing the Results of Buildings Energy Audit by Using Grey Set Theory

Authors: Tooraj Karimi, Mohammadreza Sadeghi Moghadam

Abstract:

Grey set theory has the advantage of using fewer data to analyze many factors, and it is therefore more appropriate for system study rather than traditional statistical regression which require massive data, normal distribution in the data and few variant factors. So, in this paper grey clustering and entropy of coefficient vector of grey evaluations are used to analyze energy consumption in buildings of the Oil Ministry in Tehran. In fact, this article intends to analyze the results of energy audit reports and defines most favorable characteristics of system, which is energy consumption of buildings, and most favorable factors affecting these characteristics in order to modify and improve them. According to the results of the model, ‘the real Building Load Coefficient’ has been selected as the most important system characteristic and ‘uncontrolled area of the building’ has been diagnosed as the most favorable factor which has the greatest effect on energy consumption of building. Grey clustering in this study has been used for two purposes: First, all the variables of building relate to energy audit cluster in two main groups of indicators and the number of variables is reduced. Second, grey clustering with variable weights has been used to classify all buildings in three categories named ‘no standard deviation’, ‘low standard deviation’ and ‘non- standard’. Entropy of coefficient vector of Grey evaluations is calculated to investigate greyness of results. It shows that among the 38 buildings surveyed in terms of energy consumption, 3 cases are in standard group, 24 cases are in ‘low standard deviation’ group and 11 buildings are completely non-standard. In addition, clustering greyness of 13 buildings is less than 0.5 and average uncertainly of clustering results is 66%.

Keywords: energy audit, grey set theory, grey incidence matrixes, grey clustering, Iran oil ministry

Procedia PDF Downloads 341

737 Evaluation of Ensemble Classifiers for Intrusion Detection

Authors: M. Govindarajan

Abstract:

One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this research work, new ensemble classification methods are proposed with homogeneous ensemble classifier using bagging and heterogeneous ensemble classifier using arcing and their performances are analyzed in terms of accuracy. A Classifier ensemble is designed using Radial Basis Function (RBF) and Support Vector Machine (SVM) as base classifiers. The feasibility and the benefits of the proposed approaches are demonstrated by the means of standard datasets of intrusion detection. The main originality of the proposed approach is based on three main parts: preprocessing phase, classification phase, and combining phase. A wide range of comparative experiments is conducted for standard datasets of intrusion detection. The performance of the proposed homogeneous and heterogeneous ensemble classifiers are compared to the performance of other standard homogeneous and heterogeneous ensemble methods. The standard homogeneous ensemble methods include Error correcting output codes, Dagging and heterogeneous ensemble methods include majority voting, stacking. The proposed ensemble methods provide significant improvement of accuracy compared to individual classifiers and the proposed bagged RBF and SVM performs significantly better than ECOC and Dagging and the proposed hybrid RBF-SVM performs significantly better than voting and stacking. Also heterogeneous models exhibit better results than homogeneous models for standard datasets of intrusion detection.

Keywords: data mining, ensemble, radial basis function, support vector machine, accuracy

Procedia PDF Downloads 220

736 A Review of Research on Pre-training Technology for Natural Language Processing

Authors: Moquan Gong

Abstract:

In recent years, with the rapid development of deep learning, pre-training technology for natural language processing has made great progress. The early field of natural language processing has long used word vector methods such as Word2Vec to encode text. These word vector methods can also be regarded as static pre-training techniques. However, this context-free text representation brings very limited improvement to subsequent natural language processing tasks and cannot solve the problem of word polysemy. ELMo proposes a context-sensitive text representation method that can effectively handle polysemy problems. Since then, pre-training language models such as GPT and BERT have been proposed one after another. Among them, the BERT model has significantly improved its performance on many typical downstream tasks, greatly promoting the technological development in the field of natural language processing, and has since entered the field of natural language processing. The era of dynamic pre-training technology. Since then, a large number of pre-trained language models based on BERT and XLNet have continued to emerge, and pre-training technology has become an indispensable mainstream technology in the field of natural language processing. This article first gives an overview of pre-training technology and its development history, and introduces in detail the classic pre-training technology in the field of natural language processing, including early static pre-training technology and classic dynamic pre-training technology; and then briefly sorts out a series of enlightening technologies. Pre-training technology, including improved models based on BERT and XLNet; on this basis, analyze the problems faced by current pre-training technology research; finally, look forward to the future development trend of pre-training technology.

Keywords: natural language processing, pre-training, language model, word vectors

Procedia PDF Downloads 17