Search results for: Bayesian filtering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 628

Search results for: Bayesian filtering

118 Improving 99mTc-tetrofosmin Myocardial Perfusion Images by Time Subtraction Technique

Authors: Yasuyuki Takahashi, Hayato Ishimura, Masao Miyagawa, Teruhito Mochizuki

Abstract:

Quantitative measurement of myocardium perfusion is possible with single photon emission computed tomography (SPECT) using a semiconductor detector. However, accumulation of 99mTc-tetrofosmin in the liver may make it difficult to assess that accurately in the inferior myocardium. Our idea is to reduce the high accumulation in the liver by using dynamic SPECT imaging and a technique called time subtraction. We evaluated the performance of a new SPECT system with a cadmium-zinc-telluride solid-state semi- conductor detector (Discovery NM 530c; GE Healthcare). Our system acquired list-mode raw data over 10 minutes for a typical patient. From the data, ten SPECT images were reconstructed, one for every minute of acquired data. Reconstruction with the semiconductor detector was based on an implementation of a 3-D iterative Bayesian reconstruction algorithm. We studied 20 patients with coronary artery disease (mean age 75.4 ± 12.1 years; range 42-86; 16 males and 4 females). In each subject, 259 MBq of 99mTc-tetrofosmin was injected intravenously. We performed both a phantom and a clinical study using dynamic SPECT. An approximation to a liver-only image is obtained by reconstructing an image from the early projections during which time the liver accumulation dominates (0.5~2.5 minutes SPECT image-5~10 minutes SPECT image). The extracted liver-only image is then subtracted from a later SPECT image that shows both the liver and the myocardial uptake (5~10 minutes SPECT image-liver-only image). The time subtraction of liver was possible in both a phantom and the clinical study. The visualization of the inferior myocardium was improved. In past reports, higher accumulation in the myocardium due to the overlap of the liver is un-diagnosable. Using our time subtraction method, the image quality of the 99mTc-tetorofosmin myocardial SPECT image is considerably improved.

Keywords: 99mTc-tetrofosmin, dynamic SPECT, time subtraction, semiconductor detector

Procedia PDF Downloads 306
117 Clouds Influence on Atmospheric Ozone from GOME-2 Satellite Measurements

Authors: S. M. Samkeyat Shohan

Abstract:

This study is mainly focused on the determination and analysis of the photolysis rate of atmospheric, specifically tropospheric, ozone as function of cloud properties through-out the year 2007. The observational basis for ozone concentrations and cloud properties are the measurement data set of the Global Ozone Monitoring Experiment-2 (GOME-2) sensor on board the polar orbiting Metop-A satellite. Two different spectral ranges are used; ozone total column are calculated from the wavelength window 325 – 335 nm, while cloud properties, such as cloud top height (CTH) and cloud optical thick-ness (COT) are derived from the absorption band of molecular oxygen centered at 761 nm. Cloud fraction (CF) is derived from measurements in the ultraviolet, visible and near-infrared range of GOME-2. First, ozone concentrations above clouds are derived from ozone total columns, subtracting the contribution of stratospheric ozone and filtering those satellite measurements which have thin and low clouds. Then, the values of ozone photolysis derived from observations are compared with theoretical modeled results, in the latitudinal belt 5˚N-5˚S and 20˚N - 20˚S, as function of CF and COT. In general, good agreement is found between the data and the model, proving both the quality of the space-borne ozone and cloud properties as well as the modeling theory of ozone photolysis rate. The found discrepancies can, however, amount to approximately 15%. Latitudinal seasonal changes of photolysis rate of ozone are found to be negatively correlated to changes in upper-tropospheric ozone concentrations only in the autumn and summer months within the northern and southern tropical belts, respectively. This fact points to the entangled roles of temperature and nitrogen oxides in the ozone production, which are superimposed on its sole photolysis induced by thick and high clouds in the tropics.

Keywords: cloud properties, photolysis rate, stratospheric ozone, tropospheric ozone

Procedia PDF Downloads 188
116 Comparison of Different Artificial Intelligence-Based Protein Secondary Structure Prediction Methods

Authors: Jamerson Felipe Pereira Lima, Jeane Cecília Bezerra de Melo

Abstract:

The difficulty and cost related to obtaining of protein tertiary structure information through experimental methods, such as X-ray crystallography or NMR spectroscopy, helped raising the development of computational methods to do so. An approach used in these last is prediction of tridimensional structure based in the residue chain, however, this has been proved an NP-hard problem, due to the complexity of this process, explained by the Levinthal paradox. An alternative solution is the prediction of intermediary structures, such as the secondary structure of the protein. Artificial Intelligence methods, such as Bayesian statistics, artificial neural networks (ANN), support vector machines (SVM), among others, were used to predict protein secondary structure. Due to its good results, artificial neural networks have been used as a standard method to predict protein secondary structure. Recent published methods that use this technique, in general, achieved a Q3 accuracy between 75% and 83%, whereas the theoretical accuracy limit for protein prediction is 88%. Alternatively, to achieve better results, support vector machines prediction methods have been developed. The statistical evaluation of methods that use different AI techniques, such as ANNs and SVMs, for example, is not a trivial problem, since different training sets, validation techniques, as well as other variables can influence the behavior of a prediction method. In this study, we propose a prediction method based on artificial neural networks, which is then compared with a selected SVM method. The chosen SVM protein secondary structure prediction method is the one proposed by Huang in his work Extracting Physico chemical Features to Predict Protein Secondary Structure (2013). The developed ANN method has the same training and testing process that was used by Huang to validate his method, which comprises the use of the CB513 protein data set and three-fold cross-validation, so that the comparative analysis of the results can be made comparing directly the statistical results of each method.

Keywords: artificial neural networks, protein secondary structure, protein structure prediction, support vector machines

Procedia PDF Downloads 589
115 Developing a DNN Model for the Production of Biogas From a Hybrid BO-TPE System in an Anaerobic Wastewater Treatment Plant

Authors: Hadjer Sadoune, Liza Lamini, Scherazade Krim, Amel Djouadi, Rachida Rihani

Abstract:

Deep neural networks are highly regarded for their accuracy in predicting intricate fermentation processes. Their ability to learn from a large amount of datasets through artificial intelligence makes them particularly effective models. The primary obstacle in improving the performance of these models is to carefully choose the suitable hyperparameters, including the neural network architecture (number of hidden layers and hidden units), activation function, optimizer, learning rate, and other relevant factors. This study predicts biogas production from real wastewater treatment plant data using a sophisticated approach: hybrid Bayesian optimization with a tree-structured Parzen estimator (BO-TPE) for an optimised deep neural network (DNN) model. The plant utilizes an Upflow Anaerobic Sludge Blanket (UASB) digester that treats industrial wastewater from soft drinks and breweries. The digester has a working volume of 1574 m3 and a total volume of 1914 m3. Its internal diameter and height were 19 and 7.14 m, respectively. The data preprocessing was conducted with meticulous attention to preserving data quality while avoiding data reduction. Three normalization techniques were applied to the pre-processed data (MinMaxScaler, RobustScaler and StandardScaler) and compared with the Non-Normalized data. The RobustScaler approach has strong predictive ability for estimating the volume of biogas produced. The highest predicted biogas volume was 2236.105 Nm³/d, with coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE) values of 0.712, 164.610, and 223.429, respectively.

Keywords: anaerobic digestion, biogas production, deep neural network, hybrid bo-tpe, hyperparameters tuning

Procedia PDF Downloads 16
114 An Amended Method for Assessment of Hypertrophic Scars Viscoelastic Parameters

Authors: Iveta Bryjova

Abstract:

Recording of viscoelastic strain-vs-time curves with the aid of the suction method and a follow-up analysis, resulting into evaluation of standard viscoelastic parameters, is a significant technique for non-invasive contact diagnostics of mechanical properties of skin and assessment of its conditions, particularly in acute burns, hypertrophic scarring (the most common complication of burn trauma) and reconstructive surgery. For elimination of the skin thickness contribution, usable viscoelastic parameters deduced from the strain-vs-time curves are restricted to the relative ones (i.e. those expressed as a ratio of two dimensional parameters), like grosselasticity, net-elasticity, biological elasticity or Qu’s area parameters, in literature and practice conventionally referred to as R2, R5, R6, R7, Q1, Q2, and Q3. With the exception of parameters R2 and Q1, the remaining ones substantially depend on the position of inflection point separating the elastic linear and viscoelastic segments of the strain-vs-time curve. The standard algorithm implemented in commercially available devices relies heavily on the experimental fact that the inflection time comes about 0.1 sec after the suction switch-on/off, which depreciates credibility of parameters thus obtained. Although the Qu’s US 7,556,605 patent suggests a method of improving the precision of the inflection determination, there is still room for nonnegligible improving. In this contribution, a novel method of inflection point determination utilizing the advantageous properties of the Savitzky–Golay filtering is presented. The method allows computation of derivatives of smoothed strain-vs-time curve, more exact location of inflection and consequently more reliable values of aforementioned viscoelastic parameters. An improved applicability of the five inflection-dependent relative viscoelastic parameters is demonstrated by recasting a former study under the new method, and by comparing its results with those provided by the methods that have been used so far.

Keywords: Savitzky–Golay filter, scarring, skin, viscoelasticity

Procedia PDF Downloads 277
113 Theoretical Analysis of Mechanical Vibration for Offshore Platform Structures

Authors: Saeed Asiri, Yousuf Z. AL-Zahrani

Abstract:

A new class of support structures, called periodic structures, is introduced in this paper as a viable means for isolating the vibration transmitted from the sea waves to offshore platform structures through its legs. A passive approach to reduce transmitted vibration generated by waves is presented. The approach utilizes the property of periodic structural components that creates stop and pass bands. The stop band regions can be tailored to correspond to regions of the frequency spectra that contain harmonics of the wave frequency, attenuating the response in those regions. A periodic structural component is comprised of a repeating array of cells, which are themselves an assembly of elements. The elements may have differing material properties as well as geometric variations. For the purpose of this research, only geometric and material variations are considered and each cell is assumed to be identical. A periodic leg is designed in order to reduce transmitted vibration of sea waves. The effectiveness of the periodicity on the vibration levels of platform will be demonstrated theoretically. The theory governing the operation of this class of periodic structures is introduced using the transfer matrix method. The unique filtering characteristics of periodic structures are demonstrated as functions of their design parameters for structures with geometrical and material discontinuities; and determine the propagation factor by using the spectral finite element analysis and the effectiveness of design on the leg structure by changing the ratio of step length and area interface between the materials is demonstrated in order to find the propagation factor and frequency response.

Keywords: vibrations, periodic structures, offshore, platforms, transfer matrix method

Procedia PDF Downloads 267
112 Reducing CO2 Emission Using EDA and Weighted Sum Model in Smart Parking System

Authors: Rahman Ali, Muhammad Sajjad, Farkhund Iqbal, Muhammad Sadiq Hassan Zada, Mohammed Hussain

Abstract:

Emission of Carbon Dioxide (CO2) has adversely affected the environment. One of the major sources of CO2 emission is transportation. In the last few decades, the increase in mobility of people using vehicles has enormously increased the emission of CO2 in the environment. To reduce CO2 emission, sustainable transportation system is required in which smart parking is one of the important measures that need to be established. To contribute to the issue of reducing the amount of CO2 emission, this research proposes a smart parking system. A cloud-based solution is provided to the drivers which automatically searches and recommends the most preferred parking slots. To determine preferences of the parking areas, this methodology exploits a number of unique parking features which ultimately results in the selection of a parking that leads to minimum level of CO2 emission from the current position of the vehicle. To realize the methodology, a scenario-based implementation is considered. During the implementation, a mobile application with GPS signals, vehicles with a number of vehicle features and a list of parking areas with parking features are used by sorting, multi-level filtering, exploratory data analysis (EDA, Analytical Hierarchy Process (AHP)) and weighted sum model (WSM) to rank the parking areas and recommend the drivers with top-k most preferred parking areas. In the EDA process, “2020testcar-2020-03-03”, a freely available dataset is used to estimate CO2 emission of a particular vehicle. To evaluate the system, results of the proposed system are compared with the conventional approach, which reveal that the proposed methodology supersedes the conventional one in reducing the emission of CO2 into the atmosphere.

Keywords: car parking, Co2, Co2 reduction, IoT, merge sort, number plate recognition, smart car parking

Procedia PDF Downloads 123
111 A Structured Mechanism for Identifying Political Influencers on Social Media Platforms: Top 10 Saudi Political Twitter Users

Authors: Ahmad Alsolami, Darren Mundy, Manuel Hernandez-Perez

Abstract:

Social media networks, such as Twitter, offer the perfect opportunity to either positively or negatively affect political attitudes on large audiences. The existence of influential users who have developed a reputation for their knowledge and experience of specific topics is a major factor contributing to this impact. Therefore, knowledge of the mechanisms to identify influential users on social media is vital for understanding their effect on their audience. The concept of the influential user is related to the concept of opinion leaders' to indicate that ideas first flow from mass media to opinion leaders and then to the rest of the population. Hence, the objective of this research was to provide reliable and accurate structural mechanisms to identify influential users, which could be applied to different platforms, places, and subjects. Twitter was selected as the platform of interest, and Saudi Arabia as the context for the investigation. These were selected because Saudi Arabia has a large number of Twitter users, some of whom are considerably active in setting agendas and disseminating ideas. The study considered the scientific methods that have been used to identify public opinion leaders before, utilizing metrics software on Twitter. The key findings propose multiple novel metrics to compare Twitter influencers, including the number of followers, social authority and the use of political hashtags, and four secondary filtering measures. Thus, using ratio and percentage calculations to classify the most influential users, Twitter accounts were filtered, analyzed and included. The structured approach is used as a mechanism to explore the top ten influencers on Twitter from the political domain in Saudi Arabia.

Keywords: Twitter, influencers, structured mechanism, Saudi Arabia

Procedia PDF Downloads 98
110 Raman Scattering Broadband Spectrum Generation in Compact Yb-Doped Fiber Laser

Authors: Yanrong Song, Zikai Dong, Runqin Xu, Jinrong Tian, Kexuan Li

Abstract:

Nonlinear polarization rotation (NPR) technique has become one of the main techniques to achieve mode-locked fiber lasers for its compactness, implementation, and low cost. In this paper, we demonstrate a compact mode-locked Yb-doped fiber laser based on NPR technique in the all normal dispersion (ANDi) regime. In the laser cavity, there are no physical filter and polarization controller in laser cavity. Mode-locked pulse train is achieved in ANDi regime based on NPR technique. The fiber birefringence induced filtering effect is the mainly reason for mode-locking. After that, an extra 20 m long single-mode fiber is inserted in two different positions, dissipative soliton operation and noise like pulse operations are achieved correspondingly. The nonlinear effect is obviously enhanced in the noise like pulse regime and broadband spectrum generated owing to enhanced stimulated Raman scattering effect. When the pump power is 210 mW, the central wavelength is 1030 nm, and the corresponding 1st order Raman scattering stokes wave generates and locates at 1075 nm. When the pump power is 370 mW, the 1st and 2nd order Raman scattering stokes wave generate and locate at 1080 nm, 1126 nm respectively. When the pump power is 600 mW, the Raman continuum is generated with cascaded multi-order stokes waves, and the spectrum extends to 1188 nm. The total flat spectrum is from 1000nm to 1200nm. The maximum output average power and pulse energy are 18.0W and 14.75nJ, respectively.

Keywords: fiber laser, mode-locking, nonlinear polarization rotation, Raman scattering

Procedia PDF Downloads 199
109 Roof and Road Network Detection through Object Oriented SVM Approach Using Low Density LiDAR and Optical Imagery in Misamis Oriental, Philippines

Authors: Jigg L. Pelayo, Ricardo G. Villar, Einstine M. Opiso

Abstract:

The advances of aerial laser scanning in the Philippines has open-up entire fields of research in remote sensing and machine vision aspire to provide accurate timely information for the government and the public. Rapid mapping of polygonal roads and roof boundaries is one of its utilization offering application to disaster risk reduction, mitigation and development. The study uses low density LiDAR data and high resolution aerial imagery through object-oriented approach considering the theoretical concept of data analysis subjected to machine learning algorithm in minimizing the constraints of feature extraction. Since separating one class from another in distinct regions of a multi-dimensional feature-space, non-trivial computing for fitting distribution were implemented to formulate the learned ideal hyperplane. Generating customized hybrid feature which were then used in improving the classifier findings. Supplemental algorithms for filtering and reshaping object features are develop in the rule set for enhancing the final product. Several advantages in terms of simplicity, applicability, and process transferability is noticeable in the methodology. The algorithm was tested in the different random locations of Misamis Oriental province in the Philippines demonstrating robust performance in the overall accuracy with greater than 89% and potential to semi-automation. The extracted results will become a vital requirement for decision makers, urban planners and even the commercial sector in various assessment processes.

Keywords: feature extraction, machine learning, OBIA, remote sensing

Procedia PDF Downloads 338
108 Design and Implementation of a Software Platform Based on Artificial Intelligence for Product Recommendation

Authors: Giuseppina Settanni, Antonio Panarese, Raffaele Vaira, Maurizio Galiano

Abstract:

Nowdays, artificial intelligence is used successfully in academia and industry for its ability to learn from a large amount of data. In particular, in recent years the use of machine learning algorithms in the field of e-commerce has spread worldwide. In this research study, a prototype software platform was designed and implemented in order to suggest to users the most suitable products for their needs. The platform includes a chatbot and a recommender system based on artificial intelligence algorithms that provide suggestions and decision support to the customer. The recommendation systems perform the important function of automatically filtering and personalizing information, thus allowing to manage with the IT overload to which the user is exposed on a daily basis. Recently, international research has experimented with the use of machine learning technologies with the aim to increase the potential of traditional recommendation systems. Specifically, support vector machine algorithms have been implemented combined with natural language processing techniques that allow the user to interact with the system, express their requests and receive suggestions. The interested user can access the web platform on the internet using a computer, tablet or mobile phone, register, provide the necessary information and view the products that the system deems them most appropriate. The platform also integrates a dashboard that allows the use of the various functions, which the platform is equipped with, in an intuitive and simple way. Artificial intelligence algorithms have been implemented and trained on historical data collected from user browsing. Finally, the testing phase allowed to validate the implemented model, which will be further tested by letting customers use it.

Keywords: machine learning, recommender system, software platform, support vector machine

Procedia PDF Downloads 111
107 C-eXpress: A Web-Based Analysis Platform for Comparative Functional Genomics and Proteomics in Human Cancer Cell Line, NCI-60 as an Example

Authors: Chi-Ching Lee, Po-Jung Huang, Kuo-Yang Huang, Petrus Tang

Abstract:

Background: Recent advances in high-throughput research technologies such as new-generation sequencing and multi-dimensional liquid chromatography makes it possible to dissect the complete transcriptome and proteome in a single run for the first time. However, it is almost impossible for many laboratories to handle and analysis these “BIG” data without the support from a bioinformatics team. We aimed to provide a web-based analysis platform for users with only limited knowledge on bio-computing to study the functional genomics and proteomics. Method: We use NCI-60 as an example dataset to demonstrate the power of the web-based analysis platform and data delivering system: C-eXpress takes a simple text file that contain the standard NCBI gene or protein ID and expression levels (rpkm or fold) as input file to generate a distribution map of gene/protein expression levels in a heatmap diagram organized by color gradients. The diagram is hyper-linked to a dynamic html table that allows the users to filter the datasets based on various gene features. A dynamic summary chart is generated automatically after each filtering process. Results: We implemented an integrated database that contain pre-defined annotations such as gene/protein properties (ID, name, length, MW, pI); pathways based on KEGG and GO biological process; subcellular localization based on GO cellular component; functional classification based on GO molecular function, kinase, peptidase and transporter. Multiple ways of sorting of column and rows is also provided for comparative analysis and visualization of multiple samples.

Keywords: cancer, visualization, database, functional annotation

Procedia PDF Downloads 591
106 Real-Time Radar Tracking Based on Nonlinear Kalman Filter

Authors: Milca F. Coelho, K. Bousson, Kawser Ahmed

Abstract:

To accurately track an aerospace vehicle in a time-critical situation and in a highly nonlinear environment, is one of the strongest interests within the aerospace community. The tracking is achieved by estimating accurately the state of a moving target, which is composed of a set of variables that can provide a complete status of the system at a given time. One of the main ingredients for a good estimation performance is the use of efficient estimation algorithms. A well-known framework is the Kalman filtering methods, designed for prediction and estimation problems. The success of the Kalman Filter (KF) in engineering applications is mostly due to the Extended Kalman Filter (EKF), which is based on local linearization. Besides its popularity, the EKF presents several limitations. To address these limitations and as a possible solution to tracking problems, this paper proposes the use of the Ensemble Kalman Filter (EnKF). Although the EnKF is being extensively used in the context of weather forecasting and it is being recognized for producing accurate and computationally effective estimation on systems with a very high dimension, it is almost unknown by the tracking community. The EnKF was initially proposed as an attempt to improve the error covariance calculation, which on the classic Kalman Filter is difficult to implement. Also, in the EnKF method the prediction and analysis error covariances have ensemble representations. These ensembles have sizes which limit the number of degrees of freedom, in a way that the filter error covariance calculations are a lot more practical for modest ensemble sizes. In this paper, a realistic simulation of a radar tracking was performed, where the EnKF was applied and compared with the Extended Kalman Filter. The results suggested that the EnKF is a promising tool for tracking applications, offering more advantages in terms of performance.

Keywords: Kalman filter, nonlinear state estimation, optimal tracking, stochastic environment

Procedia PDF Downloads 115
105 Application of Single Tuned Passive Filters in Distribution Networks at the Point of Common Coupling

Authors: M. Almutairi, S. Hadjiloucas

Abstract:

The harmonic distortion of voltage is important in relation to power quality due to the interaction between the large diffusion of non-linear and time-varying single-phase and three-phase loads with power supply systems. However, harmonic distortion levels can be reduced by improving the design of polluting loads or by applying arrangements and adding filters. The application of passive filters is an effective solution that can be used to achieve harmonic mitigation mainly because filters offer high efficiency, simplicity, and are economical. Additionally, possible different frequency response characteristics can work to achieve certain required harmonic filtering targets. With these ideas in mind, the objective of this paper is to determine what size single tuned passive filters work in distribution networks best, in order to economically limit violations caused at a given point of common coupling (PCC). This article suggests that a single tuned passive filter could be employed in typical industrial power systems. Furthermore, constrained optimization can be used to find the optimal sizing of the passive filter in order to reduce both harmonic voltage and harmonic currents in the power system to an acceptable level, and, thus, improve the load power factor. The optimization technique works to minimize voltage total harmonic distortions (VTHD) and current total harmonic distortions (ITHD), where maintaining a given power factor at a specified range is desired. According to the IEEE Standard 519, both indices are viewed as constraints for the optimal passive filter design problem. The performance of this technique will be discussed using numerical examples taken from previous publications.

Keywords: harmonics, passive filter, power factor, power quality

Procedia PDF Downloads 285
104 Socio-Demographic Factors and Testing Practices Are Associated with Spatial Patterns of Clostridium difficile Infection in the Australian Capital Territory, 2004-2014

Authors: Aparna Lal, Ashwin Swaminathan, Teisa Holani

Abstract:

Background: Clostridium difficile infections (CDIs) have been on the rise globally. In Australia, rates of CDI in all States and Territories have increased significantly since mid-2011. Identifying risk factors for CDI in the community can help inform targeted interventions to reduce infection. Methods: We examine the role of neighbourhood socio-economic status, demography, testing practices and the number of residential aged care facilities on spatial patterns in CDI incidence in the Australian Capital Territory. Data on all tests conducted for CDI were obtained from ACT Pathology by postcode for the period 1st January 2004 through 31 December 2014. Distribution of age groups and the neighbourhood Index of Relative Socio-economic Advantage Disadvantage (IRSAD) were obtained from the Australian Bureau of Statistics 2011 National Census data. A Bayesian spatial conditional autoregressive model was fitted at the postcode level to quantify the relationship between CDI and socio-demographic factors. To identify CDI hotspots, exceedance probabilities were set at a threshold of twice the estimated relative risk. Results: CDI showed a positive spatial association with the number of tests (RR=1.01, 95% CI 1.00, 1.02) and the resident population over 65 years (RR=1.00, 95% CI 1.00, 1.01). The standardized index of relative socio-economic advantage disadvantage (IRSAD) was significantly negatively associated with CDI (RR=0.74, 95% CI 0.56, 0.94). We identified three postcodes with high probability (0.8-1.0) of excess risk. Conclusions: Here, we demonstrate geographic variations in CDI in the ACT with a positive association of CDI with socioeconomic disadvantage and identify areas with a high probability of elevated risk compared with surrounding communities. These findings highlight community-based risk factors for CDI.

Keywords: spatial, socio-demographic, infection, Clostridium difficile

Procedia PDF Downloads 296
103 A Structured Mechanism for Identifying Political Influencers on Social Media Platforms Top 10 Saudi Political Twitter Users

Authors: Ahmad Alsolami, Darren Mundy, Manuel Hernandez-Perez

Abstract:

Social media networks, such as Twitter, offer the perfect opportunity to either positively or negatively affect political attitudes on large audiences. A most important factor contributing to this effect is the existence of influential users, who have developed a reputation for their awareness and experience on specific subjects. Therefore, knowledge of the mechanisms to identify influential users on social media is vital for understanding their effect on their audience. The concept of the influential user is based on the pioneering work of Katz and Lazarsfeld (1959), who created the concept of opinion leaders' to indicate that ideas first flow from mass media to opinion leaders and then to the rest of the population. Hence, the objective of this research was to provide reliable and accurate structural mechanisms to identify influential users, which could be applied to different platforms, places, and subjects. Twitter was selected as the platform of interest, and Saudi Arabia as the context for the investigation. These were selected because Saudi Arabia has a large number of Twitter users, some of whom are considerably active in setting agendas and disseminating ideas. The study considered the scientific methods that have been used to identify public opinion leaders before, utilizing metrics software on Twitter. The key findings propose multiple novel metrics to compare Twitter influencers, including the number of followers, social authority and the use of political hashtags, and four secondary filtering measures. Thus, using ratio and percentage calculations to classify the most influential users, Twitter accounts were filtered, analyzed and included. The structured approach is used as a mechanism to explore the top ten influencers on Twitter from the political domain in Saudi Arabia.

Keywords: twitter, influencers, structured mechanism, Saudi Arabia

Procedia PDF Downloads 109
102 Phylogenetic Analysis Based On the Internal Transcribed Spacer-2 (ITS2) Sequences of Diadegma semiclausum (Hymenoptera: Ichneumonidae) Populations Reveals Significant Adaptive Evolution

Authors: Ebraheem Al-Jouri, Youssef Abu-Ahmad, Ramasamy Srinivasan

Abstract:

The parasitoid, Diadegma semiclausum (Hymenoptera: Ichneumonidae) is one of the most effective exotic parasitoids of diamondback moth (DBM), Plutella xylostella in the lowland areas of Homs, Syria. Molecular evolution studies are useful tools to shed light on the molecular bases of insect geographical spread and adaptation to new hosts and environment and for designing better control strategies. In this study, molecular evolution analysis was performed based on the 42 nuclear internal transcribed spacer-2 (ITS2) sequences representing the D. semiclausum and eight other Diadegma spp. from Syria and worldwide. Possible recombination events were identified by RDP4 program. Four potential recombinants of the American D. insulare and D. fenestrale (Jeju) were detected. After detecting and removing recombinant sequences, the ratio of non-synonymous (dN) to synonymous (dS) substitutions per site (dN/dS=ɷ) has been used to identify codon positions involved in adaptive processes. Bayesian techniques were applied to detect selective pressures at a codon level by using five different approaches including: fixed effects likelihood (FEL), internal fixed effects likelihood (IFEL), random effects method (REL), mixed effects model of evolution (MEME) and Program analysis of maximum liklehood (PAML). Among the 40 positively selected amino acids (aa) that differed significantly between clades of Diadegma species, three aa under positive selection were only identified in D. semiclausum. Additionally, all D. semiclausum branches tree were highly found under episodic diversifying selection (EDS) at p≤0.05. Our study provide evidence that both recombination and positive selection have contributed to the molecular diversity of Diadegma spp. and highlights the significant contribution of D. semiclausum in adaptive evolution and influence the fitness in the DBM parasitoid.

Keywords: diadegma sp, DBM, ITS2, phylogeny, recombination, dN/dS, evolution, positive selection

Procedia PDF Downloads 395
101 IOT Based Automated Production and Control System for Clean Water Filtration Through Solar Energy Operated by Submersible Water Pump

Authors: Musse Mohamud Ahmed, Tina Linda Achilles, Mohammad Kamrul Hasan

Abstract:

Deterioration of the mother nature is evident these day with clear danger of human catastrophe emanating from greenhouses (GHG) with increasing CO2 emissions to the environment. PV technology can help to reduce the dependency on fossil fuel, decreasing air pollution and slowing down the rate of global warming. The objective of this paper is to propose, develop and design the production of clean water supply to rural communities using an appropriate technology such as Internet of Things (IOT) that does not create any CO2 emissions. Additionally, maximization of solar energy power output and reciprocally minimizing the natural characteristics of solar sources intermittences during less presence of the sun itself is another goal to achieve in this work. The paper presents the development of critical automated control system for solar energy power output optimization using several new techniques. water pumping system is developed to supply clean water with the application of IOT-renewable energy. This system is effective to provide clean water supply to remote and off-grid areas using Photovoltaics (PV) technology that collects energy generated from the sunlight. The focus of this work is to design and develop a submersible solar water pumping system that applies an IOT implementation. Thus, this system has been executed and programmed using Arduino Software (IDE), proteus, Maltab and C++ programming language. The mechanism of this system is that it pumps water from water reservoir that is powered up by solar energy and clean water production was also incorporated using filtration system through the submersible solar water pumping system. The filtering system is an additional application platform which is intended to provide a clean water supply to any households in Sarawak State, Malaysia.

Keywords: IOT, automated production and control system, water filtration, automated submersible water pump, solar energy

Procedia PDF Downloads 59
100 Heliport Remote Safeguard System Based on Real-Time Stereovision 3D Reconstruction Algorithm

Authors: Ł. Morawiński, C. Jasiński, M. Jurkiewicz, S. Bou Habib, M. Bondyra

Abstract:

With the development of optics, electronics, and computers, vision systems are increasingly used in various areas of life, science, and industry. Vision systems have a huge number of applications. They can be used in quality control, object detection, data reading, e.g., QR-code, etc. A large part of them is used for measurement purposes. Some of them make it possible to obtain a 3D reconstruction of the tested objects or measurement areas. 3D reconstruction algorithms are mostly based on creating depth maps from data that can be acquired from active or passive methods. Due to the specific appliance in airfield technology, only passive methods are applicable because of other existing systems working on the site, which can be blinded on most spectral levels. Furthermore, reconstruction is required to work long distances ranging from hundreds of meters to tens of kilometers with low loss of accuracy even with harsh conditions such as fog, rain, or snow. In response to those requirements, HRESS (Heliport REmote Safeguard System) was developed; which main part is a rotational head with a two-camera stereovision rig gathering images around the head in 360 degrees along with stereovision 3D reconstruction and point cloud combination. The sub-pixel analysis introduced in the HRESS system makes it possible to obtain an increased distance measurement resolution and accuracy of about 3% for distances over one kilometer. Ultimately, this leads to more accurate and reliable measurement data in the form of a point cloud. Moreover, the program algorithm introduces operations enabling the filtering of erroneously collected data in the point cloud. All activities from the programming, mechanical and optical side are aimed at obtaining the most accurate 3D reconstruction of the environment in the measurement area.

Keywords: airfield monitoring, artificial intelligence, stereovision, 3D reconstruction

Procedia PDF Downloads 98
99 An Evolutionary Perspective on the Role of Extrinsic Noise in Filtering Transcript Variability in Small RNA Regulation in Bacteria

Authors: Rinat Arbel-Goren, Joel Stavans

Abstract:

Cell-to-cell variations in transcript or protein abundance, called noise, may give rise to phenotypic variability between isogenic cells, enhancing the probability of survival under stress conditions. These variations may be introduced by post-transcriptional regulatory processes such as non-coding, small RNAs stoichiometric degradation of target transcripts in bacteria. We study the iron homeostasis network in Escherichia coli, in which the RyhB small RNA regulates the expression of various targets as a model system. Using fluorescence reporter genes to detect protein levels and single-molecule fluorescence in situ hybridization to monitor transcripts levels in individual cells, allows us to compare noise at both transcript and protein levels. The experimental results and computer simulations show that extrinsic noise buffers through a feed-forward loop configuration the increase in variability introduced at the transcript level by iron deprivation, illuminating the important role that extrinsic noise plays during stress. Surprisingly, extrinsic noise also decouples of fluctuations of two different targets, in spite of RyhB being a common upstream factor degrading both. Thus, phenotypic variability increases under stress conditions by the decoupling of target fluctuations in the same cell rather than by increasing the noise of each. We also present preliminary results on the adaptation of cells to prolonged iron deprivation in order to shed light on the evolutionary role of post-transcriptional downregulation by small RNAs.

Keywords: cell-to-cell variability, Escherichia coli, noise, single-molecule fluorescence in situ hybridization (smFISH), transcript

Procedia PDF Downloads 141
98 Introducing Two Species of Parastagonospora (Phaeosphaeriaceae) on Grasses from Italy and Russia, Based on Morphology and Phylogeny

Authors: Ishani D. Goonasekara, Erio Camporesi, Timur Bulgakov, Rungtiwa Phookamsak, Kevin D. Hyde

Abstract:

Phaeosphaeriaceae comprises a large number of species occurring mainly on grasses and cereal crops as endophytes, saprobes and especially pathogens. Parastagonospora is an important genus in Phaeosphaeriaceae that includes pathogens causing leaf and glume blotch on cereal crops. Currently, there are fifteen Parastagonospora species described, including both pathogens and saprobes. In this study, one sexual morph species and an asexual morph species, occurring as saprobes on members of Poaceae are introduced based on morphology and a combined molecular analysis of the LSU, SSU, ITS, and RPB2 gene sequence data. The sexual morph species Parastagonospora elymi was isolated from a Russian sample of Elymus repens, a grass commonly known as couch grass, and important for grazing animals, as a weed and used in traditional Austrian medicine. P. elymi is similar to the sexual morph of P. avenae in having cylindrical asci, bearing 8, overlapping biseriate, fusiform ascospores but can be distinguished by its subglobose to conical shaped, wider ascomata. In addition, no sheath was observed surrounding the ascospores. The asexual morph species was isolated from a specimen from Italy, on Dactylis glomerata, a commonly found grass distributed in temperate regions. It is introduced as Parastagonospora macrouniseptata, a coelomycete, and bears a close resemblance to P. allouniseptata and P. uniseptata in having globose to subglobose, pycnidial conidiomata and hyaline, cylindrical, 1-septate conidia. However, the new species could be distinguished in having much larger conidiomata. In the phylogenetic analysis which consisted of a maximum likelihood and Bayesian analysis P. elymi showed low bootstrap support, but well segregated from other strains within the Parastagonospora clade. P. neoallouniseptata formed a sister clade with P. allouniseptata with high statistical support.

Keywords: dothideomycetes, multi-gene analysis, Poaceae, saprobes, taxonomy

Procedia PDF Downloads 98
97 Development of a Multi-Locus DNA Metabarcoding Method for Endangered Animal Species Identification

Authors: Meimei Shi

Abstract:

Objectives: The identification of endangered species, especially simultaneous detection of multiple species in complex samples, plays a critical role in alleged wildlife crime incidents and prevents illegal trade. This study was to develop a multi-locus DNA metabarcoding method for endangered animal species identification. Methods: Several pairs of universal primers were designed according to the mitochondria conserved gene regions. Experimental mixtures were artificially prepared by mixing well-defined species, including endangered species, e.g., forest musk, bear, tiger, pangolin, and sika deer. The artificial samples were prepared with 1-16 well-characterized species at 1% to 100% DNA concentrations. After multiplex-PCR amplification and parameter modification, the amplified products were analyzed by capillary electrophoresis and used for NGS library preparation. The DNA metabarcoding was carried out based on Illumina MiSeq amplicon sequencing. The data was processed with quality trimming, reads filtering, and OTU clustering; representative sequences were blasted using BLASTn. Results: According to the parameter modification and multiplex-PCR amplification results, five primer sets targeting COI, Cytb, 12S, and 16S, respectively, were selected as the NGS library amplification primer panel. High-throughput sequencing data analysis showed that the established multi-locus DNA metabarcoding method was sensitive and could accurately identify all species in artificial mixtures, including endangered animal species Moschus berezovskii, Ursus thibetanus, Panthera tigris, Manis pentadactyla, Cervus nippon at 1% (DNA concentration). In conclusion, the established species identification method provides technical support for customs and forensic scientists to prevent the illegal trade of endangered animals and their products.

Keywords: DNA metabarcoding, endangered animal species, mitochondria nucleic acid, multi-locus

Procedia PDF Downloads 105
96 Use of SUDOKU Design to Assess the Implications of the Block Size and Testing Order on Efficiency and Precision of Dulce De Leche Preference Estimation

Authors: Jéssica Ferreira Rodrigues, Júlio Silvio De Sousa Bueno Filho, Vanessa Rios De Souza, Ana Carla Marques Pinheiro

Abstract:

This study aimed to evaluate the implications of the block size and testing order on efficiency and precision of preference estimation for Dulce de leche samples. Efficiency was defined as the inverse of the average variance of pairwise comparisons among treatments. Precision was defined as the inverse of the variance of treatment means (or effects) estimates. The experiment was originally designed to test 16 treatments as a series of 8 Sudoku 16x16 designs being 4 randomized independently and 4 others in the reverse order, to yield balance in testing order. Linear mixed models were assigned to the whole experiment with 112 testers and all their grades, as well as their partially balanced subgroups, namely: a) experiment with the four initial EU; b) experiment with EU 5 to 8; c) experiment with EU 9 to 12; and b) experiment with EU 13 to 16. To record responses we used a nine-point hedonic scale, it was assumed a mixed linear model analysis with random tester and treatments effects and with fixed test order effect. Analysis of a cumulative random effects probit link model was very similar, with essentially no different conclusions and for simplicity, we present the results using Gaussian assumption. R-CRAN library lme4 and its function lmer (Fit Linear Mixed-Effects Models) was used for the mixed models and libraries Bayesthresh (default Gaussian threshold function) and ordinal with the function clmm (Cumulative Link Mixed Model) was used to check Bayesian analysis of threshold models and cumulative link probit models. It was noted that the number of samples tested in the same session can influence the acceptance level, underestimating the acceptance. However, proving a large number of samples can help to improve the samples discrimination.

Keywords: acceptance, block size, mixed linear model, testing order, testing order

Procedia PDF Downloads 301
95 Single Tuned Shunt Passive Filter Based Current Harmonic Elimination of Three Phase AC-DC Converters

Authors: Mansoor Soomro

Abstract:

The evolution of power electronic equipment has been pivotal in making industrial processes productive, efficient and safe. Despite its attractive features, it has been due to nonlinear loads which make it vulnerable to power quality conditions. Harmonics is one of the power quality problem in which the harmonic frequency is integral multiple of supply frequency. Therefore, the supply voltage and supply frequency do not last within their tolerable limits. As a result, distorted current and voltage waveform may appear. Attributes of low power quality confirm that an electrical device or equipment is likely to malfunction, fail promptly or unable to operate under all applied conditions. The electrical power system is designed for delivering power reliably, namely maximizing power availability to customers. However, power quality events are largely untracked, and as a result, can take out a process as many as 20 to 30 times a year, costing utilities, customers and suppliers of load equipment, a loss of millions of dollars. The ill effects of current harmonics reduce system efficiency, cause overheating of connected equipment, result increase in electrical power and air conditioning costs. With the passage of time and the rapid growth of power electronic converters has highlighted the damages of current harmonics in the electrical power system. Therefore, it has become essential to address the bad influence of current harmonics while planning any suitable changes in the electrical installations. In this paper, an effort has been made to mitigate the effects of dominant 3rd order current harmonics. Passive filtering technique with six pulse multiplication converter has been employed to mitigate them. Since, the standards of power quality are to maintain the supply voltage and supply current within certain prescribed standard limits. For this purpose, the obtained results are validated as per specifications of IEEE 519-1992 and IEEE 519-2014 performance standards.

Keywords: current harmonics, power quality, passive filters, power electronic converters

Procedia PDF Downloads 279
94 Exploring the Applications of Neural Networks in the Adaptive Learning Environment

Authors: Baladitya Swaika, Rahul Khatry

Abstract:

Computer Adaptive Tests (CATs) is one of the most efficient ways for testing the cognitive abilities of students. CATs are based on Item Response Theory (IRT) which is based on item selection and ability estimation using statistical methods of maximum information selection/selection from posterior and maximum-likelihood (ML)/maximum a posteriori (MAP) estimators respectively. This study aims at combining both classical and Bayesian approaches to IRT to create a dataset which is then fed to a neural network which automates the process of ability estimation and then comparing it to traditional CAT models designed using IRT. This study uses python as the base coding language, pymc for statistical modelling of the IRT and scikit-learn for neural network implementations. On creation of the model and on comparison, it is found that the Neural Network based model performs 7-10% worse than the IRT model for score estimations. Although performing poorly, compared to the IRT model, the neural network model can be beneficially used in back-ends for reducing time complexity as the IRT model would have to re-calculate the ability every-time it gets a request whereas the prediction from a neural network could be done in a single step for an existing trained Regressor. This study also proposes a new kind of framework whereby the neural network model could be used to incorporate feature sets, other than the normal IRT feature set and use a neural network’s capacity of learning unknown functions to give rise to better CAT models. Categorical features like test type, etc. could be learnt and incorporated in IRT functions with the help of techniques like logistic regression and can be used to learn functions and expressed as models which may not be trivial to be expressed via equations. This kind of a framework, when implemented would be highly advantageous in psychometrics and cognitive assessments. This study gives a brief overview as to how neural networks can be used in adaptive testing, not only by reducing time-complexity but also by being able to incorporate newer and better datasets which would eventually lead to higher quality testing.

Keywords: computer adaptive tests, item response theory, machine learning, neural networks

Procedia PDF Downloads 157
93 Modelling Volatility Spillovers and Cross Hedging among Major Agricultural Commodity Futures

Authors: Roengchai Tansuchat, Woraphon Yamaka, Paravee Maneejuk

Abstract:

From the past recent, the global financial crisis, economic instability, and large fluctuation in agricultural commodity price have led to increased concerns about the volatility transmission among them. The problem is further exacerbated by commodities volatility caused by other commodity price fluctuations, hence the decision on hedging strategy has become both costly and useless. Thus, this paper is conducted to analysis the volatility spillover effect among major agriculture including corn, soybeans, wheat and rice, to help the commodity suppliers hedge their portfolios, and manage the risk and co-volatility of them. We provide a switching regime approach to analyzing the issue of volatility spillovers in different economic conditions, namely upturn and downturn economic. In particular, we investigate relationships and volatility transmissions between these commodities in different economic conditions. We purposed a Copula-based multivariate Markov Switching GARCH model with two regimes that depend on an economic conditions and perform simulation study to check the accuracy of our proposed model. In this study, the correlation term in the cross-hedge ratio is obtained from six copula families – two elliptical copulas (Gaussian and Student-t) and four Archimedean copulas (Clayton, Gumbel, Frank, and Joe). We use one-step maximum likelihood estimation techniques to estimate our models and compare the performance of these copula using Akaike information criterion (AIC) and Bayesian information criteria (BIC). In the application study of agriculture commodities, the weekly data used are conducted from 4 January 2005 to 1 September 2016, covering 612 observations. The empirical results indicate that the volatility spillover effects among cereal futures are different, as response of different economic condition. In addition, the results of hedge effectiveness will also suggest the optimal cross hedge strategies in different economic condition especially upturn and downturn economic.

Keywords: agricultural commodity futures, cereal, cross-hedge, spillover effect, switching regime approach

Procedia PDF Downloads 177
92 Context-Aware Recommender Systems Using User's Emotional State

Authors: Hoyeon Park, Kyoung-jae Kim

Abstract:

The product recommendation is a field of research that has received much attention in the recent information overload phenomenon. The proliferation of the mobile environment and social media cannot help but affect the results of the recommendation depending on how the factors of the user's situation are reflected in the recommendation process. Recently, research has been spreading attention to the context-aware recommender system which is to reflect user's contextual information in the recommendation process. However, until now, most of the context-aware recommender system researches have been limited in that they reflect the passive context of users. It is expected that the user will be able to express his/her contextual information through his/her active behavior and the importance of the context-aware recommender system reflecting this information can be increased. The purpose of this study is to propose a context-aware recommender system that can reflect the user's emotional state as an active context information to recommendation process. The context-aware recommender system is a recommender system that can make more sophisticated recommendations by utilizing the user's contextual information and has an advantage that the user's emotional factor can be considered as compared with the existing recommender systems. In this study, we propose a method to infer the user's emotional state, which is one of the user's context information, by using the user's facial expression data and to reflect it on the recommendation process. This study collects the facial expression data of a user who is looking at a specific product and the user's product preference score. Then, we classify the facial expression data into several categories according to the previous research and construct a model that can predict them. Next, the predicted results are applied to existing collaborative filtering with contextual information. As a result of the study, it was shown that the recommended results of the context-aware recommender system including facial expression information show improved results in terms of recommendation performance. Based on the results of this study, it is expected that future research will be conducted on recommender system reflecting various contextual information.

Keywords: context-aware, emotional state, recommender systems, business analytics

Procedia PDF Downloads 202
91 Machine Learning in Agriculture: A Brief Review

Authors: Aishi Kundu, Elhan Raza

Abstract:

"Necessity is the mother of invention" - Rapid increase in the global human population has directed the agricultural domain toward machine learning. The basic need of human beings is considered to be food which can be satisfied through farming. Farming is one of the major revenue generators for the Indian economy. Agriculture is not only considered a source of employment but also fulfils humans’ basic needs. So, agriculture is considered to be the source of employment and a pillar of the economy in developing countries like India. This paper provides a brief review of the progress made in implementing Machine Learning in the agricultural sector. Accurate predictions are necessary at the right time to boost production and to aid the timely and systematic distribution of agricultural commodities to make their availability in the market faster and more effective. This paper includes a thorough analysis of various machine learning algorithms applied in different aspects of agriculture (crop management, soil management, water management, yield tracking, livestock management, etc.).Due to climate changes, crop production is affected. Machine learning can analyse the changing patterns and come up with a suitable approach to minimize loss and maximize yield. Machine Learning algorithms/ models (regression, support vector machines, bayesian models, artificial neural networks, decision trees, etc.) are used in smart agriculture to analyze and predict specific outcomes which can be vital in increasing the productivity of the Agricultural Food Industry. It is to demonstrate vividly agricultural works under machine learning to sensor data. Machine Learning is the ongoing technology benefitting farmers to improve gains in agriculture and minimize losses. This paper discusses how the irrigation and farming management systems evolve in real-time efficiently. Artificial Intelligence (AI) enabled programs to emerge with rich apprehension for the support of farmers with an immense examination of data.

Keywords: machine Learning, artificial intelligence, crop management, precision farming, smart farming, pre-harvesting, harvesting, post-harvesting

Procedia PDF Downloads 80
90 Remote Vital Signs Monitoring in Neonatal Intensive Care Unit Using a Digital Camera

Authors: Fatema-Tuz-Zohra Khanam, Ali Al-Naji, Asanka G. Perera, Kim Gibson, Javaan Chahl

Abstract:

Conventional contact-based vital signs monitoring sensors such as pulse oximeters or electrocardiogram (ECG) may cause discomfort, skin damage, and infections, particularly in neonates with fragile, sensitive skin. Therefore, remote monitoring of the vital sign is desired in both clinical and non-clinical settings to overcome these issues. Camera-based vital signs monitoring is a recent technology for these applications with many positive attributes. However, there are still limited camera-based studies on neonates in a clinical setting. In this study, the heart rate (HR) and respiratory rate (RR) of eight infants at the Neonatal Intensive Care Unit (NICU) in Flinders Medical Centre were remotely monitored using a digital camera applying color and motion-based computational methods. The region-of-interest (ROI) was efficiently selected by incorporating an image decomposition method. Furthermore, spatial averaging, spectral analysis, band-pass filtering, and peak detection were also used to extract both HR and RR. The experimental results were validated with the ground truth data obtained from an ECG monitor and showed a strong correlation using the Pearson correlation coefficient (PCC) 0.9794 and 0.9412 for HR and RR, respectively. The RMSE between camera-based data and ECG data for HR and RR were 2.84 beats/min and 2.91 breaths/min, respectively. A Bland Altman analysis of the data also showed a close correlation between both data sets with a mean bias of 0.60 beats/min and 1 breath/min, and the lower and upper limit of agreement -4.9 to + 6.1 beats/min and -4.4 to +6.4 breaths/min for both HR and RR, respectively. Therefore, video camera imaging may replace conventional contact-based monitoring in NICU and has potential applications in other contexts such as home health monitoring.

Keywords: neonates, NICU, digital camera, heart rate, respiratory rate, image decomposition

Procedia PDF Downloads 89
89 Improving Search Engine Performance by Removing Indexes to Malicious URLs

Authors: Durga Toshniwal, Lokesh Agrawal

Abstract:

As the web continues to play an increasing role in information exchange, and conducting daily activities, computer users have become the target of miscreants which infects hosts with malware or adware for financial gains. Unfortunately, even a single visit to compromised web site enables the attacker to detect vulnerabilities in the user’s applications and force the downloading of multitude of malware binaries. We provide an approach to effectively scan the so-called drive-by downloads on the Internet. Drive-by downloads are result of URLs that attempt to exploit their visitors and cause malware to be installed and run automatically. To scan the web for malicious pages, the first step is to use a crawler to collect URLs that live on the Internet, and then to apply fast prefiltering techniques to reduce the amount of pages that are needed to be examined by precise, but slower, analysis tools (such as honey clients or antivirus programs). Although the technique is effective, it requires a substantial amount of resources. A main reason is that the crawler encounters many pages on the web that are legitimate and needs to be filtered. In this paper, to characterize the nature of this rising threat, we present implementation of a web crawler on Python, an approach to search the web more efficiently for pages that are likely to be malicious, filtering benign pages and passing remaining pages to antivirus program for detection of malwares. Our approaches starts from an initial seed of known, malicious web pages. Using these seeds, our system generates search engines queries to identify other malicious pages that are similar to the ones in the initial seed. By doing so, it leverages the crawling infrastructure of search engines to retrieve URLs that are much more likely to be malicious than a random page on the web. The results shows that this guided approach is able to identify malicious web pages more efficiently when compared to random crawling-based approaches.

Keywords: web crawler, malwares, seeds, drive-by-downloads, security

Procedia PDF Downloads 213