Search results for: patch-based similarity metric
807 An Analytical Metric and Process for Critical Infrastructure Architecture System Availability Determination in Distributed Computing Environments under Infrastructure Attack
Authors: Vincent Andrew Cappellano
Abstract:
In the early phases of critical infrastructure system design, translating distributed computing requirements to an architecture has risk given the multitude of approaches (e.g., cloud, edge, fog). In many systems, a single requirement for system uptime / availability is used to encompass the system’s intended operations. However, when architected systems may perform to those availability requirements only during normal operations and not during component failure, or during outages caused by adversary attacks on critical infrastructure (e.g., physical, cyber). System designers lack a structured method to evaluate availability requirements against candidate system architectures through deep degradation scenarios (i.e., normal ops all the way down to significant damage of communications or physical nodes). This increases risk of poor selection of a candidate architecture due to the absence of insight into true performance for systems that must operate as a piece of critical infrastructure. This research effort proposes a process to analyze critical infrastructure system availability requirements and a candidate set of systems architectures, producing a metric assessing these architectures over a spectrum of degradations to aid in selecting appropriate resilient architectures. To accomplish this effort, a set of simulation and evaluation efforts are undertaken that will process, in an automated way, a set of sample requirements into a set of potential architectures where system functions and capabilities are distributed across nodes. Nodes and links will have specific characteristics and based on sampled requirements, contribute to the overall system functionality, such that as they are impacted/degraded, the impacted functional availability of a system can be determined. A machine learning reinforcement-based agent will structurally impact the nodes, links, and characteristics (e.g., bandwidth, latency) of a given architecture to provide an assessment of system functional uptime/availability under these scenarios. By varying the intensity of the attack and related aspects, we can create a structured method of evaluating the performance of candidate architectures against each other to create a metric rating its resilience to these attack types/strategies. Through multiple simulation iterations, sufficient data will exist to compare this availability metric, and an architectural recommendation against the baseline requirements, in comparison to existing multi-factor computing architectural selection processes. It is intended that this additional data will create an improvement in the matching of resilient critical infrastructure system requirements to the correct architectures and implementations that will support improved operation during times of system degradation due to failures and infrastructure attacks.Keywords: architecture, resiliency, availability, cyber-attack
Procedia PDF Downloads 110806 Effect of Joule Heating on Chemically Reacting Micropolar Fluid Flow over Truncated Cone with Convective Boundary Condition Using Spectral Quasilinearization Method
Authors: Pradeepa Teegala, Ramreddy Chetteti
Abstract:
This work emphasizes the effects of heat generation/absorption and Joule heating on chemically reacting micropolar fluid flow over a truncated cone with convective boundary condition. For this complex fluid flow problem, the similarity solution does not exist and hence using non-similarity transformations, the governing fluid flow equations along with related boundary conditions are transformed into a set of non-dimensional partial differential equations. Several authors have applied the spectral quasi-linearization method to solve the ordinary differential equations, but here the resulting nonlinear partial differential equations are solved for non-similarity solution by using a recently developed method called the spectral quasi-linearization method (SQLM). Comparison with previously published work on special cases of the problem is performed and found to be in excellent agreement. The influence of pertinent parameters namely Biot number, Joule heating, heat generation/absorption, chemical reaction, micropolar and magnetic field on physical quantities of the flow are displayed through graphs and the salient features are explored in detail. Further, the results are analyzed by comparing with two special cases, namely, vertical plate and full cone wherever possible.Keywords: chemical reaction, convective boundary condition, joule heating, micropolar fluid, spectral quasilinearization method
Procedia PDF Downloads 348805 Cosmetic Recommendation Approach Using Machine Learning
Authors: Shakila N. Senarath, Dinesh Asanka, Janaka Wijayanayake
Abstract:
The necessity of cosmetic products is arising to fulfill consumer needs of personality appearance and hygiene. A cosmetic product consists of various chemical ingredients which may help to keep the skin healthy or may lead to damages. Every chemical ingredient in a cosmetic product does not perform on every human. The most appropriate way to select a healthy cosmetic product is to identify the texture of the body first and select the most suitable product with safe ingredients. Therefore, the selection process of cosmetic products is complicated. Consumer surveys have shown most of the time, the selection process of cosmetic products is done in an improper way by consumers. From this study, a content-based system is suggested that recommends cosmetic products for the human factors. To such an extent, the skin type, gender and price range will be considered as human factors. The proposed system will be implemented by using Machine Learning. Consumer skin type, gender and price range will be taken as inputs to the system. The skin type of consumer will be derived by using the Baumann Skin Type Questionnaire, which is a value-based approach that includes several numbers of questions to derive the user’s skin type to one of the 16 skin types according to the Bauman Skin Type indicator (BSTI). Two datasets are collected for further research proceedings. The user data set was collected using a questionnaire given to the public. Those are the user dataset and the cosmetic dataset. Product details are included in the cosmetic dataset, which belongs to 5 different kinds of product categories (Moisturizer, Cleanser, Sun protector, Face Mask, Eye Cream). An alternate approach of TF-IDF (Term Frequency – Inverse Document Frequency) is applied to vectorize cosmetic ingredients in the generic cosmetic products dataset and user-preferred dataset. Using the IF-IPF vectors, each user-preferred products dataset and generic cosmetic products dataset can be represented as sparse vectors. The similarity between each user-preferred product and generic cosmetic product will be calculated using the cosine similarity method. For the recommendation process, a similarity matrix can be used. Higher the similarity, higher the match for consumer. Sorting a user column from similarity matrix in a descending order, the recommended products can be retrieved in ascending order. Even though results return a list of similar products, and since the user information has been gathered, such as gender and the price ranges for product purchasing, further optimization can be done by considering and giving weights for those parameters once after a set of recommended products for a user has been retrieved.Keywords: content-based filtering, cosmetics, machine learning, recommendation system
Procedia PDF Downloads 135804 A Numerical Solution Based on Operational Matrix of Differentiation of Shifted Second Kind Chebyshev Wavelets for a Stefan Problem
Authors: Rajeev, N. K. Raigar
Abstract:
In this study, one dimensional phase change problem (a Stefan problem) is considered and a numerical solution of this problem is discussed. First, we use similarity transformation to convert the governing equations into ordinary differential equations with its boundary conditions. The solutions of ordinary differential equation with the associated boundary conditions and interface condition (Stefan condition) are obtained by using a numerical approach based on operational matrix of differentiation of shifted second kind Chebyshev wavelets. The obtained results are compared with existing exact solution which is sufficiently accurate.Keywords: operational matrix of differentiation, similarity transformation, shifted second kind chebyshev wavelets, stefan problem
Procedia PDF Downloads 404803 Graph Planning Based Composition for Adaptable Semantic Web Services
Authors: Rihab Ben Lamine, Raoudha Ben Jemaa, Ikram Amous Ben Amor
Abstract:
This paper proposes a graph planning technique for semantic adaptable Web Services composition. First, we use an ontology based context model for extending Web Services descriptions with information about the most suitable context for its use. Then, we transform the composition problem into a semantic context aware graph planning problem to build the optimal service composition based on user's context. The construction of the planning graph is based on semantic context aware Web Service discovery that allows for each step to add most suitable Web Services in terms of semantic compatibility between the services parameters and their context similarity with the user's context. In the backward search step, semantic and contextual similarity scores are used to find best composed Web Services list. Finally, in the ranking step, a score is calculated for each best solution and a set of ranked solutions is returned to the user.Keywords: semantic web service, web service composition, adaptation, context, graph planning
Procedia PDF Downloads 521802 A Model Based Metaheuristic for Hybrid Hierarchical Community Structure in Social Networks
Authors: Radhia Toujani, Jalel Akaichi
Abstract:
In recent years, the study of community detection in social networks has received great attention. The hierarchical structure of the network leads to the emergence of the convergence to a locally optimal community structure. In this paper, we aim to avoid this local optimum in the introduced hybrid hierarchical method. To achieve this purpose, we present an objective function where we incorporate the value of structural and semantic similarity based modularity and a metaheuristic namely bees colonies algorithm to optimize our objective function on both hierarchical level divisive and agglomerative. In order to assess the efficiency and the accuracy of the introduced hybrid bee colony model, we perform an extensive experimental evaluation on both synthetic and real networks.Keywords: social network, community detection, agglomerative hierarchical clustering, divisive hierarchical clustering, similarity, modularity, metaheuristic, bee colony
Procedia PDF Downloads 380801 Homeostatic Analysis of the Integrated Insulin and Glucagon Signaling Network: Demonstration of Bistable Response in Catabolic and Anabolic States
Authors: Pramod Somvanshi, Manu Tomar, K. V. Venkatesh
Abstract:
Insulin and glucagon are responsible for homeostasis of key plasma metabolites like glucose, amino acids and fatty acids in the blood plasma. These hormones act antagonistically to each other during the secretion and signaling stages. In the present work, we analyze the effect of macronutrients on the response from integrated insulin and glucagon signaling pathways. The insulin and glucagon pathways are connected by DAG (a calcium signaling component which is part of the glucagon signaling module) which activates PKC and inhibits IRS (insulin signaling component) constituting a crosstalk. AKT (insulin signaling component) inhibits cAMP (glucagon signaling component) through PDE3 forming the other crosstalk between the two signaling pathways. Physiological level of anabolism and catabolism is captured through a metric quantified by the activity levels of AKT and PKA in their phosphorylated states, which represent the insulin and glucagon signaling endpoints, respectively. Under resting and starving conditions, the phosphorylation metric represents homeostasis indicating a balance between the anabolic and catabolic activities in the tissues. The steady state analysis of the integrated network demonstrates the presence of a bistable response in the phosphorylation metric with respect to input plasma glucose levels. This indicates that two steady state conditions (one in the homeostatic zone and other in the anabolic zone) are possible for a given glucose concentration depending on the ON or OFF path. When glucose levels rise above normal, during post-meal conditions, the bistability is observed in the anabolic space denoting the dominance of the glycogenesis in liver. For glucose concentrations lower than the physiological levels, while exercising, metabolic response lies in the catabolic space denoting the prevalence of glycogenolysis in liver. The non-linear positive feedback of AKT on IRS in insulin signaling module of the network is the main cause of the bistable response. The span of bistability in the phosphorylation metric increases as plasma fatty acid and amino acid levels rise and eventually the response turns monostable and catabolic representing diabetic conditions. In the case of high fat or protein diet, fatty acids and amino acids have an inhibitory effect on the insulin signaling pathway by increasing the serine phosphorylation of IRS protein via the activation of PKC and S6K, respectively. Similar analysis was also performed with respect to input amino acid and fatty acid levels. This emergent property of bistability in the integrated network helps us understand why it becomes extremely difficult to treat obesity and diabetes when blood glucose level rises beyond a certain value.Keywords: bistability, diabetes, feedback and crosstalk, obesity
Procedia PDF Downloads 277800 Particle and Photon Trajectories near the Black Hole Immersed in the Nonstatic Cosmological Background
Authors: Elena M. Kopteva, Pavlina Jaluvkova, Zdenek Stuchlik
Abstract:
The question of constructing a consistent model of the cosmological black hole remains to be unsolved and still attracts the interest of cosmologists as far as it is important in a wide set of research problems including the problem of the black hole horizon dynamics, the problem of interplay between cosmological expansion and local gravity, the problem of structure formation in the early universe etc. In this work, the model of the cosmological black hole is built on the basis of the exact solution of the Einstein equations for the spherically symmetric inhomogeneous dust distribution in the approach of the mass function use. Possible trajectories for massive particles and photons near the black hole immersed in the nonstatic dust cosmological background are investigated in frame of the obtained model. The reference system of distant galaxy comoving to cosmological expansion combined with curvature coordinates is used, so that the resulting metric becomes nondiagonal and involves both proper ‘cosmological’ time and curvature spatial coordinates. For this metric the geodesic equations are analyzed for the test particles and photons, and the respective trajectories are built.Keywords: exact solutions for Einstein equations, Lemaitre-Tolman-Bondi solution, cosmological black holes, particle and photon trajectories
Procedia PDF Downloads 340799 Evidence Theory Enabled Quickest Change Detection Using Big Time-Series Data from Internet of Things
Authors: Hossein Jafari, Xiangfang Li, Lijun Qian, Alexander Aved, Timothy Kroecker
Abstract:
Traditionally in sensor networks and recently in the Internet of Things, numerous heterogeneous sensors are deployed in distributed manner to monitor a phenomenon that often can be model by an underlying stochastic process. The big time-series data collected by the sensors must be analyzed to detect change in the stochastic process as quickly as possible with tolerable false alarm rate. However, sensors may have different accuracy and sensitivity range, and they decay along time. As a result, the big time-series data collected by the sensors will contain uncertainties and sometimes they are conflicting. In this study, we present a framework to take advantage of Evidence Theory (a.k.a. Dempster-Shafer and Dezert-Smarandache Theories) capabilities of representing and managing uncertainty and conflict to fast change detection and effectively deal with complementary hypotheses. Specifically, Kullback-Leibler divergence is used as the similarity metric to calculate the distances between the estimated current distribution with the pre- and post-change distributions. Then mass functions are calculated and related combination rules are applied to combine the mass values among all sensors. Furthermore, we applied the method to estimate the minimum number of sensors needed to combine, so computational efficiency could be improved. Cumulative sum test is then applied on the ratio of pignistic probability to detect and declare the change for decision making purpose. Simulation results using both synthetic data and real data from experimental setup demonstrate the effectiveness of the presented schemes.Keywords: CUSUM, evidence theory, kl divergence, quickest change detection, time series data
Procedia PDF Downloads 335798 Impact Factor Analysis for Spatially Varying Aerosol Optical Depth in Wuhan Agglomeration
Authors: Wenting Zhang, Shishi Liu, Peihong Fu
Abstract:
As an indicator of air quality and directly related to concentration of ground PM2.5, the spatial-temporal variation and impact factor analysis of Aerosol Optical Depth (AOD) have been a hot spot in air pollution. This paper concerns the non-stationarity and the autocorrelation (with Moran’s I index of 0.75) of the AOD in Wuhan agglomeration (WHA), in central China, uses the geographically weighted regression (GRW) to identify the spatial relationship of AOD and its impact factors. The 3 km AOD product of Moderate Resolution Imaging Spectrometer (MODIS) is used in this study. Beyond the economic-social factor, land use density factors, vegetable cover, and elevation, the landscape metric is also considered as one factor. The results suggest that the GWR model is capable of dealing with spatial varying relationship, with R square, corrected Akaike Information Criterion (AICc) and standard residual better than that of ordinary least square (OLS) model. The results of GWR suggest that the urban developing, forest, landscape metric, and elevation are the major driving factors of AOD. Generally, the higher AOD trends to located in the place with higher urban developing, less forest, and flat area.Keywords: aerosol optical depth, geographically weighted regression, land use change, Wuhan agglomeration
Procedia PDF Downloads 357797 The Influence of Audio on Perceived Quality of Segmentation
Authors: Silvio Ricardo Rodrigues Sanches, Bianca Cogo Barbosa, Beatriz Regina Brum, Cléber Gimenez Corrêa
Abstract:
To evaluate the quality of a segmentation algorithm, the authors use subjective or objective metrics. Although subjective metrics are more accurate than objective ones, objective metrics do not require user feedback to test an algorithm. Objective metrics require subjective experiments only during their development. Subjective experiments typically display to users some videos (generated from frames with segmentation errors) that simulate the environment of an application domain. This user feedback is crucial information for metric definition. In the subjective experiments applied to develop some state-of-the-art metrics used to test segmentation algorithms, the videos displayed during the experiments did not contain audio. Audio is an essential component in applications such as videoconference and augmented reality. If the audio influences the user’s perception, using only videos without audio in subjective experiments can compromise the efficiency of an objective metric generated using data from these experiments. This work aims to identify if the audio influences the user’s perception of segmentation quality in background substitution applications with audio. The proposed approach used a subjective method based on formal video quality assessment methods. The results showed that audio influences the quality of segmentation perceived by a user.Keywords: background substitution, influence of audio, segmentation evaluation, segmentation quality
Procedia PDF Downloads 118796 Enhancement of Genetic Diversity through Cross Breeding of Two Catfish (Heteropneustes fossilis and Clarias batrachus) in Bangladesh
Authors: M. F. Miah, A. Chakrabarty
Abstract:
Two popular and highly valued fish, Stinging catfish (Heteropneustes fossilis) and Asian catfish (Clarias batrachus) are considered for observing genetic enhancement. Cross breeding was performed considering wild and farmed fish through inducing agent. Five RAPD markers were used to assess genetic diversity among parents and offspring of these two catfish for evaluating genetic enhancement in F1 generation. Considering different genetic data such as banding pattern of DNA, polymorphic loci, polymorphic information content (PIC), inter individual pair wise similarity, Nei genetic similarity, genetic distance, phylogenetic relationships, allele frequency, genotype frequency, intra locus gene diversity and average gene diversity of parents and offspring of these two fish were analyzed and finally in both cases higher genetic diversity was found in F1 generation than the parents.Keywords: Heteropneustes fossilis, Clarias batrachus, cross breeding, genetic enhancement
Procedia PDF Downloads 252795 Conformal Invariance and F(R,T) Gravity
Authors: P. Y. Tsyba, O. V. Razina, E. Güdekli, R. Myrzakulov
Abstract:
In this paper, we consider the equation of motion for the F(R,T) gravity on their property of conformal invariance. It is shown that in the general case such a theory is not conformally invariant. Special cases for the functions v and u, in which the properties of the theory can appear, were studied.Keywords: conformal invariance, gravity, space-time, metric
Procedia PDF Downloads 664794 Coastal Hydraulic Modelling to Ascertain Stability of Rubble Mound Breakwater
Authors: Safari Mat Desa, Othman A. Karim, Mohd Kamarulhuda Samion, Saiful Bahri Hamzah
Abstract:
Rubble mound breakwater was one of the most popular designs in Malaysia, constructed at the river mouth to dissipate the incoming wave energy from the seaward. Geometrically characteristics in trapezoid, crest width, and bottom width will determine the hypotonus stability, whilst structural height was designed for wave overtopping consideration. Physical hydraulic modelling in two-dimensional facilities was instigated in the flume to test the stability as well as the overtopping rate complied with the method of similarity, namely kinematic, dynamic, and geometric. Scaling effects of wave characteristics were carried out in order to acquire significant interaction of wave height, wave period, and water depth. Results showed two-dimensional physical modelling has proven reliable capability to ascertain breakwater stability significantly.Keywords: breakwater, geometrical characteristic, wave overtopping, physical hydraulic modelling, method of similarity, wave characteristic
Procedia PDF Downloads 118793 Optimization of Spatial Light Modulator to Generate Aberration Free Optical Traps
Authors: Deepak K. Gupta, T. R. Ravindran
Abstract:
Holographic Optical Tweezers (HOTs) in general use iterative algorithms such as weighted Gerchberg-Saxton (WGS) to generate multiple traps, which produce traps with 99% uniformity theoretically. But in experiments, it is the phase response of the spatial light modulator (SLM) which ultimately determines the efficiency, uniformity, and quality of the trap spots. In general, SLMs show a nonlinear phase response behavior, and they may even have asymmetric phase modulation depth before and after π. This affects the resolution with which the gray levels are addressed before and after π, leading to a degraded trap performance. We present a method to optimize the SLM for a linear phase response behavior along with a symmetric phase modulation depth around π. Further, we optimize the SLM for its varying phase response over different spatial regions by optimizing the brightness/contrast and gamma of the hologram in different subsections. We show the effect of the optimization on an array of trap spots resulting in improved efficiency and uniformity. We also calculate the spot sharpness metric and trap performance metric and show a tightly focused spot with reduced aberration. The trap performance is compared by calculating the trap stiffness of a trapped particle in a given trap spot before and after aberration correction. The trap stiffness is found to improve by 200% after the optimization.Keywords: spatial light modulator, optical trapping, aberration, phase modulation
Procedia PDF Downloads 190792 Computational Analysis of Potential Inhibitors Selected Based on Structural Similarity for the Src SH2 Domain
Authors: W. P. Hu, J. V. Kumar, Jeffrey J. P. Tsai
Abstract:
The inhibition of SH2 domain regulated protein-protein interactions is an attractive target for developing an effective chemotherapeutic approach in the treatment of disease. Molecular simulation is a useful tool for developing new drugs and for studying molecular recognition. In this study, we searched potential drug compounds for the inhibition of SH2 domain by performing structural similarity search in PubChem Compound Database. A total of 37 compounds were screened from the database, and then we used the LibDock docking program to evaluate the inhibition effect. The best three compounds (AP22408, CID 71463546 and CID 9917321) were chosen for MD simulations after the LibDock docking. Our results show that the compound CID 9917321 can produce a more stable protein-ligand complex compared to other two currently known inhibitors of Src SH2 domain. The compound CID 9917321 may be useful for the inhibition of SH2 domain based on these computational results. Subsequently experiments are needed to verify the effect of compound CID 9917321 on the SH2 domain in the future studies.Keywords: nonpeptide inhibitor, Src SH2 domain, LibDock, molecular dynamics simulation
Procedia PDF Downloads 270791 Garment Industry Development in South East Asia and Competitiveness
Authors: P. Nayak, Shakeel Shaikh
Abstract:
In this paper, we analyse the apparel export performance of Southeast Asian Nations (ASEAN) in the world market. The study covers the 2003-2012 period at the sector as well as product levels (6 digit HS) and analysis is based HS 2002 nomenclature. We measure export similarity among Southeast Asian nations for the apparel sector (two digit HS-61 & 62), besides analysing the products performance in the world through Revealed Comparative Advantage (RCA) technique. Coupled with RCA, the price as a factor of competitiveness was examined from the available Unit Value Realizations (UVR). Further to this, the resource availability or outsourced from the region was considered as an extension to the analysis of competitiveness between the nations. With the help of these methodologies, we examine the degree of competition between the exports of southeast nations in the world market. Our results show that Cambodia, Indonesia, Thailand, and Vietnam are well performing states within ASEAN. The paper further delves into sustainability of the export performing countries within ASEAN.Keywords: export competitiveness, export similarity index, revealed comparative advantage, unit value realisation
Procedia PDF Downloads 286790 Semantic Textual Similarity on Contracts: Exploring Multiple Negative Ranking Losses for Sentence Transformers
Authors: Yogendra Sisodia
Abstract:
Researchers are becoming more interested in extracting useful information from legal documents thanks to the development of large-scale language models in natural language processing (NLP), and deep learning has accelerated the creation of powerful text mining models. Legal fields like contracts benefit greatly from semantic text search since it makes it quick and easy to find related clauses. After collecting sentence embeddings, it is relatively simple to locate sentences with a comparable meaning throughout the entire legal corpus. The author of this research investigated two pre-trained language models for this task: MiniLM and Roberta, and further fine-tuned them on Legal Contracts. The author used Multiple Negative Ranking Loss for the creation of sentence transformers. The fine-tuned language models and sentence transformers showed promising results.Keywords: legal contracts, multiple negative ranking loss, natural language inference, sentence transformers, semantic textual similarity
Procedia PDF Downloads 109789 Influence of Convective Boundary Condition on Chemically Reacting Micropolar Fluid Flow over a Truncated Cone Embedded in Porous Medium
Authors: Pradeepa Teegala, Ramreddy Chitteti
Abstract:
This article analyzes the mixed convection flow of chemically reacting micropolar fluid over a truncated cone embedded in non-Darcy porous medium with convective boundary condition. In addition, heat generation/absorption and Joule heating effects are taken into consideration. The similarity solution does not exist for this complex fluid flow problem, and hence non-similarity transformations are used to convert the governing fluid flow equations along with related boundary conditions into a set of nondimensional partial differential equations. Many authors have been applied the spectral quasi-linearization method to solve the ordinary differential equations, but here the resulting nonlinear partial differential equations are solved for non-similarity solution by using a recently developed method called the spectral quasi-linearization method (SQLM). Comparison with previously published work on special cases of the problem is performed and found to be in excellent agreement. The effect of pertinent parameters namely, Biot number, mixed convection parameter, heat generation/absorption, Joule heating, Forchheimer number, chemical reaction, micropolar and magnetic field on physical quantities of the flow are displayed through graphs and the salient features are explored in detail. Further, the results are analyzed by comparing with two special cases, namely, vertical plate and full cone wherever possible.Keywords: chemical reaction, convective boundary condition, joule heating, micropolar fluid, mixed convection, spectral quasi-linearization method
Procedia PDF Downloads 277788 Hierarchical Piecewise Linear Representation of Time Series Data
Authors: Vineetha Bettaiah, Heggere S. Ranganath
Abstract:
This paper presents a Hierarchical Piecewise Linear Approximation (HPLA) for the representation of time series data in which the time series is treated as a curve in the time-amplitude image space. The curve is partitioned into segments by choosing perceptually important points as break points. Each segment between adjacent break points is recursively partitioned into two segments at the best point or midpoint until the error between the approximating line and the original curve becomes less than a pre-specified threshold. The HPLA representation achieves dimensionality reduction while preserving prominent local features and general shape of time series. The representation permits course-fine processing at different levels of details, allows flexible definition of similarity based on mathematical measures or general time series shape, and supports time series data mining operations including query by content, clustering and classification based on whole or subsequence similarity.Keywords: data mining, dimensionality reduction, piecewise linear representation, time series representation
Procedia PDF Downloads 276787 A Deep Learning Based Approach for Dynamically Selecting Pre-processing Technique for Images
Authors: Revoti Prasad Bora, Nikita Katyal, Saurabh Yadav
Abstract:
Pre-processing plays an important role in various image processing applications. Most of the time due to the similar nature of images, a particular pre-processing or a set of pre-processing steps are sufficient to produce the desired results. However, in the education domain, there is a wide variety of images in various aspects like images with line-based diagrams, chemical formulas, mathematical equations, etc. Hence a single pre-processing or a set of pre-processing steps may not yield good results. Therefore, a Deep Learning based approach for dynamically selecting a relevant pre-processing technique for each image is proposed. The proposed method works as a classifier to detect hidden patterns in the images and predicts the relevant pre-processing technique needed for the image. This approach experimented for an image similarity matching problem but it can be adapted to other use cases too. Experimental results showed significant improvement in average similarity ranking with the proposed method as opposed to static pre-processing techniques.Keywords: deep-learning, classification, pre-processing, computer vision, image processing, educational data mining
Procedia PDF Downloads 166786 Machine Learning Driven Analysis of Kepler Objects of Interest to Identify Exoplanets
Authors: Akshat Kumar, Vidushi
Abstract:
This paper identifies 27 KOIs, 26 of which are currently classified as candidates and one as false positives that have a high probability of being confirmed. For this purpose, 11 machine learning algorithms were implemented on the cumulative kepler dataset sourced from the NASA exoplanet archive; it was observed that the best-performing model was HistGradientBoosting and XGBoost with a test accuracy of 93.5%, and the lowest-performing model was Gaussian NB with a test accuracy of 54%, to test model performance F1, cross-validation score and RUC curve was calculated. Based on the learned models, the significant characteristics for confirm exoplanets were identified, putting emphasis on the object’s transit and stellar properties; these characteristics were namely koi_count, koi_prad, koi_period, koi_dor, koi_ror, and koi_smass, which were later considered to filter out the potential KOIs. The paper also calculates the Earth similarity index based on the planetary radius and equilibrium temperature for each KOI identified to aid in their classification.Keywords: Kepler objects of interest, exoplanets, space exploration, machine learning, earth similarity index, transit photometry
Procedia PDF Downloads 76785 A Metric to Evaluate Conventional and Electrified Vehicles in Terms of Customer-Oriented Driving Dynamics
Authors: Stephan Schiffer, Andreas Kain, Philipp Wilde, Maximilian Helbing, Bernard Bäker
Abstract:
Automobile manufacturers progressively focus on a downsizing strategy to meet the EU's CO2 requirements concerning type-approval consumption cycles. The reduction in naturally aspirated engine power is compensated by increased levels of turbocharging. By downsizing conventional engines, CO2 emissions are reduced. However, it also implicates major challenges regarding longitudinal dynamic characteristics. An example of this circumstance is the delayed turbocharger-induced torque reaction which leads to a partially poor response behavior of the vehicle during acceleration operations. That is why it is important to focus conventional drive train design on real customer driving again. The currently considered dynamic maneuvers like the acceleration time 0-100 km/h discussed by journals and car manufacturers describe longitudinal dynamics experienced by a driver inadequately. For that reason we present the realization and evaluation of a comprehensive proband study. Subjects are provided with different vehicle concepts (electrified vehicles, vehicles with naturally aspired engines and vehicles with different concepts of turbochargers etc.) in order to find out which dynamic criteria are decisive for a subjectively strong acceleration and response behavior of a vehicle. Subsequently, realistic acceleration criteria are derived. By weighing the criteria an evaluation metric is developed to objectify customer-oriented transient dynamics. Fully-electrified vehicles are the benchmark in terms of customer-oriented longitudinal dynamics. The electric machine provides the desired torque almost without delay. This advantage compared to combustion engines is especially noticeable at low engine speeds. In conclusion, we will show the degree to which extent customer-relevant longitudinal dynamics of conventional vehicles can be approximated to electrified vehicle concepts. Therefore, various technical measures (turbocharger concepts, 48V electrical chargers etc.) and drive train designs (e.g. varying the final drive) are presented and evaluated in order to strengthen the vehicle’s customer-relevant transient dynamics. As a rating size the newly developed evaluation metric will be used.Keywords: 48V, customer-oriented driving dynamics, electric charger, electrified vehicles, vehicle concepts
Procedia PDF Downloads 407784 Detecting Paraphrases in Arabic Text
Authors: Amal Alshahrani, Allan Ramsay
Abstract:
Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)
Procedia PDF Downloads 388783 The Gender Dialectic in Mothers and Daughters’ Relationships
Authors: Ronit Even Zahav
Abstract:
Objectives: Mother-daughter relationships are often portrayed as one of the most constitutive ties that shape women's identities throughout their lives. Yet, to the best of author’s knowledge, only few studies examine mother-daughter relationships in adulthood in the context of cross-cultural transition. Most of them focus on the mother-daughter relationship among one origin group. Hence, the existing knowledge about these relationships in adulthood, in the context of intercultural transition and encounters between different cultures, remain limited. Based on a critical feminist approach critical and cultural perspectives the current study focuses on a cross-cultural comparison of adult mother-daughter relationships among three groups of origin: Ethiopia, Russia, and Israel. The study aimed to: Explore the voices of women participating in a mother-daughter discourse in the context of gender and ethnicity; examine the differences in the mother-daughter relationship through number of factors (e.g. expectations of similarity and difference, perceptions of gender roles, gender identity, emotional closeness, sharing and stress) and finally, to develop a gender informed tool for understanding the gender dialectic in mother-daughter relationship in the context of cross cultural transitions. Method: 37 dyads of mothers and adult daughters participated in a qualitative study. A semi-structured interview was conducted that included questions about socio-demographic characteristics, language proficiency, social distance, closeness, emotional stress, and expectations of similarity and difference in mother-daughter relationships. Results: Analysis of the findings yielded three relationship patterns of gender dialectic and expectations of similarity and difference that characterize the groups of origin. Ethiopian mothers reported more sharing their daughters, fewer expectations of similarity, and felt more stress in the relationship compered to women from the two other origin groups. Conclusions: The study highlighted the impact of intercultural transition and social exclusion on mother-daughter relationships in adulthood in the context of the gender dialectic and women’s status in society. The presentation will explore the findings that were brought up by participants. The discussion will focus on the practices related to gender dialectic and intersecting inequalities regarding diverse groups and discuss gender development reducing inequalities and promoting empowerment to transform oppressive conditions.Keywords: gender informed perspectives, gender dialectic, mother-daughter relationships, multiculturalism
Procedia PDF Downloads 67782 Clustering Ethno-Informatics of Naming Village in Java Island Using Data Mining
Authors: Atje Setiawan Abdullah, Budi Nurani Ruchjana, I. Gede Nyoman Mindra Jaya, Eddy Hermawan
Abstract:
Ethnoscience is used to see the culture with a scientific perspective, which may help to understand how people develop various forms of knowledge and belief, initially focusing on the ecology and history of the contributions that have been there. One of the areas studied in ethnoscience is etno-informatics, is the application of informatics in the culture. In this study the science of informatics used is data mining, a process to automatically extract knowledge from large databases, to obtain interesting patterns in order to obtain a knowledge. While the application of culture described by naming database village on the island of Java were obtained from Geographic Indonesia Information Agency (BIG), 2014. The purpose of this study is; first, to classify the naming of the village on the island of Java based on the structure of the word naming the village, including the prefix of the word, syllable contained, and complete word. Second to classify the meaning of naming the village based on specific categories, as well as its role in the community behavioral characteristics. Third, how to visualize the naming of the village to a map location, to see the similarity of naming villages in each province. In this research we have developed two theorems, i.e theorems area as a result of research studies have collected intersection naming villages in each province on the island of Java, and the composition of the wedge theorem sets the provinces in Java is used to view the peculiarities of a location study. The methodology in this study base on the method of Knowledge Discovery in Database (KDD) on data mining, the process includes preprocessing, data mining and post processing. The results showed that the Java community prioritizes merit in running his life, always working hard to achieve a more prosperous life, and love as well as water and environmental sustainment. Naming villages in each location adjacent province has a high degree of similarity, and influence each other. Cultural similarities in the province of Central Java, East Java and West Java-Banten have a high similarity, whereas in Jakarta-Yogyakarta has a low similarity. This research resulted in the cultural character of communities within the meaning of the naming of the village on the island of Java, this character is expected to serve as a guide in the behavior of people's daily life on the island of Java.Keywords: ethnoscience, ethno-informatics, data mining, clustering, Java island culture
Procedia PDF Downloads 283781 Unsupervised Classification of DNA Barcodes Species Using Multi-Library Wavelet Networks
Authors: Abdesselem Dakhli, Wajdi Bellil, Chokri Ben Amar
Abstract:
DNA Barcode, a short mitochondrial DNA fragment, made up of three subunits; a phosphate group, sugar and nucleic bases (A, T, C, and G). They provide good sources of information needed to classify living species. Such intuition has been confirmed by many experimental results. Species classification with DNA Barcode sequences has been studied by several researchers. The classification problem assigns unknown species to known ones by analyzing their Barcode. This task has to be supported with reliable methods and algorithms. To analyze species regions or entire genomes, it becomes necessary to use similarity sequence methods. A large set of sequences can be simultaneously compared using Multiple Sequence Alignment which is known to be NP-complete. To make this type of analysis feasible, heuristics, like progressive alignment, have been developed. Another tool for similarity search against a database of sequences is BLAST, which outputs shorter regions of high similarity between a query sequence and matched sequences in the database. However, all these methods are still computationally very expensive and require significant computational infrastructure. Our goal is to build predictive models that are highly accurate and interpretable. This method permits to avoid the complex problem of form and structure in different classes of organisms. On empirical data and their classification performances are compared with other methods. Our system consists of three phases. The first is called transformation, which is composed of three steps; Electron-Ion Interaction Pseudopotential (EIIP) for the codification of DNA Barcodes, Fourier Transform and Power Spectrum Signal Processing. The second is called approximation, which is empowered by the use of Multi Llibrary Wavelet Neural Networks (MLWNN).The third is called the classification of DNA Barcodes, which is realized by applying the algorithm of hierarchical classification.Keywords: DNA barcode, electron-ion interaction pseudopotential, Multi Library Wavelet Neural Networks (MLWNN)
Procedia PDF Downloads 319780 Data Clustering Algorithm Based on Multi-Objective Periodic Bacterial Foraging Optimization with Two Learning Archives
Authors: Chen Guo, Heng Tang, Ben Niu
Abstract:
Clustering splits objects into different groups based on similarity, making the objects have higher similarity in the same group and lower similarity in different groups. Thus, clustering can be treated as an optimization problem to maximize the intra-cluster similarity or inter-cluster dissimilarity. In real-world applications, the datasets often have some complex characteristics: sparse, overlap, high dimensionality, etc. When facing these datasets, simultaneously optimizing two or more objectives can obtain better clustering results than optimizing one objective. However, except for the objectives weighting methods, traditional clustering approaches have difficulty in solving multi-objective data clustering problems. Due to this, evolutionary multi-objective optimization algorithms are investigated by researchers to optimize multiple clustering objectives. In this paper, the Data Clustering algorithm based on Multi-objective Periodic Bacterial Foraging Optimization with two Learning Archives (DC-MPBFOLA) is proposed. Specifically, first, to reduce the high computing complexity of the original BFO, periodic BFO is employed as the basic algorithmic framework. Then transfer the periodic BFO into a multi-objective type. Second, two learning strategies are proposed based on the two learning archives to guide the bacterial swarm to move in a better direction. On the one hand, the global best is selected from the global learning archive according to the convergence index and diversity index. On the other hand, the personal best is selected from the personal learning archive according to the sum of weighted objectives. According to the aforementioned learning strategies, a chemotaxis operation is designed. Third, an elite learning strategy is designed to provide fresh power to the objects in two learning archives. When the objects in these two archives do not change for two consecutive times, randomly initializing one dimension of objects can prevent the proposed algorithm from falling into local optima. Fourth, to validate the performance of the proposed algorithm, DC-MPBFOLA is compared with four state-of-art evolutionary multi-objective optimization algorithms and one classical clustering algorithm on evaluation indexes of datasets. To further verify the effectiveness and feasibility of designed strategies in DC-MPBFOLA, variants of DC-MPBFOLA are also proposed. Experimental results demonstrate that DC-MPBFOLA outperforms its competitors regarding all evaluation indexes and clustering partitions. These results also indicate that the designed strategies positively influence the performance improvement of the original BFO.Keywords: data clustering, multi-objective optimization, bacterial foraging optimization, learning archives
Procedia PDF Downloads 141779 Genomic Sequence Representation Learning: An Analysis of K-Mer Vector Embedding Dimensionality
Authors: James Jr. Mashiyane, Risuna Nkolele, Stephanie J. Müller, Gciniwe S. Dlamini, Rebone L. Meraba, Darlington S. Mapiye
Abstract:
When performing language tasks in natural language processing (NLP), the dimensionality of word embeddings is chosen either ad-hoc or is calculated by optimizing the Pairwise Inner Product (PIP) loss. The PIP loss is a metric that measures the dissimilarity between word embeddings, and it is obtained through matrix perturbation theory by utilizing the unitary invariance of word embeddings. Unlike in natural language, in genomics, especially in genome sequence processing, unlike in natural language processing, there is no notion of a “word,” but rather, there are sequence substrings of length k called k-mers. K-mers sizes matter, and they vary depending on the goal of the task at hand. The dimensionality of word embeddings in NLP has been studied using the matrix perturbation theory and the PIP loss. In this paper, the sufficiency and reliability of applying word-embedding algorithms to various genomic sequence datasets are investigated to understand the relationship between the k-mer size and their embedding dimension. This is completed by studying the scaling capability of three embedding algorithms, namely Latent Semantic analysis (LSA), Word2Vec, and Global Vectors (GloVe), with respect to the k-mer size. Utilising the PIP loss as a metric to train embeddings on different datasets, we also show that Word2Vec outperforms LSA and GloVe in accurate computing embeddings as both the k-mer size and vocabulary increase. Finally, the shortcomings of natural language processing embedding algorithms in performing genomic tasks are discussed.Keywords: word embeddings, k-mer embedding, dimensionality reduction
Procedia PDF Downloads 140778 Complex Network Analysis of Seismicity and Applications to Short-Term Earthquake Forecasting
Authors: Kahlil Fredrick Cui, Marissa Pastor
Abstract:
Earthquakes are complex phenomena, exhibiting complex correlations in space, time, and magnitude. Recently, the concept of complex networks has been used to shed light on the statistical and dynamical characteristics of regional seismicity. In this work, we study the relationships and interactions of seismic regions in Chile, Japan, and the Philippines through weighted and directed complex network analysis. Geographical areas are digitized into cells of fixed dimensions which in turn become the nodes of the network when an earthquake has occurred therein. Nodes are linked if a correlation exists between them as determined and measured by a correlation metric. The networks are found to be scale-free, exhibiting power-law behavior in the distributions of their different centrality measures: the in- and out-degree and the in- and out-strength. The evidence is also found of preferential interaction between seismically active regions through their degree-degree correlations suggesting that seismicity is dictated by the activity of a few active regions. The importance of a seismic region to the overall seismicity is measured using a generalized centrality metric taken to be an indicator of its activity or passivity. The spatial distribution of earthquake activity indicates the areas where strong earthquakes have occurred in the past while the passivity distribution points toward the likely locations an earthquake would occur whenever another one happens elsewhere. Finally, we propose a method that would project the location of the next possible earthquake using the generalized centralities coupled with correlations calculated between the latest earthquakes and a geographical point in the future.Keywords: complex networks, correlations, earthquake, hazard assessment
Procedia PDF Downloads 214