Search results for: cosine similarity
661 Optimized and Secured Digital Watermarking Using Entropy, Chaotic Grid Map and Its Performance Analysis
Authors: R. Rama Kishore, Sunesh
Abstract:
This paper presents an optimized, robust, and secured watermarking technique. The methodology used in this work is the combination of entropy and chaotic grid map. The proposed methodology incorporates Discrete Cosine Transform (DCT) on the host image. To improve the imperceptibility of the method, the host image DCT blocks, where the watermark is to be embedded, are further optimized by considering the entropy of the blocks. Chaotic grid is used as a key to reorder the DCT blocks so that it will further increase security while selecting the watermark embedding locations and its sequence. Without a key, one cannot reveal the exact watermark from the watermarked image. The proposed method is implemented on four different images. It is concluded that the proposed method is giving better results in terms of imperceptibility measured through PSNR and found to be above 50. In order to prove the effectiveness of the method, the performance analysis is done after implementing different attacks on the watermarked images. It is found that the methodology is very strong against JPEG compression attack even with the quality parameter up to 15. The experimental results are confirming that the combination of entropy and chaotic grid map method is strong and secured to different image processing attacks.Keywords: digital watermarking, discreate cosine transform, chaotic grid map, entropy
Procedia PDF Downloads 252660 Conductivity-Depth Inversion of Large Loop Transient Electromagnetic Sounding Data over Layered Earth Models
Authors: Ravi Ande, Mousumi Hazari
Abstract:
One of the common geophysical techniques for mapping subsurface geo-electrical structures, extensive hydro-geological research, and engineering and environmental geophysics applications is the use of time domain electromagnetic (TDEM)/transient electromagnetic (TEM) soundings. A large transmitter loop for energising the ground and a small receiver loop or magnetometer for recording the transient voltage or magnetic field in the air or on the surface of the earth, with the receiver at the center of the loop or at any random point inside or outside the source loop, make up a large loop TEM system. In general, one can acquire data using one of the configurations with a large loop source, namely, with the receiver at the center point of the loop (central loop method), at an arbitrary in-loop point (in-loop method), coincident with the transmitter loop (coincidence-loop method), and at an arbitrary offset loop point (offset-loop method), respectively. Because of the mathematical simplicity associated with the expressions of EM fields, as compared to the in-loop and offset-loop systems, the central loop system (for ground surveys) and coincident loop system (for ground as well as airborne surveys) have been developed and used extensively for the exploration of mineral and geothermal resources, for mapping contaminated groundwater caused by hazardous waste and thickness of permafrost layer. Because a proper analytical expression for the TEM response over the layered earth model for the large loop TEM system does not exist, the forward problem used in this inversion scheme is first formulated in the frequency domain and then it is transformed in the time domain using Fourier cosine or sine transforms. Using the EMLCLLER algorithm, the forward computation is initially carried out in the frequency domain. As a result, the EMLCLLER modified the forward calculation scheme in NLSTCI to compute frequency domain answers before converting them to the time domain using Fourier Cosine and/or Sine transforms.Keywords: time domain electromagnetic (TDEM), TEM system, geoelectrical sounding structure, Fourier cosine
Procedia PDF Downloads 91659 Positive-Negative Asymmetry in the Evaluations of Political Candidates: The Mediating Role of Affect in the Relationship between Cognitive Evaluation and Voting Intention
Authors: Magdalena Jablonska, Andrzej Falkowski
Abstract:
The negativity effect is one of the most intriguing and well-studied psychological phenomena that can be observed in many areas of human life. The aim of the following study is to investigate how valence framing and positive and negative information about political candidates affect judgments about similarity to an ideal and bad politician. Based on the theoretical framework of features of similarity, it is hypothesized that negative features have a stronger effect on similarity judgments than positive features of comparable value. Furthermore, the mediating role of affect is tested. Method: One hundred sixty-one people took part in an experimental study. Participants were divided into 6 research conditions that differed in the reference point (positive vs negative framing) and the number of favourable and unfavourable information items about political candidates (a positive, neutral and negative candidate profile). In positive framing condition, the concept of an ideal politician was primed; in the negative condition, participants were to think about a bad politician. The effect of independent variables on similarity judgments, affective evaluation, and voting intention was tested. Results: In the positive condition, the analysis showed that the negative effect of additional unfavourable features was greater than the positive effect of additional favourable features in judgements about similarity to the ideal candidate. In negative framing condition, ANOVA was insignificant, showing that neither the addition of positive features nor additional negative information had a significant impact on the similarity to a bad political candidate. To explain this asymmetry, two mediational analyses were conducted that tested the mediating role of affect in the relationship between similarity judgments and voting intention. In both situations the mediating effect was significant, but the comparison of two models showed that the mediation was stronger for a negative framing. Discussion: The research supports the negativity effect and attempts to explain the psychological mechanism behind the positive-negative asymmetry. The results of mediation analyses point to a stronger mediating role of affect in the relationship between cognitive evaluation and voting intention. Such a result suggests that negative comparisons, leading to the activation of negative features, give rise to stronger emotions than positive features of comparable strength. The findings are in line with positive-negative asymmetry, however, by adopting Tversky’s framework of features of similarity, the study integrates the cognitive mechanism of the negativity effect delineated in the contrast model of similarity with its emotional component resulting from the asymmetrical effect of positive and negative emotions on decision-making.Keywords: affect, framing, negativity effect, positive-negative asymmetry, similarity judgements
Procedia PDF Downloads 195658 Alignment in Earnings Management Research: Italy Looking towards US
Authors: Giulia Leoni, Cristina Florio
Abstract:
The paper aims to investigate the factors driving the increasing alignment of Italian earnings management (EM) research to US research on the same field. After characterizing the progressive similarity of Italian EM research with respect to US one by means of an historical comparison, the paper relies on a subsequent secondary source analysis to detect the possible causes of said alignment. Once identified that the alignment increased along three subsequent periods, the paper analyses and discusses this incremental similarity according to new institutional sociology (NIS) and highlights the presence of different combination of isomorphic pressures that help explaining this incremental similarity. The paper contributes to the institutional literature by providing evidence of isomorphism in academic research; it also contributes to accounting research by indicating the forces that are able to drive change and development in accounting research at national and international level. The paper also enlarges the explanatory value of NIS in alternative contexts, like academic accounting research.Keywords: accounting research, earnings management, international comparison, Italy, new institutional sociology, US
Procedia PDF Downloads 572657 Prediction of Bubbly Plume Characteristics Using the Self-Similarity Model
Authors: Li Chen, Alex Skvortsov, Chris Norwood
Abstract:
Gas releasing into water can be found in for many industrial situations. This process results in the formation of bubbles and acoustic emission which depends upon the bubble characteristics. If the bubble creation rates (bubble volume flow rate) are of interest, an inverse method has to be used based on the measurement of acoustic emission. However, there will be sound attenuation through the bubbly plume which will influence the measurement and should be taken into consideration in the model. The sound transmission through the bubbly plume depends on the characteristics of the bubbly plume, such as the shape and the bubble distributions. In this study, the bubbly plume shape is modelled using a self-similarity model, which has been normally applied for a single phase buoyant plume. The prediction is compared with the experimental data. It has been found the model can be applied to a buoyant plume of gas-liquid mixture. The influence of the gas flow rate and discharge nozzle size is studied.Keywords: bubbly plume, buoyant plume, bubble acoustics, self-similarity model
Procedia PDF Downloads 285656 Evaluation and Compression of Different Language Transformer Models for Semantic Textual Similarity Binary Task Using Minority Language Resources
Authors: Ma. Gracia Corazon Cayanan, Kai Yuen Cheong, Li Sha
Abstract:
Training a language model for a minority language has been a challenging task. The lack of available corpora to train and fine-tune state-of-the-art language models is still a challenge in the area of Natural Language Processing (NLP). Moreover, the need for high computational resources and bulk data limit the attainment of this task. In this paper, we presented the following contributions: (1) we introduce and used a translation pair set of Tagalog and English (TL-EN) in pre-training a language model to a minority language resource; (2) we fine-tuned and evaluated top-ranking and pre-trained semantic textual similarity binary task (STSB) models, to both TL-EN and STS dataset pairs. (3) then, we reduced the size of the model to offset the need for high computational resources. Based on our results, the models that were pre-trained to translation pairs and STS pairs can perform well for STSB task. Also, having it reduced to a smaller dimension has no negative effect on the performance but rather has a notable increase on the similarity scores. Moreover, models that were pre-trained to a similar dataset have a tremendous effect on the model’s performance scores.Keywords: semantic matching, semantic textual similarity binary task, low resource minority language, fine-tuning, dimension reduction, transformer models
Procedia PDF Downloads 209655 Analysis of Interpolation Factor in Pulse Shaping Filter on MRC for CDMA 2000 Systems
Authors: Pankaj Verma, Gagandeep Singh Walia, Padma Devi, H. P. Singh
Abstract:
Code Division Multiple Access 2000 operates on various RF channel bandwidths 1.2288 or 3.6864 Mcps. CDMA offers high bandwidth and wireless broadband services but the efficiency gets decreased because of many interfering factors like fading, interference, scattering, diffraction, refraction, reflection etc. To reduce the spectral bandwidth is one of the major concerns in modern day technology and this is achieved by pulse shaping filter. This paper investigates the effect of diversity (MRC), interpolation factor in Root Raised Cosine (RRC) filter for the QPSK and BPSK modulation schemes. It is made possible to send information with minimum inter symbol interference and within limited bandwidth with proper pulse shaping technique. Bit error rate (BER) performance is analyzed by applying diversity technique by varying the interpolation factor for Binary Phase Shift Keying (BPSK) and Quadrature Phase Shift Keying (QPSK). Interpolation factor increases the original sampling rate of a sequence to a higher rate and reduces the interference and diversity reduces the fading.Keywords: CDMA2000, root raised cosine, roll off factor, ISI, diversity, interference, fading
Procedia PDF Downloads 473654 Study on the Self-Location Estimate by the Evolutional Triangle Similarity Matching Using Artificial Bee Colony Algorithm
Authors: Yuji Kageyama, Shin Nagata, Tatsuya Takino, Izuru Nomura, Hiroyuki Kamata
Abstract:
In previous study, technique to estimate a self-location by using a lunar image is proposed. We consider the improvement of the conventional method in consideration of FPGA implementation in this paper. Specifically, we introduce Artificial Bee Colony algorithm for reduction of search time. In addition, we use fixed point arithmetic to enable high-speed operation on FPGA.Keywords: SLIM, Artificial Bee Colony Algorithm, location estimate, evolutional triangle similarity
Procedia PDF Downloads 516653 Destination Port Detection For Vessels: An Analytic Tool For Optimizing Port Authorities Resources
Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin
Abstract:
Port authorities have many challenges in congested ports to allocate their resources to provide a safe and secure loading/ unloading procedure for cargo vessels. Selecting a destination port is the decision of a vessel master based on many factors such as weather, wavelength and changes of priorities. Having access to a tool which leverages AIS messages to monitor vessel’s movements and accurately predict their next destination port promotes an effective resource allocation process for port authorities. In this research, we propose a method, namely, Reference Route of Trajectory (RRoT) to assist port authorities in predicting inflow and outflow traffic in their local environment by monitoring Automatic Identification System (AIS) messages. Our RRoT method creates a reference route based on historical AIS messages. It utilizes some of the best trajectory similarity measure to identify the destination of a vessel using their recent movement. We evaluated five different similarity measures such as Discrete Fr´echet Distance (DFD), Dynamic Time Warping (DTW), Partial Curve Mapping (PCM), Area between two curves (Area) and Curve length (CL). Our experiments show that our method identifies the destination port with an accuracy of 98.97% and an fmeasure of 99.08% using Dynamic Time Warping (DTW) similarity measure.Keywords: spatial temporal data mining, trajectory mining, trajectory similarity, resource optimization
Procedia PDF Downloads 120652 Flow and Heat Transfer of a Nanofluid over a Shrinking Sheet
Authors: N. Bachok, N. L. Aleng, N. M. Arifin, A. Ishak, N. Senu
Abstract:
The problem of laminar fluid flow which results from the shrinking of a permeable surface in a nanofluid has been investigated numerically. The model used for the nanofluid incorporates the effects of Brownian motion and thermophoresis. A similarity solution is presented which depends on the mass suction parameter S, Prandtl number Pr, Lewis number Le, Brownian motion number Nb and thermophoresis number Nt. It was found that the reduced Nusselt number is decreasing function of each dimensionless number.Keywords: Boundary layer, nanofluid, shrinking sheet, Brownian motion, thermophoresis, similarity solution
Procedia PDF Downloads 413651 A Comparison between Different Segmentation Techniques Used in Medical Imaging
Authors: Ibtihal D. Mustafa, Mawia A. Hassan
Abstract:
Tumor segmentation from MRI image is important part of medical images experts. This is particularly a challenging task because of the high assorting appearance of tumor tissue among different patients. MRI images are advance of medical imaging because it is give richer information about human soft tissue. There are different segmentation techniques to detect MRI brain tumor. In this paper, different procedure segmentation methods are used to segment brain tumors and compare the result of segmentations by using correlation and structural similarity index (SSIM) to analysis and see the best technique that could be applied to MRI image.Keywords: MRI, segmentation, correlation, structural similarity
Procedia PDF Downloads 406650 3D Object Retrieval Based on Similarity Calculation in 3D Computer Aided Design Systems
Authors: Ahmed Fradi
Abstract:
Nowadays, recent technological advances in the acquisition, modeling, and processing of three-dimensional (3D) objects data lead to the creation of models stored in huge databases, which are used in various domains such as computer vision, augmented reality, game industry, medicine, CAD (Computer-aided design), 3D printing etc. On the other hand, the industry is currently benefiting from powerful modeling tools enabling designers to easily and quickly produce 3D models. The great ease of acquisition and modeling of 3D objects make possible to create large 3D models databases, then, it becomes difficult to navigate them. Therefore, the indexing of 3D objects appears as a necessary and promising solution to manage this type of data, to extract model information, retrieve an existing model or calculate similarity between 3D objects. The objective of the proposed research is to develop a framework allowing easy and fast access to 3D objects in a CAD models database with specific indexing algorithm to find objects similar to a reference model. Our main objectives are to study existing methods of similarity calculation of 3D objects (essentially shape-based methods) by specifying the characteristics of each method as well as the difference between them, and then we will propose a new approach for indexing and comparing 3D models, which is suitable for our case study and which is based on some previously studied methods. Our proposed approach is finally illustrated by an implementation, and evaluated in a professional context.Keywords: CAD, 3D object retrieval, shape based retrieval, similarity calculation
Procedia PDF Downloads 261649 Hybrid Approximate Structural-Semantic Frequent Subgraph Mining
Authors: Montaceur Zaghdoud, Mohamed Moussaoui, Jalel Akaichi
Abstract:
Frequent subgraph mining refers usually to graph matching and it is widely used in when analyzing big data with large graphs. A lot of research works dealt with structural exact or inexact graph matching but a little attention is paid to semantic matching when graph vertices and/or edges are attributed and typed. Therefore, it seems very interesting to integrate background knowledge into the analysis and that extracted frequent subgraphs should become more pruned by applying a new semantic filter instead of using only structural similarity in graph matching process. Consequently, this paper focuses on developing a new hybrid approximate structuralsemantic graph matching to discover a set of frequent subgraphs. It uses simultaneously an approximate structural similarity function based on graph edit distance function and a possibilistic vertices similarity function based on affinity function. Both structural and semantic filters contribute together to prune extracted frequent set. Indeed, new hybrid structural-semantic frequent subgraph mining approach searches will be suitable to be applied to several application such as community detection in social networks.Keywords: approximate graph matching, hybrid frequent subgraph mining, graph mining, possibility theory
Procedia PDF Downloads 401648 Computing the Similarity and the Diversity in the Species Based on Cronobacter Genome
Authors: E. Al Daoud
Abstract:
The purpose of computing the similarity and the diversity in the species is to trace the process of evolution and to find the relationship between the species and discover the unique, the special, the common and the universal proteins. The proteins of the whole genome of 40 species are compared with the cronobacter genome which is used as reference genome. More than 3 billion pairwise alignments are performed using blastp. Several findings are introduced in this study, for example, we found 172 proteins in cronobacter genome which have insignificant hits in other species, 116 significant proteins in the all tested species with very high score value and 129 common proteins in the plants but have insignificant hits in mammals, birds, fishes, and insects.Keywords: genome, species, blastp, conserved genes, Cronobacter
Procedia PDF Downloads 494647 Personalized Social Resource Recommender Systems on Interest-Based Social Networks
Authors: C. L. Huang, J. J. Sia
Abstract:
The interest-based social networks, also known as social bookmark sharing systems, are useful platforms for people to conveniently read and collect internet resources. These platforms also providing function of social networks, and users can share and explore internet resources from the social networks. Providing personalized internet resources to users is an important issue on these platforms. This study uses two types of relationship on the social networks—following and follower and proposes a collaborative recommender system, consisting of two main steps. First, this study calculates the relationship strength between the target user and the target user's followings and followers to find top-N similar neighbors. Second, from the top-N similar neighbors, the articles (internet resources) that may interest the target user are recommended to the target user. In this system, users can efficiently obtain recent, related and diverse internet resources (knowledge) from the interest-based social network. This study collected the experimental dataset from Diigo, which is a famous bookmark sharing system. The experimental results show that the proposed recommendation model is more accurate than two traditional baseline recommendation models but slightly lower than the cosine model in accuracy. However, in the metrics of the diversity and executing time, our proposed model outperforms the cosine model.Keywords: recommender systems, social networks, tagging, bookmark sharing systems, collaborative recommender systems, knowledge management
Procedia PDF Downloads 172646 [Keynote Speaker]: Some Similarity Considerations for Design of Experiments for Hybrid Buoyant Aerial Vehicle
Authors: A. U. Haque, W. Asrar, A. A Omar, E. Sulaeman, J. S. M. Ali
Abstract:
Buoyancy force applied on deformable symmetric bodies can be estimated by using Archimedes Principle. Such bodies like ellipsoidal bodies have high volume to surface ratio and are isometrically scaled for mass, length, area and volume to follow square cube law. For scaling up such bodies, it is worthwhile to find out the scaling relationship between the other physical quantities that represent thermodynamic, structural and inertial response etc. So, dimensionless similarities to find an allometric scale can be developed by using Bukingham π theorem which utilizes physical dimensions of important parameters. Base on this fact, physical dependencies of buoyancy system are reviewed to find the set of physical variables for deformable bodies of revolution filled with expandable gas like helium. Due to change in atmospheric conditions, this gas changes its volume and this change can effect the stability of elongated bodies on the ground as well as in te air. Special emphasis was given on the existing similarity parameters which can be used in the design of experiments of such bodies whose shape is affected by the external force like a drag, surface tension and kinetic loads acting on the surface. All these similarity criteria are based on non-dimensionalization, which also needs to be consider for scaling up such bodies.Keywords: Bukhigham pi theorem, similitude, scaling, buoyancy
Procedia PDF Downloads 375645 Genetic Diversity Based Population Study of Freshwater Mud Eel (Monopterus cuchia) in Bangladesh
Authors: M. F. Miah, K. M. A. Zinnah, M. J. Raihan, H. Ali, M. N. Naser
Abstract:
As genetic diversity is most important for existing, breeding and production of any fish; this study was undertaken for investigating genetic diversity of freshwater mud eel, Monopterus cuchia at population level where three ecological populations such as flooded area of Sylhet (P1), open water of Moulvibazar (P2) and open water of Sunamganj (P3) districts of Bangladesh were considered. Four arbitrary RAPD primers (OPB-12, C0-4, B-03 and OPB-08) were screened and RAPD banding patterns were analyzed among the populations considering 15 individuals of each population. In total 174, 138 and 149 bands were detected in the populations of P1, P2 and P3 respectively; however, each primer revealed less number of bands in each population. 100% polymorphic loci were recorded in P2 and P3 whereas only one monomorphic locus was observed in P1, recorded 97.5% polymorphism. Different genetic parameters such as inter-individual pairwise similarity, genetic distance, Nei genetic similarity, linkage distances, cluster analysis and allelic information, etc. were considered for measuring genetic diversity. The average inter-individual pairwise similarity was recorded 2.98, 1.47 and 1.35 in P1, P2 and P3 respectively. Considering genetic distance analysis, the highest distance 1 was recorded in P2 and P3 and the lowest genetic distance 0.444 was found in P2. The average Nei genetic similarity was observed 0.19, 0.16 and 0.13 in P1, P2 and P3, respectively; however, the average linkage distance was recorded 24.92, 17.14 and 15.28 in P1, P3 and P2 respectively. Based on linkage distance, genetic clusters were generated in three populations where 6 clades and 7 clusters were found in P1, 3 clades and 5 clusters were observed in P2 and 4 clades and 7 clusters were detected in P3. In addition, allelic information was observed where the frequency of p and q alleles were observed 0.093 and 0.907 in P1, 0.076 and 0.924 in P2, 0.074 and 0.926 in P3 respectively. The average gene diversity was observed highest in P2 (0.132) followed by P3 (0.131) and P1 (0.121) respectively.Keywords: genetic diversity, Monopterus cuchia, population, RAPD, Bangladesh
Procedia PDF Downloads 504644 Similarity Based Retrieval in Case Based Reasoning for Analysis of Medical Images
Authors: M. Dasgupta, S. Banerjee
Abstract:
Content Based Image Retrieval (CBIR) coupled with Case Based Reasoning (CBR) is a paradigm that is becoming increasingly popular in the diagnosis and therapy planning of medical ailments utilizing the digital content of medical images. This paper presents a survey of some of the promising approaches used in the detection of abnormalities in retina images as well in mammographic screening and detection of regions of interest in MRI scans of the brain. We also describe our proposed algorithm to detect hard exudates in fundus images of the retina of Diabetic Retinopathy patients.Keywords: case based reasoning, exudates, retina image, similarity based retrieval
Procedia PDF Downloads 347643 Comparative Analysis of Dissimilarity Detection between Binary Images Based on Equivalency and Non-Equivalency of Image Inversion
Authors: Adnan A. Y. Mustafa
Abstract:
Image matching is a fundamental problem that arises frequently in many aspects of robot and computer vision. It can become a time-consuming process when matching images to a database consisting of hundreds of images, especially if the images are big. One approach to reducing the time complexity of the matching process is to reduce the search space in a pre-matching stage, by simply removing dissimilar images quickly. The Probabilistic Matching Model for Binary Images (PMMBI) showed that dissimilarity detection between binary images can be accomplished quickly by random pixel mapping and is size invariant. The model is based on the gamma binary similarity distance that recognizes an image and its inverse as containing the same scene and hence considers them to be the same image. However, in many applications, an image and its inverse are not treated as being the same but rather dissimilar. In this paper, we present a comparative analysis of dissimilarity detection between PMMBI based on the gamma binary similarity distance and a modified PMMBI model based on a similarity distance that does distinguish between an image and its inverse as being dissimilar.Keywords: binary image, dissimilarity detection, probabilistic matching model for binary images, image mapping
Procedia PDF Downloads 151642 Recommender System Based on Mining Graph Databases for Data-Intensive Applications
Authors: Mostafa Gamal, Hoda K. Mohamed, Islam El-Maddah, Ali Hamdi
Abstract:
In recent years, many digital documents on the web have been created due to the rapid growth of ’social applications’ communities or ’Data-intensive applications’. The evolution of online-based multimedia data poses new challenges in storing and querying large amounts of data for online recommender systems. Graph data models have been shown to be more efficient than relational data models for processing complex data. This paper will explain the key differences between graph and relational databases, their strengths and weaknesses, and why using graph databases is the best technology for building a realtime recommendation system. Also, The paper will discuss several similarity metrics algorithms that can be used to compute a similarity score of pairs of nodes based on their neighbourhoods or their properties. Finally, the paper will discover how NLP strategies offer the premise to improve the accuracy and coverage of realtime recommendations by extracting the information from the stored unstructured knowledge, which makes up the bulk of the world’s data to enrich the graph database with this information. As the size and number of data items are increasing rapidly, the proposed system should meet current and future needs.Keywords: graph databases, NLP, recommendation systems, similarity metrics
Procedia PDF Downloads 103641 Self-Supervised Learning for Hate-Speech Identification
Authors: Shrabani Ghosh
Abstract:
Automatic offensive language detection in social media has become a stirring task in today's NLP. Manual Offensive language detection is tedious and laborious work where automatic methods based on machine learning are only alternatives. Previous works have done sentiment analysis over social media in different ways such as supervised, semi-supervised, and unsupervised manner. Domain adaptation in a semi-supervised way has also been explored in NLP, where the source domain and the target domain are different. In domain adaptation, the source domain usually has a large amount of labeled data, while only a limited amount of labeled data is available in the target domain. Pretrained transformers like BERT, RoBERTa models are fine-tuned to perform text classification in an unsupervised manner to perform further pre-train masked language modeling (MLM) tasks. In previous work, hate speech detection has been explored in Gab.ai, which is a free speech platform described as a platform of extremist in varying degrees in online social media. In domain adaptation process, Twitter data is used as the source domain, and Gab data is used as the target domain. The performance of domain adaptation also depends on the cross-domain similarity. Different distance measure methods such as L2 distance, cosine distance, Maximum Mean Discrepancy (MMD), Fisher Linear Discriminant (FLD), and CORAL have been used to estimate domain similarity. Certainly, in-domain distances are small, and between-domain distances are expected to be large. The previous work finding shows that pretrain masked language model (MLM) fine-tuned with a mixture of posts of source and target domain gives higher accuracy. However, in-domain performance of the hate classifier on Twitter data accuracy is 71.78%, and out-of-domain performance of the hate classifier on Gab data goes down to 56.53%. Recently self-supervised learning got a lot of attention as it is more applicable when labeled data are scarce. Few works have already been explored to apply self-supervised learning on NLP tasks such as sentiment classification. Self-supervised language representation model ALBERTA focuses on modeling inter-sentence coherence and helps downstream tasks with multi-sentence inputs. Self-supervised attention learning approach shows better performance as it exploits extracted context word in the training process. In this work, a self-supervised attention mechanism has been proposed to detect hate speech on Gab.ai. This framework initially classifies the Gab dataset in an attention-based self-supervised manner. On the next step, a semi-supervised classifier trained on the combination of labeled data from the first step and unlabeled data. The performance of the proposed framework will be compared with the results described earlier and also with optimized outcomes obtained from different optimization techniques.Keywords: attention learning, language model, offensive language detection, self-supervised learning
Procedia PDF Downloads 103640 Top-K Shortest Distance as a Similarity Measure
Authors: Andrey Lebedev, Ilya Dmitrenok, JooYoung Lee, Leonard Johard
Abstract:
Top-k shortest path routing problem is an extension of finding the shortest path in a given network. Shortest path is one of the most essential measures as it reveals the relations between two nodes in a network. However, in many real world networks, whose diameters are small, top-k shortest path is more interesting as it contains more information about the network topology. Many variations to compute top-k shortest paths have been studied. In this paper, we apply an efficient top-k shortest distance routing algorithm to the link prediction problem and test its efficacy. We compare the results with other base line and state-of-the-art methods as well as with the shortest path. Then, we also propose a top-k distance based graph matching algorithm.Keywords: graph matching, link prediction, shortest path, similarity
Procedia PDF Downloads 356639 The Effects of Different Types of Herbicides Used for Lawn Maintenance on the Dynamics of Weeds in an Urban Environment
Authors: Yetunde I. Bulu, Moses B. Adewole, Julius O. Faluyi
Abstract:
This study investigates the effect of aggressive application of herbicide on weed succession in an urban environment in Ile-Ife, Osun State. An inspection of the communities was carried out to identify sites maintained by herbicides (test plots) and those without herbicide history (control plots). Four different experimental plots located at Olasode, Eleweran, Ife City and Parakin within Ile-Ife town were monitored during the study. Comprehensive enumeration and identification of plant populations to species level was carried out on each of the plots and at every visit to determine the direction of succession. Index of similarities was used to determine the relationship in plant species composition between plots treated with herbicide and the untreated plots. The trend of increasing plant species was observed in all the study plots. Low Similarity Index between the treated plots and the control vegetation was observed at all visitations. Low similarity was also observed between the above-ground vegetation and the seed bank in all the plots. The study concluded that the weed population observed from the experimental plots showed an increase in species richness and diversity when the plots were left to recover compared to the control plots.Keywords: herbicide, index of similarity, population, soil seed bank, succession
Procedia PDF Downloads 160638 Case-Based Reasoning Approach for Process Planning of Internal Thread Cold Extrusion
Authors: D. Zhang, H. Y. Du, G. W. Li, J. Zeng, D. W. Zuo, Y. P. You
Abstract:
For the difficult issues of process selection, case-based reasoning technology is applied to computer aided process planning system for cold form tapping of internal threads on the basis of similarity in the process. A model is established based on the analysis of process planning. Case representation and similarity computing method are given. Confidence degree is used to evaluate the case. Rule-based reuse strategy is presented. The scheme is illustrated and verified by practical application. The case shows the design results with the proposed method are effective.Keywords: case-based reasoning, internal thread, cold extrusion, process planning
Procedia PDF Downloads 507637 Algorithms for Fast Computation of Pan Matrix Profiles of Time Series Under Unnormalized Euclidean Distances
Authors: Jing Zhang, Daniel Nikovski
Abstract:
We propose an approximation algorithm called LINKUMP to compute the Pan Matrix Profile (PMP) under the unnormalized l∞ distance (useful for value-based similarity search) using double-ended queue and linear interpolation. The algorithm has comparable time/space complexities as the state-of-the-art algorithm for typical PMP computation under the normalized l₂ distance (useful for shape-based similarity search). We validate its efficiency and effectiveness through extensive numerical experiments and a real-world anomaly detection application.Keywords: pan matrix profile, unnormalized euclidean distance, double-ended queue, discord discovery, anomaly detection
Procedia PDF Downloads 245636 Flow Behavior and Performances of Centrifugal Compressor Stage Vaneless Diffusers
Authors: Y.Galerkin, O. Solovieva
Abstract:
Flow parameters are calculated in vaneless diffusers with relative width 0,014 – 0,10 constant along radii. Inlet flow angles and similarity criteria were varied. Information about flow structure is presented – meridian streamlines configuration, information on flow full development, flow separation. Polytrophic efficiency, loss and recovery coefficient are used to compare diffusers’ effectiveness. The sample of narrow diffuser optimization by conical walls application is presented. Three tampered variants of a wide diffuser are compared too. The work is made in the R&D laboratory “Gas dynamics of turbo machines” of the TU SPb.Keywords: vaneless diffuser, relative width, flow angle, flow separation, loss coefficient, similarity criteria
Procedia PDF Downloads 489635 Isolation and Identification of Diacylglycerol Acyltransferase Type-2 (GAT2) Genes from Three Egyptian Olive Cultivars
Authors: Yahia I. Mohamed, Ahmed I. Marzouk, Mohamed A. Yacout
Abstract:
Aim of this work was to study the genetic basis for oil accumulation in olive fruit via tracking DGAT2 (Diacylglycerol acyltransferase type-2) gene in three Egyptian Origen Olive cultivars namely Toffahi, Hamed and Maraki using molecular marker techniques and bioinformatics tools. Results illustrate that, firstly: specific genomic band of Maraki cultivars was identified as DGAT2 (Diacylglycerol acyltransferase type-2) and identical for this gene in Olea europaea with 100 % of similarity. Secondly, differential genomic band of Maraki cultivars which produced from RAPD fingerprinting technique reflected predicted distinguished sequence which identified as DGAT2 (Diacylglycerol acyltransferase type-2) in Fragaria vesca subsp. Vesca with 76% of sequential similarity. Third and finally, specific genomic specific band of Hamed cultivars was indentified as two fragments, 1-Olea europaea cultivar Koroneiki diacylglycerol acyltransferase type 2 mRNA, complete cds with two matches regions with 99% or 2-PREDICTED: Fragaria vesca subsp. vesca diacylglycerol O-acyltransferase 2-like (LOC101313050), mRNA with 86% of similarity.Keywords: Olea europaea, fingerprinting, diacylglycerol acyltransferase type-2 (DGAT2), Egypt
Procedia PDF Downloads 501634 Bird Diversity along Boat Touring Routes in Tha Ka Sub-District, Amphawa District, Samut Songkram Province, Thailand
Authors: N. Charoenpokaraj, P. Chitman
Abstract:
This research aims to study species, abundance, status of birds, the similarities and activity characteristics of birds which reap benefits from the research area in boat touring routes in Tha Ka sub-district, Amphawa District, Samut Songkram Province, Thailand. from October 2012 – September 2013. The data was analyzed to find the abundance, and similarity index of the birds. The results from the survey of birds on all three routes found that there are 33 families and 63 species. Route 3 (traditional coconut sugar making kiln – resort) had the most species; 56 species. There were 18 species of commonly found birds with an abundance level of 5, which calculates to 28.57% of all bird species. In August, 46 species are found, being the greatest number of bird species benefiting from this route. As for the status of the birds, there are 51 resident birds, 7 resident and migratory birds, and 5 migratory birds. On Route 2 and Route 3, the similarity index value is equal to 0.881. The birds are classified by their activity characteristics i.e. insectivore, piscivore, granivore, nectrivore and aquatic invertebrate feeder birds. Some birds also use the area for nesting.Keywords: bird diversity, boat touring routes, Samut Songkram, similarity index
Procedia PDF Downloads 331633 An Optimization Algorithm Based on Dynamic Schema with Dissimilarities and Similarities of Chromosomes
Authors: Radhwan Yousif Sedik Al-Jawadi
Abstract:
Optimization is necessary for finding appropriate solutions to a range of real-life problems. In particular, genetic (or more generally, evolutionary) algorithms have proved very useful in solving many problems for which analytical solutions are not available. In this paper, we present an optimization algorithm called Dynamic Schema with Dissimilarity and Similarity of Chromosomes (DSDSC) which is a variant of the classical genetic algorithm. This approach constructs new chromosomes from a schema and pairs of existing ones by exploring their dissimilarities and similarities. To show the effectiveness of the algorithm, it is tested and compared with the classical GA, on 15 two-dimensional optimization problems taken from literature. We have found that, in most cases, our method is better than the classical genetic algorithm.Keywords: chromosome injection, dynamic schema, genetic algorithm, similarity and dissimilarity
Procedia PDF Downloads 343632 Network Word Discovery Framework Based on Sentence Semantic Vector Similarity
Authors: Ganfeng Yu, Yuefeng Ma, Shanliang Yang
Abstract:
The word discovery is a key problem in text information retrieval technology. Methods in new word discovery tend to be closely related to words because they generally obtain new word results by analyzing words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network words that are far from standard Chinese expression. How detect network words is one of the important goals in the field of text information retrieval today. In this paper, we integrate the word embedding model and clustering methods to propose a network word discovery framework based on sentence semantic similarity (S³-NWD) to detect network words effectively from the corpus. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network words but also realizes the standard word meaning of the discovery of network words, which reflects the effectiveness of our work.Keywords: text information retrieval, natural language processing, new word discovery, information extraction
Procedia PDF Downloads 91