Search results for: document similarity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1368

Search results for: document similarity

1278 Network Word Discovery Framework Based on Sentence Semantic Vector Similarity

Authors: Ganfeng Yu, Yuefeng Ma, Shanliang Yang

Abstract:

The word discovery is a key problem in text information retrieval technology. Methods in new word discovery tend to be closely related to words because they generally obtain new word results by analyzing words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network words that are far from standard Chinese expression. How detect network words is one of the important goals in the field of text information retrieval today. In this paper, we integrate the word embedding model and clustering methods to propose a network word discovery framework based on sentence semantic similarity (S³-NWD) to detect network words effectively from the corpus. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network words but also realizes the standard word meaning of the discovery of network words, which reflects the effectiveness of our work.

Keywords: text information retrieval, natural language processing, new word discovery, information extraction

Procedia PDF Downloads 56
1277 Application of Signature Verification Models for Document Recognition

Authors: Boris M. Fedorov, Liudmila P. Goncharenko, Sergey A. Sybachin, Natalia A. Mamedova, Ekaterina V. Makarenkova, Saule Rakhimova

Abstract:

In modern economic conditions, the question of the possibility of correct recognition of a signature on digital documents in order to verify the expression of will or confirm a certain operation is relevant. The additional complexity of processing lies in the dynamic variability of the signature for each individual, as well as in the way information is processed because the signature refers to biometric data. The article discusses the issues of using artificial intelligence models in order to improve the quality of signature confirmation in document recognition. The analysis of several possible options for using the model is carried out. The results of the study are given, in which it is possible to correctly determine the authenticity of the signature on small samples.

Keywords: signature recognition, biometric data, artificial intelligence, neural networks

Procedia PDF Downloads 108
1276 System of Quality Automation for Documents (SQAD)

Authors: R. Babi Saraswathi, K. Divya, A. Habeebur Rahman, D. B. Hari Prakash, S. Jayanth, T. Kumar, N. Vijayarangan

Abstract:

Document automation is the design of systems and workflows, assembling repetitive documents to meet the specific business needs. In any organization or institution, documenting employee’s information is very important for both employees as well as management. It shows an individual’s progress to the management. Many documents of the employee are in the form of papers, so it is very difficult to arrange and for future reference we need to spend more time in getting the exact document. Also, it is very tedious to generate reports according to our needs. The process gets even more difficult on getting approvals and hence lacks its security aspects. This project overcomes the above-stated issues. By storing the details in the database and maintaining the e-documents, the automation system reduces the manual work to a large extent. Then the approval process of some important documents can be done in a much-secured manner by using Digital Signature and encryption techniques. Details are maintained in the database and e-documents are stored in specific folders and generation of various kinds of reports is possible. Moreover, an efficient search method is implemented is used in the database. Automation supporting document maintenance in many aspects is useful for minimize data entry, reduce the time spent on proof-reading, avoids duplication, and reduce the risks associated with the manual error, etc.

Keywords: e-documents, automation, digital signature, encryption

Procedia PDF Downloads 354
1275 Nazca: A Context-Based Matching Method for Searching Heterogeneous Structures

Authors: Karine B. de Oliveira, Carina F. Dorneles

Abstract:

The structure level matching is the problem of combining elements of a structure, which can be represented as entities, classes, XML elements, web forms, and so on. This is a challenge due to large number of distinct representations of semantically similar structures. This paper describes a structure-based matching method applied to search for different representations in data sources, considering the similarity between elements of two structures and the data source context. Using real data sources, we have conducted an experimental study comparing our approach with our baseline implementation and with another important schema matching approach. We demonstrate that our proposal reaches higher precision than the baseline.

Keywords: context, data source, index, matching, search, similarity, structure

Procedia PDF Downloads 323
1274 3D Model Completion Based on Similarity Search with Slim-Tree

Authors: Alexis Aldo Mendoza Villarroel, Ademir Clemente Villena Zevallos, Cristian Jose Lopez Del Alamo

Abstract:

With the advancement of technology it is now possible to scan entire objects and obtain their digital representation by using point clouds or polygon meshes. However, some objects may be broken or have missing parts; thus, several methods focused on this problem have been proposed based on Geometric Deep Learning, such as GCNN, ACNN, PointNet, among others. In this article an approach from a different paradigm is proposed, using metric data structures to index global descriptors in the spectral domain and allow the recovery of a set of similar models in polynomial time; to later use the Iterative Close Point algorithm and recover the parts of the incomplete model using the geometry and topology of the model with less Hausdorff distance.

Keywords: 3D reconstruction method, point cloud completion, shape completion, similarity search

Procedia PDF Downloads 90
1273 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, Data Mining, Hadoop, MapReduce, MongoDB, NoSQL

Procedia PDF Downloads 123
1272 A Nonlocal Means Algorithm for Poisson Denoising Based on Information Geometry

Authors: Dongxu Chen, Yipeng Li

Abstract:

This paper presents an information geometry NonlocalMeans(NLM) algorithm for Poisson denoising. NLM estimates a noise-free pixel as a weighted average of image pixels, where each pixel is weighted according to the similarity between image patches in Euclidean space. In this work, every pixel is a Poisson distribution locally estimated by Maximum Likelihood (ML), all distributions consist of a statistical manifold. A NLM denoising algorithm is conducted on the statistical manifold where Fisher information matrix can be used for computing distribution geodesics referenced as the similarity between patches. This approach was demonstrated to be competitive with related state-of-the-art methods.

Keywords: image denoising, Poisson noise, information geometry, nonlocal-means

Procedia PDF Downloads 255
1271 Product Development Process to Obtain Community Standard Product Certificate: A Case of Bangkhonthi, Samut Songkhram, Thailand

Authors: Supattra Pranee

Abstract:

The objectives of this research were to study the product development process to obtain a community standard product certificate and to set a guideline for the product development process to obtain the community product certificate. Focus group discussion was conducted with many experts in the field, local government officials, and representatives from local producers in Bangkontee district. The findings revealed that there were eight important processes to obtain the community product certificate: 1) prepare document, 2) submit the document, 3) set up an appointment for onsite inspection, 4) onsite inspection and sample collections, 5) evaluate samples, 6) obtain test result, and 7) obtain certificate.

Keywords: perceived values, tourist destination, visiting, product development

Procedia PDF Downloads 409
1270 The Legality of the Individual Education Plan from the Teachers’ Perspective in Saudi Arabia

Authors: Sohil I. Alqazlan

Abstract:

Introduction and Objectives: The individual educational plans (IEPs) is the cornerstone in education for students with special education need (SEN). The Saudi government supported the students’ right to have an IEP, and their education is one of the primary goals for the Ministry of Education (MoE). However, this support does not reflect the huge government investment. For example, some SEN students do not have an IEP, and poor communication was found between IEP teams and student's families. As a result, this study investigated perspectives and understandings of the IEP from the views of SEN teachers in the Saudi context. Methods: This study design utilised a qualitative approach, where in-depth semi-structured interviews were used with 8 SEN teachers in Riyadh (the capital city of Saudi Arabia) schools. In terms of analysing the interviews’ findings, the researcher used the thematic analyses approach. Results and Conclusion: The legality and the consideration of the legal document in Saudi Arabia are the main areas wherein study participants were questioned. It was observed that the IEP is not considered a legal document in the region of Saudi Arabia. As interpreted from the response of the SEN teachers, the IEP lacks the required legality with respect to its implementation in Saudi Arabia. All teachers were in agreement that the IEP is not considered to be a legal document in the Kingdom of Saudi Arabia. As a result, they did not use it for all their students with SEN. Such findings might have affected the teaching quality, and school outcomes as all SEN students must be supported individually depending on their needs.

Keywords: individual education plan, special education, IEP, teachers

Procedia PDF Downloads 138
1269 Phishing Detection: Comparison between Uniform Resource Locator and Content-Based Detection

Authors: Nuur Ezaini Akmar Ismail, Norbazilah Rahim, Norul Huda Md Rasdi, Maslina Daud

Abstract:

A web application is the most targeted by the attacker because the web application is accessible by the end users. It has become more advantageous to the attacker since not all the end users aware of what kind of sensitive data already leaked by them through the Internet especially via social network in shake on ‘sharing’. The attacker can use this information such as personal details, a favourite of artists, a favourite of actors or actress, music, politics, and medical records to customize phishing attack thus trick the user to click on malware-laced attachments. The Phishing attack is one of the most popular attacks for social engineering technique against web applications. There are several methods to detect phishing websites such as Blacklist/Whitelist based detection, heuristic-based, and visual similarity-based detection. This paper illustrated a comparison between the heuristic-based technique using features of a uniform resource locator (URL) and visual similarity-based detection techniques that compares the content of a suspected phishing page with the legitimate one in order to detect new phishing sites based on the paper reviewed from the past few years. The comparison focuses on three indicators which are false positive and negative, accuracy of the method, and time consumed to detect phishing website.

Keywords: heuristic-based technique, phishing detection, social engineering and visual similarity-based technique

Procedia PDF Downloads 144
1268 Popularization of the Communist Manifesto in 19th Century Europe

Authors: Xuanyu Bai

Abstract:

“The Communist Manifesto”, written by Karl Marx and Friedrich Engels, is one of the most significant documents throughout the whole history which covers across different fields including Economic, Politic, Sociology and Philosophy. Instead of discussing the Communist ideas presented in the Communist Manifesto, the essay focuses on exploring the reasons that contributed to the popularization of the document and its influence on political revolutions in 19th century Europe by concentrating on the document itself along with other primary and secondary sources and temporal artwork. Combining the details from the Communist Manifesto and other documents, Marx’s writing style and word choice, his convincible notions about a new society dominated by proletariats, and the revolutionary idea of class destruction has led to the popularization of the Communist Manifesto and influenced the latter political revolutions.

Keywords: communist manifesto, Marx, Engels, capitalism

Procedia PDF Downloads 101
1267 Mediation in Turkey

Authors: Ibrahim Ercan, Mustafa Arikan

Abstract:

In recent years, alternative dispute resolution methods have attracted the attention of many country’s legislators. Instead of solving the disputes by litigation, putting the end to a dispute by parties themselves is more important for the preservation of social peace. Therefore, alternative dispute resolution methods (ADR) have been discussed more intensively in Turkey as well as the whole world. After these discussions, Mediation Act was adopted on 07.06.2012 and entered into force on 21.06.2013. According to the Mediation Act, it is only possible to mediate issues arising from the private law. Also, it is not compulsory to go to mediation in Turkish law, it is optional. Therefore, the parties are completely free to choose mediation method in dispute resolution. Mediators need to be a lawyer with experience in five years. Therefore, it is not possible to be a mediator who is not lawyers. Beyond five years of experience, getting education and success in exams about especially body language and psychology is also very important to be a mediator. If the parties compromise as a result of mediation, a document is issued. This document will also have the ability to exercising availability under certain circumstances. Thus, the parties will not need to apply to the court again. On the contrary, they will find the opportunity to execute this document, so they can regain their debts. However, the Mediation Act has entered into force in a period of nearly two years of history; it is possible to say that the interest in mediation is not at the expected level. Therefore, making mediation mandatory for some disputes has been discussed recently. At this point, once the mediation becomes mandatory and good results follows it, this institution will be able to find a serious interest in Turkey. Otherwise, if the results will not be satisfying, the mediation method will be removed.

Keywords: alternative dispute resolution methods, mediation act, mediation, mediator, mediation in Turkey

Procedia PDF Downloads 337
1266 Genetic Characterization of Barley Genotypes via Inter-Simple Sequence Repeat

Authors: Mustafa Yorgancılar, Emine Atalay, Necdet Akgün, Ali Topal

Abstract:

In this study, polymerase chain reaction based Inter-simple sequence repeat (ISSR) from DNA fingerprinting techniques were used to investigate the genetic relationships among barley crossbreed genotypes in Turkey. It is important that selection based on the genetic base in breeding programs via ISSR, in terms of breeding time. 14 ISSR primers generated a total of 97 bands, of which 81 (83.35%) were polymorphic. The highest total resolution power (RP) value was obtained from the F2 (0.53) and M16 (0.51) primers. According to the ISSR result, the genetic similarity index changed between 0.64–095; Lane 3 with Line 6 genotypes were the closest, while Line 36 were the most distant ones. The ISSR markers were found to be promising for assessing genetic diversity in barley crossbreed genotypes.

Keywords: barley, crossbreed, genetic similarity, ISSR

Procedia PDF Downloads 305
1265 An Integrated Fuzzy Inference System and Technique for Order of Preference by Similarity to Ideal Solution Approach for Evaluation of Lean Healthcare Systems

Authors: Aydin M. Torkabadi, Ehsan Pourjavad

Abstract:

A decade after the introduction of Lean in Saskatchewan’s public healthcare system, its effectiveness remains a controversial subject among health researchers, workers, managers, and politicians. Therefore, developing a framework to quantitatively assess the Lean achievements is significant. This study investigates the success of initiatives across Saskatchewan health regions by recognizing the Lean healthcare criteria, measuring the success levels, comparing the regions, and identifying the areas for improvements. This study proposes an integrated intelligent computing approach by applying Fuzzy Inference System (FIS) and Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). FIS is used as an efficient approach to assess the Lean healthcare criteria, and TOPSIS is applied for ranking the values in regards to the level of leanness. Due to the innate uncertainty in decision maker judgments on criteria, principals of the fuzzy theory are applied. Finally, FIS-TOPSIS was established as an efficient technique in determining the lean merit in healthcare systems.

Keywords: lean healthcare, intelligent computing, fuzzy inference system, healthcare evaluation, technique for order of preference by similarity to ideal solution, multi-criteria decision making, MCDM

Procedia PDF Downloads 128
1264 Semantic Search Engine Based on Query Expansion with Google Ranking and Similarity Measures

Authors: Ahmad Shahin, Fadi Chakik, Walid Moudani

Abstract:

Our study is about elaborating a potential solution for a search engine that involves semantic technology to retrieve information and display it significantly. Semantic search engines are not used widely over the web as the majorities are still in Beta stage or under construction. Many problems face the current applications in semantic search, the major problem is to analyze and calculate the meaning of query in order to retrieve relevant information. Another problem is the ontology based index and its updates. Ranking results according to concept meaning and its relation with query is another challenge. In this paper, we are offering a light meta-engine (QESM) which uses Google search, and therefore Google’s index, with some adaptations to its returned results by adding multi-query expansion. The mission was to find a reliable ranking algorithm that involves semantics and uses concepts and meanings to rank results. At the beginning, the engine finds synonyms of each query term entered by the user based on a lexical database. Then, query expansion is applied to generate different semantically analogous sentences. These are generated randomly by combining the found synonyms and the original query terms. Our model suggests the use of semantic similarity measures between two sentences. Practically, we used this method to calculate semantic similarity between each query and the description of each page’s content generated by Google. The generated sentences are sent to Google engine one by one, and ranked again all together with the adapted ranking method (QESM). Finally, our system will place Google pages with higher similarities on the top of the results. We have conducted experimentations with 6 different queries. We have observed that most ranked results with QESM were altered with Google’s original generated pages. With our experimented queries, QESM generates frequently better accuracy than Google. In some worst cases, it behaves like Google.

Keywords: semantic search engine, Google indexing, query expansion, similarity measures

Procedia PDF Downloads 396
1263 Distances over Incomplete Diabetes and Breast Cancer Data Based on Bhattacharyya Distance

Authors: Loai AbdAllah, Mahmoud Kaiyal

Abstract:

Missing values in real-world datasets are a common problem. Many algorithms were developed to deal with this problem, most of them replace the missing values with a fixed value that was computed based on the observed values. In our work, we used a distance function based on Bhattacharyya distance to measure the distance between objects with missing values. Bhattacharyya distance, which measures the similarity of two probability distributions. The proposed distance distinguishes between known and unknown values. Where the distance between two known values is the Mahalanobis distance. When, on the other hand, one of them is missing the distance is computed based on the distribution of the known values, for the coordinate that contains the missing value. This method was integrated with Wikaya, a digital health company developing a platform that helps to improve prevention of chronic diseases such as diabetes and cancer. In order for Wikaya’s recommendation system to work distance between users need to be measured. Since there are missing values in the collected data, there is a need to develop a distance function distances between incomplete users profiles. To evaluate the accuracy of the proposed distance function in reflecting the actual similarity between different objects, when some of them contain missing values, we integrated it within the framework of k nearest neighbors (kNN) classifier, since its computation is based only on the similarity between objects. To validate this, we ran the algorithm over diabetes and breast cancer datasets, standard benchmark datasets from the UCI repository. Our experiments show that kNN classifier using our proposed distance function outperforms the kNN using other existing methods.

Keywords: missing values, incomplete data, distance, incomplete diabetes data

Procedia PDF Downloads 183
1262 Application of KL Divergence for Estimation of Each Metabolic Pathway Genes

Authors: Shohei Maruyama, Yasuo Matsuyama, Sachiyo Aburatani

Abstract:

The development of the method to annotate unknown gene functions is an important task in bioinformatics. One of the approaches for the annotation is The identification of the metabolic pathway that genes are involved in. Gene expression data have been utilized for the identification, since gene expression data reflect various intracellular phenomena. However, it has been difficult to estimate the gene function with high accuracy. It is considered that the low accuracy of the estimation is caused by the difficulty of accurately measuring a gene expression. Even though they are measured under the same condition, the gene expressions will vary usually. In this study, we proposed a feature extraction method focusing on the variability of gene expressions to estimate the genes' metabolic pathway accurately. First, we estimated the distribution of each gene expression from replicate data. Next, we calculated the similarity between all gene pairs by KL divergence, which is a method for calculating the similarity between distributions. Finally, we utilized the similarity vectors as feature vectors and trained the multiclass SVM for identifying the genes' metabolic pathway. To evaluate our developed method, we applied the method to budding yeast and trained the multiclass SVM for identifying the seven metabolic pathways. As a result, the accuracy that calculated by our developed method was higher than the one that calculated from the raw gene expression data. Thus, our developed method combined with KL divergence is useful for identifying the genes' metabolic pathway.

Keywords: metabolic pathways, gene expression data, microarray, Kullback–Leibler divergence, KL divergence, support vector machines, SVM, machine learning

Procedia PDF Downloads 369
1261 Recruitment Model (FSRM) for Faculty Selection Based on Fuzzy Soft

Authors: G. S. Thakur

Abstract:

This paper presents a Fuzzy Soft Recruitment Model (FSRM) for faculty selection of MHRD technical institutions. The selection criteria are based on 4-tier flexible structure in the institutions. The Advisory Committee on Faculty Recruitment (ACoFAR) suggested nine criteria for faculty in the proposed FSRM. The model Fuzzy Soft is proposed with consultation of ACoFAR based on selection criteria. The Fuzzy Soft distance similarity measures are applied for finding best faculty from the applicant pool.

Keywords: fuzzy soft set, fuzzy sets, fuzzy soft distance, fuzzy soft similarity measures, ACoFAR

Procedia PDF Downloads 309
1260 Enhancement of Indexing Model for Heterogeneous Multimedia Documents: User Profile Based Approach

Authors: Aicha Aggoune, Abdelkrim Bouramoul, Mohamed Khiereddine Kholladi

Abstract:

Recent research shows that user profile as important element can improve heterogeneous information retrieval with its content. In this context, we present our indexing model for heterogeneous multimedia documents. This model is based on the combination of user profile to the indexing process. The general idea of our proposal is to operate the common concepts between the representation of a document and the definition of a user through his profile. These two elements will be added as additional indexing entities to enrich the heterogeneous corpus documents indexes. We have developed IRONTO domain ontology allowing annotation of documents. We will present also the developed tool validating the proposed model.

Keywords: indexing model, user profile, multimedia document, heterogeneous of sources, ontology

Procedia PDF Downloads 315
1259 Decoding Gender Disparities in AI: An Experimental Exploration Within the Realm of AI and Trust Building

Authors: Alexander Scott English, Yilin Ma, Xiaoying Liu

Abstract:

The widespread use of artificial intelligence in everyday life has triggered a fervent discussion covering a wide range of areas. However, to date, research on the influence of gender in various segments and factors from a social science perspective is still limited. This study aims to explore whether there are gender differences in human trust in AI for its application in basic everyday life and correlates with human perceived similarity, perceived emotions (including competence and warmth), and attractiveness. We conducted a study involving 321 participants using a two-subject experimental design with a two-factor (masculinized vs. feminized voice of the AI) multiplied by a two-factor (pitch level of the AI's voice) between-subject experimental design. Four contexts were created for the study and randomly assigned. The results of the study showed significant gender differences in perceived similarity, trust, and perceived emotion of the AIs, with females rating them significantly higher than males. Trust was higher in relation to AIs presenting the same gender (e.g., human female to female AI, human male to male AI). Mediation modeling tests indicated that emotion perception and similarity played a sufficiently mediating role in trust. Notably, although trust in AIs was strongly correlated with human gender, there was no significant effect on the gender of the AI. In addition, the study discusses the effects of subjects' age, job search experience, and job type on the findings.

Keywords: artificial intelligence, gender differences, human-robot trust, mediation modeling

Procedia PDF Downloads 13
1258 Plagiarism Detection for Flowchart and Figures in Texts

Authors: Ahmadu Maidorawa, Idrissa Djibo, Muhammad Tella

Abstract:

This paper presents a method for detecting flow chart and figure plagiarism based on shape of image processing and multimedia retrieval. The method managed to retrieve flowcharts with ranked similarity according to different matching sets. Plagiarism detection is well known phenomenon in the academic arena. Copying other people is considered as serious offense that needs to be checked. There are many plagiarism detection systems such as turn-it-in that has been developed to provide these checks. Most, if not all, discard the figures and charts before checking for plagiarism. Discarding the figures and charts result in look holes that people can take advantage. That means people can plagiarize figures and charts easily without the current plagiarism systems detecting it. There are very few papers which talks about flowcharts plagiarism detection. Therefore, there is a need to develop a system that will detect plagiarism in figures and charts.

Keywords: flowchart, multimedia retrieval, figures similarity, image comparison, figure retrieval

Procedia PDF Downloads 427
1257 Cost of Outpatient Procedures for Ostomized Patients Treated in the Public Health Network in Brazil and Its Impact on the Budget of the Unified Health System

Authors: Karina Guimaraes, Lilian Santos

Abstract:

This study has the purpose of planning and instituting monitoring actions as a way of knowing the scenario of assistance to the patient with stoma, treated in the public health network in Brazil, from January to November of the year 2016, from the elaboration of a technical document containing the survey of the number of procedures offered and the value of the ostomy services, accredited in the Unified Health System-SUS. The purpose of this document is to improve the quality of these services in the efficient management of available financial resources, making it indispensable for the creation of strategies for the implementation and implementation of care services for people with stomata as a strategic tool in the promotion, prevention, qualification and efficiency in health care.

Keywords: health economic, management, ostomy, unified health system

Procedia PDF Downloads 276
1256 Web Proxy Detection via Bipartite Graphs and One-Mode Projections

Authors: Zhipeng Chen, Peng Zhang, Qingyun Liu, Li Guo

Abstract:

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

Keywords: bipartite graph, one-mode projection, clustering, web proxy detection

Procedia PDF Downloads 216
1255 Timescape-Based Panoramic View for Historic Landmarks

Authors: H. Ali, A. Whitehead

Abstract:

Providing a panoramic view of famous landmarks around the world offers artistic and historic value for historians, tourists, and researchers. Exploring the history of famous landmarks by presenting a comprehensive view of a temporal panorama merged with geographical and historical information presents a unique challenge of dealing with images that span a long period, from the 1800’s up to the present. This work presents the concept of temporal panorama through a timeline display of aligned historic and modern images for many famous landmarks. Utilization of this panorama requires a collection of hundreds of thousands of landmark images from the Internet comprised of historic images and modern images of the digital age. These images have to be classified for subset selection to keep the more suitable images that chronologically document a landmark’s history. Processing of historic images captured using older analog technology under various different capturing conditions represents a big challenge when they have to be used with modern digital images. Successful processing of historic images to prepare them for next steps of temporal panorama creation represents an active contribution in cultural heritage preservation through the fulfillment of one of UNESCO goals in preservation and displaying famous worldwide landmarks.

Keywords: cultural heritage, image registration, image subset selection, registered image similarity, temporal panorama, timescapes

Procedia PDF Downloads 127
1254 Experimental Study Analyzing the Similarity Theory Formulations for the Effect of Aerodynamic Roughness Length on Turbulence Length Scales in the Atmospheric Surface Layer

Authors: Matthew J. Emes, Azadeh Jafari, Maziar Arjomandi

Abstract:

Velocity fluctuations of shear-generated turbulence are largest in the atmospheric surface layer (ASL) of nominal 100 m depth, which can lead to dynamic effects such as galloping and flutter on small physical structures on the ground when the turbulence length scales and characteristic length of the physical structure are the same order of magnitude. Turbulence length scales are a measure of the average sizes of the energy-containing eddies that are widely estimated using two-point cross-correlation analysis to convert the temporal lag to a separation distance using Taylor’s hypothesis that the convection velocity is equal to the mean velocity at the corresponding height. Profiles of turbulence length scales in the neutrally-stratified ASL, as predicted by Monin-Obukhov similarity theory in Engineering Sciences Data Unit (ESDU) 85020 for single-point data and ESDU 86010 for two-point correlations, are largely dependent on the aerodynamic roughness length. Field measurements have shown that longitudinal turbulence length scales show significant regional variation, whereas length scales of the vertical component show consistent Obukhov scaling from site to site because of the absence of low-frequency components. Hence, the objective of this experimental study is to compare the similarity theory relationships between the turbulence length scales and aerodynamic roughness length with those calculated using the autocorrelations and cross-correlations of field measurement velocity data at two sites: the Surface Layer Turbulence and Environmental Science Test (SLTEST) facility in a desert ASL in Dugway, Utah, USA and the Commonwealth Scientific and Industrial Research Organisation (CSIRO) wind tower in a rural ASL in Jemalong, NSW, Australia. The results indicate that the longitudinal turbulence length scales increase with increasing aerodynamic roughness length, as opposed to the relationships derived by similarity theory correlations in ESDU models. However, the ratio of the turbulence length scales in the lateral and vertical directions to the longitudinal length scales is relatively independent of surface roughness, showing consistent inner-scaling between the two sites and the ESDU correlations. Further, the diurnal variation of wind velocity due to changes in atmospheric stability conditions has a significant effect on the turbulence structure of the energy-containing eddies in the lower ASL.

Keywords: aerodynamic roughness length, atmospheric surface layer, similarity theory, turbulence length scales

Procedia PDF Downloads 96
1253 Generation of Photo-Mosaic Images through Block Matching and Color Adjustment

Authors: Hae-Yeoun Lee

Abstract:

Mosaic refers to a technique that makes image by gathering lots of small materials in various colours. This paper presents an automatic algorithm that makes the photomosaic image using photos. The algorithm is composed of four steps: Partition and feature extraction, block matching, redundancy removal and colour adjustment. The input image is partitioned in the small block to extract feature. Each block is matched to find similar photo in database by comparing similarity with Euclidean difference between blocks. The intensity of the block is adjusted to enhance the similarity of image by replacing the value of light and darkness with that of relevant block. Further, the quality of image is improved by minimizing the redundancy of tiles in the adjacent blocks. Experimental results support that the proposed algorithm is excellent in quantitative analysis and qualitative analysis.

Keywords: photomosaic, Euclidean distance, block matching, intensity adjustment

Procedia PDF Downloads 247
1252 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 421
1251 Effect of Joule Heating on Chemically Reacting Micropolar Fluid Flow over Truncated Cone with Convective Boundary Condition Using Spectral Quasilinearization Method

Authors: Pradeepa Teegala, Ramreddy Chetteti

Abstract:

This work emphasizes the effects of heat generation/absorption and Joule heating on chemically reacting micropolar fluid flow over a truncated cone with convective boundary condition. For this complex fluid flow problem, the similarity solution does not exist and hence using non-similarity transformations, the governing fluid flow equations along with related boundary conditions are transformed into a set of non-dimensional partial differential equations. Several authors have applied the spectral quasi-linearization method to solve the ordinary differential equations, but here the resulting nonlinear partial differential equations are solved for non-similarity solution by using a recently developed method called the spectral quasi-linearization method (SQLM). Comparison with previously published work on special cases of the problem is performed and found to be in excellent agreement. The influence of pertinent parameters namely Biot number, Joule heating, heat generation/absorption, chemical reaction, micropolar and magnetic field on physical quantities of the flow are displayed through graphs and the salient features are explored in detail. Further, the results are analyzed by comparing with two special cases, namely, vertical plate and full cone wherever possible.

Keywords: chemical reaction, convective boundary condition, joule heating, micropolar fluid, spectral quasilinearization method

Procedia PDF Downloads 314
1250 Saliency Detection Using a Background Probability Model

Authors: Junling Li, Fang Meng, Yichun Zhang

Abstract:

Image saliency detection has been long studied, while several challenging problems are still unsolved, such as detecting saliency inaccurately in complex scenes or suppressing salient objects in the image borders. In this paper, we propose a new saliency detection algorithm in order to solving these problems. We represent the image as a graph with superixels as nodes. By considering appearance similarity between the boundary and the background, the proposed method chooses non-saliency boundary nodes as background priors to construct the background probability model. The probability that each node belongs to the model is computed, which measures its similarity with backgrounds. Thus we can calculate saliency by the transformed probability as a metric. We compare our algorithm with ten-state-of-the-art salient detection methods on the public database. Experimental results show that our simple and effective approach can attack those challenging problems that had been baffling in image saliency detection.

Keywords: visual saliency, background probability, boundary knowledge, background priors

Procedia PDF Downloads 388
1249 A Numerical Solution Based on Operational Matrix of Differentiation of Shifted Second Kind Chebyshev Wavelets for a Stefan Problem

Authors: Rajeev, N. K. Raigar

Abstract:

In this study, one dimensional phase change problem (a Stefan problem) is considered and a numerical solution of this problem is discussed. First, we use similarity transformation to convert the governing equations into ordinary differential equations with its boundary conditions. The solutions of ordinary differential equation with the associated boundary conditions and interface condition (Stefan condition) are obtained by using a numerical approach based on operational matrix of differentiation of shifted second kind Chebyshev wavelets. The obtained results are compared with existing exact solution which is sufficiently accurate.

Keywords: operational matrix of differentiation, similarity transformation, shifted second kind chebyshev wavelets, stefan problem

Procedia PDF Downloads 376