Search results for: co-occurrence
4 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text
Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni
Abstract:
The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.Keywords: cooccurrence graph, entity relation graph, unstructured text, weighted distance
Procedia PDF Downloads 1543 The Impact of Environmental Social and Governance (ESG) on Corporate Financial Performance (CFP): Evidence from New Zealand Companies
Authors: Muhammad Akhtaruzzaman
Abstract:
The impact of corporate environmental social and governance (ESG) on financial performance is often difficult to quantify despite the ESG related theories predict that ESG performance improves financial performance of a company. This research examines the link between corporate ESG performance and the financial performance of the NZX (New Zealand Stock Exchange) listed companies. For this purpose, this research utilizes mixed methods approaches to examine and understand this link. While quantitative results found no robust evidence of such a link, however, the qualitative analysis of content data suggests a strong cooccurrence exists between ESG performance and financial performance. The findings of this research have important implications for policymakers to support higher ESG-performing companies and for management practitioners to develop ESG-related strategies.Keywords: ESG, financial performance, New Zealand firms, thematic analysis, mixed methods
Procedia PDF Downloads 662 Exploring the Landscape of Information Visualization through a Mark Lombardi Lens
Authors: Alon Friedman, Antonio Sanchez Chinchon
Abstract:
This bibliometric study takes an artistic and storytelling approach to explore the term ”information visualization.” Analyzing over 1008 titles collected from databases that specialize in data visualization research, we examine the titles of these publications to report on the characteristics and development trends in the field. Employing a qualitative methodology, we delve into the titles of these publications, extracting leading terms and exploring the cooccurrence of these terms to gain deeper insights. By systematically analyzing the leading terms and their relationships within the titles, we shed light on the prevailing themes that shape the landscape of ”information visualization” by employing the artist Mark Lombardi’s techniques to visualize our findings. By doing so, this study provides valuable insights into bibliometrics visualization while also opening new avenues for leveraging art and storytelling to enhance data representation.Keywords: bibliometrics analysis, Mark Lombardi design, information visualization, qualitative methodology
Procedia PDF Downloads 901 A Feature Clustering-Based Sequential Selection Approach for Color Texture Classification
Authors: Mohamed Alimoussa, Alice Porebski, Nicolas Vandenbroucke, Rachid Oulad Haj Thami, Sana El Fkihi
Abstract:
Color and texture are highly discriminant visual cues that provide an essential information in many types of images. Color texture representation and classification is therefore one of the most challenging problems in computer vision and image processing applications. Color textures can be represented in different color spaces by using multiple image descriptors which generate a high dimensional set of texture features. In order to reduce the dimensionality of the feature set, feature selection techniques can be used. The goal of feature selection is to find a relevant subset from an original feature space that can improve the accuracy and efficiency of a classification algorithm. Traditionally, feature selection is focused on removing irrelevant features, neglecting the possible redundancy between relevant ones. This is why some feature selection approaches prefer to use feature clustering analysis to aid and guide the search. These techniques can be divided into two categories. i) Feature clustering-based ranking algorithm uses feature clustering as an analysis that comes before feature ranking. Indeed, after dividing the feature set into groups, these approaches perform a feature ranking in order to select the most discriminant feature of each group. ii) Feature clustering-based subset search algorithms can use feature clustering following one of three strategies; as an initial step that comes before the search, binded and combined with the search or as the search alternative and replacement. In this paper, we propose a new feature clustering-based sequential selection approach for the purpose of color texture representation and classification. Our approach is a three step algorithm. First, irrelevant features are removed from the feature set thanks to a class-correlation measure. Then, introducing a new automatic feature clustering algorithm, the feature set is divided into several feature clusters. Finally, a sequential search algorithm, based on a filter model and a separability measure, builds a relevant and non redundant feature subset: at each step, a feature is selected and features of the same cluster are removed and thus not considered thereafter. This allows to significantly speed up the selection process since large number of redundant features are eliminated at each step. The proposed algorithm uses the clustering algorithm binded and combined with the search. Experiments using a combination of two well known texture descriptors, namely Haralick features extracted from Reduced Size Chromatic Co-occurence Matrices (RSCCMs) and features extracted from Local Binary patterns (LBP) image histograms, on five color texture data sets, Outex, NewBarktex, Parquet, Stex and USPtex demonstrate the efficiency of our method compared to seven of the state of the art methods in terms of accuracy and computation time.Keywords: feature selection, color texture classification, feature clustering, color LBP, chromatic cooccurrence matrix
Procedia PDF Downloads 138