Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 9

Search results for: pagerank

9 Prioritization of Mutation Test Generation with Centrality Measure

Authors: Supachai Supmak, Yachai Limpiyakorn

Abstract:

Mutation testing can be applied for the quality assessment of test cases. Prioritization of mutation test generation has been a critical element of the industry practice that would contribute to the evaluation of test cases. The industry generally delivers the product under the condition of time to the market and thus, inevitably sacrifices software testing tasks, even though many test cases are required for software verification. This paper presents an approach of applying a social network centrality measure, PageRank, to prioritize mutation test generation. The source code with the highest values of PageRank will be focused first when developing their test cases as these modules are vulnerable to defects or anomalies which may cause the consequent defects in many other associated modules. Moreover, the approach would help identify the reducible test cases in the test suite, still maintaining the same criteria as the original number of test cases.

Keywords: software testing, mutation test, network centrality measure, test case prioritization

Procedia PDF Downloads 112

8 Persistent Homology of Convection Cycles in Network Flows

Authors: Minh Quang Le, Dane Taylor

Abstract:

Convection is a well-studied topic in fluid dynamics, yet it is less understood in the context of networks flows. Here, we incorporate techniques from topological data analysis (namely, persistent homology) to automate the detection and characterization of convective/cyclic/chiral flows over networks, particularly those that arise for irreversible Markov chains (MCs). As two applications, we study convection cycles arising under the PageRank algorithm, and we investigate chiral edges flows for a stochastic model of a bi-monomer's configuration dynamics. Our experiments highlight how system parameters---e.g., the teleportation rate for PageRank and the transition rates of external and internal state changes for a monomer---can act as homology regularizers of convection, which we summarize with persistence barcodes and homological bifurcation diagrams. Our approach establishes a new connection between the study of convection cycles and homology, the branch of mathematics that formally studies cycles, which has diverse potential applications throughout the sciences and engineering.

Keywords: homology, persistent homolgy, markov chains, convection cycles, filtration

Procedia PDF Downloads 136

7 Trend Detection Using Community Rank and Hawkes Process

Authors: Shashank Bhatnagar, W. Wilfred Godfrey

Abstract:

We develop in this paper, an approach to find the trendy topic, which not only considers the user-topic interaction but also considers the community, in which user belongs. This method modifies the previous approach of user-topic interaction to user-community-topic interaction with better speed-up in the range of [1.1-3]. We assume that trend detection in a social network is dependent on two things. The one is, broadcast of messages in social network governed by self-exciting point process, namely called Hawkes process and the second is, Community Rank. The influencer node links to others in the community and decides the community rank based on its PageRank and the number of users links to that community. The community rank decides the influence of one community over the other. Hence, the Hawkes process with the kernel of user-community-topic decides the trendy topic disseminated into the social network.

Keywords: community detection, community rank, Hawkes process, influencer node, pagerank, trend detection

Procedia PDF Downloads 383

6 Interbank Networks and the Benefits of Using Multilayer Structures

Authors: Danielle Sandler dos Passos, Helder Coelho, Flávia Mori Sarti

Abstract:

Complexity science seeks the understanding of systems adopting diverse theories from various areas. Network analysis has been gaining space and credibility, namely with the biological, social and economic systems. Significant part of the literature focuses only monolayer representations of connections among agents considering one level of their relationships, and excludes other levels of interactions, leading to simplistic results in network analysis. Therefore, this work aims to demonstrate the advantages of the use of multilayer networks for the representation and analysis of networks. For this, we analyzed an interbank network, composed of 42 banks, comparing the centrality measures of the agents (degree and PageRank) resulting from each method (monolayer x multilayer). This proved to be the most reliable and efficient the multilayer analysis for the study of the current networks and highlighted JP Morgan and Deutsche Bank as the most important banks of the analyzed network.

Keywords: complexity, interbank networks, multilayer networks, network analysis

Procedia PDF Downloads 282

5 A Complex Network Approach to Structural Inequality of Educational Deprivation

Authors: Harvey Sanchez-Restrepo, Jorge Louca

Abstract:

Equity and education are major focus of government policies around the world due to its relevance for addressing the sustainable development goals launched by Unesco. In this research, we developed a primary analysis of a data set of more than one hundred educational and non-educational factors associated with learning, coming from a census-based large-scale assessment carried on in Ecuador for 1.038.328 students, their families, teachers, and school directors, throughout 2014-2018. Each participating student was assessed by a standardized computer-based test. Learning outcomes were calibrated through item response theory with two-parameters logistic model for getting raw scores that were re-scaled and synthetized by a learning index (LI). Our objective was to develop a network for modelling educational deprivation and analyze the structure of inequality gaps, as well as their relationship with socioeconomic status, school financing, and student's ethnicity. Results from the model show that 348 270 students did not develop the minimum skills (prevalence rate=0.215) and that Afro-Ecuadorian, Montuvios and Indigenous students exhibited the highest prevalence with 0.312, 0.278 and 0.226, respectively. Regarding the socioeconomic status of students (SES), modularity class shows clearly that the system is out of equilibrium: the first decile (the poorest) exhibits a prevalence rate of 0.386 while rate for decile ten (the richest) is 0.080, showing an intense negative relationship between learning and SES given by R= –0.58 (p < 0.001). Another interesting and unexpected result is the average-weighted degree (426.9) for both private and public schools attending Afro-Ecuadorian students, groups that got the highest PageRank (0.426) and pointing out that they suffer the highest educational deprivation due to discrimination, even belonging to the richest decile. The model also found the factors which explain deprivation through the highest PageRank and the greatest degree of connectivity for the first decile, they are: financial bonus for attending school, computer access, internet access, number of children, living with at least one parent, books access, read books, phone access, time for homework, teachers arriving late, paid work, positive expectations about schooling, and mother education. These results provide very accurate and clear knowledge about the variables affecting poorest students and the inequalities that it produces, from which it might be defined needs profiles, as well as actions on the factors in which it is possible to influence. Finally, these results confirm that network analysis is fundamental for educational policy, especially linking reliable microdata with social macro-parameters because it allows us to infer how gaps in educational achievements are driven by students’ context at the time of assigning resources.

Keywords: complex network, educational deprivation, evidence-based policy, large-scale assessments, policy informatics

Procedia PDF Downloads 122

4 Social Network Roles in Organizations: Influencers, Bridges, and Soloists

Authors: Sofia Dokuka, Liz Lockhart, Alex Furman

Abstract:

Organizational hierarchy, traditionally composed of individual contributors, middle management, and executives, is enhanced by the understanding of informal social roles. These roles, identified with organizational network analysis (ONA), might have an important effect on organizational functioning. In this paper, we identify three social roles – influencers, bridges, and soloists, and provide empirical analysis based on real-world organizational networks. Influencers are employees with broad networks and whose contacts also have rich networks. Influence is calculated using PageRank, initially proposed for measuring website importance, but now applied in various network settings, including social networks. Influencers, having high PageRank, become key players in shaping opinions and behaviors within an organization. Bridges serve as links between loosely connected groups within the organization. Bridges are identified using betweenness and Burt’s constraint. Betweenness quantifies a node's control over information flows by evaluating its role in the control over the shortest paths within the network. Burt's constraint measures the extent of interconnection among an individual's contacts. A high constraint value suggests fewer structural holes and lesser control over information flows, whereas a low value suggests the contrary. Soloists are individuals with fewer than 5 stable social contacts, potentially facing challenges due to reduced social interaction and hypothetical lack of feedback and communication. We considered social roles in the analysis of real-world organizations (N=1,060). Based on data from digital traces (Slack, corporate email and calendar) we reconstructed an organizational communication network and identified influencers, bridges and soloists. We also collected employee engagement data through an online survey. Among the top-5% of influencers, 10% are members of the Executive Team. 56% of the Executive Team members are part of the top influencers group. The same proportion of top influencers (10%) is individual contributors, accounting for just 0.6% of all individual contributors in the company. The majority of influencers (80%) are at the middle management level. Out of all middle managers, 19% hold the role of influencers. However, individual contributors represent a small proportion of influencers, and having information about these individuals who hold influential roles can be crucial for management in identifying high-potential talents. Among the bridges, 4% are members of the Executive Team, 16% are individual contributors, and 80% are middle management. Predominantly middle management acts as a bridge. Bridge positions of some members of the executive team might indicate potential micromanagement on the leader's part. Recognizing the individuals serving as bridges in an organization uncovers potential communication problems. The majority of soloists are individual contributors (96%), and 4% of soloists are from middle management. These managers might face communication difficulties. We found an association between being an influencer and attitude toward a company's direction. There is a statistically significant 20% higher perception that the company is headed in the right direction among influencers compared to non-influencers (p < 0.05, Mann-Whitney test). Taken together, we demonstrate that considering social roles in the company might indicate both positive and negative aspects of organizational functioning that should be considered in data-driven decision-making.

Keywords: organizational network analysis, social roles, influencer, bridge, soloist

Procedia PDF Downloads 104

3 Normalizing Scientometric Indicators of Individual Publications Using Local Cluster Detection Methods on Citation Networks

Authors: Levente Varga, Dávid Deritei, Mária Ercsey-Ravasz, Răzvan Florian, Zsolt I. Lázár, István Papp, Ferenc Járai-Szabó

Abstract:

One of the major shortcomings of widely used scientometric indicators is that different disciplines cannot be compared with each other. The issue of cross-disciplinary normalization has been long discussed, but even the classification of publications into scientific domains poses problems. Structural properties of citation networks offer new possibilities, however, the large size and constant growth of these networks asks for precaution. Here we present a new tool that in order to perform cross-field normalization of scientometric indicators of individual publications relays on the structural properties of citation networks. Due to the large size of the networks, a systematic procedure for identifying scientific domains based on a local community detection algorithm is proposed. The algorithm is tested with different benchmark and real-world networks. Then, by the use of this algorithm, the mechanism of the scientometric indicator normalization process is shown for a few indicators like the citation number, P-index and a local version of the PageRank indicator. The fat-tail trend of the article indicator distribution enables us to successfully perform the indicator normalization process.

Keywords: citation networks, cross-field normalization, local cluster detection, scientometric indicators

Procedia PDF Downloads 203

2 Graph-Based Semantical Extractive Text Analysis

Authors: Mina Samizadeh

Abstract:

In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to explore the data. This leads to an intense growing interest in the research community to develop computational methods focused on processing this text data. A line of study focused on condensing the text so that we are able to get a higher level of understanding in a shorter time. The two important tasks to do this are keyword extraction and text summarization. In keyword extraction, we are interested in finding the key important words from a text. This makes us familiar with the general topic of a text. In text summarization, we are interested in producing a short-length text which includes important information about the document. The TextRank algorithm, an unsupervised learning method that is an extension of the PageRank (algorithm which is the base algorithm of Google search engine for searching pages and ranking them), has shown its efficacy in large-scale text mining, especially for text summarization and keyword extraction. This algorithm can automatically extract the important parts of a text (keywords or sentences) and declare them as a result. However, this algorithm neglects the semantic similarity between the different parts. In this work, we improved the results of the TextRank algorithm by incorporating the semantic similarity between parts of the text. Aside from keyword extraction and text summarization, we develop a topic clustering algorithm based on our framework, which can be used individually or as a part of generating the summary to overcome coverage problems.

Keywords: keyword extraction, n-gram extraction, text summarization, topic clustering, semantic analysis

Procedia PDF Downloads 70

1 Citation Analysis of New Zealand Court Decisions

Authors: Tobias Milz, L. Macpherson, Varvara Vetrova

Abstract:

The law is a fundamental pillar of human societies as it shapes, controls and governs how humans conduct business, behave and interact with each other. Recent advances in computer-assisted technologies such as NLP, data science and AI are creating opportunities to support the practice, research and study of this pervasive domain. It is therefore not surprising that there has been an increase in investments into supporting technologies for the legal industry (also known as “legal tech” or “law tech”) over the last decade. A sub-discipline of particular appeal is concerned with assisted legal research. Supporting law researchers and practitioners to retrieve information from the vast amount of ever-growing legal documentation is of natural interest to the legal research community. One tool that has been in use for this purpose since the early nineteenth century is legal citation indexing. Among other use cases, they provided an effective means to discover new precedent cases. Nowadays, computer-assisted network analysis tools can allow for new and more efficient ways to reveal the “hidden” information that is conveyed through citation behavior. Unfortunately, access to openly available legal data is still lacking in New Zealand and access to such networks is only commercially available via providers such as LexisNexis. Consequently, there is a need to create, analyze and provide a legal citation network with sufficient data to support legal research tasks. This paper describes the development and analysis of a legal citation Network for New Zealand containing over 300.000 decisions from 125 different courts of all areas of law and jurisdiction. Using python, the authors assembled web crawlers, scrapers and an OCR pipeline to collect and convert court decisions from openly available sources such as NZLII into uniform and machine-readable text. This facilitated the use of regular expressions to identify references to other court decisions from within the decision text. The data was then imported into a graph-based database (Neo4j) with the courts and their respective cases represented as nodes and the extracted citations as links. Furthermore, additional links between courts of connected cases were added to indicate an indirect citation between the courts. Neo4j, as a graph-based database, allows efficient querying and use of network algorithms such as PageRank to reveal the most influential/most cited courts and court decisions over time. This paper shows that the in-degree distribution of the New Zealand legal citation network resembles a power-law distribution, which indicates a possible scale-free behavior of the network. This is in line with findings of the respective citation networks of the U.S. Supreme Court, Austria and Germany. The authors of this paper provide the database as an openly available data source to support further legal research. The decision texts can be exported from the database to be used for NLP-related legal research, while the network can be used for in-depth analysis. For example, users of the database can specify the network algorithms and metrics to only include specific courts to filter the results to the area of law of interest.

Keywords: case citation network, citation analysis, network analysis, Neo4j

Procedia PDF Downloads 107