February 1, 2019
Networks of disparate phenomena-- be it the global ecology, human social institutions, within the human brain, or in micro-scale protein interactions-- exhibit broadly consistent architectural features. To explain this, we propose a new theory where link probability is modelled by a log-normal node fitness (surface) factor and a latent Euclidean space-embedded node similarity (depth) factor. Modelling based on this theory considerably outperforms popular power-law fitness and...
May 18, 2017
Networks are a useful representation for data on connections between units of interests, but the observed connections are often noisy and/or include missing values. One common approach to network analysis is to treat the network as a realization from a random graph model, and estimate the underlying edge probability matrix, which is sometimes referred to as network denoising. Here we propose a generalized linear model with low rank effects to model network edges. This model c...
November 8, 2010
Given a set of alternatives to be ranked, and some pairwise comparison data, ranking is a least squares computation on a graph. The vertices are the alternatives, and the edge values comprise the comparison data. The basic idea is very simple and old: come up with values on vertices such that their differences match the given edge data. Since an exact match will usually be impossible, one settles for matching in a least squares sense. This formulation was first described by L...
June 26, 2009
Graphs and networks provide a canonical representation of relational data, with massive network data sets becoming increasingly prevalent across a variety of scientific fields. Although tools from mathematics and computer science have been eagerly adopted by practitioners in the service of network inference, they do not yet comprise a unified and coherent framework for the statistical analysis of large-scale network data. This paper serves as both an introduction to the topic...
August 18, 2003
Using each node's degree as a proxy for its importance, the topological hierarchy of a complex network is introduced and quantified. We propose a simple dynamical process used to construct networks which are either maximally or minimally hierarchical. Comparison with these extremal cases as well as with random scale-free networks allows us to better understand hierarchical versus modular features in several real-life complex networks. For random scale-free topologies the exte...
September 21, 2012
In domains like bioinformatics, information retrieval and social network analysis, one can find learning tasks where the goal consists of inferring a ranking of objects, conditioned on a particular target object. We present a general kernel framework for learning conditional rankings from various types of relational data, where rankings can be conditioned on unseen data objects. We propose efficient algorithms for conditional ranking by optimizing squared regression and ranki...
October 22, 2021
Force-directed layout algorithms are ubiquitously-used tools for network visualisation across a multitude of scientific disciplines. However, they lack theoretical grounding which allows to interpret their outcomes rigorously and can guide the choice of specific algorithms for certain data sets. We propose an approach building on latent space models, which assume that the probability of nodes forming a tie depends on their distance in an unobserved latent space. From such lat...
September 3, 2015
PageRank is arguably the most popular ranking algorithm which is being applied in real systems ranging from information to biological and infrastructure networks. Despite its outstanding popularity and broad use in different areas of science, the relation between the algorithm's efficacy and properties of the network on which it acts has not yet been fully understood. We study here PageRank's performance on a network model supported by real data, and show that realistic tempo...
June 8, 2015
Distances in a network capture relations between nodes and are the basis of centrality, similarity, and influence measures. Often, however, the relevance of a node $u$ to a node $v$ is more precisely measured not by the magnitude of the distance, but by the number of nodes that are closer to $v$ than $u$. That is, by the {\em rank} of $u$ in an ordering of nodes by increasing distance from $v$. We identify and address fundamental challenges in rank-based graph mining. We fi...
January 13, 2024
Graphical models for cardinal paired comparison data with and without covariates are rigorously analyzed. Novel, graph--based, necessary and sufficient conditions which guarantee strong consistency, asymptotic normality and the exponential convergence of the estimated ranks are emphasized. A complete theory for models with covariates is laid out. In particular conditions under which covariates can be safely omitted from the model are provided. The methodology is employed in t...