July 14, 2005
Similar papers 4
July 19, 2012
Many real-world networks are prohibitively large for data retrieval, storage and analysis of all of its nodes and links. Understanding the structure and dynamics of these networks entails creating a smaller representative sample of the full graph while preserving its relevant topological properties. In this report, we show that graph sampling algorithms currently proposed in the literature are not able to preserve network properties even with sample sizes containing as many a...
February 19, 2003
Understanding the subgraph distribution in random networks is important for modelling complex systems. In classic Erdos networks, which exhibit a Poissonian degree distribution, the number of appearances of a subgraph G with n nodes and g edges scales with network size as \mean{G} ~ N^{n-g}. However, many natural networks have a non-Poissonian degree distribution. Here we present approximate equations for the average number of subgraphs in an ensemble of random sparse directe...
July 24, 2007
We introduce and study a class of exchangeable random graph ensembles. They can be used as statistical null models for empirical networks, and as a tool for theoretical investigations. We provide general theorems that carachterize the degree distribution of the ensemble graphs, together with some features that are important for applications, such as subgraph distributions and kernel of the adjacency matrix. These results are used to compare to other models of simple and compl...
September 11, 2024
As networks continue to increase in size, current methods must be capable of handling large numbers of nodes and edges in order to be practically relevant. Instead of working directly with the entire (large) network, analyzing sub-networks has become a popular approach. Due to a network's inherent inter-connectedness, sub-sampling is not a trivial task. While this problem has gained attention in recent years, it has not received sufficient attention from the statistics commun...
February 15, 2010
Uniform sampling from graphical realizations of a given degree sequence is a fundamental component in simulation-based measurements of network observables, with applications ranging from epidemics, through social networks to Internet modeling. Existing graph sampling methods are either link-swap based (Markov-Chain Monte Carlo algorithms) or stub-matching based (the Configuration Model). Both types are ill-controlled, with typically unknown mixing times for link-swap methods ...
December 22, 2009
We investigate exponential families of random graph distributions as a framework for systematic quantification of structure in networks. In this paper we restrict ourselves to undirected unlabeled graphs. For these graphs, the counts of subgraphs with no more than k links are a sufficient statistics for the exponential families of graphs with interactions between at most k links. In this framework we investigate the dependencies between several observables commonly used to qu...
November 14, 2012
Network sampling is integral to the analysis of social, information, and biological networks. Since many real-world networks are massive in size, continuously evolving, and/or distributed in nature, the network structure is often sampled in order to facilitate study. For these reasons, a more thorough and complete understanding of network sampling is critical to support the field of network science. In this paper, we outline a framework for the general problem of network samp...
May 27, 2015
Represented as graphs, real networks are intricate combinations of order and disorder. Fixing some of the structural properties of network models to their values observed in real networks, many other properties appear as statistical consequences of these fixed observables, plus randomness in other respects. Here we employ the $dk$-series, a complete set of basic characteristics of the network structure, to study the statistical dependencies between different network propertie...
November 14, 2014
We define and study the statistical models in exponential family form whose sufficient statistics are the degree distributions and the bi-degree distributions of undirected labelled simple graphs. Graphs that are constrained by the joint degree distributions are called $dK$-graphs in the computer science literature and this paper attempts to provide the first statistically grounded analysis of this type of models. In addition to formalizing these models, we provide some preli...
September 15, 2005
We study the statistical properties of large random networks with specified degree distributions. New techniques are presented for analyzing the structure of social networks. Specifically, we address the question of how many nodes exist at a distance from a given node. We also explore the degree distribution of for nodes at some distance from a given node. Implications for network sampling and diffusion on social networks are described.