July 17, 2018
Similar papers 2
April 21, 2020
Inferring topological characteristics of complex networks from observed data is critical to understand the dynamical behavior of networked systems, ranging from the Internet and the World Wide Web to biological networks and social networks. Prior studies usually focus on the structure-based estimation to infer network sizes, degree distributions, average degrees, and more. Little effort attempted to estimate the specific degree of each vertex from a sampled induced graph, whi...
October 2, 2018
The focus of this work is on estimation of the in-degree distribution in directed networks from sampling network nodes or edges. A number of sampling schemes are considered, including random sampling with and without replacement, and several approaches based on random walks with possible jumps. When sampling nodes, it is assumed that only the out-edges of that node are visible, that is, the in-degree of that node is not observed. The suggested estimation of the in-degree dist...
March 6, 2018
Most empirical studies of networks assume that the network data we are given represent a complete and accurate picture of the nodes and edges in the system of interest, but in real-world situations this is rarely the case. More often the data only specify the network structure imperfectly -- like data in essentially every other area of empirical science, network data are prone to measurement error and noise. At the same time, the data may be richer than simple network measure...
June 1, 2011
We use mathematical methods from the theory of tailored random graphs to study systematically the effects of sampling on topological features of large biological signalling networks. Our aim in doing so is to increase our quantitative understanding of the relation between true biological networks and the imperfect and often biased samples of these networks that are reported in public data repositories and used by biomedical scientists. We derive exact explicit formulae for de...
December 13, 2022
This work introduces a method for fitting to the degree distributions of complex network datasets, such that the most appropriate distribution from a set of candidate distributions is chosen while maximizing the portion of the distribution to which the model is fit. Current methods for fitting to degree distributions in the literature are inconsistent and often assume a priori what distribution the data are drawn from. Much focus is given to fitting to the tail of the distrib...
July 24, 2022
Often, due to prohibitively large size or to limits to data collecting APIs, it is not possible to work with a complete network dataset and sampling is required. A type of sampling which is consistent with Twitter API restrictions is uniform edge sampling. In this paper, we propose a methodology for the recovery of two fundamental network properties from an edge-sampled network: the degree distribution and the triangle count (we estimate the totals for the network and the cou...
July 3, 2005
The degree distribution of many biological and technological networks has been described as a power-law distribution. While the degree distribution does not capture all aspects of a network, it has often been suggested that its functional form contains important clues as to underlying evolutionary processes that have shaped the network. Generally, the functional form for the degree distribution has been determined in an ad-hoc fashion, with clear power-law like behaviour ofte...
June 26, 2009
Graphs and networks provide a canonical representation of relational data, with massive network data sets becoming increasingly prevalent across a variety of scientific fields. Although tools from mathematics and computer science have been eagerly adopted by practitioners in the service of network inference, they do not yet comprise a unified and coherent framework for the statistical analysis of large-scale network data. This paper serves as both an introduction to the topic...
January 25, 2017
The need to produce accurate estimates of vertex degree in a large network, based on observation of a subnetwork, arises in a number of practical settings. We study a formalized version of this problem, wherein the goal is, given a randomly sampled subnetwork from a large parent network, to estimate the actual degree of the sampled nodes. Depending on the sampling scheme, trivial method of moments estimators (MMEs) can be used. However, the MME is not expected, in general, to...
July 31, 2020
When a network is inferred from data, two types of errors can occur: false positive and false negative conclusions about the presence of links. We focus on the influence of local network characteristics on the probability $\alpha$ - of type I false positive conclusions, and on the probability $\beta$ - of type II false negative conclusions, in the case of networks of coupled oscillators. We demonstrate that false conclusion probabilities are influenced by local connectivity m...