We analyze about two hundred naturally occurring networks with distinct dynamical origins to formally test whether the commonly assumed hypothesis of an underlying scale-free structure is generally viable. This has recently been questioned on the basis of statistical testing of the validity of power law distributions of network degrees by contrasting real data. Specifically, we analyze by finite-size scaling analysis the datasets of real networks to check whether purported de...
June 18, 2013
Complex networks are now being studied in a wide range of disciplines across science and technology. In this paper we propose a method by which one can probe the properties of experimentally obtained network data. Rather than just measuring properties of a network inferred from data, we aim to ask how typical is that network? What properties of the observed network are typical of all such scale free networks, and which are peculiar? To do this we propose a series of methods t...
November 5, 2018
We bring rigor to the vibrant activity of detecting power laws in empirical degree distributions in real-world networks. We first provide a rigorous definition of power-law distributions, equivalent to the definition of regularly varying distributions that are widely used in statistics and other fields. This definition allows the distribution to deviate from a pure power law arbitrarily but without affecting the power-law tail exponent. We then identify three estimators of th...
February 7, 2014
The amount of large-scale real data around us increase in size very quickly and so does the necessity to reduce its size by obtaining a representative sample. Such sample allows us to use a great variety of analytical methods, whose direct application on original data would be infeasible. There are many methods used for different purposes and with different results. In this paper we outline a simple and straightforward approach based on analyzing the nearest neighbors (NN) th...
August 1, 2019
We develop a new sampling method to estimate eigenvector centrality on incomplete networks. Our goal is to estimate this global centrality measure having at disposal a limited amount of data. This is the case in many real-world scenarios where data collection is expensive, the network is too big for data storage capacity or only partial information is available. The sampling algorithm is theoretically grounded by results derived from spectral approximation theory. We studied ...
October 2, 2018
The focus of this work is on estimation of the in-degree distribution in directed networks from sampling network nodes or edges. A number of sampling schemes are considered, including random sampling with and without replacement, and several approaches based on random walks with possible jumps. When sampling nodes, it is assumed that only the out-edges of that node are visible, that is, the in-degree of that node is not observed. The suggested estimation of the in-degree dist...
March 10, 2020
We perform an extensive analysis of how sampling impacts the estimate of several relevant network measures. In particular, we focus on how a sampling strategy optimized to recover a particular spectral centrality measure impacts other topological quantities. Our goal is on one hand to extend the analysis of the behavior of TCEC [Ruggeri2019], a theoretically-grounded sampling method for eigenvector centrality estimation. On the other hand, to demonstrate more broadly how ...
March 1, 2023
We consider the problem of graph generation guided by network statistics, i.e., the generation of graphs which have given values of various numerical measures that characterize networks, such as the clustering coefficient and the number of cycles of given lengths. Algorithms for the generation of synthetic graphs are often based on graph growth models, i.e., rules of adding (and sometimes removing) nodes and edges to a graph that mimic the processes present in real-world netw...
September 20, 2006
Complex networks, modeled as large graphs, received much attention during these last years. However, data on such networks is only available through intricate measurement procedures. Until recently, most studies assumed that these procedures eventually lead to samples large enough to be representative of the whole, at least concerning some key properties. This has crucial impact on network modeling and simulation, which rely on these properties. Recent contributions proved ...
January 6, 2012
For many real-world networks only a small "sampled" version of the original network may be investigated; those results are then used to draw conclusions about the actual system. Variants of breadth-first search (BFS) sampling, which are based on epidemic processes, are widely used. Although it is well established that BFS sampling fails, in most cases, to capture the IN-component(s) of directed networks, a description of the effects of BFS sampling on other topological proper...