September 18, 2011
Similar papers 4
May 21, 2023
Analysis of social networks with limited data access is challenging for third parties. To address this challenge, a number of studies have developed algorithms that estimate properties of social networks via a simple random walk. However, most existing algorithms do not assume private nodes that do not publish their neighbors' data when they are queried in empirical social networks. Here we propose a practical framework for estimating properties via random walk-based sampling...
October 22, 2020
Network analysis provides powerful tools to learn about a variety of social systems. However, most analyses implicitly assume that the considered relational data is error-free, reliable and accurately reflects the system to be analysed. Especially if the network consists of multiple groups, this assumption conflicts with a range of systematic biases, measurement errors and other inaccuracies that are well documented in the literature. To investigate the effects of such errors...
June 19, 2023
Social networks have been widely studied over the last century from multiple disciplines to understand societal issues such as inequality in employment rates, managerial performance, and epidemic spread. Today, these and many more issues can be studied at global scale thanks to the digital footprints that we generate when browsing the Web or using social media platforms. Unfortunately, scientists often struggle to access to such data primarily because it is proprietary, and e...
October 22, 2020
Studying real-world networks such as social networks or web networks is a challenge. These networks often combine a complex, highly connected structure together with a large size. We propose a new approach for large scale networks that is able to automatically sample user-defined relevant parts of a network. Starting from a few selected places in the network and a reduced set of expansion rules, the method adopts a filtered breadth-first search approach, that expands through ...
August 30, 2017
Random walk-based sampling methods are gaining popularity and importance in characterizing large networks. While powerful, they suffer from the slow mixing problem when the graph is loosely connected, which results in poor estimation accuracy. Random walk with jumps (RWwJ) can address the slow mixing problem but it is inapplicable if the graph does not support uniform vertex sampling (UNI). In this work, we develop methods that can efficiently sample a graph without the neces...
June 22, 2013
Exploring statistics of locally connected subgraph patterns (also known as network motifs) has helped researchers better understand the structure and function of biological and online social networks (OSNs). Nowadays the massive size of some critical networks -- often stored in already overloaded relational databases -- effectively limits the rate at which nodes and edges can be explored, making it a challenge to accurately discover subgraph statistics. In this work, we propo...
November 30, 2015
In a study related to this one I set up a temporal network simulation environment for evaluating network intervention strategies. A network intervention strategy consists of a sampling design to select nodes in the network. An intervention is applied to nodes in the sample for the purpose of changing the wider network in some desired way. The network intervention strategies can represent natural agents such as viruses that spread in the network, programs to prevent or reduce ...
May 27, 2022
In general, to draw robust conclusions from a dataset, all the analyzed population must be represented on said dataset. Having a dataset that does not fulfill this condition normally leads to selection bias. Additionally, graphs have been used to model a wide variety of problems. Although synthetic graphs can be used to augment available real graph datasets to overcome selection bias, the generation of unbiased synthetic datasets is complex with current tools. In this work, w...
April 21, 2009
Complex networks are at the core of an intense research activity. However, in most cases, intricate and costly measurement procedures are needed to explore their structure. In some cases, these measurements rely on link queries: given two nodes, it is possible to test the existence of a link between them. These tests may be costly, and thus minimizing their number while maximizing the number of discovered links is a key issue. This paper studies this problem: we observe that ...
November 13, 2013
Characterizing large online social networks (OSNs) through node querying is a challenging task. OSNs often impose severe constraints on the query rate, hence limiting the sample size to a small fraction of the total network. Various ad-hoc subgraph sampling methods have been proposed, but many of them give biased estimates and no theoretical basis on the accuracy. In this work, we focus on developing sampling methods for OSNs where querying a node also reveals partial structu...