ID: 1109.3911

Benefits of Bias: Towards Better Characterization of Network Sampling

September 18, 2011

View on ArXiv

Similar papers 5

Estimating the Size of a Large Network and its Communities from a Random Sample

October 26, 2016

87% Match
Lin Chen, Amin Karbasi, Forrest W. Crawford
Machine Learning
Social and Information Netwo...
Physics and Society

Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V;E) from...

Find SimilarView on arXiv

Sampling on networks: estimating eigenvector centrality on incomplete graphs

August 1, 2019

87% Match
Nicolò Ruggeri, Bacco Caterina De
Social and Information Netwo...
Data Analysis, Statistics an...
Physics and Society

We develop a new sampling method to estimate eigenvector centrality on incomplete networks. Our goal is to estimate this global centrality measure having at disposal a limited amount of data. This is the case in many real-world scenarios where data collection is expensive, the network is too big for data storage capacity or only partial information is available. The sampling algorithm is theoretically grounded by results derived from spectral approximation theory. We studied ...

Find SimilarView on arXiv

Graph Sample and Hold: A Framework for Big-Graph Analytics

March 16, 2014

87% Match
Nesreen K. Ahmed, Nick Duffield, ... , Kompella Ramana
Social and Information Netwo...
Databases
Physics and Society
Applications

Sampling is a standard approach in big-graph analytics; the goal is to efficiently estimate the graph properties by consulting a sample of the whole population. A perfect sample is assumed to mirror every property of the whole population. Unfortunately, such a perfect sample is hard to collect in complex populations such as graphs (e.g. web graphs, social networks etc), where an underlying network connects the units of the population. Therefore, a good sample will be represen...

Find SimilarView on arXiv

A Community-Based Sampling Method Using DPL for Online Social Network

September 6, 2011

87% Match
Seok-Ho Yoon, Ki-Nam Kim, ... , Park Sunju
Social and Information Netwo...
Physics and Society

In this paper, we propose a new graph sampling method for online social networks that achieves the following. First, a sample graph should reflect the ratio between the number of nodes and the number of edges of the original graph. Second, a sample graph should reflect the topology of the original graph. Third, sample graphs should be consistent with each other when they are sampled from the same original graph. The proposed method employs two techniques: hierarchical communi...

Find SimilarView on arXiv

ComPAS: Community Preserving Sampling for Streaming Graphs

February 5, 2018

87% Match
Sandipan Sikdar, Tanmoy Chakraborty, Soumya Sarkar, ... , Mukherjee Animesh
Social and Information Netwo...
Physics and Society

In the era of big data, graph sampling is indispensable in many settings. Existing sampling methods are mostly designed for static graphs, and aim to preserve basic structural properties of the original graph (such as degree distribution, clustering coefficient etc.) in the sample. We argue that for any sampling method it is impossible to produce an universal representative sample which can preserve all the properties of the original graph; rather sampling should be applicati...

Find SimilarView on arXiv

Reliability of rank order in sampled networks

February 17, 2007

87% Match
Pan-Jun Kim, Hawoong Jeong
Physics and Society
Statistical Mechanics
Data Analysis, Statistics an...
Applications

In complex scale-free networks, ranking the individual nodes based upon their importance has useful applications, such as the identification of hubs for epidemic control, or bottlenecks for controlling traffic congestion. However, in most real situations, only limited sub-structures of entire networks are available, and therefore the reliability of the order relationships in sampled networks requires investigation. With a set of randomly sampled nodes from the underlying orig...

Find SimilarView on arXiv

Efficient Sampling for Better OSN Data Provisioning

December 14, 2016

87% Match
Nick Duffield, Balachander Krishnamurthy
Data Structures and Algorith...
Social and Information Netwo...

Data concerning the users and usage of Online Social Networks (OSNs) has become available externally, from public resources (e.g., user profiles), participation in OSNs (e.g., establishing relationships and recording transactions such as user updates) and APIs of the OSN provider (such as the Twitter API). APIs let OSN providers monetize the release of data while helping control measurement load, e.g. by providing samples with different cost-granularity tradeoffs. To date, th...

Find SimilarView on arXiv

A Review: Random Walk in Graph Sampling

September 27, 2022

87% Match
Xiao Qi
Social and Information Netwo...
Methodology

Graph sampling is a technique to pick a subset of vertices and/ or edges from original graph. Among various graph sampling approaches, Traversal Based Sampling (TBS) are widely used due to low cost and feasibility for many cases, in which Simple Random Walk (SRW) and its variants share a large proportion in TBS. We illustrate the foundation SRW and presents the problems of SRW. Based on the problems, we provide a taxonomy of different Random Walk (RW) based graph sampling met...

Find SimilarView on arXiv

Little Ball of Fur: A Python Library for Graph Sampling

June 8, 2020

87% Match
Benedek Rozemberczki, Oliver Kiss, Rik Sarkar
Social and Information Netwo...
Machine Learning

Sampling graphs is an important task in data mining. In this paper, we describe Little Ball of Fur a Python library that includes more than twenty graph sampling algorithms. Our goal is to make node, edge, and exploration-based network sampling techniques accessible to a large number of professionals, researchers, and students in a single streamlined framework. We created this framework with a focus on a coherent application public interface which has a convenient design, gen...

Find SimilarView on arXiv

Real Time Enhanced Random Sampling of Online Social Networks

November 30, 2012

87% Match
Giannis Haralabopoulos, Ioannis Anagnostopoulos
Social and Information Netwo...
Information Retrieval
Physics and Society

Social graphs can be easily extracted from Online Social Networks. However these networks are getting larger from day to day. Sampling methods used to evaluate graph information cannot accurately extract graph properties. Furthermore Social Networks are limiting the access to their data, making the crawling process even harder. A novel approach on Random Sampling is proposed, considering both limitation and resources. We evaluate this proposal with 4 different settings on 5 d...

Find SimilarView on arXiv