The Graphics Card as a Streaming Compute...

Exploring the Limits of GPUs With Parallel Graph Algorithms

February 24, 2010

86% Match

Frank Dehne, Kumanan Yogaratnam

Distributed, Parallel, and C...

Computational Complexity

Data Structures and Algorith...

In this paper, we explore the limits of graphics processors (GPUs) for general purpose parallel computing by studying problems that require highly irregular data access patterns: parallel graph algorithms for list ranking and connected components. Such graph problems represent a worst case scenario for coalescing parallel memory accesses on GPUs which is critical for good GPU performance. Our experimental study indicates that PRAM algorithms are a good starting point for deve...

Find SimilarView on arXiv

A Short Note on Gaussian Process Modeling for Large Datasets using Graphics Processing Units

March 6, 2012

86% Match

Mark Franey, Pritam Ranjan, Hugh Chipman

Computation

Machine Learning

The graphics processing unit (GPU) has emerged as a powerful and cost effective processor for general performance computing. GPUs are capable of an order of magnitude more floating-point operations per second as compared to modern central processing units (CPUs), and thus provide a great deal of promise for computationally intensive statistical applications. Fitting complex statistical models with a large number of parameters and/or for large datasets is often very computatio...

Find SimilarView on arXiv

MPI Streams for HPC Applications

August 3, 2017

86% Match

Ivy Bo Peng, Stefano Markidis, Roberto Gioiosa, ... , Laure Erwin

Distributed, Parallel, and C...

Data streams are a sequence of data flowing between source and destination processes. Streaming is widely used for signal, image and video processing for its efficiency in pipelining and effectiveness in reducing demand for memory. The goal of this work is to extend the use of data streams to support both conventional scientific applications and emerging data analytic applications running on HPC platforms. We introduce an extension called MPIStream to the de-facto programming...

Find SimilarView on arXiv

Data-Parallel Hashing Techniques for GPU Architectures

July 11, 2018

86% Match

Brenton Lessley

Distributed, Parallel, and C...

Data Structures and Algorith...

Hash tables are one of the most fundamental data structures for effectively storing and accessing sparse data, with widespread usage in domains ranging from computer graphics to machine learning. This study surveys the state-of-the-art research on data-parallel hashing techniques for emerging massively-parallel, many-core GPU architectures. Key factors affecting the performance of different hashing schemes are discovered and used to suggest best practices and pinpoint areas f...

Find SimilarView on arXiv

A Study of Performance Programming of CPU, GPU accelerated Computers and SIMD Architecture

September 16, 2024

86% Match

Xinyao Yi

Distributed, Parallel, and C...

Parallel computing is a standard approach to achieving high-performance computing (HPC). Three commonly used methods to implement parallel computing include: 1) applying multithreading technology on single-core or multi-core CPUs; 2) incorporating powerful parallel computing devices such as GPUs, FPGAs, and other accelerators; and 3) utilizing special parallel architectures like Single Instruction/Multiple Data (SIMD). Many researchers have made efforts using different para...

Find SimilarView on arXiv

Using graphics processing units to generate random numbers

January 10, 2011

86% Match

S. Hissoiny, P. Després, B. Ozell

Distributed, Parallel, and C...

The future of high-performance computing is aligning itself towards the efficient use of highly parallel computing environments. One application where the use of massive parallelism comes instinctively is Monte Carlo simulations, where a large number of independent events have to be simulated. At the core of the Monte Carlo simulation lies the Random Number Generator (RNG). In this paper, the massively parallel implementation of a collection of pseudo-random number generators...

Find SimilarView on arXiv

Augmenting Operating Systems With the GPU

May 15, 2013

86% Match

Weibin Sun, Robert Ricci

Operating Systems

The most popular heterogeneous many-core platform, the CPU+GPU combination, has received relatively little attention in operating systems research. This platform is already widely deployed: GPUs can be found, in some form, in most desktop and laptop PCs. Used for more than just graphics processing, modern GPUs have proved themselves versatile enough to be adapted to other applications as well. Though GPUs have strengths that can be exploited in systems software, this remains ...

Find SimilarView on arXiv

A Framework for Accelerating Bottlenecks in GPU Execution with Assist Warps

February 3, 2016

86% Match

Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Saugata Ghose, Abhishek Bhowmick, Rachata Ausavarangnirun, Chita Das, Mahmut Kandemir, ... , Mutlu Onur

Hardware Architecture

Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent execution of thousands of threads. Unfortunately, different bottlenecks during execution and heterogeneous application requirements create imbalances in utilization of resources in the cores. For example, when a GPU is bottlenecked by the available off-chip memory bandwidth, its computational resources are often overwhelmingly idle, waiting for data from memory to arrive. This work descri...

Find SimilarView on arXiv

BigGraphVis: Leveraging Streaming Algorithms and GPU Acceleration for Visualizing Big Graphs

August 1, 2021

86% Match

Ehsan Moradi, Debajyoti Mondal

Distributed, Parallel, and C...

Computational Geometry

Graphics

Information Retrieval

Graph layouts are key to exploring massive graphs. An enormous number of nodes and edges do not allow network analysis software to produce meaningful visualization of the pervasive networks. Long computation time, memory and display limitations encircle the software's ability to explore massive graphs. This paper introduces BigGraphVis, a new parallel graph visualization method that uses GPU parallel processing and community detection algorithm to visualize graph communities....

Find SimilarView on arXiv

N-Body Simulations on GPUs

June 20, 2007

86% Match

Erich Elsen, V. Vishal, Mike Houston, Vijay Pande, ... , Darve Eric

Computational Engineering, F...

Distributed, Parallel, and C...

Commercial graphics processors (GPUs) have high compute capacity at very low cost, which makes them attractive for general purpose scientific computing. In this paper we show how graphics processors can be used for N-body simulations to obtain improvements in performance over current generation CPUs. We have developed a highly optimized algorithm for performing the O(N^2) force calculations that constitute the major part of stellar and molecular dynamics simulations. In some ...

Find SimilarView on arXiv

The Graphics Card as a Streaming Computer

Exploring the Limits of GPUs With Parallel Graph Algorithms

A Short Note on Gaussian Process Modeling for Large Datasets using Graphics Processing Units

MPI Streams for HPC Applications

Data-Parallel Hashing Techniques for GPU Architectures

A Study of Performance Programming of CPU, GPU accelerated Computers and SIMD Architecture

Using graphics processing units to generate random numbers

Augmenting Operating Systems With the GPU

A Framework for Accelerating Bottlenecks in GPU Execution with Assist Warps

BigGraphVis: Leveraging Streaming Algorithms and GPU Acceleration for Visualizing Big Graphs

N-Body Simulations on GPUs