Fusing Convolution Kernels through Tiling (ARRAY 2015 - ARRAY Research Papers)

Sat 13 - Wed 17 June 2015 Portland, Oregon, United States

Who

Mahesh Ravishankar, Paulius Micikevicius, Vinod Grover

Track

ARRAY 2015

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 13 Jun 2015 16:00 - 16:30 at C122 - Paper Session 4 Chair(s): David Grove

Abstract

Image processing pipelines are continuously being developed to deduce more information about objects captured in images. To facilitate the development of such pipelines several Domain Specific Languages (DSLs) have been developed that provide constructs for easy specification of such computations. It is then upto the DSL compiler to generate code to efficiently execute the pipeline on multiple hardware architectures. While such compilers are getting ever more sophisticated, to achieve large scale adoption these DSLs have to beat or at least match the performance that can be achieved by a skilled programmer.

Many of these pipelines use a sequence of convolution kernels that are memory bandwidth bound. One way to address this bottleneck is through use of tiling. In this paper we describe an approach to tiling within the context of a DSL called Forma. Using the high-level specification of the pipeline in this DSL, we describe a code generation algorithm that fuses multiple stages of the pipeline through the use of tiling to reduce the memory bandwidth requirements on both GPU and CPU. Using this technique improves the performance of pipelines like Canny Edge Detection by 58% on NVIDIA GPUs, and of the Harris Corner Detection pipeline by 71%.

Mahesh Ravishankar

Paulius Micikevicius

Vinod Grover