BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20111114T163000Z DTEND:20111115T010000Z LOCATION: DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: This tutorial, suitable for attendees with an intermediate-level in Parallel programing in MPI, and GPU programming in CUDA or OpenCL, will provide a comprehensive overview on the optimization techniques to port, analyze, and accelerate applications on scalable heterogeneous computing systems. We will focus on methods, tools, and techniques to migrate existing applications to large scale GPU clusters using MPI and OpenCL/CUDA. First, we will review our methodology for successfully identifying and selecting portions of applications to accelerate with a GPU, motivated with several application case studies. Second, we will present an overview of several performance and correctness tools, which provide performance measurement, profiling, and tracing information about applications running on these systems. Third, we will present a set of best practices for optimizing these applications: GPU and NUDA optimization techniques, optimizing interactions between MPI and GPU programming models. A hands-on session will be conducted on the NSF Keeneland Initial Delivery System, after each part to give participants the opportunity to investigate techniques and performance optimizations on such a system. Existing tutorial codes and benchmark suites will be provided to facilitate individual discovery. Additionally, participants may bring and work on their own applications. SUMMARY:M12: Scalable Heterogeneous Computing on GPU Clusters PRIORITY:3 END:VEVENT END:VCALENDAR