When viewing the Technical Program schedule, on the far righthand side
is a column labeled "PLANNER." Use this planner to build your own
schedule. Once you select an event and want to add it to your personal
schedule, just click on the calendar icon of your choice (outlook
calendar, ical calendar or google calendar) and that event will be
stored there. As you select events in this manner, you will have your
own schedule to guide you through the week.
You can also create your personal schedule on the SC11 app (Boopsie) on your smartphone. Simply select a session you want to attend and "add" it to your plan. Continue in this manner until you have created your own personal schedule. All your events will appear under "My Event Planner" on your smartphone.
AUTHOR(S):Guangming Tan, Linchuan Li, Sean Triechler, Everett Phillips, Yungang Bao, Ninghui Sun
ROOM:TCC 303
ABSTRACT: The GPU is offering more than an order of magnitude speedup of peak floating-point computing over conventional processors. In this paper we present a thorough experience on tuning double-precision matrix-matrix multiplication (DGEMM) on the Fermi GPU architecture. We choose an optimal algorithm with blocking in both shared memory and registers to satisfy the constraints of the Fermi memory hierarchy. Our optimization strategy is further guided by a performance modeling based on micro-architecture benchmarks. Our optimizations include software pipelining, use of vector memory operations, and instruction scheduling. Our best CUDA algorithm achieves comparable performance with the latest vendor supplied library: CUBLAS 3.2. We further improve upon this with an implementation in the native machine language, leading to a 20% increase in performance over CUBLAS. That is, the achieved peak performance (efficiency) is improved from 302Gflop/s (58%) to 362Gflop/s (70%).