BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20111118T003000Z DTEND:20111118T010000Z LOCATION:TCC 305 DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: Recent results have shown that topology aware mapping reduces network contention in communication-intensive kernels on massively parallel machines. We demonstrate that=0Aon mesh interconnects, topology aware mapping allows for utilization of highly-efficient topology aware collectives. We map novel 2.5D dense linear algebra algorithms to cuboid partitions allocated by a Blue Gene/P supercomputer. Our mappings allow the algorithms to exploit optimized line multicasts and reductions. Commonly used 2D algorithms cannot be mapped in this fashion. On 65,536 cores of Blue Gene/P, 2.5D algorithms with rectangular collectives are 2.6x and 2.7x faster for matrix multiply and LU factorization, respectively. For LU, communication time drops by up to 92%. We derive a novel performance model based on the LogP model for rectangular broadcasts and reductions. We model performance on a hypothetical exascale architecture. Our study evaluates the benefits of topology aware collectives for high performance algorithms. SUMMARY:Improving Communication Performance in Dense Linear Algebra via Topology Aware Collectives PRIORITY:3 END:VEVENT END:VCALENDAR