SC is the International Conference for
High Performance Computing, Networking,
Storage and Analysis



SCHEDULE: NOV 12-18, 2011

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

You can also create your personal schedule on the SC11 app (Boopsie) on your smartphone. Simply select a session you want to attend and "add" it to your plan. Continue in this manner until you have created your own personal schedule. All your events will appear under "My Event Planner" on your smartphone.

A Long-Distance InfiniBand Interconnection Between Two Clusters in Production Use

SESSION: State of the Practice - AICS/Security/Net

EVENT TYPE: State of the Practice

TIME: 2:30PM - 3:00PM

SESSION CHAIR: David Martin

AUTHOR(S):Sabine Richling, Steffen Hau, Heinz Kredel, Hans-Günther Kruse

ROOM:TCC 202

ABSTRACT:
We discuss operational and organizational issues of an InfiniBand interconnection between two clusters over a distance of 28 km in day-to-day production use. We describe the setup of hardware and networking components, and the solution of technical integration problems. Then we present solutions for a federated authorization system for the cluster within our two participating universities and other organizational integration problems. Performance measurements for MPI communication and file access to Lustre storage systems are presented. The results and a simple performance model show that MPI performance is intrinsically poor across the long-distance interconnection with limited bandwidth. However, file access and MPI communication among nodes on each side are barely affected by the limitations of the interconnection even at high load. Our organizational and technical setup allows the operation of the two clusters as a single system with lower administration costs and a better load balance than in a disconnected setup.

Chair/Author Details:

David Martin (Chair) - Argonne National Lab

Sabine Richling - University of Heidelberg

Steffen Hau - University of Mannheim

Heinz Kredel - University of Mannheim

Hans-Günther Kruse - University of Mannheim

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

The full paper can be found in the ACM Digital Library

   Sponsors    ACM    IEEE