SC is the International Conference for
High Performance Computing, Networking,
Storage and Analysis



SCHEDULE: NOV 12-18, 2011

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

You can also create your personal schedule on the SC11 app (Boopsie) on your smartphone. Simply select a session you want to attend and "add" it to your plan. Continue in this manner until you have created your own personal schedule. All your events will appear under "My Event Planner" on your smartphone.

M01: An Introduction to Data Intensive Computing

SESSION: M01: An Introduction to Data Intensive Computing

EVENT TYPE: Tutorial

TIME: 8:30AM - 12:00PM

Presenter(s):Robert Grossman, Collin Bennett

ROOM:

ABSTRACT:
Datasets are growing larger and larger each year. The goals of this tutorial are to give an introduction to some of the tools and techniques that can be used for managing, analyzing, and transporting large datasets. The focus of the tutorial is on utility clouds, such as provided by Amazon, and data clouds, such as provided by Hadoop. 1) We will give an introduction to managing scientific datasets using distributed file systems, such as Hadoop, and NoSQL databases, such as HBase. 2) We will give an introduction to parallel programming frameworks, such as MapReduce, Hadoop streams, and related techniques. 3) We will give an introduction to some of the specialized tools used for transporting large datasets, such as GridFTP and UDT. We will illustrate these technologies and techniques using several case studies, including the management and analysis of the large datasets produced by next generation sequencing devices and the analysis of the high volume data streams and large datasets that arise with NetFlow data.

Chair/Presenter Details:

Robert Grossman - University of Chicago

Collin Bennett - Open Data Group

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

   Sponsors    ACM    IEEE