When viewing the Technical Program schedule, on the far righthand side
is a column labeled "PLANNER." Use this planner to build your own
schedule. Once you select an event and want to add it to your personal
schedule, just click on the calendar icon of your choice (outlook
calendar, ical calendar or google calendar) and that event will be
stored there. As you select events in this manner, you will have your
own schedule to guide you through the week.
You can also create your personal schedule on the SC11 app (Boopsie) on your smartphone. Simply select a session you want to attend and "add" it to your plan. Continue in this manner until you have created your own personal schedule. All your events will appear under "My Event Planner" on your smartphone.
SESSION: M01: An Introduction to Data Intensive Computing
EVENT TYPE: Tutorial
TIME: 8:30AM - 12:00PM
Presenter(s):Robert Grossman, Collin Bennett
ABSTRACT: Datasets are growing larger and larger each year. The goals of this tutorial are to give an introduction to some of the tools and techniques that can be used for managing, analyzing, and transporting large datasets. The focus of the tutorial is on utility clouds, such as provided by Amazon, and data clouds, such as provided by Hadoop.
1) We will give an introduction to managing scientific datasets using distributed file systems, such as Hadoop, and NoSQL databases, such as HBase.
2) We will give an introduction to parallel programming frameworks, such as MapReduce, Hadoop streams, and related techniques.
3) We will give an introduction to some of the specialized tools used for transporting large datasets, such as GridFTP and UDT.
We will illustrate these technologies and techniques using several case studies, including the management and analysis of the large datasets produced by next generation sequencing devices and the analysis of the high volume data streams and large datasets that arise with NetFlow data.