BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20111113T233000Z DTEND:20111114T010000Z LOCATION:WSCC 2B DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: Biological data, both naturally derived and synthetically generated, generally suit graph representations well. Among other uses, graph-based representations can be used to reveal networks within data that are tied together by shared characteristics such as homology or function. Consequently, clustering formulations are prevalent in a number of biological applications, including that of determining protein-protein interactions and discovering protein families from metagenomics data. Performing these operations at a large-scale, however, still remains technically challenging.=0A=0AIn this session, we will: (i) formulate metagenomics protein family characterization as a graph clustering problem; (ii) describe an efficient graph clustering algorithm called pClust; (iii) conduct hands-on experiments to cluster several real world data sets and visualize the results. The primary intended outcome is to help undergraduate instructors identify lesson plans suitable for their majors (Computer Science/Mathematics/Biology), and thereby facilitate integration of these cutting-edge research advances into classrooms.=0A=0AAssumed background: (i) high school mathematics; (ii) basic genomics background (Molecular Biology 101).=0A=0ASuggested background: (i) an interest in computational biology and/or combinatorial problem solving; (ii) (optional) introductory knowledge of any programming language and basic Unix/Mac command line usage. SUMMARY: PRIORITY:3 END:VEVENT END:VCALENDAR