ABSTRACT: Current supercomputer systems have architectures with many thousands of cores. Application developers are faced with the task of designing for parallel scalability at ever higher levels. The effective use of such platforms, requires application developers to focus on achieving parallel scalability through application design and optimization driven by performance analysis. This tutorial provides developers with the background in system features, program design and performance analysis to aid in producing efficient programs on modern supercomputers. SUMMARY:S10: Scaling to Petascale and Beyond: Performance Analysis and Optimization of Applications