Learn how to write a portable parallel program that can run on multicore CPUs and accelerators like GPUs and how to apply incremental parallelization strategies using the OpenACC programming model to accelerate an example application that simulates heat distribution across a 2-dimensional metal plate and applying this knowledge to accelerate a mini-application.
The GPU Bootcamp will be hosted online in the Central European Summer Time zone. All communication will be done through Zoom, Slack and email.
Basic experience with C/C++ or Fortran. No GPU programming knowledge is required.
This event has limited capacity, so please make sure that prerequisites are met before applying. You will be receiving an acceptance email with details on how to participate by September 2nd, 2021.
Attendees will be given access to a GPU cluster for the duration of the GPU Bootcamp.
Day 1: September 9, 2021 (9:00 AM to 5:00 PM CEST)
- Welcome: 9:00 AM
- Connecting to a cluster: 9:00 AM – 09:15 AM
- Introduction to GPU programming: 9:15 AM – 9:30 AM CAT(Lecture)
- What is a GPU and Why Should You care?
- What is GPU Programming?
- Available Libraries, Programming Models, Platforms.
- Introduction to OpenACC: 9:15 AM – 10:00 AM (Lecture + Lab)
- What is OpenACC and Why Should You Care?
- Profile-driven Development.
- First Steps with OpenACC.
- Lab 1.
- OpenACC Data Management: 10:00 AM – 10:45 AM (Lecture + Lab)
- CPU and GPU Memories.
- CUDA Unified (Managed) Memory.
- OpenACC Data Management.
- Lab 2.
- Break 11:00-11:15
- Gangs, Workers, and Vectors Demystified: 11:30 AM – 12:15 PM (Lecture + Lab)
- GPU Profiling.
- Loop Optimizations.
- Lab 3.
- Mini-application challenge (12:15-12:30)
- Overview of the mini-application
- Review steps to acceleration
- Application challenge (13:00-17:30)
Day 2: September 10, 2021 (9:00 AM to 12:30 PM CEST)
Welcome (Moderator): 9:00
Mini-application Solution Walk-through (9:15-9:30)
Introduction to NVIDIA ® Nsight™ Tools (9:30-10:00)
- Overview of Nsight Tools
- How to profile a serial application with NVIDIA Tools Extension (NVTX)
- Overview of optimization cycle with Nsight Systems
Profiling mini-application (10:00-12:30)
- Profile a sequential weather modeling application (integrated with NVTX APIs) with NVIDIA Nsight Systems to capture and trace CPU events and time ranges
- Understand how to use NVIDIA Nsight Systems profiler’s report to detect hotspots and apply OpenACC compute constructs to the serial application to parallelise it on the GPU
- Learn how to use Nsight Systems to identify issues such as underutilized GPU device and unnecessary data movements in the application and to apply optimization strategies steps by steps to expose more parallelism and utilize computer’s CPU and GPU