3 days of code, coffee, & GPUs
The first DiRAC hackathon took place on the 9th of September 2018 at Swansea University. The event gave DiRAC users an opportunity to explore the potential of GPU’s to push their science to the next level of parallel computing. Several teams of explorers set off on their GPU adventure, aided by GPU trainer Wayne Gaudin from PGI Compilers and tools (Dev Tech) sponsored by Nvidia, and supported by Dr Ed Bennett from Swansea University, and Dr Jeffrey Salmond from Cambridge University.
The event was hosted by Swansea and facilitated by the DiRAC GPU systems hosted by Cambridge. DiRAC has 11 GPU machines which are free to use subject to a scientific project approval. The GPU system comprises 11 nodes each containing 4 NVIDIA Tesla P100 GPUs. The NVIDIA Pascal architecture enables the Tesla P100 to deliver superior performance for HPC and hyperscale workloads. With more than 21 teraFLOPS of 16-bit floating-point (FP16) performance, Pascal is optimized to drive exciting new possibilities in deep learning applications. Pascal also delivers over 5 and 10 teraFLOPS of double- and single-precision performance for HPC workloads.
Hardware was not the only thing offered by DiRAC & Swansea, expertise was also in abundance, in the form of RSEs like Dr Mark Dawson, Dr Michele Mesiti, Dr Jarno Rantaharju, and Dr Chennakesava from Swansea, from DiRAC Cambridge Dr Jeffrey Salmond and Matt Archer along with Matthias Wagner from Nvidia.
Hackathons are a place where ideas mix and in September there was a great mix with the AREPO, CURSE, FARGO, GRID, TROVE and The Mighty Atom codes to work on. The users came together to share ideas, experience, and develop good practice, but most of all to see if GPUs would be a good fit for their research. There were teams from all across the UK, including Swansea University, University of Edinburgh, University of Cambridge and the UK Atomic Energy Authority.
With all these different areas of expertise, the teams came together on a cool morning in September.
How the day went
With all local participants staying on site, the event started bright and early on Sunday the 9th of September. Gathering in the Wallace Building, introductions were made and objectives set for the 3 day GPU Hackathon journey.
With the base knowledge of CUDA & openACC , two free courses from Nvidia, and re-enforced by Wayne all the teams started their first steps into the world of GPUs.
After a productive day with most groups successfully running their code on the GPU systems, everyone relaxed and discussed the day’s trials and triumphs over a pizza at Brewstone.
Monday was another 9am start, and with fruits, cakes and coffee the GPU experience was continued.
In the evening the weary travellers relaxed in the award winning luxurious boutique style hotel Morgans, where the exotic idea for GPUs mixed with the spices, and aromas of the beautifully prepared food, situated in a relaxed and unstuffy environment. With the aid of sweet desserts and a small amount of alcohol, the thought of what had been achieved and what was still to do dominated the conversations.
On the last day, with heads down and gritted determination, everyone focused on the final push to achieve each team’s goal. This was not the focus of the lone worker, but the focus of a well trained team, working together, looking out for each other and supporting each other. This support was also given between teams and was not just apparent on the last day, but was a continuous theme of the whole 3-day hackathon.
In the afternoon of the last day all the teams gathered to present their achievement over the 3 days, highlighting problems, solutions found, and an expectation of where this would advance their research to in the future, and prepare them for the machines of the future.
On the last evening there was a celebration of what was achieved, not a big fanfare, but a relaxed quiet reflection of a job well done. DiRAC’s Director, Dr Mark Wilkinson was there to welcome the teams, gauge their reaction to this first DiRAC Hackathon, and assess interest in GPUs possibly playing a bigger part in the upcoming DiRAC3 systems.
Neurons and Cores
This hackathon was not just for the ‘knowledgable’ ones, but also for the ‘I’ve done a bit’ ones, and the ‘would like to know more’ ones. They all came with open minds and a very basic knowledge about GPUs. All participants stated that they would recommend the pre-requisite online training provided by NVidia, there are some comments:
“The online material was great introduction, that gave us an idea of some of the key issues”
“Very useful, everthing well explained & liked interactive aspect”
At the end of the 3 days great strides had been achieved by all the teams see the results by visiting the team pages. All teams agreed that the 3 day GPU event will have a positive impact on their research.
Looking to the Future
The hackathon was a great success, and apparently the teams agreed, with all participants stating that they would expect to use what they learnt in the future, and everyone reporting the hackathon was a good or very good event, with comments like:
“Great experience, learned a lot”, “it was good fun”, and “GREAT FUN! GOOD EXPERIENCE!”
“We attended DiRAC’s “Nvidia GPU Hackathon” with limited knowledge — and almost no practical experience — of accelerating codes using GPUs. The training material provided by Nvidia gave us a broad view of using CUDA and OpenACC to achieve speed-ups using GPU hardware, but we really got to grips with it at the hackathon by getting stuck in and modifying code ourselves. Our goals were to gain some experience, and to try to speed-up a ray-tracing module used in the cosmological hydrodynamics code AREPO. Using OpenACC directives and PGI’s compiler, we eventually managed to gain a ~10x acceleration with a single GPU, compared to both single & multi CPU only runs. Alongside this acceleration we also gained an appreciation for the algorithmic regimes where GPUs are useful, and the important technical considerations associated with GPU programming. Since the hackathon we have put our new knowledge to use, modifying other codes to take advantage of GPU acceleration!”
Lewis H. Weinberger
After the success of the GPU hackathon, DiRAC will hold more of these events. DiRAC is here for the advancement of science, and to help you get your research to the next level. In the future there will be hackathons on different topics around the country.
Nvidia Hackathon Testimonial
We are a team of graduate students from the University of Cambridge, working in the numerical simulations group at the Institute of Astronomy. When we learned that there would be a three-day GPU hackathon before DiRAC day, we were all excited to apply.
None of us had any experience of using GPUs, but we had all heard about the significant accelerations they could provide. Alongside this two of our members had been frustrated with the speed of some code they used to analyse their hydrodynamical simulations. With the dual prospects of learning about GPU programming and accelerating this analysis code, we decided to submit a team application to the hackathon.
Prior to the event we were given some interactive training material to start learning how to use GPUs. The hackathon was sponsored by Nvidia, and so they provided tutorials for learning about CUDA and OpenACC as two possible ways to write GPU-aware code. CUDA is Nvidia’s API for interacting with a GPU; it’s the more low-level approach of the two, in which you control memory allocation, data transfer and kernel definition explicitly. OpenACC is a programming standard, similar to OpenMP, which allows a more high-level approach to GPU programming. Using directives you indicate to the compiler which regions you want to parallelise on the GPU, and then let the compiler optimise the low-level details.
We arrived on Sunday in Swansea ready to get to work. On the first day we had to decide on our acceleration strategy (CUDA, OpenACC, both?) as well as figuring out how best to modify the existing code. The code we were working on is a module of the simulation code AREPO that creates visualisations. AREPO’s main purpose is to run cosmological hydrodynamical simulations, for example modelling galaxy evolution. In order to visualise the outputs from AREPO, it can perform ray-tracing to create projected views of the simulation volume.
Our task was to transfer the simulation volume onto the GPU, so that it could perform the additive ray-tracing step massively parallel. We chose to use the OpenACC approach, and so we spent the rest of that day (and the next) carefully choosing our compiler directives. Our main challenge was to ensure the simulation data was transferred correctly to the GPU. It felt like we were battling the PGI compiler for the whole two days, until finally we had a breakthrough late on Monday. The code finally compiled, and we immediately tried a test run. Comparing to a reference time that we calculated for a CPU only run, the new GPU code ran ~10 times faster!
On the final day we did some further testing and profiling, and confirmed that indeed the ray tracing step was massively accelerated by running on the GPU. Furthermore the code now scaled very well with the desired image resolution. This speed-up allows us to now make rapid projections, useful not just for creating images but also movies of the simulation volume. We’d like to thank the organisers at DiRAC, our hosts at the University of Swansea, and the sponsor Nvidia for running a really fun hackathon!