CUDA C++ is a extension of the ISO C++ language that allows you to use familiar C++ tools to write parallel programmings that run on GPUs. However, one essential C++ tool has been missing from device-side CUDA C++ — the C++ standard library. But not any longer! Introduced in the CUDA 10.2 toolkit, libcu++ is an opt-in heterogeneous CUDA C++ standard library — you can get the latest version today on GitHub: https://github.com/NVIDIA/libcudacxx. One of the marquee features is C++ atomics for CUDA — a more correct, efficient, and powerful replacement for the legacy CUDA `atomic*` functions. In this example-oriented talk, we'll explain how and when to start using libcu++ and how it can be used to build complex concurrent data structures and enable new classes of applications on modern NVIDIA GPUs. We'll also give you a sneak preview of our future roadmap for libcu++.