Managing On-Premises AI Clusters with Base Command Manager

, Technical Training Content Developer, NVIDIA
The growth in AI is driving the need for substantial compute infrastructure in data centers to train and deploy models. The right cluster management tools are critical for managing this infrastructure at scale and ensuring its optimal utilization. This lab will introduce the NVIDIA Base Command Manager software and describe the best practices for managing AI infrastructure. You’ll gain hands-on experience with provisioning cluster nodes, managing software images, creating users and groups, deploying Kubernetes, running a containerized workload, building a custom monitoring script, and configuring nodes with GPUs. We'll cover both Base Command Manager, which is included in the NVIDIA DGX SW stack, and Base Command Manager Essentials, which is included in NVIDIA AI Enterprise.
Prerequisite(s):

Basic system administration skills


Explore more training options offered by the NVIDIA Deep Learning Institute (DLI). Choose from an extensive catalog of self-paced, online courses or instructor-led virtual workshops to help you develop key skills in AI, HPC, graphics & simulation, and more.
Ready to validate your skills? Get NVIDIA certified and distinguish yourself in the industry.

活动: GTC 24
日期: March 2024
行业: 所有行业
NVIDIA 技术: Base Command,LaunchPad
级别: 初级技术
话题: 数据中心/云基础设施
语言: 英语
所在地: