Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
详情
字幕
Harnessing Grace Hopper's Capabilities to Accelerate Vector Database Search
, Principal Developer Technology Engineer, NVIDIA
We'll explore methods to substantially alleviate the constraints imposed by database size, a well-recognized limitation in the realm of graph-based approximate nearest neighbor search (Graph ANNS) for GPUs such as CAGRA. This will be accomplished by harnessing the extensive data transfer bandwidth between the CPU and GPU of Grace Hopper.
Conventional solutions to large databases that cannot be accommodated in GPU memory often involve quantization or compression. However, Graph ANNS necessitates an additional graph index alongside the database, a component that cannot be downsized through compression. For oversized databases, this graph index is relegated to the host memory. This scenario severely hampers the performance of x86+H100 systems. In contrast, Grace Hopper's performance experiences only a minor decline, showcasing its ability to manage a massive database up to 5-10x larger on a single GPU, maintaining nearly consistent performance levels.