Accelerated Intelligent App Development and Deployment with OctoML and Triton
, CTO and Co-founder, OctoML
Developers wanting to build full-stack intelligent applications powered by AI are creatively constrained by deficiencies in today’s MLOps workflows. The complexity is two-fold: (1) There isn’t a fast path for getting the latest deep learning innovations, such as transformer models, quickly integrated into existing application stacks; and (2) in terms of deploying on the latest NVIDIA accelerated computing in the cloud, the deployment burden imposed by cloud infrastructure creates friction even for experienced operations teams. OctoML provides a platform that automates away operational complexity so that the latest AI technology can be swiftly integrated into your current DevOps workflows. This session will also showcase our new model packaging workflow that enables seamless integration of models into a full-stack application by leveraging NVIDIA’s Triton Inference Server. Our technology enables a consistent experience across environments from local development to production on NVIDIA devices in the cloud. We'll include a demo of these new accelerated workflows leveraging OctoML technology and NVIDIA software and hardware.