Integrating NVIDIA TensorRT-LLM with the Databricks Inference Stack

Share
https://devopsforum.uk/forums/topic/50618-integrating-nvidia-tensorrt-llm-with-the-databricks-inference-stack/
Followers

Start new topic
Reply to this topic

Posted December 21, 20231 yr

Over the past six months, we've been working with NVIDIA to get the most out of their new TensorRT-LLM library. TensorRT-LLM provides an easy-to-use Python interface to integrate with a web server for fast, efficient inference performance with LLMs. In this post, we're highlighting some key areas where our collaboration with NVIDIA has been particularly important.

View the full article

Quote