Skip to content

Configure TGI Integration: FEC VM Connectivity to LLM-Hosting API

Text Generation Inference (TGI) [1] is a toolkit designed for deploying and serving Large Language Models (LLMs) in production environments. With TGI, we can operate an LLM as a service, enabling its utilization across myriad clients.

llm_hosting.drawio

Tasks :

  • Prepare the selected LLM model(s) in a format compatible with TGI's requirements for deployment.
  • Coordinate with the LLM-hosting team to add the wrapped LLM model (TGI compatible) to their list of available models. image
  • Create a simple pipeline to export the model in AI-Builder and test it.

References

  1. https://github.com/huggingface/text-generation-inference
Edited by Swetha Lakshmana Murthy