Configure TGI Integration: FEC VM Connectivity to LLM-Hosting API
Text Generation Inference (TGI) [1] is a toolkit designed for deploying and serving Large Language Models (LLMs) in production environments. With TGI, we can operate an LLM as a service, enabling its utilization across myriad clients.
Tasks :
- Prepare the selected LLM model(s) in a format compatible with TGI's requirements for deployment.
-
Coordinate with the LLM-hosting team to add the wrapped LLM model (TGI compatible) to their list of available models.
- Create a simple pipeline to export the model in AI-Builder and test it.
References