Configure TGI Integration: FEC VM Connectivity to LLM-Hosting API

Text Generation Inference (TGI) [1] is a toolkit designed for deploying and serving Large Language Models (LLMs) in production environments. With TGI, we can operate an LLM as a service, enabling its utilization across myriad clients.

Tasks :

Prepare the selected LLM model(s) in a format compatible with TGI's requirements for deployment.
Coordinate with the LLM-hosting team to add the wrapped LLM model (TGI compatible) to their list of available models.
Create a simple pipeline to export the model in AI-Builder and test it.

References

https://github.com/huggingface/text-generation-inference

Edited May 14, 2024 by Swetha Lakshmana Murthy

Copyright © Eclipse Foundation, Inc. All Rights Reserved. Privacy Policy | Terms of Use | Copyright Agent