Improvements on GroundingLLM pipeline
This issue looks at ways to improve the Grounding LLM pipeline. It explores current research fields such as Prompt engineering, Vector embedding + Knowledge Graph and graph encoders.
Identified Areas of Improvement
-
Scenario 1:
- question: 'How does Pounits taste?'
- usecase document: 'Pounits taste more savory than sweet'
- KG created : Pounits -> are -> more savroy than sweet
- cypher generated: MATCH (p:node {name: 'pounits'})-[:taste]->(t) RETURN t
- LLM ouput: Cannot be found
-
Scenario 2:
- question: 'How does Pounits taste?'
- usecase document: 'Pounits are more savory than sweet'
- KG created : Pounits -> are -> more savroy than sweet
- cypher generated: MATCH (p:node {name: 'pounits'})-[:taste]->(t) RETURN t
- LLM ouput: Cannot be found
Tasks
-
Prompt Engineering for Parse model: ( work_in_progress )
In the triples ( subject, relation, object), the relation must be verbs or some sort of rules so that the KG created remains consistent. With this the generated cypher query can work better on the KG.
- Tried different GPT models with Langchain and modified KG
- Setting up the 2 different prompts
-
To do - Self- healing mechanism with langchain ( work_in_progress )
-
-
Python script - Eliminating Langchain: - Created a python script.
- Consist of self- healing mechanism
- Tried out the python script with the modified KG
-
Modified Graph addressing the 2 scenarios above - Worked on the KG structure through prompt engineering
- Complete node properties and relationships are sent into the schema
-
Vector embedding * Knowledge Graph
Updates:
- Only using Prompt
Unlike SPARKQL, Prompt engineering alone doesn't work : graph schema needs to be passed.
1. Generic questions – works well
2. Specific questions: partially works
- Proposed KG structure through Prompt engineering
References
- https://towardsdatascience.com/how-i-won-singapores-gpt-4-prompt-engineering-competition-34c195a93d41
- https://platform.openai.com/docs/guides/prompt-engineering/strategy-test-changes-systematically
- https://neo4j.com/labs/genai-ecosystem/_example_projects
- https://arxiv.org/abs/2310.04560 - Talk like a Graph: Encoding Graphs for Large Language Models
- https://medium.com/neo4j/generating-cypher-queries-with-chatgpt-4-on-any-graph-schema-a57d7082a7e7
- https://neo4j.com/labs/genai-ecosystem/vector-search/
- https://medium.com/@bukowski.daniel/the-practical-benefits-to-grounding-an-llm-in-a-knowledge-graph-919918eb493
Edited by Sangamithra Panneer Selvam