Skip to content

Improvements on GroundingLLM pipeline

This issue looks at ways to improve the Grounding LLM pipeline. It explores current research fields such as Prompt engineering, Vector embedding + Knowledge Graph and graph encoders.

Identified Areas of Improvement

  • Scenario 1:
  1. question: 'How does Pounits taste?'
  2. usecase document: 'Pounits taste more savory than sweet'
  3. KG created : Pounits -> are -> more savroy than sweet
  4. cypher generated: MATCH (p:node {name: 'pounits'})-[:taste]->(t) RETURN t
  5. LLM ouput: Cannot be found
  • Scenario 2:
  1. question: 'How does Pounits taste?'
  2. usecase document: 'Pounits are more savory than sweet'
  3. KG created : Pounits -> are -> more savroy than sweet
  4. cypher generated: MATCH (p:node {name: 'pounits'})-[:taste]->(t) RETURN t
  5. LLM ouput: Cannot be found

Tasks

  • Prompt Engineering for Parse model: ( work_in_progress )

In the triples ( subject, relation, object), the relation must be verbs or some sort of rules so that the KG created remains consistent. With this the generated cypher query can work better on the KG.

  • Tried different GPT models with Langchain and modified KG
  • Setting up the 2 different prompts
    • To do - Self- healing mechanism with langchain ( work_in_progress )
  • Python script - Eliminating Langchain:
    • Created a python script.
    • Consist of self- healing mechanism
    • Tried out the python script with the modified KG
  • Modified Graph addressing the 2 scenarios above
    • Worked on the KG structure through prompt engineering
    • Complete node properties and relationships are sent into the schema
  • Vector embedding * Knowledge Graph
    • Neo4j provides GDS library for graph embedding.

    • Intial Concept:

      image.png

Updates:

  1. Only using Prompt

Unlike SPARKQL, Prompt engineering alone doesn't work : graph schema needs to be passed.

1.     Generic questions – works well

 

image.png

2.     Specific questions: partially works

image image

  1. Proposed KG structure through Prompt engineering

image image

image

  1. Compartive study on Langchain Vs. Python script

    image.png

  2. Prompt and Testing with different GPT models:

    image.png

References

  1. https://towardsdatascience.com/how-i-won-singapores-gpt-4-prompt-engineering-competition-34c195a93d41
  2. https://platform.openai.com/docs/guides/prompt-engineering/strategy-test-changes-systematically
  3. https://neo4j.com/labs/genai-ecosystem/_example_projects
  4. https://arxiv.org/abs/2310.04560 - Talk like a Graph: Encoding Graphs for Large Language Models
  5. https://medium.com/neo4j/generating-cypher-queries-with-chatgpt-4-on-any-graph-schema-a57d7082a7e7
  6. https://neo4j.com/labs/genai-ecosystem/vector-search/
  7. https://medium.com/@bukowski.daniel/the-practical-benefits-to-grounding-an-llm-in-a-knowledge-graph-919918eb493
Edited by Sangamithra Panneer Selvam