Draft: Add first sketch of experiment for comparing prompts for intent classification
Fixes #12
This is to track my work on coming up with a structure to run experiments comparing prompts + models for a given task, using the LangFuse client and its support for datasets and experiments.