The core building block for building your LLM data flywheel is very simple, just pass your data to unify.log, that’s (kind of) it.

This lets you store any kind of data into your console for future visualization, grouping, sorting, and plotting etc.

Rather than going through each concept in the logging and visualization APIs, we’ll instead just go through several examples, showing how you can use these simple primitives to get your LLM data-flywheel spinning for any LLM-powered application 🎡

In general, the process for building high-quality LLM apps is simple:

  • Add unit tests (evals) ❓
  • See if they pass ✅
  • Iterate on system message, in-context example, available tools etc. until they do 🔁
  • Beta test with users and find failure modes from production traffic 🚦
  • Convert these into unit tests 🗂️
  • Repeat! 🔝

The hardest part of this “flywheel” is the iteration step, where spotting patterns can be very difficult. Other great work looks to abstract this away, such as DSPy. These auto-optimization tools are really great in some contexts, but from our experience, the best LLM apps are built when a human is very much “in the loop” when it comes to iterations and prompt-engineering 🧑‍💻

We’ve tried to make this as simple as possible, keeping you in the driving seat for your LLM application 🏎️