🔍 Context Alignment
Let’s take a look, and see if we can work out why the agent might be failing on some examples 🕵️ In terms of failures, let’s take Example 132 as an example. For this particular question, there are 6 sub-questions (a.i, a.ii, b.i, b.ii, b.iii, c), and we’re asking the LLM to do a lot in a single shot:- understand all 16 points in the general marking guidelines
- understand all 6 of the sub-questions
- understand all 6 of the student’s answers to the sub-questions
- understand all 6 of the markscheme’s reasoning for said sub-questions
🔀 Better Align Context
So, let’s go ahead and improve the system prompt such that relevant information is closer together, and see if it helps 🤞 First, let’s abstract this into a"{questions_markscheme_and_answers}"
placeholder:
call_agent
evaluate
function,
so that we pass the sub_questions
into the call_agent
function:
🧪 Rerun Tests
0.3
.
Let’s take a look at the traces,
to ensure that the system message template has been implemented correctly,
and each LLM call has the template variables in the system message populated correctly.
It seems as though everything was implemented correctly,
and the per-LLM system messages look good ✅
Again,
let’s explore what’s going wrong in the next iteration 🔁