Case Study

Cause & Effect Question Set Evaluation for LLMs

Challenge

Identify potential problems (such as Causal Reasoning and Model Performance) and their causes in our client’s
existing LLM.Additionally, we needed to verify the performance and enhance the cause/effect relationship in
outputs from our client’s LLM by generating benchmark “Cause and Effect” question sets.

Industry

Prompt Engineering

Data Type

Text

Project Duration

2 Months

Ongoing?

Yes

Solution

To execute this, our prompt writers began by choosing a word from a list (provided by the client) to use in a sentence. Operating within a set of client parameters, our skilled writers created two statements. Both statements had plausible causes or  effects related to the word selected from the original list. Then the writer chose one of the responses to their two statements as the single most accurate response to the client’s “Cause” or “Effect” question, thus providing a usable future benchmark.

Those responses were then cycled through an evaluation phase which was based on answering pre-determined questions about the relevance of the content in the response. The post-evaluation responses were then used to teach the model to generate relevant responses in potentially ambiguous situations.

This type of evaluation cannot be done without Human-in-the-Loop (HitL) judgement and expertise, which we utilized by leveraging the domain experience and knowledge of our prompt writers.

Outcome

We created hundreds of prompts and validated responses (in the form of answer statements) for given “Cause” and “Effect” question prompts. Then, the prompt and answer pairs were exported as a JSON file and fed into a training pipeline for the client’s LLM. Lastly, the outcome statements were then used by the client as human-generated benchmarks for training an LLM to accurately determine “Cause and Effect” relationships.

By using these updated benchmark statements, our client successfully pinpointed potential issues and their underlying causes. As a result, they proactively addressed these issues before any problems could arise.

Download Case Study

Download