
Chatbot testing with "LLM as Judge"
Chatbot testing isn't easy! LLMs represent a massive blast radius when it comes to good test coverage. Traditional testing approaches, and TDD for that matter, rely on fixed assertions. So what then for non-deterministic output?





