Martijn Jansen works as a Data Scientist within Rabobank. Working within the biggest GenAI team within the bank, he has considerable knowledge about Large Language Models and how to evaluate those. Initially, Martijn focused on implementing Retrieval Augmented Generation, and he has since shifted his attention to evaluating chatbots. Martijn studied Computer Science at Utrecht University, with a focus on algorithms, optimization and machine learning. Outside of work, Martijn enjoys playing volleyball and going to the cinema.
The evaluation of chatbots and generative AI in general is a big challenge. Ideally, the answers from a chatbot could be compared to ‘good’ or correct answers to determine the quality, but these ground-truths are time-consuming to write and even then still difficult to compare. At the same time, a proper evaluation framework is the […]