Nederlandse Testdag 2026

Friday 30th of October 2026

The 29th edition of the Dutch Testing Day (Nederlandse Testdag) will take place in [Location to be determined]. For years, the Nederlandse Testdag has been the main event where science, education and the business world share new ideas and insights in the field of testing.

Our Sponsors

Location

[To be determined]

Organisation

The 29th Nederlandse Testdag organization members:

René van Veldhuijzen

LinkedIn

Dr. Petra van den Bos

Assistent Professor at Universiteit Twente

The Nederlandse Testdag Stuurgroep:

Prof. dr. Tanja Vos

Hoogleraar Software Engineering Open Universiteit

LinkedIn

Dr. Petra van den Bos

Assistent Professor at Universiteit Twente

Dr. Machiel van der Bijl

CEO and founder Axini

LinkedIn

Bart Knaak

IT Consultant at ABN AMRO Bank N.V.via Professional Testing

LinkedIn

Program

[To be determined]

KEY NOTE SPEAKER

Michael Bolton

Test Consultant / Quality Coach / Speaker at Rapid Software Testing

Michael Bolton is a consulting software tester and testing teacher who helps people to solve testing problems that they didn’t realize they could solve. In 2006, he became co-creator (with creator James Bach) of Rapid Software Testing (RST), a methodology and mindset for testing software expertly and credibly in uncertain conditions and under extreme time pressure. Since then, he has flown over a million miles to teach RST in 35 countries on six continents.

Michael has over 30 years of experience testing, developing, managing, and writing about software. For over 20 years, he has led DevelopSense, a Toronto-based testing and development consultancy. Prior to that, he was with Quarterdeck Corporation for eight years, during which he managed the company’s flagship products and directed project and testing teams in-house and around the world.

Known and unknown unknowns

When I’m learning something new I always struggle with my unknown unknowns: the things I don’t know that I don’t know. This is also a big topic in software testing. Because when you find no bugs, is it because they aren’t there or is it because you don’t know that you’re looking at it the wrong way?

In this talk I will explain my troubled past with unknown unknowns, and how I deal with them. One spoiler: Ask a lot of questions. Questions about things everyone already seems to know. Also: ask things you already think you know.

You, the audience, also get to participate! We’ll do a round of seemingly easy tasks to do together, where we’ll find that it doesn’t work at all! For a few brave souls I also have an assignment to do on the stage with me.

After the talk you’ll be prepared to challenge things you think you know, and to face your unknown unknowns without fear.

Key learnings:

Unknown unknowns always exist and you’ll never know you have them.
Looking at learning with an open mind uncovers new unknowns
Assume there is learning in a topic of you, even if you are versed

Linda van de Vooren

Test Consultant / Quality Coach / Speaker at Bartosz ICT

In daily life I am an amateur (baritone!) saxophonist, and an experienced software tester. Living in the center of Netherlands, you can find me exploring nature, visiting at a concert or the theater. I enjoy working in complex environments, and do not shy away from a challenge, wether it be complexity through technical difficulties or because of a political environment. In any free time that is left, I am an avid gamer (Nintendo!) and enjoy reading (mostly fantasy, currently).

From Design to Disaster: why great UX Designs still fail in practice

Even though we see an increase in the investments in UX designs, there is still little attention on the usability of the actual systems being delivered. This gap between design and end result can have a disastrous impact on the business results, despite the good intentions of focusing on well designed UX designs. In this presentation we’ll explore the validity of the reasons for an increased focus on UX/Usability and question the outcome of the investments associated with UX/Usability, using both statistical data and our combined experience, spanning several business segments and various software systems.

Key take aways

The importance of UX/Usability in relation to other key factors like price, functionality and performance
Quality measures that can attribute to good UX/Usability
Bridging the gap between UX Designs and the actual system

Freddy de Weerd

Test Architect at Polteq

Working for Polteq for the last 7 years. He has gained in depth experience in the field of Quality Assurance and Testing in his 25+ years working in IT. Besides acting as a test manager, coordinator and engineer, he is also a trainer, coach and assessor.

Concurrency Testing as a Sampling Problem

Concurrent and distributed systems are prone to concurrency bugs due to the nondeterminism in the interleavings of concurrent events. Detecting and diagnosing concurrency bugs in such systems is critical since unforeseen interleavings of concurrent events can result in unexpected, erroneous system behavior. However, these bugs are hard to detect as they are triggered only in some subtle interleavings of the events.

In this talk, I’ll give an outline of my research on testing concurrent and distributed systems from a perspective of probabilistic sampling. While naïve random stress testing is unlikely to discover difficult bugs, our recent testing algorithms offer effective testing methods by carefully sampling test executions.

Burcu Külahçıoğlu Özkan Assistant Professor at TU Delft

Fellow in the Software Engineering Research Group at TU Delft. She received her PhD from Koç University in Istanbul, Turkey, followed by postdoctoral research at the Max Planck Institute for Software Systems (MPI-SWS) in Kaiserslautern, Germany. Her research focuses on formal methods, model checking, software testing, and debugging of concurrent programs and distributed systems. She is a recipient of the academic research awards and grants from Amazon Research and Stellar Development Foundation.

Applying Behaviour-Driven Development for Game Creation

How to test computer games? In computer games, a player interacts with advanced (AI) agents, and deals with extensive game worlds. While computer games can be immensely complex, and bugs show up in well-known games, testing has not been picked up as much in the game software engineering community, as it has in traditional software engineering. In this talk I will show how Behavior-Driven Development, which is a popular technique for specification and testing in traditional software engineering, can be applied in game software engineering as well. Specifically, I will present the highlights of (i) a framework to help express game behaviors as BDD scenarios, (ii) a method to apply BDD in game development, and (iii) tooling to apply BDD in Unity 3D, a major game development platform.

Petra van den Bos

Assistent Professor at Universiteit Twente

I am an assistant professor in the Formal Methods and Tools group of the University of Twente. My current research focusses on software correctness and software quality in general, and on model-based testing specifically. I like working on theory (formal methods), that can be applied in practice as well. Previously, I had a postdoc position in the Formal Methods and Tools group of the University of Twente. Before that, I had a PhD position in the Software Science group of the Radboud University, where I completed my thesis “Coverage and Games in Model-Based Testing”.

VU-BugZoo: Bug Hunting Games to Educate Gen Z Software Testers

In these disruptive times, when code and tests can be created in a few minutes by generative AI, students are less than ever motivated to learn. In an attempt to alleviate the situation, we developed VU-BugZoo, an innovative teaching platform that instead of asking students to create test plans on paper, for hypothetical code, it engages them in exciting bug detective games in executable, standalone- and embedded- code. While grading, we praise an elegant test strategy above just finding the bugs. The results show that the approach succeeds in increasing the excitement and engagement in today’s software testing classrooms.

PDF of Slides

Natalia Silvis-Cividjian

Assistant professor at the Vrije Universiteit in Amsterdam

She has been teaching software testing for more than 10 years. She is also the author of a textbook titled Pervasive computing: Engineering Smart Systems (2017). In 2020, she initiated the VU-BugZoo project that resulted in two products: DBugIT, an online bug-hunting tool for standalone software, and a miniature, tangible VU-Smart Home that can host corrupted embedded software. The innovative teaching approach and its results have been promoted in a few scientific publications and disseminated to other courses, like programming and other institutions, like the HvA.

The Inconvenient Truth of Software Testing: What is Our Environmental Impact?

We push the button and let our tests run. But what really happens? What machinery is put into motion when we test and what are the environmental consequences of our quality assurance process? In this talk, I will take you through some of the insights that we have made while closely examining the potential environmental impact of building and testing 204 Java open source projects. The outlook is that testing is perhaps not as green as we might have thought.

Slides on slideshare

Andy Zaidman

Full professor software engineering at the TU Delft

His mission is to make software easier to evolve, and easier to test. For his work on software testing, he was the laureate of a Vidi mid-career grant in 2013, while in 2019 he received the most prestigious Vici career grant from the Dutch Science Foundation NWO. In 2024, Andy Zaidman became the head of department for the Department of Software Technology, which brings together around 200 researchers, educators and support staff in the area of the design, engineering, and analysis of complex, distributed, and data-intensive software and computer systems.

Het gebruik van model-based testen in het test proces voor medische software

TBD

Ronald van Doorn

Project & program manager | Quality manager at Bilihome

Ronald van Doorn is the Chief Program and Quality Officer at Bilihome, with over 25 years of experience in industrial automation, software engineering, and the development of medical devices. Throughout his career, he has specialized in bringing complex, safety-critical technologies to market, with a strong focus on innovative solutions, functional safety and regulatory compliance. Within Bilihome, Ronald van Doorn leads both the technical innovations and the quality management strategy behind the development of a wearable medical device designed to treat newborns diagnosed with jaundice. The device enables newborns to be treated safely and effectively with phototherapy at home, combining clinical reliability with user-centered design. Throughout our design verification and validation, we used an automatic testing philosophy while applying model based testing (MBT) techniques from Axini.

REST API bug detection with WuppieFuzz

Today’s world depends on many digital services and the communication between them. To facilitate this communication between applications, standardised and well-specified application programming interfaces (APIs) are often used.

In particular, the use of well-defined representational state transfer (REST) architectural constraints for APIs is popular. As an entry point to many applications, these APIs provide an interesting attack surface for malicious actors. Furthermore, since APIs often control access to business logic, a security lapse can have high-impact undesirable consequences.

Thorough testing of these APIs is therefore essential to ensure business continuity. Manual testing cannot keep up, so automated solutions are needed. In this talk, we introduce and demonstrate WuppieFuzz, an open-source, automated testing tool that makes use of fuzzing techniques and code coverage measurements to find bugs, errors and/or vulnerabilities in REST APIs.

PDF of Slides

Thomas Rooijakkers

cyber security researcher at TNO

Thomas Rooijakkers is a dedicated cyber security researcher at TNO deeply committed to enhancing software quality and software security. He believes that high-quality software is the foundation of secure systems and has a strong drive to bring innovative solutions to industry.

In his current role as Lead Scientist for the Early Research Programme on Cyber-secure Systems by Design, Thomas is transforming product and system development by integrating cybersecurity at every stage of the engineering process. This ensures secure, reliable, and resilient cyber-physical systems, enabling the digital transition to advance further.

Additionally, Thomas is serving as the Interim Portfolio Manager for cybersecure products and systems, where he oversees the strategic direction and development of TNO’s cybersecurity portfolio.

Automated Validation of RAG-pipelines

Take away

This presentation explores how automated white-box validation following the LLM-as-a-Judge paradigm can be used to assess and improve RAG pipelines, offering practical insights into maintaining the trustworthiness and effectiveness of RAG-based applications over time.

Abstract

With the advent of foundation models and generative AI, especially the rapid rise in Large Language Models (LLMs), we see our students and companies around us build a popular type of AI-enabled systems: private document chatbots based on Retrieval-Augmented Generation (RAG). From these systems, we learned that it is not hard to build something that works from a technical point of view, but that the true challenge lies in building solutions that are thrustworthy and capable of delivering consistently high-quality results.

Validating RAG pipelines in practice is challenging for several reasons: 1) the document retrieval step can be configured in many ways and is sensitive to the structure and quality of input documents; 2) inference of high-quality answers depends on the relevance of the retrieved documents; 3) assessing answer quality is subjective and often require domain expertise; 4) improving RAG pipelines involves experimentation with new techniques and configurations, which demands structured and consistent evaluation methods.

Most existing validation frameworks treat RAG systems as black boxes, evaluating only the final output without insight into how that output was produced. In contrast, our approach takes a white-box perspective: by leveraging knowledge about the inner components of the RAG architecture and evaluating the outputs of intermediate stages, we gain a deeper understanding of system behavior and failure points.

To support this, we are developing an automated validation framework, that combines both traditional, calculable metrics with more nuanced, LLM-based evaluation techniques following the LLM-as-a-Judge paradigm. This provides more granular insights, supports targeted improvements, and allows for more reliable quality assurance.

In this presentation, we share key insights from our work and show how automated white-box validation can be applied in practice to assess and improve RAG pipelines. We explore the benefits and challenges of this approach and illustrate how it enables ongoing quality monitoring to help RAG-based applications remain trustworthy and effective over time.

PDF of Slides

Leon Schrijvers

Lecturer in Software Engineering and Artificial Intelligence at Fontys ICT

Leon Schrijvers has been active in the IT industry since 2005, starting as a software engineer before moving into roles as senior software architect and technical director. Across all these positions, he maintained a strong focus on software quality assurance and security.

In 2018, he joined Fontys ICT as a Lecturer in Software Engineering and Artificial Intelligence, to educate the next generation of software engineers. He also contributes to educational development initiatives aimed at shaping the future of the field.

Since 2023, he has been researching LLM engineering, with a focus on building trustworthy LLM systems. In his research, he enjoys combining theory with a hands-on approach, delivering practical results that educate and inspire others. Leon holds a Master’s degree in Computer Science from Eindhoven University of Technology.

Automated Validation of RAG-pipelines

Take away

This presentation explores how automated white-box validation following the LLM-as-a-Judge paradigm can be used to assess and improve RAG pipelines, offering practical insights into maintaining the trustworthiness and effectiveness of RAG-based applications over time.

Abstract

With the advent of foundation models and generative AI, especially the rapid rise in Large Language Models (LLMs), we see our students and companies around us build a popular type of AI-enabled systems: private document chatbots based on Retrieval-Augmented Generation (RAG). From these systems, we learned that it is not hard to build something that works from a technical point of view, but that the true challenge lies in building solutions that are thrustworthy and capable of delivering consistently high-quality results.

Validating RAG pipelines in practice is challenging for several reasons: 1) the document retrieval step can be configured in many ways and is sensitive to the structure and quality of input documents; 2) inference of high-quality answers depends on the relevance of the retrieved documents; 3) assessing answer quality is subjective and often require domain expertise; 4) improving RAG pipelines involves experimentation with new techniques and configurations, which demands structured and consistent evaluation methods.

Most existing validation frameworks treat RAG systems as black boxes, evaluating only the final output without insight into how that output was produced. In contrast, our approach takes a white-box perspective: by leveraging knowledge about the inner components of the RAG architecture and evaluating the outputs of intermediate stages, we gain a deeper understanding of system behavior and failure points.

To support this, we are developing an automated validation framework, that combines both traditional, calculable metrics with more nuanced, LLM-based evaluation techniques following the LLM-as-a-Judge paradigm. This provides more granular insights, supports targeted improvements, and allows for more reliable quality assurance.

In this presentation, we share key insights from our work and show how automated white-box validation can be applied in practice to assess and improve RAG pipelines. We explore the benefits and challenges of this approach and illustrate how it enables ongoing quality monitoring to help RAG-based applications remain trustworthy and effective over time.

PDF of Slides

Petra Heck

Associate Professor Software & AI Engineering at Fontys ICT

Petra Heck is an accomplished academic and researcher with extensive experience in software engineering, AI development, and educational innovation. She began her career in 2012 as a Lecturer in Software Engineering at Fontys University of Applied Sciences, where she later advanced to Senior Researcher in the AI & Big Data research group. Her postdoctoral research (2019–2021) focused on production-ready machine learning systems, contributing to the emerging field of AI engineering.

Currently serving as Associate Lector AI Engineering, Petra leads applied research projects, mentors students and colleagues, and facilitates knowledge transfer between academia and industry. She is also active in educational development and innovation.

Her academic journey includes a Master’s in Computer Science from Eindhoven University of Technology and a PhD from TU Delft, with a dissertation on agile requirements quality. Petra’s work is characterized by a transdisciplinary approach, bridging technical expertise with pedagogical innovation.

AI Based Testing versus Model-Based Testing

What is AI based testing?

And what is Model Based Testing?

Both techniques have been around for a while, but have been very hype sensitive.

With the current AI hype and the history of MBT, I see much confusion on what is what, and would like to share my vision on what they mean and how they differ but also compliment each other. It is always hard to predict what the future holds, but these two techniques will help evolve the test field.

Kyra Hameleers

Test Engineer at Foreside

Kyra Hameleers is a senior Test Engineer, with a background in software development. She has a specialization in Model Based Testing (MBT) but any test related technology has her interest. Kyra has been working with multiple parties in investigating AI and how it can advance the test field further.

In her free time Kyra is a stereotypical nerd; playing games, reading books, writing stories. She also loves technology and spends time researching it.

Nederlandse Testdag 2026

Our Sponsors

Location

Organisation

The 29th Nederlandse Testdag organization members:

René van Veldhuijzen

Dr. Petra van den Bos

The Nederlandse Testdag Stuurgroep:

Prof. dr. Tanja Vos

Dr. Petra van den Bos

Dr. Machiel van der Bijl

Bart Knaak

Program

Speakers

Previous editions