1. Library
  2. Podcasts
  3. How It's Tested
  4. Ep. #17, How BDD Changed Software Testing with Steve Tooke
How It's Tested
30 MIN

Ep. #17, How BDD Changed Software Testing with Steve Tooke

light mode
about the episode

In episode 17 of How It’s Tested, Eden sits down with Steve Tooke, Director of Product at Kosli and former co-founder of Cucumber, to explore the impact of Behavior-Driven Development (BDD) on software testing. They discuss how Cucumber evolved from an open-source tool into an industry staple, the challenges of test automation, and how AI is reshaping software quality.

Steve Tooke is the Director of Product at Kosli and a former co-founder of Cucumber, a pioneering tool for Behavior-Driven Development (BDD). With a background in software engineering, testing, and product development, he has played a key role in shaping modern testing practices.

transcript

Eden Full Goh: Today, I'm speaking with Steve Tooke. He's the former co-founder of Cucumber and currently a director of product at Kosli.

Hey Steve, thank you so much for joining me on the How It's Tested podcast.

Steve Tooke: Hi, Eden. No worries at all. Thank you. Looking forward to it.

Eden: Yeah, so I've been really excited about getting the chance to talk to you.

I think you've played a really critical role in sort of the way that the testing industry has evolved, sort of like best practices and how they've evolved in the last five to 10 years with the role that you played at Cucumber and some of the other companies that you've been involved with as well.

It'd be great for you to maybe introduce yourself, kind of your backstory, how you got into all of this as an engineering and product professional.

And I'm sure our audience would love to kind of hear about what inspired you to get started with founding Cucumber and everything that came before that too.

Steve: Yeah, so hi everyone. I'm Steve. I'm the director of product at Kosli. And my background is really is kind of through software engineering.

So in 2015, I guess I joined Cucumber as a co-founder, so I'd been part of the open source community there for a little while.

I'd been really involved ineffectively what we looked at as a rewrite of the original Cucumber, the Ruby version of Cucumber. I worked with Matt Wynne on that.

And it was through that and helping companies do some training with behavior driven development and using Cucumber in practice that I kind of got involved in the organization and I joined Cucumber at that point to help really kick off the product development of our kind of commercial product that we were trying to build to support the open source development.

And the whole idea of having a Cucumber company was to kind of protect and grow the open source tooling.

So that's how I got involved in the Cucumber Company, but more interested maybe is like how I got involved in BDD and got interested in the idea of Cucumber.

So for those who don't know, BDD stands for behavior driven development and really what it is, at its essence, what that's about is about driving the development of software using executable examples as specifications for how the software should behave.

The kind of the canonical example about how an ATM should work. When you put your card in and you type your PIN number and you select the amount of money that you want to get out, the result of that should be that the cash is dispensed and your account is debited.

And both of those two things are really important, right? You need your money and the bank needs to record that they've given you the money, right?

And so that's just a really small example that lets you then dig into that behavior. And the thing that got me interested in that was because I came from an extreme programming background that I'd be really interested in test-driven development.

I've been doing test-driven development for a while. For me, the idea was this helps me think about what those tests are that I'm going to write.

There's automated tests that I'm going to write to drive the development that I'm going to write first and then I'm going to drive the development from it.

And I'd already been doing something like this for a number of years, a similar practice, people called acceptance test-driven developments.

So we've got acceptance, test-driven development, test-driven development, behavior-driven development.

They're really all different ways of saying the same thing where you say how you want the software to behave first.

You see that fail as a test, you write the code to make it pass, and then you have working software.

And I've been doing that for a number of years with a tool called Fit, which allowed you to kind of write these examples of the behavior of the software in Word Documents or Excel spreadsheets and write code to automate that.

But the same idea was sort of running through is how do you say what you want the software to do, write something which will prove that it does it and then have it work.

Eden: I know Cucumber has been around as an open source tool for a while and it's based on, you know, the original kind of like Gherkin language and behavioral-driven development.

But then it sounds like you started using Cucumber as its open source tool as a professional, and then there came about this opportunity to create a company around it and a consultancy around it and really build up the profile of Cucumber.

But I'm curious, kind of like as you were using it as a professional, when it was still just an open source tool, like what did you sort of observe, like, this is where it could go, what was sort of the vision that you started to have around, what you wanted it to become or limitations that you were experiencing using it as a professional?

Steve: Yeah, so I was using it really as as, and Gherkin in particular, in the roles where I'd been using particularly in the consulting company I've been part of where we go in and we're helping people to build the software that they want.

So they've got the ideas, but they don't have the in-house team or skills themselves to build the software.

So it was a great communication layer in that level where what we could do is we could start to say, well, let's sketch out what we want that behavior to be and we'll write that out in a document and then you can give us some feedback on that.

And that process is exactly where the sweet spot for me was for Cucumber, it meant you could lift the idea of describing the behavior of the software up out of code into a kind of document format that you could go back and forth with a non-technical person with. And you could keep that description in the problem domain.

And what we saw with Cucumber was more and more organizations were starting to use it. And there was two things going on.

One, there was a lot of people who were using it the way that we all imagined it being used, the way we were using it, and they were getting some good success from it.

But then what they were finding is these documents, which were really business documents, right? Describing what the behavior of the software was.

What would happen is the developers would get hold of those and they'd lock them up in their Git repositories, right?

That's where they have to live because they're tests of the software. Not only are they specifications, but they're also tests of the software.

So they need to be versioned along with the software so they need to be in our Git repositories.

So suddenly, we've got these documents, which were intended to be shared artifacts between non-technical and technical people, but the technical people are keeping them well under lock and key.

And that was the idea for our commercial product that we were building. We called it Cucumber Jam.

The idea was that we'd be able to create a space for technical and non-technical people to collaborate on the specification of the software that would result in executable things. So that's kind of one side of what happened and where we thought building a product company alongside the open source tooling would be valuable.

The other thing that happened though was when Aslak originally created Cucumber in the Gherkin language, he created an example file that he called web steps.

So this was a set of Gherkin steps, so things that you could reuse, and they said things like, when I clicked this button or given I have this element, then this element should appear on the page.

It kind of laid out an example of how could use Cucumber to drive Selenium, I think it probably was in those days, or maybe even Water or one of those other browser automation tools.

And the thing about that Web Steps file, it was included as part of Cucumber and all of the examples that ended up on the web were about using it like this.

And it was wonderful because it, it really did show you very simply how you could use a little bit of text and some pattern matching and then say, "I want that to result in this piece of automation."

Of course, the downside of that is it was there and really easy to use and all the examples show people how to use it.

And so what you ended up with was people using Gherkin as a scripting language for automating browsers.

So instead of nice descriptive examples of how I should behave, what we ended up with was a whole bunch of statements like "given I have created a user," or maybe not even that.

"Given I have filled in the username as Steve and the password is this and clicked create, when I enter the username Steve and the password XYZ and press log in, then I should see the homepage of Steve, or I should see Steve is logged in," or something like that.

And we ended up with these really like, it meant you could put automation in the hands of people who didn't really have a lot of programming experience. Which is great, right? I mean, we're seeing the drive for that right now in the AI revolution, right? You want to give the power of computing to more and more and more people. The problem is test automation is a software engineering activity and you have to come at it from a software engineering perspective to create well-engineered tests, which really tell you whether the behavior you're looking for has happened.

That testing idea of like what if asking those questions, but you need the engineering rigor to make the tests really, really valuable.

And I think that was the big problem that we saw in Cucumber and that it had been, I don't want to say misused, but this small example had created this kind of situation where you didn't have people who had the engineering experience managing how those tests were created.

Eden: When that happens and sort of like, 'cause the format of Gherkin or just the like given-then-when, you know, language format, the syntax.

Yeah, it's like really accessible. Anyone can just sit down and all of a sudden,write a Cucumber test. This framework is available to them that they didn't previously have access to.

But because the format is so intuitive, it's also introducing this opportunity for, I guess like tests to be written in a more convoluted way or you know, not directly actually aligned with BDD or TDD principles.

Did you guys, when you built Cucumber Jam or sort of the product side of what you did at Cucumber, did you have to build in any guardrails to kind of start protecting against that?

Or is there even a need to protect against that? Like how do you sort of think about that?

Steve: So we didn't build anything into the product as guardrails.

I mean, one of the things we did with Cucumber Open Source was we removed the web steps from it just because we didn't think it was particularly helpful, even if you wanted to use it for driving browser tests.

We just felt you could probably do almost as well with just like Ruby or something rather than Cucumber.

I think the interesting thing as well is it's not like, given-then-when, as a format was first described by Chris Matts and Dan North, and they were really coming at it from a business analysis aspect.

I think like Chris Matts is not a software engineer at all. And his idea was, "Well let's talk about the outcome that we want with a then and then work backwards from that."

So it's just a way of discussing with people what you want the software to do.

So yeah, I think it's really fascinating that like you say, it's something that's intuitive and then you get a little bit of a piece of software which lets you do something really powerful that lots and lots of people want to do. And the drive to test automation was huge.

So when Aslak wrote Cucumber back in about, I want to say 2005, maybe 2009, maybe I'm being a bit generous, 2005, but his whole goal was just to be able to automate these Gherkin files as tests, right?

But that was also the time when everyone was really starting to discover the power of, well, if we automate more of our tests, we get to shift that left.

You know, the whole shift left thing was starting then and everybody was trying to up their automation, DevOps was kind of getting there.

How do we automate our build process? So it all happened at that time when people are like, "How do we get all of these manual test scripts that we've got and automate them as quickly as possible?"

So we can take advantage of, you know, DevOps, pipelines, automated builds, regular releases, all of that good stuff that we think of as almost every day these these days, it was just that confluence of things at the same time I think really meant that Cucumber use exploded.

Eden: And so as more and more people were originally, you guys, you know, started Cucumber and Cucumber as an open source tool was also yeah, intended for software professionals.

And then now, more and more quality assurance professionals, analysts, testers are using it. How has that even evolved in the last couple of years since you've moved on to other roles?

And I'm curious what you think with, kind of like you touched on the AI revolution, you know, these large language models.

I'm curious if anything has kind of evolved, notably in the industry that you kind of see this is where Cucumber's kind of helped lay that foundation and how it's even still being used today.

Curious kind of like how you've seen that play out in the last few years?

Steve: It's a great question and I'm sort of hesitant to answer just because I think there are so many people who have been doing so many talking about so many great things and so many great ideas in that area that I probably don't have a great handle on all of it.

But I think where Cucumber and BDD more generally,has been a great success is in helping people understand that testing isn't just about testing the quality in at the end. It's about moving that conversation to earlier in the process, right?

It's not just about checking the work of the developers, it's about understanding like the folks that are talking about the holistic testing idea and of there's testing that happens all the way through the pipeline, all the way through that value stream.

I think that's what's really important. And I see that even now, like working much more in the product side of this.

Like I really want to know, like the assumption that I'm making about my users and what will be good for them, how do I make sure that is well understood before we even start building code?

The solution that we're coming up with, is it the right solution? Like is it actually going to give our users the outcomes they want and are we going to be able to deliver those outcomes to them in a way that gives us the business outcomes that we want?

And I don't know if that's my personal journey or the journey of the industry over the years, but that's how I feel things have gone on.

I like to think that that extreme programming community, the behavior-driven development community, the agile community, have kind of added to that corpus of discourse around all of these ideas.

And that's why as an industry all across, like testing is central to everything we do. I'd even say central to what we're doing at Kosli really briefly.

Like we're trying to help organizations automate the gathering of evidence for compliance purposes.

But if you think about compliance and governance, really, what that's about doing is testing that your processes and procedures are helping you manage the risk that you want.

And like all of this is a test of some sort. And that's fascinating to me how similar all of these things are just in different parts of our value streams.

Eden: Yeah, I can totally see how there's a lot of parallels between, you know, the work that you used to do in the testing industry, building different testing products at Cucumber and at SmartBear and how that actually all ties in to your role, building product at Kosli as well.

I'm curious like, when you started working in the industry a decade ago, there was still a very predominant focus on web applications, web automation, a lot of tooling and conversation and dialogue in the testing industry was still focused on that.

As we kind of move into a world where there's more mobile, there's more hardware products, do you feel like these best practices around BDD, TDD, how do they stand the test of time?

Realistically for the next like five, 10 years, there's still a lot of screens involved, there's still a lot of web apps involved. So I think there's a lot of adoption of kind of those best practices.

But I am curious, like as we kind of move away from the world of screens, does all of this generalize and does it stand the test of time?

Steve: I think the fundamentals are going to remain the same, right? In that there is some outcome that I'm expecting to get.

And that outcome is going to happen because, it's triggered by something else. And that action that's going to happen is going to be within some context.

I think you can test for those kinds of things no matter what the sort of medium is. And I think it's really important to know the level that you're testing and how sure you can be of the answer.

I'm a bit hesitant because I think LLMs bring a whole new world because you can't be sure of what the outcome's necessarily going to be.

So we have to think, well we're not now testing specifics. What we have to do is there has to be a kind of quality of the answer that we have to test somehow.

And I haven't really started thinking about how we do it, so, but I think the fundamentals are still the same. You just have to think differently about what the outcome is that you're testing.

It's not a specific outcome like this thing is on the screen or this value is in the API response or this, you know, IoT device has changed state.

It's now, I've got an answer. But does that answer satisfy the questioner?

And so I think, I guess that's one of the things that's going to change that's really similar to what we've seen in the observability world, right?

Is we do a lot more looking at the quality of our systems, not from whether we've tested them on the way out, but we test them while they're running.

Are they performing with the right kind of characteristics that we want? So some of that testing is going to move into the runtime rather than in the build time.

I guess that's the shift, but the idea is you still want to be able to specify what the characteristics are that you want ahead of time so you can build towards that.

Eden: Yeah, I think that makes a lot of sense and I'm very keen to see how the industry continues to evolve.

I'm also going to be curious to see how it starts to blend into the world of compliance and more established enterprises.

'Cause right now, I think, you know, larger companies, established industries,regulated industries, there's still a lot of, you know, emphasis on kind of the known deterministic paths, products, platforms, you know, a lot of it is still web and computer-focused, you know, on a desktop or something.

And so it will be really fascinating to see how this all kind of scales out over the next five to 10 years and how larger enterprises, larger regulated industries will engage with these kind of new form factors and products.

Steve: Well, I think those larger enterprises really see the value of some of them. I was talking to somebody at a very large enterprise.

They were developing a chat bot for internal use for IT support basically. And what they had calculated was that every correct answer this chatbot gave to a member of staff for their IT support issue, saved the organization in the region of $20. Right?

And there was something like, I'm trying to remember. I'm just making the numbers up because I can't remember.

But in the order of 10,000 questions a month, that's a pretty significant saving for the organization. Not only is it a saving, but it means that people are getting unblocked quicker, people can move forward.

They get the answers to their queries immediately, they're not in a queue. So I think these big organizations might well be some of the driving points for the use of some of these technologies is my guess.

'Cause they're really looking to how can we invest in them? 'cause they've got a lot of savings to make.

Eden: Yeah, you brought up correct answers that the chatbot gives. And so to go down that rabbit hole a little bit, how do we know if the chatbot answer is correct.

And like, I also think that it makes the quality industry and the testing industry a really exciting place to be right now that there needs to be protections and there needs to be the definition of like, what is this product supposed to do?

How is it supposed to serve the user? And I think there's a lot of opportunity there.

Steve: I agree. I mean, just thinking about that example, right?

You wouldn't want to release it to your organization until you'd done a pilot sufficiently that you are happy that it was giving useful answers, I guess at least 80% of the time or something.

And then you'd also want to release it in a way that let users report really quickly or easily where if the answer was helping them or not.

I mean maybe even better if you could then build in systems so that you could track whether the user was getting the outcome that they wanted.

And I think again, that's the only way we're going to, in this sort of new world where we're not necessarily in control of what is happening then that's the only way we can do it is move to kind of qualitative testing after the fact.

Eden: Cucumber and sort of BDD and TDD tools play a role in the industry and sort of like shifting left as you mentioned.

But there's also the role that like exploratory testing, more qualitative testing like you mentioned accessibility testing, usability testing.

When you were at Cucumber and sort of even at SmartBear or even now as you're working professional, like have you sort of seen that bucketed and addressed differently with other tools or is there some overlap with kind of like the tooling that you were helping to build?

Steve: So I think Cucumber was always very focused on the kind of the regression testing, the testing that helps you, you know where, you know what the test should be.

You build that, you can then automate it and you can have it in your pipeline and you get the answers, right?

Again, it's just like the brave new AI world we're in. Like we can look at it as we can be fearful that it's going to take jobs away. Maybe it's just going to open up the space for more work where people can really add value as opposed to the things that we can automate.

That was the idea then. I think things like accessibility and so on, there were definitely lots of tools coming up to look at, well, how can we visually inspect the software that we build, the output?

How can we check that, you know, it would be usable by people with accessibility issues and so on.

I think all of that is super important, but it wasn't ever an area that I was particularly focused on.

Eden: Yeah, as the industry continues to evolve, there's web, there's mobile, there's automated, there's manual testing, there's different kinds of testing, there's, you know, everything going on with the AI revolution.

But what I really enjoyed with this conversation was having a chance to reflect on like, the history of how we even got here. I think when I first started Mobot a few years ago, cucumber was a tool that I'd heard a lot about.

But having the chance to actually meet someone who, you know, was a formative part of really scaling and making cucumber mainstream and all the, you know, projects that you've worked on since at SmartBear and now at Kosli.

It's been really cool to kind of reflect on that and I'm sure in a few years' time, so I'm going to have a conversation with someone else about how they built an AI tool or that the industry trends are going to keep evolving, but I think there's always something that we can learn from kind of like the best practices from, you know, historical context and it was really fascinating to get to dive into that.

Is there anything else that, any final thoughts or perspectives that you'd like to convey to our audience?

Steve: Maybe just one little thought that I've been having recently. As we're talking about AI, we've been talking a lot more about AI than I'd expected us to, but I guess that is the way things are these days.

And there's a part of TDD which like the canonical description of it is, like Red, Green, Refactor, right? Where you have a failing test, you make it passing as terrible as a way as you can possibly.

And then you improve the code, right? You get it to passing and then you improve the code. 'Cause you've got these tests to give you the safety net.

And one of the thoughts I've been having recently is with the AI coding assistance and so on. Maybe we can really quickly get to that green stage.

We can say what we want the software to do, we can maybe automate creating the test, I don't know, or we can write the test ourselves, get to green as quickly as possible, and then we can drive the improvement of the code towards the design we want.

So we can either explain to the AI coding assistant how we want to evolve the design or we can get into it ourselves.

So I kind of feel like for all those developers out there, good design sense and refactoring skills are still going to be super important and I'm pretty excited to see where that goes.

Eden: Yeah, you can have the same analogy with the way that AI is being applied to generating music, generating emails, drafts, written content.

AI's good for the first draft, but it's no replacement for instincts, best practices, like the human intuition piece, the refinement piece, like does this actually do the job and isn't just like a bunch of placeholder, generic, canned cookie cutter content or whatever is being coded or created.

And so I hope like the way that professionals really start to engage with those tools and continue to, and I think we're all sort of at a point where, like, I remember when ChatGPT came out a couple of years ago, there was that initial honeymoon buzz where everyone was like, "It'll do everything for you."

And then we all kind of realized, oh wait, no, it like hallucinates things that aren't there. It's, you know, making up things that aren't representing things as factual when they're not.

You know, I've had that experience where when we try to use AI tools, it will try to hallucinate and automate a test like,"Tap on this button," but that button's not on the screen. What are you talking about?

And I think there's, there's work that still needs to be done. There's literally refinement of these AI tools, but I do think like the human role is going to play a bigger role.

I think just a different role, but not a reduced role. And I hope that that is really the direction that we continue to head in as a society and in an industry

Steve: 100%. And if it just means more people are able to make use of the amazing power of these computers that we all have in our homes all the time, then how much better is the world going to be?

Eden: Thank you so much, Steve, for taking the time to share your experiences at Kosli, at SmartBear, at Cucumber.

You know, I've really enjoyed kind of like getting to know the evolution of your career and just like you've played such a pivotal role.

You know, thanks again for taking the time. I'm sure our audience enjoyed this conversation as well.

Steve: Fantastic. Thanks so much, Eden.