Ep. #50, Identifying Weak Spots with Benjamin Wilms of Steadybit
In episode 50 of o11ycast, Charity Majors and Jessica Kerr are joined by Benjamin Wilms of Steadybit. This conversation examines chaos engineering before production, identifying weak spots in your distributed system, and the health of the tech industry at large.
Benjamin Wilms is Co-Founder & CEO of Steadybit. He was previously senior software engineer at Codecentric AG.
In episode 50 of o11ycast, Charity Majors and Jessica Kerr are joined by Benjamin Wilms of Steadybit. This conversation examines chaos engineering before production, identifying weak spots in your distributed system, and the health of the tech industry at large.
transcript
Benjamin Wilms: Because we would like to make chaos engineering accessible for everyone, you don't need to be an expert to know what's going on in our core engine.
It's more like get started with chaos engineering and do it in a safe and easy way, without knowing anything like an expert for chaos engineering.
Jessica Kerr: But do you have to be an expert in your own systems?
Benjamin: Not really, because also we are using-- So we are collecting data about your system, we are analyzing your system and we are able to identify, so called, weak spots. Spots in your system, some areas in your system, your distributed system where you should maybe get started with a specific already configured experiment.
You would like to do something new, but you're not able to be an expert and you don't know where should I get started in my system. I'm not quite able to understand my own system, and now I should do chaos engineering without knowing anything about it?
It's a big challenge, so we will guide you to specific points in your system and we will present you some already configured experiments where you can get started in a very easy and safe way, and not kill your system with the first try because chaos engineering is still risky and if you're doing it once wrong, you're not allowed to do it any more.
That's something we are protecting you-
Charity Majors: This seems a little bit interesting to me, because I've always felt like in order to do chaos engineering you should be this tall to ride this ride, you should have gotten the basics out of the way. If you're still in a space where you're getting paged all the time or there's a lot of stuff you need to fix, you probably don't need to go injecting even more chaos into your system. Is that wrong?
Benjamin: If you are at that point, no, it's not good. But we would like to enable people to do it in the earlier stage, not like in production because you are hunted by production. You are sitting in front of a very complex system in a very unstable system.
Charity: You're talking about doing chaos engineering not in production?
Benjamin: Yes, yes.
Charity: What's the point of that?
Benjamin: Not everybody is on the level like Netflix, like AWS, like very big companies that have done chaos engineering for the last 10 years or 8 years, whatever.
Charity: But what can you actually learn if it's not in production?
Benjamin: You can be prepared, you can be proactive to survive production, and you are training your organization, you are training your dev teams, your product teams, your SRE and Ops, teams.
Charity: Give me an example maybe, maybe that'll help me understand.
Benjamin: Your SRE team has identified a very big incident last night, they have done a post mortem analysis and now they would like to protect your organization, your team, your customers to run the same incident again and again and again.
So that's maybe a good point where you could describe a playbook from this incident, you can train your steam, you can train your stack not to run into this issue again.
And, you can do it in an earlier stage which is quite close through production, it needs to be as close to production as it can be. But it doesn't need to be production.
Charity: Well, see, that's the thing. Nobody's staging environments actually look like production, so what you can learn from them is fairly limited. Can you give me a specific example of the kind of experiment that someone might run for one of these things?
Benjamin: Let's imagine we are a developer and we are focused on our application.
We understand how our application is working, we know what dependencies I need as an application to survive, to get started and so on.
So maybe let's get started from this tiny point and let's drop off one of these dependencies and see how my application is reacting under this small radius.
And so you can do it with all your applications, you can improve your application at the very small scale.
Charity: Like a library?
Benjamin: Not like a library, like a remote dependency or maybe you are pushing data into Kafka or you were consuming data via Rest, whatever. So something which your service needs to work.
Jessica: So if I'm a checkout service and I depend on inventory service, maybe inventory service becomes unresponsive?
Benjamin: Yeah, exactly. Or maybe it's responding slow, or maybe it's responding with some exceptions and maybe you have implemented a retry pattern and you would like to verify if the retry pattern is working or not.
Or maybe you have implemented a fallback strategy and you would like to see if the fallback strategy is triggered like designed, hopefully it works. You can check and validate if it's happening like expected.
Charity: I associate this sort of thing with the sort of thing you implement using tests. That's how I've always seen it done.
Benjamin: It's very close to tests, but it's a little bit more than just testing. Nowadays testing is more like a happy pass testing, "Okay, everything will work. Okay, yeah. Nice."
But if you are injecting some latency in a remote call or you are injecting some exceptions or requests which the response is not coming like it's designed.
Charity: So basically this is a way to take some of the complexity out of testing and put it into more of a manual testing?
Benjamin: Not manual, no.
Jessica: It's not exactly production, but you're simulating production and so you're able to do testing at a broader scale, you can include performance in that?
Benjamin: Yes. For example, one customer is doing it like they are using Steadybit, but for example Q&A is deployed.
Or maybe let's pick up another stage, let's get started with developer stage. Something is pushed into Git and now the system is deployed, but after the deployment Steadybit is checking, "Okay, is the system able to survive an outage of some specific container, some specific services? Is this system still working?"
So Steadybit is triggering your load test and during the execution of a load test we are injecting, let's call it latency, we are injecting-- Some parts are now dying and we are doing a shut down node of a specific Kubernetes node.
Is the system still able to survive? Is the system still responding? And can the load test be executed? And are the results okay? That's something you can do.
Charity: Can you do this in an automated way after every run of your test suite, or whatever?
Benjamin: Exactly, exactly. You can integrate into CICD pipelines.
Charity: So it's funny because I don't associate this with chaos engineering at all. It's more in the heritage of, well, unit testing, integration testing and everything.
To me, chaos engineering means in production and it means in a more-- I know manual is kind of a bad word in the industry, but to me it's a balance between this is a really disruptive thing, and we're doing it in production because it's the only way that we can find these learnings.
And, we do it during the daytime because people are around and they can see if anything goes wrong and that software thing. It sounds to me like this is a blurring of those lines.
Benjamin: It's the basement. It's just where-
Charity: Is this the buzzwordification of integration testing?
Benjamin: That was the starting point for chaos engineering, of course.
Production is the place where you need to train production if you would like to improve production, because your customers are in production. That's a very special place to be.
Jessica: This is like integration tests, but the not happy path. You have to have integration tests and load tests set up to be automated.
Benjamin: Yeah. You need traffic in the system, yeah. You need some traffic in the system otherwise you are not able to learn.
Charity: Even if I ran this, I would still be only like 10% confident that it would actually behave the same way in production.
Jessica: Sure, but it was worse before.
Benjamin: Yeah, exactly.
Jessica: The first time you run the chaos experiment that is inventory service becomes unreachable, you find out that the checkout service definitely repriced infinitely until it crashes. So you'll find some things, yeah, like you said. Maybe there's a 10% better chance.
Benjamin: You can train yourself up and you can get better. But it's like the Master Boss is production, you should do it later on in production, of course, yeah.
But don't get started, because if you have never done it before you will fail and you are not allowed to do it any more.
Jessica: I imagine that when Steadybit suggests experiments in a system that people aren't 100% familiar with because maybe they haven't been on this team for 20 years and the system has evolved during that time. I imagine that teaches people about their own systems.
Benjamin: Yes. They can learn something, they can prove themselves, they are able to skill themselves up.
I'm a developer by heart and I would like to be better every day. I would like to get up and bring something into production as fast as possible, that's something which is driving me because I would like to deliver value for my users.
If they are not able to use my product, if they are not able to see how my product, my application is working because of an outage, that's a bad experience and I would like to be aware of any other kind of outages.
Jessica: This sounds like a good time to introduce yourself.
Benjamin: Yeah, I am Benjamin Wilms, one of the founders of Steadybit. In total we are three founders.
With my colleagues Dennis and Johannes, we founded Steadybit 2019. It's a very nice place to be, it's a freaking rollercoaster, it's my first startup so everything is new for me.
I started with chaos engineering six to seven years ago as a consultant in a company called Concentric, and from there it evolves into a company called Steadybit. Anything else you would like to know about me?
Charity: There's this note in our show notes here, or our agenda, that says, "Are we hunted by our own systems?" What does that mean?
Benjamin: It's that moment when you are called by one of your colleagues, "I'm sitting in front of a very complex incident, but I'm not able to get production up and running again. So I am under pressure, I am kicked by production and we need to fix it right now."
But you are in a very stressed environment, you are not able to get production up and running on the long run.
It's more like a quick fix, you are just fixing this issue, next issue is coming up but I'm still under pressure, under stress.
How can I get production up and running? That's behind this sentence.
Jessica: Okay. So observability will help you figure out that what's going on in inventory service is not responding to checkout service, and then with a Steadybit experiment you could be like, "Okay, we're going to set this up in test, we're going to duplicate this condition, we're going to see it fail.
Then we're going to fix it, then we're going to know it's fixed, then we're going to set up experiments for every other dependency"?
Benjamin: Exactly. And also you can replay every time you would like.
Jessica: Ooh, and then you could run Honeycomb on your load test, or send events from load test to Honeycomb so that you can see, "Yes, this was the same thing."
Then you make the changes, and then you run the test again and you look at the traces and you say, "Ha, this is better. It did not fail."
And then, what's more, you can actually demo that to the business and be like, "See, we did some work this Spring. Look how different this is."
Benjamin: Yeah, perfect. And maybe we can use the data which is collected by Honeycomb to replay this specific situation with Steadybit.
Charity: I've always felt like chaos engineering and observability are kind of like peanut butter and jelly, in as much the same way that if you don't have your shit together on a basic level, you probably shouldn't be injecting more chaos in the system.
If you can't see what the fuck you're doing, you probably also have no business injecting chaos into your system. And here is a place where aggregates are not good enough. I
f all you have is mono gray, if all you have is these aggregate graphs that tells you have some errors, but you can't actually see which requests are erroring and why and what they were timing out, talking to, et cetera, then you're always going to be guessing what impact your chaos engineering experiment had on this service.
And that, to me, is just a scary place to be in. I've heard of people who ran chaos engineering experiments in their system and then two or three weeks later realized they had some lingering artifact of that experiment still causing chaos in their system.
Like some version of a build was out there in the wild that wasn't where it was supposed to be, and returning bad requests and corrupting their data and shit.
We're joking about this, but I really do think that if you're going to be doing chaos engineering you owe it to yourself to have the kind of tooling where you can inspect it at the raw request level, where you can say these errors are outliers because of X, Y, Z and you can trace it back to the experiment that you did.
Benjamin: A nice story from an early customer, he has started with chaos engineering, he has picked up Steadybit and he has told us, "Okay, our monitoring is based on opensource stuff, everything is in place, we are ready to go."
So they have created the first experiment suites, they have executed 10 experiments, and after a week he reached out to me again, "Benjamin, we need to postpone our Steadybit engagement because our monitoring is not working."
It took about five to ten minutes before we can see any kind of errors. We applied, but we were not able to see the production, so they are quite happy to see it quite early before in production.
So yeah, now they have picked up some paid versions.
Charity: The latency between when something happens and when you're able to see it in your instrumentation is a real thing. It's a thing that we never think about, because with Honeycomb it's like it's a matter of a couple of seconds.
We get alerted if it's more than 10 seconds, it's instant. As soon as you've done the thing, you can query other thing, and I keep forgetting that that's not true for most tools.
If what you're using is metrics and aggregates, in fact there's a window of 15 seconds to a minute or whatever where it aggregates before it even sends it.
I remember using fucking Datadog and it would be like five, six minutes before the metrics even showed up, and I was just like, "I'm in the middle of doing shit here. I can't just sit here and wait for five or six minutes after every single thing I typed to see if it's fixed it."
Jessica: Yeah. And then you have the balancing act of, "Oh, do we react quickly when these numbers drop off, or do we wait because maybe it's just that they haven't arrived yet?"
And you have a trade off between reacting or waiting for your aggregates to start doing what they're supposed to do.
Charity: Right. Have we fixed it or not? Are we going to make it worse or better? We don't know, we're flying blind.
Jessica: Oh, the alarm went off. I logged in, it's fine now.
Charity: Exactly.
Benjamin: Absolutely.
Charity: Once you've experienced real time observability, it's really hard to go back. You just realize how blind you've been flying all along.
Benjamin: Yeah. Like you mentioned, the time from five minutes, how many customers are affected until you can see it?
So, "Wow. Okay, why I am losing so many customers? Why they are talking so bad about me in social media? Hmm, I don't know." Okay, now you should investigate."
Charity: Right. How healthy do you consider the tech industry to be?
Benjamin: That's a hard and tough question. It's sometimes quite hard.
It depends on your organization, it depends on your culture and how you can work together. If you are in an organization where there is so much pressure because we need to get out as fast as possible with this new feature, but the team is telling you, "That's not stable, we are not able, we are not quite confident. Can we postpone this a little bit?"
But as I mentioned, it's pushing you, it's a very unhealthy situation.
If people are coming into the company and there's no time to get familiar with the tech stack, with the culture, with the tiny, little details how you can get your things up and running in production, that's not a nice place to be.
Charity: There's a lot of pathologies, which is why I'm curious.
This is such a broad question, I'm smiling at it because there are so many little microclimates in tech. Some of them it's like, "It's great, it's amazing. We get paid a lot of money, there's some annoying shit once in a while but it's more or less fine, we feel more or less empowered."
There's some places where people are furiously miserable and burning out, and there's as well some questions here about changing those people or tech stack.
When you came up with this question, what did you want us to ask it for? What are you driving towards?
Benjamin: Yeah. Something behind this question from me is I'm allowed to fail, and how will my team react if I have maybe injected a failure because I have pushed something bad in production and now the system's not running?
I am allowed to fail. Is it part of our culture as a team? Is there something we can learn from it? Or is it something where I get fired because I have done something wrong? So if you are able to fail, if you are allowed and you are covered by your team, by your organization, by your management, it's a nice place to be and then we are healthy.
Charity: It is interesting how I think founders-- This might just be pop psychology, armchair psychologizing here, the fact that your mind goes immediately to, "Can we fail in a safe way?"
If somebody asked me about healthy tech culture I'd be like, "Well, is it transparent?"
And it's funny how it emanates from our definition of what a healthy culture looks like, and then we created companies out of those things, right?
It's like observability, can you see what the fuck's going on? Do you have access to information?
And for you, it seems to be very much about can you fail safely.
Jessica: There's a pattern here, there's an inflection point between two different self reinforcing feedback loops here.
If you get a lot of pressure then that leads through various ways, to problems. Through the unfamiliarity, through the rush.
So pressure leads to more problems, which leads to more pressure, which leads to more problems.
Or else you can be in the reinforcing feedback loop of learning, space and safety for learning leads to more success, which leads to more space and safety for learning.
You're in one or the other, the middle is not stable.
Benjamin: Yeah. Also something which is quite important for me at Steadybit was that we can build a culture where people can learn from each other.
If there is a failure, okay, we need to fix it but it's something we can have fun about.
We, as a team, can grow from it and not finger pointing, "You have done something wrong."
Charity: Ops has always had the best humor, I think, in terms of tech culture just because of the gallows humor and the fact that we know stability is ephemeral and probably faults, and everything's failing all the time and so if we can't laugh about it, where in the hell are we?
Benjamin: Yeah, exactly. You will not survive if you are not able to laugh about it, yeah.
Charity: Exactly. Something else I've been thinking about recently though, and I wrote that long post about.
How do you know if the company you are interviewing at is a hellhole on the inside or not?
Benjamin: Yeah, nice title.
Charity: It's just the fact that good intentions only take you so far. We live in this capitalist world, it's very eat or be eaten, et cetera.
Something I think that people don't pay enough attention to when they're looking for jobs is the business succeeding or not, at a fundamental level do people want what they're selling? How are the business fundamentals? Are they growing?
Because even if you have founders and a leadership team that has the best of intentions and really wants to do the best, the right things, if things aren't going well they're going to have some really tough decisions to make that are not going to make people happy.
Back when Honeycomb was in this position, back in 2017, we had to do layoffs and we ended up having to layoff pretty much most of the engineers who weren't straight, white guys.
It's what we had to do to survive because these were the senior people, we could only keep like nine people and we were just trying to get to the next position.
I am exaggerating slightly. I felt terrible about that, and I can only imagine being on the other end of that and being like, "Well, she doesn't have these ideals, she doesn't live up to them, all this stuff."
Conversely, versus right now, I don't know if our listeners all know this but we recently put an employee member on our board as a voting member of our board.
We're able to do that because we're in a very different position now, we have a position of strength with regards to our investors.
Multiple people were trying to give us money, and when you're succeeding as a business you really get to steer your own ship.
You get a blank check for any kind of radical social experiment, whether it's a libertarian social experiment like Netflix has done, or whether it's kind of like a socialist-commie experience like Honeycomb likes to do.
If you're succeeding, then your investors, they don't want to mess with what's working.
They're like, "Okay, you know what to do, it's working. We're going to be hands off." You get to be masters of your own destiny, not completely, but in a much more meaningful way. If it's not working, like if business is not healthy and not growing, it's not in their hands.
There are so many people who are going to have control over their destiny, and many of which you'll never know their names or see their faces, right?
Jessica: The capitalist system pushes you into that scarcity.
Charity: Well, yeah. Because it is literally live or die, right?
And ideals are wonderful, and we should have them, and there are some ideals which you should never sacrifice just to stay alive and stay safe, but there are some ideas which-- We live in a compromised world, we are compromised beings.
There are some idealists who would say that this is false, but I believe that if you want to enact change you have to work through imperfect organizations and imperfect vehicles.
Jessica: Sometimes you have to settle for chaos engineering in your test environment.
Charity: There you go.
Benjamin: But you need to stay authentic, you need to stay transparent to your organization, to your people's.
Charity: I think that's the best you can really aim for, is always being honest. "Hey, this isn't who we want to be. But this is who we have to be today, and just apologize to people."Instead of acting like you're not.
Benjamin: Exactly. There needs a place to be where people can understand how you have ended up this way, and not the other.
Jessica: Yeah, that's true.
Charity: Exactly. In my experience, people are remarkably understanding if you're honest with them and upfront.
And honestly, they shouldn't be because that's an extremely low bar, but most people are not accustomed to getting honesty from their leaders.
And so they're generally pretty understanding when you're just forthright with them.
Benjamin: A hard way to stay in balance, yeah.
Charity: Yeah, it is.
Jessica: Yeah. Speaking of balance, there is this balance between safety and reducing tech debt and keeping your system super reliable and spending time fixing all of the possible things that you can figure out could possibly go wrong, versus delivering features that you can sell. How do you find that?
Benjamin: Developers are measured on speed, they would like to get new features fast and up and running, they would like to get a new library up and running because maybe I can be faster.
I am optimizing my outcome, my speed, but on the other side at my local environment, everything is running like it should be. But now in a distributed system, everything is now getting into one big system and components are coming together the first time.
Wow, okay. How we should care about reliability and how we should be aware of cascading errors?
That's something where we need to get in balance, so not every feature needs to get in production as fast as possible. Maybe it's also a good time to improve the system on reliability.
Charity: I think of it like, I don't know if you ever watched car racing where they're going round and around and around the track, but they have to stop every few things, switch out their tires, and do the maintenance or they're just going to explode, right? Yeah, in terms of software, speed is safety.
The more quickly that you can be shipping smaller changes, that's what gives you a rhythm, it keeps you safe. The faster you can go, the better it is.
Jessica: Yeah. What if you could just drive through the pit every lap, but instead of taking your tires off they just sprayed some extra rubber on.
Benjamin: Yes, nice picture, nice picture.
Jessica: You have to stop and refactor every now and then. You have to make investments in your infrastructure in order to be able to keep going fast.
Benjamin: Or maybe like the technical debts, okay, they are still there. Yeah, we can fix them later, we can fix them later. And the list is growing and growing and growing, but nobody cares.
Jessica: One thing I love about Honeycomb is that as a developer I suddenly have a way of noticing when I have delivered value, of proving to myself by looking at the data of how many people are using this thing, and what's more, exactly who is using this thing. Individual customers, I can find the people who are actually getting value from what I do, and that turns out to be so much more meaningful than beautiful code.
Benjamin: Yes. You need to identify with your customers, you need to understand your customers, their problems, because they are telling you, "Okay, that's a problem. Please fix it for me. But I'm not an expert, you can fix it."
Jessica: Yeah. But they never tell you that in person, mostly because your organization won't let you talk to the customer support people, but observability can.
Your system can tell you, the somewhere notices, "Well, have this request that's five seconds old."
Charity: This is the beautiful thing about observability, is that it really aligns engineers with their users. It's not just about how is the system performing in general? It's about what is the experience of every single, individual user? You can put yourself in their shoes and you can retrace it, you can see what they're seeing and it has a way of aligning you with them and seeing their pain and identifying with it in a way that I think has been very abstract to us in the past.
Jessica: Yeah, and seeing their pain and noticing that it's because the inventory service went down, and then you can say, "Look, fixing this connection between checkout and inventory is real value that we can provide to customers."
And when you have something like Steadybit, you don't just write one test that makes the inventory service go down because that just gets you the one thing.
That doesn't work in distributed systems where that's not going to be the problem tomorrow, that's not going to be. It's going to be something else.
You need to be able to set up those experiments for a whole category of problems.
Benjamin: And also you need to find a way how you could share this specific part, or maybe this specific experiment, with other organizations, with other teammates, with other people that are also sitting in front of a very distributed system.
Still, every system is a tiny snowflake and every system is reacting under conditions a little bit different.
But there are some parts where people can work together and share the experience, like, "Okay, if you are doing Kubernetes with this configuration, it's not a good way. Here's an experiment, you can verify how your Kubernetes is reacting."
Jessica: Nice. Not, "Try to figure out if you have this particular setting anywhere in your 60 deployed applications." But, "Run this experiment and see if this is a problem for you"?
Benjamin: Exactly.
Charity: Right. I was going to say that it's always so hard to figure out when to stop and switch the wheels, when to pause from shipping features and do the refactor, or whether to do it along the way, right?
Making everything else take twice as long, but then you advertise it over a longer period of time. Something that I've become very attuned to is what is our horizon? How far out are we planning?
Because for the first couple of years of the company our horizon was somewhere between a month and six months.
We knew how long until we had to raise money again, there were times when it's like, "Okay, if this can't pay off in the next three months, we can't do it. We can't spend any time doing it because we'll be dead in three months if we..."
So everything had to have a turnaround, if we could have this payoff within a week, great, let's do it. But we were able to spend no time investing in our longterm viability.
Then after we'd raise money we'd be like, "Okay, now we're planning for the next 18 months or the next year or whatever."
Knowing the horizon on which you're expected to make your system succeed is a really big ingredient into telling how you should be spending your time.
Benjamin: Yeah, very good.
Charity: How does your product organization work? Do you have a product manager? Honeycomb didn't have a product manager for the first four years.
Benjamin: That's a very nice topic. We only can win on product, so the product needs to be loved by our customers.
From day one, the biggest challenge. We have a dedicated product team and this means, okay, there is a product manager, he's the expert for problems, he's searching for problems the whole day and he's picking up interviews from customers, "Okay, there is something. We can work on it."
And then we have a UX designer and she needs to understand, "Okay, there's a problem identified by our product manager. How we can solve it? I will talk to the engineering team."
And it's so important that the engineering team and designer team is working so closely together because if you are building something upfront in your product team, maybe just two or three heads, and you are jumping over the wall to the engineering team, "Please build this." "No, that's not working."
Because engineers are fighting against this solution because they would like to be part of this process, they would like to be jumped into calls, into interviews, they would like to understand how is the customer.
Charity: How large is your company, by the way, Benjamin?
Benjamin: Actually we are in total, since yesterday 12 people. 12.
Charity: 12 people. How many engineers? How many product people?
Benjamin: Two product people and six engineers.
Charity: Any designers?
Benjamin: One.
Charity: We practice what's called, I guess, the Triad Model here at Honeycomb where you've got a product manager, an engineer and designer who they do the product planning process together. Is that what you guys practice too?
Benjamin: If I am talking about the product team, there are multiple roles involved.
They are doing research, we are also doing some guessing, we have some ideas we would like to get out into the heads of customers to then do an interview afterwards to get feedback immediately. That's where we're actually working at this moment.
Charity: How do you decide what work gets done?
Benjamin: It's done by, okay, there is a core vision, we would like to work on-- That's the core vision of Steadybit, and there are some goals, and the goals are picked up in the founder team.
They are challenged against our engineering team, against our core team, and then we find a way how we can achieve these goals, and we are doing it for, let's say, "In the next three months, in the next six months we would like to achieve these four goals. How we can do it?"
And every goal needs to be measured at any point in time. It's not like after two months we will take and analyze are we still on the right track? No, every goal needs to be measured at any point in time.
The solution is upped by the team, so they need to understand the problems, they need to understand the goals and the vision behind, and then they can challenge every solution idea against these goals. Is it paying in? Or is it something maybe for later, we will take it on the desk.
Charity: What do you see as being the biggest challenge of the next two years for your company?
Benjamin: The biggest challenge is happy customers and to get more customers into the product, but without being not able to deliver.
For example, if you are doubling the team, there is so much going on inside of your organization, you need to take care about your organization, that they are still able to work, that they are still able to deliver and they are still able to-
Charity: I mean from a product perspective. What is the biggest thing you're trying to build or accomplish?
Benjamin: In the next months we will open the product, so actually at this point in time we are working with design partners and now we are opening the product more and more.
We started the early access program yesterday so people can get started with Steadybit without to do any sales part.
Charity: That's exciting.
Benjamin: Yeah.
Because I'm a developer by heart, and something I don't really like is to talk to salespeople before I can try out the product. No, please, come on. I would like to get my hands into the product, I would like to get started as fast as possible, and if I am loving your product I will contact you, anyway, but give me somethinga.
And that's something we are preparing for the next month so that people can get started with Steadybit as easy as possible.
Charity: Cool. Congratulations. It's been really nice having you.
Benjamin: Thanks a lot. Thanks a lot for having me.
Subscribe to Heavybit Updates
You don’t have to build on your own. We help you stay ahead with the hottest resources, latest product updates, and top job opportunities from the community. Don’t miss out—subscribe now.
Content from the Library
O11ycast Ep. #75, O11yneering with Daniel Ravenstone and Adriana Villela
In episode 75 of o11ycast, Daniel Ravenstone and Adriana Villela dive into the challenges of adopting observability and...
O11ycast Ep. #74, The Universal Language of Telemetry with Liudmila Molkova
In episode 74 of o11ycast, Liudmila Molkova unpacks the importance of semantic conventions in telemetry. The discussion...
Machine Learning Model Monitoring: What to Do In Production
Machine learning model monitoring is the process of continuously tracking and evaluating the performance of a machine learning...