JAN 14, 2025

50 MIN

Ep. #45, Live from KubeCon 2024

GuestsOle Lensmar, Sándor Guba, Andreea Munteanu, Terence Lee, Jesse Brown, Mauro Morales, Paige Patton, Avi Press

light mode

about the episode

In this special episode of The Kubelist Podcast, recorded live at KubeCon 2024 in Salt Lake City, hosts Marc Campbell and Benjie De Groot bring you seven engaging interviews with developers and contributors from the Kubernetes ecosystem. From CNCF projects like Testkube, Logging Operator, and Kubeflow to innovative tools like Buildpacks, Kairos, Krkn, and Scarf, this episode dives into the latest updates and insights.

about the guests

Ole Lensmar is CTO of Kubeshop.
Sándor Guba is CTO and Co-founder of Axoflow.
Andreea Munteanu is AI/ML & MLOps Product Manager at Canonical.
Terence Lee is Cloud Native Architect at Heroku.
Jesse Brown is Software Engineering Architect at Heroku.
Mauro Morales is Special-Purpose Operating Systems Work Group Co-Chair of the CNCF and a maintainer of Kairos.
Paige Patton is Principal Software Quality Engineer at Red Hat.
Avi Press is Founder & CEO at Scarf.

show notes

about the episode

about the guests

show notes

transcript

Marc Campbell: So, Benjie and I just spent the week at KubeCon in Salt Lake City.

It was really fun and a busy few days, but we brought our microphones and managed to get some of the CNCF projects and a few other companies that have really cool open source projects sitting down and talk to them. We got a bunch of great updates and learned about some projects we hadn't really heard of yet.

Benjie De Groot: Yeah, it was great. I had a really nice time. My one critique is I wish it was like two weeks later 'cause they got a bunch of snow and I would've loved to have an excuse to go up to the mountains, but it was a really good time.

It was really well organized, I would say. They were really checking tickets, too. This is the first year that they checked my badge at every thing. It was just really well organized.

Unfortunately, the food was, I don't know how it's always that way, but the food is the food at KubeCon. There was a great cafe across the street. I'm forgetting the name of it, but I really enjoyed that cafe. They did a great job. I ate there most mornings and afternoons.

For me, I think one of the highlights was Scott and Kelsey. So Scott from Docker and Kelsey Hightower sat down, I think it was the first day and they did a platform at like the platform con thing, and they just talked about the history of Docker and kind of the ecosystem and that was really insightful, I think, some gems in there.

Speaking of that, I ran and spent some quality time with Solomon, he's a fixture at KubeCon. Got some crazy more demos from the Dagger folks. Maybe we'll do an update with them in 2025. But Dagger's always really cool.

Spent some time at the Replicated booth. I really liked the key chain this year that you guys did, Marc. It was very subtle.

Marc: It was the Wasatch Mountains. Yeah.

Benjie: Yes, it was the mountains. It was a lovely key chain. There was some good swag there. What was my best swag, oh, I will say this, special call-out to Microsoft.

They had a fidget spinner with like the little dots that you push in and out. And I gave that to my nephew and it's like apparently his favorite toy he's ever had. But I mean, overall, I had a really nice time.

I would say that, you know, there was some friends that were missing, but it felt, it does feel like KubeCon's back. It feels like all the people that I want to see are there. Yeah. What did you think, Marc?

Marc: Yeah, it was really good. I think, you know, and a lot of the time that we spent, you know, with microphones and kind of going around, we were in the little CNCF project pavilion talking to some sandbox incubating and graduated projects.

The ones that we have on this episode aren't projects that we've actually had a full episode on before. Some of them we probably should dive into and like spend some time and have bigger conversations.

We also did catch up with some of the more, some of the graduated projects like Helm, Argo, and planning to get some time on the calendar here really soon and actually talk about some of the work that's happening because that just wasn't, there was too much going on in some of those to have a quick 10 minute conversation on the show floor for a catch up.

Benjie: Yeah, no, I think that we planned out some exciting episodes for 2025, made some good connections and excited to get some more people on.

If I had to say my highlight, I mean honestly the highlight is walking around the floor with Marc and, you know, shooting the stuff, signing up for giveaways that I never win.

Marc, have you ever won any of those? You don't even sign up. It's just me, right? Marc doesn't have that disappointment that I do. Oh, the cookies, there were fresh cookies.

Benjie: Portworx?

Marc: Portworx.

Benjie: Portworx was giving out fresh cookies. So special call-out to them because I definitely gained about 10 pounds from those cookies.

Yeah, I mean I think that the interviews will speak for themselves. I definitely learned a good amount. Some of these projects were really cool. I really like Krkn. That one was really cool. I mean, Buildpacks, Kairos.

All of them were great and I definitely learned about a lot of them. So I'm excited to share this episode and get some feedback on stuff. I will say this, for our legions of fans, our legions and legions of fans, feel free to reach out if you have any projects that you want us to talk to in London.

Marc's going to be there, I think I'm going to be there. Marc's definitely going to be there. It's a very free-flow type of thing when we go to KubeCon where we just go around and we talk to all kinds of different teams and even companies.

So if there's anyone you think we should talk to or anything we should explore or dig into for 2025, please feel free to reach out.

Marc: Yeah, and I think on this one, this episode here, we have about seven different projects that we talk to. And actually five of them are CNCF projects if I'm not mistaken.

And a couple are just, they're actually, they're companies, so they have a, it's open source like, or at least open core, some variation of open source and then they're building a commercial entity around it.

But like they're still really cool to talk to and understand like how you can use them and how they're contributing back to open source.

So these seven here, we're just going to string 'em together. They're all about 10 minutes or so long each. So this will be one episode and like Benjie said, we're looking forward to seeing everyone at the beginning of April in London.

Benjie: Yeah, I'm excited. Oh, by the way, Marc, I think that was the official first time a podcast did not talk about AI in 2024. So I just had to drop that in there just to say the word.

Marc: Well now we did.

Benjie: Now I just wanted to say AI for everyone, just so everyone thinks, just to prove that it is 2024 that we went to KubeCon but no, this was about DevOps. This was about SRE and platform.

Marc: And Benjie, do you want to talk about eBPF for a minute, too?

Benjie: eBPFs are still hot sort of, and "Kube Cuddle,"obviously. Alright guys, have a great holiday season and look forward to catching you all in 2025.

Ole Lensmar Discusses Testkube

Cool, sitting here with-

Ole Lensmar: Ole Lensmar.

Benjie: Ole Lensmar from Testkube.

Ole: Yes.

Benjie: And what is Testkube?

Ole: Testkube is a test orchestration execution framework for Kubernetes. So Testkube takes all the tests that you have today, your functional tests, your load tests, your unit tests, your acceptance tests, and runs them inside your Kubernetes clusters.

And then it integrates with your CI/CD tool. So you could trigger those tests from CI on a schedule from Kubernetes events from Argo CD, et cetera.

Marc: So what would be the advantage of running them in my Kubernetes cluster versus like letting a GitHub action runner actually run all the tests?

Ole: So there's a couple aspects to that. So the advantage of running them in your cluster could be security related. You don't have to open up your cluster for a testing tool running outside your cluster.

You might want to run tests inside your cluster. You might want to run infrastructure tests that kind of test low-level infrastructure provisioning, you know, make sure your GPUs or whatever else are available. So that could be one advantage.

The bigger advantage we see with our customers is that they use multiple CI/CD tools. So they use GitHub Actions, they have a legacy Jenkins and then maybe they're going to Argo CD and GitOps and today they have to manage the scripting and for the execution of their tests and all of those different CI/CD tools and it just adds a lot of overhead to their ops team to do that.

Marc: So the way that I would do it is if I was using GitHub actions or CircleCI, whatever the CI tool is, that's just an orchestration, but the actual logic to execute the tests, I'm writing in Testkube.

Ole: Exactly. So we've built our own kind of workflow engine that's purpose-built for test execution that would then run your tests inside your clusters.

And those test execution definitions are stored as CRDs in your cluster. So you can use GitOps approaches to sync them from your Git repo, et cetera. And then, but to your point, you trigger those then from your CI/CD pipelines and we have integrations with all the big CI/CD tools and there's a CLI and all that.

The next big advantage then is that since all your tests are being run from one place, you have one place to troubleshoot your failed tests, to get reports over time for your test executions, to look at artifacts generated by your tests.

And this is where we've kind of purpose built functionality to aggregate all the data that we collect. So instead of your QA team having to go to Jenkins or to GitOps actions or wherever else to look at logs or try to find the artifacts or whatever else they do, they can just go to the Testkube dashboard and they have everything in one place.

And then of course we do test reporting and which none of the CI/CD tools really do over time, right? They're very point, they give you the JUnit report for a specific test, but they won't show you like, I want to see pass-fail ratios for all my tests for this team or for that component, et cetera, in one place.

So we're basically replacing test execution in your CI/CD, we're not replacing CI/CD, we love CI/CD and we want you to continue using Jenkins, GitHub actions, Argo. But we think there's a big advantage of moving the execution of your tests into Testkube, which will then run them inside your clusters.

Benjie: Right, and there's probably like a potential money savings too if I'm not paying for some massive GitHub action instance or Circle CI, whatever. So now this is an open source project, is that correct?

Ole: So there's an open source, the engine that runs our tests is open source and then we have a control plane on top of that if you want to manage multiple engines, which is commercial and that control plane can either be run in the cloud in our SaaS offering or you can host it yourself in your on-prem.

Marc: But if I just had like one cluster that I wanted to run, I could just use the open source?

Ole: You can just use the open source. It won't have the nice dashboard but it has a CLI and REST API and there's plenty of examples on our website how to use that.

Benjie: Super cool. What is your website?

Ole: Testkube.io.

Benjie: Okay, cool. And you're not CNCF right now?

Ole: So we are a member of the CNCF, and the Testkube is on the CNCF landscape. And we're testing. But we have not donated the, it's not a sandbox project, as of yet.

Benjie: And then what license is it?

Ole: It's an MIT license product.

Benjie: Super cool. What's the next six months to a year? Like, what's the quick roadmap in 30 seconds? Anything big coming out?

Ole: A couple of things. We are getting a lot of adoption in big enterprise customers who have the problems I mentioned earlier, multiple CI to CI/CD systems, very kind of fragmented infrastructure for running tests and they need one place for their QA teams to look at, you know, getting our overview of what tests are we running, what the results, et cetera.

So a lot of functionality in that direction. Integrating with test management tools, with defect tracking, you know, JIRA, that kind of stuff. We're also looking, like everyone else, leveraging AI.

In our case it's about helping you troubleshoot failed tests, so Testkube is in a unique position to collect logs and metrics and any kind of metadata from your system under tests.

Right, so not just from the testing tool but also from the microservices you might be testing because Testkube knows when your test is running.

Benjie: Yeah, so it seems to me that the big advantage of Testkube is that especially for like end-to-end tests and from a security perspective I can run against like basically behind pretty secure walls.

I can test whatever I want. So that's a big advantage. There's a cost savings advantage. So like you're not doing unit tests probably, maybe you are, but it is more of a functional integration and end test type things.

And so you kind of alleviate a lot of the overhead of getting to these specific environments or orchestrating these environments in the particular CI, yeah.

Ole: And then really make it easier to troubleshoot and report on your tests 'cause we have all the results in one place. Right? So.

But just going back to the unit test, a lot of people use unit testing frameworks to do integration tests or end-to-end tests.

So Pytests or a JUnit, we see a lot where people use that, was a framework for doing like incubation tests. So we run those tests as well, of course.

Marc: I want to talk a little bit more about troubleshooting tests and maybe like this is actually worth like a whole deeper episode on the podcast at some point. But like a problem I've been dealing with a lot recently is troubleshooting like a failed test running in a GitHub action.

I got to think like the architecture that you have where the test is running in a Kubernetes cluster I have gives me more ability to leave artifacts behind that I can like go poke at, debug, like t ake manual steps.

Ole: So exactly. So we can, automatically today, what happens under the hood is that our engine creates a Kubernetes job which then runs the pods that run the actual tests. So we of course collect all the logs and everything from there.

You can also then set up our workflows to aggregate logs from your system under test, right? Because even if you, if you see that your API test failed, you might not know which of my microservices actually it was this source of that failure.

And you're going to start digging through the logs of each of those to find a stack trace or something like that. And Testkube can once again aggregated all of that and put it into one view for the developer or a QA person that has to figure out why the test failed.

Marc: And it sounds like that's the real value, right? Because GitHub has like, so I know there's a lot of different CI providers out there and that's a lot of value which you're talking about.

But GitHub has an actions controller that lets me run that in my own Kubernetes infrastructure. But I don't get that like, any of the additional troubleshooting steps that you just actually mentioned.

It's just where the test runs. Testkube isn't just that, Testkube is that plus the ability to debug, the ability to troubleshoot.

Ole: Exactly, so it's really a purpose built for testing, right, so you could, at a higher level you could say sure, GitHub actions is a workflow engine and so is Tekton and so is Argo workflows. But they're not purpose built for testing.

They're generic and they're great at what they do, but we've really tried to focus on the jobs to be done and the value adds for testing or testers, right, then the workflows that they have to go, the jobs that they have to do.

Benjie: I think that's great. I'm going to check you out a bunch and like Marc said, maybe we're going to have to do a deeper episode on this one.

Ole: I'd love that.

Benjie: Thanks for the time, Ole.

Ole: Oh, thank you for having me.

Sandor Guba Discusses Logging Operator

Benjie: Cool, we're here with Sandor Guba from Axoflow, but we're here to talk about the Logging Operator. Sandor, tell us about the logging operator. What is it?

Sandor Guba: Yeah, so in a few words, Logging Operator has you automate the local action in Kubernetes. So instead of manually configure the daemons as stateful has to grab the logs from the different nodes, from the different pods, it uses custom resources and it will very much simplify how you collect those.

Marc: So I just deploy this as an agent into the cluster and how do I identify what logs and how does that work?

Sandor: So it's an operator. And there are different custom resources and you can define in the custom resources that which logs you want to collect and where you want to transport them.

So for example, you decide that I want the logs from the pods that have labels app Nginx, then it'll grab all the Nginx logs and transport it for Nykolo, Loki, Elasticsearch, whatever you configure in the destination.

Marc: Does it do anything with the logs in the middle or does it just grab them and stream them over?

Sandor: It can, so you can do transformation in these log flows. So we will call, it flows and the good thing is that because you know which logs you grabbed, you can, you know, apply. It's an Nginx log, I will apply Nginx transformation. So it will be much easier to decide the soup of logs that's coming for Kubernetes.

Marc: Okay, and what types of transformations do you see people out there using?

Sandor: So, Logging Operator uses Fluent Bit and Fluentd in the under the hood, so whatever it supports. And there is a new operator called Telemetry Controller, which uses OpenTelemetry under the hood and what whatever telemetry controller, collector able to do with logs, telemetry controller can do it as well.

Marc: And tell us about the beginning of the project. What company created this?

Sandor: So back in the day, I was a co-founder of Banzai Cloud. We started the project there and like over two years ago, I founded a new company called Axoflow and we continuing the open source project there as well.

Marc: All right, and Logging Operator right now is a Sandbox project?

Sandor: Yes, it's a CNCF sandbox project and we started the telemetry controller under the logging operator as well. So we want to, you know, at the point when the feature parity we hit change it to telemetry controller.

Marc: When did it become a sandbox project?

Sandor: One and a half year ago.

Marc: Oh wow, okay. How do you thought about the roadmap out of sandbox and in the incubating?

Sandor: That's why we started actually telemetry controller because there was a lot of requests that we just couldn't fit in the logging operator architecture.

So we sit back and started to, you know, schedule this one and we have a proper roadmap supporting multiple logging architectures and after then going for metrics traces.

So we plan telemetric controller to be the ultimate operator for the observer with the signals.

Marc: And do you have a timeline or a roadmap as to how that's coming and when that'll be done?

Sandor: I mean, we have a timeline but with no ETA's as you know, it's open source.

Marc: Do you have a community meeting that people can join and hear about the latest?

Sandor: Yeah, absolutely. We have a biweekly meeting online on Tuesdays and you know, we keep discussing, you know, what's the further steps.

Marc: And what are you looking for a lot from the community right now? Are you looking for people to use it, come up with new use cases or contributions?

Sandor: Absolutely, yeah, so our Logging Operator was very successful. We got a lot of new use cases, that's why we started the new multitenant capabilities telemetric controller.

And I hope that people will start to use it, get feedback and we improve it because I think it will be a very cool, very cool project.

I mean my goal is like with what you do with just cert manager, if you need certificates, you just install that. If you want to handle logging monitoring, you will install Telemetric Controller.

Benjie: Super cool. Are you on the CNCF Slack? Is that-

Sandor: Yeah.

Benjie: Can we find you there? Okay, great. This is super cool. I'm going to check it out when we get back. Appreciate the time and looking forward to tracking this project. Thanks so much.

Sandor: Yeah, it was a pleasure to meet you.

Andreea Munteanu Discusses Kubeflow

Benjie: All right, so I'm here with Andreea from Kubeflow. Tell us about Kubeflow.

Andreea Munteanu: Sure. Thank you for having me. I'm Andreea, I am with Canonical and I'm here to talk about Kubeflow. When it comes to Kubeflow, it's an open source project, part of CNCF, that's why we are at KubeCon indeed.

Benjie: True.

Andreea: It's an incubation project. It's a project that has been on the market for around 6 years, actually. It's a data science and ML platform or an MLOPS platform for those who are more advanced.

It's a suite of tools, open source tools that aim to enable ML engineers, data scientists, to develop, optimize, and deploy models. It has leading open source components. Kubeflow Notebooks are very similar to Jupyter Notebooks that people are very familiar with.

You have Katib for model optimization or hyper-parameter tuning, you have KServe from other inference and Kinetic for example as well.

It's a collective application. It's important to mention that the trans on Kubernetes.

Benjie: So what is Kubeflow exactly like from a tech? Is it a, so it's a project and it helps that so I can load all these different modules in, or?

Andreea: It's a platform. Once you deploy to get access to all these tools that I just mentioned.

Benjie: Okay, so it's kind of like packaging ML tools.

Andreea: Yes.

Benjie: And then they'll, it takes care of like the CRDs or the operators or the daemon sets. Okay, so really it's just kind of best practice for ML tools.

Andreea: Yes. And one important thing that we don't mention, it's designed to automate ML workloads because if you think of how to run models in production, you need to build pipelines to automate your workloads and that's why Kubeflow really was innovative five years ago. Nowadays a lot of companies and other projects use it.

Benjie: Right, okay. And so I'm curious GPU wise, does it do some cool stuff to enable GPU, stuff like that?

Andreea: Yes, I mean when you look at the compute resources in general, they're very much related to Kubernetes. You need to have the GP operator, you can have schedulers and so on.

But then also Kubeflow itself integrates for example with PaddlePaddle for distributed training to optimize the resources under need. You have user management capabilities profiles to ensure that the resources are split depending on how the administrator of the platform wants. And it's suitable for larger data science or ML teams.

Benjie: Okay, so can you tell me about some cool use cases that you know about of people using Kubeflow?

Andreea: Yes, now I'm biased because I know stories from our customers-

Benjie: Sure.

Andreea: At Canonical. For example, we have one retailer in the US, it's one of the biggest ones. And they use Kubeflow to build all their system recommenders on one hand for the e-commerce part for the online store.

But then they also build promotion analysis and almost like system recommenders for the in-person stores for the brick and mortar stores.

Benjie: Are they doing it on the edge or do they do that?

Andreea: No, so they do initially the training on-prem, having Kubeflow deployed on a Kubernetes distribution that we can find around here.

Benjie: So they have it on a cloud somewhere? Or in a data center. So either for training-

Andreea: And for inference as well. 'cause then they deploy the model using Kserve on smaller edges closer to the store itself.

Benjie: So, and the Kubeflow piece of that puzzle is it made it very easy for them to deploy the model but also for the training. So are they using?

Andreea: It's an end-to-end solution for the whole ML lifecycle.

Benjie: Okay, so it's an end-to-end solution.

Andreea: It's like if you oversimplify, that's what it is. Another cool use case is in the public sector where I live in one of the leading organizations, they build an AI serving cloud and they use Kubeflow on top of it to build all their use cases to identify any accident in the city to optimize the street lights colors, to do traffic decongestion based on models that are trained on Kubeflow. And then they're deployed with KServe, which is part of Kubeflow.

Benjie: And so the real thing here is you can use Kubeflow to install. So do you know what tools in particular, one of those examples are using that they use Kubeflow-

Andreea: So they use, all of them use Kubeflow notebooks for training models.

Benjie: So Kubeflow notebooks.

Andreea: And then they use Kubeflow pipelines to automate the workloads. You have, it's quite nice 'cause the pipelines themselves are, you can split them into steps, you can schedule them, you can trigger them when the data changes, you can-

Benjie: So I can use any like PANDAS or TensorFlow or PyTorch or whatever.

Andreea: Yeah. Those ones you use them in the notebooks and then they use Katib for high-parameter tuning.

Benjie: Sorry, which one?

Andreea: Katib.

Benjie: Katib, okay.

Andreea: For high parameter tuning, optimizing the model because just building a model is not enough.

Benjie: Right.

Andreea: You need to to get better performance.

Benjie: And so Kubeflow makes it super easy to install that tooling and there's some very opinionated Kube notebook stuff. Okay, cool. I'm going to check it out. So you said you're incubating right now. What's the next step for the project?

Andreea: Graduation.

Benjie: Yeah, of course. Do you have a timeline on that? Have you done the security stuff yet, or?

Benjie: It is interesting. We've been working on it and it's important to mention that we really welcome contributors as well as users to share their feedback with us.

We have a security working group, we have scanners enabled on all the working, well, on all the components and we are advancing on the security side as well. It's one of the priorities that we are having and then some of the official distributions are really helping out on that front.

And then I think the timeline, it really depends on how we advance with a couple of things. We have the steering committee in place. We are moved towards the CNCF channels. The security part is on track.

Benjie: So if I want to find you and I want to help contribute on anything?

Andreea: Kubeflow.org or go on Slack and look for the Kubeflow channel and you're going to find it in there.

Benjie: Okay. So in the CNCF Slack, you guys are in there. And last question, do you have have meetings monthly, weekly?

Andreea: Yeah, yes. So we have the, thanks for asking this. We have the weekly community call. It's every Tuesday. I will not say the time because it really depends.

Benjie: Time zones are a pain.

Andrea: Yes. For those who are in Europe, it's somewhere in the afternoon.

Benjie: Somewhere in the afternoon for those in Europe. Okay.

Andreea: So it's America friendly.

Benjie: But it's every Tuesday, you said?

Andreea: Yes. That's the community call. And then each working group has their own calls because some people are more passionate about notebooks, some people are more passionate about training operator.

So you can always join that. But I think the very good starting point is the Kubeflow community call. Go on Kubeflow.org, get familiar with the project, maybe try it out and-

Benjie: Right. And find you and everybody in the CNCF Slack for Kubeflow. Okay, great. Well, thanks for talking, Andreea. What's your, Andreea? Oh my god.

Andreea: Munteanu.

Benjie: Munteanu. Alright, well thank you so much Andreea.

Terence Lee and Jesse Brown Discuss Buildpacks

Benjie: Alright, sitting here with Terence and Jesse from Buildpacks.io, Terence Lee and Jesse, what's your last name?

Jesse Brown: Brown.

Benjie: Jesse Brown. And we're going to talk a little bit about Buildpacks. Terence, give us a quick 30-second. What is Buildpacks.io?

Terence Lee: Buildpacks simply takes your application source code, transforms it into an OCI container image without the need of Docker files. And what you get is a well constructed image with layers that map logically to your application.

Benjie: So, okay, so is this related to the, I remember Buildpacks from the Pivotal days and yeah. Pivotal days and stuff like, is that related to that or is it a different project?

Terence: It is an evolution of kind of the early kind of Heroku and Pivotal efforts from there.

And so we were actually kind of two separate companies that were doing Buildpack things and then we kind of came together and were like, we want a unified ecosystem and we want to not produce proprietary artifacts and kind of produce stuff in this container ecosystem.

Marc: Heroku is the company behind Buildpacks right now though. That was like the ones that created the buildpacks.

Terence: Yeah, the originals of kind of the buildpack stuff from way back in the day.

Benjie: Okay, so are you guys a sandbox project?

Terence: We're an incubation project.

Jesse: We're in incubation.

Terence: We joined a sandbox I want to say in like 2018. But we got incubation 2020.

Benjie: Okay, so you're incubation. What's next on the map for Buildpacks?

Terence: We're looking at graduation, we actually just finished our security audit this summer, I want to say. And then I guess like KubeCon snuck up and we kind of just never got around to finishing up the rest of the graduation stuff.

Benjie: I want to talk more about Buildpacks in a second, but I think it would be really interesting. Can you tell us about what that security audit was like?

Like what did you have to do? So you got to incubation and then the next step is the security audit. Just tell us a little bit about that process and what that looks like.

Terence: Yeah, I mean the CNCF is great. You kind of reach out to them and they hook you up with a third party company that, and we work with OSTIF and basically they go through, you point them at kind of all your repos and things and they like find a company that's like specialized for like what you have and what you're looking for.

And basically they go through and review all your stuff and you kind of do a bunch of back and forths of like they find some things, you talk through some stuff and then you kind of like get to a point where you agree on what the problems are and kind of what the fixes are and they give you, you know, like some amount of time before they post it publicly.

Benjie: Cool, and so how long does that take? How long does that process take?

Terence: I would say that probably took us four or five months to kind of go through that.

Marc: And you felt it was really good, it like helps with the project over-

Terence: Yeah.

I mean it's really nice to have someone else, like a third party person, like actually validate that what you're doing is secure and not secure and like someone who actually specializes in that.

Benjie: Alright, so let's go back to Buildpacks for a second. Does it generate a Docker file or what does it do? What is the, sorry and it generates an OCI image?

Terence: Yes.

Benjie: Does it generate a file, a Docker file outside?

Terence: We don't use Docker files at all. So there's no Docker file that's created as part of that. So one of the benefits from that, from our perspective, we're not also kind of constrained to how Docker file does caching and other things.

So Buildpacks execute and then we have an exporter that basically takes the file system that's on disk and converts it into an OCI image at the end with metadata and things like that.

Marc: So it sounds great. I don't have to go build a bunch of scripts, I don't have to write Docker files, I don't have to do this. And you said it creates an optimal OCI image with like, has the right layers and stuff.

How does it do it? Like how technical can you actually get in like the quick five minute interview right here? Can you explain it?

Terence: Yeah, basically like Buildpacks can be written in any language that basically runs in, what I would say is like the the build image that you're going to execute the Buildpacks in that you're going to build on top of.

And usually Buildpacks are going to encapsulate a certain technology like a Java, Ruby node, Python HV, et cetera, right? And what the Buildpack gets to do is they get tight control over what gets put in what layers you want to create.

So like the kind of a subject matter expert for that particular technology or ecosystem can decide like okay, like let's say I want to do node, I'm going to put in the node run time, I'll probably read that from your package JSON node engine thing 'cause that's like the standard in that ecosystem, right?

I'm going to run either MPM, ER and et cetera to install the node modules. And then the nice thing is like you get an OCI image at the end, Darkfiles are pretty similar, you can get something like that, right?

The big benefit then is like the buildpack author has the ability to do advanced caching and so they get to decide when to bust the cache. It's not decided just on file system changes.

So that means like let's say I add a single node module to my package JSON, I'm going to keep my entire cache, I'm not going to toss it, there's no reason to, I'm just adding some unknown module to my node modules, right?

I'll run IPN salt adds it, like local development, let's say I upgrade my version node. Now the ABI version has changed. It now makes sense to blow that entire cache away. And so the Buildpack has the flexibility to do that because we aren't like constrained with how a Docker file does cache.

Benjie: So to be clear, Buildpacks is the project but each language kind of has its own, has its own plugin for, is that the right?

Terence: A buildpack.

Benjie: A buildpack, okay.

Terence: Yeah.

Benjie: So it has its own buildpack. So if I wanted to do a FORTRAN buildpack, which I don't think is a great idea, but if I want to do a FORTRAN buildpack, I could contribute that and go to Buildpacks.io. I could contribute the FORTRAN one.

Now what about me as an end user that has a node project, that's not, I would never have a node project. I'm more of a Go, Python guy.

But imagine I have this Python project and I want to tweak something. I don't want to use exactly what the Buildpack is for Python. I want to bust Poetry's cache for some reason or whatever. Is there a capability to do that?

Or how do I customize these images that are generated as the end user, not the author of the Buildpack but as the person using the Buildpack?

Terence: Sure. You have a few options. Like at the end of the day, if you build a image with a Buildpack, it is an OCI image.

So you could, at the end of the day, if you know how to do Docker files and things, you could just inherit with a from from that image and like edit it that way.

Using tools you know, you could add like, build your own Buildpack that runs like after the existing one-

Benjie: Like fork the Python buildpack?

Terence: You can either fork it or like just add another Buildpack that you wrote.

Marc: Like chaining.

Terence: You can chain them together.

Benjie: Oh, you can chain. Okay, that's cool.

Terence: Yeah, there's also like inline Buildpacks. So like if you don't want to like full-blown write your own Buildpack, you could use the inline Buildpack through our project descriptor to like have that one opting in after like the Python buildpack has executed.

Benjie: Alright, well if I want to participate in the community, check you guys out, where do I go to to participate?

Terence: On GitHub, we're in the Buildpacks org with an S at the end, plural, and then we're in the CNCF Slack in the Buildpacks channel as well. Super simple.

Benjie: Do you guys have a community meeting weekly, monthly?

Terence: We have weekly meetings on Thursdays. If you go to buildpacks.io/community, you can see kind of the times for that.

Benjie: Super cool. And you're on the Slack, you're on the CNCF Slack?

Terence: Yep.

Benjie: Okay, great. Buildpacks is the channel?

Terence: Yep.

Benjie: Alright, wonderful. Thank you so much Terence Lee and Jesse Brown. Yeah, we look forward to seeing how this project graduates.

Terence: Awesome, thank you so much.

Mauro Morales Discusses Kairos

Benjie: Alright, so I'm here with the Kairos folks.

Mauro Morales: Yeah.

Benjie: Mauro?

Mauro: Mauro.

Benjie: Mauro, tell us about Kairos. First off, how do you spell it and how do you say it and then tell us a little bit about what it is?

Mauro: So English is not my mother tongue, so sorry for any mistake, but it's K-A-I-R-O-S, Kairos, that's how we pronounce it. We have a Greek person in the team. So it turns out it is not the proper way to pronounce it in Greek, but who cares, right?

And yeah, the project comes out of the necessity of moving outside of the cloud into the edge space. The idea is that in a data center, there are many problems that you don't have that you do have at the edge.

For example, security is a lot more problematic at the edge. If you are running a system, let's say in a gas station or I don't know, in a grocery store, it is a lot more exposed to people to come and try to mess with the device or to steal the data, of course.

Benjie: Right, there's much more physical access to the edge.

Mauro: That's correct. So that drives most of the decisions of the operating system in this case. And some of those features are, for example, systems are completely immutable with the idea that nobody can come and try to install a new package or a root kit for examples.

Benjie: Yeah. So wait, so backing up a second. So Kairos is a Linux distro, is that correct?

Mauro: Yeah, I mean we sell it as a meta distribution because Kairos is distro-agnostic. We want to play well with as many distributions as possible.

So as long as it has either systemd or openrc as their init system, then it can run Kairos. That means Alpine, Ubuntu, openSUSE, all of them run Kairos. We call them Kairos flavors.

Benjie: Okay, so there's these Kairos flavors and Kairos basically locks down the OS, essentially? Give me five highlights of how it locks down the OS real quick.

Mauro: Yeah, so we have immutability for example, which means that you have less snowflakes than you would normally have, like you can have with other configuration management systems, despite the idea that you are creating a fleet, it's still possible to log into the system and change something. With Kairos, that's not an option because it's immutable.

Benjie: So fully immutable.

Mauro: Yeah.

Benjie: Okay. What is the second best thing?

Mauro: Yeah, it's for example, we have A/B upgrades. That means that instead of doing an upgrade that could end up in a mixed state, you don't have that. Either you upgrade or you don't upgrade. And if for some reason you're not able to upgrade, then it rolls back automatically to our passive image.

Benjie: So that's really kind of like the embedded model for OS's where there's like basically two memory banks. Very limited. And if it doesn't work, it just flips back over.

Mauro: That's correct.

Benjie: Okay. Tell me the third best thing about Kairos.

Mauro: In terms of security, we have a feature called Trusted Boot, which means that once you have decided which operating system is going to be booted in your machine, no other operating system can be booted there.

This is done using TPM chips, so it's hardwiring measurements, it cannot be changed if someone wants to run another operating system, has to be signed with your keys.

Benjie: Wow, okay. That's super cool. So what is like the typical install? Like is it a Raspberry Pi? It's not going to be an ESP32, that doesn't have enough juice.

Mauro: Yeah.

Benjie: But what's a few of the cool target piece of hardware that you've seen Kairos on so far?

Mauro: We have a bit of everything you have, I can be honest. The homelabber definitely targets the Raspberry Pi for example, but we have a bit of everything it runs on.

For example, there are cases of drones picking up fruit that have Kairos underneath. There are cases where it's used for teaching networking at schools.

There are cases where it's in a, you know, traditional nook, simple system, nothing too expensive where you can stack three of those and create a Kubernetes cluster, something like this.

Benjie: Cool. Is it sandbox? It's in the CNCF?

Mauro: That's correct. It's part of the CNCF. We're a sandbox project. We were just admitted in April. And-

Benjie: So April, 2024 you got admitted, congratulations.

Mauro: Thank you very much.

Benjie: What is the next phase for Kairos? What are the things that you're working on in the next six months that are the priorities?

Mauro: Yeah, the main two priorities at the moment are CAPI. So a Cluster API, so that you can manage an entire fleet through Kubernetes. And the other one is remote at the station.

So that the same level of security that I was explaining before with the TPM chips, but also that remotely for some reason you might say, okay, that node has been stolen or whatever, that you can make sure that it never calls home, that it shuts down as properly as you can trust it to do.

Benjie: Right, okay. So is that kind of like a MDM vibe kind of thing, if you're familiar with?

Mauro: Good question. I'm not familiar with it.

Benjie: That's the, luckily I'm here with Marc. Marc, what does MDM stand for? It's the?

Marc: It's like device management.

Benjie: Device management. So it's like device management for Kairos. Is that kind of a way to?

Mauro: We don't have that part, that we-

Benjie: When you say attestation, that's kind of what you mean by that, or?

Mauro: Attestation for us is we want you to be sure that there is no other OS spinning up in the machine than the one that you have designated to run for two reasons.

One, this means that you will not have snowflakes and two, it means that if someone tries to put a root kit or anything like that, they simply can't and they cannot access your encrypted data.

Benjie: Super cool. So if I wanted to be a part of Kairos, contribute to Kairos, where's the best place to find you guys? When do you have your community meetings? How do I find you?

Mauro: Yeah, so since we're part of the CNCF, you can find us in the Slack server under the Kairos channel. We also have a website called kairos.io. There is all the information that you might need. And on GitHub, we are kairos-io.

Benjie: Cool, are there any community meetings or anything like that?

Mauro: Yes. Every Monday, 5:30 if I'm not mistaken. Central European time, you can find us there.

Benjie: I have no idea what that time is anywhere else but 5:00 Central European time, that's great. Thank you so much for your time, Mauro.

Mauro: Thank you guys. Happy to be here.

Paige Patton Discusses Krkn

Benjie: Okay, so on the last day at KubeCon, my voice is not doing amazing, but luckily we have Paige Patton from Red Hat and she's going to talk to us about Krkn?

Paige Patton: Krkn, yes.

Benjie: Spelled K-R-K-N.

Paige: Yes. No vowels.

Benjie: No vowels. Paige, what is Krkn?

Paige: So Krkn is a chaos and resiliency tool that's pretty new to the CNCF area. It's a sandbox project. Pretty much what our goal is is to inject chaos, test resiliency, test performance metrics on a cluster of any sort.

It runs completely outside the cluster so you don't have to install anything on the cluster so that the cluster goes down in a pretty high chaotic scenario, that things will come back.

We can still get our logs, we can see what happened and then at the end of the chaos, we can attach to Prometheus and do a bunch of different SLOs, different calls to see what our metrics were, if our latency came back, if it's too slow now, and then be able to attach to different services or you can do just base Kubernetes testing.

Benjie: Super cool. Okay. So where does Krkn run then? Do I install it on its separate cluster or is it just a agent that runs?

Paige: Yeah, just an agent. So you can clone the project, run it in Python, or we have a kind of newer binary that you can download and just run Krkn CTL and just run from command line using pod vamp.

Benjie: Krkn "cuddle." I prefer cuddle.

Paige: Yes.

Benjie: And so what it does is it just looks at like, oh, here's a CRD, I'm going to corrupt it.

Paige: Yes.

Benjie: Here's a pod. I'm going to-

Paige: Yes. And so we have all different sorts of different scenarios. We have pod, we have network, we have completely taken down a node based on your specific pod provider. You can connect to AWS, GSV, all of those based on their Python pod providers.

Benjie: Okay, so it would actually take down a node, for example. So it would integrate. So I'd I'd give it like some permissions to a particular cluster. It would actually go into my cloud provider and then turn off the node.

Paige: Exactly.

Benjie: So it's not faking it. It's actually-

Paige: Yep. Yeah, you can go into the whole cloud provider, see that the node is completely stopped, if that's to your scenario, and then restart it after a certain amount of time, see if your components came back or if anything slowed down, stuff like that.

Benjie: Oh, that's super cool. And so for the pod shutting down or a sidecar thing, so you guys, it sounds like you guys have scenarios that you, and then I assume there's like not a marketplace, but people contribute other scenarios?

Paige: Yes, definitely. We're looking for, we're kind of a newer as we're just sandbox, but we're just hoping for new scenarios, new use cases, people being able to know applications, be able to apply all those things.

Benjie: Super cool. So, okay, so you're sandboxed.

Paige: Yeah.

Benjie: When did this project start?

Paige: We started I want to say maybe five years ago.

Benjie: Okay.

Paige: I started at Red Hat about five years ago and I kind of just got into it from the get go there. It's really grown. We all used to only have pod scenarios and now we have I think 15 scenarios of different components.

So it's really grown, but it's all from our minds of what kind of scenarios we have. So trying to get other industry chaotic community support is really what we're looking for as well.

Benjie: Okay, super cool. So if I want to find you and contribute or help, where do I find you guys?

Paige: We are on GitHub and we also have a website, which is probably our main place, which is krkn-chaos.dev.

Benjie: That does not have vowels either.

Paige: It does not have vowels.

Benjie: So we'll spell it real quick. K-R-K-N - C-H-A-O-S.dev?

Paige: Yes.

Benjie: Okay. Now are you on the CNCF slack as well?

Paige: I'm not sure if we're on the CNCF Slack, but we're on Kubernetes Slack.

Benjie: You're in the Kubernetes Slack?

Pagie: Yes. Same, K-R-K-N.

Benjie: Yeah. Okay, great. And then also, do you have community meetings?

Paige: We don't have community meetings yet. We are trying to get kind of a overall chaos community with meetings, but setting up, but that's still in the future. I think some other tools, we're in a community at some point that's kind of died off, so we're hoping to bring it back.

Benjie: Okay, so the best place to to contribute and to see what's going on is GitHub. And then also the website and there's some contact information.

Paige: Yeah, so like GitHub is linked in the website, so the website is really kind of a catch-all that you can get to everything.

Benjie: Perfect. Thanks, Paige. Great to check out Krkn. It's exciting. I'm going to check it out.

Paige: Perfect, thank you.

Avi Press Discusses Scarf

Benjie: Alright, we're sitting here with Avi Press from Scarf. Great name, I got to say. Tell us what Scarf is, Avi.

Avi Press: Scarf is a platform for usage analytics for open source software and we provide sales and marketing intelligence to companies that commercialize open source.

Benjie: All right, so gimme a example of what does that look like? How does that help?

Avi: Yeah, so we'll, let's take, you know, projects in the CNCF where say you have a set of Docker containers that get downloaded from whatever registry you publish it to.

We build a container registry gateway. You can think of it like a link shortener that you can Docker pull through it, you can pip install through it, you can curl a binary through it, et cetera. So we're kind of like a middle layer between the end user and the artifact registry.

We analyze all the traffic that comes through and generate insights about how your open source is being used.

Benjie: So you're kind of like a Bitly for my container registry for example. Is it mostly for Docker containers or is it also for repos and-

Avi: Anything, anything.

Benjie: If I'm doing a git clone, I would do git clone then the Scarf, what's, Scarf.io? What is?

Avi: Scarf.sh.

Benjie: Of doing Scarf.sh/linux.

Avi: Right, right. And we let you pick your own domain. You can see, name a custom domain to us, so it's whatever domain you want, but that's just one way we collect data.

We also make cookie-less tracking pixels that can go on open source documentation, your website, et cetera. And then we also just support regular telemetry if you want to send us events from your code.

And so the idea is from users discovering the project to downloading it and trying it out to using it, deploying it, et cetera, we're giving a way to collect all that data in a privacy-conscious way and then get useful insights out of it.

Benjie: Interesting. So this is for open source projects?

Avi: Yes, it's for open source projects.

Benjie: So just to be a little commercial about it, so you're kind of like a top of funnel for like if I'm an open source project and I have a paid, have personally paid model and I know this one company or person is using this project a lot, that whole life cycle, then I know, hey, I want to reach out to that person.

Now you said you're privacy-focused. How do I reach out to that person if I don't know who that person is?

Avi: Right, so yeah, so Scarf is GDPR compliant by default. So we're not going to keep an IP address that we collect. We will enrich it with all the metadata that we can collect about whatever, you know, anonymous traffic that we are seeing.

But from there, we can start to do lead generation based on like who your ideal customer profile is, where the traffic is coming from, what they're doing, what we score that lead, et cetera.

And basically give teams a platform they can operationalize that data, their marketing teams can focus on, you know, the top of the funnel, you know, who's kind of interested, you know, sales might be nurturing the production deployments, whatever way is most appropriate for that project and company.

Marc: So a lot of projects, and I'm guessing Scarf is the same way. There's a lot of open source components and there's a lot of proprietary hosted closed source components. What part of it is open source?

Avi: Yeah, so for Scarf, our gateway, the thing that sits in front and you know, proxies or redirects that traffic, that whole thing is open source. Our SDKs are open source. The analytics pipeline is the thing that we keep proprietary.

Yeah, our gateway, it's a very novel thing. It can kind of redirect arbitrary traffic. You can almost think of it like just a really flexible reverse proxy, but it can also redirect and then you get all the great analytics out of it.

Marc: So I could like run that, just the open source component and it would be able to capture the traffic that was going through that gateway locally on that machine. It wouldn't have access to the full Scarf analytics and everything.

Avi: Right, like you'd be able to do all the redirection and you'd be able to like switch registries on a dime and do all that stuff. But you wouldn't have all of our like intelligent analytics behind it, basically.

Benjie: Right, right. But I could like print logs and-

Avi: Yeah, yeah, yeah. Exactly. And we hope you do.

Benjie: Okay. Super cool. I do think it's actually important for the open source world to figure out how to monetize responsibly. That is what this is all built on. So it sounds like that's kind of what you guys are, how far along is Scarf? How far along are you as a company?

Avi: Yeah, so we've been at this for about five years now. Yeah, we're about a 15 person team. We work with about 500 open source organizations. We work with the Linux Foundation, the Apache Software Foundation, CNCF.

So you know, we're still a small startup but we've come a long way. We have most of the Fortune 500 downloading artifacts from us every month.

Benjie: And then the last piece, my obvious question, I think I know the answer, but as an open source project, I need to seek out Scarf and use the SDKs, give the links to the proxy stuff.

So with Scarf though, the open source projects have to implement some component of Scarf. You're not like mirroring a bunch of stuff, and we just use the Scarf version of it or?

Avi: That is correct. The answer here is I guess a little bit nuanced. So yeah, if you want to collect data specifically about your project, you'd have to instrument some kind of tracking of this.

However, increasingly, Scarf is collecting so much data from so many different projects, we are able to provide kind of useful, really high-level aggregated reports that we do.

You know that we also do work with some companies on, but yeah, if you want data about your project in particular, you do have to set up some kind of tracking.

Benjie: Okay, so if I'm an open source project, how do I use you? What do I do?

Avi: So what you do, you'd sign up for a Scarf account, you'd make an organization on our site, you would create any one of a couple different things. You'd basically make like what we call a package entry.

So you know, you may have some package that you publish, you're going to make a corresponding entry in Scarf. You'd tell us where the artifacts are and we spin up a endpoint that people can pull down those artifacts from. You update your docs.

Benjie: Sure, and you put the pixel in.

Avi: You put the pixel on the dock.

Benjie: Okay, so it's pretty straightforward, pretty minimal.

Avi: Yeah.

Benjie: For our listeners that want to instrument their open source project seems pretty simple. And there's a cost, I'm guessing?

Avi: So it's entirely free to use. Scarf charges for the company. If you want to see the commercial data, which companies are using this, that's where we charge.

Benjie: Okay, great.

Avi: But if you also, if you are a CNCF project, there's no seat license at all. So we, for our partnership with the Linux Foundation, free to use for any LS CNCF project, your whole group of the, all the maintainers can get in entirely for free. You get unlimited data retention for free as well.

Benjie: Alright, that's super cool. Well, really appreciate you sitting down with us, Avi, and yeah, that's cool. And I'd like to help our listeners figure out how to monetize and track. Cool. Thanks, Avi.

Content from the Library

Visit library

Mar 27, 2025

Podcast

Open Source Ready Ep. #10, The Whirlwind Pace of AI with Taylor Dolezal

In episode 10 of Open Source Ready, Brian and John chat with Taylor Dolezal, former CNCF Head of Ecosystem and current Chief of...

Mar 5, 2025

Podcast

The Kubelist Podcast Ep. #46, Kubefirst with John Dietz of Konstruct

In episode 46 of The Kubelist Podcast, Marc and Benjie chat with John Dietz, CEO of Konstruct, about Kubefirst, an open source...

Feb 27, 2025

Podcast

Open Source Ready Ep. #8, Bridging Software & Hardware with Daniel Mangum of Golioth

In episode 8 of Open Source Ready, Brian and John sit down with Daniel Mangum, CTO of Golioth, to discuss his journey from...