Ep. #29, CloudQuery with Yevgeny Pats
In episode 29 of The Kubelist Podcast, Marc and Benjie speak with Yevgeny Pats, CEO & Co-Founder of CloudQuery. They unpack Yevgeny’s career journey, the security and compliance space, and the open source CloudQuery project and its use cases. This talk offers evergreen lessons on entrepreneurship, information technology security, documentation practices for open source projects, and much more.
Yevgeny Pats is Co-Founder & CEO of CloudQuery. He was previously Founder & CEO of Fuzzit (acquired by GitLab). Yevgeny’s journey into software engineering began when he was R&D Senior Software Engineer for the Israel Defense Forces.
In episode 29 of The Kubelist Podcast, Marc and Benjie speak with Yevgeny Pats, CEO & Co-Founder of CloudQuery. They unpack Yevgeny’s career journey, the security and compliance space, and the open source CloudQuery project and its use cases. This talk offers evergreen lessons on entrepreneurship, information technology security, documentation practices for open source projects, and much more.
transcript
Marc Campbell: Hi, welcome back to another episode of The Kubelist Podcast. As you've just heard, we have Yevgeny Pats, CEO and Co-F ounder at CloudQuery, ready to talk about the CloudQuery Open Source project. Of course Benjie is here too, how's everything going?
Benjie De Groot: Everything's good. Turbulent times, but doing pretty well.
Marc: Great. So let's dive right in. Yevgeny, we'd love just to start off hearing a little bit about your background, will you tell us a little bit about your career and what led you to creating CloudQuery?
Yevgeny Pats: Yeah, sure. First thanks, Marc and Benjie, for hosting me. I'm really excited to be here, and yeah, I'll be happy to tell a little bit about myself. Last 12 years focused a lot on entrepreneurship, cybersecurity. I started my career at the Cybersecurity Intelligence Unit 12 years ago, been there for 4.5 years, and after that started my startup journey. First joined a first, small enterprise security startup back in 2013.
It was pretty quickly acquired by Checkpoint, and then I figured I wanted to start my own startup and also went into enterprise security for a few years, but pretty quickly I understood that I'm too young to run a company without a demo button so I forked and last five years I was focused solely on the developer space. I had a product led growth, a lot of Open Source projects, and before CloudQuery I ran continuous fuzzing as a service project called Fuzzit, which was partially Open Source.
But it's mostly focused on our first clients, who were Open Source projects and it was acquired by GitLab in 2020. I've been there for a while, learned a lot from the GitLab culture and then started CloudQuery as an Open Source project which I can dig deeper into why I started it and so on. But I think that's a high level overview. Maybe in one sentence before I dig deeper into CloudQuery, CloudQuery is an Open Source project.
It started as an Open Source, cloud asset inventory. If you can think about it as Terraform, but just the other way around. We connect to all the cloud infrastructure APIs, AWS, GCP, Azure, extract all the configuration, transform it and load it into PostgreSQL database into structured tables. Then devops engineers, security engineers, site reliability teams use this for a variety of use cases starting from visibility to security, compliance, monitoring.
Benjie: That's great. There's about 35 things you just said that we're going to probably want to hear more about. But I just want to go back to the beginning for a second, you mentioned early on in your career you were in the cybersecurity space. Not sure if you're allowed to talk about exactly what you were doing, or where you were doing it. But can you tell us a little bit more about your early career and what you were doing?
Yevgeny: Yeah, sure. So I guess my passion is really computer, so I started in high school probably or school-
Benjie: Sorry, where did you grow up, Yevgeny?
Yevgeny: I grew up in Israel, in Haifa, in the north. My parents actually came from Russia back in the 90s, I also emigrated but I was six months old. So yeah, I'm kind of like an Israeli, I just have a bit of a Russian accent for some reason--
Benjie: It's weird to keep an accent from six months, but it makes sense.
Yevgeny: Yeah, I couldn't fix it. But yeah.
Benjie: Look, there's obviously a trend, especially in CNCF but in general, about some really amazing Israeli companies coming out and I think it's interesting to talk about how that education system over there led to this. Then obviously there's the mandatory military service that it seems like that played a role in your career and your trajectory as well. I think it would be really interesting to understand whatever you can share about that with us.
Yevgeny: Yeah. So I think also probably if you asked like five entrepreneurs from cybersecurity then everyone will have a different opinion on how it came out to have this nice ecosystem, but probably it's because of different variables. But I think one thing is the thing when you go to the army in Israel, it's mandatory service and then some of the units like the best units or the best unit out there, like the technical one, they have the ability to choose.
So they're trying to choose the smartest people that they can, but of course there are other places with smart people. Not saying that. But essentially they can choose, so basically they bring everyone together so it's like a great place to learn from a lot of other smart engineers and also get the ability to work on interesting things early on in your career.
So I think this really, yeah, provided a great place for everyone to learn and then to go out there and build interesting stuff. Other than that, I think there is also macroeconomics, a lot of funding as well in Israel so combining those two together, I think wrote some interesting or good outcomes.
Benjie: Well, one thing I think is really interesting about the whole Israeli tech ecosystem that you get these kids that are obviously very talented, but you didn't have a lot of experience when you were 18, at least from a professional perspective, I would guess. I think it really creates this very creative environment, and I can't remember who told me this, but an Israeli friend told me that you guys go into these very elite units and you guys don't know what you can't do so you guys just do pretty close to impossible stuff. It seems like it creates and fosters this... It's kind of like a startup mentality, is that a good way to look at that or what do you think?
Yevgeny: Yeah, I think it's a good way to look at it, and I think there is also some truth to that. But I also think that I don't want to ruin the marketing for the Israeli entrepreneur, but there is the other, like the chicken and the egg thing, that I think even if there weren't Cybersecurity Intelligence Unit, potentially outcome would be similar because you take smart people in and then you get smart people out. There is a lot of good stuff happening in between, but I think smart engineers would be also fine without it, so it's not just because of that. It definitely helps and creates an interesting ecosystem and bonds, and so on.
Marc: So let's move on and let's talk about the product a little bit. You defined it as a cloud asset inventory, so the way you described it, it sounds to me like I have Terraform or Plumi or some other tool, maybe I'm just using the AWS console and not following any kind of modern best practices. But I'm deploying, I have all my cloud infrastructure and then you have a service that's querying what's available in all the infrastructure that I've created in the cloud, inserting it into a structured, schema'd PostgreSQL database that I can then have anybody on the team query to know what's out there in the cloud. Is that what CloudQuery does?
Yevgeny: Yeah, exactly. We also connect, not only depending on your infrastructure, also to other things like Octa, CloudFlare, anything else in your infrastructure that you want to slice and dice or create, get visibility across clouds and so on. But yeah, that's exactly it. I don't know if your next question is in regards to Terraform, right? It's if everything is in Terraform, why do I need to go again to get the information if I already have it in Terraform?
Marc: The question has gone through my mind.
Yevgeny: So I think definitely that's the best case scenario, yeah. If we got to a place where everything is in Terraform basically, a graph probably would do fine for any kind of visibility questions. But the issue is in medium companies, big companies, not everything is in Terraform so you actually want to have this kind of... Don't you have a view of what you have and maybe also compare with what is defined with Terraform? Which I can also talk a little bit about how we are doing that, but yeah, basically that's the best case scenario but that's usually not the real world scenario.
Marc: Yeah, and I think that it's also fair just keeping in mind that Terraform is a declarative system that defines what's going to run in the cloud provider, but there are things like autoscaling groups and managed node groups for EKS, things like this that actually mean that you don't know, you're defining the criteria for the cloud provider to run. I'm assuming in that scenario, CloudQuery is actually able to tell me what's actually running at this moment in the cloud.
Yevgeny: Yeah, exactly. That gives you the current snapshot.
Benjie: Great. And errant clicks is a reality for all of us in the infrastructure space in the AWS console. So let me ask you this question, you take all this information and you give it a unification, if you will, because obviously Octa is maybe not covered under your Terraform or your Plumi, so that's really cool. But then you decided to put it into SQL, to make it SQL accessible. Can you tell us a little bit about that decision and why a modern enterprise would want to enable their internal teams for SQL?
Yevgeny: Yeah, so we started with SQL but essentially the biggest problem here, you can look at this problem in general more as a data problem.
The issue usually is to get all the information from those APIs, normalize them and put them somewhere in a structured way. So yeah, we started with the vanilla PostgreSQL, but something that we saw that users do and also we plan to support other databases, they take this structured information and then they load it into any other place if they want to do a different set of analyses or connect it to their current monitoring or visualization platforms like Grafana or SuperSet or whatever they have. So yeah, whether it's PostgreSQL or SQL, just because it's the standard way to go, the first way to go.
Marc: These different connectors, actually I'm like at your website, it looks like you call them providers for AWS, Azure, for Octa, et cetera. How many of these do you have and how active is the community around creating new ones? Or one way last way to look at that is, if I want to run CloudQuery but I have something that's not supported, can I just run it or do I need to make a contribution upstream to get that provider in?
Yevgeny: Yeah. Actually, some of our users make contribution and we approve them pretty quickly, but you can also compile your own provider and run it so you don't necessarily need to wait for it to be upstream. Though we are relatively fast, at least for now. That's the whole idea, because the main problem or the main challenge here, let's say even if we go and look into things like AWS config, which is like the enterprise or the commercial competitor from the cloud provider, or Google Cloud Assets Inventory.
You will see that there are a couple of disadvantages there. One, they usually use a custom query language so you don't have access to the raw data. Second, even though they are native AWS, they don't support all the AWS APIs or JCPA APIs.
Actually CloudQuery already supports more resources and APIs than AWS config, just because they are dealing with the same problem of how to support the enormous and infinite number of API that is just drawing and handling all the nuances and then transforming it and normalizing into a database. This is some of the challenges also that Terraform has in dealing with providing this abstraction layer on top of cloud APIs.
Marc: Right. So that makes sense, and I can think of all these different use cases. Let's go all the way back to the beginning though, getting data out of the cloud provider, making it available so that I can use a PostgreSQL query, doing things like this, super powerful. What was the use case you originally were building for?
Yevgeny: Yeah, so the first use case that I was looking was especially around security and specifically I was looking at enterprise products like CSPM, like cloud security posture manager from either big companies or startups, companies like Palo Alto Networks, Checkpoint or other next gen enterprise startups. What was mind boggling for me is, one, why there are so many of those, and two, they actually all building or implementing in house, at least probably 80% of their time or development time, they're implementing this kind of closed source, ETL engine.
Then just probably 20% of the time building the rules or the security value that they sell, so actually it's a data problem. What we started to build in CloudQuery is trying to rebuild the whole stack, either for security use cases or compliance or cost use cases, rebuilding it on top of CloudQuery as a data app and building CloudQuery as a data ingestion platform. Being Open Source, also giving the ability for users or developers writing integrations, commit and contribute to our official integrations, and not being blocked by a vendor that can't really maintain an infinite amount of APIs and integrations.
Benjie: So if I'm an enterprise customer or I'm a CSO or maybe I'm someone under a CSO and I want to know from a compliance standpoint, what are all the assets that I have? Where are they? What are they doing? What is the configuration around those?" CloudQuery is really where-
Yevgeny: Yeah, so you can run just like you can run all your compliance policies with SQL or you can create views, generic views with all your cloud assets, query them across regions, query them over time, creating alerts when something is not according to your company policy, and you can use a standard query language to create any rule you want.
Benjie: Have you seen customers or folks using CloudQuery to query the data and then using an external policy engine like Caverno or Open Policy Agent, or something like this? Then CloudQuery is critical to provide the data as the context for the external policy engines to make their decisions?
Yevgeny: Yeah, it's a good question. What we saw with regards to OPA and SQL, people who use CloudQuery, they use SQL to write their rules, join across tables and create their security or compliance or cost rules, and then they create standard alerts, their workflow is either Grafana or anything else. With regards to OPA, OPA is really good language for in memory processing for things like Terraform or Kubernetes, but actually running OPA on top of all your cloud infrastructure, it won't be a great fit because it will probably just run out of memory or you will need a very big machine.
So I think this is exactly where actually you need a database. Yeah, probably there is no better solution than some kind of database, either PostgreSQL or something else, but we sell companies loads, like tens of thousands of accounts and millions of resources into PostgreSQL. So yeah, there is no work around having your database here in that case.
Marc: Sure. Let's talk more about policies then. CloudQuery, it completely makes sense to me that you have this service, we'll talk about where it runs, how it runs and everything. We'll get into some technical details. But I end up having a PostgreSQL database that has everything that's in my AWS account or my CloudFlare account or Octa. What is your best recommendation, your best practice to define policies that can run either regularly or continuously and I can use these in evidence for potential compliance auditors or customer reports and things like this.
Yevgeny: Yeah. Our suggest ways, one is to look at our premade policies, we have a bunch of them on our website. We have something called CloudQuery Hub where we list providers and policies. Yeah, we made those policies for CIS benchmarks and all kind of the industry standard compliance and security policies.
The policies are actually just lists of SQL queries in an HCL file, and CloudQuery knows to parse them and just run them one by one and produce a result so you can run them in a CI on a GitHub action on a daily basis or whatever schedule you want, then create your others in advance on top of it. So yeah, we have very simple HCL format file which basically just groups together a bunch of SQL queries which define your policy. Yeah, you can put them in a Git so you have all this compliance and security as code thing.
Marc: Let's switch for a second here, I'm curious. CloudQuery is an Open Source project, but there's a company behind it, there's a commercial entity. Do you have a commercial offering on top of it? Do you have plans to? Can you help me understand a little bit about how you're able to continue just to work full time on an Open Source project?
Yevgeny: Yeah, so currently we're working from VC funding but the idea eventually is to have a managed version. Right now our customers deploy CloudQuery with Terraform and Helm charts, so they deploy it and it runs periodically on their infrastructure. They maintain also the database, so yeah, it's self hosted right now but the idea eventually is to have a managed version where people won't need to maintain their own infrastructure. That will be the first step.
I don't know what features exactly or if we'll have features that it will have only in the commercial product, and then the open source one, because it's a bit farther down the road, But yeah, the plan right now is really just to continue the focus on the open source adoption. We had great adoption in the last six months to a year from big companies, from Fast.ly to Autodesk, to Tempus and we want to continue growing this list of great users and supporting the current one until we are sure we hit the critical mass where it makes sense to start also putting money into developing the managed version.
Because if you have too little users, then you fork too early to a managed version, then you will end up with very little users on your managed version and they will turn to be very expensive users.
Benjie: Yeah, it's an interesting topic though because if I am using CloudQuery, I probably don't want to have that data not on my servers. There's kind of a weird dichotomy there a little bit, right? I don't mean to be critical, I'm just saying if I'm using the managed version, that means that my data is going to live on your servers basically, that tells the whole compliance story for my organization?
Yevgeny: Yeah, so it's fair feedback and a good question. I don't know all the answers to that, but we do know from the customers that we talk to, yeah, some of them probably will never buy our managed version. But some of them actually said that, yeah, their whole policy is to use managed wherever possible just because infrastructure burden is real and they try to avoid it.
The results are somewhere in between, and this will be able to only understand when we'll be there, how good or how bad the conversion rate and do we need to come up with smarter solutions to make people more comfortable to migrate to our managed one. Either with things like maybe hosting a database somewhere in their cloud or not, maybe it won't be a problem because we know it's not a problem for Sunfire customers, but we don't know the conversion yet.
Benjie: Well, to be the Devil's Advocate to my own point, I also think that there's a third party audit component that is really interesting, where CloudQuery is SOC2 compliant and all the ISO27,000,000, whatever it is.
Marc: We're only in thousands right now.
Benjie: Sorry, we're only in thousands right now. But yeah, having that and then of course there's the obvious... I can say it because it's not my company, there's options to help you with on prem versions of Replicated. Sorry, Marc, but I can shill for you every once in a while, right? No, this all makes a whole lot of sense, it's just a lot to think about, and honestly, when we get into the compliance world, we're all going through it and it's very interesting, and it's a really, really, really interesting topic.
Marc: But I think I hear one of the big takeaways is that there is going to be a commercial offering, but you don't want to cannibalize or hurt the Open Source tool right now so you're not in any rush to figure that out. When it makes sense, you're going to have it, but the Open Source version is here to stay and you're committed to supporting that right now.
Yevgeny: Yeah, so I think this is exactly it and I'm actually speaking from a painful lesson. When we just started it, I open sourced it first, even before the seed was drawn. It took off relatively quickly, we started to get good traction from big companies and decided to double down on that. Our first thought was, "Okay, great. Let's also start the managed version, let's try to monetize quickly."
Then it turned out to be a four or five months of half of the team of very expensive experimenting. Then we understood that it's not feasible to do both at this early stage because it's a complex project and you need to work with a lot on the engineering to solve a lot of hard engineering problems and also to work a lot with the community in the adoption to understand what you need to develop. So the project really has to be mature enough until you release them to the managed version, and yeah, it will take time.
Now from startup and money perspective, you should be as lean as possible until you get the project to this level because it will take a lot of time to monetize. Apart from maybe some support agreement, you want to be or what we're trying to be; as lean as possible until we are sure we've got it. Now we want to split and put more effort into the managed version, which we know will be very expensive.
Benjie: I think it's really interesting and it's a good lesson, in that we have seen a lot of projects fall on their face, falling into that trap. We've seen some other ones go the path that you're talking about and be unbelievably successful, so it's a good lesson. Again, our audience is very interested in the whole Open Source ecosystem, so it's a great way to think about how to build a company around an Open Source project and all the different gives and takes of it. I have a question, not to actually not talk about compliance or VC stuff, but what is CloudQuery implemented in? Tell us about how development works and all that stuff.
Yevgeny: Yeah, so it's written in Go. The first reason for that was apart from there's a lot of new projects written in Go, is really the distribution thing. It's compiled to one binary, so that really eases up with distribution. Yeah, other than that it's a really good language, it has downsides, especially for some things. But I think in general for something like an ETL, plugable ETL framework, it's worked pretty good for us.
A little bit about the architecture, the architecture is actually really similar to Terraform because we used a lot of their underlying libraries like Go Plugin because we wanted to create a plugable architecture, so we have CloudQuery which is one binary and then we have our official plugins for AWS or GCP which are a separate binary and are hosted on our CloudQuery hub.
CloudQuery downloads the plugins you want to use and then runs them via GRPC, via this pretty widely used HashiCorp library, the Go plugin. I think they did a good job there and I think it's the best project probably out there around loading plugins in runtime.
Marc: Yeah, it's great to be able to find that and model the architecture after it because years ago that was a hard problem to solve in Go and being able to stand on the shoulders, if you will, and just be able to say, "Great, this is a provable pattern, it's got some maturity behind it now, we can actually write code in Go which is what we want to write code in for all the benefits that we like." It's just the Open Source community, right? And being able to reuse that library and packaging and follow that pattern.
Yevgeny: Yeah, that helped us a lot because it was a really tough problem and got much easier with this library. Even this library, when you use it, it's quite complex. The very good thing is it's battle tested and it works, you just need to read the documentation very well and understand how it works when you touch those pieces of code. But yeah, it saved us months of development, if not years.
Benjie: Yeah, and probably some pain down the line too. You're probably saving yourself a whole lot of pain for some weird, edge case that you guys didn't have, as they say. Does anyone know if Kubectl uses it for their plugin stuff? Does it use that? Do we know?
Marc: I actually don't know the implementation of that, but Kubectl is just separate binaries that have to be on the path for the cruise style plugins.
Benjie: Right, but they're extending their whole plugins, everyone's using plugins now for Kubectl, I think. Which is my most important question, Yevgeny, and we'll get back to other stuff. But Kubectl or KubeControl, what is your answer?
Yevgeny: I think KubeControl for me, actually I didn't know the other one.
Benjie: Oh my god, all right. Marc, the interview is over. All right, we're done. We're good. I'm a big Kubectl guy, I'm a big believer in Kubectl so I tend to ask everyone that question.
Yevgeny: I get it. Okay, I wasn't aware that this is how you pronounce it, okay?
Benjie: I'm on a mission, I don't actually think there is a right answer per se. It's an EMac versus Vin type world, or tabs versus spaces. But okay, going back to the actual implementation, what is the hardest problem that you guys have tackled so far? Was it the plugin architecture or is there something else that's really challenging that you guys have dealt with?
Yevgeny: Yeah, that's a great question. I think the hardest problem is around how to support and maintain the huge amount of APIs and I think also in the data world, projects like Airbyte that integrate with a big amount of APIs, and also Terraform, they have similar challenges and different solutions to that. I think this is the biggest challenge and also the biggest engineering challenge, and I can talk about this a little bit.
Probably maybe it's also on Hacker News maybe a few months ago, on Terraform, that they stopped accepting community contributions for a while. I don't know if it's back, and if you go through the issues, the number of issues is huge. Either it's because of an underlying API or Terraform API, so this is the biggest challenge because for CloudQuery to be very useful, it has to integrate with a lot of, a lot of APIs both inside one cloud and also across clouds and other SaaS services.
You want to do that also fast because new APIs are introduced, and also you want to do that with a small team because otherwise you will have a very large cost, and we're not making any money yet so we want to be as lean as possible also when we will be making money. So that's the engineering challenge and to talk a little bit more about just the engineering challenge at scale here, is how do you maintain that also? And how do you test it?
A lot of our solutions were around code generation and trying to write tools that autogenerate that, both some of our integrations and also some of the tests. This is where we invest a lot so we can scale more easily, catch more bugs in development and stay as small as possible while supporting as many as possible APIs. I can talk about some of the code generation issues that we tackled.
Marc: Yeah, I'd love to hear some stories there around that. Adding one or two in, probably not the hardest problem in the world, to go figure out how to query EC2 data and write it in. But when you actually get into both many of them and keeping them updated, that's when you're really, really relying on automation to make the product reliable and up to date for everybody. I'd love to hear more about how you did that.
Yevgeny: Yeah, exactly. One, I think that's a great point. When you just do it in EC2 people are like, "Okay, just running that." But when you start supporting hundreds or thousands of AWS APIs, the amount of issues from people that use it like, "Oh, this API is not returning what I think it should." Or how do you catch as much bugs as possible early on? Otherwise you will be bombarded with support requests.
Maybe before I talk about the code generation actually, in this specific world of providing abstraction on top of cloud APIs, it's a well known problem in the Terraform, Plumi world, data world, and actually AWS also are aware of that, that their API is not unified and it's hard to generate code for it. So they actually introduced AWS cloud control API about six or maybe seven months ago, which is an API specifically for tools like cloud security posture management, Terraform, Plumi, so they can autogenerate their whole integration on top of this API, unified API endpoint.
It's still very early on and it doesn't support all their APIs and so on, but I think it's a step in the right direction from the AWS side. Yeah, from our end, we provide a lot of tooling around generating our plugins from AWS, known unified APIs, so that's one part. The second part is also generating the tests, we have a lot of mock tests and we also have end to end tests with real infrastructure so we try really to test as much as possible because otherwise you will really be bombarded with support and small, annoying bug fixes. "Oh, this small field is not populated into database."
Benjie: Right. And there are a lot of changes that happen on all these, there's a lot of surface area on the things that you guys integrate with so I can imagine quite a few headaches. That brings up a good point though, if I wanted to add something, I don't have your hub in front of me right now. What do I want to add? I want to add LinNode, I like LinNode, I used to like LinNode, I still like LinNode. I want to do this and I want to do the community thing, would you guys add LinNode actual infrastructure for the end to end test? And how can people contribute? How can people take part in this I guess is my question?
Yevgeny: Yeah. So there are actually some community providers, we just mark it, you upload it to our CloudQuery hub, but we just mark it as community provider and then it's up to you to manage and maintain it. We maintain as much for our users as we can, we plan to write more when some of our tooling will improve so we will be able to support more. But yeah, for example, Yandex Provider is a community provider, they maintain it, so the third party cloud provider maintain it, but also we saw other people write their own provider for, let's say, DataDog because they wanted to extract more configuration so they maintain it.
Marc: How big is the team right now? How many people are working full time on CloudQuery?
Yevgeny: We are around 12 people, mostly engineering. I would say around eight people working on the Open Source project and three more people working on all our web facing assets, so CloudQuery Hub, all our documentation, both the CloudQuery docs and also the provider docs which is all the schemas, all the tables, our policies. It's a lot of work so we have a team just for that, and then also product and security engineer for customer support and educating our customers, writing content, tutorials. We've found it to be super helpful for current users and new users.
Marc: It's great, I'm looking at the Hub, I'm looking at your docs. You can tell there's a lot of effort put into this. You're solving a hard problem with a ton of different data, and documentation is just so, so important. I love the way that you actually define the schema so I can look at the Octa provider right now. I'm thinking, "Oh, I wonder if I can actually do this?" I can see the data you're pulling in and be able to know whether you're going to support the use case I have or not, or whether I need to consider go create an issues or try to extend that provider.
Yevgeny: Yeah, exactly. This is all up to date so every time, for example, the Octa provider gets updated or the AWS, or the docs are autopopulated. A bit similar to Terraform Registry in a sense, I think, and maybe also Docker Hub. But yeah, I think there are firm bridges with... it's a good example because they are dealing with a very similar problem from the provisioning side. In some sense it's a very similar problem and maybe we have a bit easier life because we have only the read problem and not the write problem, which is harder.
Benjie: Sure. I want to talk about Open Source in general. You have this Open Source project. It looks like really good traction, a good number of stars, adoption of it, you've obviously been able to raise some money and create a company around it and focus on growing the team. There is a lot of Open Source projects out there that are also great, but maybe don't have that much visibility so what advice can you tell us or what stories can you share? How did you get to this many stars? Was it a big reveal where you worked behind the scenes and then Open Sourced the repo? Or was there a moment on Hacker News or Product Hunt or something else that really helped accelerate traction? Do you have any tips?
Yevgeny: Yeah, sure.
I think in general stars is an awful metric for things. Yeah, we got the initial star spike, it was from Hacker News. I had to post it like five times, to relaunch it like five times until it got to the front page. I wasn't able to do that. It's kind of like a hit or miss there.
The first spike was just a visibility issue, and then I just actually did a lot of small things, writing a lot about it. It didn't solve all my problems right away when I was still in the exploration phase. It was before we raised money, I was still trying to understand and talk to every CloudQuery user to understand do we have long term users here? Or do we have just one Hacker News spike? So yeah, if you have one Hacker News spike, please don't go raise money, even if you can. It's not a good idea.
Yeah, so for four months I was posting a lot of content and tutorials on Reddit, Hacker News and also just waiting to see if someone starts using it, because if I put it on the Hacker News, even if it's interesting for someone, maybe it's not his top priority right now. Maybe it will be high priority in a month. It was kind of like a drip of users from relatively medium to large to very large enterprises.
I was talking to all of them, trying to understand the use cases, trying to understand that they are going to use this on a day to day basis and only then I was ready to double down on that and raise the seed funding and then really continuing doing the same thing. We still do that, we write a lot of content, we bring new users to the platform, we talk to all of them, we have monthly community hours. So yeah, there is no one thing you really have to do all the time.
Benjie: Yeah, success by 1,000 cuts, as I like to say. This is a little bit off, or we're going back to something you said but you mentioned something that I just wanted to ask you about. You were acquired by GitLab, you mentioned, and I know that they are pretty famous for their remote setup. A lot of us were forced into a remote setup type world a while ago, and now we're all adjusting to maybe a hybrid, maybe some people are staying remote. I just want to say, what are some of the big learnings and things that you learned over there?
Yevgeny: One I learned, I guess, it seems to work, but it's not something crazy. That's the first thing, that nothing completely breaks if you just do remote, and I think some people are maybe aware or are scared of that if they will go remote, that's it. But before that I actually worked remote for a lot of years, so it wasn't completely new to me. But it still was nice to see it work in a large enterprise.
I learned that communication is super important, written communication, Slack, Notion, those pull requests, trying to be as communicative as possible, this really helps. Timezones are also quite important in remote, so if you have the luxury of being more or less in the same timezone, this will be helpful. But it really all depends on what you can achieve, what's your budget where you can hire, sometimes you have to be flexible.
Benjie: Sure, yeah. Were there any particular little things, little details that they did that helped bring the team together, keep people on the same page. Anything specific that was just like, "We did a meeting every 13 days and that was super helpful"? I don't know, anything interesting there?
Yevgeny: Yeah. I took probably a few things, but I don't remember all of them, maybe some of them I take for granted now. But one thing that I do remember that we also have in our company that I like to do, in GitLab you could schedule coffee chats whenever you want with any other people on the calendar. I was doing that quite often, just with anyone in the company, you wanted to learn something, you wanted to meet someone new, just put it there and people will take the meeting or will just say, "It doesn't work for me on this day, let's do it another day."
So they were saying it was okay to do, so it wasn't strange for people so I think it really helped. I personally really liked it, I met a lot of new people this way, learned new things. It was a great opportunity. You have an opportunity to talk to any of 1,000 people in GitLab for free, right?
Marc: Totally cool. So I want to come back and think about going back to Open Source and the ecosystem in general. You obviously work with cloud providers, not specifically Kubernetes. How close are you to the CNCF? Have you considered working more closely? Even potentially making a contribution of a sandbox project? Then the last question in this space, is there is Cloud Custodian in the sandbox right now, right? Which is an Open Source project that solves, at least on the surface, reading the docs, it sounds like a similar problem and I'd love to hear your take on where CloudQuery shine versus Cloud Custodian and some of the differences between the two projects?
Yevgeny: Great question. On the Kubernetes, we have a Kubernetes provider. It is still early on and, yeah, we want to invest more because we know we have a few users that use it and we've been asked for a lot of improvements around that, and we know also it's a huge pain point. So there is definitely more work around this integration on our roadmap. For the CNCF sandbox, I think probably when we will get to it, I'll be more knowledgeable if it will make sense for us.
For now, I hear a couple of different suggestions and ideas from people that had more experience around that and I'll be happy also to hear maybe your experience. A few folks told us basically it's good when, I don't know, your users ask for that. But other than that, the overhead is too high, some folks said that it's worth it and it helped them. So I hear a few different opinions about that, so I don't have my own just because it's a bit too far from it.
With regards to Cloud Custodian, actually it's a project I looked into even before we started CloudQuery because I was looking into the security space. I was looking into is there anything Open Source alternative to all those enterprise companies? So I saw Cloud Custodian. But one of the big issues there when I tried to use it is that they focused on the end use case so you have to learn this DSL language for security, and they combine their ETL layer with the query layer.
Which makes the project only serviceable to this use case, while the number of use cases are actually infinite so in my opinion the idea was quite good, but the architecture I think wasn't the best because actually what you want to do is to separate those two completely different problems. They're hard but different problems.
The one problem is to get all the data to a database, which is the first hard problem, and then actually the easier problem is then just to write rules on that and also I think writing the rules in query language from a maintainer perspective, and also from a user perspective, is much easier when you use a standard one. You don't need to implement a new one, right? So it's a hard enough problem to solve the first one, the data engineering problem, that you probably don't want to also implement a query language if it's a solved problem.
Benjie: Yeah, that makes sense. Thinking more about those, the compliance and the policies and stuff like this, I'm wondering, the use case very much makes sense. I totally understand it. We've recently gone through SOC2 type 2 compliance where we have to generate reports for auditors and some of that's... There may be a startup out there that's run for a while, has a lot of cloud infrastructure that they might not be able to inventory really well, so CloudQuery is amazing in that scenario, it's also good just to start off for compliance.
But I wonder now, like when we first started, about the macroeconomic climate stuff is changing in tech and one of the side effects is startups have to really become efficient and you have to really think about managing costs more than you used to. Have you found use cases or has there been any transitions or surprises where you see folks using CloudQuery to get a handle on cloud cost more and more these days? Or has that not really changed at all?
Yevgeny: Yeah. I think the users that use CloudQuery even before that, I think were aware of that so I think it was a top priority for them also before that.
Something that we saw, even before the climate change, is quite a few big companies moving or migrating from enterprise solutions to Open Source stack just because it became so immensely complex and expensive in the last couple of years, that they understood that actually it isn't worth it any more and they can extend it to their own use cases. Then it becomes more work and more money than they initially thought, so they were migrating to open source stack, to CloudQuery or to considering other projects in this space. I think this is something that I think and probably I also hope that we will see more in the future.
Marc: Yeah, it goes back to the other conversation too. I think you're doing a great job and it sounds like that move from proprietary software to Open Source solutions like CloudQuery makes perfect sense. I think that speaks to the next step, is where the foundations like CNCF, Apache Foundation, things like this start to become the next level of continuity and longevity of the project, knowing that while it's Open Source and there's a way that...
Hopefully this never happens, but if you as an organization no longer are supporting the project, great, it's Open Source, I can throw some engineers on it and we can do what we need to do. But with that foundation ownership of the trademark, there's more of a like an open governance model, there's other folks that have expressed that they're also counting on it and it gives those enterprises an even more proven way to ensure that that project is going to last.
Yevgeny: Yeah. I think that's probably the best point in favor of that, I think. But also I think if you go to, and again this is something that probably I would also dig deeper and try to get more data points, and we will get to that... Probably also, for example, companies like Terraform, they are not in the CNCF even though they have success. So you have Confluent and you have Apache Projects, so you have successful examples on both sides, right?
Marc: Yeah. I'm here, I think Benjie and I, neither of us are going to advocate for or against it, I think it's just a really interesting thing to talk about and understand why some people have taken projects really, really early and put them into a foundation. And some people, like to your point, Terraform is an Open Source project and looked like it's still owned by HashiCorp. The difference is it's interesting to have those conversations and understand the motivations behind it.
Benjie: I think that the interesting thing that we're all talking around is that it used to be you never get fired for buying IBM, and now it's kind of like you're probably not going to get fired for buying or using a pretty robust Open Source ecosystem and community. It's really weird how that's shifted. Obviously now IBM bought RedHat so I guess this is all... This analogy has just gone sideways.
But it's really interesting to shift in enterprise purchasing, where Open Source is now something that is considered reliable and predictable. Then obviously you layer on these different foundations and governance and you get some more stability and confidence there. It just speaks, because it was not like that, I would even go, six years ago, seven years ago. Marc, what would you say?
Marc: Yeah. I think you still looked for that support contract, right? And now you're like, "No, I need to take the future into my own hands or at least ensure that there's a path for me to be able to do it."
Benjie: Exactly, yeah. So speaking of a path, anything interesting to share on the roadmap?
Yevgeny: Yeah, I can talk a little bit about some of the things that are coming. One will be looking into also adding more database support. I don't know exactly when, because right now our focus is really around tooling, automation and testing, and the data problem. But the most important thing is that the data that you extract, it's coherent, right, you want to know that what you have in the database is your current configuration, you don't have missing things or incorrect things.
So this is something we put a lot of effort. But in terms of features, yeah, we are thinking about supporting more databases because it's a data problem and some customers have different stacks, they build their data analysis platform different ways, it's not always PostgreSQL. Sometimes it's DataLakes, DataWarehouse, things like Bitquery, Snowflake, so that's something that we'll be looking into.
Another thing that is not a feature per se, but more of tutorials that we actually just released, and people asked for. Actually they even built it in their own CloudQuery stack, but they asked to concrete it to your tutorials and we thought it would be interesting to share with the community. So actually building a GraphQL layer on top of this data, so having a GraphQL API.
We did it with a project called PostGraphile, if you know it. It automatically generates a GraphQL endpoint from your schema. This was pretty useful for some of the users for a search use case for their developer. The SRE team, for example, installed or deployed CloudQuery and now they wanted to give all the data into the organization, they had a GraphQL UI to search for instances or IPs when they debug or develop something and they don't have access to all the accounts or they don't even know where this IP or instance is located. So just using a GraphQL UI is pretty neat, so this is something that we did recently.
Marc: Yeah, we'll include a link to that in the shownotes here too.
Benjie: Anything else on the far roadmap?
Yevgeny: Yeah, so another thing that we were looking into is a problem that we heard from security teams or devops team, it's when they build their security guardrails, both for the infrastructure as deployed and for infrastructure as code, they end up writing the same rules for further infrastructure, let's say, if they have a CSPM or whatever enterprise product. Then they write it in one query language and then they write the same rules in things like OPA for Terraform or for Cloud Formation in a different language.
Then they have to maintain it as we go, so something that we were looking into is giving teams the ability to write and maintain all the rules with one query language. The idea around that was to integrate with things like Terraform or Cloud Formation, read the Terraform state and convert it, transform it to a CloudQuery schema so this will give security teams the ability to write and to run the CloudQuery policies both on the infrastructure as deployed and on the infrastructure as code. Not only that, it will also open the door for things like detecting drift and doing all that without writing code, just by solving a data problem.
Benjie: Wow. That's super cool. Well, Yevgeny, I think we're at time here but I really just wanted to thank you for coming on. We'll make sure to put a bunch of links in the shownotes so they can find you. Really appreciate it, thanks for coming on. Really excited to see where CloudQuery ends up going.
Yevgeny: Awesome. Thanks, Benjie. Thanks, Marc. Really enjoyed it here, it was great chatting with you.
Subscribe to Heavybit Updates
You don’t have to build on your own. We help you stay ahead with the hottest resources, latest product updates, and top job opportunities from the community. Don’t miss out—subscribe now.
Content from the Library
The Kubelist Podcast Ep. #45, Live from KubeCon 2024
In this special episode of The Kubelist Podcast, recorded live at KubeCon 2024 in Salt Lake City, hosts Marc Campbell and Benjie...
How to Make Open-Source & Local LLMs Work in Practice
How to Get Open-Source LLMs Running Locally Heavybit has partnered with GenLab and the MLOps Community, which gathers thousands...
Open Source Ready Ep. #6, The Infinite Nature of Software with Adam Jacob
In episode 6 of Open Source Ready, Brian and John are joined by Adam Jacob, co-creator of Chef and CEO of System Initiative, to...