Ep. #2, The Early Days of GitHub with Tom Preston-Werner
In episode 2 of EnterpriseReady, Grant chats with Tom Preston-Werner about how the open source company he co-founded, GitHub, rose up to become an essential coding resource for developers everywhere.
Tom Preston-Werner is co-founder and former CEO/CTO of GitHub. More recently he co-founded Chatterbug, a language learning platform. Tom is also the author of the wildly popular static site generator Jekyll.
In episode 2 of EnterpriseReady, Grant chats with Tom Preston-Werner about how the open source company he co-founded, GitHub, rose up to become an essential coding resource for developers everywhere.
transcript
Grant Miller: Hey, Tom. Thanks for joining.
Tom Preston-Werner: Hey Grant, how are you doing?
Grant: I'm good. I'm excited to have you on the podcast. One of the most amazing guides for Replicated and EnterpriseReady. Everything that you talked about in the conversations that we've had over the last three or four years, most listeners won't know this, but you helped inspire everything that we built at EnterpriseReady. We wrote and helped create and helped us synthesize the thesis around what it takes to enable software companies to distribute to the enterprise.
Tom: I'm glad to have been a part of it. Getting software ready for the enterprise is something that still is extremely painful, very complicated, and I hope that nobody has to go through it the way that we did at GitHub. It was painful, it was a learning experience.
Grant: A lot of that learning experience that you passed on to me, we used and we tried to build into products and writing and everything else. Hopefully what we can do here today is expose some of that same knowledge for the rest of the world, so that everybody can hear the things you learned along the way and the challenges that you faced, and then hopefully make their products better and easier to integrate with different enterprise systems.
I mentioned your background earlier, but I would love to just hear it in your words. How did you get into enterprise software? What was the career trajectory that took you into it, and how did you first figure out you were building enterprise software?
Tom: It was by accident, in that all of GitHub was by accident. GitHub is the first enterprise software that I've worked on, and it didn't start out as enterprise software. It started out as a SaaS model. You go to the website, you'd sign up, you'd have your user and you'd share some repositories. It started small, it was just a side project when it got started.
The turning point was a couple of years in. We went to a PHP conference and we had an exhibiting booth there, it was ZendCon or something. It was down in San Jose. We drove down there, the four of us founders, and we set up a booth. We got some booth stuff. Did you know that you have to pay for carpet when you have an exhibition booth? You have to pay for carpet. It's amazing. And electricity.
Grant: I did not know that.
Tom:
This is something that you'll learn if you're going to go down there and you want to advertise to the enterprise in an exhibit booth: you pay for everything.
Keep that in mind. But we went down there and we set up a booth and people came by, and we're like, "This is GitHub. It's version control, it's the next generation of version control. Easy branching, easy merging. Distributed. Offline, etc." And they're like, "Wow this looks amazing. I could use this at my company. It sounds way better than what we're using. How do I do it?"
I'm like, "OK. You go to the website, GitHub.com, and you click on sign up, and then you sign up." And they're like, "Woah. Stop right there. Because I'm an enterprise, I work for an enterprise, and we don't do that stuff. We need something that's on-prem or it's not going to work." So we're like, "OK. Keep us in mind for the future."
And this happened over and over throughout the day, or the couple of days that we went down there. We heard it a lot. We heard it constantly. It was probably a third or a half of people that would come by, and we ended up just saying, "Thanks for coming by. There's nothing that we can do for you."
On the car ride back up to San Francisco I remember talking about it with the guys and we were like, "Should we be doing something in the enterprise? Because this seems like a lot of people that we can't have as customers. They can't use our product. They need it on-prem. They need something installable. Should we work on it?" This was only a couple of years into GitHub, we were still very small. Probably six people, seven people at the time.
Grant: How many users do you think you had at that point?
Tom: We had thousands of users at the time, but paying users would have been-- I honestly don't remember. It wasn't a huge number, but it felt like we had something that people were using. They loved it. They were paying for it. Getting people to pay for GitHub was never a problem. It just--
Grant: You don't think that none of those early customers that were using it, and signing up online, or seeing it on different-- You guys were pretty involved in the Ruby community. None of that ever surfaced this idea that you needed to have a private instance that could be deployed into an enterprise data center.
Tom: No. It was the first time that we ever saw requests for it, was at this conference where enterprise people were. The thing is, we were big in the Ruby community, like you said. Enterprises weren't using Ruby back then. It was still fresh enough. It was like,
Twitter was the biggest thing that was using it, and Twitter wasn't even that big at the time. That was our crowd.
That was the people that we hung out with, those were the people that were noticing what we were doing and coming to sign up. It wasn't the enterprise types that were ending up on GitHub trying to do something, and then being like, "I can't do this. Let me e-mail someone about it."
Either they just weren't there or they didn't want to go through the hassle of requesting it. Now, we did eventually have people that worked in larger companies use GitHub, but they just used it the normal way, through SaaS. They would do it under the radar, or maybe their division would approve it, and they could use it in the SaaS model for just their division.
Grant: Shadow IT, as the industry has started to call it.
Tom: Yeah, exactly. People want to get things done.
Developers want to use the best tools, and they'll find a way.
It's like life, it'll always find a way.
Grant: Thanks, Jeff Goldblum.
Tom: But that made it easy to then transition into the enterprise as more and more developers started using it personally, and in shadow IT, that they could come to us later on when we started adding more large organization features. We went a long time before we added organizations, and the way that you would deal with teams was to just, one person would have an account under their own name and then they would add collaborators to a repository, and then other people could work on it.
It wasn't a great way to manage those users. We eventually added an organization plan that had different pricing and features to accommodate larger teams, then we helped people transition to that new plan. We grandfathered everyone in that would want to do it and then people could upgrade to the team plan, to the organization plan, and then they would pay more to get those extra features.
Grant: Right. Creating a little bit of product assortment there, in terms of using that team-type functionality as a way to differentiate between individual contributors working on personal projects and companies, where they had large teams of software developers who were all going to be collaborating on multiple different projects.
Tom: Yeah, exactly. And these people would also want a separate billing contact. Here's something we found out early on. When you've got a company that wants to buy a plan, they want to have someone who can pay for it that's not a developer. The developers often don't have access to a company credit card to pay for it.
Then they're like, "Let me get my billing person on here." And it's like, "OK. Now the organization plan has to also have a billing address and contact that is separate from any of the developers, and that's attached to the account. What permissions do they have? What can they do?" We added a lot of those things that we discovered just through people abusing the system and finding a way to solve their problems with the existing toolset that we had.
We always would try to figure out what people were doing and then solve the problems that they had, and that was the name of the game throughout.
As we approached all of the enterprise problems, we take it from that approach, we were learning what enterprises needed at the time. Like I said before, I hadn't ever made software for enterprises. I had no idea what features they wanted.
Grant: And you weren't pooling in product managers from CA or Oracle that had built enterprise software before, or even spending time with folks that had done it. You were finding the problem, like "We need roles and permissioning," and then trying to solve it from scratch. Is that right?
Tom: Yeah, pretty much. We didn't hire executives in the normal way. We didn't do any executive searches until after we took Series-A money. The Andreessen Horowitz money. After that we hired our CFO and that became our first proper executive hire.
We were trying to solve problems in a different way, we were trying to do management a different way. We were trying to do everything in a different way.
We did have the advantage that we were building a product for ourselves. We were all developers, most of the company. We hired people that were not developers, but everyone in the company learned how to use GitHub and used it to collaborate, even if it was on blog posts or even things like accounting.
Some of those things would go through GitHub. All the conversations would happen on GitHub as much as possible. We were building a product for ourselves and we knew what features we needed as developers, and we thought, "If we can build what we need, then this will apply to people throughout the industry no matter what language they're writing and whether they're enterprise or not, as far as it comes to the core products."
Then as we had people that became interested in using GitHub in the enterprise, then they would say, "OK. You have an on-prem solution," which originally was called GitHub FI, which stood for "firewall install." I don't know. I don't know why we didn't just call it GitHub Enterprise from the very beginning, but we thought GitHub FI sounded cool. That was designed to be installed behind your firewall, which is why it was called that. It was hard.
It was hard to develop the way that we initially started developing it, which we can talk about. I have some advice, whether you should have two separate code bases that you stream code from the SaaS model into the enterprise model, and such. But we were just working from first principles. That's how we built the company, from first principles.
Everything, we questioned everything. We questioned how you built enterprise software. We started with a company called BitRock. They had something that allowed you to package up your stuff and make an installer. We used it for a little while. That was the very first way that you could install GitHub in an enterprise. It obfuscated code and it had some features that were nice, but they just weren't flexible enough.
We couldn't get done what we wanted to get done quickly enough, and because we were driven so much by first principles we wanted to do things differently. Some of the decisions that they made were not decisions that we wanted to make. We eventually went off of that solution and we wrote our own. We wrote a lot of our own solutions at the time. So much software didn't exist 10 years ago that is commonplace now, that we ended up writing a lot of our own solutions.
Grant: Sure. When you think about that first principles approach, it's an interesting perspective to take because I don't think anyone had done that for enterprise software prior to GitHub. And ultimately that replicated in our customers, and anyone that's read EnterpriseReady has benefited a lot from the first principles thinking that you did at that point. Because it helped inspire a lot of the patterns that became adopted within the industry. Now some of the things we can look back on and debate if it's the right choice.
I always think about the GitHub user-centric model, versus org-centric model where your GitHub handle lives with you no matter where you go. Why did you guys do that to begin with? Is that just how it started and you decided to keep it that way? Versus, you have a different user account for every org that you're part of for your work or for open source, or something else?
Tom: It just felt like the right decision. Are you talking about specifically, that you have the URL structure? Where it's like, GitHub.com/username/project as opposed to /projects/something? And there's users that belong--?
Grant: No, what I think about it as, I have maybe three different Google accounts. My personal Gmail, I used to have a side-project Gmail, and then I would have a work Gmail. These are all things that I would sign into, now called G-suite, and I would have a separate account for each one of those different organizations or projects. I could be logged into all of them at the same time and toggle between them, but they were different accounts.
Dropbox now has this concept of you can have a personal account and a work account, they still only do two, but GitHub has always been. Other than your GitHub enterprise user account, which would be totally separate. If you use the SaaS product, that user joins a org or joins a company and then gets access to everything. It's user-centric versus organization-centric. I don't log in to GitHub.com as Grant@Replicated.com, I log in with GrantLMiller@gmail.com.
Tom: Part of it is that we just started in a way that was much more user-centric, and this was a very specific decision which was counter to how, very specifically, SourceForge did it. Before GitHub, SourceForge was a common place to put stuff. If you remember SourceForge had projects as the primary bit in the namespace.
So you'd create an open source piece of software on there and that would take up SourceForge.com /whatever that name was. This is probably going to be a bit of history for many people listening to this, and new developers today, but on SourceForge when you created a project you couldn't just create a project. You had to propose and request the creation of a project, after which SourceForge administrators would then approve said project before you could do anything with it.
Grant: Wow, I didn't know that.
Tom: This just felt like complete madness to us. It was like, "Why do they get to decide what software you can put up there?" Part of the problem was that they had a scarce resource which was these top-level project names. So we were like, "If we make the user be the primary then that's a natural namespace, and you can put all your stuff underneath there, and everyone gets their own stuff.
A hundred people can have a repository named the same thing, because it's all scoped by their username, and then you don't have to request permission anymore." This was life changing for people. They could go to GitHub, and they'd be like, "I can just create as many repos as I want. I don't have to have permission to create a repository. This is life changing."
Grant: Yeah. I wouldn't have realized that was such a problem, but you're right. Looking back on it, I get a notification for every new repo that we create, and there's two or three a week just within the 15 developers we have. I imagine going and trying to get permission for those things would just be a nightmare.
Tom: It was. It brought open source software to a level of-- It made it too important. It was like, "Your project has to be the real deal, it has to be a serious project to be on SourceForge because it's going to take up a whole namespace, and you got to get approval, and it's got to be legit." After GitHub, then people were like, "We can share anything. It could be stuff that people care about or not, it's just code. It's all just code."
So we tried to make it a little bit more irreverent that people didn't see bits of code as so holy and so important, that they could be small or unimportant things. That reliance on users as the primary factor, it just felt natural. It was like, "OK. You have a user on GitHub and you can put code up and you can collaborate on code."
You always do it through your username, because guess what, there's only one of you. So why do you need different personas? You're just a person working on code, and that code might be in different places, and you may or may not have access to that code. So why would we make you create separate accounts to do it?
If you look at your own behavioral patterns, when you go to use some Google property and you try to go to it directly or you click on something from an e-mail or something-- I use a program called Mailplane. When I click on a link to a Google document or something it takes me over to my browser, and let me tell you, the number of times that it's already on the correct Google user is almost zero.
I almost always struggle with Google's login thing. It's always chosen the wrong person. I'm always having the wrong permissions, and it's infuriating. Places like Slack do this as well. Go and try to log into your Slack account on a specific one. It's like, "Oh my God. I have like, 12. And I have to remember that my one password is just littered with Slack log ins." It's like, "I'm just one person. Either I have access to a room, or I don't."
It doesn't have to be that complicated. Every service should be, you have one user and you have access control. I don't know why you need to have a separate user for every use case.
It makes me crazy.
Grant: I love this topic because I think it's interesting, and it's a foundational decision that you make when you're building a company. This is the data structure around your user table. To change this inside of something that's around for a few years becomes impossible. Those early decisions you make set the course, and I love hearing your perspective that you would love to see everyone take a user-centric perspective, and then use permissioning in order to grant or deny access.
Tom: That's true. Now, it's not all rosy. There is one downside to that approach, to the GitHub approach. It makes it difficult when you're talking about notifications, e-mail notifications. Right now they're generally going to go to one address, so it's like, "Do you choose your personal address, and now you've got a bunch of work stuff going to your personal address? How do you divvy those up?" That to me is the one downside.
If you want to solve that well then you come up with a system where you're saying, "OK. This repository now notifies this e-mail address, and stuff. It's a sticky solution. Now that I think about it a little bit more, it's not so clear cut, and the notifications problem is the biggest one that would drive you to say, "Use a different e-mail address for each thing, and then we'll know where to send notifications for that thing." But at the same time, I'm still one person.
Grant: It feels like one of those problems that someone could solve first principles again and go back because, you're right, I don't think that any of the existing implementations out there would break down how GitHub does it, how Google does it, how Dropbox or Slack does it. None of these are the perfect implementation. This is exactly where that first principles thinking, and going back to, "What are we trying to accomplish here, and what are the challenges?" Now that you have so many examples and points from which you can draw from, someone could build a very eloquent version of this.
Tom: There's a solution, maybe it's just not a commonly practiced. I don't know what the perfect solution is. All I know is that I have a lot less trouble getting to what I want to get to on GitHub than I do on anything else.
Grant: Sure. When you build software, when you think about these features, do you draw inspiration from other products? Do you look at what other people have done and say, "That's beautiful. I love what they've done there and the implementation that they've done to this mundane problem."
Tom: We didn't do a ton of that at GitHub. Like I said, we were focused on first principles, so we tried to think through everything fresh. We weren't using a lot of reference material. Software is way better today. I think back to 10 years ago at the quality of tooling that was available, and it's light years ahead today as it was before.
I think there are a lot of good examples today of patterns that people are using for all kinds of different things. We just had this idea that we could do it differently, that we could do it better if we thought about things from scratch. In a lot of ways we did do better. We wanted to build something great for ourselves, and we said, "What should it look like? If we build an issue tracking system, what should it look like?"
It ended up looking like several different things throughout the history of GitHub. Now that I'm thinking about it, the first version of Issues did look like Gmail, in a way. It had an interface that was reminiscent of an e-mail client, because that was a nice way to go through things, and you could classify issues and stuff. There were some patterns around e-mail, and Gmail has had an amazing interface for many years, and they continually improve.
I just love the work that they do on that product, which is shocking for me to say about a Google-designed product because I'm not usually a fan of their UI. But I think they do a great job with e-mail, at least for me, it works really well. So yeah, we did take some inspiration . Gmail is my shining light of inspiration in some of the early GitHub days, but the interface has changed a lot since then. We realized that it wasn't necessarily the right model for issues.
Grant: Sure. If you're taking a first principles approach to a lot of these different features, most of your perspective is to build for yourselves. But when it comes to features that you weren't building for yourselves. Let's think about the on-prem installer, or role-based access control, or reporting.
Stuff that you wouldn't necessarily want in a product, but your customers were asking for, these enterprise were asking for.
What was the process around discovery and initial implementation and feedback? Who was involved, how were you doing customer interviews and things like that? Or, what was that process overall?
Tom: It was again, it was it was like, "OK. You need auditing." Auditing was something that we heard a lot from enterprise clients, or like, "We need to know when someone gets access to a repository, when they lose access to it, who's downloaded, who's cloned stuff." This was valuable from a forensics perspective. So we'd listen.
We talked to the customers. We'd say, "OK. You want to use GitHub enterprise but you need assurances that you can trace certain things." We'd get a list, and it's like, "OK. Tell me what things you care about. What is it that's driving that need? What do you need to do?" And then they say, "OK. We need to do forensics on what happened. If we learned that some code was leaked, we need to be able to trace back and see exactly who had access to that code and when that code moved across certain boundaries.”
So we'd say, "OK. What else?" And say, "We also need then the ability to easily remove someone from all of the organizations throughout the whole site when they quit, or we fire them," and you say, "OK." But they want it integrated with their LDAP, or their active directory or something. So you're like, "OK. All right. You want something that will integrate with a single sign on thing. OK. What else?"
We'd just keep asking, "What do you need to solve the problems that you have?" Instead of trying to make them spec out the solution for us, it was more, "Tell us your needs. What's lacking for you? What do you need in order to be comfortable with the software, and to actually use it?" What we usually find is once we start asking that, it ends up being a pretty small set of things that most organizations would ask for, straight away.
They were usually comfortable without having everything that they wanted as long as there was a plan for us to have it eventually.
That we weren't going to say, "We are antithetical to that need. We'll never do it." If they heard that then they might say, "OK. This product isn't going to work for us because this is something that we need eventually." But if it was on a roadmap and we weren't going to say we weren't ever going to do it, then we could make them pretty happy.
So that's something for people who are building enterprise software that's important to know, is even when an enterprise sends you some ridiculous requirements list that's like, "Here's a 200-page document and you need to satisfy all of these check boxes for us to buy your product." That's something that they'll send to every vendor, and then you go through it with your team, and you're like, "Wow. We don't do any of this. I don't even know what this one means."
Grant: Neither do they, sometimes.
Tom: That's the thing with these documents, is they collect over time and they are usually defensive. These big organizations have been around for so long that they've had all of these problems at some point. Every time they have a problem they add it to their list of things that they need to defend against. Even things that only happen extremely rarely, that almost never happen, it's on the list.
You don't know the relative importance of a thing that they need all the time, or thing that they almost never will need. You have no idea which ones are which. What we would do is we would get in in a meeting with whoever was deciding. It was like, "OK. Here's this list. We'll look at it, but it's pointless for us to just say what we do and what we don't do. Because we know that's not what you're after."
We tried to get meetings with people that were decision makers, someone on the technical or the security side that was driving these requirements, and we'd sit down and we'd say, "OK. We do some of these. Here's the things that we do. These are fine. You've got all kinds of stuff in here that we don't do. What do you need? What are you scared about?"
And that would then bring up the conversation of, "Yeah we need this. This is super important. We need traceability." That was generally the biggest one. Things will go wrong and enterprises need to know who to blame. They need to trace it, and that's totally reasonable. They need explanations. They need certainty. There's a lot on the line. When you're a big enterprise you have a lot to lose, and they need tools to mitigate those risks. We'd just ask them about it.
We'd have a conversation, and sometimes it would take several conversations, and sometimes it would go back and forth. The enterprise sales cycle is long and they're used to it, but it's not written in stone. It's not just, "You must check all these boxes or we can't do this." It's more, "What do you need now, and what can you wait for?" And then you go on from there, and it worked quite often. Most people could use it. Usually it was the earlier enterprises, places like Zynga or someone, or Drop. These places that were startups not so long ago.
Grant: Modern enterprises. They have evolved to become multi-billion dollar companies, but only in the last few years did they have a fairly modern approach to software.
Tom: Exactly.
Life is easier with them because they remember being startups.
They're like, "I remember the good old days of just using any tool we wanted on the web and we didn't care, because we didn't have much to lose." They still remember the beauty and simplicity and ease of that. But now they have requirements, but at the same time those were the people who were adopting GitHub in the Enterprise.
It wasn't these hundred-year-old companies knocking down the door to try to get GitHub. It was going to take them longer, and that was fine. You take the low hanging fruit and you solve the problems that they have, which are not going to be as stringent as the problems of bigger enterprises. So you can start small. You start with the smaller ones that are willing to take more risk and you keep building the features.
Or, this is what we did. Now with someone like you guys comes along, then it's like, "Wow we can already, if we just integrate a few of these things we can meet most of these requirements without having to build everything," and that's a beautiful thing. That shortcutting of that process, to tick those boxes and just be like, "Yeah, Replicated does that. You want auditing? Replicated does that. You want LDAP or active directory? We can integrate that easily, and now we can do that." It took us years to build that stuff.
Grant: We've always tried to think about, "What are the common services that enterprise software companies need? And then, how can we help build a great version of those and make it easy for both the vendor and the enterprise IT admin to consume and integrate? Again, a lot of that inspired by GitHub. We looked at GitHub enterprise and we said, "Wow. This is an incredible experience for installing a piece of what would be considered on-prem software, which to the rest of the world had been dead in legacy for 15 years."
And we were like, "No. This is a great way to give this unique experience to someone who's often ignored." You could just tell that there was actual attention to detail put into the implementation that you guys did, so we wanted to make it so that any piece of software could be delivered that way. Because ultimately, the thing that you're talking about. which is go to these early modern enterprises.
We think that more and more companies of all sizes have requirements around security, and single sign-on is no longer just done throughout Active Directory, but there's services like Okta and Ping that are using SAML. SSO is easy to manage how your users log in at employee size of 50.
Managing separate user tables and shared passwords is a terrible security practice across the board, and sometimes people are using the same password, so implementing something like SAML so that people can access your application is a huge boon for the security of the application.
I think a lot of times the early web applications and early SaaS products that were available, It seems like we look back with rose colored glasses and say, "It was so easy," but realistically it was also what was causing routine and simple data leaks and breaches, was just your vendors sometimes weren't nearly as secure as they should have been. There were early GitHub competitors that had breaches and lost data and other issues too.
Tom: These are these are real needs, and we wanted to serve these enterprise customers, these developers. We'd always think about the developers that we could serve, that we could get using GitHub. That we could give access to this tool that we thought was so amazing. If only we could make it so that the enterprises could purchase it.
That was what we wanted at the end of the day, was for more engineers, more developers, more everyone to be using GitHub if they wanted to use it. In order to do that we had to meet the needs of the enterprises, but at the same time that's its own product. We thought of everything as a product, of everything as an engineering task.
The way that we changed the website if we needed to do something with ops to change around the settings on machines, or whatever. It was always imagined as an engineering product. It was always written up and it was always understood, and it was always meant to be easy to use.
Which is where a lot of our chat up stuff came from. It was like, "We love chat, we love being a distributed company, how can we use these tools we're already using to make deployment easier to make troubleshooting easier?" When there was an outage of any sort, you could go into our chat room and you could type a few commands, and you'd get all the graphs from all of the analytics that we kept on all of the servers.
It was just this mentality of productizing everything, of making everything a good experience
and that was the same for all of the enterprise installation stuff. It was designed, it was easy to do and it was easy to upgrade from any version to any future version, which is another thing that we thought would become a nightmare. When you have enterprises and everyone has potentially a different version, how do you get them to upgrade?
Do they have to upgrade individually through every version, and then every installer is only a single version upgrade? And then they have to do like 50 upgrades if they're a little behind? We had something that was originally based on Vagrant and it would know how to go from any version to any version, because it was just a sequence of operations each on top of the last.
We tried to think, again from first principles, about how would you design an installation experience that was enjoyable and that was usable? I don't know that existed at the time, and certainly nobody was thinking about, "How do we delight the installers of enterprise software?" Most enterprise software wasn't even designed for the end user, it was designed to be sold to the purchaser and that was it. It was, "Meet the needs of the--"
Grant: Shelf-ware.
Tom: Right. "Meet the needs of middle managers who don't know what they're doing." It's a pessimistic way to think about large organizations, but it doesn't lend itself to the people who touch the product. Which are the tech, the ops people who are installing it, and the ops people who are maintaining it.
A lot of those features aren't just insulation, it's stuff that happens afterwards. Like, how do you get logs from the enterprise installation, to the company, to GitHub? People would install GitHub behind their firewall. It would be completely locked off, that was requirement number one. No bits can be transferred between these two entities unless they're auditable. So, something would go wrong and we'd be like, "OK. Can we SSH into your server and check it out?" And they'd be like, "No way. It's not happening."
We had to come up with a way to get knowledge of what was happening on the server so we could troubleshoot it, without physically going to their building and signing NDAs and sitting in a room with one of their ops people at a terminal, to run commands while someone is looking over your shoulder.
Which we did do some of that, but that's not a real efficient way to do things. So we created this log bundling tool that they could bundle up and then they could review it, and then they could send that over. This is a product for ops people; they're customers too. They're the real ones who have to keep this thing running in their environment. We always imagined everything as a product.
Grant: I love that. I also love something you said a little bit earlier which is, you wanted to get GitHub that experience to more developers. If you held your line and said, "No. We're only going to do SaaS," then you would have been withholding this amazing product and amazing experience for millions of developers.
I love the idea like, "Look, OK. I know some of these requests are a bit onerous. We may not understand everything. While some of them might be super rational around security or secrecy, you're instinct is to say, "Just trust us. We're going to do a great job, just send us all of your source code," or, "That's a stupid process. Do it some other way."
And the fact that you one, had the empathy and said, "No, we want the developers inside these organizations to be able to use GitHub." And then two, you said, "OK. We want that so much that any of these new requests and problems that these enterprises are having, we'll just listen and we'll try to understand and not just put up and tell them, 'No. You're wrong. That's the wrong way to think about software,'" you'll use that information and say, "OK. What can we do to solve that problem in an amazing way that's solved by product and solved by this great experience that we can then engineer?"
Tom: Yeah, that's exactly right. We could have tried to force people to change, and organizations have changed some, and they'll often allow more SaaS products. But it might honestly be tipping the other way in today's environment of security breaches.
It's getting harder and harder to maintain security of everything, and if you have real security needs, extreme security needs, it can still make sense. Where you just you can't rely on some third parties ability to maintain security for their databases, for their stuff. People's code is, for many companies, that's very heavily guarded information. I could argue that most people are too paranoid about it.
Grant: The reason that they're probably guarded about the code is not necessarily that they think-- Some people think the IP is super valuable. But generally it's about the potential vulnerabilities, so if someone has the source code and they don't disclose a vulnerability that they find, then they have this interesting access into your systems that maybe somebody else wouldn't have had without that source code.
They call it security in-depth. Adding that extra, making it be a black box for them to be able to attack, versus having all the source code to be able to attack, makes it a little bit easier. Obviously with open source code you get the advantages of the community is responsible for trying to maintain the security of the code, and finding any of those patches and issues before bad actors do.
Tom: I agree 100%. Some organizations, they probably put credentials in their repositories. Everyone has different ways of working, and security is huge. I don't mean to downplay that. It's just depending on the size of an organization, levels of paranoia can and should be different.
Also depending on which piece of software it is, some could stand to have more levels of paranoia, but security is always a tradeoff between convenience and absolute security. Which way do you turn that knob? Do you want things to be easier for people to do? Or, things can be more secure at the expense of being a little bit harder to manage.
Grant: You guys did a great job of making it both secure, and easy. I think that's an approach that a lot of times needs more first principles thinking, for product teams and engineers to make sure that they're trying to solve both the security and these abuse problems.
Tom: At the end of the day though, organizations weren't going to change their policies just for us, and we didn't want to wait decades for that to happen, if it was ever going to happen. Like I was saying today, it's more important than ever, it's more difficult than ever to properly secure your application.
It makes absolute sense to me that companies want to install software behind their own, in their own infrastructure, with their own security, and then they still get to use those products.
That's why we deliver that as a product. We certainly didn't have to. We could have had GitHub be SaaS forever and it would have been profitable, and it would have been great, but it would have excluded a large number of developers and that was not acceptable to us.
Grant: Someone released some numbers, maybe a year or two ago, when GitHub was at over 100 million in run rate. and the enterprise product was contributing to more than half of the revenues. It also reaches a lot of developers and contributes a lot of the non-dilutive funding for building even better software for the next generation.
Tom: It's a big part of the business. I'm glad that we did it. It was painful, a learning process, but we delivered a good product to the customers. At the end of the day that's what we've always wanted to do.
Grant: Beyond just the on-prem stuff, we hit a lot of important pieces around how you did things like audit logging, or some of the permissioning and role based access control. One thing that I'm always interested in, just in general, is how do you think about introducing a new feature?
We talked about discovery a little bit, you said you weren't always involved in that. But if we think about introducing and rolling out a feature, were you doing alpha and beta and GA? How would you get the first users to try out whatever new enterprise feature you were you were launching? And then how would you communicate that, even through your product, to engineering, to sales, to marketing. The whole chain of people would have to be aware of the new products releasing.
Tom: Are you talking about, enterprise features specifically? Stuff like audit logging, or log management, stuff like that?
Grant: You can go through any feature release. I think that the process assuming is probably fairly similar in terms of new things we're rolling out, but whatever comes to mind.
Tom: Any non-enterprise specific features was easy because they'd go through the SaaS product first. We'd build them, we would use them internally. We always had feature flags where we could say, "Staff sees this." So when we'd start building new features, even something as extreme as a new version of the issues system.
As soon as it was usable at all, that we could use it ourselves daily, we switched everyone in the company to start using it so that we would have to experience it ourselves and find any bugs that way, and we would know immediately what worked and what didn't with that new design. Those were easy, we did those for ourselves, we beta tested them ourselves because we were the customer and there was a lot of us.
We didn't do, we never did alphas or betas or anything for main product features. We just experienced them ourselves until we thought that they were good enough for general population, and then we would release them to everyone.
Grant: I would refer that to that, since you're feature flagging it and using it internally, I define that as an alpha, personally. That's when your company is using that feature and testing it out and using it in some type of workflow. Generally I would think about a beta as you feature flagging it on for a few early customers, but it sounds you went from alpha to GA oftentimes in the SaaS product.
Tom: We did. That's the way we did it for many years. We did eventually become more sophisticated and we would start rolling out some features to small portions of the population. We had some code in there, a tool that would allow us to select a certain percentage of users and then give them a new feature and see how it went. Usually, this was important for scalability concerns. New code, the one thing that we couldn't test by ourselves was how this would perform when used by the entire population of GitHub users.
So we would start rolling it out to small percentages. 3%, 5%, 10% and keep an eye on the database, and see what was happening. Make sure there weren't any slow queries and stuff like that.
These are things that matter when you're talking about a SaaS product at scale, things that tend to matter less when you're talking about an enterprise installation where you usually have fewer than millions of people trying to use something.
Grant: Sure. The scale of your SaaS product is always larger than the scale of any enterprise instance.
Tom: Right, but that's a nice way to build something. You know that whatever you deliver to the enterprise is going to be sufficiently performant enough to solve their needs because it's performant enough to solve the needs of a much larger user base, depending on what installation and what hardware that company is running it on. But at least they know that they could service as many users as they want in their organization if they add a sufficient number of servers to do it.
That was how we did it for the SaaS model, and then we would wait some amount of time and those features would all get bundled together and become a release of enterprise. Then everything had already been Alpha tested and tested in reality by the entire GitHub user base before they would make it into an enterprise installation. There was usually three or four months between when we would roll a new installation and make that available for people, for enterprise customers to download and install in their own enterprises.
Grant: That's cool, so you were using that SaaS group as a way to validate and test the features and new things you were releasing. Make sure they were production ready and then distribute them to the enterprise.
Tom: It's a great way to do it. It battle tests new features so that when they make it to the enterprise, you're pretty sure that you're not going to completely screw up their day. Because this is another big concern of enterprises is one, they want to know when stuff is going to change. If you go and you completely change how issues work, which we did several times, then they don't want to be subject to your timeline where you're like, "All right. Issues is shipping tomorrow. Good luck everyone."
They want to be able to say, "Look. We have important processes built around this within our company. We cannot tolerate those changing at all unless we say they change." They would often want more time to accommodate a situation like that. Which is not to say that they weren't willing to use new stuff or that they didn't want new stuff, it was more that they just needed that to be able to happen on their timeline.
Something that you can't offer nearly as easily to your SaaS customers because it's the software. It's going to roll out and unless you're willing to build in both the old version working and the new version working, which generally we were not, because we valued moving quickly over full backwards forever compatibility for most features. In general they were, you could still solve the same exact problems.
But every once in a while we'd have to remove something. We removed being able to thumbs up issues or something, and it caused people a lot of consternation because that was something that some people used. But we looked at the numbers and it was like 2% of people were using it, and it just wasn't sufficient to warrant leaving that feature in at the expense of making the product more complicated. We came up with better ways to do it in the future anyway, but that's another story.
Grant: I love this concept you're talking about now, because it addresses something that is often ignored, which is change management. This is a huge concern for enterprises, because like you mentioned, they have workflows and processes built around certain features and if you change something it could, one, break that process or that workflow.
And two, if you change something, oftentimes they need to train the people who are using the software on the latest updates. They have an extensive process for bringing people on board with how something work, and if you're going to roll out something different-- Humans don't love change in general. It's good to provide them with the runway and the tooling and the collateral to propagate that change through their organization.
Tom: That's an excellent point and completely true. The needs of enterprises are drastically different from those of a smaller organization where if something changes in an app, it's like, OK. We've got two people that deal with that and they'll just come to grips with it. But when you've got 10,000 people across all manner of locations and trainings and whatever, this a different story. This is something that has to be managed actively.
Grant: Your training materials for your new employees that are coming on next week have all of the old processes in it, and you just rolled out a new change. You need to go through and update all the materials and everything else that you're delivering for training. The thing that people don't often recognize, and the reason that change management is so important, is communication within enterprises.
Why there's a hierarchy oftentimes is because it's just so hard to communicate something to everyone, so you rely on this top-down bottom-up communication flow where people are meeting regularly and delivering, "Here's the newest thing," because when you're trying to tell something to 20,000 people that's just a hard thing to do.
Tom: It's about predictability. All of their supply chains, everything that they do has to line up properly in these large organizations. And if you throw a wrench into their system one thing is going to get slower, and now that thing is not going to be done when the other thing is done. It is exactly that. It's this communication issue, the synchronization of all the pieces needs to run properly.
If you go changing stuff underneath them they don't have time to prepare, so they can't they can't predict. They need predictability. You give them a new version, you say, "Here's the features in this version and you can have them whenever you want. It's up to you."
Grant: This is also really interesting to bring up. You created the concept of semver. Semver can be seen, the versioning concept can be seen as a change management thing. It communicates the different changes that are happening in software through a consistent pattern. One, how did you come up with semver? What was the problem you were facing? Maybe even talk a little about semver for those who aren't super familiar with it.
Tom: Semver is short for semantic versioning. It's a way of codifying a pattern that people were already using, which was major-version, minor-version, patch-version 1.2.4, whatever a version might be for a piece of software. When I was working a company called PowerSet we create this super repository of every piece of software that we used, and we made our own packaging system that would allow any package, even across different languages--
You could have a Ruby package that depended on a C library, and a Python library that depended on a specific Ruby version. It was cool but it had a weakness where if you over specified your dependencies, then anytime you wanted to upgrade something, then you'd have to deal with that all the way downstream. Every piece of software that relied on that, now it might be requiring a previous version. Then what happens if you require two libraries?
One which requires version 3.4 of something only, and another one that requires absolutely 3.5, and you're completely screwed. It can't be done. I remember it being a huge problem there. Later on I don't even remember specifically why I sat down and wrote it out, but it just seemed silly to me that a bunch of different software programming languages were using different ways of specifying versions.
It seemed like, "This is all the same stuff. Libraries of software. Why is there not a consistent scheme behind it?" There was just no consensus, so I sat down and I wrote it out because I was frustrated by it not existing, and saying, "Look. This stuff already exists, let's just give it a name that's not too specific, and try to convey meaning with these version numbers in a way that everyone can agree on. That way you can say that, "I'm happy to use version 3.4 up to version 4 of something."
Because I know that a minor version is only going to add features. It's not going to break backwards compatibility. If we can all agree on that then you can create software dependencies.
When you say, "What versions do you want? You can be comfortable saying, "I'm OK up to version 4.0. 3.5, 3.6, 3.7, whatever. I don't care because I know it's all backwards compatible." Until you all agree that that's OK, and that's what those numbers mean, then all you can ever do is say, "I want specifically version 3.4 and nothing else." That would be the only rational choice. It was my attempt to name something and define it, which is what I've done several times, and it's turned out to be valuable.
Often we don't have agreement on things and I've been lucky enough that this has worked. The danger of doing this, and people always joke about it is, "You used to have 10 versions of how to do something, or 10 ways to do something, and now you've just created an 11th." So, I don't know. I love doing it though.
Grant: When the 11th is the most obvious and best way, and you help standardize and define it, I think it's a valuable contribution. You're right, there's always a risk, that you don't want to just be the 11th.
Tom: I don't think there is even a problem, honestly, with being 11 or 100 or 1,000. Because at the end of the day you're exploring the space. Everyone who's working on something is exploring the space of possibilities, and down the road, 999 of those won't exist anymore and people will have chosen the one that worked the best. You see this over and over in languages, and you're seeing it a lot in JavaScript right now for instance.
There's a million ways to do everything. That's because we're on the bleeding edge of a JavaScript renaissance. You see that anytime a new language becomes popular, it happened in Ruby, there was a time early when Rails came about. It was like, "Shit. Ruby's amazing. Now we need libraries for everything," and there was a million libraries to do everything and so many options.
It was like, "How do you do any of this stuff? There's so much choice." This is like next level in JavaScript, but you can see in node, node uses semver. It's built into NPM. That's the way that you do things, and it's the only way that NPM has become as successful as it has. Because people can finally rely on what versions mean.
And it's not perfect, obviously there's ways to introduce breaking changes in minor versions, and it's going to happen from time to time. But at least when people do their best to accommodate the rules, then you can do things that you were never able to do before. You can have the flexibility that allows you to rely on a set of packages in a much cleaner and a much more flexible way.
Tom: Yeah, I love that. I also just love the idea of freeing people from the risk of being the 11th. It's like, "Look. It's in early market. People will create their opinion and share their perspective on what it should be doing, and you should do the same thing."
If you're right, over time it will win. And if you're wrong, you still contributed something to the global effort towards this. Because somebody else might see the thing that you developed and you didn't release, and that might influence the correct decisions.
It's a net gain for society if we all contribute these things more freely and with a little less concern for, "I don't want to be the 11th." I love that perspective.
Tom: We wouldn't have a lot of things that we have today if someone hadn't made that 11th thing. Think of think of something like the iPhone, how many cell phones were there before the iPhone came out? It doesn't matter where you are on the timeline of something, if you have a better idea, put it out there and see what happens. Why would you ever tell someone to not do something? I don't know. It just doesn't make sense to me.
Grant: Yeah, that's the first-principles thinking.
Tom: One other thing I wanted to bring up before we get off of change management in general is something I hinted at before, which is how to organize your code base for doing a SaaS version and an enterprise version. This is an interesting story. Early on at GitHub when we started doing GitHub enterprise, or GitHub FI at the time, we were faced with a choice.
That choice is one that everyone's faced with which is, "Do we create a fork of our repository and build enterprise features there? Because we don't need them in the SaaS products, so let's not complicate the SaaS product with them. We'll build those features in the enterprise model only, and then we'll port all of the new stuff from the SaaS model into the enterprise codebase. Because Git can do that. It's good at merging stuff." So that's what we did, we had a separate repository.
We forked off GitHub.com SaaS model repository, and new enterprise features went in there. We hired a person specifically to do this merge process, which was a thankless horrible task, and we did it for a while and it was just extremely slow. Merging is not amazing, ever. If you have conflicts it's a nightmare. It's a never-ending nightmare.
Whenever anyone asks me now I say, "Please. For the love of all that is good in programming, do not make a second repository and try to merge upstream stuff into it. If you have a product of any complexity, it's a nightmare. And not only that, it removes the need for your SaaS developers to think about what enterprise customers need, because they don't see it." This was a problem as well, when you didn't have the enterprise stuff.
There could be things in the product itself, in the normal parts of the product that weren't just ops things. There could be flags in there, even stuff as simple as the header in enterprise was black, so you would know that this was not GitHub.com if you were using both. Because a lot of people that are in the enterprise are using an enterprise installation, they're using GitHub.com for opensource, whatever. They need to know very specifically which one they're on, so the header was always black in GitHub enterprise, so they could know which one they were on.
Grant: It was wearing a suit, right?
Tom: Yeah, exactly. Business. Business time. The problem with that was though, that the logic for that was in enterprise only. If someone went in and they needed to make a change to the header in GitHub.com.
They'd do that and then the person who had to merge that into GitHub enterprise would then be like, "Fantastic. Someone made a change in the header. Let me go read that code. I'll spend a day doing that. Figure out what it affects. Figure out the 17 files that I need to now fix conflicts in and make this stuff not all break." It just doesn't work. We changed to a single repository, which has the benefit of number one, you don't have to merge stuff anymore and you don't have conflicts anymore.
Fixing conflicts is the biggest waste of time. If you spend almost any time fixing conflicts in merges, you're doing something very wrong.
It's just the least effective way to spend time, and it's extremely error-prone. You have to do it very carefully. It's a nightmare. Every developer probably knows this, if they've ever had to deal with a conflict in code before. We wanted to optimize to remove dealing with conflicts. That was one big win.
The other big win was all of the conditional code, and there was a lot of it, that was there in its full glory for every developer to see. This while on its head seems frustrating, where it's like, "I'm a SaaS developer. Why do I got to deal with all this enterprise conditional stuff?" That attitude is wrong, and toxic. It's important, and I try to make this important at GitHub, that enterprise customers were valuable. That these are also developers, they just happen to be in the context of an enterprise. We are building products for both of these people.
This product needs to be considered by all developers. It can't be just SaaS developers and just enterprise developers. Everyone has to be both.
We tried to do that, where some people would work on enterprise features but SaaS people were also working on enterprise. Because they'd go in there and if they needed to change something in the header, they saw the conditional. They saw where it forked off to make it enterprise-specific, and now they had to contemplate that when they made those changes. Which made the product better for everyone. They just thought from a larger perspective. The perspective of the whole company, not just a portion of the company.
Grant: That's super important. It's even become more important in today's age when we think about not just unifying the code base, but unifying the ops model. How do you orchestrate and schedule your application both in the cloud and on-prem? A lot of the stuff is happening in the world of Kubernetes is enabling that to happen more and more. It's still a challenge and it's still super important.
Tom: I agree.
Grant: Tom, after GitHub you started a new company that is really exciting, and I've been able to watch it grow as you've built. But it's in a totally different space, and I'm just looking to understand a little bit more about what it is, what it's doing, and why you're building it? I'd love to hear your perspective on what Chatterbug is and what's exciting about it.
Tom: The company is called Chatterbug, and Chatterbug is a better way to learn foreign languages. As you said, it's in a completely different space than GitHub. It has completely different customers. But this is something that I love. I love building new products for new people, and learning what it takes to do that. This is a broader customer base. This is a consumer product, versus something that's for a more niche set of people or just developers.
The reason that Chatterbug is different is a couple of ways. Let's say you want to learn German, you want to learn Spanish. We have both of those curriculum available on Chatterbug now. What's different from other offerings is that it's a comprehensive way to learn a new language. We've found that the best way to learn a new language is by practicing the things that you're going to be doing. Seems maybe obvious, but most people don't do this.
With Chatterbug there's two part, one part is self-study that you do on a computer. You do things like flashcards, you do things like readings, reading passages of text in the language that you're learning. Maybe a little bit of journaling, watching videos, listening to some audio. We keep track of everything that you do, everything that you do ends up in the system, and starts to become customized to how you are learning.
The amount of repetition that we do is all handled intelligently behind the system. You don't have to do anything, you just show up, you do the work, and we handle the rest. In alternation with the self-study, we offer what are called, "Live lessons," and those are 45-minute long live video chats with someone who is native or fluent in the language that you're learning. The cadence goes, do some self-study, learn some words, learn some grammar.
All arranged around specific topics as you go through the curriculum. When you've done a little bit of that then it's time for live lesson. So you schedule one, and you get on a call with a native speaker and you start doing exercises based around what you already know from the self-study. Those live lessons are, we call the people that you're doing with them, "Tutors" as opposed to, "Teachers," because they're helping you do it.
They're not teaching you, so much as they are helping you understand the language more naturally, through natural interaction and asking questions. But there's always a visual aid that you're looking at when you're in a live lesson, and you're working on something together, so you might see a map of a city. You might have to figure out how to go from point A to point B, or ask someone "Where is the closest supermarket?" These would be the tasks that you would do with your tutor, and you would be in a dialogue with that person throughout the course of this live lesson.
You find that you get good at the things that you practice. If you practice speaking a language then you become good at speaking a language. This is something that most mobile apps, mobile language learning apps don't do well right now. You could spend thousands of hours learning vocabulary, but if you never practice speaking and you show up to a country where they speak that language that you're learning, you might find yourself completely incapable of speaking that language or understanding that language because you haven't practiced those.
So you get into this natural cadence back and forth between doing some self-study, and we have a mobile app. You can do it on your phone, you can do it on your computer, on your iPad, whatever you have. You can do it wherever you are. Then let's say two times a week you want to do a live lesson, then you just show up at the scheduled time and we have a tutor waiting for you. Part of the magic is we use multiple tutors.
You're not locked into any specific single tutor. This makes it flexible scheduling-wise. If you only have one tutor that you're working with, when they go on vacation, you're not doing any lessons. Over time you can build up a pool of tutors that you like, and you have the ability to favorite tutors or block tutors if you don't find them suitable for your learning style. Then every time that you show up for a live lesson you know that you're going to get someone, and you're going to get someone that you probably enjoy doing this work with.
It also gives you much better exposure to different accents, people from different regions, people that used slightly different vocabulary. It's all just better from a preparation perspective. On the flip side, Chatterbug is finally an answer to a question that I've asked for many years and never had a suitable answer to. And that is, "How can you make money on the internet as a person who is unskilled in any particular specific thing?"
I've had many people ask me this question. They're like, "You're good at computers. You know about the internet and stuff. It seems like with the internet you have access to a global economy, millions and billions of people. It seems like there should be a way for someone who's in maybe a remote location, or otherwise doesn't have a lot of opportunities, how can they make money on the internet?"
And I've always had a hard time answering that question. I just can't think of much to tell them, which to me is absurd. It's absurd that I can't answer that question well. Chatterbug can finally be that, because what's one valuable skill that everyone already has and has tens of thousands of hours of practice doing? That's to speak their native language.
There's a lot of people around the world that want to speak that language. So, let's help connect them by building a platform where anyone can become an effective language tutor if we provide the curriculum, and we provide the tooling, and we provide the support, and we provide some training to understand how to use the system. Someone with no expertise in language instruction can become an incredibly effective tutor on Chatterbug.
This is also a way to answer that question. Anyone, living anywhere, that knows the language that someone else wants to learn, can make money on the internet using Chatterbug.
That's a little summary of what I've been up to.
Grant: That's amazing. I really do love that mission too. That perspective. That was your first principles on this, to solve that problem.
Tom: Yeah. It's what made me join, is that it had that aspect, it had this answer to something that I was mulling over while I was thinking about what I wanted to work on next.
Grant: That's cool. I'm excited for the future of it. I hope that you keep pushing on it and keep going forward, because I'm sure that the world needs it and they need you to make it a reality.
Tom: We're cranking on it.
Grant: OK, Tom. Thanks much for your time today. I appreciate it. It's been amazing chatting with you. I'm sure that everyone listening is going to appreciate your thoughts and feedback on how to think about enterprise software, and how to deliver that.
Tom: Thank you. I had a good time chatting and I hope this information has been useful to your listeners.
Subscribe to Heavybit Updates
You don’t have to build on your own. We help you stay ahead with the hottest resources, latest product updates, and top job opportunities from the community. Don’t miss out—subscribe now.
Content from the Library
The Kubelist Podcast Ep. #45, Live from KubeCon 2024
In this special episode of The Kubelist Podcast, recorded live at KubeCon 2024 in Salt Lake City, hosts Marc Campbell and Benjie...
How to Make Open-Source & Local LLMs Work in Practice
How to Get Open-Source LLMs Running Locally Heavybit has partnered with GenLab and the MLOps Community, which gathers thousands...
Open Source Ready Ep. #6, The Infinite Nature of Software with Adam Jacob
In episode 6 of Open Source Ready, Brian and John are joined by Adam Jacob, co-creator of Chef and CEO of System Initiative, to...