Ep. #5, WebTorrent: Bringing BitTorrent to the Web
In the latest episode of Demuxed, Matt, Steve and Phil are joined by Feross Aboukhadijeh and John Hiesey from WebTorrent.
Feross explains that the idea behind WebTorrent is to allow any web browser to connect to a peer-to-peer swarm, fetch content, verify it, and display it to the user with minimum use of servers. They also discuss how they use the WebRTC protocol and outline browser limitations that resulted in the creation of a WebTorrent desktop application.
Feross Aboukhadijeh is a programmer & designer who is involved in creating software for various purposes. He is currently building WebTorrent, which is a streaming BitTorrent client for the browser powered by WebRTC. He has previously built PeerCDN, a peer-to-peer content delivery network that makes websites faster and cheaper and was acquired by Yahoo in December 2013.
John Hiesey is a software engineer and core contributor to WebTorrent. John is a self described “mad scientist” who is working on a language that can replace JavaScript in the browser. John has also worked as software engineer on the Yahoo Video Player Team.
In the latest episode of Demuxed, Matt, Steve and Phil are joined by Feross Aboukhadijeh and John Hiesey from WebTorrent.
Feross explains that the idea behind WebTorrent is to allow any web browser to connect to a peer-to-peer swarm, fetch content, verify it, and display it to the user with minimum use of servers. They also discuss how they use the WebRTC protocol and outline browser limitations that resulted in the creation of a WebTorrent desktop application.
transcript
Matt McClure: Hey everybody, welcome to the 5th episode of Demuxed. Today we've got Feross and John from the popular WebTorrent project. But before we get started, we wanted to talk a little bit about Demuxed 2017. If you've come to the meetup in the last few months, you know that I will mention it forever. Basically every time you see me until October, and then we'll start talking about 2018 two months later.
So the Call for Papers is now open at Demuxed.com, there's a button at the top, you can go submit a talk. We're looking for anything and everything that's interesting to engineers working with video. So if you can think of anything in that realm, you need to submit it. October 5th, Broadway Studios, it's going to be great.
In San Francisco. I guess let's get into it, but in the meantime after this podcast, you should go submit a talk. So today, as I said, we have John and Feross from WebTorrent, you might know these guys from things like PeerCDN, WebTorrent itself, Standard JS. Am I missing anything here from your illustrious open source careers?
John Hiesey: There is the stream-http module, it's used by Browserify and a bunch of other stuff to implement the node HTTP client interface in the browser. So that's something else I've worked on.
Matt: Cool, so I guess you guys could probably give your own introductions better than I can. Feross, why don't you kick us off.
Feross Aboukhadijeh: Sure, people might know me from my open source work. I worked on WebTorrent and Standard, like you said, and a bunch of other npm packages. I helped maintain Browserify in my not-that-much free time. And I made the buffer module for that, the browser informational buffer. Like John said, he did the HTTP module for Browserify as well. And various other npm packages and things like that.
John: Hi, I'm John, I am one of the PeerCDN people. And relevant to this, we worked on the Yahoo video player and then more recently I've been working on other things. I'm designing my own programming language and some even less related to video things. For right now I'm still maintaining Videostream, which we'll be talking about, but that's mostly on the back burner these days.
Feross: You should mention the non-video stuff you're doing, you should just give a quick overview.
Matt: Yeah, programming language? You said you're working on your own things that have not been programming languages.
John: I'm doing that, it's still untitled, but I'm trying to make something that can replace JavaScript in the browser for now.
Steve Heffernan: Why would you ever want to do that?
John: I want to adapt that to internet of things, platforms eventually too. And I'm also trying to figure out how to cold vacuum-fry bananas to make banana chips that taste like candy.
Matt: I get a feeling that Feross had a very specific intent with that question. Okay, we'll get back to the bananas, because I think that might be a cornerstone of the podcast. So I let's kick off with "what is WebTorrent." You guys want to give us a little background?
Feross: Yeah sure,
The idea behind WebTorrent is to make BitTorrent work on the web in the browser.
The idea is any web browser should be able to connect to a peer-to-peer swarm and fetch content, and verify that it's correct, and then display it to the user. And to do that entirely without, well as much as possible without the use of servers, so just relying entirely on the browsers themselves. So a network made up entirely of people's browsers. And unlike other attempts at this in the past, including PeerCDN which is something we did as a start-up before WebTorrent, the goal of WebTorrent is to maintain compatibility with BitTorrent as much as possible.
So it speaks the same BitTorrent protocol, it just has a different transport; instead of using TCP and UDP, which doesn't work on the web currently, it uses WebRTC which does work on the web, which is a way to connect browsers directly to each other on the web. So it's really the only option we have here, if we don't want to involve servers right? We don't want to do something with WebSockets and deal with the middleman server, we just want to really make a true, true peer-to-peer network on the web.
And that's what it is. So it works today, you can install it up from npm, and you can put in a magnet link or a link to a torrent file and say "go and get that for me", and it'll give you back the stream of bytes right from the torrent. You can do whatever you want with it, including streaming it into a video tag, or just accessing the data yourself to do whatever kind of thing you want your application to do.
And one of the coolest things you can do is actually combining it with existing HTTP URLs, where you have some content that exists already on a server. You can actually do a hybrid between peers and the server. So if there are no peers, the very first peer can just go and get it from the server and become the first peer. And so there's really interesting applications you can do with that sort of hybrid approach.
Steve: That's really cool.
Matt: I have noticed that a few times on your website, it's a big demo. And occasionally there aren't enough peers, so I send links to people and I'm like, "check this out, isn't this awesome? These guys are going to be on the podcast." And they're like, "nothing's happening."
Feross: Oh you know, I think the server ran out of disk space a week ago, so that might have been what happened. Yep, that's about a week ago. So actually all the files are getting truncated halfway through, like the JS bundle is half of the bundle, so it didn't work.
John: There are normally a few peers, at least, online at any given time.
Matt: Yeah every other time I've gone, there's the beautiful little spiderweb that pops up in the corner.
Steve: Yeah, I'm looking at it now, there are about eight people on there.
John: Feross, do you want to talk about WebTorrent desktop too?
Feross: Oh yeah, sure.
Part of the challenge with WebTorrent is that because it's speaking WebRTC, you can't actually talk to existing BitTorrent clients that are out there.
And that's actually a huge install-base of users, there's like hundreds of millions of people with torrent apps on their computer, and we can't talk to them. So that means that the strength of the network isn't as good as it could be.
Steve: Why is that? What's the complication there?
Feross: Browsers can't send UDP packets or open up TCP connections. It's just like considered a thing that we don't want browsers to do for security reasons. You wouldn't want a webpage with an ad in it to be able to send some packets to your printer, and you know, print some stuff or something like that. That just sounds terrible.
Matt: Somebody wants that.
Feross: So they created this whole new protocol called WebRTC, that facilitated peer-to-peer communication in the browser, and so that's what we built WebTorrent on top of. But the cool part is, since almost everything is the same, and we just made this new transport, we can actually very easily support the TCP and the UDP and the old style of connecting to the torrent apps in the same module.
It's just that that won't work in the browser,.
So to bridge the two worlds, we made a desktop app that can actually talk both styles, it can do WebRTC and it can talk to existing torrent apps. Because it's on the desktop, right? And we did it with Electron, which made it really, really easy. If you're familiar with Electron, it's a way to make desktop apps using JavaScript, and you have access to all of the APIs from Chrome and from Node in the same process.
So what WebTorrent does when you run it in Electron, is it'll say "oh I can require the net module and the degram module, and so I can do TCP and UDP, oh but there's also a WebRTC object, so I'll use that as well." And so it just finds peers from both worlds. And so anybody who's downloading anything in there is sharing it to browsers.
Steve: Yeah that's great. That was kind of my first question looking at the site. I saw the desktop version, my first question was "why do you need a desktop version if it's WebTorrent?" But that makes a lot of sense. Ultimately you hope people who have other BitTorrent streaming applications would switch over to your desktop version, because then it would open up the possibilities of what you can do with that.
Feross: Right.
John: Or you could alternately build WebRTC and the WebTorrent transport into other clients. Are you still working on that?
Feross: So one mainstream client added WebTorrent support, so that's called Vuze, it used to be known as Azureus. I think a lot of people probably used them back when they were called Azureus. It's not the most elegant integration but it works. And then there's WebTorrent desktop and then there's, I think even like some sketchier apps like Popcorn Time added WebTorrent.
But there are definitely other people using it, but we thought it would be a good idea to jumpstart it by just making our own client. Plus, there's so much low-hanging fruit.
If you've used a BitTorrent client at any time in the last few years, you'll see that they're almost all terrible. So it wasn't hard to make something better.
Utorrent at one point actually shipped a Bitcoin miner with their app and said it was an accident. So that's sort of what we're competing against.And so we were thinking you know, that would get us some users, and that'll make the web network strong. And that's sort of what's happened since we've released it.
Matt: That's awesome.
Feross: Yeah there's about actually 500,000 people I think, last time I checked that had downloaded it. So it's pretty good.
Steve: Wow, that's great!
Matt: Yeah.
Phil Cluff: It's WWDC this week, big exciting one, Safari, WebRTC, you know. Everyone's super excited. That's going to be big for you guys, right?
John: It should. You know I haven't really looked into what they've announced. I don't know if they're adding data channels support.
Feross: They are.
John: Oh good.
Feross: So I've heard that it's going to be an iOS 11 and they're also going to put it in Safari on the last two versions of macOS and then of course the next one coming out, High Sierra.
John: Oh nice.
Feross: So it should be in Safari, but as far as data channel goes, it is in Safari, but I've heard rumors that it requires a webcam permission before it will work, which would sort of cause problems for us because we want to just make data connections without asking to use it for their webcam.
Matt: So even if you just use the data channels, it'll ask you to use the webcam?
Feross: I think it'll just deny it if you do that before you've gotten a webcam. That's what I've heard, it's just on Twitter, I haven't tried it myself so.
Matt: Okay.
Feross: We should try it. This is just a rumor.
John: Yeah I gotta try out the beta and see.
Feross: Yeah, I'm glad they're finally on board. They were dragging their feet for so many years.
John: Now we just gotta get data channel into Edge, and then we'll be good.
Phil: So like more in general, are you guys really focused on video? Is online video the key to what you guys want to do? Is that like the core of what you want this tech to be about?
Feross:
I think video is where it's most useful, because the thing about peer-to-peer is if you're not combining it with HTTP in a hybrid approach, then peer-to-peer alone has higher latency up-front, because you have to find peers.
So it doesn't make sense to use it for smaller things in general, usually you want to use it for big things, big files. Video makes a lot of sense. If you just do peer-to-peer with video, you'll have maybe five seconds of waiting up-front to find peers and stuff, which I know probably sounds horrible to you guys.
Because you're doing video player stuff with like CDNs and all this stuff, but for peer-to-peer thing where no one's paying for the bandwidth, it's totally acceptable. And then you have like potentially long content, so it's fine. But for like images or for small data sets, it makes a lot less sense. So big files like video makes sense.
John: And plus if you don't want that latency you can always get the first few seconds of the video from a server, and then go peer-to-peer for the rest.
Matt: Is that a big limitation of the system that you have to have a copy on the server for it to be available everywhere, all the time? Do you run into that at all, that that's kind of like the blocker for this really taking off? As soon as somebody closes the browser that file could be no longer available?
John: Potentially, yeah. It needs to be seeded from somewhere and at least at this point, since WebRTC isn't in every browser, you still need HTTP fall-back.
Feross: Yeah, if you visit from Safari on iOS for today, then if you don't have what's called a web seed, is what it's called in BitTorrent parlance, but it's basically an HTTP URL that has the content there. If you don't have that, then it just won't play. So for like robustness, and especially for content that isn't that popular, you would want to have that, in case like all peers go away, then you can't access the content anymore.
But you know what? There's actually a lot of use cases where you don't really care about the content living on forever. Like if I just want to send you a file for example, I don't necessarily want a copy of my file on some random hosting site. Have you ever used one of those sites where you can upload like a gigabyte or something, and then you can send your friend a link?
Steve: Yeah, totally legit.
Feross: Right, I mean it would be better if you could just use a direct connection to their computer, and if you could do it through a little simple website, you both go to the same URL, and one of you drops the file on, and the other person types a little code in or something. WebTorrent can be used for that. And if you encrypt the data when you first drop it on the browser, because WebRTC itself is encrypted, the connections are encrypted.
So you could either rely on that or you could do another layer of encryption, add a key to the link that you send them, so they have to decrypt it as well, if you wanted to be extra safe. And yeah, then in that case it doesn't matter.
Matt: And are there any file size limitations to that? I would guess maybe not, if that's the case.
Feross: I think the biggest limitation that you run into in practice is the RAM on the users computer, if you're just keeping everything in memory on the receiving end, but you can use IndexedDB to store it as you get it, and then the limits are just whatever the IndexedDB limits are. Which I think is like unlimited, because you can just prompt the user for as much space as you need.
Matt: That's great,
I remember 10 years ago there was probably 50 different start-ups whose only job was sending a big file from one person to another. So you can now do that in a browser without touching any other service.
Feross: It's basically the hello world of WebTorrent it's literally one person just, you do drag and drop, drop it onto the page, and then you create a torrent based on that. And then the magnet link is what you send to the person basically, and then they, on the receiving end, download it.
Matt: Correct me if I'm wrong, was that Instant.io?
Feross: Yeah, that's the demo site. You can go to Instant.io and try that exact thing out right now.
Phil: I'm doing it right now, right now.
Feross: Don't do that, it'll take out your connection again.
Matt: Yeah, don't do it Phil.
Matt: So I was going to skip this one, but I am super curious. How often do you guys get confused with Popcorn Time? Like do people assume that you are the creators of it? Like I have to assume that it's at least come up before.
Feross: Actually, it's never come up for me.
Matt: Really? Because I feel like when I've told people about it, they've been like, "oh are those the Popcorn Time guys?"
John: Maybe people just don't want to ask, but.
Matt: Or maybe I'm being crude.
John: No, I don't know. I could see why someone might think that, but I mean I certainly don't want to be involved with that kind of sketchy stuff.
Matt: Cool. So we talked about this a little bit in the introductions, but both of you guys were involved with PeerCDN, which I believe sold to Yahoo in 2013, right?
Feross: Yep.
Matt: I remember that's when I first met you is at NodeConfEU, probably 2012? No.
Feross: 2013.
Matt: It would have been 2013, but I don't think you'd been acquired just yet.
Feross: Right, right. It was October, so it was right before it.
Matt: Yeah, anyway, for everybody that's uninitiated, do you want to give a little bit of background about what that was?
Feross: Sure, I can do that. So it has a similar goals, kind of, to WebTorrent, it was to play around with peer-to-peer and see what we could do to make browsers talk directly to each other and what opportunities there were there. And so we were hoping people would use PeerCDN to augment their existing CDNs, so they could opportunistically use peer-to-peer whenever possible, falling back to HTTP in all the other situations.
So the idea what would be, you'd use it only if you could improve performance or if you wanted to save bandwidth. So if you're like, you've buffered a bunch of the video and you have plenty of runway, then maybe you say, "you know what, for the rest of the bytes in this video, I'm just going to get them from peer-to-peer, I don't even care how slow it is, because we're way ahead, we got plenty of time to spare, we're not in any risk of buffering."
And so it's this think you'd layer on top of your normal CDN. It was a script in your page that would literally sit there and intercept HTTP requests and it would first try to get them from peer-to-peer if it could. You'd basically have a strategy that would decide when to use peer-to-peer and when to use HTTP.
Steve: I'm assuming you weren't using WebRTC then.
Feross: No, no, it was.
Steve: Oh it was?
Feross: That's the whole thing that enabled it.
Steve: Okay.
Feross: Yeah, because before that, the only way to do this would be to force the user to install a plugin or something, and so what's the upside for them to install a plugin that's just going to use their bandwidth, right? It has to be kind of built in to the web and automatic for it to make any sense.
Steve: Okay, so this would have been super early days in WebRTC.
Feross: Yeah.
John: It was pretty early.
Feross: Yeah.
John: There were a lot of browser bugs.
Feross: I remember one time Chrome crashed and I had to look at the core dump and then look up the actual Chrome C++ source code to figure out what the bug in my JavaScript was. Which is so weird, right?
Steve: Yes, but not uncommon, I mean similar experiences with the video element when it first came out. Like half the browsers actually support it well, like some of the browsers you'd click play and nothing would actually happen. It's evolution, right? It just has to mature over time.
Feross: Yeah, a lot of the video stuff seems similar to the early days still, like the way that you don't get real error messages. The way that you have to like give the MediaSource API the data in the exact format or it will cry, and it won't tell you why it doesn't like it.
John: Yes.
Steve: On that note, do you guys get involved in the standard side of WebRTC? Like are you involved in that process at all, like helping them make the choices of what errors they should be showing and things like that?
Feross: No, I haven't.
John: I haven't either.
Steve: Okay. Yeah I don't know what's actually available out there. We're regularly involved in a group called FOMS, Foundations of Open Media. Have you guys been to that?
Feross: Yeah I went to one of those events.
Steve: Okay cool. So that's always a nice in-road to some of the standards conversations, but yeah, I'd be into the know kind of where that process is with the standards around WebRTC, like do they consider it a completely finished spec, or is it still in process. Details like that, do you know?
Feross: I think WebRTC 1.0 is basically done, and they're doing the finishing touches. I think Google said they want to have it finished up by the end of the year.
John: What about Object RTC?
Feross: I think that's just been folded in. I'm honestly not that up-to-date.
Steve: And what's that?
John: That's a API that's implemented in Microsoft Edge, that is trying to simplify a lot of things, instead of manipulating SDP messages with strings and stuff, you have an object-oriented interface. So it's trying to make it a lot more user-friendly to set up a connection and customize. You know if you're doing regular video stuff, customize like the bit rate and all that stuff.
Steve: And that's built on top of WebRTC, is this an API on top of that?
John: More or less. It's an API that's intended to replace the original API.
Steve: Oh, interesting.
John: So last I checked, Edge had implemented that and not the regular WebRTC API. So I'm not sure where unification of that is going.
Feross: So I know that Edge said they're going to implement the WebRTC API in addition to ORTC. So they're going to have both, but apparently you can shim them back and forth, like their equivalent basically. So as long as you're not doing anything super fancy, you can just use an adapter, and then you'll just write one app and then it'll work in all the browsers.
Steve: Nice. Now that Safari will support WebRTC, are there any other major hold-backs on the browser landscape for WebRTC? Maybe on the mobile side or?
Feross: Well one thing I think is that Edge still doesn't support data channels, so it can only do video and audio. And I also think that Edge only supports some really weird, I can't remember the details now. There's something special about the version of H264 that they support, where it doesn't interoperate with Chrome and Firefox.
But I think they're going to fix that, I think they just wanted to help the Skype team get out their Skype for web release, and so they just used whatever Skype needed at the time, and it's not interoperable.
John: Alright. Just to clarify, so
WebRTC was originally kind of billed as a system for sending video and audio streams back and forth, so it has all of the logic to do adaptive bit rate and such, built into it. What WebTorrent is doing, is simply using the data channel and not using any of that machinery.
So it's a lot more modular, just built on top of sending messages across.
Feross: Yeah, and that's true with any of the different companies that are doing peer-to-peer streaming right now, like Peer5 and Streamroot and groups like that. None of them are actually using the video capabilities of WebRTC, it's all the data channels. You might expect that they're using the video stuff, but they're not. It's all about the data and sharing these blocks of data, as opposed to video channels.
John: You get way more flexibility that way, in terms of what you can do. Otherwise you're pretty boxed in by the browsers.
Steve: Yeah, if you're doing like an actual chat with somebody, then it does make sense to use that machinery, because the browser's doing a bunch of stuff that would be really hard or impossible to do yourself from JavaScript land.
John: Rewriting all of the adaptive algorithm.
Steve: Exactly, and doing echo-cancellation, and making sure that if your CPU is pegged, for whatever reason by the browser, by a different app on the computer, then it will lower the bit rate automatically, it can detect that condition, that kind of stuff.
John: Yeah, that would be a lot of work.
Steve: Yeah.
Phil: So you guys do all this stuff with PeerCDN, it's pretty great. How much did that influence you with decisions you make going into WebTorrent, and are you planning to productionalize in the same way? Is it going to be the same sort of offering going forward or what does that look like?
Feross: Yeah, so after we did the PeerCDN acquisition, I remember that I was giving a conference talk, and one of the first conferences I ever spoke at was RealtimeConf in 2013. It was NNovember 2013, and I was explaining WebRTC. And explaining it to an audience of people that mostly hadn't heard of it. And I was trying to think of cool use cases. And I was thinking, "okay well BitTorrent over WebRTC is probably something that they would understand."
And so the last slide in the deck was actually just "what if we could have this thing, let's say we called it WebTorrent, and it did this." And I said at the very end, I think I said, "I'd like to build this, if you're interested in building this with me, come talk to me aftewards." And actually one of the people there, some guy with lots of followers, Aral Balkan, he actually tweeted out a link.
I actually set up a repo for it, right before the talk, so that way people can star it even before it's done. And so then he tweeted a link out to it and said check out WebTorrent, and there was nothing in there. So actually he got a bunch of messages from people who said like "where is the code?" And so he messaged me, kind of not really angry, but sort of like kind of angry, he was like "What the heck man, where's the code?"
And I was like there's no code, you should have looked at it first before you tweeted it dude. So that's kind of how it started. And then people reacted really well to it, and said that it was a good idea, and so then wanted to go and work on it more. But it has kind of a different mission than PeerCDN, right? Because it's trying to be interoperable with BitTorrent.With PureCDN we were free to make whatever decisions we wanted, and come up with whatever protocols we wanted.
For signaling and for actual data transfer, with this we had to name everything exactly the same, even when the decisions didn't really make that much sense anymore, because BitTorrent's an old protocol. It's from the late '90s. So they didn't even have JSON back then, so they used this other weird thing called Bin Coding, and we had to keep that, because we wanted to be interoperable.
So there are different trade-offs.
PeerCDN was trying to go for speed as much as possible, whereas with WebTorrent we were going for decentralizations, so we don't want to have any central servers that need to be trusted in the process.
So when you're downloading files from other people, you can't trust the files they're going to send you, they're random people on the internet. So you need to have a hash of the content in advance, that you know is correct. And then you hash the content that the peers give you, and compare it to the hash that you know to be representative of what you actually want and if it matches, then you'll actually show it to the user.
So, who's going to give you that hash in a system like PeerCDN? See in WebTorrent you have magnet links, which contain a hash, but in PeerCDN you drop the script into your website and it's just supposed to automagically make all your videos and everything load over peer-to-peer. So when you find a peer who says they have the content that you want, how do you verify it?
You need to ask some trusted person to give you a hash that you can trust. So what we did was we had a server that just sat there and would download any URL that anyone asked it to download, and then return a hash of it. And that was part of the CDN infrastructure, that was just trusted by all the browsers. And so that's something that we didn't want to do in a WebTorrent system. So there are different decisions like that.
Steve: That's a really interesting detail that the hash is built into the magnet link there, so you don't need any of that system. That's great.
Feross: Yeah, thanks!
Steve: So, in general, like this whole idea of playing back a streaming torrent in a browser, like that feels like some deep, deep dark magic. So I'm curious if we can talk a little bit more about the specifics there, so Videostream is the package that I think, as far as I can tell, handles most of the heavy lifting there.
John: Yep, that's correct.
Steve: Tell us a little bit more.
John: Well, so this Videostream started out when Feross was doing a Hackathon, working on WebTorrent, and he called me in the middle of the night and asked "what do we need to do to get it so you can actually stream video and be able to seek around", because before this, you could stream data into the video element, the HTML5 video element, but you couldn't seek, because you always were just dumping the bytes in, in order.
Feross: Well, I didn't know what I was doing, so I just literally would ping bytes over and over again to just add bytes to the media source.
John: And that more or less worked.
Feross: Yeah, but I didn't even know what I was doing, so that's why I called you, and I was like why doesn't seeking work.
John: So I basically gave him the bad news, that I couldn't get that done. That wasn't going to happen during the hackathon. But eventually back in 2015,
I started this project called Videostream,. The basic idea is you give it an HTML5 video element, and you tell it you want to play back at a certain position, which is wherever the media element is, and then it tells you what byte-range to get, and then you feed it back in.
Pretty much the same underlying mechanics that the browser normally does when it fishes the video from HTTP. If you just set the source to your video file, it'll automatically make the right range request. But unfortunately, there, at least at the time, was not any way to get that information out of the browser, so originally I used MP4Box.js to do the remuxing, and that worked moderately okay.
There were some things that weren't compatible, so I made my own fork and patched some things up. And I was planning to upstream those changes, but after a while I realized that the architecture made it very hard to evict data that I had fetched, so that it wasn't taking up RAM, and it still had quite a few bugs. So I decided to go out and write my own remuxer, mostly from scratch, so that was in early 2016 I think. So, I remember I went around and looked at different things that were out there, I'm not sure why I didn't use mux.js.
Feross: Oh, you saw that note, yeah.
John: I don't remember what my reasoning was at the time, but I ended up using a NPM package called MP4-stream made by Mathias Buus, who's Mafintosh on GitHub and Twitter. And I ended up contributing some to that, it's a streaming, basically part of muxing and demuxing, just the really low-level handling of the ISO base media/MP4 container format.
And then, Videostream essentially does all of the actual remuxing itself, and it was designed to be a very minimal implementation, so it's only a few thousand lines of JavaScript for the whole thing.
Steve: And can you talk a little bit about like what is the necessity around remuxing and that process.
John: So, the issue is that if you just use the video element with source equals a URL, the browser knows how to fetch all of the top-level metadata, which is the move box, and then know what bytes to go where.
Feross: Yeah it uses range requests.
John: Yes, it uses range requests. The browser will automatically figure out what your byte range is to fetch. However, the MediaSource extensions API works totally differently. There you have to feed the browser the data it needs at any given time. Instead of being able to ask what byte ranges do you need, it would really have been much simpler if I could just ask the browser what byte ranges do you need, but there isn't any good way to do that that works cross-browser.
Steve: Why does the browser need to tell you which byte ranges you need?
John: Well if I seek around in a video, then I don't want to just have to wait for all the data to come in. So what I need to do is, if I seek like to the middle of a video, I need to figure out what bytes do I need, and then I have to feed it to the browser in fragmented MP4, not the traditional MP4, where you have all the metadata up front.
So essentially you have to convert all of the information about the individual frames or samples from one big chunk of metadata right at the beginning or end of a file, and repackage that so each fragment contains the relevant metadata.
And then you can feed one fragment at a time to the browser. So basically the Videostream will first start reading from the beginning of the file, it'll find the metadata, download that, and then it knows what byte ranges to request for what, and then it can dynamically piece together a complete fragment to feed back into the browser.
Steve: Yeah, it makes sense. So the browsers have, I don't know what the status of this API is, but there's been work in the W3C to actually define streams; like from the new fetch API actually being able to pass the data through a stream, which is essentially like a pipe in the browser. How much is what you're doing connected to some of that.
John: Well Videostream internally uses the node stream API, which was an earlier attempt at building streams. I don't know what the status is of integrating streams with MediaSource extensions. Do you know?
Steve: I don't, I mean I should say that like to call out what would be the benefit of that is that today as you're making XHR requests to pull down a segment, you have to actually get the whole segment of data, before you can then operate on a segment of data.
So you're actually like storing the video data in multiple steps along the way, whereas if the streams were built into the browser, you could just basically pipe the data through a process, you wouldn't have these memory-inefficient steps along the way. I haven't seen anybody actually using that, but I haven't actually looked into it for a while.
Feross: Doesn't fetch have streams
John: It does.
Feross: You can stream things from a server instead of using XHR, right?
John: Yeah, so my stream-http browser module actually does support using the fetch API HTTP stream. So that essentially transforms the browser stream into a node style stream. Which is what things using the node API expect.
So you can actually connect stream-http straight into Videostream, and have it download over HTTP.
That's one of the demos in the repo. So if you, basically want to emulate what the browser could do just with a straight video tag, but using a lot more machinery, that's out there. So you can do it, essentially at some level the MediaSource SourceBuffer API is a way to sort of quote unquote stream data into the browser. But it doesn't follow the actual stream API that is being developed by W3C.
Steve: Yeah, and that would be the main blocker is there a stream input to SourceBuffer, I don't know that there is today.
John: Not that I know of, but I haven't been keeping up on this super close.
Feross: It's too bad that the video tag couldn't just expose more information to us. I feel like the MediaSource API is cool, but without having gone too deep into it, it just seems like if the video tag could have told us a little bit more information, like when someone seeks just tell me what bytes to give you, like just tell me which bytes you need at any given time, and I'll go get them for you, so if you want the metadata in the MP4, just tell me which bytes you think they're at.
And then I'll go get them and give them to you. And it won't necessarily be over HTTP, it might be me getting it over WebTorrent, but when I give them to the video tag, it should just do the right thing. If it could just do that, then my life would have been easier.
John: Well, potentially you could do something like this with Service Workers, which essentially behave like a proxy, that a page inserts in the browser to intercept that page's request. And that could, in principle, let you do this, if you can stream data in. And last time I looked, there was no way to actually provide a stream, as a response to a request. But that may be fixed by now.
Feross: The idea John's talking about there is really cool. Did you guys catch it? I feel like it was subtle. He was basically saying that you could make a video tag with a source that is like some fake URL, and then in your Service Worker, sit there and watch requests to that URL and intercept those, and then send a message back to the main window where WebTorrent is living, and say like "oh the browser tried to get this range."
And then you could go get it over WebTorrent. And then pass it back to the Service Worker, and have the Service Worker say "oh I got it for you", which is kind of unbelievable to me that that's even possible.
John: It's kind of a big hack too.
Steve: You're bypassing the whole MediaSource extension, you're just making it look like to the video element, it just looks like another URL, and so it has all the infrastructure built for everything else you need it to do, you're just kind of tweaking it right at the network level.
John: So I actually haven't looked at it super recently, but the blocker to doing this; well one is, it's not supported in a lot of browsers. The second was, as I said, that you can't, or last I looked you couldn't provide a stream as a response. But the spec definitely says you should be able to, so that's coming at some point.
Steve: Yeah, well that'd be pretty interesting.
Feross: If it's not there already.
Steve: I guess the reason why the MediaSource API doesn't do the exact use case that we need, is because the whole point of it was adaptive streaming, right? So doing the whole, like just tell me the bytes to get, doesn't make sense in that context, because you want to do a bunch of your custom logic about what bit rate to pick and all that stuff.
John: Sure, right. This is not what the MediaSource API was originally designed for, but it does work.
Steve: Yeah, the original intentions of the MediaSource API was, like what's the bare minimum we can provide, what's the lowest level we can open up a door to make all of these other things possible. But as I understand, it's still kind of evolving to a degree, so this might be, you know for instance, an interesting conversation to bring up at next year's FOMs. I know there was discussion last year around the Media Capabilities API.
Phil: Hey, I'm working on that.
Steve: Hey! Maybe there's something, a similar conversation we could have around just like a playback API and what is available to these applications that are trying to interact MediaSource extensions. What else could be valuable there. That'd be interesting.
Matt: Hey Phil, real quick because I didn't go to that session at FOMs last year. Could you explain a little bit about what the Media Capabilities API is?
Phil: Yeah, Media Capabilities is a way to give not only very, very simple information about whether a browser can play some things preemptively, much more about what the experience will be like.
Are you going to send something that I can't display meaningfully on the displays I have attached. Are the displays attached going to be able to do the DRM that's required, you know the color profile that I'm going to give, what sort of impact that we're going to have on battery life and CPU performance by streaming media of like a particular type or a particular bit rate and profile.
So it'll give you a kind of experience score, as well as like a battery impact score, it's actually a pretty cool thing. And it's going to hopefully, well huge things yeah, it's going to be able to tell you if you've got a HDCP attached monitors as well, which is also a big thing because that doesn't really exist right now. MediaSource extensions kind of just makes a bit of a mess for you, so it's quite exciting. It's a NVW tag incubator stage, if you go onto GitHub you can work on the spec, if you're interested.
John: I'll take a look.
Feross: Yeah, being able to know whether you have a hardware decoder or software decoder seems like a really big one.
Steve: Yeah, a common complaint that I've heard from people. Complaint isn't the right word, but we were talking about this today. Before Everywhere for example, or like HLS Everywhere, you use something like HLS JS or Contrib HLS, like we were talking about. And like under the hood that uses mux JS, and one of the problems that we have is that if you're on a iOS device for example, you kind of hand off all control to the browser. You lose everything, right?
Which means if you have custom ABR logic or something like that, you kind of lose it all. So people are talking like, oh it would be great once we have that in iOS, and it's just like do we want to transmux like MP4s in your iOS Safari window? Like is that even going to be efficient? So I'm curious, like when we start talking about WebRTC potentially being supported in iOS, are you guys concerned about battery life, or like data usage at all there?
John: I ultimately I think if we don't have to do any transmuxing in the browser, that would be great. For something like WebTorrent, the easiest way from our point of view, would be if the MediaSource API were different, and didn't require it. But I think the issues for more conventional video players, I think that's more of an issue of MPEG DASH versus, maybe HLS isn't the right choice to keep using for the long term. And you know we can have things that are already in the right format on disk or live streaming, instead of doing anything client-side.
Phil: And I think Apple's play here on in is very much, we had the announcement yesterday, native HAVC support in iOS, and interestingly that's all like the session that happened today for HAVC delivery to all iOS devices, it's all going to be FMP4, it's not going to be supported in transport stream containers, so I think that's a really obvious statement as to the direction Apple's going with like the way you need to transmuxing, so when you're sticking it down a MediaSource buffer. I think Apple will finally get on the bandwagon soon.
John: Yeah, that sounds like the right direction to go.
Matt: Cool, so I guess I wanted to wrap this up by finding out what's next for you guys? I mean it seems a little bit like, I think you even mentioned this in your bio Feross, but you guys are a little bit mad scientist, you've kind of built some crazy stuff, are building some crazy stuff, languages, frozen plantains or whatever.
Are there any crazy future plans for WebTorrent, or do you think you guys are going to focus on other things. I'm just curious like where you think you want to push WebTorrent forwards versus, like use it as a basis for other things. What's the next mad scientist role?
Feross: So yeah, I think with WebTorrent, it feels like it's in a pretty good place right now, where it all works. And we have a desktop app that has a lot of happy users. Sure there's things that could be better about it, but it's an open source project for a reason. If you have a complaint, you know, there's a way to fix it. Pull requests welcome. So it feels like it's in a good place in that regard, where there's also lots of contributors who all have commit access to the repo, and I recently just moved every repo in WebTorrent.
There's a lot of different modules, because it's a very module torrent app. There's dozens of repo's, every little piece of it; parsing torrent files, parsing magnet links, talking to trackers, talking to the BitTorrent DHT to find peers, all these pieces. They're all now in an organization, just WebTorrent on GitHub. And so it feels to me like it's in a good place, where if I go away or disappear, that there'll be other people who can work on it, and keep it going.
So that was big for me, I never made a GitHub organization before, it felt like I was giving away my baby in some ways. Like I didn't know what was going to happen, but I think it was the right thing to do. Because it hasn't been just me working on it for a long time now, so it makes sense to do that. But, as far as like future plans for WebTorrent, I guess just keep fixing bugs, making it more solid. I hope more torrent apps add support for it, and I might, if I have time later, send some pull requests to some other torrent apps to add WebRTC.
But the biggest blocker in that right now is WebRTC as a library, is really hefty, the different implementations that all exist, all include all the video and audio stuff. So if you wanted to add WebRTC support to a popular torrent app, like let's say Transmission for Mac, I guess that's on all platforms now, but you'd have to embed this huge link to this huge C++ library.
So there are some up and coming implementations in C that just do data channel, and they're really small and lightweight. And once those get solid, I think it would be pretty easy to just send a little pull request to a couple of these different torrent implementations, and then we'll have a really strong network where they can all talk to browsers. So that's probably the biggest thing I'm excited about.
Matt: I'd love to introduce you to Nick Chadwick, we can talk about this more later, but at a recent video meetup he talked about a open source project he's working on called librtpcpp, I think I got that right.
Feross: I think that's one of the data channel implementations that I looked at.
Matt: Yeah. Because like you said, like a lot of the current examples are, bring in all the world of Chromium, or something like that. So he wrote a small tool to hopefully give you the things that you need to be able to use WebRTC reasonably in a C++ project.
Feross: Yeah, that would be huge. That would really help, because right now like using it in Node, if you want to have your server be a WebRTC endpoint is a little bit painful right now. Actually, some peoples approaches are pretty hilarious, one of what's considered the best approaches right now is to run a headless Electron instance, which is all of Chromium basically.
Well all of the Chromium render anyway, and use the WebRTC implementation from that. It works the most reliably on most platforms, because Electron is really good at cross-platform. But what a heavyweight way to do it. I don't know if, John if you have anything to add.
John: Well, I think from my side, I would like to maybe do some more work on Videostream to support containers that aren't MP4, in particular WebM or MKV, which are very similar. I'm personally excited about the AV1 codec, and not sure when it's going to be supported or where, but I think we're definitely going to want to support the WebM container in Videostream at some point for that.
Steve: Yeah that'd be great.
Matt: Cool. Yeah, so I think that's all the time we have for today. But I wanted to really thank you guys for comin' in, this has been awesome so far. As I say we're wrapping up.
Feross: We didn't get to talk about banana chips.
John: Oh yeah!
Matt: Okay wait. Let's close on that note.
Feross: I don't know if you want to publicly share what you're doing, is it a secret?
John: No, that's fine. Well this started as a, mostly just me kind of messing around with things. Well I was trying to make this device, sort of like an opposite microwave for rapidly cooling drinks. I realize that's not too practical, it does work, but it's not very practical, it kind of splashes everywhere and it takes too long. But I realized vacuum-frying has been around for a while, the idea is you can fry something at relatively low temperatures.
And what that does, is it means that it doesn't get terribly cooked, but it does get that nice flavor with a bit of oil gets into it. And I realized, well I can do an even better job of that, if I have a really low temperature. So I managed to make some banana chips that taste pretty good, but currently it's extremely slow in small batches and a manual process, so we'll see if I ever actually make that into anything, but they do taste good.
Matt: Kickstarter.com
John: Maybe at some point.
Feross: Small batches sounds like a plus to me, that's the thing people go for, you know?
Matt: Artisan.
John: For now, we're talking like, to make a half a cup of banana chips you have to sit there for an hour twiddling knobs. So, it's not quite ready for prime time.
Matt: Just a mere $30 per chip.
John: Exactly.
Matt: Just some at the mill with the toast, it'll be awesome.
Phil: Yeah, if there's one city you're going to do it in, it's gotta be San Francisco.
Matt: Awesome. Well thanks again for joining us you guys, I really appreciate it.
Feross: Thanks for having us.
Subscribe to Heavybit Updates
You don’t have to build on your own. We help you stay ahead with the hottest resources, latest product updates, and top job opportunities from the community. Don’t miss out—subscribe now.
Content from the Library
Generationship Ep. #1, Building Future Infrastructure Sustainably with Catharine Strauss
Generationship is a podcast exploring the intersection of infrastructure and artificial intelligence through a technical and...
Jamstack Radio Ep. #90, Performance First with Ishan Anand of Limelight Networks
In episode 90 of JAMstack Radio, Brian Douglas speaks with Ishan Anand of Limelight Networks. They discuss Ishan's...
Jamstack Radio Ep. #64, CDN for npm with Michael Jackson of unpkg
In episode 64 of JAMstack Radio, Brian speaks with Michael Jackson, the creator of unpkg. Together, they explore package...