TCP and its History - podcast episode cover

TCP and its History

Nov 28, 201832 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

We look at the history of the Transmission Control Protocol and why the next version of HTTP won't be relying upon it. What does TCP do and how does it work?

Learn more about your ad-choices at https://www.iheartpodcastnetwork.com

See omnystudio.com/listener for privacy information.

Transcript

Speaker 1

Get in touch with technology with tech Stuff from how stuff Works dot com. Hey there, and welcome to tech Stuff.

I'm your host, Jonathan Strickland. I'm an executive producer at iHeart Radio and how Stuff Works in I love all things tech, and recently, as I was looking over tech news, I saw that the Venerable Transmission Control Protocol a a t c P is getting ready to take its bow upon the release of the next version of Hypertext Transfer Protocol or h t t P. In other words, it will no longer be part of how ht t P works. So wait, what does all that mean? That's a whole

lot of initialisms, and why does it matter. Where did TCP come from? Anyway? Well, most of the time we group TCP together with Internet Protocol or i P, so it's pretty comedy here. People talk about the TCP slash i P protocol, but that name is misleading as generally what is meant by that is a suite of protocols, not just those two. Though. To be fair, when Robert Khan invinced SURF we're first working on the transport rules of the Internet, they lumped it all together in one

protocol called t c P, So just a reminder. Let's start on the very basic definitions here. A protocol is essentially a set of rules or directions. So it's the parameters that we create so that computers know what to do when we tell them to do stuff. And a good protocol should be functional and consistent. You should be able to get the same result every time you follow

those rules if you give the same inputs. Now, while there are several protocols in the TCP I P suite, the t c P and i P ones are particularly important. No big surprise, since that's the ones we use whenever we refer to these protocols. So a quick reminder about

how computers send data over the Internet. Computers do not send enormous files all in one go, because that would be difficult to scale as the network of computers got larger and the file sizes got bigger as well, and it would mean that if something were to go wrong during the transmission of a file, you would at best end up with a corrupt file and you'd have to start all over again. At worst, you wouldn't end up

with anything at all. So either way, you would have to figure out how to start up the process so to facilitate sending this information across the network so that it's not unmanageable. This protocol suite divvies up the file

into smaller packets of data. Each packet has information associated with it that identifies where it is coming from, where it is headed on the Internet, and how the information contained within the packet fits in with all the other packets of information for that same file, so that the computer on the other end of the communication channel can get all those packets and then put it all back together so that you get the file that was sent

by the first computer. TCP defines how applications can create channels of communication across a network. It also manages how a message is assembled into those smaller packets before they are then transmitted over the Internet and then reassembled at the destination address, and it makes sure that the recipient computer has actually received each packet in sequence to verify

that the entire file has made it across. So it's kind of an error checking mechanism, or a way of ensuring that the information computer A is sending to computer B gets to where it's going. Without these rules, you would never really be sure if you send something from computer A. If computer B got it. This is a set of rules that tells computer or be to say, hey, by the way, once you get all these, let computer A know so that everyone knows that the transmission is complete.

I P, by the way, defines how to address and route each packet to make sure it reaches the right destination. It's technically on a layer lower than the TCP protocol. Now, way back in the early nineteen seventies, you had a team working on a project called ARPA net. This is going to go back to DARPA, which I covered in a series of episodes recently, so this kind of ties in with that more than a little bit. Ar Ponett was an early computer network, and in a way it

was a precursor to the Internet. Remember the Internet is a network of networks. Ar Ponnett was a network, period and while the team was working on this, they realized that the protocols they had been using for our bonnet were functional but not scalable. If you were to go beyond just one network, if you wanted to connect two

networks together, you really needed a different solution. And as this network would get bigger, the situation would become untenable and so some of the team got to work designing new sets of rules for networked communication that could keep things running smoothly even as the network would get bigger and bigger, something that was truly scalable. One of the people working on this was a guy named Robert Kahn,

one of the fathers of the Internet. You often hear about him and his buddy Vents Surf, who together would create TCP. So Robert Kahn comes on over to DARPA and he's part of the I P. T O Department. That's the department that's in charge of creating networks and

that sort of thing. He specifically wanted to replace an earlier set of rules, an earlier protocol called the Network Control Program or in c P. And the reason for that gets a little technical, but I figure we can go a bit further than just it doesn't scale well, because that doesn't really tell you much. So for just a second, let's talk about what con was envisioning back in the early nineteen seventies. He wanted a protocol that

was going to do certain things. He felt that a computer scientists had connected these distant computers together using the telephone system that created the first wide area network, but that was not going to be sustainable on a broader scale. He knew that the key component of the network technology at our ponet was the Interface Message Processor or i MP and MP and an MP is kind of like

a router. It was a packet switching node that would serve as a connection between different computers on our bonnet. In addition to imps, the team on our Bonnet was working on a host to host protocol which would become the network control protocol, and developers began to create applications to run on those networks like email and con would demonstrate even a twenty node large network in nineteen seventy two,

so a network consisting of twenty computers. Khan was also working on technology for a packet radio network that would actually use radio waves to send data back and forth across different computers, and that was going to use packet switching specifically because radio is tricky stuff. If a signal were lost or jammed, then the information that was being sent across the network would be lost and you'd end up with miscommunications and failures, so you had to have

a way to deal with this. Originally, Khan had intended to develop a protocol specifically for radio packet networks to have the sort of error correction mechanism in there, a way of guaranteeing that the information from one system would get to another, and then this network would be able to interface with other networks like arpanet, using the already established in CP as a transport layer. But there was a big problem. N c P could only address networks

and machines down to the imp level. N CP would rely upon our bonnet itself for end to end reliability, so it worked just fine if you were in our pannet if your machine was directly connected into that network, But if you wanted to interconnect the arpanet network with another network, something had to change because n c P could not handle identifying and responding to errors and delivering information to the computers that were outside of our bannet.

N CP just didn't have that capability. Arpanet handled everything within the network, but it didn't have anything designed to handle stuff from outside that network. So at first Kahn had planned to only work on the packet radio networks and just concentrate on that, but ultimately his quest to create a protocol that could ensure message is were arriving at their destinations across different networks, expanded beyond just the

packet radio application. So Cohn wanted an open network architecture, something that would allow any sort of networked system to interconnect with another and still have rules in place to ensure that the data was getting to where it needed to go. So Robert Kahn and vent Surf were two computer scientists who were working on this. They were the

authors of these protocols. Vent Surf had been one of the people to create in CP well fun fact, by the way, TCP did not originally stand for Transmission control Protocol back when con and Surf first proposed it. Instead, it stood for Transmission Control Program, and that's a subtle difference, to be sure. They wrote the first version of TCP in ninety three, and they published a fully documented and revised version in four under RFC six. It was specifically

titled Specification of Internet Transmission Control Program. Not long after the initial creation, other folks began to realize that it might be a better idea to break out the functions of t c P into two sets of protocols, and that's where we get t c P I P. Because, again, before it was all lumped together, and then they figured this would make more sense if we separated them out into two sets of rules. The creation of t C P I P predates the Open Systems Interconnection or O

S I layer model. And I've talked about the O SI model in a past episode of tech Stuff. But the O SI model describes how different parts of a telecommunications or computer system communicate with one another. They have layers to describe the different functions. But T C P I P layers are are pretty similar to O SI layers, so we can we can talk about the two as being at least somewhat analogous. It's an abstract idea that's meant to describe how each layer fits within a grand scheme.

So layers that are near the bottom of the stack support all the layers that are on top of it. Layers at the top do not necessarily support any other layers. They rely on the ones below them, but they don't support anything any layers above them. And again this is an abstraction. There are not actual literal layers in these systems, but within this framework, you could say TCP would be on layer four. That would be the transport layer. The

Internet protocol is one layer further down. It's on layer three, meaning it is a little closer to the basic hardware layer of the system. That's the lowest layer is the hardware, and that uh means that the i P protocol supports the TCP protocols above it and above TCP are the application layers where you have stuff like file Transfer Protocol, email and h T t P. Those are all on

top of it. I've got more to say about what TCP is and what it does in just a moment, but first let's take a quick break to thank our sponsor. So when laying out the rules for TCP, bob con had a few requirements and this is from the Internet Society's page on the History of the Internet, and they were each distinct network would have to stand on its own and no internal changes should be required to any

such network to connect it to the Internet. Next, communications would be on a best effort basis, so if a packet did not make it to the final destination, it would shortly be retransmitted from the source. So this is the error correction part. A packet on its way to computer B never makes it, then computer A will retransmit that same packet black boxes would be used to connect

these networks. These would later be called gateways and routers, and there would be no information retained by the gateways about the individual flows of packets passing through them, thereby just keeping them very simple and avoiding complicated adaptation and recovery from various failure modes. So they were really just a means of controlling traffic flow, but not monitoring traffic flow, and there would be no global control at the operations level.

Those were his requirements. Those would develop into more granular requirements as the work would continue on the protocols. And Vince Surf did a really really good explanation about how TCP works in a short video, and he used a postcard analogy, and I highly recommend checking it out because he just puts it very simply. I'm gonna kind of paraphrase what he said here. He compared TCP to sending a book to a friend, and you're using the postal service,

except your postal service is very peculiar. They will not carry anything other than postcards. So you cannot actually send the physical book as is to your friend because the post office is not gonna carry that. So what you have to do is cut your book up so that you can fit maybe about half a page on a postcard, and then you can send that postcard through the mail, and then you have to send all of the book

in a series of postcards to your friend. But then you realize, hey, wait, because of the way I have to cut up this book, sometimes there's no indication there about a page number, so there's no way of knowing just on the page where this page fits in relation to the rest of the book. So then you number every single postcard, and that way your friend knows what order they go in, they know the sequence, so there's no guarantee that any one postcard will actually make it

all the way through. There's also no guarantee your friend will receive the postcards in the same order that you sent them. But by numbering, your friend will know which postcards they have received. So if they get postcard number eighty three but they didn't get postcard number A D two, they can send you a message alerting you that they are missing one, and you can read, transmit or re send, so your friend can send a postcard back to you. Essentially says, hey, I got all the postcards up to

number eighty two or whatever. But that's it, and this would let you know that you need to resend those postcards, which means you have to keep a copy of the

postcards you've submitted. You can't just send your only copy because you'd be up the creek if your buddy says, hey, I didn't get that, and if nothing comes back to you, if your friend never says, oh, I received everything, then you would have to start re sending postcards until you finally got a message that says, hey, a toad, scut all the postcards, thank you, I'm going to read the book now, or whatever it might be. But that's how

TCP works. But instead of it being you know, physical postcards, we're talking digital information. There's never a guarantee that the information you send is actually going to get to your destination or that will all arrive sequentially. But these safeguards mean that your computer will know when to send stuff again to guarantee transmission. The United States Department of Defense adopted TCP i P as a standard in nineteen eighty.

DARPA was able to change over in advance of everyone else, which allowed for partitioning of the military networks from non military networks, and that would carry forward, so you have mill net that's its own separate network that's based on essentially the same architecture as the general Internet. T c P i P would become the official transport layer for ar PONNETT on January one, nine three. This was called

a flag day. Now, that is an event that involves incorporating a critical change in a very large system UH in a simultaneous way, like it has to change throughout the system the same time, UH, and that is really tricky to do. The bigger the system, obviously, the harder it is for you to make a global change all

at the same time. The transition had been planned out for years in advance because this would require network administrators to change over to the t c P i P protocol all at the same time, and surprisingly it went off without any really major problems, So that's pretty cool. By the mid nineteen eighties, the Internet was an established thing, though really only a relatively small number of people were

aware of it. If you worked at DARPA, or if you were at a university with a really good computer science curriculum, or maybe you worked in a research facility, or maybe you were in the military, then you might know about it. A few other government offices also were on the early Internet, but apart from that and a few major businesses, it was largely a thing of mystery. The general public was pretty much ignorant of the Internet

for almost a decade. It wouldn't be until the emergence of the World Wide Web that more people would become aware of the Internet, and in fact, at that point, the Worldwide Web and the Internet would often be confused as meaning the same thing for a lot of people. A lot of people would refer to the Worldwide Web as the Internet, not realizing that really the world Wide Web is one application built on top of the Internet,

it is not itself the Internet. In the early nineties, uh, speaking of the Web, a guy named Tim berners Lee, he was working for a little scientific research organization called CERN, had a bright idea. And his idea was for an application protocol on top of the Internet that would facilitate communications between client computers and server computers, including file transfers and the ability for a sir her to refer a client to a different server. And that would be the

foundation for the Worldwide Web. And just in case you didn't pick up on my stupid joke. CERN is not a little scientific research organization. It's the European Organization for Nuclear Research and it is a huge, huge deal. Among the many things it does is oversee the large Hadron collider, so big, big organization. Ultimately, the purpose of h t t P, which was created by Tim berners Lee, was to create a means of linking different documents together through

what is called hypertext. And you've seen these. These are those highlighted words and web pages, and when you click on it, you go to a different web page. And that's the whole point is clicking on hypertext sends a command to navigate to a new page. And because the rules for h T t P allow for one server to refer a client to another server, those two web pages don't have a quote unquote live on the same

server together. So we're talking about the very basic foundation of how the Worldwide Web works with the interlinking documents that allow you to hop from one page or one site to another. The features of HTTP version zero point nine, which was the first one released to the public, included the following clients Server Request Response Protocol as key protocol

running over a t C p I P link. It was designed to transfer hypertext documents or h t m L and the connection between server and client is closed after every request. And that's it. It was bare bones stuff, but this was the beginning of something truly transformational. In fact, I could honestly say I would not have the career

I have without this invention. So from the h t t P standard of pretty quickly, Tim burns Lee had set the stage, and then a team at the National Center of Supercomputing Applications or in c s A made the first popular web browser called Mosaic. One of the programmers on that team was a guy named Mark Andreason, who went on to co found the Mosaic Corporation and

eventually publish a new browser called Netscape. Now, at that same time, the Internet Engineering Task Force was organizing a team called the ht t P Working Group dedicated to improving this HTTP protocol, and it was quickly developing in several different directions. And by that I mean a lot of different people had started by taking the version zero point nine h t t P and then tweaking it independently of each other. So it's evolving in different directions simultaneously.

So while there's a shorthand that refers to h t t P one point oh. There is not an actual standard one point oh. There were many quote unquote flavors of one point oh because there were so many different variations on that, the I E. T F Working Group would publish a standard for HTTP Protocol version one point one under RFC two zero six eight if you want to read it. It's a little technical, but this version would be tweaked and updated before it was officially released.

In Version two point oh of h T t P wouldn't come out until two thousand fifteen. That is a long time between versions one point one came out in two point oh and two thousand fifteen, and only about a third of all websites in the world today support

version two point oh as the standard. Most websites are still using one point oh or one point one, So it may come as something of a surprise to hear that h T t P three point oh is right around the corner when not even a majority of sites are on the most recent version of two point oh, and perhaps an even bigger surprises that, unlike the earlier versions, this h T t P protocol will not rely upon t c P. I'll explain more in just a second,

but first let's take another quick break to thank our sponsor. Okay, So why would h T t P three point oh ditch TCP, which has been a part of the framework of the Internet since the very beginning, since before there was an Internet. Well, it mostly comes down to two big things, speed and efficiency. So when Robert Cohn and when vent surf we're working on TCP, they were building out a protocol to handle any sort of application that would be built on top of what was to become

the Internet. So and it has a very much a one size fits all kind of approach to that it provided useful or really I mean, at this point I should just say necessary set of features to facilitate communication. But some of those are excessive or not as pertinent to the types of traffic that happened over h T t P, or they impede some of the functions that

htt P handles. For example, in an effort to establish a connection between a client and a server, TCP requires a number of back and forth messages, essentially saying, hey over there, services, Yeah, what is it? Client says, I want to talk to you? The services all right? Hang on a second, and the computer says now, good time, and the services yeah, yeah, let's go ahead and do that. It's far more technical than that, but there's this series that goes back and forth in order for a communication

channel to be established between client and server. That gets even more complicated if you want to have an encrypted connection over Secure socket Layer or s s L using a website. So you know the little lock that you see in the address bar when you visit a secure website,

that's part of s SL. Well, to establish that kind of connection between your computer or your computer's browser, which is the client, and the server which houses the website you're visiting to, it requires even more round trips between the two to first establish the connectivity and then established

the encrypted communications. So the process is good for making sure that there is an actual route for data to follow, but it's not the most straightforward approach if you want to use HTTP, particularly if you want to use encrypted connections. There is another protocol, however, called User Data Gram Protocol or u d P, and that can serve as the foundation for a new transport layer for h T t P three point oh. U d P has a big advantage over TCP. It is incredibly simple and it is

incredibly fast. It is a transport layer protocol just like TCP, but unlike TCP, u DP does not have the same features to ensure communications are established or successful, so that could be a big drawback. Right. U DP transmissions are unordered, so a later message can arrive ahead of an earlier one, and that can be very confusing if you haven't built

in a way of dealing with that. There's also no means for the receiving computer to know if something has gone wrong, if packets go missing, like it doesn't know if doesn't have all the different pieces, if it's just over pure U d P. But U d P can act as a base to build upon. It doesn't have to be uh. The protocols don't have to be the

end all be all. That's your starting point. So Google has taken you U d P as its starting point and built upon it to create an experimental network protocol called q U I C for Quick U d P Internet Connections. The I E t F has taken this experimental protocol and worked on creating a standardized version, which in some ways has moved away from what Google's initial design was all about, But the writing is on the wall for t c P as far as the h

t t P standard is concerned. Moving forward, the transport layer that was initially published in the nineteen seventies is going to have to make way for a lighter, more agile, and less cumbersome standard. In addition to the move away from t c P, the new version of h T t P will be more secure. QUICK, as designed by Google, transports data by encrypting it by default, is not the added layer on top of everything. It is the default layer.

Google's build, which is sometimes called h t t P over QUICK, is supported in the latest versions of Google Chrome and in the Opera web browser. Right now, only a few websites actually make use of it, most of them belong to Google, though Facebook has also been incorporating it. It's going to be a really long path to travel to get widespread adoption because right now less than two percent of all websites support QUICK. Meanwhile, TCP will still

be in use. Just because it's being phased out of future versions of h T t P does not mean that this protocol is completely obsolete. As I mentioned earlier in this episode, the Worldwide Web is just one implication on top of the Internet, there are lots of others that will still make use of that venerable set of rules, and some websites may never move off of it, since it requires work to make the transition, and let's be honest, it's not always the highest priority for some businesses that

maintain websites out there. But it is interesting to me to see this move away from TCP. I have always associated TCP as being a truly integral part of the Internet, and it still will be. It just won't necessarily be as integral to web browsing as it used to be. Fascinating stuff to me. I hope you guys enjoyed this episode.

I know it got a little more uh techie with the protocols than usual, but I thought this was a big deal and one that maybe, probably I'm guessing, is not going to get widely reported outside of tech news circles. I doubt that you're gonna, you know, turn on the local news and some anchor is going to say and another news, the Worldwide Web is moving away from this ancient set of rules. I just don't see that making the news, but it is important anyway. If you guys

have any suggestions for future episodes of tech Stuff. Maybe it's a technology, a company, a person in tech. Maybe there's someone you would want me to talk with about technology. You should send me those thoughts. You can email the show it is tech Stuff at how stuff works dot com, or you can go to our website that's text Stuff podcast dot com and you can find other ways to contact me there. Don't forget to head on over to our merchandise store that's at t public dot com slash

tech Stuff. Everything you purchased there goes to benefit the show. We greatly appreciate that, and there's some pretty cool things over there. If you haven't checked it out, go see if there are any designs that you particularly like. And oh, remember we've been nominated in the Science and Technology category of the I Heart Radio Podcast Awards. You can head on over to the website for the I Heart Radio Podcast Awards and vote up to five times a day.

You can dedicate all five of those votes to tech Stuff if that is your desire. But whatever you want to do, you should go check out all of those different categories see if there are any other shows you really like. Maybe you can discover some shows you didn't even know existed. I always love finding new podcasts. This is a great way of finding some really high quality ones, and I'll talk to you again really soon for more on this and thousands of other topics, because it how stuff works. Dot com

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android