Brought to you by the reinvented two thousand twelve camera. It's ready. Are you get in touch with technology? With tech Stuff from how stuff works dot com. Hello again, everyone, welcome to tech Stuff. My name is Chris Boulette and I am the tech editor here at how stuff works dot Com. Sitting across from me, as he always does, is senior writer Jonathan Strickland. Hey there, all right, So what do you want to talk about today? Well, I wanted to talk about something that comes comes to us
courtesy of a little listener mail. List listener mail comes from Samuel, and Samuel says, Hi, First of all, thanks for the great info. I really like your show. I just started listening to you guys, and I am liking it. I would like to know more about distributed computing services and how would I be able to sell my computers idling time or CPU cycles, also if it would be a good idea to do it. Thanks again, Sam from Pasco, Washington, By bye, Sam. It's nice talking to you. Um so
distributed computing. We're gonna go into what distributing distributed computing is and um and the whole idea of selling your CPUs cycles, which I don't just go ahead and get this out of the way. UM, As far as I can tell, there are no projects currently available where you can actually sell your CPUs cycles. In most cases, you're
donating your cycles. UM. There was a proposed project at one point to compensate people for their computer time, but that one kind of fizzled out before it ever reached a point where payments could be made to participants. So UM. So as it stands the as the recording of this podcast, and as far as I am aware, the only way you can participate in distributed computing projects is by either working at a particular facility or donating your your computer
CPUs cycles. Should we talk about what distributed computing is, how it wants. I just wanted to make sure that we answered Samuel's question and didn't forget about it. No, I'm absolutely okay with that, So distribute computing take it away. We Distributed computing UH is a way of basically crunching data. UM. And in a lot of cases there are massive, massive
projects underway. A lot of the ones I'm sure we'll be talking about in a few minutes are science related, where UH scientists, for example, are collecting lots and lots and lots of information, but they don't have the computing power to crunch all those numbers in a handy fashion. So, um, well, I would guess about the mid nineties, that's when I
started seeing information on undistributed computing. They started saying, well, hey, you guys all want to help us out with these projects, So why don't we find a way to let you contribute your computer processing cycles from when your machine is down, say overnight, If you leave your computer on, it's not doing a whole lot other than just sort of sitting there on sleep. So why don't you help us out by crunching some of these numbers for us and send them back and we can use this to get our
experiment further down the road, right right. It speeds things up considerably, because really, when you're working on one of these huge projects, you only have a couple of options available to you, because, as Plett was mentioning, you get tons and tons of information. If you were to use a single regular computer to try and process that, it could take thousands or even millions of years to complete
the calculations. Because we're talking about enormous figures here, um, and often lots of different kinds of information coming in that you have to analyze, and the computer may only be able to concentrate on one at a time. Your other option is you could try and book some time with a supercomputer, and some of these projects are doing that too. They're also using supercomputers, not just a distributed computing, but supercomputer time is precious. There are only so many
supercomputers out there. Many of them were built with a specific task in mind, so there's not a whole lot of time for for them to to dedicate to other projects. So if your project does not have a supercomputer, it may be a struggle for you to get the money and the time available on one of those machines to do your number crunching. So what do you do? Well, if the problems that you're working on can be broken
down into smaller problems, you can look into distributed computing. Now, some of you may be thinking this sounds a lot like what we talked about when we talk about multi core processors. It is not dissimilar. The two are are very close together in a way. You're talking about a big problem that you can break down into smaller problems,
and each processor or in distributed computing. Each computer can work on a portion of that problem, solve it, send it back to a master computer, which then incorporates everything that has been gathered up to that point, and you end up solving the big problem much faster than you
would if you tackled it with just one machine. Yeah. Actually, if you think about it, if you're a long time tech stuff fan, you you've heard us talk about this in a number of occasions, UM, including when we talked about the PlayStation threes being used for the military, and um, you know the many occasions on which we've mentioned Google. Um, they sort of do the same thing if you think about it, because they're using rather than buying lots and
lots of really expensive hardware. UH, they used sort of run of the mill uh PCs using Linux. They're sort of custom built. But um, they're still not you know, high end, high end, super powerful computers. UM. And by using those uh smaller, less powerful computers, they're able to distribute the load of the entire network on UH and and spread their budget out a little bit more too. So, right, and I have to kind of fun. Definitions of fun
might be a stretch. I have two definitions. Have distributed computing from two different individuals. So this first one comes from andrew S. Tannenbaum, and he says, distributed computing distributed communing system is a collection of independent computers that appears
to its users as a single coherent system. So, in other words, from a macro point of view, if you if you get the bird's eye view on this, Uh, yeah, it's a system that's made up of lots and lots of computers, but they're all working to the same end, and they have, you know, a kind of a master that is coordinating everything. If you look at this one big system, then you just think, oh, each of these
computers is really a processor. And another example of of distributing computing we've talked about in the face it applies to that definition a bot net. Yes, that's also a good example. Uh. And you could argue that the Internet itself is in a way a distributed computing system, although it's not really geared to a specific project. It's not solving a problem so much, but the computing power of
moving all that information is distributed. Yes. So then the second definition I wanted to mention comes from Leslie Lamport, who says, you know, you know, you have of a distributed system when the crash of a computer you've never heard of halts all progress. So that brings us to some of the considerations you have to have in mind
when you start to build a distributed computing system. Now, to build a distributed computing system, what you really need is an algorithm that which is of course a set of instructions. An algorithm is just really just a set of instructions in mathematical format usually UM that tells computers what they have to do. You know, it gives them a step by step procedure. With a distributed algorithm, what this is doing is it determines how to divide up
a problem. It determines how to send that information out to the different nodes or clients um uh. The clients and nodes in this case would be the individual computers out there UH and to get also would have to have information on how long to wait until UH for
a response from that computer from that client um. In other words, if you send out a problem to a computer, and let's say that you you are using your computer to be part of one of these distributed computing projects, but you happen to go on vacation for a full
month and you've turned your computer off. Well, in that that month long vacation that you're on, while you're enjoying life and drinking tropical drinks and listening to Beach Boys songs, your computer is not working on the problem that was sent to it by the master computer. If you don't build into the algorithm the you know the fact that, hey, sometimes computers are not going to send you the information you expect back, then the whole project could come to
a halt. Theoretically, um Or, you could build into the algorithm if you don't hear back from the client within X number of days, then send the problem to another client, so that way you have redundancy in there. If one computer fails, the whole project doesn't come to a crashing halt. As as Leslie had pointed out, um that's something you
have to build into the algorithm. As well as the idea that none of these individual computers are completely infallible, they have to be You have to be planned for the fact that some of these are going to crash. Uh. Users may turn their machines off, they may they may abandoned the project, and you have to build that into your system. So that it doesn't bring the whole thing down in the process. UM, I'm sorry you were going
to say, well, no, no, it was UM. I didn't know where you're going next because I had an example. But UM. One of the projects that has become so it's really sort of an overarching way of working on different kinds of distributing computing projects is UH Berkeley University of California Berkeley blink yes, pink, the Berkeley Open Infrastructure for Network Computing. I couldn't read my own handwriting there. UM. And basically what this is it's a client a piece
of client software UM that enables somebody. You can download the software and it will enable you to participate in all kinds of projects. It's sometimes more than one. Um yeah. Basically, a scheduling server sends out the instructions to your computer and says, you know, okay, here, here's your here's your information. It judges your CPU and memory, so it says I'm not gonna give you any more and you can handle,
you know, go ahead and work on this. And the computer is also able to download any necessary files it needs to execute. Basically, the if the server, if the software is not up to date it can update itself so that it can go ahead and process the information. Then your computer does the UH, does the calculations, and sends it back up to the data server and tells the scheduling server, hey, I'm free. And basically they give you credit for the amount of cycles your computer has
participated in. And it does send out each problem to more than one computer, actually two in this case, and based you know, it to be fair rather than to give all the credits. Say, you know, Nathan and I both have boy and constalled in our computers, and both of us are working on steady at home, and uh, Jonathan turned his computer off one night when I left my non so his wasn't able to process the information quite as fast because he was, you know, about ten
hours shorter than than mine. The press you know, the cycles. So basically what it does is to be fair. Because Jonathan's computer did do most of the work UH to get the unit done, it gives both of us credit for what how much time our CPUs spent working on the problem. So um it sort of it serves as a a dual purpose because in one hand, if if my computer been off the entire time, the unit still needed to get to be finished. But at the same
time they want to give us credit for it. Because part of the the UH, the fun and participating in some of these projects as they give you credit and you get to rack up hours and you can join teams and the teams can compete with one another. So it's mostly for fun for bragging rights. It's not like it's not a compensation thing. But when you think about it, you're helping contribute to two humans expanding our knowledge about the universe in some way, depending on what the project is.
There's lots of different versions out there as emergencies go on outside of our studio. UM. But the besides the fact of the whole algorithm saying the whole UH scheduling thing that you pointed out, UM, there are two different major UH categories of distributed computing. You have synchronous, where everything is as you would imagine from the name synchronized, so that everything is going on a very set schedule, and then there's asynchronous. Now, asynchronous is harder to set up.
It is much more difficult to create an algorithm that that compensates for an asynchronous system, but it's also the one that allows people more people to join, because, as you pointed out, they're gonna be times where someone's computer is going to be off for perhaps the person is using the computer for a really process or heavy application. And so most of the uh, these programs have middleware
that will ease back on your computer while you're using it. Um. It may always be running in the background, but it will start to consume less and less power as you demand more and more for whatever application you need. So let's say you're doing like a lot of video editing and you need a lot of process or power at that time. Uh, it'll back off, or you can turn it off entirely for the duration of your work, which
I sometimes do depending on what I have to do. Right, especially if you're you start feeling like your computers acting slugg as you start looking at the processes you can you can halt and that's usually one of the one of the first to go. But uh that's an asynchronous system can compensate for that. Uh. You know, it doesn't
matter if your computer goes offline for a while. UM. And UH so that's another one of those those things that you have to take into consideration along with the UM the fact that UH if a computer does go down, you don't want your your project to crash. Another thing you have to keep in mind is that for a lot of these these, actually, I would say, are pretty much any distributed computing system that was not a very tightly controlled grid computing system, you're gonna be working with
lots and lots of different kinds of hardware. Definitely. I mean we work in the same company, Chris and I and and really yeah, oddly and I have so Chris and I work in the same company, but our computers are different. Yes, his computer and my computer are not. Are not exactly the same, you know, the specs are different. Um, his his better, His e key works and mine doesn't. Um, I'm really tired of typing the number three instead of
ever and over. I think that's called the minor vowel shift. Yes, yes, I had a vowel movement. Um, minor English major joke there. Yeah, we're sorry. We apologize at any rate. They you have to be able to build your system so that it can it can come sate for the fact that people
are not using the exact same machine. Now, for some of these these distributed computing models, if you were, say a research facility and you just decide, you know, instead of trying to spend several hundred million dollars to build a supercomputer, we're gonna spend several thousand dollars and buy up, you know, a dozen PCs. You could theoretically make sure that every single one of them matches exactly to a
specific model. Now, in that case, a synchronous approach might even work, because if all you're doing is relying on computers that you yourself are maintaining, you can make sure that everything's working properly. It's not the same as if it's a distributed computing system where anyone can join UM. And you know, really it's the Internet Protocol that made all of this possible from you know, opening it up
to the public. Before that, you would just have uh various standards of computer to computer uh communication to to try and create a distributed computing system. I'm saying distributed computing so much it's it's start to lose all meaning to me, a right. But there we should also mention there are different variations of distributed computing, and the definitions are not set in stone. They're not so well defined
that UM, that's easy to differentiate one from the other. So, for example, grid computing, you might think of grid computing as being a little more structured than you know, the word grid sound to be right, right, It doesn't necessarily mean that. I mean, some people use grid computing to mean any kind of distributed computing system, but or utility computing, where that kind of suggests that there's a transaction involved, like you are paying a company in order to access
a certain amount of processing power or certain applications. It's kind of a software as the service spin off. Um. Then there's cloud computing, which sounds like it's a little less structured than grid computing, and that's usually what I think of when I think of these big projects that allow the public to participate. Um. But again, these definitions are just kind of guidelines, they're not There's a lot
of overlap, and some people use the terms interchangeably. Yeah, so don't get too confused because no one knows what they're talking about. Yeah. I always think of distributing distributed computing as being the participatory projects that you opt into for these things, right, I mean, that's usually the way I hear it referred. And you you mentioned study at home, which would be the search for extraterrestrial intelligence. Yes, um, it was the very first project I ever worked on
in a distributed sense. Honestly, I think we just need to start searching for the terrestrial kind first and then work our way up a dumb human joke. I'm sorry. Yeah. Setting at Home was the first project that I UH ever participated in like this, and it um basically it takes information gathered by a radio telescope, UM and UH
sends it out to different people. Now, they used to have their own individual client, which was really pretty because it had lots and lots of multicolored blocks on it, and it when it was up as a screensaver, it made it look like your computer was doing something really really cool. Um. And and what your computer really was doing, just so you know, was analyzing the data from this this radio telescope and looking for any kind of patterns
or spikes that might indicate an extraterrestrial radio transmission. Yeah. So they were, we were. They were listening, and we were helping them figure out whether we were actually picking up anything other than I love Lucy reruns and um, which oddly come from Venus. That is not true. Um. Yeah. Actually, now they're part of blink, or they're not part part of blink, but their data can be configured through boink.
So if you're down to the point of client, you can do that, and a lot of projects are are done that way too. But yeah, at the moment um, in the moment I'm participating in three Wow, what are the other two? Um? Well no, actually I was okay, So what happened was good. This is a good transition because I wanted to talk about the different kinds of things because we're not just talking uh, you know, strict
lee science stuff. Although science and life sciences are two of the big categories the participatory distributed computing projects like this, there's also stuff you can do with cryptography. Um, there are other internet and finance projects. You could work on, mathematics, language, art, and even puzzles and games that you can participate in. See I got confused. I thought cryptography was when you sent people to the forbidden zone. No, right, So anyway, Um,
that was a Superman joke. And I'm sorry, but two of them might participate in and I'm ignoring you through point um. One of them is the similar similarity matrix of proteins UM and Uh. The other is Rosetta at Home with both of which are our life sciences related. And then I use a third one on our work on our aforementioned work computers. Um, those two are on the on our test machine. The other one is a Stanford project that's not run through point called Folding at Home.
That's proteins, right, Yes, it is also proteins. They're basically looking the proteins fold um in a way that helps them work. And we don't really understand what it takes for the proteins to actually activate, so we're looking at you know, we're looking at how the proteins fold and how that helps determine what the proteins do. Yeah, exactly exactly, And it's uh, it's the difference in this and the point stuff is, you know, the Folding its home has
a screensaver components. So when my computer is you know, a sleep when I go to get coffee or something like that. Uh, it has all these funny looking look molecule looking things up on the screen and solely are are building the the supposedly building the protein. Yeah. And uh, I even convinced, uh stuff you should know as Josh Clark to participate in the process too. So we have
a house stuff Works group. Yes, we do on on that group, so you know you can always join us there, and I um, I'm going to start groups too for the other two projects I'm working on in our house stuff works dot Com groups. Hopefully I can get ones for the other groups and we can really get a how stuff works dot Com movement behind distributed computing and have a have fun trying to take on the other groups with it. And will make sure that we include that when we do the blog post right up for
this this episode. Yeah, because I haven't done them all yet, so I'm gonna have to put those together. So there are a couple of other ones that we can talk about. There's the Milky Way at Home projects. That's that one's mapping out the Sagittarius stream and analyzing the data from that. There's Einstein at Home, which is looking for gravitational waves, particularly from pulsars um. There's the amount of time you can put in on that one is sort of relative nice. Yeah,
it breaks down the closer you get to the big bang. Uh. The these these projects, of course, like like the other ones we mentioned, just involve massive amounts of data that needs to be analyzed and uh, and that's why these these projects are so important and why they depend so heavily upon volunteers, because otherwise, if you just dedicated you know,
a dozen machines to this, it would take forever. But other big projects used distributed computing as well, like the Large Hadron Collider is collecting well, when it's working, it's collecting terra bytes of data, and so they use they're using so they're using a distributed computing to analyze all of that. And talking about the Big Bang, I mean, that's really what the LHC is is doing. It's it's slamming these protons together at such magnificent speeds that the
resulting explosion is sort of in miniature. And I'm talking like unimaginable. Maniagure the sort of effects that you would have witnessed at the at the Big Bang, had there been some way for you to actually witness them. Now, I haven't heard whether they're going to allow the public to participate in the crunching of those numbers. It's going to be just as sign scientific community. As far as I can tell. That is one of those closed grid computing systems where UM I haven't seen anything about a
public uh public version. Cern is pretty tight with that. That also leads another to another issue. First of all, to to run any of these UM at home type projects, normally you have to download some software. I mean pretty much,
you have to download some software. This is the software that allows your computer to communicate with the master computer and it allows your computer to accept assignments and then work on the assignments because he usually has whatever the program is that's need to analyze the data and then send the information back. So, um yeah, just getting the numbers that it isn't going to really help you much to crunch them through, right Well, that means you have
to download some software on your computer. So right there, there's a security vulnerability anytime you have to download software. Um Now, most of these are coming from very trustworthy sources, and I would imagine that they have pretty good security on their their products so that someone's not gonna, you know, kind of do a man in the middle attack and
and have you download malware instead of the middleware. But you may also worry about the fact that this is essentially giving another computer access to your computer's processing power. Hopefully there are no back doors built into that software so that they could say, get control administrative control of your machine, or spy on any of your data. Again, most of these projects are from very reputable sources, and it's a good bet that there's no malicious intent behind them.
But doesn't mean there isn't in vulnerability, right, Yes, there could be vulnerabilities. I mean sometimes people just don't think of that when they're building a tool. You know, they're not because they're thinking how can I achieve this goal? They're not thinking how can I make sure someone doesn't take advantage of this tool to do something naughty? Right, So just just you know, buyer beware naughty naughty um.
Another thing to consider if you're at all environmentally conscious is if your computer is not on I mean not being used, say overnight, and you're contributing to one of these distributed computing projects. Yes, you're helping out science, but you're also running your computer when you don't absolutely have
to write. So it's kind of a trade off. You know, if you're thinking about doing good, you have two ways to do good and there's sort of at a huts with each other, so you have to decide which is more important to you. Turning your computer off or right. So you're talking about conserving electricity. I didn't actually say that,
but there's there's another element to it besides that. So there's the conserving electricity concern where if you do participate in this project, you are going to be consuming more power. So not only is it not it's not only is it less green, but it also could impact your electricity bill, although probably not to a huge extent um, but it could still affect it. The other thing to remember is that if you have a cap on how much data you can access on the internet, because some some I
s P s cap how much you can access. If you're running this, that's going to count toward that cap. So let's say that you like to watch lots of streaming movies and stuff. Well, that eats up a lot
of bandwidth pretty quickly. And if you are running this in the background the whole time, you may start inching toward that cap faster than you would imagine, because you know, you're always having maintaining that connection with the master computer um or at least you're you're it's checking back and you know, receiving data and sending dat on a fairly
regular basis. So if you do have a cap on your broadband usage, think about that before you sign up for one of these things too, because if you're anywhere close to that cap, this might put you over the edge, depending on you know, how often you're letting it run in the background. Uh, did you have anything else to add? No? Yeah, I really focused on the uh, the public aspect of this because I thought it would be more fun to talk about. And sure, and it's and it's the part
that applies. Yeah, it's the part that applies to our listeners the most too, unless they're all working at certain in which case, hey y'all, Hey y'all, they probably have to have that translated since they're in Switzerland. That's true. I don't, I don't, I don't. I can't speak any of those languages. Bonjeur, y'all. Yeah, there you go. That's part of its in France. Um, but no, I mean, it's it's it's kind of fun to participate in these groups.
And uh, certainly, Uh, you're welcome to join any of the house stuff works dot com distributed computing groups that will set up and and uh, you know, like I said, maybe we'll take out some of those other media companies with extreme prejudice indeed excellent. Well that was a really good discussion and it leads us into our second round
of a listener mail. This listener mail comes from Cameron, and Cameron says, Hi, Jonathan and Chris, I look through your podcast episodes and couldn't find one covering the topic of daylight saving time. Please cover the history and why it was made, like you guys normally do. I love the podcast, keep up the great work. Well, thanks, Cameron, But you know what daylight saving time really isn't in our wheelhouse. It's not really tech, it's not really tech related.
But what I suggest is that you send this request to either stuff you missed in history class which could give you the historical low down on daylight saving time, or stuff you should know. And if you write the stuff you should know and ask them if they can cover the topic, you can also tell them that they smell that they smell wars are heating up. Yeah, yeah, they smell. We We've already received emails that we smell, so they also smell. Actually, technically, never mind, I can't go.
Thanks so much for your mail, Cameron, Please continue to listen. Sorry, we can't tackle that one. It's just not tech enough, I think. But if any of you have any questions or criticisms, concerns you just want to say hi, you can write us our email addresses tech stuff at how stuff works dot com. We have articles about grid computing, cloud computing, all sorts of distributed computing things, bombie computers. All that you can find at the website how suff
works dot com. And we will talk to you again really soon for moralness and thousands of other topics. Does it how stuff works dot com And be sure to check out the new tech stuff blog now on the house stuff Works homepage, brought to you by the reinvented two thousand twelve camera. It's ready, are you
