How Camera Stabilization Works

Speaker 1

00:04

Get in text with technology with tech Stuff from stuff works dot com and welcome to tech Stuff. I am your host, Jonathan Strickland. I'm an executive producer at how Stuff Works. I like you and I like technology, and I don't really know that much about you. I just get a good feeling, but I know a pretty good amount about technology. Today, I want to talk about, you know, tech and as I'm sure most of you guys know by now, some of you know because you're doing it

00:35

at this moment. I live stream tech Stuff recordings on Wednesdays and Fridays over at twitch dot tv slash tech Stuff. There's a chat room over there, and I like to spend some time chatting with viewers and listeners before the show and during breaks. And during one of those breaks in a recent show, one listener put forth a request for a future episode of tech Stuff and the request was for camera stabilization or image stabilization. So that's what I'm gonna talk about the future is now. So I'm

01:04

gonna talk about image stabilization in general. And there are two really big, really general ways to go about this. One is to use mechanical elements to help compensate for the little or sometimes not so little, jittery movements we make as we try to capture video or still images using handheld cameras. And the other is the software approach, in which algorithms and programs attempt to reduce the jitters sometimes reverse the jitters using some sort of trickery on

01:34

the back end. We're gonna look at both of those different approaches in this episode. Now, before I do that, I think it might be a good idea to talk about how digital video cameras work in the first place, so that you understand what is going on before we even get into stabilization. That way, we can better understand the actual mechanics of stabilization when we get there. So let's crack open a digital camcorder. Let's say, you guys remember cam quarters, right, I imagine there's still a thing.

02:05

I don't know. I haven't been in the market for one for a really long time, But these days there are lots of other devices out there that can capture high resolution video, including several smartphones. So cam quarters a term I don't really hear very much anymore. Anyway, that doesn't matter, as the cam quarters inwards are pretty much the same as you would find in any digital camera

02:22

device these days. Only now we've gotten really good at maturizing those components, so you don't have to lug a big thing on your shoulder, nor do you need something that is part camera part VCR. And if you don't know what a VCR is, you'll have to listen to an older version of tech stuff. I'm not going to go into that here. First, cameras record visual images. I know that breaks it down to an incredibly simplistic and almost silly statement, but it is something to keep in

02:48

mind that we are talking about visual information. Now there's also audio information. Cameras mostly incorporate microphones these days. If you have a video camera that doesn't have a microphone and you aren't using external microphone, you might be an avant garde performer, or you may just have a really crappy camera. Breaking this down even further, cameras capture light. Because everything we see is from light that's bouncing off

03:13

of surfaces or filtering through materials. Cameras attempt to capture that combination of lighting effects in an effort to replicate it in some way, typically to replicate in a way that is as faithful as possible to the to what we would see with our naked eyes, although obviously you can set cameras so that you're capturing stuff and altering it so it doesn't look the way it would naturally. Thus you have all the different types of filters out there.

03:40

You Instagram fanatics out there know all about that, taking photos with various filters, But generally speaking, cameras are capturing light. Old cameras did this by focusing that light on photo sensitive film. So you would take light, you would direct it to this film, and that would alter the film chemically.

04:01

You would then develop the film, putting it through other chemicals that would give you a negative, a reverse of the image that you want, at least in color, and then you would use that negative to produce prints, and that would replicate what your eyes saw based upon what the camera was capable of capturing. Assuming you had a good camera, it would look pretty much the way you saw it in person. Now, cameras typically use one or

04:27

more lenses to focus light in this way. In fact, when we say a camera lens, you really mean a series of lenses that are encased in some sort of form factor. It's very easy to think of a camera lens as being a single thing, because you might have a camera body and then a selection of different lenses where you can attach or detach them. But each of those lenses contains several glass lenses inside of it. It's not just a lens inside a casing. A digital cameras

04:58

by the way, you do this too. Uh. They do use film to capture images. They use image sensors to capture them, but they still use lenses to focus the light onto the appropriate part of the camera. Uh. This being the sensor as opposed to photosensitive film. The image sensors are solid state devices, and the two most common types of image sensors are cc D also known as charge couple devices and c MOSS or CMOS complementary metal

05:28

oxide semiconductors. But what the heck does that even mean? Well, CCD sensors are the older of the two, and they lead the way for many many years in terms of image quality. CCD was considered to be the top of the image quality. Was also more expensive and less energy efficient,

05:45

but it produced the best images. A CCD sensor was able to produce much sharper, higher quality images, but s MOSS sensors are more efficient the more power efficient they're Also, there's also some more built in functionality with MOSS sensors than there are with C c d s. Plus, eventually the C MOSS quality pretty much caught up with and

06:07

in some ways surpass C c ds. So while there are differences, and they're still different kinds of of cameras out there that have different kinds of sensors, and different photographers will favor one versus the other, the differences in quality between the two have largely diminished. It's not as dramatic as it once was. Uh, some budget cameras have C c d s, A lot of the better cameras have C mosses, but that's not that's not necessarily across

06:38

the board. Like photo reactive film, those image sensors are photo sensitive. So when light in the form of photons hits the photo sites, photo sites are the little photo sensitive elements on these image sensors. Each photo site essentially is a pixel, so you get one photo site per pixel. When when a photon hits a photos ight, this produces an electrical charge. Now a bright light will end up

07:04

generating a stronger charge than a dem light will. The photo sites must register the relative brightness of the light that's hitting them, as well as the relative brightness of the color of light that is red, green, or blue light, in order to reproduce color images. Otherwise you're just gonna get black and white images. Some sensors do this by having a color filter above each pixel photo site, uh, and they will have a different color filter over the

07:31

different photo sites. This is called a Bear filter array is named after Bryce Beyer, who invented the array. It's a filter consisting of a mosaic or array of red, green, and blue filters above the photo sites on the sensor chip. So it looks kind of like a checkerboard, except you've got three colors, not two, and uh. One fourth of those filters are red, one fourth of those filters are blue, and the other half are green. So you have twice as many green as you have either blue or red.

08:04

So why is that Why would you have twice as many green filters? Well, that's to mimic the color sensitivity of the human eye, because our color sensitivity is not even across the board from a biological standpoint. Some filters, some Bayer filters, will include different shades of green, so you won't just have one single shade of green filter you might have to you might have one that's like a dark forest green and one that's kind of more

08:26

of a lighter green or something along those lines. Uh. Some filters will use clear filters in place of some of the green, and that's in order to improve light sensitivity, but in general, that's how most of those filters work. When the camera records an image, initially, it's a picture that's just red, green, and blue in the raw file. So if you were to look at a raw photo image before any processing has happened, it would look really weird. It wouldn't look color realistic at all. It could look

08:58

pretty awful. It's what photogra is often called false color. But then you have this image processing unit that's part of digital cameras, and it can take this raw file and convert it into more natural imaging using what is called a demosaic stage, So it takes all that jumblee information and makes some meaning out of it. One cool thing that happens with this software is how it determines the color of an individual pixels. So remember images are

09:26

made up of millions of these pixels. A pixels essentially a point of light, or if you prefer a single point of color in an image, and the more pixels you have, the smaller they Let's say that you have the same dimensions of a photo. Let's start with that. So let's say you're using an eight by ten photo. This is just for the purposes of explaining pixels and resolution. If you increase the resolution of an eight by ten photos, so you're not changing the size of the image, You're

09:56

just changing the resolution. If you increase the resolution, that means you're packing or pixels into that same physical space. That means the pixels themselves have to be smaller. If you are reducing the resolution, you're decreasing the number of pixels. That means the size of each individual pixel gets larger.

10:14

So if you've ever seen an image where someone has taken a digital photo that was for a very small set of dimensions and then they've blown it up, it looks very blocky or pixelated, that's because it was a certain resolution and a certain size, and now you've stretched it, so all of those pixels have individually been stretched as well. If you want to increase resolution, then you have to reduce pixel size to add more pixels to that image. This only works to make an image sharper up to

10:45

a point. Higher resolution does not automatically mean a better quality of picture. There are other elements that are also very important, things like color representation, so in contrast ratio, So it's not just resolute shan that you need to think about. This is why we've done episodes of tech stuff in the past where we've talked about the megapixel myth, the idea that if you go out and buy a camera that has more megapixels that automatically takes better photos

11:13

than a camera that has a lower megapixel count. That's just not necessarily the case. It might be true, but it's not because of the megapixels. It's because of other elements as well. The only time you really have to worry about really high megapixels counts as if you wanted to take a photo and then blow it up to a really large size and you didn't want to have too much distortion when you were doing that. You want a really high megapixel count so that you can do

11:39

that without losing too much on the resolution side. Well, the way this image processing software tends to work is it looks at each individual pixel and then it looks at all that pixels neighbors and it starts to say, well, based upon all the neighbors of this pixel, what color should this specific pixel be. And it starts to extrapolate information based on this and it makes some decisions about what color each pixel should be based upon the nature

12:08

of its neighbors. And as it turns out, this software is pretty good. It can reproduce colors fairly faithfully. And that seems pretty interesting to me when you think that. You know, when you how do you start with that? Right when you first start, you don't have any necessarily any natural colors. It's all the red, green, blue. But by using this kind of process of deduction, the image

12:32

processing software can create a natural looking image. And it's all about again looking at all the neighbors of this one pixel to determine what color it should be. This happens super fast. By the way, it sounds like something that would take a really long time, but the software is incredibly fast. Now. Bear filters are found in many, but not all cameras. Some use other types of filter systems. Uh. The fob on sensor has red, green, and blue filters

12:58

over every single photo site. So instead of having either a red, or green, or blue filter over each photo site. Every single photo site has all three, and some photographers say that this produces more natural looking images. Others say they are completely bonkers crazy wah wah, who uh. I guess it really depends on your perspective and what you have experience with um, because as far as I can tell, it's not that different. But then my visual acuity is

13:30

nowhere near the level of say a professional photographers. The important thing to remember here is that the image sensor generates an electrical charge based off these photons, whether it's measuring the red, green, or blue, or just the amount of light in general, and the camera has to measure that that electrical charge and convert that value into a digital value, or convert that electrical charge into a digital value with an analog to digital converter, and then you're

13:59

left with digital data for all the processing. So the software does all the work on the back end, so you've got the hard work on the front end where the image has to her image sensor has to take all these this light information converted into electrical charges, measure that and convert that into digital information, and then the software takes that digital information and interprets it to create

14:23

the image that you're looking at. There are other things that take into consideration here, such as the shutter speed of the camera. Shutter is a device that cuts off light exposure to the sensor in the camera. The shutter speed determines the exposure time of a camera's UH the image that you're taking, So a short exposure is good if you're trying to take an image of something that is moving very quickly, but you need a lot of

14:48

light to do that. You don't you're not leaving the shutter open very long for light to get to the image sensor. The same was true of film cameras. You need a very fast shutter speed, but you need a lot of light in order for the light to be able to register against the the photo sensitive material, whether

15:06

it was film or an image sensor. If you're using a very slow shutter speed where the shutter is open for a longer amount of time, and this is all relative by the way, we're talking about fractions of a second. If you're leaving the shutter speed open for a longer amount of time, that's great for low light UH activities like if you want to take an image of something

15:26

that's in really low light. Using a longer shutter speed where it's open longer is a good idea, But any motion is going to insert a lot of blur into that image, So you want the you wanted to be very still. If you're going to use a slower shutter speed. If you wanted to do something like high speed camera work where you're going to show something back in super slow motion, that shutter is moving incredibly fast. You're talking

15:55

about taking thousands of images every single second. In order to do that, you have to have it very very well lit. So if you've ever been on a set where they're shooting super high speed UH footage in order to show back at at slow speed, then you know what I'm talking about. The lights in those kind of situations tend to be out of control. They're really actually

16:21

very much in control. They're just extremely well lit. So the whole reason I wanted to talk about this bit is that if you move a camera around while you're taking images, whether it's snapshot or video, you can end up with a jittery mess and it's really difficult to hold completely still. If you've ever tried to just hold still, uh, you know even if you've got the hands of a surgeon, you're gonna notice a little bit of a jitter in there. Most of us have some of that whenever we operate

16:47

a camera. Some of it's more obvious than others, and a little bit people can tend to look past, but more than a little bit, it gets very distracting. So if you ever wanted to do something like move around while you're shoeing video, you might have so much jetd that's distracting, and you want to figure out a way to reduce that, and we found numerous ways to cut

17:07

that down. Some ways just involved locking down the camera, so you might put it in a tripod, and then the camera's motion is strictly limited and you can operate it without it having too much jetter. But then you have a pretty static shot. You could use cranes and dollies, which tend to have very smooth movement along certain directions, but you are limited in the ways you can move the camera. In both of those, you can't go anywhere

17:35

an actor can go. For example, if you wanted to shoot a film, you could put the camera on a dolly that isn't on a track. So there's some track systems. Those are great. If you have uneven ground or rough terrain and you put tracks down, you can have a nice smooth camera operation, but again you're limited to just moving on the tracks, or you could do a wheeled dolly that can move around uh an area, but then you need some that's an area that's pretty smooth and flat.

18:02

That really still limits what you can do with a camera. So one of the ways that we have come up with two improve camera operation and to be able to go in the same places where it say performers can go is with steadicams. Now. The steadicam was invented in the mid nineteen seventies as a way for camera operators to move around while running a camera and get smooth footage. Whether it's with a documentary or with a film where you're following actual characters. The steadicam can remove the jitter

18:33

created by walking or running. And when we move around in our own bodies, this is happening all the time. There's always this jitter as we're walking or running, but we don't really notice it because our brains are really good at smoothing all of that out to create a more steady experience in our consciousness. So it's only when you're really concentrating on it that you become aware of it. It's sort of similar to how the world does not spontaneously go away and then come back every time you

18:58

blink your eyes. It's only when you really think about blinking your eyes that you start to notice it. And now you're noticing it, aren't you sorry about that? My bad? Don't worry. You'll you'll stop thinking about blinking your eyes in just a minute and then everything will be fine. So how does a steady cam stabilize the image so that you can get the really cool effects you see in movies like the Amazing copa shot in Good Fellas. So if you're not familiar with the sequence I'm talking about,

19:24

it's it's in the movie. It's well in the film. There's a sequence that's a little bit more than two and a half minutes long, and it's a an uncut shot. It tracks two characters as they cross a New York street. They walk down some stairs into the kitchen of a busy restaurant. They walk through the kitchen, passing by dozens of extras as they're moving around in the kitchen as cooks or or waiters, they emerge onto a dining floor and they're seated right up front at a stage. And

19:56

this is all in a single uncut shot. And it was using a film camera, not a digital camera. This was in the early nineties, and there were other challenges there. For example, the way the film camera worked. It was actually pulling film from one side of the camera and it would go in through the camera where you would expose the film to light coming through the camera. The exposed film would then go into a canister on the

20:21

other side of the camera. Now, what that meant was that as you shot film, the weight of the camera started to shift because the side worthy the the exposed film was coming out that started getting heavier and heavier, so you had to compensate for that. It's actually pretty remarkable if you watched that sequence that it doesn't just slowly start tilting towards the right because the camera was

20:43

getting progressively more heavy on the right side. By the way, if you're curious, there's some great behind the scenes documentaries and interviews about the copa shot. It's one of those things that's talked about in film school. It took them eight takes and that was it, and they actually finished it in half a day of shooting, which if you've ever been on a film shoot for a two and a half minute sequence uncut, to get completed in eight

21:10

takes is pretty phenomenal. Also, it tells you the difference between someone, say like Martin Scorsese and Stanley Kubrick, because if Kubrick were shooting that, he would have still been doing it years later. Uh. Anyway, the steadicam was a big part of why this shot was even possible because it meant following these actors through these different environments, including downstairs and through a crowded kitchen, and it was something that you just could not do on a track or

21:42

on a wheeled dolly. Uh. And it was all because of the steadicam. Now, let's say he came itself was invented again in the seventies, not in the nineties. It was invented by a guy named Garrett Brown who was

21:52

a commercial director and a producer. He was not an engineer, and he was just trying to come up with a way to remove all this jetter that was coming around whenever you wanted to do a handheld shot and you wanted to follow an actor along a place where you couldn't have these other traditional camera setups, and he got a rough idea of it and called it the Brown's Stabilizer. In nine three, fortunately, a more refined version would become

22:19

the first steadicam, and it consisted of three components. You had an articulated iso elastic arm that would attach to a sled. The sled was a kind of a it's it's a housing for the camera and for all the camera components. It looks like a big pole, and then the various components attached to either the top or the bottom of the pole, the camera at the top and then everything else closer towards the bottom. And then a vest that helped distribute the weight of the system and

22:49

provide more stability. So you had an iso elastic arm that on one end would attach to the vest and thus have that, you know, just weight distribution. On the other end, it attached to the sled which held the camera and its various components like a battery pack, a monitor, that kind of thing. Now, the arm of the original steadicam was a lot like a swing arm lamp with a spring loaded arm. Uh if you ever look at

23:17

one of those. In fact, I have some right in front of me right now, because the microphones I use are on that this style of arm, you have two bars that make up each segment of the arm, and they're in parallel with one another. Uh. These two bars would then end at metal blocks and they would also join together with a pivot point, a pivot joint that will allow them to kind of bend like a human arm would, and the camera sled would fit onto the arm.

23:47

The sled would consist of that assembly that holds the camera, the battery, the motor, counterweights, and some systems. All of that's very important, and it's on was called the sled pole central piece of the sled, and that fits into the arm of the stead decam. Now, the way of the sled and the camera is constantly pulling the arm that's attached to the vest downward. Right. So imagine that you're holding a weight at arms length. Let's say it's

24:13

a a fifteen pound weight. You're constantly feeling gravity pulling down on that weight that's being held out arms length. The steadicam arm is the same sort of thing. It's holding up the sled and the camera. Well, the way it holds it up and keeps it in the same relative position, so that the camera doesn't just continuously sink towards the floor. Is through this spring loaded system, it's

24:38

counteracting that downward force. The parallel metal bars in each arm have these spring systems that are creating a force that's directly opposite the downward force of the weight from the sled and the camera. In fact, it's precisely calibrated so it will maintain that that way. The only time you change the elevation of the camera is if the

25:00

camera operator wants to. The camera operator can raise or lower the camera, but it will then stay in that relative position relative to the camera operator because the arms spring loaded system is counteracting that downward pull from gravity. It's really clever. It's also very difficult, obviously to explain

25:21

this in an audio format. Fortunately, there are numerous articles, including one on How Stuff Works, about how steadicams work, as well as videos on various platforms that show how steadicam systems work to kind of help you get a visual reference to what I'm talking about. So if you still are having trouble imagining this, I highly recommend checking out an article or watching a video to get a

25:46

little deeper understanding. To get that spring system UH into further detail, it would require a very deep discussion, so and again it's very difficult to explain it in an audio format. So let's just say that there's a system of pulleys and springs inside these arms that provide that counter force, and will move on to talk about the

26:05

sled and the sled pole. So the sled includes an element called a gimbal and a gimbals, a mechanism that keeps something in a position of relative stability despite a moving environment, typically horizontally, and in fact, gimbles have been around for centuries and the good old days ships used compasses mounted on gimbals, and that allowed the compass to appear horizontal even as the ship was pitching or rolling

26:30

on rough seas. Typically gimbals consists of a couple of rings pivoted at right angles to achieve this effect, and so the UH, the the system around the compass moves, but the compass itself appears to be steady relative to you. The camera sled also changes a camera's center of gravity. So the center of gravity, obviously that's the point at which you can balance an object, and cameras typically there their center of gravity, it doesn't make it very easy

27:03

to keep them nice and stable. But adding them to the sled, if you add mass to something, you change at center of gravity, and typically the the handle the control system for a steadicam is located very close to the actual center of gravity of the whole system. That allows you to have much more control over the camera and stabilize it. As such, the sled also changes the the moment of inertia for the camera system. Uh. That means that it becomes more resistant to rotation, and that's

27:38

really important for image stabilization. To the effect of a steadicam is that you get this gliding shot that isn't bothered by all the shakes of walking or running. And there are tons of examples of steadicam shots in cinema. Uh the Good Fellows shot being a really famous one. Or there's also one in Kubricks The Shining there's a part where Danny is running away from his dad in a hedge maze in the snow, and we're following right behind him as he's making turns left and right, and uh,

28:09

it's a steadicamp SHOT's pretty impressive. So that's another famous one. But there are tons of them in movies now. As we'll see in a bit, some cameras today incorporate equally ingenious ways to reduce unwanted motion in footage that do not require you to strap it to some sort of complicated apparatus that you have to then wear so that you can take some of the weight onto your own body. And we'll talk about him in just a second, but first let's take a quick break to thank our sponsor.

28:45

All right, So let's say you want to get smooth video, or you plan on capturing still images, but you want to use a lower shutter speed to capture in low light, and you're gonna be holding the camera by hand, you're gonna want to reduce that jitter, or else what you'll get back won't look nearly as good as what you want, what you saw in person, unless you're really going for that shaky cam look, in which case image stabilization isn't really your bag. Baby. For a camera to correct for

29:10

that kind of motion, you need a few elements. The first, which you'll find in both of the major solutions to this problem would be sensors that can detect camera motion. For a camera to correct for a shaking motion. It first has to detect that shaking is happening in the first place, or else nothing can happen. So let's consider the type of sensors found in these solutions that can detect camera motions. So, for example, we'll talk about optical

29:37

image stabilization or o I S solutions. That's one of the two that we're going to chat about in this section. Optical image stabilization and in camera systems. Those are the two big ones. Both of those rely on sensors to detect when motion is happening. Otherwise there's no way to compensate for it, right, You have to know something's happening before you can fix it. Well, the optical image stabilization uses the optical pathway, in other words, the lenses to

30:06

correct for camera shaking. And from a very basic standpoint, the secret is that these lenses have a specific movable lens of floating lens inside of them, so a an actual lens singular inside this lens. Casing will be movable with respect to the rest of the system inside of it. It can shift left, right, up and down. Movable lenses are pretty cool and I can't wait to talk more about in a second, but again, in order to move the lens to where it needs to be, you have

30:38

to detect the shaking in the first place. It's a little complicated. So you've got these sensors, and with these solutions, that's typically two piezo electric angular velocity sensors, which do

30:49

what gyroscopes do. And the reason you need two of them is because each sensor really only detects motion along one element of movement, like horizontal or vertical, So you need two of them at sort of ninety degrees from each other in order to detect vertical motion versus horizontal motion, and together they can detect the motion you would find

31:10

in your typical camera shaking movements. They're not going to be able to fix anything that's dramatic, so if you whip the camera left or right or up or down, it's not gonna be able to compensate for that. But for just the small motions that our hands make while we're holding a camera and trying to capture an image, they can often correct for that. H So you gotta keep that in mind. It's really the jitters that these

31:38

things fix, not the big stuff. Um. Also for the optical image stabilizers, they cannot correct for rotation along the optical axis. In other words, if you were to take your camera. Let's say you're holding up a camera and it's got a lock in on the way the image is showing up like landscape, and you start rotating the camera h so that you're doing either a clockwise or

32:05

counterclockwise motion, tilting the image. These sorts of sensors don't detect that kind of momentum, that rotational momentum in that respect, so you could get tilt, but no jitter in this case. So how did those sensors work? Well, we gotta break it down, and it's a little a little scary when you see something saying piezo electric angular velocity sensors, What the heck does that mean? Well, if you break it down, it's not that tough. First, you've got piezo electric. You've

32:35

probably heard of the piezo electric effect. This refers to the ability of certain materials to generate an alternating current voltage. When those materials are subjected to mechanical stress or vibration. They also will do the opposite. They will vibrate if you subject them to an alternating current voltage, So it goes either way. Quartz crystal do this, and that's why they were used in watches and still are in some watches.

33:05

Uh So it's a very predictable behavior. If you know the the the basics of that material, you can replicate it over and over and over again. It's always going to be the same. Next, we have the term angular velocity, so that refers to the change in rotational angle along an access per unit of time, and we express this in degrees per second. That's how you measure angular velocity

33:30

in degrees per second. So a piece of electric angular velocity sensor is one that detects changes in rotational angles along an access by generating an alternating current voltage in response to mechanical stress. And that mechanical stress is brought to us courtesy of the Coriolis effect or the Coreolis force. I really should say not the Coriolis effect, which is a very specific thing that refers to an enormous system

33:55

called planet Earth. But the Coriolis force. This is an inertial force that was first described by stove Gasparred Coriolis, a French engineer of some renown, and as Encyclopedia Britannica puts it, quote Coriolis showed that if the ordinary Newtonian laws of motion of bodies are to be used in a rotating frame of reference, and inertial force acting to the right of the direction of body motion for counterclockwise rotation of the reference frame or to the left for

34:25

a clockwise rotation, must be included in the equations of motion. So essentially it's talking about detecting specific types of velocity of changes in will not even changes, but just in the direction and speed of motion. The important thing to remember is that any motion would cause these piece of electric sensors to deform slightly, so they can take different shapes. And these are very tiny, tiny sensors, but they can take different physical shapes like a tuning fork shape is

34:57

not uncommon. And as you move these things, it deforms them. They change their shape slightly because of that inertia, and that mechanical stress thus generates an alternating current voltage. Because they are piezo electric materials, the sensor will detect that change in voltage or that generation of voltage. And it gets a little more technical from here, but it also gets really complicated. So kind of like the steadicam, We're

35:25

gonna take a bird's eye view of this. Essentially, the sensors consist of sensing arms and drive arms of this piezo electric material, and the emotion sensed produces a potential difference, an electrical potential difference. It's this potential difference that indicates to the sensor the change in angular velocity, and the sensor's output is an electrical signal which can then be

35:50

processed by a microcomputer. The other big element to this optical image stabilization system is the movable lens, that floating lens inside the overall camera lens. So this is completely contained within a camera lens itself. If you've ever seen a camera where you've got a body that can then you can attach different types of lenses to it. You attach the lens, and if you need a different lens, you pop the first one off, you put a new

36:16

one on. This system is completely contained within those individual lens cases. It's not in the actual camera body itself. So we're talking about a piece of glass essentially that shaped a very specific way, that's in a movable frame inside this lens, and movable so that it can go up, down, left or right, but it still remains aligned front and back with the rest of the lens assembly. So it's all meant to direct light back to the image sensor in a proper way. And you typically have a lens

36:50

in the focusing group. That's the one that's closest to the end, the part that you're you know, is facing out to the outward world. That's the focusing group lens. And then towards the back of the lens array you have some other lenses that are meant to direct the light properly. Between those two sets is where you put the image stabilization lens. Really, you don't do it, the lens manufacturer does. Don't open your lenses, that's crazy talk.

37:17

But the stabilization lens is in between these two other sets. So you got the focusing lens at the at one end of the the lens array. You've got the other groups in the very back that are directing it towards the image sensor. In between those, you have this floating image stabilizer lens, and it's not fixed in relation to the rest of the lens array the way the other

37:42

lenses are. So it's in this frame that uses electro magnets to move the frame with respect to the rest of the lens array, and it's this frame that reacts to the information from the motion sensors. So the motion sensors are picking up jet and based upon those electrical potential differences, it's able to identify how much jeer is

38:07

coming up and down versus left and right. It sends that information to the micro computer or micro controller rather that controls the movement of the frame holding this image stabilizer lens. The image stabilizer lens then is moved into place so that the light coming in through the focusing lens can be redirected towards the lenses in the back of this lens array to thus go to the image

38:33

sensor in the very back of the camera. And it's the end effect is supposed to be as if there was not any jitter at all, as if there was no broken line between the light coming in through the focusing lens to the image sensor in the back. So, if you want to think about in another way, imagine that you've got a character, let's call him a Z's and the ceas has a mirror, and the seas is standing at the entrance of a cave and light is coming in, and you need to have some light directed

39:07

from the outside of the cave into the cave. So you can read some really funky hieroglyphics that someone wrote years ago that suggested maybe aliens came down ages ago and you yell a seas light and the seas has to manipulate the mirror in such a way so that light coming from outside is then reflected to go further into the cave and illuminate the cave wall. That's what this image stabilizer lens is doing, except instead of reflecting light obviously, it's it's directing the light by making it

39:36

go through a specific part of the lens. And yes, that was a fifth element reference for those of you who are paying attention, And if you don't know what that is, you should watch The Fifth Element because it's amazing. It doesn't have a lot of steadicam shots in it,

39:50

but it's a great movie anyway. The cool thing about this particular approach, this optical image stabilization approach, is that it can turn any compatible camera into an image stabilized camera because all the technology is inside the lens itself. So if you have a lens that's got this image stabilization system in it, then you can attach that to

40:13

any compatible camera and you get that image stabilization. The downside is it really makes those lenses more expensive, and lenses are not cheap to start off with, so if you're building up a selection of lenses and you want them all to have image stabilization capabilities, you start really racking up the cost pretty quickly. But there is another route to go, and that is called the in camera stabilizer. This takes a different approach to stabilization, but it uses

40:41

a very similar philosophy. So instead of having a floating lens internally in that array that can move around and redirect light, UH, it actually has the image sensor itself on a movable frame. So this would be as if you could move that cave wall so that it was in line with the light. As opposed to moving the mirror to redirect the light to the cave wall, you're actually moving the sensor itself. Otherwise it's behaving pretty much

41:10

the same way as the optical image stabilizer. These are also sometimes called mechanical image stabilizers UH to differentiate the two, or in camera image stabilizers. Because it's inside the camera, the body of the camera itself, not inside the lenses. So the big pro here is that if you've got that in the camera, it doesn't matter what kind of lens you use, because it's already got the image stabilization

41:31

in there. The downside is, if you're using lenses for something that's further away, you're using like zoom lenses that kind of thing, the reduction in jender is less effective than it would be if you were using the optical image stabilizer. So depending upon your use, one may be better than the other. Depending on your budget, the in

41:52

camera version may be better. And in fact, you can find this specific type of image stabilization, and lots of smartphones out there actually have this capability of moving the image sensor tiny tiny amounts to reduce jitter. Not all of them do this, by the way. Some of them use post process image stabilization, which we'll talk about in

42:11

a minute. So it's kind of a neat an elegant solution. Um, if you have stabilized your camera, let's say that you've got one of these two systems in place in your camera system. Uh, and let's say that you've decided to to really lock down your camera. Let's say that you put it on a tripod, you actually probably want to turn off image stabilization at that point, and if you don't turn it off, the motions of the stabilizer itself can end up being picked up by the system, which

42:42

then tries to compensate for that movement. Which puts me in mind of the time I was in college and tried tight rope walking for the first and only time in my life. So let me explain, because this analogy I think is very apt. When I tried tight rope walking, I stepped on the tight rope and it was really something between a tight rope and a slide rope, and as I put weight on my leg, my leg began

43:03

to shake. Now that caused the rope to move around, and I tried to catch my balance but found myself overcompensating, so I would shift and then the rope would move the other way, and I try and shift again, and it was just getting worse and worse, and it caused my leg to shake more, and instead of studying myself, I just found the lower half of my body was

43:22

going crazy and not in a fun way. So, after approximately fifteen seconds of trying this, I realized that tight ropes were probably just one of those things I wasn't meant to experience, and I stopped. Now, the image stabilizer system kind of behaves the way my legs did on that tight rope in these situations. So you've got your image stabilized camera on a tripod and your image stabilization

43:43

is turned on. A little motion within the system itself could be picked up as a camera shake, so then it tries to compensate, but because the camera is not actually shaking, it's on a steady tripod, uh, it then starts to detect the solution for that shake as its own problem, and it becomes this feedback loop. And this can actually introduce motion, blur and jitter in your photos even and your video even though you've got your camera

44:07

mounted to a stable tripod. So you might want to turn off image stabilization in that process unless you plan on doing a panning shot. So panning shots can be tricky to you want a nice even pan You're either going left to right or right to left, and you

44:22

want to avoid introducing any jitter in the image. Vertically. Well, some image stabilizers have a panning mode so that way they compensate for vertical movements, but not horizontal movements, and those systems will remove some of that jitter while still allowing for a nice smooth panning motion. With both optical image stabilizers and mechanical image stabilizers, it took a lot of engineering to figure out precisely how to make the internal mechanisms respond in such a way to preserve the

44:47

integrity of an image while simultaneously removing jitter. And while the basics could be found in much older technologies and use for centuries, like the gimbal, getting that precision and response time to a level that is useful in a dynamic use case such as taking video or photos required a whole lot of engineering. So I'm really impressed at this technology. But we still have one more variation to talk about because some systems don't use any moving parts

45:13

at all to create image stabilization. Instead, they try to beat the problem with software in a post process solution. So how does that work? Well, we'll find out, but first let's take another quick break to thank our sponsor. Some digital cameras, particularly some smartphones, have what is called virtual image stabilization or electronic image stabilization, or sometimes just post process image stabilization, and these systems don't use moving parts to adjust sensors or lenses in order to keep

45:50

the image nice and steady. Instead, they use software. So what's going on here, Well, from a high level perspective, a program attempts to reverse any shake found in an image algorithmically, and some do this in a pretty basic way. Imagine you have a video open on YouTube. So let's just say you've got a regular video. It's not full frame, it's just the video in YouTube's desktop application. Now, imagine that the actual video footage extends beyond the borders of

46:23

that video frame. I mean, even if you were looking at it in full screen mode. Imagine that the video itself would extend a little bit beyond every single border, just a touch. So what you're seeing is not the full frame of video. It's a section, a cropped section of that video. The edges are cut off because those edges provide some cheat room, a little bit of buffer for the purposes of image stabilization. Now, this approach relies

46:49

on some assumptions. The program does not necessarily know which elements of a video you're really interested in, so it has to make some guesses. Let's say that you're taking a video of a kid running uh and the kids running out though through the snow in a hedge maze, and you're running after the kid. Maybe you're shouting out encouraging words about this nifty hotel you've been hired to look after, and your video footage would be really shaky

47:12

because you're holding the camera as you're running. You don't have a steady camp, so every single run step you're taking it's it's jittering the camera. But feeding the footage through an algorithm can smooth things out a bit, and the algorithm recognizes that the kid you're chasing is the interesting thing in the frame. It's doing this through some processing and figuring out which pixels are changing the most and which ones are staying more or less the same,

47:35

and kind of drawing some conclusions based on that. So while the camera shakes around, the algorithm repositions the frame of view for the audience, so the kid remains more or less in the same general area relative to the screen. So you can think of it as like a a picture in picture sort of thing, and the picture in picture that frame is moving around with relation to the

47:57

rest of the the the the big picture view. But if you just stare at whatever is happening inside that picture within picture, it looks nice and steady, or at least compared to the overall image. Uh. This is not a perfect system, obviously, it can sometimes be very off putting, but it is a common one that's in use in post process image stabilization. The way this is done practically has changed over the years, like the way that people have actually designed the algorithms so that they could make

48:31

this happen. That has changed. So a very early version of this would have a camera essentially the image processing software identify a point in the background that was clearly defined and lit. So let's say that you've got uh an image of a person standing in a field, and there's a some wooded forest as opposed to the unwitted forest in the background, and there's a particular tree that's

49:00

a nice sharp relief. Well, the image processing software might focus on that tree and say, all right, we're going to lock onto this section of the tree, and we want that section of the tree to be in this relative position in our frame of view for the duration of this video. So you're hand holding the camera pointing it at somebody who standing in the field chatting, and because of the image processing, trying to keep that one part of the tree in that one part of the frame.

49:31

It stabilizes the image even if there's a little bit of jitter as you're holding it pointing it at the person who's talking in the video. Uh, that works, okay, it's it's a little primitive. You could actually do a fairly nice static shot that way, but you can't move the camera at all because if you start moving the camera purposefully, i mean, beyond just the little jitters, then that frame of reference is going to move dramatically and the software can and handle that. It has to, you know,

50:02

keep an eye on a relatively stable shot. So this is for static images static video. A slightly more advanced version of that same approach would pick to reference points, so one on either side of the frame, and then it was essentially draw an imaginary line between those two points.

50:18

So let's say it's identified a tree on one side and a bush on the other side, and said, all right, based upon this setup, we want these two points to remain the same general spots in your frame of view, and the line that's between them will always be at the same alignment no matter what. This would allow you to actually have some rotation of the camera as well, and the processor could could account for that and correct

50:49

for it. So, let's say you're holding a smartphone and you're trying to take video of someone, and you're using this particular version of image stabilization. If the outside of your smartphone where to dip down a little bit and the left side lifts up a little bit, thus you would have a little bit of a tilt to the video before you corrected it. Because of this particular form of post process image stabilization, it would detect that change relative to that imaginary line and correct for it. Uh,

51:19

depending on how dramatic your turn was. You might actually notice this while looking at the playback, because unless you crop further into the image, you might start seeing the edges of where the cut off is for the video popping up and it can be a little weird. You may have even seen this in some videos online where you start seeing an edge kind of creep into the video a little bit. That's due to this post process image stabilization, whether it was for jitter or for rotation,

51:48

and it can be very off putting. That's why most people who use this, who are actual video editors, will digitally punch in a little bit. They'll crop the image so that they cut off those edges so that you can't see that when it happens. Of course, the problem with that is that you have to be at a high enough resolution where when you digitally punch in, it's not it's not a noticeable decrease in quality of the

52:12

actual image itself. For moving shots, visual image stabilization might look at individual pixels and track how they change over time, and they interpret that as motion. So if these pixels are changing in ways where you see one pixel changing and the pixel next to it has become the same as the pixel that was to its left, and then two pixels down it becomes what was two pixels to

52:38

the left, etcetera, etcetera. It can then start to interpret this as motion and starts to work out where things are moving, and then and editing, you can again crop your video to remove those jolting edges and punch in a bit. Some folks from Google and Georgia Tech actually developed a new method of virtual image stabilization a few years ago. So I guess it's not new now, but it was a couple of years ago, and they published their work in a paper titled Auto directed Video Stabilization

53:05

with Robust L one Optimal Camera Paths. The paper also gets super technical. It is out there free for you to read, so feel free to seek it out if you are technically minded. Here's the bit that I think is really important. They lay out that this post process video stabilization requires three steps, and the first is to estimate the original camera path, which is the one that's all shaky and stuff. It's the one you want to fix.

53:32

The second step is to estimate a new smooth camera path, and the third step is to synthesize a stabilized path by following the estimated smooth path. This is not easy to do, and there are lots of different ways to try and do it, and we're always seeing developments in

53:50

this space. However, that being said, even as this post process stabilization technique advances and it evolves, I think most filmmakers would argue that the optical image stabilization and the in camera stabilization systems are far superior that you can get some image stabilization that's all right for basic use in this post process approach, especially if you're going to just do something like share something online, like it's a online video or you know something on Facebook or Twitter,

54:22

it's not that big a deal to have this post process image stabilization then. But if you want something that is more of a professional level, hands down, the argument I have seen is that you should go optical image stabilization if you can, in camera stabilization if you if optical is too cost prohibitive, But either of them are far more reliable and effective than post process image stabilization, and they are not going to create the weird artifacts

54:53

that you might see using a software based solution. So it turns out that in this case, the pchanical one, the electronic mechanical one, might be better than the software one. That may not always be the case. We may eventually get to a point where they are advanced enough algorithms and advanced enough cameras where it becomes a non factor,

55:15

but we're not there yet. So generally speaking, image stabilization is all about trying to take out the frailties of being human, trying to remove that little element of humanity where we get that imperfection that creeps into our art. And for some of us. That means that we get an image that we actually really wanted to convey to our audience. For other people, they may say, well, that kind of removes some of the humanity from the art itself,

55:44

and I'm not here to make either argument. I think that there are some amazing uses of image stabilization that create really compelling effects and are great for storytelling, and I think there are elements where you don't want that, where you want something more natural, a stick and shaky, to convey a specific kind of mood or tone, And really it comes down to the intent of what you

56:09

are doing and the theme that you're going for. I don't think that either is necessarily inferior or superior to the other, and I enjoy plenty of work that incorporates both types of elements, hopefully purposefully. Sometimes you get these these effects just by happenstance because people just didn't know any better, and that can still be effective, but it's a little less special to me than when people go into it knowing what they're doing. Thank you so much

56:41

for joining me on this episode. It was a lot of fun to look into a more technical aspect. I've got another one coming up soon. That's also extremely technical, and also was a suggestion that was left in the twitch dot tv slash text stuff chat room. Remember, on Wednesdays and Fridays, I do live stream my recordings of text so just go to twitch dot tv slash tech Stuff. You can see the schedule there. Also, if you have any suggestions for future episodes of tech Stuff, you can

57:08

send me a message. My email address is tech Stuff at how stuff works dot com. People ask me for it all the time. I say it in every single episode tex Stuff at how stuff works dot com. Or you can drop me a line on Facebook or Twitter. The handle at both of those is text stuff hs W. That's it for me. I'll tell it to you again. Releas for more on this and thousands of other topics because at how stuff works dot com.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript