15: Depth Perception

MIT OpenCourseWare · Intermediate ·🔍 RAG & Vector Search ·3mo ago

Key Takeaways

This video lecture covers depth perception, including shape from shading, lambertian reflectance, and retrieval augmented generation, with a focus on how the brain infers 3D structure from 2D images. The lecture discusses various cues to depth, such as shadows, texture gradients, and linear perspective, and how the visual system uses these cues to infer depth and distance.

Full Transcript

Let's talk about depth perception. So the problem here that we're going to talk about today is that the input to the visual system consists of two dimensional images. Okay. Um the world is three dimensions and people are pretty good at estimating the three-dimensional structure. You can hold the shape in your hand and visually be able to assess the shape. You can tell how far things are away from you and so on and so forth. Okay. Uh so the extraction of three-dimensional structure from the two-dimensional images that the eyes receive is important for lots of things for navigating through space for interacting with objects so forth. And the most reliable source of information about threedimensional the threedimensional structure of the world is visual at least for humans. Okay. Some other animals use sonar and and other things like that. Okay. All right. So the keys to the solutions um the one one key is the fact that you have two eyes. Okay. Um and we're not going to talk about that in this lecture. That'll be for next time. Um the other key are assumptions that we make about the world um that we have presumably learned or internalized over evolution or development um coupled with implicit knowledge of the physics and the geometry of light that uh in the comes from objects to the eyes. And so we're going to talk a lot about depth cues. All right, we've talked about cues previously in the class. When you talk about a Q, we mean a source of information. So some aspect of the stimulus that provides information in this case about a variable that we care about like depth. Okay. So there are lots of visual cues to 3D shape and depth. Okay. So stereopsis is the the one that is best known. Um but you can see three dimensions without it. Um so if you just close one eye like the world doesn't really look that different, right? You can still tell that certain things are further away u than others. Um, and you can see the threedimensional shape of objects pretty well. Okay. Um, so this is a partial list of of cues to depth. We'll talk about a whole bunch of these. Um, shape from shading and drop shadows, linear perspective, texture gradients, familiar size, height and field, aerial perspective. U, last time we talked about the kinetic depth effect. There's also motion parallax. Those are depth uh, cues from motion, occlusion, uh, focal blur. Um and and this is not an exclusive list. Okay. Um so the first thing I want to talk about today is shape from shading. So we've we've alluded to shape from shading previously in in the class and it refers to the fact that variations in luminance are often um interpreted as shape and the basis from shape from shading is that many surfaces are approximately lamburian. Right? So we talked about lamburian reflectance in the context of lightness perception. So a surface is lamburian um if the amount of reflected light is determined by the angle between the incident light ray and the surface normal vector. Um but the surface otherwise scatters like equally in all directions. And so this is like um a picture that kind of depicts this. So the um the black arrow is the surface normal. All right. So this is a horizontal surface. So the normal is perpendicular to that. The yellow arrow is the direction of the illumination. Okay. And then the little blue arrows are are scattered light rays. Okay. And so what you what you're supposed to take away from this is that irrespective of the direction of the illumination, the light gets scattered equally in all directions. But you can see that the light um is there's the the most light reflects off of the surface when the illumination is perpendicular to the surface. And as the illumination becomes more oblique, there's less light that's scattered. Okay? So this is the characteristic of a lamburian surface. All right. And so um the the basis of shape from shading is kind of like the like the opposite of what is shown in this diagram. So this diagram is showing a situation where the surface orientation is fixed and the illumination direction is varied. Okay. And that that doesn't happen very often in the world. So typically in the world it'll be one one light source and so the direction of illumination will be fixed in some particular direction. Um but the orientation of the surface will vary because like an object is curved. Okay. And so for a fixed lighting direction, the amount of reflected light will then vary with the surface orientation. And so it follows from that that if you knew the direction of illumination and if you also knew that the intensity changes in the image were only due to shading, then you could infer shape. Okay. Um and so there are a bunch of classic demonstrations that people do seem to do this under many conditions. So this is a a classic um stimulus, an illusion if you will, where the physical stimulus um is these circles and inside each circle there's a luminance grade gradient. Okay? And it just goes from light to dark or from dark to light. Okay? And so when you um look at this mo most people will see um is a set of bumps and a set of craters. Okay. Okay. Um, so does anybody want to hazard a guess as to why some of the circles look like bumps and others like craters? Well, >> because you expect that the light is coming from above. So it looks like it would be illuminating illuminating from above. >> Yep. >> Yeah, that's exactly right. So the visual system seems to have a prior um that favors illumination from above. So we very commonly by default will assume that illumination comes from above. Okay? And so the um if the illumination was coming from below um then the inferred shape would be the reverse of what it is, right? So the things that are bumps would be craters and vice versa. Okay? Um so shape from shading is important in art. So bar relief is a a style of sculpture um that relies exclusively on shape from shading. So there's some material here and um uh somebody carves some 3D structure into it, but then all of the luminance variation just comes purely from shape. So it's illuminated from a particular direction and then you're able to see the 3D structure via shape from shading. You can also see evidence of shape from shading in lots of like real world photographs. Um, so this is a photograph of a a crater either on the moon or Mars. I forget which. Okay. Um, but it doesn't really look like a crater here, right? Um, so the the prior favoring light from above is again at work. So this is this is uh the crater. Okay. Um, so it's just flipped upside down here. Okay. So there's lots of examples of photographs of things where if they're taken from an orientation where the illumination is coming from the direction that you don't expect, the they look funny. Um, this is one that I just saw the other day that's kind of striking. So, this is a picture of sand dunes. Okay. Um, but it really doesn't look like sand dunes, right? It looks like a a whole bunch of little craters on the surface. Um, so here's the other orientation. And you can see the the correct 3D structure. Um, this one, this is a really pretty awesome demonstration of the same thing. So, um, this is, um, a physical thing that was actually like engineered. So, there's some 3D shapes that are embossed into this thing. Um, and then it's lit from a particular direction. Okay? Um, but it's on this table that can rotate. And so, when you look at it now, it kind of looks like this is sticking out, right? And this is sticking out. Okay. But whoa, now it flips. Right. Okay. It's going to get even cooler. So now they're going to pour some liquid so you can kind of verify which parts are actually sort of the valleys, right? Okay. Watch closely. Yeah. So you know um that the the places where the liquid is present um are kind of indented in the surface, right? Um but your priors on the direction of the illumination are are causing you to misperceive um the the shape from shading, right? So again, a great example of how perception is kind of encapsulated um and sometimes cognitively impenetrable. Okay, so that's shape from shading. So again, key principles shape of shape from shading. Uh it works for limbburian surfaces. Um really the the kind of most straightforward way to infer shape from shading. Um is to know or assume a direction of illumination. Sometimes we we assume the wrong direction of illumination. So you get the shape right, shape shape wrong. Um and critically you have to also assume that the intensity variation is due to shape rather than due to um say paint. Right? So in these actual this is actually something you could print out on a piece of paper and so the intensity variation here would be due to paint and so you misperceive things. Um now in in the real world like things are actually much more complicated because shapes often do have like reflectance variation on it and people are actually pretty good you know if you take an object that has reflectance variation you're pretty good at actually correctly estimating the shape. So your visual system has this amazing ability to actually kind of separately infer um the contributions due to shading the contribution due to reflectance. Um and that's kind of a little bit beyond the scope of what we're talking about here and not terribly well understood but uh remarkable. Okay. Um, another um very powerful um cue to depth um is shadows. Okay. Um so this is just a a display that has a few shapes and you can see that like some are in front of the others and uh you can tell that this oludes this and this occludes this. Okay. So there are these occlusion cues present here. But then when we add shadows to the display, these are typically called drop shadows. Okay. um you now actually see the objects as being separated in depth, right? Um so shadows are a very powerful cue to depth. Um the visual system so typically in the real world um they are caused by one object oluding the illumination. So blocking the illumination and so the region of a shadow is typically dark in the image. Okay. Okay. And so the visual system seems to have internalized that regularity because when you artificially make the shadows lighter rather than darker and you can do this in graphics programs um then you don't get the sense of depth. Right. And and what what does this look like to people actually? >> What's that? >> It looks like >> Yeah. Like spray paint kind of. Right. To me it looks like somebody took a can of spray paint and sort of sprayed it. Right. Um so again like really doesn't have the same effect as a shadow as a shadow. Doesn't look like one. Um and if you just invert the whole image then it also doesn't really produce a sense of depth. Um so shadows have to be dark. Um in computer um graphics programs you can kind of play around with both shape from shading and drop shadows. Um, and these things again are they're happening all the time in the world and you typically are not you just don't even really notice them, right? So, this is just a picture of eggs. Um, and in this case, it's pretty obvious that the eggs are are sitting on a table, right? Um, because you get these kind of shadows um that suggest that the eggs are are sitting on the table and blocking the light. If you remove the shadows, the eggs kind of look like they're floating in space in front of the table, right? um uh but if you get rid of the shading so you just color all of the the eggs white now you lose the 3D shape right so normally like you sort of look at these eggs and they all just look white right so it's like your visual system is kind of separating out um the contribution of the shading from that of the reflectance and remember how when we we looked at the retext algorithm and we ran the retext algorithm on an actual image retext would would make the mistake of actually interpreting some of the shading that's due to shape as being due to illumination right because retex X assumes that uh all of the gradual variation in luminance is due to illum is due to illumination, right? Which is which is not a correct model of the world. Um so there's lots of luminance variation due to due to shading. Um so the visual system is is really sensitive to the relationships between the positions of shadows uh and the depth relationships of objects to the surfaces on which they sit. Um, so the drop shadows kind of have to align correctly for an object to sit on a surface. Um, so you remember in the very first class, um, we showed this demonstration of a ball that kind of rolls across the screen and depending on the trajectory of the shadow, you see these very different 3D trajectories. Um, and there's lots of these really cool I kind of collect um, these images where just sort of by accident um, the shadows kind of in the wrong place and then it sort of causes things to look like they're levitating. Um, so this is a situation where I guess there were like a lot of flag poles on this particular beach and there's one that just happens to be casting a shadow. The person's sort of standing on some platform on the st on the sand. Um, and the alignment of the of the shadow and the surface are such are good enough that the person looks like they're kind of floating. Um, here's a situation where there's, I guess, just some dirt on the ground that kind of happens to be in approximately the right place and like approximately the right color to plausibly be a shadow and so causes the trash can to float. Um, this is a cat with supernatural powers. Um, again, same same kind of thing, just a sort of dark patch in the right in the right place. Um, you can also get some pretty funny instances of this involving water. um because the shadows will be cast on the on the bottom of the ocean in this particular case, right? Um and can cause you the boat to seem like it's it's sort of floating. Um this is another one that I came across where the particular angle of the sun um causes the shadow to be in a place where you can kind of see the train is sort of floating. Um, so you start looking to these things for these things and you see them all over the place. Um, so shadows are caused by the the geometric relationship between an object and a light source. Okay. Um, and so when you're using the shadow to infer depth, you're kind of implicitly taking advantage of that relationship, right? So the visual system is like internalized something about the relationship between light sources and objects. um and shadows. And so you might imagine that what what hap what happens is like you actually are building or you're inferring like an entire scene layout in your head, right? Where there's a light source in a particular place um and shadows that are cast relative to objects. And that doesn't actually seem to be what is happening. Um so this is kind of a an interesting demonstration where you get pretty good depth from the shadows in the top two um circles of circles. logically like that that display sort of doesn't really make sense, right? Because you can kind of tell from the direction of the shadows relative to the objects that in this case the light is kind of coming from the upper left, right? In this case, the light must be kind of coming from the bottom left, right? For the shadows to be where they are, right? And so they you couldn't really actually generate an image like this without some really contrived setup with with two sources of of light that were only only cast on part of the of the scene, right? But it doesn't seem to bother you, right? Um on the other hand, down here where the positioning of the shadows has kind of been randomized locally, um that really seems to kill the sense of depth um for the most part, right? So you're there there's some degree of kind of local consistency that is really required for this to happen. Um but it doesn't seem like you're kind of modeling the the global scene. Okay. And this is like you know we've seen some other hints of this so far in the class. Like remember when we were talking about lightness illusions we saw evidence that um people weren't really modeling a full kind of global version of the scene. There's sort of more local kind of mid-level stuff going on. Any questions about shadows? So, another regularity that uh influences depth um is is the geometry of of the geometrical relationships between where objects are in the world um and where they project on the retina. And so, very often um we're standing on a ground plane. And so, things that are further away from us tend to be on the ground plane. Um, and so things that are further away tend to be higher in the visual field. And so you can get these effects like this. And they they also as things move uh are further away, they also tend to project smaller images. Okay? Um, and so you can get these effects like so the one on the left gives like a pretty compelling sense that the surface is receding in depth. Okay? Because the objects get smaller as they go up in the visual field, right? That's more or less what would happen in a typical situation in the world. um the one on the right um has much has a much weaker effect, right? Because it sort of violates that geometric regularity that is sort of typical of what you would observe in the world. So again, the common theme here is like the visual system has sort of internalized um the regularities of the world um and uses those to to solve this illos problem of depth perception. Um, and so this is an example of of kind it's kind of an example of a texture gradient in the sense that you have all these repeated elements. Um, and the texture changes in some systematic way that's kind of consistent with something receding in depth. Um, and those are actually super common. So here's um, and these are this is a case where um, an artist decided to kind of trick the visual system. So this is sort of a funny pattern that got painted on a a perfectly flat floor. Um but instead of um interpreting this as what it is, which is a flat floor that's got sort of a a weird warped texture on it, right? Um you infer a situation where the depth is um kind of where the the shape of the floor is kind of curved, right? Um and the texture is uniform and just paste it on the curved on the curved surface. So of course the inference is illposed, right? And so there's you you got to have priors to constrain the inference. In this case, the prior is that the texture is uniform because textures in the world tend to be uniform rather than having this the weird kind of modeled shape. Okay, here's another pretty cool example of the same kind of thing. Um, all right. So, texture gradients another really powerful cue to to depth. Okay. Another very um important thing that that is related to depth perception is what's called Emer's law. Okay. So, Emmer's law describes the relationship between the size that we perceive things to be, the distance that we perceive them to be away from us and the visual angle that they subtend. Okay. Um, and so the the idea is shown here, right? Which is that you can have two objects of different sizes and if the big one is further away, it will subtend the same visual angle as a small one that's closer. Okay? So there's this relationship between visual angle apparent uh inferred distance and inferred size, right? Because in the actual world there would be a relationship between visual angle, distance, and size. Um and and so there's lots of these instances where um either you know how big something is because it's a familiar object, okay? And so that you use that to infer how far away it is or you know how far away something is and you use that to infer the size. All right? All right. So, we're going to do an experiment to verify this. And so, this experiment is going to involve staring at one of the lights in the ceiling. Okay? So, you're going to do do this for I think like about 30 seconds. So, kind of get kind of close to a light and kind of look up at it. And the purpose of doing this is we're going to kind of burn a temporary after image into your eyes. Okay? And the reason that we're doing that is that when you generate an after image um in the eye, right, that's covering some fixed portion of your retina, right? So it corresponds to a particular kind of visual angle that would normally project onto the retina. All right? So the so the idea is that after you stare at this thing, you're going to get this after image. It's going to have a fixed um extent on your retina. So now what we're going to do, all right, hopefully you got a pretty good after image. So now I want you to look at your hand. Okay. All right. So, that's going to be some particular size. Now, take your hand down and look at the wall. Okay. And the thing should look a lot bigger. You might have to keep blinking to kind of get the after image to restore itself, right? All right. So, what we just did is we generated a stimulus of a fixed extent on the retina. Okay. But you can cause it. So, now I'm like looking at it on the back wall and it's enormous, right? So you can cause it to look different sizes um by looking at things at different distances. And so that it it seems to be the case that the visual system assumes that that retinal stimulation is due to something that happened at the surface that you're looking at, right? And so you look at something that's that that's that's close um and you assume that the thing is small. If you look at something that's further away um you infer that it's large. Yeah. Why does blinking cause? >> Oh, it's um it's just kind of re it's refreshing the adaptation. Um yeah, so you're you're kind of temporarily bleaching the photo receptors and like I think probably what's happening is that there may be some downstream adaptation that you then kind of like reset with the blinking. Yeah. So you're just sort of making the after image visible again. Yeah. I don't know the exact mechanism actually. Okay, so Emirates's law, right? So the relationship between how big things look and how far away they seem. Okay, so this is that's one example of this. And it's a pretty good party trick. Um, and you see Emer's law in like lots of different settings, right? So here, this is a situation where the big balls look a lot closer than the small balls, right? This is a case where people the visual system seems content to assume that the objects are all the same size and thus the changes in size are attributed to differences in distance. Okay, here's a situation where um you know the relative size of the bo of these body parts, right? You know that like that hands are of a certain size relative to somebody's face. Um, and so the fact that the hand is bigger in the one on the left leads you to conclude that it's a lot closer, right? So again, hand hand looks closer and this all kind of happens implicitly. Um, and so again, if you look around, you can kind of find funny instances where this sort of plays some tricks on perception. Um, this is kind of an interesting uh photo that I came across where the um the girl in the middle kind of looks like she's floating. Okay. Um, and so I think the reality is that she's a lot taller than her brothers, right? But you your visual system sort of seems to assume that these things are all probably the same size, right? Um, and thus that the girl has to be closer than she actually is, which means she's floating in the air, right? Something like that. Here we got some really big pigeons. So, this is a kind of a funny photo. Can anybody tell what's going on here? >> Yeah. >> Yeah. But you you kind of don't notice the ledge at first, right? So, it sort of looks like they're they're right next to the car. So, if you assume that um the cars and the birds are at the same distance, then they're they're about the same size on the which means they're about the same size physically. So, you got giant pigeons. Um all right, Emirates law. Any questions about E Emirates law? So in some cases, um the size is unambiguous and thus that kind of determines the apparent distance. In some cases, the distance is less ambiguous and that will determine the size. Yeah. Why do we perceive this as large? >> Yeah, I think it's a good question. Um, yeah, I don't know the answer to that and I'm not sure that I'm not sure that it's completely known. Um, and you know, I mean, you might imagine, well, most of us have a lot of experience with toy cars, right? So, you might think that it could be the opposite. Um, yeah. Yeah. No, I think I think the the the absolute size of these things is sort of it's interesting that you you anchor with the car rather than the the pigeon. Okay. Another cue to depth is aerial perspective. Um, so in general, things that are kind of further away are more blurry because the the as as light travels through the air, it gets kind of it gets scattered. Um, and things also tend to look a little bit more bluish. And so you can see this oftentimes in photo in photographs. This is a a photograph of a a beautiful landscape. Um, and the stuff in the that's further away is lower contrast, blurrier, and a little bit bluer. So again, it's like a regularity of the way that optics works um that we seem to have learned. Another really important defe um is linear perspective. Okay, so linear perspective is is due to the fact that par lines that are parallel in the world will generally converge in the image. Okay, unless they lie in a plane that's parallel to the image plane. Okay. So if you have have lines that are in a plane that's parallel to the image plane, those will remain parallel in the image. But if they're not in the image, they will um converge. Okay. And so you can see in this particular case um this line and this line are are a good example of that. Okay? Or like that and that. Right? Okay. So the question is like now how do we know that lines are parallel in the world? Right? In principle these lines might not be parallel in the world, right? That's that's they may have an idea why why why are lines parallel in the world? What would cause things to be parallel? >> Humans. >> Humans. Yeah, I think that's one possibility. Yeah, maybe also gravity. you know, but yeah, like people like to construct rectangles, you know, so there's like a lot of things in in um human constructed environments that tend to be parallel, right? Um and so uh this is a a classic illusion where um depth from both linear perspective and from texture. So again, we got linear perspective. is these lines kind of are converging. Um, makes it look like that's a lot further away than this. There's sort of a texture gradient. Okay. Um, and what's cool about this is that it looks like there's a really big monster chasing a smaller monster, right? But in fact, um, the two monsters are exactly the same size in the image, right? So this is again Emer's law, right? So you got the same visual angle being subtended by this monster and that monster. The depth cues indicate that this one is further away than this one. Okay? So Emmer's law tells you that this has got to be proportionally bigger than this one. Here's uh here's another kind of cool example. So we got two cigarettes. and they're positioned on a drawing um that is making use of linear perspective to cause this to look like it's closer to you than this. Right? Two cigarettes are physically the same size and thus they're lying on a table, right? So they're the same size on the retina. Um but they look very different, right? Um, this is a a thing that I just saw that's kind of um interesting. So, check this out. See if you can figure out what's going on. So, this is a artist who's drawing this thing. It's going to look pretty 3D. Now he's going to jump. Okay. Notice that like the the artist has changed in size, right? Okay. So, it looks like this this this person is changing in size. Anybody want to try to explain why the person looks kind of small here and bigger here and very big there? >> Yeah. >> Yeah. Yeah. So, in reality, this person is jumping horizontally by a pretty big distance, right? So, so the distance of the camera is changing, you are misperceiving this person as actually kind of jumping up and down vertically, right? So, um, and that if you if you assume that the distance is remaining constant, then the person's got to be changing in size. Okay? So, in this case, like the depth cues here are so powerful that they sort of overcome the presumably strong prior you have that like people stay the same size, right? Okay. All right. So, another kind of um point I want you to take away from this is that we see in 3D, right? So, what you perceive is your inference of the threedimensional structure of the world. Right? Now, that is derived from these two-dimensional retinal inputs. Um, and it's very difficult for you to actually counteract the visual systems recovery of 3D shape. All right? So this is a classic demonstration um that was constructed by Roger Shepard who's a cognitive psychologist who also kind of was was an amateur artist and so he he published a book of these illusions that he he came up with and this is probably the most famous um and so the the illusion here is that um the parallelograms here and here are physically the same shape. Okay, the same size, same shape in the image. Okay. Um yeah, you're shaking your head. You don't believe me, right? Yeah, it doesn't look possible. Okay. Um, so here's one way to to think about to see this, right? So this is um the image kind of rotated and overlaid. You can kind of see that they're actually the same. Here's another animation. Okay. Yeah. But it's it's it's really kind of hard to believe. And so the idea is that like these these two different shapes in the image um look like kind of different um shapes in the world because the tables are kind of oriented differently and angled differently. Um and you infer these so you're you're inferring these different 3D shapes in the world um and it's really impossible to access the the 2D shapes in the image. Yeah, >> maybe this is another illusion in the image. But is it relevant that the um that the parallelogram seems slightly a skewed? Like the the bottom of the table of the long table doesn't look parallel to the top of the long table to me. >> Yeah, I it doesn't to me either. And but this one doesn't quite look exactly rectangular either. Yeah. >> Like >> I don't think that's important. I think this is just this is just a that's just another version. I think in the original version they actually um they look pretty well matched, right? Um yeah, but the point is that like you know so this distance in the image, right? This is the same as this distance in the image, right? But when you look at it, it looks like it's a much greater um distance, right? Because of the forhortening effect of of projective geometry, right? So you have this thing that's getting projected onto an image um and you're inferring the the threedimensional shape in in the world that over you know overcoming that that projective process or inverting it correctly. Okay. Okay. So that's kind of an an important theme and then the thing that we'll end with here um is this issue of of ambiguity and bability. Right. So like one of the amazing things about vi about perception and vision in particular is that we're we're constantly solving these whole posed problems, right? So threedimensional interpretation in the world in particular is always ambiguous, right? Especially if you kind of look at the world with one eye, right? So there's like lots of different possible threedimensional structures that are consistent with the image that you observe. But usually we see the one interpretation that is apparently deemed most likely by our visual system. Right? So v vision is is typically quite stable. Um but sometimes especially in these more kind of impoverished displays that we we often make use of in perceptual science um there are multiple interpretations that are equally likely and the percept is bystable. So this is called a neker cube. It's like a wireframe cube. So if you just look at this one on the left you'll be able to see it in two different orientations. Right? So one orientation is that this square is out in front. The other orientation is that this square is out in front. Okay? And they're those are both consistent with the image. If you look at it, your percept will kind of flip from one to another, you know, maybe every five or 10 seconds or something like that. Okay, raise your hands if you've seen it flip. Almost everybody. Yeah. Um, and so there are there are lots of famous examples uh like this. So this is kind of combining two famous illusions. Um, in this particular case, the face can be seen um as an older woman from the side or a younger woman kind of from the back. Okay? And this person is holding their hands up in a silhouette. And that's the duck rabbit illusion, right? You can see it as a duck or as a rabbit. Okay? So, this is a joke, a cartoon that I saw recently. So, the bartender says, "Can I see your ID?" Wait, never mind. Wait. Yeah, I need to see your ID. Wait. You know, constantly switching between these interpretations of someone who's underage and someone who's overage, right? Okay. Um, and so this bi stability, we had a question about this last time, right? Biability is is super interesting, right? The fact that like you don't get stuck in one interpretation. Um and you know one one idea um is that this is a situation where the posterior so the probability of of the world given the stim stimulus is multimodal right so there's sort of multiple interpretations that are both probable um and maybe maybe your visual system is sampling from the posterior so that you know you sometimes see one thing sometimes see another um maybe it relates to adaptation um and And what happens is like you see one thing and so some neurons kind of adapt and then the other interpretation kind of ends up getting favored in some way could be potentially sort of due to implementation constraints. You know we don't really know um but um it's fairly prominent. And then these are uh some other examples of bstable things analogous to the duck rabbit um illusion. All right let's uh let's end there. Um so next time we'll resume talking um about moninocular depth cues and then we'll get into stereopsis how you derive um depth information from uh having two Guys, whatever.

Original Description

MIT 9.35, Spring 2024 Instructor: Josh McDermott View the complete course: https://ocw.mit.edu/courses/9-35-perception-spring-2024 YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP62-9RweyYBIpkqfo5dfcuS8 This lecture covers how the brain infers how near or distant objects in the visual field are. License: Creative Commons BY-NC-SA More information at https://ocw.mit.edu/terms More courses at https://ocw.mit.edu Support OCW at http://ow.ly/a1If50zVRl We encourage constructive comments and discussion on OCW’s YouTube and other social media channels. Personal attacks, hate speech, trolling, and inappropriate comments are not allowed and may be removed. More details at https://ocw.mit.edu/comments.
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from MIT OpenCourseWare · MIT OpenCourseWare · 0 of 60

← Previous Next →
1 21. Post Trade Clearing, Settlement & Processing
21. Post Trade Clearing, Settlement & Processing
MIT OpenCourseWare
2 10. Financial System Challenges & Opportunities
10. Financial System Challenges & Opportunities
MIT OpenCourseWare
3 7. Technical Challenges
7. Technical Challenges
MIT OpenCourseWare
4 3. Blockchain Basics & Cryptography
3. Blockchain Basics & Cryptography
MIT OpenCourseWare
5 19. Primary Markets, ICOs & Venture Capital, Part 1
19. Primary Markets, ICOs & Venture Capital, Part 1
MIT OpenCourseWare
6 1. Introduction for 15.S12 Blockchain and Money, Fall 2018
1. Introduction for 15.S12 Blockchain and Money, Fall 2018
MIT OpenCourseWare
7 Chalk Radio, A Podcast about Inspired Teaching at MIT (Teaser)
Chalk Radio, A Podcast about Inspired Teaching at MIT (Teaser)
MIT OpenCourseWare
8 Nuclear Gets Personal with Prof. Michael Short (S1:E1)
Nuclear Gets Personal with Prof. Michael Short (S1:E1)
MIT OpenCourseWare
9 How Africa Has Been Made to Mean with Prof. Amah Edoh (S1:E2)
How Africa Has Been Made to Mean with Prof. Amah Edoh (S1:E2)
MIT OpenCourseWare
10 Making Deep Learning Human with Prof. Gilbert Strang (S1:E3)
Making Deep Learning Human with Prof. Gilbert Strang (S1:E3)
MIT OpenCourseWare
11 Social Impact at Scale, One Project at a Time with Dr. Anjali Sastry (S1:E4)
Social Impact at Scale, One Project at a Time with Dr. Anjali Sastry (S1:E4)
MIT OpenCourseWare
12 Film is for Everyone with Prof. David Thorburn (S1:E5)
Film is for Everyone with Prof. David Thorburn (S1:E5)
MIT OpenCourseWare
13 Lecture 12: Aircraft Performance
Lecture 12: Aircraft Performance
MIT OpenCourseWare
14 Lecture 3: Learning to Fly
Lecture 3: Learning to Fly
MIT OpenCourseWare
15 Lecture 13:  Interpreting Weather Data
Lecture 13: Interpreting Weather Data
MIT OpenCourseWare
16 Lecture 21: Weather Minimums and Final Tips
Lecture 21: Weather Minimums and Final Tips
MIT OpenCourseWare
17 Hand-on, Minds On with Dr. Christopher Terman (S1:E6)
Hand-on, Minds On with Dr. Christopher Terman (S1:E6)
MIT OpenCourseWare
18 Part 4: Eigenvalues and Eigenvectors
Part 4: Eigenvalues and Eigenvectors
MIT OpenCourseWare
19 Part 5: Singular Values and Singular Vectors
Part 5: Singular Values and Singular Vectors
MIT OpenCourseWare
20 Part 3: Orthogonal Vectors
Part 3: Orthogonal Vectors
MIT OpenCourseWare
21 Part 2: The Big Picture of Linear Algebra
Part 2: The Big Picture of Linear Algebra
MIT OpenCourseWare
22 Part 1: The Column Space of a Matrix
Part 1: The Column Space of a Matrix
MIT OpenCourseWare
23 Intro: A New Way to Start Linear Algebra
Intro: A New Way to Start Linear Algebra
MIT OpenCourseWare
24 9. Chromatin Remodeling and Splicing
9. Chromatin Remodeling and Splicing
MIT OpenCourseWare
25 28. Visualizing Life - Fluorescent Proteins
28. Visualizing Life - Fluorescent Proteins
MIT OpenCourseWare
26 20. Roth's theorem III: polynomial method and arithmetic regularity
20. Roth's theorem III: polynomial method and arithmetic regularity
MIT OpenCourseWare
27 8. Szemerédi's graph regularity lemma III: further applications
8. Szemerédi's graph regularity lemma III: further applications
MIT OpenCourseWare
28 19. Roth's theorem II: Fourier analytic proof in the integers
19. Roth's theorem II: Fourier analytic proof in the integers
MIT OpenCourseWare
29 12. Pseudorandom graphs II: second eigenvalue
12. Pseudorandom graphs II: second eigenvalue
MIT OpenCourseWare
30 1. A bridge between graph theory and additive combinatorics
1. A bridge between graph theory and additive combinatorics
MIT OpenCourseWare
31 Special Episode: Teaching Remotely During Covid-19 with Prof. Justin Reich
Special Episode: Teaching Remotely During Covid-19 with Prof. Justin Reich
MIT OpenCourseWare
32 Spring 2020 Update from Dean Rajagopal
Spring 2020 Update from Dean Rajagopal
MIT OpenCourseWare
33 S1E7: Unpacking Misconceptions about Language & Identities with Prof. Michel DeGraff
S1E7: Unpacking Misconceptions about Language & Identities with Prof. Michel DeGraff
MIT OpenCourseWare
34 Climate 101 Live
Climate 101 Live
MIT OpenCourseWare
35 Welcome for Volunteers (for EarthDNA's Climate 101)
Welcome for Volunteers (for EarthDNA's Climate 101)
MIT OpenCourseWare
36 Learning to Fly with Drs. Philip Greenspun & Tina Srivastava (S1:E8)
Learning to Fly with Drs. Philip Greenspun & Tina Srivastava (S1:E8)
MIT OpenCourseWare
37 Thinking Like an Economist with Prof. Jonathan Gruber (S1:E9)
Thinking Like an Economist with Prof. Jonathan Gruber (S1:E9)
MIT OpenCourseWare
38 2. Cyber Network Data Processing; AI Data Architecture
2. Cyber Network Data Processing; AI Data Architecture
MIT OpenCourseWare
39 1. Artificial Intelligence and Machine Learning
1. Artificial Intelligence and Machine Learning
MIT OpenCourseWare
40 2: Resistor Capacitor Circuit and Nernst Potential - Intro to Neural Computation
2: Resistor Capacitor Circuit and Nernst Potential - Intro to Neural Computation
MIT OpenCourseWare
41 14: Rate Models and Perceptrons - Intro to Neural Computation
14: Rate Models and Perceptrons - Intro to Neural Computation
MIT OpenCourseWare
42 4: Hodgkin-Huxley Model Part 1 - Intro to Neural Computation
4: Hodgkin-Huxley Model Part 1 - Intro to Neural Computation
MIT OpenCourseWare
43 18: Recurrent Networks - Intro to Neural Computation
18: Recurrent Networks - Intro to Neural Computation
MIT OpenCourseWare
44 3: Resistor Capacitor Neuron Model - Intro to Neural Computation
3: Resistor Capacitor Neuron Model - Intro to Neural Computation
MIT OpenCourseWare
45 15: Matrix Operations - Intro to Neural Computation
15: Matrix Operations - Intro to Neural Computation
MIT OpenCourseWare
46 13: Spectral Analysis Part 3 - Intro to Neural Computation
13: Spectral Analysis Part 3 - Intro to Neural Computation
MIT OpenCourseWare
47 16: Basis Sets - Intro to Neural Computation
16: Basis Sets - Intro to Neural Computation
MIT OpenCourseWare
48 20: Hopfield Networks - Intro to Neural Computation
20: Hopfield Networks - Intro to Neural Computation
MIT OpenCourseWare
49 8: Spike Trains - Intro to Neural Computation
8: Spike Trains - Intro to Neural Computation
MIT OpenCourseWare
50 7: Synapses - Intro to Neural Computation
7: Synapses - Intro to Neural Computation
MIT OpenCourseWare
51 19: Neural Integrators - Intro to Neural Computation
19: Neural Integrators - Intro to Neural Computation
MIT OpenCourseWare
52 5: Hodgkin-Huxley Model Part 2 - Intro to Neural Computation
5: Hodgkin-Huxley Model Part 2 - Intro to Neural Computation
MIT OpenCourseWare
53 6: Dendrites - Intro to Neural Computation
6: Dendrites - Intro to Neural Computation
MIT OpenCourseWare
54 17: Principal Components Analysis_ - Intro to Neural Computation
17: Principal Components Analysis_ - Intro to Neural Computation
MIT OpenCourseWare
55 12: Spectral Analysis Part 2 - Intro to Neural Computation
12: Spectral Analysis Part 2 - Intro to Neural Computation
MIT OpenCourseWare
56 11: Spectral Analysis Part 1 - Intro to Neural Computation
11: Spectral Analysis Part 1 - Intro to Neural Computation
MIT OpenCourseWare
57 9: Receptive Fields - Intro to Neural Computation
9: Receptive Fields - Intro to Neural Computation
MIT OpenCourseWare
58 10: Time Series - Intro to Neural Computation
10: Time Series - Intro to Neural Computation
MIT OpenCourseWare
59 1: Course Overview and Ionic Currents - Intro to Neural Computation
1: Course Overview and Ionic Currents - Intro to Neural Computation
MIT OpenCourseWare
60 The Power of OER with Profs. Mary Rowe and Elizabeth Siler (S1:E10)
The Power of OER with Profs. Mary Rowe and Elizabeth Siler (S1:E10)
MIT OpenCourseWare

This video lecture teaches how the brain infers 3D structure from 2D images using various cues to depth, such as shape from shading, shadows, and texture gradients. The lecture discusses retrieval augmented generation and its application to depth perception.

Key Takeaways
  1. Stare at one of the lights in the ceiling for 30 seconds
  2. Generate an after image in the eye
  3. Look at your hand and then at the wall to see the after image's effect on perceived size
  4. Observe the Shepard illusion
  5. Analyze the duck rabbit illusion
💡 The visual system uses various cues to depth, such as shape from shading, shadows, and texture gradients, to infer 3D structure from 2D images.

Related Reads

📰
AnswerSurvivalRAG: What Happens When RAG Finds the Answer, Then Drops It?
Learn how RAG systems can fail even when they find the correct answer, and why it matters for reliable AI performance
Medium · Machine Learning
📰
A RAG evaluator that admits what it can't judge
Learn how to build a reliable RAG evaluator that acknowledges its limitations, a crucial aspect of AI safety and robustness
Dev.to · Melissa D. Ellison
📰
RAG on Google Cloud in Regulated Environments: A Lifecycle Playbook from Inception to…
Learn to implement RAG on Google Cloud in regulated environments with a lifecycle playbook
Medium · Machine Learning
📰
Solving One of the Hardest Problems in Code RAG: Context Retrieval
Learn to solve context retrieval in code RAG systems, a crucial challenge in automation code generation, and improve your skills in RAG and code analysis.
Medium · RAG
Up next
RRF vs DBSF with Qdrant: Hybrid Retrieval Fusion for RAG in Python
Professor Py: AI Engineering
Watch →