Depth and Subjective Transparency:
A survey of the literature

Dave Johnson

INTRODUCTION

Mist in the air, the surface of a pond, and imperfectly cleaned glass are familiar media known to be physically transparent (or, more properly, physically semi-transparent) because not all of the light incident upon them is reflected or absorbed. Under the proper conditions, physical transparency is ubiquitous, for Sir Isaac Newton observed in his famous synthesis on optical phenomena: "The least parts of almost all natural Bodies are in some measure transparent: And the Opacity of those Bodies ariseth from the multitude of Reflexions caused in their internal Parts" (1730, page 248).

Just as important, however, these media are also usually perceived as transparent. When we believe that we can see through a medium, we call that medium perceptually or subjectively transparent and the phenomenon is called subjective transparency.

In nature, it is usually the case that physical and subjective transparency are intimately linked, that one will not be found without the other. However, in the laboratory of the psychophysicist, the intimacy of these closely related transparencies can be broken in striking ways, and a large literature has grown up to describe and attempt to explain the resulting paradoxes. A few of these contributions are mentioned here: Lightness and contrast and transparency perception are considered by Koffka (1935), Metelli (1970, 1974a, 1974b, 1982), Beck et. al (1984), Metelli et. al (1985), and Watanabe et. al (1992). Cognitive factors are considered by Rock (1983). Figural, coding, and complexity factors are considered by Koffka (1935), Leeuwenberg et. al (1978), Attneave (1982), and Restle (1982). Ecological factors are considered by J.J. Gibson (1982). Motion factors are considered by Watanabe et. al (1991), Victor et. al (1992). Illusory contours and color factors are considered by Meyer et. al (1983), and Grossberg (1993). Most of these factors will not be discussed further in this paper.

This paper will attempt to do two things. First, it will survey the literature of subjective transparency from the point of view of depth cues. Secondly, it will offer an answer to the question: How do we see a transparent surface?

DEPTH

Disparity cues may be used to induce the illusion of transparency. When the transparent and opaque surfaces are seen at different depths, the effect is called stereoscopic transparency to distinguish it from other forms of subjective transparency that do not involve a sensation of depth. The effect of stereoscopic transparency is easily produced in the laboratory with properly configured stereograms. A number of researchers have examined this effect and a few of their results will be presented here.

Depth cues alone cannot be sufficient for a perception of stereoscopic transparency. Mason (1991) reports, however, that viewing attitude plays a crucial role and "When the viewing attitude is normal, the probabilities of activation of representations of transparency can be set at about 1 by figural ... or stereoscopic conditions of transparency" (page 564). His paper does not explore the effects of attitude on stereoscopic transparency.

How important is physical separation (or equivalently, disparity) on subjective transparency? Clearly, without it, one cannot have a percept of stereoscopic transparency. Watanabe et. al (1992) study the McCollough effect and its effect on subjective transparency. Their results will not be reported here. However, they assert that a physically separated pattern (one in which one pattern is closer than another) is more easily separated perceptually than an ordinary test pattern. "This might be because the subjective separation of transparent surfaces is not so complete as physical separation."

Zanforlin (1982) considers a number of stereograms which involve what appears to be two figures, one behind the other. Figure 13.16, which shows two stereograms (page 263), is a particularly good example. In the second stereogram with its back overlaying vertical rectangle, when the left image is presented to the right eye and the right image is presented to the left eye, "the horizontal rectangle appears further away, it also appears opaque and behind the vertical rectangle. But when the stereograms are reversed the horizontal rectangle appears nearer, above the black rectangle and transparent" (page 263). He goes on to say (page 264) that "Completion of the horizontal contours across the vertical surface is possible because they are the same color. If the vertical surface is a different color, the 'crossing' is not possible and the two parts break..."

Zanforlin (1982) does not focus on subjective transparency as such, and so, does not present a model. However, regarding the "bending" which may occur in the forming of the binocular Gestalt, he writes, "a modification or 'rupture' of the complex Gestalt may result although this depends on the relative strength of the forces that intervene at monocular and binocular level" (page 265). This might be interpreted as follows: when figural and reflectance cues are consistent with the depth cues, an ordinary percept of transparency (or opacity) result. Otherwise, the visual system tends to "bend" one of the surfaces so that the percept that would be expected results.

Some instances of stereoscopic transparency are more difficult to see than others. Akerstrom & Todd (1988) have performed a number of experiments to determine the conditions under which this percept is most difficult. They used random-dot stereograms with varying dot densities and disparities. It was found that subjects who were tested were significantly better at seeing the transparent surface in the stereogram when the disparities were very small and/or when the dot densities were lower. Interestingly, they reported that "with random-dot stereograms of opaque surfaces, in contrast, a comparable increase in element density has no significant effect on observers' perceptions". Also, it was found that "overlapping surfaces are harder to segregate perceptually than nonoverlapping surfaces" (page 431).

The difference in difficulty between perceiving opacity and transparency in random-dot stereograms is striking. On page 423 of Akerstrom & Todd (1988), two random-dot stereograms are shown. Figure 1 represents transparency and Figure 2 represents opacity. I had significant difficulty seeing the binocular image in Figure 1, but no trouble at all seeing Figure 2. The density of dots and the disparity are comparable in the two figures.

The experiments of Akerstrom & Todd were conducted, in part, to determine the role of competitive interactions in any neural model of subjective transparency. They point out (pp 421-422) that Marr et. al had difficulty correctly representing input which contained transparent surfaces in a random-dot stereogram. One might expect that competitive interactions would prevent the simultaneous perception of two depth values in the same direction of visual space.

Akerstrom & Todd concluded (1988, page 431) that "there are several sources of evidence to suggest that the perception of stereoscopic transparency in actual human observers cannot be based on a purely cooperative process". One source is the finding that overlapping surfaces are harder to segregate. This would indicate that there is competition between these surfaces or between the perception of the overlap of these surfaces and some other percept. Another source of evidence is the fact that segregation becomes significantly more difficult as the density of dots increases, suggesting a competitive interaction. And "finally, the perceptual 'filling in' of background regions that typically occurs with random-dot stereograms of opaque surfaces is effectively inhibited when two such surfaces overlap one another", suggesting that the inhibitory interactions are involved in the prevention of the filling-in of the occluded surface.

They also suggest that any neural model based solely on position and disparity must fail. They cite the fact that long presentation times were required in many instances for the subjects to fuse the images and that this "is a source of compelling evidence that vergence eye movements may play a critical role in this phenomenon. ... Their specific strategies for directing attention and eye movements seem to be particularly important in this regard." Another indication that position and disparity "cannot tell the whole story" is the finding that "the ability of observers to segregate overlapping transparent surfaces perceptually can sometimes be facilitated when the individual depth planes are appropriately labeled with other sources of information, such as color" (page 431).

What value of depth will be assigned to a transparent surface by the visual system? The answer to this question depends on the disparity and local contrast of the various regions in the pattern. Watanabe & Cavanagh (1991) study this question in a series of experiments. They note the Spillman and Redies result that "when a transparent textured pattern is held above the Ehrenstein figure, the subjective surfaces appear to lie not in the plane of the figures but in the plane of the overlying texture" (page 527). They call this phenomenon depth capture. Though a computer monitor presents the stimuli and polarized glasses are worn by the subjects to see the patterns, an equivalent result will occur if the random-dot pattern is copied to a transparent sheet and held above patterns which, by themselves, give rise to illusory contours.

When the Kanizsa square is used, Watanabe & Cavanagh report that "the perceived depth of the colored square was captured by the texture most frequently when the luminance contrast of the colored square was nearly equiluminant to the background. Depth capture occurred more frequently for smaller disparities. The depth capture was stronger when the texture was placed behind the display plane than when it was in front." They suggest that this last result may be because the dots "always occluded the colored square providing a cue that it should lie behind the texture" (page 528).

They suggest that equiluminance facilitates depth capture because "chromatic contours provide only a weak disparity signal" (page 529). Thus, the chromatic contour disparity may be overridden by the disparity signals from the dots on the transparent surface.

The experiments of Watanabe & Cavanagh (1991) also proved that the depth value assigned to the transparent surface and to the captured illusory contour was not an average of the depths of the two original surfaces. "Rather, the apparent depth of the colored square was captured by the texture" (page 530).

In addition, Watanabe & Cavanagh (1991) determined that the depth capture was better for lined Kanizsa figures than for solid Kanizsa figures. They also report that this effect is true for other illusory figures besides the Kanizsa square. Why? They argue that the disparity signals produced by thin lines versus solid regions are different and that this accounts for the difference in depth capture seen in their experiments (page 531).

Nakayama et. al (1990) conducts a number of experiments designed to explore the relationship of depth to subjective transparency. They ask subjects to examine Ehrenstein and Kanizsa square stereograms to determine how the perception of transparency and other qualities of the stimuli are correlated with disparity. They write, "Rather than seeing transparency as a perceptual endpoint, determined by seemingly more primitive processes, we interpret perceived transparency as much as {\em cause} as an {\em effect}" (page 497).

In one of their stereograms (Figure 6, page 502), a Kanizsa square can be viewed either with the "square" above the circles or below. When it is above, the square is clearly visible, floating above the circles. When it is below, there are no illusory contours to define its edges. In the monocular images, weak illusory contours can be seen.

HOW DO WE SEE A TRANSPARENT SURFACE?

As the foregoing examples have demonstrated, the perception of transparency is a complex phenomenon which links seemingly disparate effects such as brightness perception, illusory contours and neon color spreading, motion, and depth. We are led back to our original question: How do we see a transparent surface?

In the context of a much broader theory of human vision, Stephen Grossberg (1993) has begun to propose a mechanistic solution to our question. Since space limitations prevent a thorough analysis of Grossberg's contribution, and because the primary focus of this paper is on the influence of depth on subjective transparency, I will attempt to summarize Grossberg (1993) from that perspective.

Grossberg (1993) presents a theory based on an extension of the BCS/FCS and FACADE models which were originally designed to explain such phenomena as subjective contours, brightness constancy and assimilation, and binocular fusion. The present theory attempts to show "how spatially sparse disparity cues can generate continuous surface representations at different perceived depths" (page 463). Grossberg assumes the existence of pools of cells, some for the representation of near-zero disparity, and others for non-zero disparity. "The non-zero disparity cells are themselves assumed to be segregated into separate cell pools that are organized ... to correspond to different relative depths of an observed image feature" (page 465).

Grossberg (1993) asks "why do not all such surfaces look transparent?" The answer: "The theory suggests that this does not happen because the boundaries corresponding to closer objects are added to the boundaries corresponding to further objects in the filling-in domains". These boundaries restrict filling-in in regions that are further away. "This restriction upon filling-in of surface properties does not prevent boundaries from being completed behind an occluded region" (page 466).

How, then, do we see something behind a transparent surface at all? The filling-in of the region behind an occluder "does sometimes occur, as during transparency phenomena" (page 470). Though Grossberg does not explicitly describe how this may occur, we have enough information to propose an answer. Suppose that one surface which is physically semi-transparent lies nearer to an observer than another surface which is physically opaque. On the nearer surface are regions of opacity (or else we would not be able to see the surface at all). The boundaries of these regions of opacity add to the boundaries in the further layer to prevent the filling-in (and to occlude the corresponding region from our view). But only those regions are occluded in the further surface. All of the rest of the further surface will be allowed to fill-in.

But how do we see the transparent surface? Grossberg (1993) assumes that there exist copies of the BCS and FCS at different disparities. Consider only the BCS and FCS copies corresponding to the disparity associated with the nearest surface and assume that this surface contains regions of opacity. Not all such surfaces give rise to the perception of transparency. For example, your hand held in front of your face defines a planar surface at the depth of your hand. However, that plane is not perceived as transparent. Something more is needed. Again, Grossberg (1993) does not explicitly address this issue. However, under the right conditions, for example when many small, partially connected regions of contrast are spread across the surface, the local contrasts may be small enough and the featural information strong enough that featural information can bleed from their sources and fill-in the surface in the same way that the neon color signal fills in the Ehrenstein figure. It is this featural signal that we see.

CONCLUSION

Though there are still many unanswered questions in the field of human vision, the synthesis of Grossberg (1993), building on the work of many outstanding psychophysicists and neurobiologists, begins to close the loop with the optics of Isaac Newton.

REFERENCES

Akerstrom, R.B. & Todd, J.T. (1988). The perception of stereoscopic transparency. Perception and Psychophysics, 44, 421-432.

Attneave, F. (1982) Pr�gnanz and Soap Bubble Systems. In J. Beck (Ed.) Organization and Representation in Perception. New Jersey: Lawrence Erlbaum Associates.

Beck, J., Prazdny, K., & Ivry, R. (1984) The perception of transparency with achromatic colors. Perception and Psychophysics, 35, 407-422.

Gibson J.J. (1982). What is Involved in Surface Perception? In J. Beck (Ed.) Organization and Representation in Perception. New Jersey: Lawrence Erlbaum Associates.

Grossberg, S. (1993) A solution of the figure-ground problem for biological vision, Neural Networks 6, 463-484.

Koffka, K. (1935) Principles of Gestalt Psychology. New York: Harcourt Brace.

Leeuwenberg E. (1978). Quantification of Certain Visual Pattern Properties: Salience, Transparency, Similarity (pp 277-298). In E.L.J. Leeuwenberg and H.F.J.M. Buffart (Eds.). Formal Theories of Visual Perception. New York: John Wiley and Sons.

Masin. S.C. (1991). A weighted-average model of achromatic transparency, Perception & Psychophysics, 49, 563-571.

Metelli, F. (1970). An algebraic development of the theory of perceptual transparency. Ergonomics, 13, 59-66.

Metelli, F. (1974a) The Perception of Transparency. Scientific American 230 (April), 90-98.

Metelli F. (1974b). Achromatic color conditions in the perception of transparency. In R.B. MacLeod & H.L. Pick (Eds.), Perception (pp. 95-116). Ithaca, NY: Cornell University Press.

Metelli F. (1982). Some Characteristics of Gestalt-Oriented Research in Perception. In J. Beck (Ed.) Organization and Representation in Perception. New Jersey: Lawrence Erlbaum Associates.

Metelli, F., DaPos O., Cavadon A., (1985) Balanced and unbalanced, complete and partial transparency. Perception and Psychophysics, 38, 354-366.

Meyer G.E., & Senecal M. (1983) The illusion of transparency and chromatic subjective contours. Perception and Psychophysics, 34, 58-64.

Nakayama, K., Shimojo S., & Ramachandran, V.S. (1990). Transparency: Relation to depth, subjective contours, luminance, and neon color spreading. Perception, 19, 497-513.

Newton, Sir Isaac (1730) Opticks or A Treatise of the Reflections, Refractions, Inflections and Colours of Light. New York: Dover Publications.

Restle, F. (1982) Coding Theory as an Integration of Gestalt Psychology and Information Processing Theory. In J. Beck (Ed.) Organization and Representation in Perception. New Jersey: Lawrence Erlbaum Associates.

Rock I. (1983). The Logic of Perception. Cambridge: MIT Press.

Victor, J.D. & Conte, M.M. (1992). Coherence and transparency of moving plaids composed of Fourier and non-Fourier gratings. Perception & Psychophysics, 52, 403-414.

Watanabe, T., & Cavanagh, P. (1991). Texture and motion spreading, the aperture problem, and transparency. Perception & Psychophysics, 50, 459-464.

Watanabe, T.,Zimmerman, G., & Cavanagh, P. (1992). Orientation-contingent color aftereffects mediated by subjective transparent structures. Perception & Psychophysics, 52, 161-166.

Watanabe, T., & Cavanagh, P. (1992) Depth Capture and transparency of regions bounded by illusory and chromatic contours. Vision Research, 32, 527-532.

Zanforlin. Mario (1982). Figure Organization and Binocular Interaction (pp 251-267). In J. Beck (Ed.) Organization and Representation in Perception. New Jersey: Lawrence Erlbaum Associates.