Physically-Based Materials – Energy Conservation and Reciprocity

Last year we released the Enterprise PBR Material, a physically-based material that follows industry-standards and is flexible enough for use in real-time applications as well as high-quality offline rendering. The material fulfills two key requirements. The first is consistency, which means that renderings of a certain material must look the same in different applications, rendering algorithms and under different lighting conditions. The second requirement is easy-to-use, or more precise, it must be easy to build realistic-looking materials, even for inexperienced users.

The detailed scenarios and platforms for which we optimized the material were presented in a talk by Meseth et al. (2019) (video, requires login). It targets mobile phones and web browsers, VR and AR applications, real-time rasterization and ray tracing, on low-end devices and high-end workstations, but also interactive and offline global illumination rendering, on a single machine or in a cluster. In all these scenarios, we have a single source of truth, which means that the material is configured once and used everywhere without changes.

Physically-based materials and physically-based rendering naturally provide us consistency and make it possible to design an easy-to-use material model. Although the terms “physically-based material” or “PBR material” are widely used, there is a lot of misunderstanding and ambiguity. For this reason I would like to define the term “physically-based material”, show the math behind it, and describe why sticking to the fundamental, physical principles of the light transport is the key to cross-renderer consistency.

Definition

A material defines the reflective, transmissive, and emissive properties at each surface point of the mesh the material is assigned to. I will refer to these properties as the surface properties.

Sometimes, for example in Enterprise PBR, the material also defines what happens in the object’s volume below the surface, i.e., the volumetric (or subsurface) properties. The left image shows refraction due to the index of refraction being different in the material than on the outside. The right image shows subsurface scattering, an effect in which the light penetrates the surface, interacts with the volume according to absorption and scattering coefficients, and exists the surface at a different point.

A very intuitive description about materials in rendering can be found in chapter 1.1 of the MDL handbook (2020).

In a physically-based material, surface and volume properties are defined independent of the light source, scene geometry and rendering algorithm. Light direction and distance, for example, or the squared distance falloff, or ambient color are not part of the material.

The reflective and transmissive properties of a physically-based material are described via the bidirectional scattering distribution function (BSDF). The BSDF is a 4-dimensional function $f(\omega_i, \omega_o)$. It takes the direction of incoming light $\omega_i$ and the outgoing direction $\omega_o$, and returns the ratio of scattered (reflected or transmitted) radiance along $\omega_o$ to the irradiance incident from $\omega_i$. The directions $\omega$ are given as azimuth and zenith angle, hence the 4 dimensions. More details about the BSDF can be found in the PBR Book, Pharr et al. (2019) and, of course, Wikipedia. I want to focus on the three main properties that make the BSDF physically-based: positivity, energy conservation, and reciprocity.

Properties

Positive

A physically-based BSDF is positive: negative radiance doesn’t make sense in light transport.

$f(\omega_i, \omega_o) \ge 0$

Energy Conserving

A physically-based BSDF is energy-conserving: the total energy of light reflected or transmitted is equal to or less than the energy of incident light. For all directions $\omega_o$,

$\int_\Omega f(\omega_i, \omega_o) \cos\theta_i \mathrm d \omega_i \le 1$

A BSDF that violates energy conservation adds light to the scene, similar to an emitter or light source. While emitters are of course a crucial part of any scene (scenes without lights are boring), emission has to be treated as a separate component in a physically-based material, not as part of reflection or transmission. Typically, the emissive properties are described via the emission distribution function (EDF), but let’s stick to BSDFs for now.

Although energy conservation is immediately clear intuitively, a surprisingly large number of implementations violate this property. An easy test for that is the white furnace test: render a sphere surrounded by a constant, white light source seen from an orthographic camera. If the sphere is brighter than the surrounding, the material is not energy-conserving. If testing is so easy, why do so many implementations of physically-based materials violate this property?

Most of the time producing a small amount of energy (a few percent) can be tolerated, especially in real-time rendering with rasterizers. In global illumination (GI) renderers, however, adding a small amount of energy at each bounce quickly becomes a major source of light in the scene. If the renderer (or the artist) is not prepared for that, the scene converges slowly, the (pseudo-)random noise of Monte Carlo integration does not disappear. A skilled artist may be able to detect those problems and react accordingly, but unskilled users are lost.

Reciprocal

A physically-based BSDF is reciprocal: swapping incident and outgoing light directions does not change the result. For all pairs of directions $\omega_i$ and $\omega_o$,

$f(\omega_i, \omega_o) = f(\omega_o, \omega_i)$

In reality, light sources like the sun or light bulbs emit photons that scatter at objects until they eventually hit the eye or camera sensor. In ray tracing, the direction is reversed, a path starts at the camera sensor and is traced through the scene until it hits the light source. This inversion is only correct mathematically if the BSDF is reciprocal. If the BSDF is not reciprocal, we do not only violate a fundamental physical property, but also make the BSDF unusable in bidirectional rendering algorithms.

Global illumination rendering algorithms profit considerably from bidirectional path tracing. Light paths started at the camera (path tracing) are combined with light paths started at the light source (light tracing) via multiple importance sampling. This only works if path tracing and light tracing produce the same result. A lot of bidirectional path tracing methods have been developed, like two-way path tracing, progressive photon mapping, Veach-style Metropolis light transport, VCM, …

Almost all BSDFs found in literature and implementations are reciprocal by design: Lambert, Oren-Nayar, microfacet-based BSDFs with GGX or Beckmann distribution, etc. However, a material for realistic surfaces usually combines several of these elemental BSDFs. A simple plastic BSDF, for example, is a combination of Lambertian reflection and GGX microfacet. It’s tricky to make the combined BSDF reciprocal, and, therefore, a lot of renderers violate reciprocity.

The main reason for that is that many renderers do not make use of bidirectional algorithms. In this case, the additional complexity is not worth the increase in physical plausibility. And even if the renderer is bidirectional, the artifacts introduced by this simplification are so small that you may decide to just live with it. It will, however, make debugging and verification harder, as it is impossible to compare path tracing and light tracing to check and validate the implementations.

For the Enterprise PBR material we decided to strictly fulfill the criteria for a physically-based material. We believe that this will guarantee us consistency with todays and tomorrows rendering algorithms, from rasterization to bidirectional global illumination, and whatever new physically-based technique will be developed in the future.

In the next sections, I’d like to have a close look at various material models and rendering techniques to check whether they are physically-based or not.

Case Study: Split Sum Approximation and Pre-Filtered Environment Maps

Pre-filtering the environment map is a popular technique in rasterizers to handle image-based lighting (IBL), see for example Karis (2013). It’s an approximation to quickly compute the lighting integral with low error for typical environment maps. The integral is split into two integrals (split-sum approximation). The first part consists of the environment map preconvolved with the microfacet distribution function, ignoring the view-dependent shape of the distribution. The second part consists of the view-based directional albedo of the BRDF (DFG LUT). The two parts are pre-computed independently and stored as mip-mapped cube map and lookup-table.

There are two sources of errors: the split-sum approximation and the view-independence. The following image taken from Karis (2013) shows the reference at the top, the split-sum approximation in the middle and the full approximation including view-independence at the bottom.

Either approximation makes the method non-physical. Either violates the fundamental principle of physically-based rendering: light and material have to be independent.

Considering that the method is an approximation, being non-physical is totally fine. The approximation has to be used carefully and under the right circumstances. In particular, it is not suited for defining a renderer-independent material model or defining a ground-truth that other renderers have to match. Other renderers, especially when focused on image quality, don’t want or can’t use an approximation for image-based lighting. Ground-truth has to be defined via an unbiased renderer, only then it is truly unambiguous. The Unreal Engine, for example, includes a Path Tracer to generate reference images. There is also an RTX-based path tracer from NVIDIA for Unreal Engine, see this video.

Case Study: Dielectric Materials

A common model for dielectric materials is a Fresnel-weighted combination of a Lambertian diffuse and a microfacet specular reflection. The diffuse reflection $f_d(v, l) = \frac{1}{\pi}$ models the subsurface scattering and attenuation inside the material, the specular reflection $f_s(v, l)=\frac{D(h) G(v, l)}{4 (v \cdot n)(l \cdot n)}$ models the reflection at the surface. The Fresnel term $F$ models the ratio of reflection and transmission at the surface: the fraction of light not reflected at the surface is transmitted and, thus, handled by the diffuse term. I am using $v$, $l$, $h$, and $n$ here to denote the view and light direction, the half-vector and the surface normal, respectively.

$f(v, l) = (1-F) \, f_d(v, l) + F \, f_s(v, l)$

In this equation I left out an important detail, the direction used to compute the Fresnel term. Let’s have a look at different approaches used in the wild and check if they are physical-based.

Half-vector based microfacet Fresnel, no diffuse Fresnel

$f(v, l) = f_d(v, l) + F(v \cdot h) \, f_s(v, l)$

Is it reciprocal? Yes. Is it energy-conserving? The white furnace test shows: no! This shouldn’t be a surprise: the diffuse BRDF reflects all incoming energy, so whatever we add will break energy-conservation.

Half-vector based microfacet Fresnel, half-vector based diffuse Fresnel

$f(v, l) = (1-F(v \cdot h)) \, f_d(v, l) + F(v \cdot h) \, f_s(v, l)$

Is it reciprocal? Yes. Is it energy conserving? No.

View-vector based microfacet Fresnel, view-vector based diffuse Fresnel

$f(v, l) = (1-F(v \cdot n)) \, f_d(v, l) + F(v \cdot n) \, f_s(v, l)$

Is it reciprocal? No. Is it energy conserving? Yes.

But wait… why is it energy-conserving if we use $(v \cdot n)$ in both Fresnel terms, but not if we use $(v \cdot h)$? Isn’t it in both cases just a Fresnel-based linear blend between diffuse and specular terms?

Recall the definition of energy conservation (with $\omega_i = l$ and $\omega_o = v$):

$L_o(v) = \int_\Omega f(v, l) \cos\theta_l \mathrm d l \le 1$

Substituting $f(v,l)$ with the view-vector based combination $F(v \cdot n)$ gives us

\begin{aligned} L_o(v) &= \int_\Omega (1-F(v \cdot n)) \, f_d(v, l) + F(v \cdot n) \, f_s(v, l) \cos\theta_l \mathrm d l \\ &= \int_\Omega (1-F(v \cdot n)) \, f_d(v, l) \cos\theta_l \mathrm d l + \int_\Omega F(v \cdot n) f_s(v, l) \cos\theta_l \mathrm d l \\ &= (1-F(v \cdot n)) \underbrace{\int_\Omega f_d(v, l) \cos\theta_l \mathrm d l}_{\leq 1} \, + F(v \cdot n) \underbrace{\int_\Omega f_s(v, l) \cos\theta_l \mathrm d l}_{\leq 1} \end{aligned}

As the integration doesn’t depend on $F(v \cdot n)$, we can pull it out of the integrals. We know that the individual BRDFs are energy-conserving, i.e., the two integrals without Fresnel terms are less than or equal 1. Therefore, any linear blend between those will also be energy-conserving.

This doesn’t hold if we use the half-vector based combination $F(v \cdot h)$. The Fresnel term depends on $l$ ($h = \frac{v+l}{||v+l||})$ and cannot be pulled out of the integral. We are stuck with

\begin{aligned} L_o(v) &= \int_\Omega (1-F(v \cdot h)) \, f_d(v, l) + F(v \cdot h) \, f_s(v, l) \cos\theta_l \mathrm d l \\ &= \int_\Omega (1-F(v \cdot h)) \, f_d(v, l) \cos\theta_l \mathrm d l + \int_\Omega F(v \cdot h) f_s(v, l) \cos\theta_l \mathrm d l \end{aligned}

and cannot simplify further.

Half-vector based microfacet Fresnel, view-vector based diffuse Fresnel

$f(v, l) = (1-F(v \cdot n)) \, f_d(v, l) + F(v \cdot h) \, f_s(v, l)$

Is it reciprocal? No, if we swap $v$ and $l$, the result is different, $f(v, l) \neq f(l, v)$. Is it energy-conserving? Yes.

As it is not reciprocal, we can also check what happens if we swap $v$ and $l$. Unfortunately, it is not energy conserving anymore.

With a small modification, we can make it reciprocal, as shown in the next section. It also explains why it is energy conserving.

Half-vector based microfacet Fresnel, view and light-vector based diffuse Fresnel

$f(v, l) = (1-F(v \cdot n))(1-F(l \cdot n)) \, f_d(v, l) + F(v \cdot h) \, f_s(v, l)$

Is it reciprocal? Yes. Is it energy-conserving? Yes.

We know by construction of the half vector $h$ that $(v \cdot h)$ is greater than at least one of $(v \cdot n)$ or $(l \cdot n)$. From this follows that $(v \cdot h)$ is also greater than the product $(v \cdot n)(l \cdot n)$.

The following plots show that behavior for three different values of $\theta_v$ (angle between the $v$ and $n$). Either the orange or the green line, or both, are below the blue line, as is the red line.

As the Schlick Fresnel is monotonically increasing with $(1-\cos\theta)$, we can observe a similar behavior with the Fresnel values, just the other way around. The red line is now above the blue.

As a result, we can define $1-(1-F(v \cdot n))(1 - F(l \cdot n))$ as an upper bound to $F(h \cdot v)$. We can safely assume that if

$f(v, l) = (1-F(v \cdot n))(1-F(l \cdot n)) \, f_d(v, l) + [1-(1-F(v \cdot n))(1-F(l \cdot n))] \, f_s(v, l)$

is energy conserving, then

$f(v, l) = (1-F(v \cdot n))(1-F(l \cdot n)) \, f_d(v, l) + F(h \cdot v) \, f_s(v, l)$

is so too.

Similar to before, it is possible to split the integral and pull $(1 - F(v \cdot n))$ as well as $F(v \cdot n)$ out of it, leaving us with a linear blend between two energy-conserving BRDFs.

Unfortunately, although it is physically-based according to our definition, it has an issue. It loses a lot of energy at grazing angles if the microfacet roughness is high.

Other Approaches

There are a few approaches that are reciprocal, energy-conserving and have only little energy loss. In the Enterprise PBR Material, we use the approach introduced by Kelemen and Szirmay-Kalos (2001), later refined by Kulla and Conty (2017). The downside is that it relies on low-resolution lookup-tables with pre-integrated data, introducing error and complexity in the implementation. The following image taken from the Enterprise PBR specification shows the approach in the white furnace test. As you can this approach does the trick: the image is (almost) uniformly white.

Apart from that, there are more sophisticated layering techniques that simulate scattering effects within and between layers, like layerlab from Jakob et al. (2014). Although a lot more realistic, this approach is too slow for real-time rendering and not well-suited for textured materials.

Case Study: Node-based Materials and the View Vector

Node-based material models like Imageworks’ OpenShadingLanguage (OSL) or NVIDIA’s Material Definition Language (MDL) give artists huge flexibility in defining the appearance of a surface. There is a small detail in which OSL and MDL differ that makes the latter physically-based: the way directional color effects are exposed in the shading language.

Let’s write the following material in OSL:

surface SimpleMaterial (
color base = color(1, 0, 0),
color edge = color(0, 1, 0),
output closure color bsdf = 0)
{
c5 = pow(abs(dot(I, N)), 5);
bsdf = diffuse(N) * (edge + (base - edge) * c5);
}

Do you see the problem? It’s the I. OSL gives you access to the incident ray direction in the code:

vector I: The incident ray direction, pointing from the viewing position to the shading position P.

The material we just wrote is not reciprocal, because the resulting BSDF contains a $(v \cdot n)$ term without a corresponding $(l \cdot n)$ term. In fact, it is impossible to fix our OSL material, because the outgoing ray direction O is not available. It makes sense, because when the shader is evaluated, this direction is not available, it is about to be computed. OSL clearly aims at unidirectional path tracers, for which reciprocity is not a concern.

MDL does not expose the view direction in the shading language. Instead, there are a few closures like directional_factor or measured_curve_factor that can modify the result of other closures based on view and light direction. Our material looks the following in MDL:

material SimpleMaterial(
color base = color(1, 0, 0),
color edge = color(0, 1, 0))
= material(
surface: material_surface(
scattering: df::directional_factor(
normal_tint: base,
grazing_tint: edge,
base: df::diffuse_reflection_bsdf()))

It is more restrictive, but reciprocal. It gives the renderer the chance to use, for example, the half-vector to compute the tint factor. For this reason MDL is better suited for bidirectional path tracing and, therefore, the better choice if physically-based rendering and consistency is the goal.

Case Study: Refraction Textures

When it comes to refraction, a material describes the boundary surface between two media. In addition, many renderers also treat the volume enclosed by the surface as part of the material, see Schmidt and Budge (2002), and the volumes may be overlapping and nested. A priority attached to the material or mesh determines which volume is active at a certain position in space. This makes it easy for artists to create realistic nested refraction (ice cubes floating in ice tea inside a glass), but there is a limitation.

Nested dielectrics with spatially-varying index of refraction (IOR) defined on the surface by a 2D texture violate reciprocity. To understand that statement, let’s first have a look at a scene in which reciprocity is maintained: a box with refractive index $\eta$ inside air.

A ray is traced through the box from left to right, refracted in and out of the box, see image below. The stroke intensity indicates the index of refraction at the interface, varying between $\eta = 1.1$ and $\eta = 1.5$. Snell’s law ($\eta_1 \sin\theta_1 = \eta_2 \sin\theta_2$) will tell us the angle wrt. surface normal, which means the amount by which the ray is bent. When entering from the left (red), we bent the ray with 1.0->1.1, when leaving with 1.5->1.0. That might not be realistic, as the transition 1.1->1.5 is missing, but so far no problem for our simulation, the behavior is reciprocal. If we trace the ray in the opposite direction (green), from right to left, we have to reverse the computation of Snell’s law, but we end up with the same light path.

Now let’s put another box into our box. It has a constant IOR of 1.8 and a higher priority than the surrounding box. We now need to make use of the IOR tracking. When we hit the inside box, we remember the IOR of the outside box and use it to compute the refraction angle. From left to right:

To check reciprocity let’s try the reverse path:

It’s different! When we enter the inner box from the right, we have to use the IOR we remember from entering the outer box (1.5), and that is different from the IOR we remembered before (1.1). The method is not reciprocal.

The problem can be solved with a volumetric (3D) texture that defines the IOR at each position in 3D space. Then we can do ray marching, and at each step bend the ray according to the IOR in the cell. The 3D data can either come from an actual volume texture, or there is a mapping that maps each 3D location onto a point on a 2D texture.

Closing Words

The concept of physically-based materials is intuitive, but many material models ignore the one or the other property. Since a material model is usually closely coupled to a certain implementation in a specific renderer, bending the rules is acceptable and can be a good trade-off to simplify the implementation. As soon as we want to target a diverse set of renderers and rendering algorithms, from high-performance, approximative real-time rasterization to high-quality, unbiased bidirectional global illumination, we need to strictly follow the rules to achieve consistency.

The definition of physically-based materials via reciprocity and energy conservation is not very restrictive. It doesn’t limit the BSDF to a certain microfacet distribution, shadowing-masking or Fresnel term. It doesn’t mention valid ranges for the index of refraction, differentiates between dielectrics and conductors, considers IOR stacking or multiple scattering. So why did I not include them in the definition? Because these things are important if we want to quickly build and configure realistic-looking materials from a small amount of parameters. In other words: They are important if we build an analytical model that predicts measured data with as little error as possible. They are not important for consistency in unbiased rendering algorithms.

As mentioned in the beginning, the other key requirement for the Enterprise PBR Material is to make it easy to build realistic-looking materials. This is where the metallic-roughness workflow comes into play. The next part will show what “realistic-looking material” means, why “realistic” is not enough, and why the metallic-roughness workflow is a good choice.