Ray Tracer
Overview
This project implements a physically-based renderer using the well-known technique of ray tracing. More specifically, it uses path tracing to implement global illumination for more realistic lighting in 3D renderings. Furthermore, this ray tracer is optimized for speed and quality by using a Bounding Volume Hierarchy (BVH) as an acceleration structure and by using adaptive sampling as a noise reduction technique. Overall, it can produce quality renderings of 3D scenes in reasonable amounts of time. Some sample renderings are presented below, as well as explanations of the techniques and optimizations to make it all work.
Part 1: Ray Generation and Intersection
Ray tracing works by emulating the way light travels in the real world--as photons originating from light sources and bouncing off various objects until it reaches your eyes. In practice, this is done in reverse, tracing the path light takes by starting from the eyes and looking in some direction at some object that reflects light from a light source. Thus, the ray tracer starts from a given camera position and view direction, generates appropriate viewing rays for each pixel in the image, and figure's out the pixel's color based on the object that the ray hit and its position/orientation relative to the scene's lighting.
Let us first approach the ray generation. We start with an image that we want to create
in screen space coordinates, i.e. 800 x 600 pixels from a particular viewpoint. Samples are
generated for each pixel by generating rays from the camera (or our eyes) to the image plane
pixel, giving us a parametric equation for the ray,
r(t) = o + td
,
where o
is where the ray originates, d
is the direction the ray
travels, and t
is a parameter that varies to make r(t)
represent
any position on the ray.
Next, we want to figure out which object the ray hits. We'll explore intersections with two types of geometries: triangles and spheres.
We can figure out if the ray hits a triangle by calculating the barycentric coordinates of
the intersection point with the plane containing the triangle. This can be done with the
equation o + td = (1 - u - v)p1 + u*p2 + v*p3
, where p1, p2
and
p3
are the triangle points and u, v, (1 - u - v)
are barycentric
coordinates. This equation can be rearranged to find the unknown values (t, u, v)
for the intersection:
td + u(p1 - p2) + v(p1 - p3) = p1 - o,
or in matrix-vector form,
[d, (p1 - p2), (p1 - p3)] * [t; u; v] = [p1 - o].
This can be solved with basic linear algebra techniques, but a particularly efficient approach
is to use the
Möller–Trumbore intersection algorithm.
We can verify that the equation is solvable and has a valid t > 0
and
valid barycentric coordinates u >= 0, v >= 0, (1 - u - v) >= 0
that land inside
the triangle to determine triangle intersection, and we are done!
Now for spheres. To determine intersection here, we can substitute the ray position into the
sphere equation and solve for t
.
|(o + td - c)|^2 - R^2 = 0
Here, c
is the sphere center, R
is the sphere radius, and
o + td
is a point on the sphere when the equation holds. Expanding this
out, we get a quadratic equation for t
(the "." here means dot product):
(d . d)t^2 + (2(o - c) . d)t + ((o - c) . (o - c) - R^2) = 0.
If we can solve for a valid t > 0
, we have ray intersection with the sphere.
Once we have the object intersection information, we can do a lighting and color calculation for the pixels to accurately represent what our eyes should see. For now, we can use a debug color based off the surface normals at the intersection to observe that our implementation of ray generation and object intersection works as expected.
Part 2: Bounding Volume Hierarchy
Though we now have the methods to compute ray-object intersections,
it would still take O(n)
time to loop through all n
objects and test them against the ray. This can be improved significantly to
O(log n)
time, which is a speedup by a factor of 100,000x or more
in practice, depending on the scene. To do this, we can use what are typically
called ray tracing acceleration structures, which are spatial data structures
that help optimize ray traversal performance.
Here, we will use a bounding volume hierarchy (BVH), which is a tree structure that partitions all the shapes in the scene into disjoint groups of objects, whose bounding boxes may overlap. The ray then only needs to traverse the subset of objects within the bounding boxes that it intersects.
There are several heuristics that one can choose to do the actual partitioning and construction of the BVH. The approach that is taken here is to split a parent bounding box into two children by the midpoint along the box's longest side. This would help partition space more evenly so that the ray can skip the traversal of more objects on average.
The time to ray trace a 3D scene with 1 ray (sample) per pixel is measured in the table below, using both the naive approach (testing every object against the ray) and the BVH accelerated approach. As we can see, the BVH helps render scenes of even hundreds of thousands of triangles within a fraction of a second. Without the BVH, the same scenes can take several minutes for the same result. The last measurement in the table below is already a 10,000x speedup! This difference would be felt even more as we increase the size, quality, and complexity of the scene (i.e. resolution, samples per pixel, and number of triangles).
Scene | Number of Triangles | Naive Approach (seconds) | BVH Accelerated (seconds) |
---|---|---|---|
Teapot | 2,464 | 9.5083 | 0.0417 |
Cow | 5,856 | 20.0684 | 0.0480 |
Bunny | 33,696 | 99.6719 | 0.0415 |
Bench | 67,808 | 130.8857 | 0.0198 |
Dragon | 105,120 | 328.4802 | 0.0478 |
Wall-E | 240,326 | (over 10 minutes) | 0.0705 |
Here are a few of the moderately complex scenes that can now be ray traced with BVH acceleration, using simple surface normal (debug) coloring and 1 ray sample per pixel. Without using an acceleration structure, we would not be able to ray trace some 3D scenes with decent quality in a reasonable amount of time.
Part 3: Direct and Global Illumination
To complete the rendering, we must calculate the light reflecting off the surface in the direction of the viewing ray. This will depend on the properties of the material and position/orientation relative to light sources. We can also make a distinction between direct light sources and indirect lighting, which incorporates light bouncing from other surfaces instead of just the light source. We can perform all of these calculations to make a physically-based renderer by implementing a method called path tracing.
Here are some renderings of 3D scenes path traced for global illumination! This is the Stanford Bunny in the Cornell Box.
Here is another example.