
1. Introduction
Computer graphics primitives are represented by mathematical functions→MLPs used as neural graphics primitives e.g NeRF, Neural sparse voxel fileds, deep SDF, ACORN
- The inputs of the neural network needs to be encoded(mapped) in to higher dimensions to extract high approximation quality from compact models
- [-] heuristic, structural modifications that complicate training→task specific, limit GPU performance
- Multiresolution hash encoding
- Adaptivity
- Mapping cascade of grids to fixed size array feature vectors → No structural update needed
- coarse resolution → 1:1 mapping
- fine resolution → spatial hash function automatically prioritizing sparse areas with most important fine detail
- Efficiency
- $O(1)$ hash tabel lookup
- No pointer-chasing
- Independent of task
- Gigapixel image, Neural SDF, NRC, NeRF
2. Background and Related work
Frequency Encodings
- Transformers introduced encoded scalar positions as a multi resolution sequence of $L \isin \mathbb{N}$ sine and cosine functions
$$
enc(x) = \Bigl( \sin(2^0x),\sin(2^1x),\dots ,\sin(2^{L-1}x), \\ \\cos(2^0x),\cos(2^1x),\dots ,\cos(2^{L-1}x) \Bigr)
$$
Parametric Encodings
-
Arrange additional trainable parameters in an auxiliary data structure e.g grid, tree and to look-up and interpolate these parameters depending on the input vector.
-
Larger memory footprint for a smaller computational cost
- For each gradient → every parameter in MLP needs to be updated, but for trainable input encoding parameters, only a small number are affected
reducing the size of the MLP, such parametric models can typically be trained to convergence much faster without sacrificing approximation quality
Coordinate Encoder
- Large Auxiliary coordinate encoder neural network (ACORN) is trained to ouput dense feature grids around $\bold{x}$
Sparse Parametric Encodings
- Dense Grids → Consume much memory, fixed resolution
- Allocates too much memory on empty space
- Natural scenes exhibit smoothness → multi resolution decomposition