index

배경

그래픽스 파이프라인의 transformation은 canonical coordinate(world coordinate)의 3D location $x$를 image space로 옮기는 작업을 의미한다. 이때 이는 3가지의 요소로 나뉘는데,

Camera Transformation
Projection Transformation
Viewpoint transformation

으로 구성되어 있다.

Untitled

Camera Transformation (world space ⇒ camera space)

"A camera transformation is a rigid body transformation that places the camera at the origin in a convenient orientation. It depends only on the position and orientation, or pose, of the camera.”
Projection Transformation (camera space ⇒ NDC)

A projection transformation maps points in camera space to a $[−1,1]^3$cube whose center e = 0 lies the camera. Such a cube is called a canonical view volume or the normalized device coordinates (NDC).
Viewport Transformation (NDC ⇒ screen space)

A viewport transformation "flattens" the $[−1,1]^3$ NDC and maps the 2×2 square to a raster image. The image measures H in height and W in width; the unit is pixel.

🤔 그럼 NeRF에서는 이런 transformation을 어떻게 활용할까?

✅ NeRF의 MLP는 **world coordinate에서의 camera position (Camera position w.r.t World coordinate)**을 받기 때문에 Camera Transformation이 사용 되지 않는다. 또한, 3D NDC에서 이미지 랜더링 하는 것이 아니기 때문에(MLP를 통한 query) Viewport transformation이 사용되지 않는다

🧐그럼 NeRF에서의 Projection Transformation은 어떻게 진행될까? 한번 알아보자.

Projection Transformation in NeRF

🗒️Camera Coordinates