Local Light Field Fusion (LLFF) is a practical and robust deep learning solution for capturing and rendering novel views of complex real-world scenes for virtual exploration. The dataset consists of both renderings and real images of natural scenes. The synthetic images are rendered from the SUNCG and UnrealCV where SUNCG contains 45000 simplistic house and room environments with texture-mapped surfaces and low geometric complexity. UnrealCV contains a few large-scale environments modeled and rendered with extreme detail. The real images are 24 scenes captured from a handheld cellphone.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
BilaRF Dataset
Project Page | Arxiv | Code This dataset contains our own captured nighttime scenes, synthetic data generated from RawNeRF dataset, and editing samples. To use the data, please go to 'Files and versions' and download 'bilarf_data.zip'. The source images with EXIF metadata are available for download at this Google Drive link. The dataset follows the file structure of NeRF LLFF data (forward-facing scenes). In addition, editing samples are stored in the 'edits/'… See the full description on the dataset page: https://huggingface.co/datasets/Yuehao/bilarf_data.
The shiny folder contains 8 scenes with challenging view-dependent effects used in our paper. We also provide additional scenes in the shiny_extended folder. The test images for each scene used in our paper consist of one of every eight images in alphabetical order.
Each scene contains the following directory structure: scene/ dense/ cameras.bin images.bin points3D.bin project.ini images/ image_name1.png image_name2.png ... image_nameN.png images_distort/ image_name1.png image_name2.png ... image_nameN.png sparse/ cameras.bin images.bin points3D.bin project.ini database.db hwf_cxcy.npy planes.txt poses_bounds.npy
dense/ folder contains COLMAP's output [1] after the input images are undistorted. images/ folder contains undistorted images. (We use these images in our experiments.) images_distort/ folder contains raw images taken from a smartphone. sparse/ folder contains COLMAP's sparse reconstruction output [1].
Our poses_bounds.npy is similar to the LLFF[2] file format with a slight modification. This file stores a Nx14 numpy array, where N is the number of cameras. Each row in this array is split into two parts of sizes 12 and 2. The first part, when reshaped into 3x4, represents the camera extrinsic (camera-to-world transformation), and the second part with two dimensions stores the distances from that point of view to the first and last planes (near, far). These distances are computed automatically based on the scene’s statistics using LLFF’s code. (For details on how these are computed, see this code)
hwf_cxcy.npy stores the camera intrinsic (height, width, focal length, principal point x, principal point y) in a 1x5 numpy array.
planes.txt stores information about the MPI planes. The first two numbers are the distances from a reference camera to the first and last planes (near, far). The third number tells whether the planes are placed equidistantly in the depth space (0) or inverse depth space (1). The last number is the padding size in pixels on all four sides of each of the MPI planes. I.e., the total dimension of each plane is (H + 2 * padding, W + 2 * padding).
References:
[1]: COLMAP structure from motion (Schönberger and Frahm, 2016). [2]: Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines (Mildenhall et al., 2019).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Quantitative Comparison on LLFF.Our proposed method outperforms other methods on real-world forward-facing scenes, ft indicates the results fine-tuned on each scene individually.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LLFF
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ablation studies. We perform ablation studies on LLFF with 3 input views, where DPT(V2) means more advanced depth priors, DOSL means dynamic optimal sampling layer and PII means per-Layer inputs incorporation.
Mip-NeRF 360 is an extension to the Mip-NeRF that uses a non-linear parameterization, online distillation, and a novel distortion-based regularize to overcome the challenge of unbounded scenes. The dataset consists of 9 scenes with 5 outdoors and 4 indoors, each containing a complex central object or area with a detailed background.
The dataset used in the paper is not explicitly mentioned, but it is mentioned that the authors used real scenes and scenes from IN2N [9] and LLFF [29].
Neural Radiance Fields (NeRF) is a method for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. The dataset contains three parts with the first 2 being synthetic renderings of objects called Diffuse Synthetic 360◦ and Realistic Synthetic 360◦ while the third is real images of complex scenes. Diffuse Synthetic 360◦ consists of four Lambertian objects with simple geometry. Each object is rendered at 512x512 pixels from viewpoints sampled on the upper hemisphere. Realistic Synthetic 360◦ consists of eight objects of complicated geometry and realistic non-Lambertian materials. Six of them are rendered from viewpoints sampled on the upper hemisphere and the two left are from viewpoints sampled on a full sphere with all of them at 800x800 pixels. The real images of complex scenes consist of 8 forward-facing scenes captured with a cellphone at a size of 1008x756 pixels.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Local Light Field Fusion (LLFF) is a practical and robust deep learning solution for capturing and rendering novel views of complex real-world scenes for virtual exploration. The dataset consists of both renderings and real images of natural scenes. The synthetic images are rendered from the SUNCG and UnrealCV where SUNCG contains 45000 simplistic house and room environments with texture-mapped surfaces and low geometric complexity. UnrealCV contains a few large-scale environments modeled and rendered with extreme detail. The real images are 24 scenes captured from a handheld cellphone.