Simulating camera measurements through wave-optics (with PyTorch support!) 📷

9 min readJan 9, 2023

As part of recent work on privacy-enhancing classification with lensless cameras, we needed a large set of data to train and test how different types of cameras would perform on various classification tasks. Ideally, we would record all the data with each camera. However, this approach has two problems:

The amount of time required to collect the data.
We would have to re-collect all the data if the camera hardware changed (which was the case in our setup).

Therefore, we turned to simulation. In this post, we describe how to efficiently simulate wave-based optics to replicate an imaging system, and we provide Python code that is compatible with PyTorch 🔥

When developing algorithms that try to make sense of their surroundings through sensors (cameras, microphones, temperature, etc.), the availability of real-world data can be vital. However, certain constraints may make it impractical or even unethical to obtain such data. For example, it may be too time-consuming and cumbersome to manually record the data, or too invasive to place a camera or microphone in a private setting. To this end, simulation can provide an attractive alternative to obtain large amounts of data quickly and cheaply. Moreover, simulation can serve as a useful sanity check before assembling a complicated physical setup.

For a recent work on privacy-enhancing classification with lensless cameras, we needed a large set of images for training and testing purposes. Moreover, this had to be collected for multiple cameras, namely different mask-based lensless cameras, which amount to replacing the lens with another type of “encoding” function in the form of a mask. The image below shows the difference between a conventional lensed camera and a mask-based lensless camera. See this post for more on lensless imaging.

Conventional **(top)** vs. lensless **(bottom)** imaging. Image by author.

In our work, we compared different mask designs and, for one of the cameras, we “learned” the best mask for a given task. This “learning” amounted to adjusting the mask pattern, similar to updating the weights of a neural network. Consequently, at each update-step (i.e., after performing gradient descent to update the mask), our system would yield a completely different set of measurements due to a change in camera hardware (i.e., mask pattern), meaning we would have to re-collect our dataset. This would be highly impractical for tens of thousands of images! Simulations allow us to “re-collect” this new dataset numerically.

From scene to sensor ➡️

The first step in simulation is identifying an appropriate model. We assume that the scene is a 2D plane perpendicular to the sensor plane and at a distance d1 from our camera. We can model the scene-to-sensor relationship as a matrix-vector multiplication:

where y is a vectorized version of the 2D sensor measurement, x is a vectorized version of the 2D scene, and H is a dense matrix to characterize the input-output relation, sometimes called the transmission matrix in optics literature. In a typical lensed camera, this H should be close to identity. For a lensless camera, there is a one-to-many mapping between the scene and the sensor, resulting in a highly multiplexed image (as shown in “lensless capture” in the image above).

Depending on the scene and sensor resolutions, H can be very large, namely 10⁶ x 10⁶ for 1-megapixel (MP) scene and sensor resolution. To reduce the size of the “effective” size of H, a common assumption is to assume linear shift invariance (LSI), or “infinite memory effect” in optics terminology.

Consequently, we can use a convolutional model to describe the relationship between the scene and the sensor:

where p is typically a 2D filter or kernel, and the rows of H in the matrix-vector formulation would be shifted versions of the vectorized p. This p has a few names in literature: impulse response or point spread function (PSF). We will use the latter, which is more common in optics and imaging. For our 1MP scene and sensor resolution example, our representation of H can be reduced from 10⁶ x 10⁶ parameters to 10⁶, and even smaller for PSFs with a compact support, e.g. that of a lens.

Obtaining the PSF 💥

Now that we have a model, we need the different components and parameters involved, with the PSF being a key part of our simulation. There are two ways to obtain the PSF: (1) measuring it from the system itself or (2) simulating it. In this post, we show how to measure the PSF for our low-cost lensless camera. In the paper (Eq 4), we describe how to simulate the PSF of masks for which we have an analytic function M(x):

Eq 4 from https://arxiv.org/abs/2211.12864

The above (rather ugly!) equation can be described by three steps:

Propagate spherical waves to the mask according to the distance between the scene and the mask (d1).
Multiply with the mask.
Free-space optical wave propagation from the mask to the sensor (separated by a distance of d2). This propagation can be done by convolving with a wavelength-dependent kernel — H.

Simulation 🤖

With a PSF in hand (whether by measurement or simulation) we are ready to simulate some data! Given a digital image (grayscale, RGB, or even hyperspectral), the following steps need to be performed.

Prepare object plane: resize and pad the original image according to the physical dimensions of the setup and camera sensor.
Convolve with PSF.
(Optionally) downsample: perhaps you use a higher resolution PSF than your actual camera sensor.
Add noise: e.g. shot noise to replicate noise at the sensor or Gaussian noise.
Quantize according to the bit depth of the sensor.

Appendix B of our paper describes this process in more detail.

Python implementation 🐍

With LenslessPiCam and waveprop, you can readily perform the above simulation with a PSF (measured 40cm away) of our low-cost lensless camera that uses the Raspberry Pi HQ sensor. You can use a different PSF by simply specifying a different file path in the configuration files.

First download and install the necessary libraries:

# developer install of LenslessPiCam
git clone git@github.com:LCAV/LenslessPiCam.git
cd LenslessPiCam
conda create --name lensless python=3.9
conda activate lensless
(lensless) pip install -e .

# extra dependencies for simulation and metrics
(lensless) pip install -r recon_requirements.txt

1) Simulating a single file

A single file can be simulated for the lensless camera (and reconstructed) with the following script:

# from LenslessPiCam
(lensless) python scripts/sim/single_file.py

It will by default use the configuration from this file, namely simulate an image of “3” from the MNIST dataset with an object height of 12cm.

Outputs from scripts/sim/single_file.py.

2) Simulating a folder of files

The following script simulates an (1) entire dataset of lensless measurements, (2) recovers an estimate for each image, and (3) quantifies the reconstruction. Note that you should download a subset of the CelebA dataset (from here) and place it in the data folder.

# from LenslessPiCam
(lensless) python scripts/sim/dataset.py

It will by default use the configuration from this file. The simulation will be done in RGB, and a random height and shift is applied.

3) Simulating a dataset with PyTorch compatibility

The following script creates a PyTorch Dataset and DataLoader out of the above mini CelebA dataset, such that the simulated data can be conveniently used in a machine learning pipeline with PyTorch to train and/or test over batches.

# from LenslessPiCam
(lensless) python scripts/sim/torch_custom_dataset.py

It will by default use the configuration from this file, which will try to use a GPU (if available) for convolving each batch with the PSF. We have noticed that using a GPU can be considerably faster; this is set through the device_conv parameter of SimulatedDatasetFolder.

Roughly 5x faster for the above example with a batch size of 4.

4) Simulating a built-in PyTorch dataset

As a final example, we show how to simulate an already available PyTorch Dataset This can be done with the following script:

# from LenslessPiCam
(lensless) python scripts/sim/torch_dataset.py

It will by default use the configuration from this file, which simulates a subset of MNIST in RGB (and downloads the dataset if not available locally). Another dataset can be set in the configuration file or through the Terminal.

# from LenslessPiCam
(lensless) python scripts/sim/torch_dataset.py files.dataset=cifar10

At the moment only MNIST, FashionMNIST, and CIFAR10 are supported. Other datasets can be used by loading it as is normally done and passing the dataset object to SimulatedPytorchDataset.

Conclusion

And that’s it! We hope you find the above description and code useful for simulating your own data. Whether for lensless imaging, other computational imaging tasks, or other types of propagation, this approach of using an impulse response is a very convenient and flexible technique for many applications 🚀

* Behind the scenes *

For those interested in a deeper dive into the Python code, we’ll do that here.

The core class is FarFieldSimulator within the waveprop library (soon to be open-source on GitHub!). It is initialized with the following simulation parameters:

Point spread function (PSF).
Object height, or a range of values.
Scene-to-mask distance.
Mask-to-sensor distance.
Sensor name, defined in waveprop.devices (only “rpi_hq” as of Jan 2022).
(Optionally) output dimension to resize the image at the sensor.
(Optionally) signal-to-noise ratio (SNR) in dB for adding shot noise.

After creating an instance of FarFieldSimulator, the propagate method can be used to simulate a given digital image.

simulator = FarFieldSimulator(
    psf=psf,
    object_height=object_height,
    scene2mask=scene2mask,
    mask2sensor=mask2sensor,
    sensor=sensor,
    output_dim=output_dim,
    snr_db=snr_db,
    max_val=max_val,
)
simulated_image = simulator.propagate(original_image)

When calling propagate, the steps described under “Simulation 🤖” (above) are applied.

PyTorch compatibility

For use with PyTorch, the SimulatedDatasetFolder and SimulatedPytorchDataset classes in waveprop provides a wrapper around FarFieldSimulator with the necessary __getitem__ and __len__ methods that are expected by a PyTorch Dataset object.

Some features and a few things to keep in mind:

The PSF must have the height and width as the final dimensions, as is the convention for images in PyTorch. This reshaping can be conveniently done with the ToTensor method (as well as normalization within [0, 1]).

from lensless.io import load_psf
from torchvision.transforms import ToTensor


psf = load_psf(psf_fp)
psf = ToTensor()(psf)

For SimulatedDatasetFolder, the dataset path and the image extension should be provided:

ds = SimulatedDatasetFolder(
    ...
    path=ds_path,
    image_ext="jpg",    # defaults to JPG
)

For SimulatedPytorchDataset, a PyTorch Dataset is provided as input:

ds_prop = SimulatedPytorchDataset(
    dataset=ds,
    ...
)

Random augmentation can be applied to both SimulatedDatasetFolder and SimulatedPytorchDataset: horizontal/vertical flipping, rotation, rescaling (height), and shifting.

ds = SimulatedDatasetFolder(       # or SimulatedPytorchDataset
    ...
    object_height=[0.1, 0.4]  # uniformly pick from this range
    random_vflip=0.5,      # probability
    random_hflip=0.5,      # probability
    random_rotate=90,      # random rotation within [-90, 90]
    random_shift=True,     # randomly shift while being fully visible
)

Indexing the new dataset returns a simulated image and a target, which you can decide to be (1) the original image, (2) the image at the object plane (original image resized and padded to match the PSF shape and object height), or (3) the label of the original dataset (only for SimulatedPytorchDataset e.g. if you want to return the MNIST digit or CIFAR label).

ds = SimulatedPytorchDataset(
    ...
    target="original",  # "original" or "object_plane" or "label"
)

target, simulated_image = ds[0]