In the inference code.

I wonder the specific definition of params image, masked_image and masked_latents.
Why image and masked_image both set to latent ( the representation of masked input test sample in latent space) ?
Why masked_latents is set to None ?
How to set these input value in training phase ?
In the inference code.

I wonder the specific definition of params image, masked_image and masked_latents.
Why image and masked_image both set to latent ( the representation of masked input test sample in latent space) ?
Why masked_latents is set to None ?
How to set these input value in training phase ?