Incorporating Your Own Images

A common strategy for making training data reflective of the data collected by real astronomical surveys is to utilize real images in your training data. Two specific strategies might be

In both situations, you can use deeplenstronomy's BACKGROUNDS. In the configuration file, you specify the use of background images with an entry at the same level as the DATASET, COSMOLOGY, GEOMETRY, etc. keys:

BACKGROUNDS: 
    PATH: <path/to/image_directory_name> #(no trailing '/')
    CONFIGURATIONS: <configuration list, e.g. ['CONFIGURATION_1'] or ['CONFIGURATION_1', 'CONFIGURATION_3']>

Image directory structure

The image directory should have a structure that looks like this:

image_directory_name
├── map.txt
├── g.fits
├── i.fits
├── r.fits
└── z.fits

You will need a FITS file for each band you choose to simulate named <band>.fits and an optional map.txt file for incorporating the properties of the objects in your images to your simulations.

FITS file layout

Each FITS file should contain all the images of a single band, stacked in a single array, and the index of each image in the array should line up across all the FITS files. So for example if you are using 20 100px by 100px cutouts, then the following code:

from astropy.io import fits
hdu = fits.open('g.fits')
print(hdu[0].data.shape)

should print (20,100,100).

Automatic resizing

If your supplied images are bigger (smaller) than the specified value of IMAGE.numPix in the configuration file, then deeplenstronomy will automatically crop (pad with zeros) your images. In either situation, the images are modified equally on each side, so the centeral pixels will remain the center of the resized image.

The optional map.txt file

If there are properties of the image that you would like to incorporate into the simulations, the map.txt file is what you need. If you do not provide a map.txt file, then your images are used at random in your simulation.

The map.txt file should have one row per image (plus a header row). So in the example above the map file would need a header and 20 rows. The (whitespace-delimited) columns are deeplenstronomy parameters: such as 'PLANE_1-OBJECT_1-REDSHIFT_g' and the like. The rows should be the properties of the galaxy in your real image.

For example, a map.txt file that looks like this:

exposure_time_g exposure_time_r exposure_time_i exposure_time_z
120 90 90 30
60 60 60 60
... 18 more rows

would tell deeplenstronomy the exposure time used in each band when collecting each real image. This way, when you put your real images into your simulations, you will be forcing the simulations to have the same exposure time distribution as your images.

Example

For a full walkthrough of this feature, check out the "Using Real Lens Galaxies in Your Simulations" notebook.

Appendix - The DES Cutout Tool

If you happen to be using the Dark Energy Sruvey Cutout Tool (or a similar tool), the following script can be used or adapted to take a directory or images and stack them into the "one-band-per-file" style of deeplenstronomy.

# Format des bulk cutout images for deeplenstronomy

import json
import sys

from astropy.io import fits
import pandas as pd

assert len(sys.argv) > 2, "Need image_dir and json_list as command-line args"
image_dir = sys.argv[1]
json_file = image_dir + '/' + sys.argv[2]

with open(json_file) as stream:
    json_data = stream.read()

cutout_info = json.loads(json_data)

bands = [x.lower() if x != 'Y' else x for x in cutout_info['bands'].split(',')]
files = cutout_info['files']

images = {b: [] for b in bands}

for file_path in files:
    for band in bands:
        filename = image_dir + '/' + file_path + '_' + band + '.fits'

        hdu = fits.open(filename)
        images[band].append(hdu[0].data)
        hdu.close()

# save images in fits format
for band in bands:
    hdu = fits.PrimaryHDU(images[band])
    hdu.writeto(band + '.fits')

# save a csv with the filenames
files = [[f] for f in files]
pd.DataFrame(data=files, columns=['FILENAME']).to_csv('test_files.csv')