Science with deeplenstronomy!

The notebooks up until this point have demonstrated the features of deeplenstronomy at their most basic level. This notebook will show how I would design a science-ready training set by putting some of those features to use.

We'll be going through the configuration file example.yaml, which simulates a 4-class dataset consisting of galaxy-galay lensing, galaxy-galaxy lensing where the lens is a real DES galaxy, lensed quasars, and a galaxy by itself.

Top Level Dataset Properties

Like in the other examples, we start by selecting the size of our dataset and telling deeplenstronomy where to save the images at the end of the day.

Next up is the cosmology section. Here I could also specify $T_{cmb}$, $N_{eff}$, and other parameters of the astropy.cosmology.FlatLambdaCDM class, but we can get away with just specifying $H_0$ and $\Omega_m$ since that is technically all that is needed for the cosmology calculation performed internally.

Camera and Observing Conditions

Now normally we would write the IMAGE and SURVEY sections next to describe the camera and observing conditions, but I will make use of deeplenstronomy's built-in surveys functionality and just pass survey='des' when calling the deeplenstronomy.make_dataset() function.

Describing the Objects to Simulate

Time to describe the objects we want to put in our dataset. Onward to the SPECIES section!

I'll start with a galaxy. It's name will be "SimulatedLens" and I'll use this name to refer to it later on. I'll add to this galaxy a "Light Profile" to describe how it's light will appear and a "Mass Profile" to tell deeplenstronomy how it will affect the light of objects behind it.

For the Light Profile, I'm choosing to call the lenstronomy "SERSIC_ELLIPSE" model, then I set its parameters:

For the Mass Profile, I'm choosing to call the lenstronomy "SIE" model, then I set its parameters:

Next, I'll write the properties for the galaxy I plan to use as a source galaxy in the gravitational lensing systems. I'll name it "SimulatedSource". I'll use a similar Light Profile, and I can get away without a Mass Profile because the source galaxy's mass will not affect the lensing.

The parameters will be defined in a similar way to the Light Profile of SimulatedLens, but this time I will draw center_x and center_y from narrow distributions to account for not-exactly-alligned gravitational lensing.

To finish off the galaxies I'll define in this configuration file, I will make a third galaxy called "DarkLens". The reason for the name is I am going to simulate no light at all coming from this galaxy, and then I am going to use a real DES image of a lens galaxy in its place.

The Light Profile here is very simple because I'm intending it to be meaningless. I set the magnitude to 100 so that the light will be several dozens of orders of magnitude too faint to detect, and then I add values to all the other parameters to satisfy both deeplenstronomy and lenstronomy.

The Mass Profile here is important. To make the training dataset realistic, I want to make sure the the real image of a galaxy that I use has its properties encoded into the lensing calculations. Specifically, I will use the measured velocity dispersion of the DES galaxy in each image in the simulation. Thus, I am specifying sigma_v instead of theta_E in this case.

Closing out the SPECIES section, I'll include a quasar so that we can simulated lensed quasars along with the other cases of galaxy-galaxy lensing.

I'll name this object "Quasar" and specify that I want the POINTSOURCE to be located within the galaxy that I named SimulatedSource earlier. Since a quasar is always at the center of a galaxy, I will leave off the sep, sep_unit and angle parameters, in which case deeplenstronomy will place the POINTSOURCE at the center of its HOST. Lastly, I'll plan to overwrite the magnitude with a realistic, color-dependent, distribution using deeplenstronomy's USERDIST functionality.

Object Placements

Now I'll fill in the GEOMETRY section to tell deeplenstronomy how I want the objects oriented. Let's start with a straightforward galaxy-galaxy lensing system. Each system I choose to create will be called a CONFIGURATION. I've named this one "GalaxyGalaxySimulated" so I can keep track that I'm doing galaxy-galaxy lensing and using all simulated light. I also specify FRACTION: 0.25 so that this type of system will make up 1/4 of the total images that get produced.

Because I want to put one galaxy behind another one, I need two PLANEs, in each plane I write the name of the objects I want and I set the redshift.

Next let's put in a second configuration where I keep everything the same, except I'll switch out "SimulatedLens" for "DarkLens". This will be the configuration where I put in the real DES images later on. This one I'll name "GalaxyGalaxyReal" to indicate the use of real images while still doing galax-galaxy lensing.

Next let's make the dataset a little more fun and put in a lensed quasar system. I'll call this configuration "GalaxyQuasarSimulated". Everything is the same as CONFIGURATION_1, but I have added OBJECT_2: Quasar to PLANE_2. This line tells deeplenstronomy to include the object in it the SPECIES section named "Quasar". Recall that "Quasar" has a HOST of "SimulatedSource", so deeplenstronomy is already prepared to put this object in that galaxy.

Lastly, I'll include a configuration where I only simulate a single galaxy. The motivation here is to simulate a background class for a neural network to train on. Presumably the largest background you'll face will be individual galaxies, but you can get as detailed as you want. To make "JustAGalaxy" I will copy CONFIGURATION_1, but I will remove the second plane that contained the background galaxy.

Including the real galaxy images

Recall that in CONFIGURATION_2 I've utilized the object we aptly named "DarkLens" and that we effectively gave this object zero apparent luminosity. The reasoning here is that we wanted to have a mass profile, but utilize the light from a real image. To do that, I'll use the BACKGROUNDS feature of deeplenstronomy.

The CONFIGURATIONS argument is a list containing only CONFIGURATION_2 which specifies that I only want these background images to be utilized for CONFIGURATION_2.

I've supplied a directory example_background_images of 46 DES images of galaxies (and the galaxies also overlap with SDSS) split into each band that I'm simulating in this dataset. I also give a map.txt file, which tells deeplenstronomy the properties of the galaxies in the images.

Each of the galaxies here have a measure velocity dispersion, so in the map.txt file I am telling deeplenstronomy about the velocity dispersion of each galaxy (each galaxy is one row and the rows are aligned with the index of the image in the FITS files).

Then, with the header row, I am telling deeplenstronomy which properties these velocity dispersions map to in the simualtions. Since the configuration with "DarkLens" is CONFIGURATION_2 and "DarkLens" is OBJECT_1 in PLANE_1, and since the velocity dispersion is characterized by the sigma_v parameter of "DarkLens"'s MASS_PROFILE_1, we can create a header string of:

CONFIGURATION_2-PLANE_1-OBJECT_1-MASS_PROFILE_1-sigma_v-

and finally since the velocity dispersion is not a color-dependent quantity, we make the velocity dispersion the same for each band.

Including Realistic Object Colors

In the same vein of simulating a dark galaxy knowing we would include real images, recall that we also left flag values of -5 in the LIGHT_PROFILE_1-magnitude parameter for "SimulatedLens", "SimulatedSource", and "Quasar". In this section we will finally give these objects physically meaningful brightnesses and colors.

I'll do this by utilizing USERDISTs. In total there are 7 USERDISTs that I am utilizing. For each one I specify it's filename and the mode I will be using. In this case I've chosen to use the sample mode, which draws from the raw points in the file rather than interpolate a grid of points. This mode is much more efficient when correlating more than a couple parameters.

Let's start with the first 3 USERDISTs: filenames low_z_galaxy_colors_config_*.txt. These three files are identical with the exception of the header row, which I will cover next. The columns in the files weree produced with the following query:

SELECT DNF_ZMEAN_SOF, MAG_PSF_G, MAG_PSF_R, MAG_PSF_I, MAG_PSF_Z, MAG_PSF_Y 
FROM Y3_GOLD_2_2 
WHERE EXTENDED_CLASS_COADD = 3 and DEC > -44 and DEC < -40 and DNF_ZMEAN_SOF < 0.3 and ROWNUM < 101;

Ignoring some of the DES-specific column and table names, what I've done is select the magnitude in each band and the redshift of 100 galaxies in a random patch of sky for low redshift galaxies. I've then put the query results directly into the USERDIST files with updated column names:

The column names match how deeplenstronomy will track the object. It can take some getting used to. I've also added a WEIGHT column. These weights represent the frequency at which each row is used. They are defined relative to each other, so it's not necessary thaty they sum to one.

As I mentioned earlier, the only difference between low_z_galaxy_colors_config_1.txt, low_z_galaxy_colors_config_3.txt, and low_z_galaxy_colors_config_4.txt is the header row. In these other files CONFIGURATION_1 has been replaced with CONFIGURATION_3 and CONFIGURATION_4 respectively. Thus, I am still using the same galaxy colors and redshfits, but I attributing it to different object in the simulation. These files are used for CONFIGURATION_1, CONFIGURATION_3, and CONFIGURATION_4 because they target "SimulatedLens" in each case.

Similarly, the files high_z_galaxy_colors_config_1.txt, high_z_galaxy_colors_config_2.txt, and high_z_galaxy_colors_config_3.txt were produced with the query:

SELECT DNF_ZMEAN_SOF, MAG_PSF_G, MAG_PSF_R, MAG_PSF_I, MAG_PSF_Z, MAG_PSF_Y 
FROM Y3_GOLD_2_2 
WHERE EXTENDED_CLASS_COADD = 3 and DEC > -44 and DEC < -40 and DNF_ZMEAN_SOF > 0.5 and ROWNUM < 101;

and these files target the redshift and magnitude of "SimulatedSource" everywhere it appears in the simulation.

Lastly, I use the file quasar_colors.txt to put in fake quasar colors where I have created the distribution of magnitudes to make the objects appear blue and to be bright at the redshifts they will be place at in the simulation. I would recommed using real AGN observations if you are working with lensed quasars.

Summary

That is all the information in the file example.yaml. We have utilized built-in astronomical surveys, realistic galaxy colors, real images of galaxies, and physically motivated distributions of all parameters to give us a rich training set ready for a neural network. At this point, we're ready to call deeplenstronomy.make_dataset() and simualted our images.

The Simulated Dataset

We can inspect the images using the deeplenstronomy visualization features.

Here's CONFIGURATION_1, which was galaxy-galaxy lensing with both the source and lens being simulated:

All the metadata for these images is stored here:

Here's CONFIGURATION_2 which was real DES images of the lens galaxy and a simulated source behind it:

Here is CONFIGURATION_3 which was the same as CONFIGURATION_1 but with a quasar in the source galaxy:

And finally here is CONFIGURATION_4 which is just a galaxy by itself.