Getting Started with deeplenstronomy

The purpose of deeplenstronomy is to provide the functionality for users to generate entire datasets using a single input file. The use of this framework separates the programmer from the astronomer when creating datasets, and will hopefully enable the creation of easily-reproducible datasets across the field.

The Backbone of deeplenstronomy

When using deeplenstronomy, the vast majority of your time will be spent construcitng the "configuration file" that will tell deeplenstronomy how to make your dataset. This configuration file is a yaml file where you will set the attributes of different objects in your dataset. As an example, you define the properties of the camera used to collect the images with a section like this:

IMAGE:
    PARAMETERS:
        exposure_time: 90
        numPix: 100
        pixel_scale: 0.263
        psf_type: 'GAUSSIAN'
        read_noise: 7
        ccd_gain: 6.083

For a full introduction to the different parts of a deeplenstronomy configuration file, please read the Creating deeplenstronomy Configuration Files documentation.

Creating a Dataset

Once you have your configuration prepared, all you need to do is import deeplenstronomy and call the deeplenstronomy.make_dataset() function on your configuration file. This code will look something like this:

import deeplenstronomy.deeplenstronomy as dl

config_file_name = 'demo.yaml' # name of your file
dataset = dl.make_dataset(config_file_name)

The Generating Datasets documentation will introduce you to the options that exist when using the deeplenstronomy.make_dataset() function.

Making deeplenstronomy Work for You

deeplenstronomy has several features for creating a dataset to match various science goals. These features include:

These features were designed to enable users to create datasets usable in deep learning problems. The example notebooks in this repository will introduce you to each of deeplenstronomy's capabilities. Enjoy!

A Note on Reproducibility

One of the main guiding principles of deeplenstronomy is the ability for you to send a configuration file and associated text files to a collaborator and they would be able to completely recreate your dataset. To make your work fully reproducible, you can (and should) specify a random seed.

You can specify a random seed in your configuration file like this:

DATASET:
    NAME: test_dataset
    PARAMETERS:
        SIZE: 10 # number of images in your dataset
        OUTDIR: temp_dir # directory to save your simulations
        SEED: 6 # your favorite random seed (a positive integer)

When Errors Arise

Before beginning the dataset generation step, deeplenstronomy checks your configuration file for things it cannot understand. It will alert you of everything it finds and then return control to you to fix the configuration file. These error strings will look like this:

SURVEY.PARAMETERS.magnitude_zero_point is missing from the Config File
SURVEY.PARAMETERS.num_exposures is missing from the Config File
SURVEY.PARAMETERS.seeing is missing from the Config File
SURVEY.PARAMETERS.BANDS is missing from the Config File
SURVEY.PARAMETERS.sky_brightness is missing from the Config File
COSMOLOGY.PARAMETERS.H0 is missing from the Config File
IMAGE.PARAMETERS.numPix cannot be drawn from a distribution
Missing SURVEY section from config file
Fatal error(s) detected in config file. Please edit and rerun.

These checks are as exhaustive as possible, and will catch most of the things that would cause crashes when generating the dataset. That being said, if you do manage to produce a real crash, please open an issue in the deeplenstronomy GitHub repository.