Incorporating FITS Region , CRTF I/O and improving Regions

Google Summer Of Code, 2018 — Final Report

Our primary motive was to able to deal with CRTF, CASA Region Text Format (#119) in the astropy/regions package. That means, we implement a parser as well a writer for it. The grammar of this format was simple enough to not make a big deal out of it. The specification was all one line with limited keywords and regex module of the python library was a suitable fit. This C engine not only helped me to match the strings with the specification but also helped extract the meaningful components in the right order. The regions.Region class a lot of subclasses of different shapes that uses higher level astropy's classes to represent it’s coordinates and other attributes. So, to ease the serialization/deserialization process it has an intermediate Shape class. So when you are reading a CRTF/DS9/FITS region file in regions the CRTFParser first converts into ShapeList (list of Shape ) and then converts into list of regions.Region object by the to_regions() method.

When you serialize the list of regions.Region object, exactly the reverse happens. The to_shape_list() method converts a list of regions.Region object into a ShapeList object. The above two functions is combined in the crtf_objects_to_strings() . The write_crtf() directly writes a file.

The PRs #173 and #186 deals with this.

A short example :

>>> from regions import CRTFParser
>>> reg_str = """
... #CRTF0
... global coord=B1950, frame=BARY, corr=[I, Q], color=blue
... # A simple circle region:
... ann circle[[18h12m24s, -23d11m00s], 2.3arcsec]
... """
>>> parser = CRTFParser(reg_str, 'warn')
>>> print(parser.shapes[0])
Type : ann
Coord sys : fk4
Region type : circle
Meta: {'frame': 'BARY', 'corr': ['I', 'Q'], 'color': 'blue', 'include': True, 'type': 'ann'}
Composite: False
Include: True
>>> regs = parser.shapes.to_regions()
>>> print(regs[0])
Region: CircleSkyRegion
center: <SkyCoord (FK4: equinox=B1950.000, obstime=B1950.000): (ra, dec) in deg
(273.1, -23.18333333)>
radius: 2.3 arcsec
>>> print(regs[0].meta) # Meta data
{'frame': 'BARY', 'corr': ['I', 'Q'], 'include': True, 'type': 'ann'}
>>> >>> print(regs[0].visual) # Visual data taken by matplotlib
{'color': 'blue'}
>>> print(crtf_objects_to_string(regs, 'fk4'))
global coord=fk4
+ann circle[[273.100000deg, -23.183333deg], 0.000639deg], frame=BARY, color=blue, corr=[I, Q]

The API here is almost same as that of the `DS9` I/O in `regions`. Just replace the above function with crtf with ds9 . Also, as you can see the regions.Region has meta and visual . Handling them was a bit tricky. Earlier, regions dealt with only DS9 and it just stored the raw meta data in the same format as dictionaries . The validity of the meta keys were never validated nor documented. Now, to make this package compatible with CRTF and other formats as well in future, there was a need to identify a fixed name for a similar meta key and them map them to their respective names while reading and writing. Therefore I created (#179, #192) RegionMeta and RegionVisual classes which the meta key validation . Also, the whole serialization process for DS9 happened directly from Region objects to string without the intermediate Shape class. So, to make it consistent we added the Shape layer. The Shape can now be used as an API. Also, it helps deal with various region file formats without actually converting them to Region objects.

There were certain Region shapes such as text (#177) , ellipse annulus , rectangle annulus that were not implemented but supported by both DS9 and CRTF files. Also, since the regions.Region were highly dependent on the validity of it’s attributes. It was very important to make sure that we can rely on them. One such approach was to make them immutable, but I came across this #83 issue where we were hoping to validate the value on every assignment. Looks like an ugly task, but thanks to python’s descriptor classes to make it so elegant and efficient. So I spent some time creating descriptor classes that was tailored for our exact need. This #196 addressed the above problems.

In the meanwhile, to keep things from being too monotonous , my mentors suggested (or rather I asked them, does it matter?) we integrate the regions package with the Spectral-Cube package . The Spectral-Cube as the name suggests , deals with FITS spectral cubes, the third dimension being the spectral dimension. Since the CRTF regions also contained spectral meta information, it was kind of exciting on how to use the extracted spectral information for extracting a sub cube. It took me a couple of days to be familiar with the spectralcube package, the techniques used to mask compound regions and also the various spectral coordinates and frequency/velocity relations. It was my first time working with images and these tools, so it was quite exhilarating. Basically, we had to replace pyregions with regions . This pull request #488 contains the implementation of subcube_from_region() , subcube_from_ds9() , subcube_from_crtf() , which helps in extracting subcube in both spatial and spectral dimensions from region.Region objects or directly from DS9 , CRTF region files.

Now that we are able to parse and handle all the meta data of the regions. I decided to use the visual data in the matplotlib plotting methods. I got acquainted with matplotlib’s API in the meanwhile. The plot methods for point , text , annulus regions were missing. This PR , #194 implements the as_patch() method for annulus, uses the visual meta data to construct the matplotlib patches as well as sets the default values for several parameters according to the DS9 convention.

Regions + Matplotlib

By the end of second phase, we had a done all the above things and technically ahead of my schedule. Then, we came across a FITS Region Spatial Region specification which we thought to add them in the IO package(#200). The FITS is a De-facto standard for images in astronomy and writing table in a FITS file would certain help users. Also, astropy exclusively supports these files which made the table easier to parse. The writer , though tricky, was doable. This PR , #198 implements the parser as well the writer. In addition to this, I also made sure that I improve the documentation alongside . I also implemented a meta class MetaRegion to make sure that the doc strings are properly inherited from abstract methods so that Sphinx can up automatically add it in the docs. (#188, #199)

These were some of the major things that I worked on with through out the summer. In addition to these, I solved a couple of side bugs : ds9 pixel-origin offset , handling negative bounds in compound masks , ds9 ellipse width conversion . I also made sure that I included unit tests alongside writing code to make sure that nothing breaks down.

So, this is pretty much what I did in this summer. I am grateful to Astropy community for giving me this opportunity to work with such awesome astronomers and developers. This all would not have been possible without Google administering it. Not to exaggerate, the backbone of any GSoC project is it’s mentors. No amount of words will suffice, but I take this opportunity to thank my mentors Adam Ginsburg and Miguel De Val-Boro for taking their time to guide and review my work throughout these three months patiently. This project would not have been possible without their confidence and belief in me. Also, a big shout out to Google Summer of Code, orgs, mentors and students for making this program a grand success. This has been one of my best summers, no doubt.

This is the list to the all the merged pull requests to astropy/regions, spectral-cube in a chronological manner.