A beautiful summer spent coding

GSoC-2018 with Python Hydra

Published in

Hydra Ecosystem Developers

6 min readAug 11, 2018

Hey! I am writing this post as the summer comes to an end and so does the 3rd phase of GSoC, i really wish it wasn't limited to summer, clap if you agree!

I worked on creating a parser to convert OpenAPI Document to Hydra API Documentation which will greatly reduce the barrier to entry and people will be able to migrate to hydrus easily , in one command to be precise . The parser was created over several iterations and i think we have a pretty robust parser now .

My idea was to parse the doc for the information required by the `doc_maker` to create a Hydra document , hence i started by referring to the sample hydra doc generator given here. From there i managed to break the process into following points:

Create HydraDoc using `doc_writer` and details like title , description , api name , base url etc.
Create `HydraClass` with details like class title , description , and endpoint variable .
Attach properties(`HydraClassProp`) and operations(`HydraClassOp`) to `HydraClass` defined before and specify if the class is a collection .
Finally attach this Hydra Class to the api doc .
Do the above steps for every class .
Finally we add baseResource and baseCollection and entrypoint to the api doc .
Optionally we can convert this Hydra Doc to a dict by using the generate method on Hydra Doc.

Now that i had identified these steps i had another hurdle in front of me which was how should I map the objects provided by OpenAPI to objects provided by Hydra. Now this was very tricky to do at most of the places and almost not possible at some because the open api was really extensive and descriptive in comparison to Hydra. Also since Hydra is still evolving it did not have a lot of features provided by OpenAPI like giving the location of input parameter like “header”, ”query”, ”path” and “body” or providing information about the headers etc . Keeping all that in mind we came up to the following mapping of objects in OpenAPI and hydra :

Class Representation :

Classes are defined locally under the key definitions generally in open api , actually they can be present under any key ( this was the reason why we had to do the second iteration , will come back to this in a while) . In Hydra the classes are of type hydra:class i used the doc_maker to create the Hydra Class and it was quite easy to once i had the required information from the OpenAPI doc . Below are how the classes/objects are represented in OpenAPI and Hydra .

OpenAPI Specification

definitions:
  Order:
    type: "object"
    description: "Order class"    properties:
      id:
        type: "integer"
        format: "int64"
      petId:
        type: "integer"
        format: "int64"
      quantity:
        type: "integer"
        format: "int32"
      shipDate:
        type: "string"
        format: "date-time"
      status:
        type: "string"
        description: "Order Status"
        enum:
        - "placed"
        - "approved"
        - "delivered"
      complete:
        type: "boolean"
        default: false
    xml:
      name: "Order"

Hydra Doc Specification

{
    "@id": "vocab:Order",
    "@type": "hydra:Class",
    "description": "Order class",
    "supportedOperation": [
    ],
    "supportedProperty": [
    ],
    "title": "Order"
},

Properties :

By properties here i mean the properties of the class. OpenAPI defines the properties with the class definition itself. The properties may reference another class defined locally or may give a URL. We can define properties using HydraClassProp of `doc_writer`. We later attach this prop to the `HydraClass` we defined in the previous step. Here is how it looks like!

OpenAPI Specification

Order:
    type: "object"
    description: "Order class"properties:
      id:
        type: "integer"
        format: "int64"
      petId:
        type: "integer"
        format: "int64"
      quantity:
        type: "integer"
        format: "int32"
      shipDate:
        type: "string"
        format: "date-time"
      status:
        type: "string"
        description: "Order Status"
        enum:
        - "placed"
        - "approved"
        - "delivered"
      complete:
        type: "boolean"
        default: false
    xml:
      name: "Order"

Hydra Api Document

"supportedProperty": [
    {
        "@type": "SupportedProperty",
        "property": "",
        "readonly": "true",
        "required": "false",
        "title": "id",
        "writeonly": "true"
    },
    {
        "@type": "SupportedProperty",
        "property": "",
        "readonly": "true",
        "required": "false",
        "title": "petId",
        "writeonly": "true"
    },
    {
        "@type": "SupportedProperty",
        "property": "",
        "readonly": "true",
        "required": "false",
        "title": "quantity",
        "writeonly": "true"
    },
    {
        "@type": "SupportedProperty",
        "property": "",
        "readonly": "true",
        "required": "false",
        "title": "shipDate",
        "writeonly": "true"
    },
    {
        "@type": "SupportedProperty",
        "property": "",
        "readonly": "true",
        "required": "false",
        "title": "status",
        "writeonly": "true"
    },
    {
        "@type": "SupportedProperty",
        "property": "",
        "readonly": "true",
        "required": "false",
        "title": "complete",
        "writeonly": "true"
    }
],

Operations on Classes:

Operations are defined in OpenAPI under the path key, we are given a endpoint(the path), the method (get, post, put, delete) and the input, response and other details. We extract these details and create a supportedOperation using the HydraClassOp. We later add this operation to the Hydra Class defined previously. We parse each method and add that to the Api documentation for the respective classes. Here is how it looks!

OpenAPI Documentation

paths:
  /pet:
    post:
      tags:
      - "pet"
      summary: "Add a new pet to the store"
      description: ""
      operationId: "addPet"
      consumes:
      - "application/json"
      - "application/xml"
      produces:
      - "application/xml"
      - "application/json"
      parameters:
      - in: "body"
        name: "body"
        description: "Pet object that needs to be added to the store"
        required: true
        schema:
          $ref: "https://schema.org/mainEntity"
      responses:
        405:
          description: "Invalid input"
      security:
      - petstore_auth:
        - "write:pets"
        - "read:pets"

Hydra Api Documentation

"supportedClass": [
    {
        "@id": "vocab:Pet",
        "@type": "hydra:Class",
        "description": "Pet",
        "supportedOperation": [
            {
                "@type": "http://schema.org/UpdateAction",
                "expects": "vocab:Pet",
                "method": "POST",
                "possibleStatus": [
                    {
                        "description": "Invalid input",
                        "statusCode": 405
                    }
                ],
                "returns": "null",
                "title": "Add a new pet to the store"
            }
        ],

Once we had established such broad mappings we moved to implementing them in the phase 1.

First Iteration (56c32231a)

In this iteration we started parsing the objects defined under definitions key and added those classes to the HydraDoc, then we parsed the paths to get the endpoint and the operations on them. The thing to note is we added classes and operations parsed as soon as they were parsed. We maintained 2 global variables that contained the classes parsed, we appended the operations and associated the classes with the operations when parsed.

Now all this worked great! But it was way too limiting and hard to maintain. Also we had several assumptions here (some voluntary, some accidental), like we assumed classes will always be defined under the key classDefinitions, which certainly wasn't true . We also assumed that the endpoint will contain the class name, which also on further study came out to be not true.

We were still trying to find a way to identify collections in OpenAPI doc.

To remove these assumptions ,and to generalise the parser so that we could cover most of the OpenAPI documents came the second iteration .

Second Iteration

We discussed that it’s time to upgrade hydrus to accomodate requirements of OpenAPI. We had to include the following features to existing hydrus:

Allow custom endpoints : Till now the endpoint was the same as class name, we quickly realised this had to change.
Allow arrays as input parameter: We had several requirements where we needed to pass a number of ids to the endpoint in OpenAPI in an array , hydrus at that moment did not support this, now it does 😌

We performed minor operational changes too during this time like change change of id to uuid , my mate Sandeep did this part; we were talking to the hydra:publiclist in the meanwhile on how to achieve client-server synchronisation using checksums and other mechanisms.

After the changes were made to hydrus, I had to implement those in the parser, but 😅 parser was very tough to modify as the functions were very closely coupled; making a change at one place broke something or another because things we getting added to hydra doc as soon as they were parsed, and we had to maintain a record of what was parsed what wasn’t. I realised I could finish this for now but it would be hard for upcoming developers to modify. Now the second phase was about to end in I think 5 days, but I decided I have to make the parser more maintainable with better code design (thanks Lorenzo and Akshay). I took a deep breath and started making a new parser learning from previous iterations and going around problems more gracefully than before. This is how the third iteration was born, in 2 days !

Third Iteration

I started with maintaining a global `dict` containing all the info of the classes parsed and operations parsed till the moment , the idea was to not add anything to the HydraApiDocumentationtill the parsing was done. We added a `dict` with a structure like below:

global_ {
      "class_names":set()
      "class_name1":{
                     "name": we put class name here ,
                     "class_defnition" : HydraClass,
                     "op_definition" :[HydraSupportedOperation],
                     "prop_definition" : [HydraSupportedProperty],
                     "collection": Bool ,
                     "path":latest endpoint assigned to the class
                    }       "class_name2":{
                     "name": we put class name here ,
                     "class_defnition" : HydraClass,
                     "op_definition" :[HydraSupportedOperation],
                     "prop_definition" : [HydraSupportedProperty],
                     "collection": Bool ,
                     "path":latest endpoint assigned to the class
                    }
}

This for sure made working with parser a 😂 (i was going for fun there).

Now we parsed classes, operations and made changes to them easily at any place without worrying about breaking something. In the end we will simply iterate this `dict` and add to HydraApiDocumentation.

for name in global_["class_names"]:
    for prop in global_[name]["prop_definition"]:
        global_[name]["class_definition"].add_supported_prop(prop)
    for op in global_[name]["op_definition"]:
        global_[name]["class_definition"].add_supported_op(op)
    if global_[name]["collection"] is True:
        if global_[name]["class_definition"].endpoint is True:
            global_[name]["class_definition"].endpoint = False

    api_doc.add_supported_class(
        global_[name]["class_definition"],
        global_[name]["collection"],
        collection_path=global_[name]["path"])

References :

Blog Posts :

https://medium.com/openapi-hydra-parser-gsoc2018

A beautiful summer spent coding

GSoC-2018 with Python Hydra

Class Representation :

Properties :

Operations on Classes:

Written by Vaibhav Chellani