Boost your YAML with autocompletion and validation

Alex Molev
10 min readSep 1, 2020

--

What is this article about

In this article I would provide you step by step how to develop autocomletion and validation for YAML files. How to setup development environment for Visual Studio Code and IntelliJ. I’ll wrap up with how to make automatically the Schema JSON available in many IDEs so everyone will be able to use it with no need to configure it manually. Currently the supported IDEs are:

  • IntelliJ IDEA
  • PhpStorm
  • PyCharm
  • Rider
  • RubyMine
  • Visual Studio 2013+
  • Visual Studio Code
  • Visual Studio for Mac
  • WebStorm
  • JSONBuddy

Overview

Recently there is increase in YAML/JSON usage in different DevTools. We can see it in tools such as K8S, Terraform, CloudFormation, CircleCI etc. It provides an easy declerative way to define application, infrastructure, CI/CD pipeline. The benefit of it is that without any prior knowledge of any programming language you can easily define readable (arguably) configuration. The downside is that it’s configuration and most of the time we go to documentation or looking for examples to understand the structure.

In Cloudify we use blueprints which as you can guess is YAML file and the purpose of it is declaratively define Environment as a Service. One of the challenges is to remember the structure as there are many options. To ease on our customers and with JSON Schema assistance we added autocompletion and validation for the blueprint files. Yes JSON Schema works both for JSON and YAML files. It allows to add description to properties, declare a property type, declare property mandatory, seal an object to particular properties, change property structure based on a value and more.

JSON Schema

JSON Schema is a vocabulary that allows you to annotate and validate JSON documents. https://json-schema.org

There is a public available PDF that covers all the options.

YAML Language Support by Red Hat

Provides comprehensive YAML Language support to Visual Studio Code, via the yaml-language-server, with built-in Kubernetes syntax support.

Supports JSON Schema 7 and below.

SchemaStore

SchemaStore.org is an open source that manages different JSON Schema. it already contains a vast number of JSON Schema and encourages other to upload to their Github repo. It’s already integrated with a number of IDEs that which makes it easier to distribute your solution among your users.

So let’s get started!!!

Cloudify Blueprint YAML

Before jumping and implementing JSON Schema file, let walk through how Cloudify blueprint file look like and what it is contains of.

tosca_definitions_version: cloudify_dsl_1_3 imports:  
- http://cloudify.co/spec/cloudify/5.0.5/types.yaml
inputs:
webserver_port:
description: The HTTP web server port.
default: 8000
node_templates:

http_web_server:
type: cloudify.nodes.WebServer
properties:
port: { get_input: webserver_port }
interfaces:
cloudify.interfaces.lifecycle:
create:
implementation: install.py
executor: central_deployment_agent
delete:
implementation: uninstall.py
executor: central_deployment_agent

The blueprint above defines installation and deletion of a web server. I’ll start with providing the final JSON schema for the YAML above and explain each section. In the JSON schema i’ve tried to cover as many as possible topics.

{
"$id": "https://example.com/blueprint.schema.json",
"$schema": "http://json-schema.org/draft-08/schema#",
"title": "Blueprint",
"description": "Cloudify Blueprint as described https://docs.cloudify.co/5.0.5/developer/blueprints/",
"type": "object"
"properties: {
"tosca_definitions_version": {
"type": "string",
"enum": [
"cloudify_dsl_1_0",
"cloudify_dsl_1_1",
"cloudify_dsl_1_2",
"cloudify_dsl_1_3"
]
},
"imports": {
"type": "array",
"items": {
"type": "string"
}
},
"inputs": {
"type": "object",
"properties": {
"webserver_port": {
"type": "object",
"properties": {
"description": {
"type": "string"
},
"type": {
"type": "string"
},
"default": {
"type": "number"
}
},
"node_templates": {
"type": "object",
"patternProperties": {
"": {
"type": "object",
"properties": {
"type": { "type": "string" },
"properties": { "type": "object" },
"interfaces": {
"type": "object",
"properties": {
"create": { "$ref": "#/definitions/interfaceAction" },
"delete": { "$ref": "#/definitions/interfaceAction" }
}
}
}
}
}
}
},
"definitions": {
"interfaceAction": {
"type": "object",
"properties": {
"implementation": { "type": "string" }
"implementation": { "executor": "string" }
}
}
}
}

The final result is more complex and can be found in here as an inspiration to what can be done with JSON Schema.

Setting up development environment for Visual Studio Code

I’m using MacBook and all the examples will be accordingly. On other OSs the the difference are not significant so you should be able to follow the guidlines.

Install YAML Plugins

First thing first you need to download YAML Language plugin.

Open Extensions by going to Code->Preferences->Extensions or use keyboard shortcut SHIFT+CMD+X

Look for YAML and install YAML Language Support by Red Hat

Apply JSON Schame on your YAML files

Open Settings by going to Code->Preferences->Settings or use keyboard shortcut CMD+,

Open Schema Settings by going Looking for YAML, you should find it unders Extensions , looks for Yaml: Schemas and click on edit in setting.json

Add file match to apply the JSON on YAML files.

{
"yaml.schemas": {
"/Users/alex/WS/IntelliSense/cloudify.json": ["*.cfy.yaml"]
}
}

In the example above the JSON schema is stored in cloudify.json file and it will be applied on all the files that ends with .cfy.yaml .

Setting up development environment for PyCharm

I’m using PyCharm by Intellij but you can do the same with other Intellij IDEs

Open preferences by going to PyCharm->Preferences or use keyboard shortcut CMD+,

Look for JSON Schema in the sidebar

Add your mapping, in my example I’m adding mapping with the name cloudify .

Schema file or URL is pointing to the file of JSON Schema

The JSON Schema will be applied on all the files that ends with .cfy.yaml .

Explaining JSON Schema for blueprint

Every JSON schema has to start with the following properties

{
"$id": "https://example.com/blueprint.schema.json",
"$schema": "http://json-schema.org/draft-08/schema#",
"title": "Blueprint",
"description": "Cloudify Blueprint as described https://docs.cloudify.co/5.0.5/developer/blueprints/",
"type": "object"
"properties": {
...
}
...
}

Even though the properties are self explanatory let’s go one by one

  • $id: Unique identifier for the schema, it’s a best practice to be a public available URL. it’s not mandatory and you can skip it if you don’t have references to other schemas.
  • $schema: Version of the JSON Schema, when writing this document the lates is draft-08 but it’s always best to check the official website and use the latest one.
  • title: Title of the schema used for display purposes only.
  • description: Description of the schema used for display purposes only.
  • type: type of the object
  • properties will have all the json properties definitions.

tosca_definitions_version

Let’s take the first line and add it to our schema

tosca_definitions_version: cloudify_dsl_1_3
...

tosca_definitions_version is a top level property to specify the DSL version. The versions that are currently defined arecloudify_dsl_1_0, cloudify_dsl_1_1, cloudify_dsl_1_2 and cloudify_dsl_1_3

{
...
"properties: {
"tosca_definitions_version": {
"type": "string",
"enum": [
"cloudify_dsl_1_0",
"cloudify_dsl_1_1",
"cloudify_dsl_1_2",
"cloudify_dsl_1_3"
]
}
...
}
...
}

imports

imports enable the author of a blueprint to reuse blueprint files, or parts of them, and to use predefined types.

...
imports:
- http://cloudify.co/spec/cloudify/5.0.5/types.yaml
...

as you can see imports is an array that contains of URLs, let’s create a property for it.

{
...
"properties: {
...
"imports": {
"type": "array",
"items": {
"type": "string"
}
}
...
}
}

inputs

inputs are parameters that are injected into a blueprint when it’s executed.

...
inputs:
webserver_port:
description: The HTTP web server port.
default: 8000
...

there might be a multiple inputs and each input schema has the following rules:

  • description: is a string
  • type: is a string
  • default: any
  • constraints: list of dicts
  • required: boolean
{
"properties": {
...
"inputs": {
"type": "object",
"properties": {
"webserver_port": {
"type": "object",
"properties": {
"description": { "type": "string" },
"type": { "type": "string" },
"default": { "type": "number" }
}
}
}
}
...
}
}

node_template

node_templates:
http_web_server:
type: cloudify.nodes.WebServer
properties:
port: { get_input: webserver_port }
interfaces:
cloudify.interfaces.lifecycle:
create:
implementation: install.py
executor: central_deployment_agent
delete:
implementation: uninstall.py
executor: central_deployment_agent

node_template is an interesting usecase. We want to allow it to be a dictionary. in our case we have only one key http_web_server. To allow any key we will use patternProperties instead of previously used properties

Each key in the dictionary is an object which has 3 properties

type is a string

properties is an object

interface is a dictionary similar to node_template each interface key has 2 actions create and delete which looks the same. In that case we can declare a definition and add ref to the definition in each action.

{
...
"properties: {
"node_templates": {
"type": "object",
"patternProperties": {
"": {
"type": "object",
"properties": {
"type": { "type": "string" },
"properties": { "type": "object" },
"interfaces": {
"type": "object",
"patternProperties": {
"": {
"type": "object",
"properties": {
"create": {
"$ref": "#/definitions/interfaceAction"
},
"delete": {
"$ref": "#/definitions/interfaceAction"
}
}
}
}
}
}
}
}
},
"definitions": {
"interfaceAction": {
"type": "object",
"properties": {
"implementation": { "type": "string" }
"implementation": { "executor": "string" }
}
}
}
}

Taking JSON Schema to the next level

The example that we walked through above it’s a very simple use case. JSON schema has some advance topics. I would recomend to read more about the JSON Schema options and browse other JSON Schemas in SchemaStore repo to have a better understanding how it can be build or how to solve different use cases.

Going Production

Once we are done with the development. We can move to the next phase. Make it available to everyone without making everyone locally download and configure their own IDE.

We will be using open source SchemaStore to to make our schema be available in the following IDEs:

  • IntelliJ IDEA
  • PhpStorm
  • PyCharm
  • Rider
  • RubyMine
  • Visual Studio 2013+
  • Visual Studio Code
  • Visual Studio for Mac
  • WebStorm
  • JSONBuddy

The process is very easy. It takes less than 20 minutes of an effort for a request to add you schema to the SchemaStore repo. And it took me less than a half of the day to get approval.

Prerequisite

  1. Github account.
  2. Installed git on you laptop
  3. Install npm on you laptop

So let’s get started step by step what you need to do:

Fork SchemaJson project

Go to the repo SchemaJson and fork it to your account

Clone it to your laptop

Got to your Github account to the SchemaStore forked repository and copy the repo link

git clone git@github.com:alexmolev/schemastore.git

Enter the cloned repocd schemastore

Open a new branch

run the following command, replace add_cloudify_blueprint_schema with your schema name.

git checkout -b add_cloudifyblueprint_schema

Add Your Schema File

Copy your json schema file to the src/schemas/jsondirectory in my case it was cloudifyblueprint.json

Add schema to the catalog

Open the filesrc/api/json/catalog.json.

Add following JSON snippet to the catalog file

{ 
“name”: “cloudifyblueprint”,
“description”: “Schema for Cloudify Blueprint”,
“fileMatch”: [“*.cloudify.yaml”],
“url”: “https://json.schemastore.org/cloudifyblueprint"
}

Add Tests

Create a folder with your schema name in the following path /src/test/cloudifyblueprint

Even we have created a JSON Schema that should address YAML files we need to provide a valid JSON file that matches the JSON Schema we have added.

In my case I’ve added the file cloudify-test.json

Feel free to add more than one file to address different use cases

Validate Build Pass

Go back to the src directory

Run the command npm install to install all the dependencies

Run the command npm run build to build and see if tests passed successfully

the output is very long but you should look for the following things:

Your JSON Schema passed

Your test passed

Commit and push your Changes

Check the git status and see the changed files.

git status

Not sure why, but by running the build additional 3 files were modified that are not part of our project.

Make sure to add to your commit only your files by executing the command

git add api/json/catalog.json schemas/json/cloudifyblueprint.json test/cloudifyblueprint/

Commit your changes with a valid message

git commit -m "Adding Cloudify Blueprint Schema"

Push you changes to your forked repo

git push origin add_cloudify_blueprint_schema

Open Pull Request

Go to your Github account and enter SchemaStore forked repo. Once you’ll enter you’ll see a suggestion to open a pull request on the just pushed branch

click on Compate & pull request button.

Review that all the details are correct. Make sure that the Pull Request is opened to SchemaStore, the the image bellow is says from mine local to SchemaStore organisation

Double check that all the changes are correct and all the files in place.

Don’t forget to press Create pull request button and wait until it will be merged to master.

Validate the JSON Schema from SchemaStore is used in PyCharm

Open Settings, use keyboard shortcut CMD+,

In the sidebar look for Remote JSON Schemas and validate that Allow downloading JSON Schema... and Use schemastore.org JSON Schema catalog are checked.

Create a file that matches the fileMatch patter.

Validate that in the status bar at the bottom of the screen right cornet you can see your name.

Conclusion

Creating JSON schema is very easy but what is more important if you are building a tool either for a customers or in house usage that relies on YAML configuration it can boost the adoption and productivity.

Never the less if you have a complex JSON Schema or parts that are updated frequently I’ll recommend to write a script that generates the file automatically and make it as part of your CI/CD tool so you JSON Schema will be all the time up to date.

--

--