A guide to tagging resources in AWS
Jason Haines, Principal Consultant at Versent, explains how to make AWS visibility and management easier by tagging your s_it.
When using AWS at scale, tagging becomes an essential tool to manage and report on your resources.
This post gives guidance on defining a tagging policy tailored to your organisation. Before we jump in and talk about how to tag, it is worthwhile understanding what tagging is and why you would use it.
Tagging associates metadata against each resource in AWS is not specifically required for AWS to operate. A common example is the ‘Name’ tag. It enables you to define a human readable description of the resource, rather than using the Instance ID. Almost all AWS resources support the concept of tagging. Currently tagging allows up to 50 Tag Name/Value pairs to be assigned to each resource.
Why bother tagging items?
Overall, it is incredibly hard to manage your AWS resources without tagging. When your manager asks you why instance i-abcdexpense is an x1 instance, you want to answer quickly and know who to speak to, to resolve the problem.
Other reasons for tagging resources:
- Reporting — tagging data can be used for reporting using AWS Console and billing reports;
- Management — tools like Stax use tags to group and classify your cloud data;
- Permissions — permissions on many AWS resources can be restricted based on tag data;
- Filtering — AWS Resource Groups allow the console to be filtered based on tag values;
- Automated Processes — tags can store key data items that drive scheduled shutdowns, backup polices, or AWS Config policies.
“There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton
It’s important to get agreement on what tags to use quickly — before you get invited to your 10th tagging meeting. Our recommendation is to build your own tagging policy based on best practices, publish it within your organisation and then enforce it.
Below are some principles that you should consider when defining your tagging strategy. They won’t apply in every situation, and should reflect the unique requirements of your organisation.
Master Data source
AWS Tags should not be used as a de-facto configuration management database. Where a Master Data source exists in your organisation, your AWS tags should be used as a foreign key pointing to that data.
If your organisation already has a Configuration Management Database, use the foreign key to reference a value that uniquely identifies a record in another data source. e.g. A Social Security Number uniquely identifies US citizens and residents.
Use redundant data sparingly
It can be useful to store a limited amount of redundant data in AWS tags. In particular, values that are inconvenient to look up during operational use. For example, during outage investigation it is easier to read an application name or perhaps the email of an application owner than to have to look it up in another system.
In general, values that are not easily human readable (e.g. GUIDs or infrequently used IDs) can have a human readable name/description added another tag. e.g. A tag ApplicationGUID may be supplemented with a tag ApplicationName.
Some downstream systems can also make use of redundant data.
Data available through AWS APIs should not be duplicated in Tags.
Use the Name tag for redundant data
The ‘Name’-tag is frequently used in the AWS GUI to identify resources. It is cosmetic and should be comprised of redundant data. It helps to sort and keep the AWS console resources grouped. A common anti-pattern is to use the Name tag as a compound value with other data.
If the Name tag contains compounded fields, these fields should be available through other tags. e.g. if Name=’PROD-app101′ then there should be other fields for this information info:env=PROD and info:appId=101.
Tags should be (semi-) static information
Information that is frequently updated should not be stored in tags. e.g. CreatedBy could be tagged, but LastRebootedAt should not be.
Environments are elastic
A traditional data centre often has a fixed number of environments. For non-production, these environments are often shared-use or particular environments (e.g. stress/load testing) are booked for exclusive use. In a cloud environment, multiple environments can be provisioned on-demand and terminated when no longer required.
Tagging policy on environment names should reflect this.
Tags use colon-separated namespaces (e.g. AWS reserves the prefix ‘aws:’ for their own Tag Keys). You can use a prefix for tags specific to your organisation — ‘myorg:’. Generally, applicable tags can use a namespace like ‘info:’. Sub-namespaces can be reserved for use by departments in the organisation — ‘myorg:app:’ for application developer data.
It is important to check that your namespace is supported by all AWS resources. Some AWS resources have slightly different specifications.
Use camelCase for Key names
AWS generated tags (such as aws:cloudformation:stack-id), use kebab-case. However, kebab-case is not supported by all AWS resource types. Instead, use camelCase Key names by preference (e.g. ‘info:createdBy’). Ensure that the case is consistent the same for all keys and values.
Use a version number
Plan for the future. Your tagging needs may change over time, possibly in incompatible ways. Defining a ‘info:taggingVersion’ tag can save you problems in the future. Most likely, your needs will change slowly and a simple integer version value will be sufficient. You probably won’t need complicated semantic versions.
If you are storing multiple values in a tag, you need to define the structure at the start. This will ensure that everyone uses the same format. For example,
- Single-Value: TagName=value
- Multi-Value: TagName=value1:value2 or TagName=value1-value2
- Multi-Attribute: TagName=attribute1=value1/attribute2=value2
Mandatory and Optional Tags
Each tag should be defined with its scope, e.g. enterprise wide, team wide, or department wide and its compliance level, e.g. mandatory or optional. Your tagging policy should be defined and enforced. If the tagging policy is not enforced, then it will unlikely add much value and will get a splintered standard, which is a much harder to manage end state.
Define a time format and use it consistently. The time format you define, should not be your own, use an industry one that has been tried and tested. For example UTC ISO8061, complete date plus hours, minutes and seconds: YYYY-MM-DDThh:mm:ssTZD (e.g. 1997–07–16T19:20:30+01:00). I like this one as it enables you to sort by the date quickly to find the most recent snapshots or backups.
Using a buildID in the tag enables you to test multiple version of infrastructure code and quickly trace an environment back to its released code base.
Before you define your tagging policy, you should understand what your tags will be used for. This doesn’t have to been a long or complicated exercise. You should, however, spend some time documenting who will be using it. Tagging is used by many areas of the organisation, so considering their requirements will simplify processes across the business.
One typical and important use of tagging is to categorise and attribute cost of service use. Be sure to understand your organisation’s hierarchy and where costs are attributed. You can look how existing cost reports work (e.g. travel expenses) to help understand the structure. There will likely be multiple levels of the organisation that require reports. e.g. the CIO may want a cost report broken down by department. Each department may want a cost report by project team.
Keep in mind what master data sources your organisation for this organisational hierarchy data. Use well-known department or cost centre codes as appropriate.
AWS IAM Policies allow you to restrict access to resources based on tag values. For example, members of a certain department can only terminate EC2 instances owned by that department. Determine the types of segregation you need and be sure your tagging policy support it. This should be used to protect against accident operations, as malicious users could bypass.
Understand the needs of your operations and development teams. Operations will typically need to identify the owner of a particular resource and map the other way from an application to the resources it is using. Development teams will often need to distinguish between multiple, non-production environments.
Third party systems
Identify any third-party systems that have tagging requirements.
Tagging data model
You are now ready to define your data model. For each tag, you should identify:
- Key name — with namespace
- Requisiteness — whether this key is optional or mandatory
- Data type — integer, string, enumeration,
- Allowed values — e.g for a set of predefined values
- Key name: myorg:departmentId
- Requisiteness: mandatory
- Data type: integer; reference to org-chart database
- Allowed values = all
Here are some example tags that we often use:
For management purposes, Stax can use your resource tags to create groupings of resources, which are used to:
- Compare metrics such as Cost across groups;
- Filter views of Stax down to specific groups;
- Assign compliance rules to specific groups.
For example, a grouping might be “Environment”. The groups you create in Stax might be simply “Prod” and “Non-prod” — all myorg:environment tags names are allocated to one of these two groups in Stax, so that you can then track, compare and report on the cost, compliance or quality for all Prod and Non-Prod resources. You could assign specific compliance rules to Non-Prod, and different rules to Prod.
Using tagging makes management and visibility of AWS through a tool like Stax much easier, as you can align your views of each critical metric to a specific part of your business, thus enabling you to have the right conversations with the right people.
Principal Consultant, Versent
Technical Director, Stax