Adding Telemetry to Your Visual Studio Projects
Ready, set, build, get coffee!
If you’re a developer working on a medium-sized project, you certainly have gone through this process. After hitting F6 in Visual Studio (VS), you go grab a cup of coffee or press Alt+Tab to read the news, check Slack… Between coffee and a snack, have you ever measured how long it takes for your team to build a project?
We have that problem too at OutSystems. As we grew, so did our product and, ultimately, our solutions. Take Service Studio as an example: it’s a solution that has more than 100 projects. The sole reason it doesn’t have more is that we’ve started to break our monolith, isolate assets (plus their projects), and include them as NuGet packages.
It Slowly Gets Worse
Like most things in life, the slowdown didn’t happen overnight. Our solution grew throughout the years, and with it, the build time. We generate a good percentage of our code by using various tools to guide developers and avoid bad patterns. Linter, PreSharp, TypeScript — these are all great enhancements to a developer’s life but can impact your build if used without proper setup or maintenance.
We have over 150 engineers (as of March 2021) from many different teams evolving Service Studio alone. With our recent port to .NET Core (now .NET only), we had to mess around with projects to create the new Service Studio and evolve the current one while continually delivering value to our customers. With over a decade of development on it, it slowly got worse.
Add to this the hypergrowth path that OutSystems is on, balancing between investing in legacy removal and improving our processes — all while delivering amazing features to our customers — is not an easy thing to do. I could hardly finish that sentence, let alone juggle this. But, we do have one of the best engineering teams in the world. So, we did what we do best: change!
Back to this post’s focus: how to measure the time it takes to build your projects? If you just want to measure the time spent on your machine, stop reading: you can quickly gather those metrics with any useful profiling tool, even VS. But if you want to understand how your entire department is doing, then let’s do it!
The Starting Point
At OutSystems, we currently use GoCD, a CI/CD (Continuous Integration/Continuous Delivery) system that gives us metrics about the time our build takes. You can see it’s taking us around 17 minutes in the build step (the build itself takes a bit less, about 14 minutes).
That was a good starting point because we could see from historical data that, just one year before, our solution took half (!) that time. That’s right, in one year, we doubled the time our CI/CD server takes to build our product. One might think, “That’s just processing time,” but it isn’t. It’s time the entire group has to wait before releasing. It’s part of the feedback loop.
A simple example: if we find a critical bug in production, and we want to release a version of Service Studio quickly, that means we have *at least* a 17-minute Mean Time to Recovery (MTTR).
So, while it is not our focus, our CI/CD system provides a good measure of how much time a solution is taking.
Build Times in Every Developer’s Machine
The CI/CD system allowed us to understand how much time our developers spent building the solution. Time, in this case, relates to cost. And we cannot measure certain things, like the developers going for a coffee or the Alt+Tab and breaking focus.
Before getting into the “how,” let me illustrate this by showing you our numbers: in 10 working days, our developers built Service Studio 2,123 times (either full solution or changed projects), resulting in around 38 hours of machine processing. If you extrapolate that to a year (approximately 253 working days), it results in approx. 961 hours. That’s 40 days (considering a 24-hour day).
We’ve grouped every single build coming from the developers’ machines into categories:
The results show that 11.4% of the builds in our machines take more than eight minutes, even with incremental builds and building only things we change. That’s 242 builds in just 10 days! In our case, we needed to act. What’s your scenario?
To assess it, you’ll need to gather data. Here’s how you can do it. There are three different options, the third being the best:
1. Add pre- and postbuild events to every project
It’s the “ugliest” solution. Just edit every single project, and in the prebuild and postbuild events, gather the metrics and send them. Now, what happens when you’re editing 100+ projects? No. Way. One could create a script to edit all the .csproj files and add the info, but it’s still pollution. Nevertheless, if you’re going for this approach, try playing around with MSBuild Targets. Hammer time.
2. Every project depends on a placeholder
This is a much better option. Instead of adding the telemetry code to every single project, you create one “empty” project on which all others will depend. Then, in this last project, you add your telemetry code (preferably calling an external app, since it will make your life much easier when you need to troubleshoot).
In the example above, all projects depend on TelemetryProj, which has the AfterTarget to send the telemetry.
3. Change Directory.Build.targets (pick this!)
This is the best approach. Using
Directory.Build.targets, a solution-wide file, you can have a centralized location for the telemetry code, and it will send it for every project in the solution. Here's the code:
(You can check my GitHub here: https://gist.github.com/cjaSource)
Things to notice:
- You *must* send the telemetry by launching a separate process that you “fire and forget.” We’re calling a REST (Representational State Transfer) endpoint to register the telemetry. While REST is lightweight, it’s still subject to network latency and server errors. We don’t want to add entropy to the developer’s work.
- At OutSystems, we use Cake, a free and open-source cross-platform build automation system with a C# DSL. Essentially, instead of creating a side application, we just wrote a Cake script to do the job.
- Notice the usage of a condition
(BuildingSS_fcbat). It allowed us to avoid registering build times from other sources, like the CI/CD system mentioned before. We wanted to clock the devs only. The CI/CD systems must set this variable to skip telemetry.
Directory.Build.targetsshould be alongside your solution file.
As a final note, this won’t work in NET 5 (or .NET Core) since Microsoft deprecated the “code inline” task. If you use the
Directory.Build.targets above, you'll get a "Task factory 'CodeTaskFactory' is not supported on .NET Core version of MSBuild" message.
To fix it, you’ll need to place the code that launches the separate process along with the call to REST endpoint (or however you’re registering the telemetry). Then, instead of using a CodeTaskFactory, you can use a simple Exec task.
Regardless of the option, it will give you telemetry by project, which is great because it pinpoints the projects that deserve a closer look first. If you want to have just the full build time, you’ll need to do some post-processing. In our case, we had a mechanism that ran every night to crunch the numbers.
We’re currently acting on this problem (at the time of writing), but we don’t want to “fix it and pop the Champagne” because that will lead us to the same scenario in a year from now. We need mechanisms to alert us when things deteriorate (such as alarmistic), pull request validations, and guidelines. In essence, we will ensure that when things slowly start to get worse, we will act fast.
Originally published at https://www.outsystems.com.