NUnit to xUnit automatic test conversion: pattern match

Dmitry Yakimenko
3 min readMar 18, 2019

--

Photo by Daniel von Appen on Unsplash

In the previous post I described how to use the Roslyn API to find code patterns in the C# AST and how to change the AST to rewrite the original code to something else. The goal was to automate the conversion of NUnit tests to xUnit. The approach I used was quite tedious, as I had to write a very long chain or ifs and typecasts to get the job done. Let’s try to do better this time. Let’s start with just the search part in our search-and-replace tool.

What would be great is to be able to specify structural patterns like this:

Assert.That(_, Is.EqualTo(_))
Assert.That(_, Is.EqualTo(true))
Assert.That(_, Is.Throws.TypeOf<_>())

And they would match the actual code:

// Matched by 'Assert.That(_, Is.EqualTo(_))'
Assert.That(account.Id, Is.EqualTo(id))
Assert.That("".ToBytes(), Is.EqualTo(new byte[] {}))

// Matched by 'Assert.That(_, Is.EqualTo(true))'
Assert.That(info.IsMd5, Is.EqualTo(true));
Assert.That(token.BoolAt(path, true), Is.EqualTo(true));

// Matched by 'Assert.That(_, Is.Throws.TypeOf<_>())'
Assert.That(() => Quad[-1], Throws.TypeOf<ArgumentOutOfRangeException>())
Assert.That(() => access(token, path), Throws.TypeOf<JTokenAccessException>())

At first it looks like a quite difficult task. But as it turns out in its simple form is not even that hard. I got the idea first when I was generating code for AST replacement with Roslyn Quoter. Looking at its source code I discovered a bunch of Parse* methods of the SyntaxFactory class.

So basically one function call will parse the snippet and return an AST for the given pattern:

var patternAst = SyntaxFactory.ParseExpression("Assert.That(_, Is.EqualTo(_))");

The one line above is equivalent to a wall of code like this:

var patternAst =
InvocationExpression(
MemberAccessExpression(
SyntaxKind.SimpleMemberAccessExpression,
IdentifierName("Assert"),
IdentifierName("That")))
.WithArgumentList(
ArgumentList(
SeparatedList<ArgumentSyntax>(
new SyntaxNodeOrToken[]{
Argument(
IdentifierName("_")),
Token(SyntaxKind.CommaToken),
Argument(
InvocationExpression(
MemberAccessExpression(
SyntaxKind.SimpleMemberAccessExpression,
IdentifierName("Is"),
IdentifierName("EqualTo")))
.WithArgumentList(
ArgumentList(
SingletonSeparatedList<ArgumentSyntax>(
Argument(
IdentifierName("_"))))))})));

It feels like a total win already and we have not even done anything useful yet. But let’s find this pattern in a source AST. First, we need to parse the file we’re searching in:

var sourceAst = CSharpSyntaxTree.ParseText(File.ReadAllText(filename));

This gives us the list of all expression nodes in the AST:

var nodes = sourceAst.GetRoot().DescendantNodes().OfType<ExpressionSyntax>();

And now we find the nodes that match:

foreach (var e in nodes)
{
if (Ast.Match(e, patternAst))
{
var line = e.GetLocation().GetLineSpan().StartLinePosition.Line;
var code = e.NormalizeWhitespace();
Console.WriteLine($" {line}: {code}");
}
}

Obviously the Ast.Match function is the tricky one. But not as tricky, really. We recursively traverse both ASTs in parallel and see if they match:

public bool Match(SyntaxNode code, SyntaxNode pattern)
{
// A placeholder matches anything
if (IsPlaceholder(pattern))
return true;

// Node types don't match. Clearly not a match.
if (code.GetType() != pattern.GetType())
return false;

switch (code)
{
case ArgumentSyntax c:
{
var p = (ArgumentSyntax)pattern;
return Match(c.Expression, p.Expression);
}
case ArgumentListSyntax c:
{
var p = (ArgumentListSyntax)pattern;
return Match(c.OpenParenToken, p.OpenParenToken)
&& Match(c.Arguments, p.Arguments)
&& Match(c.CloseParenToken, p.CloseParenToken);
}
case IdentifierNameSyntax c:
{
var p = (IdentifierNameSyntax)pattern;
return Match(c.Identifier, p.Identifier);
}
case InvocationExpressionSyntax c:
{
var p = (InvocationExpressionSyntax)pattern;
return Match(c.Expression, p.Expression)
&& Match(c.ArgumentList, p.ArgumentList);
}
case LiteralExpressionSyntax c:
{
var p = (LiteralExpressionSyntax)pattern;
return Match(c.Token, p.Token);
}
case MemberAccessExpressionSyntax c:
{
var p = (MemberAccessExpressionSyntax)pattern;
return Match(c.Expression, p.Expression)
&& Match(c.Name, p.Name);
}
case GenericNameSyntax c:
{
var p = (GenericNameSyntax)pattern;
return Match(c.Identifier, p.Identifier)
&& Match(c.TypeArgumentList, p.TypeArgumentList);
}
case TypeArgumentListSyntax c:
{
var p = (TypeArgumentListSyntax)pattern;
return Match(c.LessThanToken, p.LessThanToken)
&& Match(c.Arguments, p.Arguments)
&& Match(c.GreaterThanToken, p.GreaterThanToken);
}
default:
return false;
}
}

So it’s basically a giant switch with every node type in it. By far not every type is covered here, just those that I needed to get my examples to work. I imagine to cover the most of C# syntax I’d have to tediously write a couple of thousand lines of repetitive code. I’m not going to do it all any time soon. Just the stuff I need to cover my use cases.

With a few more lines of code added this already becomes a useful tool for searching for code patterns in a codebase. Next time we see how we can implement the replace part. The goal was to refactor, not just to search, wasn’t it? I have some ideas on how it could be done. See you next time.

Conclusion

Thanks to Roslyn awesome API with just 172 lines of code we have a pretty advanced code grep. Surely, it’s just a toy and a proof of concept at the moment. It would take a serious effort to make it something more than that. But I’m happy with what is possible with so little effort. Amazing.

Originally published at detunized.net on March 16, 2019

--

--

Dmitry Yakimenko
Dmitry Yakimenko

Written by Dmitry Yakimenko

Grew up in Russia, lived in the States, moved to Germany, spend winters in Spain. I program since I was 13, a long haul. I used to program games, maps and more.

No responses yet