Implementing the ISIN check digit algorithm in C#

Michael Harges
18 min readOct 13, 2023

--

Photo by Antoine Dautry on Unsplash

Implementing and optimizing a check digit algorithm with new challenges

Welcome to my third article on implementing various check digit algorithms in C#. My earlier articles covered the Luhn algorithm and the UPC/EAN algorithm. In this article I’ll look at the algorithm used by International Securities Identification Numbers (ISIN).

Before going further, I want to announce that I’ve recently released CheckDigits.Net, a .Net library of optimized implementations of 20 different check digit algorithms. Benchmarks comparing CheckDigits.Net to popular NuGet packages have shown performance increases of 3X, 10X and in some cases up to 50X, depending on the algorithm. In addition, CheckDigits.Net completely eliminates the memory allocations that are common in other packages. You can find CheckDigits.Net on NuGet or by searching for CheckDigits.Net in your IDE’s NuGet package manager.

ISINs are identifiers assigned to securities (stocks, mutual funds, etc.) and are used when processing trades. ISINs are 12 characters in length and consist of the two alphabetic character code for the country that issued the security, nine alphanumeric characters and a decimal check digit. This is the first algorithm I’ve covered that must handle alphanumeric characters and as we’ll see below this leads to some interesting challenges.

Under the covers, the ISIN algorithm uses the Luhn algorithm with letters mapped to values between 10 and 35. The Wikipedia article for ISINs describes two different examples for calculating the ISIN check digit that differ slightly depending if the number of digits in the value (after translation of letters to numbers) is odd or even number. However, a close reading of the description shows that basically the algorithm just creates a new string with the letters translated to numbers and then applies the Luhn algorithm to that string. For example:

| Character | Number Equivalent |
| --------- | ----------------- |
| 0 | 0 |
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
| 6 | 6 |
| 7 | 7 |
| 8 | 8 |
| 9 | 9 |
| A | 10 |
| B | 11 |
| C | 12 |
| D | 13 |
| E | 14 |
| F | 15 |
| G | 16 |
| H | 17 |
| I | 18 |
| J | 19 |
| K | 20 |
| L | 21 |
| M | 22 |
| N | 23 |
| O | 24 |
| P | 25 |
| Q | 26 |
| R | 27 |
| S | 28 |
| T | 29 |
| U | 30 |
| V | 31 |
| W | 32 |
| X | 33 |
| Y | 34 |
| Z | 35 |

ISIN for Tesla = US88160R1014

Individual characters = U S 8 8 1 6 0 R 1 0 1 4
After translation = 30 28 8 8 1 6 0 27 1 0 1 4

Translated value = 302888160271014

Like the Luhn algorithm, the ISIN algorithm can detect all single character transcription errors (i.e. typing 5 instead of 4) and it can detect all two digit transposition errors (i.e. typing 76 instead of 67) except for transposing 90 -> 09 or vice versa. However, when it comes to characters, the algorithm has significant limitations. It cannot detect transpositions of two letters at all. Nor can it detect transpositions of any digit and the letters B, M or X.

Let’s look at why those limitations exist. The Luhn algorithm uses two weights, 1 and 2, and applies them in an alternating fashion with odd character positions (starting from the right-most non-check digit character) having weight 2 and even character positions having weight 1. After weighting the individual numbers, the weighted values are summed up and the sum is used to calculate the check digit. (There is an additional step where the values with weight 2 are collapsed down to a digit between 0 and 9, but that doesn’t have an impact in this discussion.) Because there are only two weights and letter characters all translate to two digits, the weights applied to the digits are unchanged if two letters are swapped. Here is a made-up example to demonstrate:

Example ISIN = US0000000AZ7 (check digit 7)

Individual characters = U S 0 0 0 0 0 0 0 A Z 7
After translation = 30 28 0 0 0 0 0 0 0 10 35

Digits = 3 0 2 8 0 0 0 0 0 0 0 1 0 3 5
Weights = 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

With A & Z swapped

Individual characters = U S 0 0 0 0 0 0 0 Z A 7
After translation = 30 28 0 0 0 0 0 0 0 35 10

Digits = 3 0 2 8 0 0 0 0 0 0 0 3 5 1 0
Weights = 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

Notice that while the right-most four digits have changed places, the weights applied to those digits are the same and thus the sum of the weighted values will remain the same. That means that the calculated check digit will be the same and the swap is not detected.

But what about swapping digits with the letters B, M or X? What is so special about those characters? It’s because of where they land in the table of translated values. B = 11, M = 22 and X = 33. Thus, when you swap one of those letters with a digit the weights applied to the translated digits remain the same, as does the sum and the calculated check digit. For example:

Example ISIN = US00000002X5 (check digit 5)

Individual characters = U S 0 0 0 0 0 0 0 2 X 5
After translation = 30 28 0 0 0 0 0 0 0 2 33

Digits = 3 0 2 8 0 0 0 0 0 0 0 2 3 3
Weights = 1 2 1 2 1 2 1 2 1 2 1 2 1 2

With A & Z swapped

Individual characters = U S 0 0 0 0 0 0 0 X 2 5
After translation = 30 28 0 0 0 0 0 0 0 33 2

Digits = 3 0 2 8 0 0 0 0 0 0 0 3 3 2
Weights = 1 2 1 2 1 2 1 2 1 2 1 2 1 2

Once again, even though the right-most three digits have changed position, the weights applied to the individual digits are unchanged and the sum and the check digit is unchanged.

Not all check digit algorithms that handle letters or alphanumeric characters have these weaknesses. ISO 7064 defines multiple approaches to handling check digits for alphanumeric values. But ISINs predate ISO 7064 and the Wikipedia article on ISINs mentions that there are protocols that layer additional check digits on ISIN values to provide better protection.

Given those weaknesses, why cover the ISIN algorithm? First, it is an international standard in common use in an application space that generates millions if not hundreds of millions of transactions daily. Second, there are still some interesting things to cover when optimizing the algorithm. So, let’s dig into it.

Implementation

As with the previous algorithms, we’ll follow a three-phase approach of creating an initial implementation that meets the requirements, then refactoring as necessary to improve the resilience of the implementation to mal-formed input and finally optimizing the implementation for best performance. We’ll also use red-green development and define a suite of tests intended to prove that the implementation both meets the requirements and that it is algorithmically correct. The tools we’ll use are xUnit and FluentAssertions for unit tests and Benchmark.Net for creating benchmarks while optimizing the code.

We start with requirement for a function to validate an ISIN check digit:

  • The input will be a string.
  • The input will be an ISIN (length 12, check digit located in the right-most position).
  • Output will be a boolean value where true = the input string contains a valid check digit and false = the input string does not contain a valid check digit.
  • The code should be resilient, meaning that invalid input should not throw an exception and instead should return false to indicate that there is not a valid check digit.

Next we create an empty method that we can write tests against.

public static class IsinAlgorithm
{
public static Boolean ValidateCheckDigit(String str)
{
throw new NotImplementedException();
}
}

// Example usage:
var str = "US88160R1014";

var isValid = IsinAlgorithm.ValidateCheckDigit(str);

For our first test case, we’ll cover the happy path and look at values that have valid check digits. We’ll use actual values that we can find on the web.

public class IsinAlgorithmTests
{
[Theory]
[InlineData("US0378331005")] // Apple
[InlineData("AU0000XVGZA3")] // Treasury Corporation of Victoria
[InlineData("GB0002634946")] // BAE Systems
[InlineData("US30303M1027")] // Meta (Facebook)
[InlineData("US02079K1079")] // Google Class C
[InlineData("GB0031348658")] // Barclays
[InlineData("US88160R1014")] // Tesla
public void ValidateCheckDigit_ShouldReturnTrue_WhenValueContainsValidCheckDigit(string value)
=> IsinAlgorithm.ValidateCheckDigit(value);
}

Next, let’s consider the less happy path of errors that we know that the algorithm can’t detect. We’ll use a mix of values from our first test where we’ve introduced errors as well as some dummy values generated from https://www.isindb.com/fix-isin-calculate-isin-check-digit/ that allow us to examine specific examples of nondetectable errors.

   [Theory]
[InlineData("AU0000VXGZA3")] // AU0000XVGZA3 with two character transposition XV -> VX
[InlineData("US0000000QB4")] // US0000000BQ4 with two character transposition BQ -> QB
[InlineData("GB123909ABC8")] // GB123099ABC8 with two digit transposition 09 -> 90
[InlineData("GB8091XYZ349")] // GB8901XYZ349 with two digit transposition 90 -> 09
[InlineData("US1155334451")] // US1122334451 with two digit twin error 22 -> 55
[InlineData("US1122337751")] // US1122334451 with two digit twin error 44 -> 77
[InlineData("US9988773340")] // US9988776640 with two digit twin error 66 -> 33
[InlineData("US3030M31027")] // US30303M1027 with two character transposition 3M -> M3
[InlineData("US303031M027")] // US30303M1027 with two character transposition M1 -> 1M
[InlineData("AU000X0VGZA3")] // AU0000XVGZA3 with two character transposition 0X -> X0
[InlineData("G0B002634946")] // GB0002634946 with two character transposition B0 -> 0B
public void ValidateCheckDigit_ShouldReturnTrue_WhenValueContainsUndetectableError(string value)
=> IsinAlgorithm.ValidateCheckDigit(value);

Then we consider the unhappy path and add tests for values with detectable errors.

   [Theory]
[InlineData("US30703M1027")] // US30303M1027 with single digit transcription error 3 -> 7
[InlineData("US02079J1079")] // US02079K1079 with single character transcription error K -> J
[InlineData("GB0031338658")] // GB0031348658 with single digit transcription error 4 -> 3
[InlineData("US0387331005")] // US0378331005 with two digit transposition error 78 -> 87
[InlineData("US020791K079")] // US02079K1079 with two character transposition error K1 -> 1K
[InlineData("US99160R1014")] // US88160R1014 with two digit twin error 88 -> 99
[InlineData("GB0112634946")] // GB0002634946 with two digit twin error 00 -> 11
[InlineData("US12BB3DD566")] // US12AA3DD566 with two letter twin error AA -> BB
public void ValidateCheckDigit_ShouldReturnFalse_WhenValueContainsDetectableError(string value)
=> IsinAlgorithm.ValidateCheckDigit(value).Should().BeFalse();

Now let’s look at proving the algorithmic correctness of the implementation. We know that the Luhn algorithm applies weight 2 to odd position digits so we’ll create a test to prove that. If we use all zeros except for a single ‘1’ in an odd position we can quickly figure that the sum of the weighted digits would be 2 and that the check digit would be 8 (using the Luhn formula of (10 — (sum % 10) ) % 10). If we shift that single ‘1’ two places right or left, then we can prove that doubling is being performed on odd position digits.

   [Theory]
[InlineData("000000000018")]
[InlineData("000000001008")]
[InlineData("000000100008")]
[InlineData("000010000008")]
[InlineData("001000000008")]
[InlineData("100000000008")]
public void ValidateCheckDigit_ShouldCorrectlyWeightOddPositionDigits(String value)
=> IsinAlgorithm.ValidateCheckDigit(value).Should().BeTrue();

We can use the same approach for even position digits, except that the sum would be 1 and the check digit would be 9.

   [Theory]
[InlineData("000000000109")]
[InlineData("000000010009")]
[InlineData("000001000009")]
[InlineData("000100000009")]
[InlineData("010000000009")]
public void ValidateCheckDigit_ShouldCorrectlyWeightEvenPositionDigits(String value)
=> IsinAlgorithm.ValidateCheckDigit(value).Should().BeTrue();

We can check the handling of letters in much the same way. The difference is that a letter is translated into two digits. But if we use ‘A’ as our single non-zero character we can just as easily predict the check digit since ‘A’ conveniently translates to 10 (or digits 1 and 0). In fact, the only difference is that the check digits will be the opposite of the digit cases.

   [Theory]
[InlineData("0000000000A9")]
[InlineData("00000000A009")]
[InlineData("000000A00009")]
[InlineData("0000A0000009")]
[InlineData("00A000000009")]
[InlineData("A00000000009")]
public void ValidateCheckDigit_ShouldCorrectlyWeightOddPositionLetters(String value)
=> IsinAlgorithm.ValidateCheckDigit(value).Should().BeTrue();

[Theory]
[InlineData("000000000A08")]
[InlineData("0000000A0008")]
[InlineData("00000A000008")]
[InlineData("000A00000008")]
[InlineData("0A0000000008")]
public void ValidateCheckDigit_ShouldCorrectlyWeightEvenPositionLetters(String value)
=> IsinAlgorithm.ValidateCheckDigit(value).Should().BeTrue();

And finally, let’s look at a case that is often overlooked in example Luhn algorithm implementations. If the sum is zero or a multiple of 10 then sum modulus 10 would be zero. Unless you correctly handle that case, the calculated check digit would be 10 instead of zero. So we add the following test cases (with the value used in the second case found by playing with https://www.isindb.com/fix-isin-calculate-isin-check-digit/ until a value with a check digit of zero was located).

   [Fact]
public void ValidateCheckDigit_ShouldReturnTrue_WhenInputIsAllZeros()
=> IsinAlgorithm.ValidateCheckDigit("000000000000").Should().BeTrue();

[Fact]
public void ValidateCheckDigit_ShouldReturnTrue_WhenCheckDigitIsCalculatesAsZero()
=> IsinAlgorithm.ValidateCheckDigit("CA120QWERTY0").Should().BeTrue();

These tests handle expected success and failure paths for well-formed input. Additionally, they cover algorithmic details that could be missed if we focused only on success and failure paths. We can be reasonably sure that any implementation that passes all these tests is a correct implementation of the algorithm.

With our tests in place, we can move on to implementation. I’m going to start with a literal interpretation of the description that I gave earlier and create a new string with the letters translated to numbers and then apply the Luhn algorithm to that string.

   public static bool ValidateCheckDigit(String str)
{
var sb = new StringBuilder();
foreach(var ch in str)
{
if (ch >= '0' && ch <= '9')
{
sb.Append(ch);
}
else if (ch >= 'A' && ch <= 'Z')
{
sb.Append(ch - 55);
}
else
{
throw new ArgumentOutOfRangeException(nameof(str), str);
}
}

return LuhnAlgorithm.ValidateCheckDigit(sb.ToString());
}

Here we simply walk the input string and add characters to a StringBuilder to construct the new string. Digit characters are added directly, and letter characters are first converted to the equivalent value in the conversion table by subtracting 55 from the character value. Given that the decimal value for ‘A’ in the ASCII table is 65, this subtraction results in a value between 10 and 35. We then pass the converted string the the Luhn algorithm implementation from my earlier article.

When the tests are run, amazingly, they all pass. But don’t be satisfied with this implementation. It’s pretty crappy, as I’ll show later. But for the moment, aside from the else clause containing the ArgumentOutOfRangeException, it is a working implementation that can serve as a baseline for future improvements. So now we can move on to adding resiliency.

Adding Resiliency

I won’t focus too much on this step as it is very similar to what was done in my earlier articles. When it comes to mal-formed input, we’re going to guard against a null input, an empty input, an input that is not the correct length for a ISIN and invalid characters (not 0–9, A-Z) in the input. Here are the tests for those cases and the changes to the initial implementation needed to pass those tests.


[Fact]
public void ValidateCheckDigit_ShouldReturnFalse__WhenInputIsNull()
=> IsinAlgorithm.ValidateCheckDigit(null!).Should().BeFalse();

[Fact]
public void ValidateCheckDigit_ShouldReturnFalse_WhenInputIsEmpty()
=> IsinAlgorithm.ValidateCheckDigit(String.Empty).Should().BeFalse();

[Fact]
public void ValidateCheckDigit_ShouldReturnFalse_WhenInputHasLengthLessThan12()
=> IsinAlgorithm.ValidateCheckDigit("00000000000").Should().BeFalse();

[Fact]
public void ValidateCheckDigit_ShouldReturnFalse_WhenInputHasLengthGreaterThan12()
=> IsinAlgorithm.ValidateCheckDigit("0000000000000").Should().BeFalse();

[Fact]
public void ValidateCheckDigit_ShouldReturnFalse_WhenInputContainsNonAlphanumericCharacter()
=> IsinAlgorithm.ValidateCheckDigit("US0378#31005").Should().BeFalse();



private const Int32 _expectedLength = 12;

public static bool ValidateCheckDigit(String str)
{
if (String.IsNullOrEmpty(str) || str.Length != _expectedLength)
{
return false;
}

var sb = new StringBuilder();
foreach (var ch in str)
{
if (ch >= '0' && ch <= '9')
{
sb.Append(ch);
}
else if (ch >= 'A' && ch <= 'Z')
{
sb.Append(ch - 55);
}
else
{
return false;
}
}

return LuhnAlgorithm.ValidateCheckDigit(sb.ToString());
}

Note that the tests for length should use values that would otherwise pass if not for the length. My first version of these tests used “12345678901” and “1234567890123” and the tests passed, but it was because the values didn’t have a valid check digit and not because the length was incorrect. So, I replaced them with a value from an earlier test had shown would pass (“000000000000”) and added/subtracted a zero from the value. Details like this are easy to miss if you aren’t careful.

Optimization

Now we get to what I consider the most interesting thing about this algorithm. We start by creating a benchmark for the baseline implementation.

[MemoryDiagnoser]
public class IsinAlgorithmBenchmarks
{
[Params("US0378331005", "AU0000XVGZA3", "US88160R1014")]
public String Value { get; set; } = String.Empty;

[Benchmark(Baseline = true)]
public void BaseLine()
{
_ = IsinAlgorithm.ValidateCheckDigit(Value);
}
}


| Method | Value | Mean | Error | StdDev | Ratio | Gen0 | Allocated | Alloc Ratio |
|--------- |------------- |----------:|---------:|---------:|------:|-------:|----------:|------------:|
| BaseLine | AU0000XVGZA3 | 132.48 ns | 1.026 ns | 0.960 ns | 1.00 | 0.0484 | 304 B | 1.00 |
| | | | | | | | | |
| BaseLine | US0378331005 | 62.38 ns | 1.019 ns | 0.953 ns | 1.00 | 0.0254 | 160 B | 1.00 |
| | | | | | | | | |
| BaseLine | US88160R1014 | 68.08 ns | 0.902 ns | 0.800 ns | 1.00 | 0.0254 | 160 B | 1.00 |

When we run the benchmark, we can see a couple of things right off the bat. Notice the [MemoryDiagnoser] attribute on the benchmark class. I normally include that attribute on all my benchmarks, but it hasn’t played a role in my previous articles. But here we can see the memory allocation involved with using the StringBuilder. In a small utility function, you would want to minimize or eliminate memory allocations if possible because the garbage collector will eventually have to clean up that memory. If you’re processing a large volume of transactions, then eventually even a small memory allocation per invocation will cause the garbage collector to be invoked more often. Depending on your service settings this could eventually impact the performance of your application/service. So, we should try to eliminate that if possible.

Further, if you look at the mean times, you’ll see that the more letters in the value, the more work is being done to construct the string. Even the occurrence of a single extra letter in the value causes almost a 10% increase over the lowest value. And the value with 5 extra letters is approximately double the cost, both in execution time and memory allocated.

We could try to optimize the conversion of letters to pairs of digit characters that we add to the StringBuilder but that still wouldn’t address the memory allocation issue. Could we remove the allocation entirely?

Let’s look at the original Luhn algorithm implementation that is invoked after creating the new string.

public static class LuhnAlgorithm
{
private static readonly Int32[] _doubledValues = new Int32[] { 0, 2, 4, 6, 8, 1, 3, 5, 7, 9 };

public static Boolean ValidateCheckDigit(String str)
{
if (String.IsNullOrEmpty(str) || str.Length < 2)
{
return false;
}

var sum = 0;
var shouldApplyDouble = true;
for (var index = str.Length - 2; index >= 0; index--)
{
var currentDigit = str[index] - '0';
if (currentDigit < 0 || currentDigit > 9)
{
return false;
}
sum += shouldApplyDouble ? _doubledValues[currentDigit] : currentDigit;
shouldApplyDouble = !shouldApplyDouble;
}
var checkDigit = (10 - (sum % 10)) % 10;

return str[^1] - '0' == checkDigit;
}
}

In this implementation, we process the input from right to left (ignoring the right-most check digit position). We convert each character to an integer and double odd position characters by using a lookup table and sum up the weighted values for each position. Then we calculate the check digit as (10 — (sum % 10)) % 10.

If the input consisted of purely digit characters, then no conversions would be needed and we could use the above implementation directly. So, the question becomes “can we update that implementation to handle letters as well?” And the answer is a definite “yes”.

Every letter character in the value is converted to a number between 10 and 35 and that number is processed as two individual digit characters. So we could handle letters in the input by executing this statement

sum += shouldApplyDouble ? _doubledValues[currentDigit] : currentDigit;

twice, once for the first digit of the translated value and once for the second digit. Let’s see what that might look like.

   private const Int32 _expectedLength = 12;
private static readonly Int32[] _doubledValues = new Int32[] { 0, 2, 4, 6, 8, 1, 3, 5, 7, 9 };

public static bool ValidateCheckDigit_NoAllocation(string str)
{
if (String.IsNullOrEmpty(str) || str.Length != _expectedLength)
{
return false;
}

var sum = 0;
var oddPosition = true;
for (var index = str.Length - 2; index >= 0; index--)
{
var ch = str[index];
if (ch >= '0' && ch <= '9')
{
var digit = ch - '0';
sum += oddPosition ? _doubledValues[digit] : digit;
oddPosition = !oddPosition;
}
else if (ch >= 'A' && ch <= 'Z')
{
var number = ch - 55;
var firstDigit = number / 10;
var secondDigit = number % 10;
sum += oddPosition
? firstDigit + _doubledValues[secondDigit]
: _doubledValues[firstDigit] + secondDigit;
}
else
{
return false;
}
}
var checkDigit = (10 - (sum % 10)) % 10;

return str[^1] - '0' == checkDigit;
}

In this version, we don’t convert the character until we know if we’re dealing with a digit or a letter. If it’s a digit, we handle it exactly like the original Luhn algorithm, including updating the oddPosition flag. But if it’s a letter, we convert the letter to a number and then get the individual digits of that number. Then we add those numbers to the sum, applying the doubling table to either the first or second digit depending on if we’re currently on an odd position or not. Note that we don’t need to update the odd position flag because we’re effectively handling two characters at once and negating the odd position flag twice would leave the flag unchanged. Then we calculate the check digit from the sum as per the original Luhn algorithm.

When we run the unit tests against this implementation, we can show that this implementation is functionally equivalent to the original version. So, let’s see what the performance looks like.

   [Benchmark]
public void NoAllocation()
{
_ = IsinAlgorithm.ValidateCheckDigit_NoAllocation(Value);
}


| Method | Value | Mean | Error | StdDev | Ratio | Gen0 | Allocated | Alloc Ratio |
|------------- |------------- |----------:|---------:|---------:|------:|-------:|----------:|------------:|
| BaseLine | AU0000XVGZA3 | 137.85 ns | 2.675 ns | 3.285 ns | 1.00 | 0.0484 | 304 B | 1.00 |
| NoAllocation | AU0000XVGZA3 | 27.14 ns | 0.217 ns | 0.203 ns | 0.20 | - | - | 0.00 |
| | | | | | | | | |
| BaseLine | US0378331005 | 62.33 ns | 1.277 ns | 1.470 ns | 1.00 | 0.0254 | 160 B | 1.00 |
| NoAllocation | US0378331005 | 21.70 ns | 0.177 ns | 0.165 ns | 0.35 | - | - | 0.00 |
| | | | | | | | | |
| BaseLine | US88160R1014 | 68.47 ns | 0.704 ns | 0.659 ns | 1.00 | 0.0254 | 160 B | 1.00 |
| NoAllocation | US88160R1014 | 23.09 ns | 0.218 ns | 0.204 ns | 0.34 | - | - | 0.00 |

Wow, what a difference that makes. As expected, the new version allocates no memory. But the performance improvement is quite impressive. Approximately 65% faster for the values with the fewest letters and 80% faster for the value with 5 more letters. We’ll definitely keep that change!

But can we further improve the performance? What if we move the calculations for letters to a lookup table like the Luhn digit doubling table? That change would look something like this:

   private const Int32 _expectedLength = 12;
private static readonly Int32[] _doubledValues = new Int32[] { 0, 2, 4, 6, 8, 1, 3, 5, 7, 9 };
private static readonly Int32[,] _lettersTable = BuildLetterLookupTable();

public static bool ValidateCheckDigit_Lookup(string str)
{
if (String.IsNullOrEmpty(str) || str.Length != _expectedLength)
{
return false;
}

var sum = 0;
var oddPosition = true;
for (var index = str.Length - 2; index >= 0; index--)
{
var ch = str[index];
if (ch >= '0' && ch <= '9')
{
var digit = ch - '0';
sum += oddPosition ? _doubledValues[digit] : digit;
oddPosition = !oddPosition;
}
else if (ch >= 'A' && ch <= 'Z')
{
sum += _lettersTable[oddPosition ? 1 : 0, ch - 65];
}
else
{
return false;
}
}
var checkDigit = (10 - (sum % 10)) % 10;

return str[^1] - '0' == checkDigit;
}

private static Int32[,] BuildLetterLookupTable()
{
var table = new Int32[2, 26];

for(var n = 0; n < 26; n++)
{
var number = n + 10;
var firstDigit = number / 10;
var secondDigit = number % 10;

table[0, n] = _doubledValues[firstDigit] + secondDigit;
table[1, n] = firstDigit + _doubledValues[secondDigit];
}

return table;
}

Here we’ve refactored the bulk of the letter calculations into a static method that pre-calculates the values for all letters and covers both even and odd permutations. That static method is used to initialize a static array of lookup values that is used to retrieve the appropriate value for each letter encountered.

When we make our previous version the new baseline and compare it to the lookup table implementation, we get the following results.

|       Method |        Value |     Mean |    Error |   StdDev | Ratio | Allocated | Alloc Ratio |
|------------- |------------- |---------:|---------:|---------:|------:|----------:|------------:|
| NoAllocation | AU0000XVGZA3 | 27.96 ns | 0.117 ns | 0.092 ns | 1.00 | - | NA |
| Lookup | AU0000XVGZA3 | 18.82 ns | 0.083 ns | 0.073 ns | 0.67 | - | NA |
| | | | | | | | |
| NoAllocation | US0378331005 | 21.82 ns | 0.185 ns | 0.164 ns | 1.00 | - | NA |
| Lookup | US0378331005 | 17.76 ns | 0.148 ns | 0.138 ns | 0.81 | - | NA |
| | | | | | | | |
| NoAllocation | US88160R1014 | 23.28 ns | 0.158 ns | 0.148 ns | 1.00 | - | NA |
| Lookup | US88160R1014 | 18.97 ns | 0.110 ns | 0.103 ns | 0.81 | - | NA |

Not bad. A minimum 19% improvement and for values with more letters we see a 33% improvement. This is another change that is worth keeping.

Summary

As you may have surmised from my description of the algorithm and its limitations, I’m not all that impressed by the ISIN algorithm. To be honest, the layering of the letter conversion on top of the Luhn algorithm feels to me more like a hack than a considered design choice. Still, given that ISINs were introduced in 1981, I’m criticizing from a position of 40+ years of hindsight. Nonetheless, I had fun figuring out exactly why the algorithm couldn’t detect letter transpositions. And exploring the algorithm did provide some interesting opportunities for optimization.

State of the art has progressed since the 1980s and in future articles I’ll cover later algorithms that handle alphanumeric values without the limitations of the ISIN algorithm.

Thanks for reading!

About Me

I’m just Some Random Programmer Guy, at least according to the nameplate that was once waiting for me on my first day at a startup. I’ve been around a while and my career has spanned FORTRAN and PL/1 on IBM mainframes to .Net Core microservices with some interesting forays with C/C++, OS/2, VB, C#, SQL, NoSQL and more along the way.

Code for this article is available my public github repository.

--

--