Working with Unicode and Grapheme Clusters in Dart

Suragch
Flutter Community
Published in
5 min readDec 13, 2019

--

As announced with Dart 2.7, we finally have support for properly handling grapheme clusters. That is a big win for those of us who do a lot of string manipulation.

What are Grapheme Clusters?

Grapheme is a wonder material made of carbon lattice… Oh, wait, no. Sorry, that’s graphene.

Graphene

Graphemes are a subset of highly shared internet images… Oh, wait, no. Sorry, that’s graph meme.

Graph meme

Graphemes are written characters.

The word “character” can have a lot of meanings, though. What a computer thinks of as a character and what a human thinks of as a character can be two different things.

Dart uses Unicode for its strings and is encoded in UTF-16 format. That means that each character is a 16 bit value, aka code unit. That translates nicely most of the time:

a    \u0061
b \u0062
θ \u03B8
家 \u5BB6

--

--