Strings and runes in go. Is there character datatype in go?

Quick tech learn
2 min readSep 1, 2023

--

In many programming languages string is made up of a sequence of character. But in go, there is no concept of character data type. In Go, it is called as rune, an integer that represents a Unicode code point.

Assume you have a string with the value “hello” and you are trying to print the length of the string like below,

package main

import (
"fmt"
)

func main() {

str := "hello"
fmt.Println("Len:", len(str))

}

You will get the input as 5!

Ok, now you are familiar with English language and the output will look correct. But how does the computer knows it’s English or something. That’s where unicodes comes in. For that we need to understand what is code point?

Code point is a number assigned to represent an abstract character in a system for representing text (such as Unicode). In Unicode, a code point is expressed in the form “U+1234” where “1234” is the assigned number. For example, the character “A” is assigned a code point of U+0041.

Now, We can return to runes.

Rune literals are just 32-bit integer values. They represent unicode codepoints. For example, the rune literal 'a' is actually the number 97.

Strings are equivalent to []byte, len() function will produce the length of the raw bytes stored within.

fmt.Println("Len:", len(str))

Indexing into a string produces the raw byte values at each index. When you loop this string, it generates the values of all the bytes that constitute the code points in s based on the format specifier you are specifying.

for i := 0; i < len(s); i++ {
fmt.Printf("%x ", s[i])
}

//Output
//68 65 6c 6c 6f

If you want the actual rune value you can use the %#U format specifier.

for i, runeValue := range s {
fmt.Printf("%#U starts at %d\n", runeValue, i)
}

//Output
//U+0068 'h' starts at 0
//U+0065 'e' starts at 1
//U+006C 'l' starts at 2
//U+006C 'l' starts at 3
//U+006F 'o' starts at 4

So, finally the take is string is not an array of characters in go. It is a read-only slice of bytes. There is no concept of character in go. Each index in string represents a unicode point rather than a character.

--

--

Quick tech learn

Blogs are written by a fullstack developer with experience in tech stack like golang, react, SQL, mongoDB and cloud services like AWS