Encoding Data with the Go Binary Package
When encoding binary data for IO in Go, there is a multitude of options available ranging from gob, Protobuf, to roll-your-own encoder. This post examines Go’s encoding/Binary package used to encode and decode numeric values into binary data that can be used for many purposes.
A simple binary protocol
Imagine that you are writing Go code for small remote devices collecting sensor data to send to backend edge servers. To save bandwidth (and other resources), you decide to create a simple binary format for your data as outline below:
0 1 2 3 4 5 6 7
0123456701234567012345670123456701234567012345670123456701234567
+-------+-------+-------+-------+-------+-------+-------+------+
| SensorID | LocationID | Timestamp |
+-------+-------+-------+-------+-------+-------+-------+------+
| Temp |
+---------------+
The data is arranged in a fixed-size binary packet of 80 bits that divided into four fields including SensorID
(16 bit), LocationID
(16 bit), Timestamp
(32 bit), and Temperature
(16 bit).
The remainder of this post shows how to use the binary package in Go to encode numeric values into this binary packet which can then be used in IO operations.
Source code for this post https://github.com/vladimirvivien/go-binary
Data representation concepts
Before discussing how to use Go to encode the numeric values into binary data, let us review two important concepts about binary data representation using the binary
package.
Fixed-size values
The binary
package supports several ways of encoding and decoding fixed-size numerical values. They are called fixed because the underlying binary representation always utilize a fix number of bytes for a given type (i.e. 32 bits for all 32-bit size values). Some functions in package binary
can also encode and decode arrays, slices, and structs containing only fixed-size values.
The binary package also supports variable-length encoding of numeric values where smaller values require fewer bytes. However, this is not covered in this writeup.
Byte order
Another concept worth discussing is the endianness of encoded data. This is the order in which the bytes, that represent a numerical value, are arranged when the data is encoded. Go’s binary package provides interface binary.ByteOrder
:
type ByteOrder interface {
Uint16([]byte) uint16
Uint32([]byte) uint32
Uint64([]byte) uint64
PutUint16([]byte, uint16)
PutUint32([]byte, uint32)
PutUint64([]byte, uint64)
String() string
}
This interface is implemented by types binary.LittleEndian
and binary.BigEndian
. These types can be used to automatically encode/decode values with the proper endianness (byte order). A value should always be encoded and decoded using with the same byte order or risk data corruption.
Next, let us see how to use these types to encode our binary packet.
Encoding and decoding directly
Let us use the BigEndian
type to encode a single packet of the sensor data. Because each field in the packet is fixed-length, it is easy to explicitly construct the packet by populating each field individually, as shown in the following snippet, into a buffer (variable buf
):
package mainimport (
"encoding/binary"
...
)func main() {
buf := make([]byte, 10)
ts := uint32(time.Now().Unix())
binary.BigEndian.PutUint16(buf[0:], 0xa20c) // sensorID
binary.BigEndian.PutUint16(buf[2:], 0x04af) // locationID
binary.BigEndian.PutUint32(buf[4:], ts) // timestamp
binary.BigEndian.PutUint16(buf[8:], 479) // temp
fmt.Printf("% x\n", buf)
}
Source code github.com/vladimirvivien/go-binary/encode0.go
Using methods from type BigEndian
each field is encoded into slice buf
at pre-determined indexes to fill up all 10 bytes (80 bits). Note that each BigEndian
method call matches the size of its related field (i.e. PutUint16
is used to encode the 16-bit for the sensorID and so on).
Conversely, when decoding the data, the same byte order must be used to avoid improper reading of the encoded values. Assuming slicebuf
is available with the encoded binary data (either from in-memory or received from a network IO), it can be decoded using the following snippet :
func main() {
buf := <contains encoded bytes>
sensorID := binary.BigEndian.Uint16(buf[0:])
locID := binary.BigEndian.Uint16(buf[2:])
tstamp := binary.BigEndian.Uint32(buf[4:])
temp := binary.BigEndian.Uint16(buf[8:])
fmt.Printf("sid: %0#x, locID %0#x ts: %0#x, temp:%d\n",
sensorID, locID, tstamp, temp)
}
Source code github.com/vladimirvivien/go-binary/encode0.go
Using methods from type BigEndian
, each field is extracted from the slice containing the encoded data packet. Again, since the sized of each encoded number is fixed, each method call matches the expected data size as read from the index offset.
Encoding/decoding with an IO streams
Encoding and decoding data packets field by field, as shown above, can be error prone for larger packets with many fields. Fortunately, the binary
package supports Go’s streaming IO interfaces io.Reader
and io.Writer
when encoding and decoding data.
Encoding with binary.Write
Function binary.Write(w io.Writer, o ByteOrder, data interface{})
encodes parameter data
with the provided io.Writer
with the specified ByteOrder
. Parameter data
is provided as empty interface{}
(instead of []bytes
). This is because the Write
function can do some helpful magic to automatically encode the fixed-size values when they are passed in as:
- Numeric data types (i.e. bool, int8, uint8, int16, float32, etc.)
- Arrays, slices, or structs containing only fixed-size values
- Pointers to the types in the previous bullets
For instance, the following example introduces struct type packet
to encapsulate the previously defined fields from the data packet. Notice that the struct only contains numeric fields that appear in the expected order. The code uses function Write
to automatically encode values stored in packet
value dataIn
:
type packet struct {
Sensid uint32
Locid uint16
Tstamp uint32
Temp int16
}func main() {
dataIn := packet{
Sensid: 1, Locid: 1233, Tstamp: 123452123, Temp: 12,
}
buf := new(bytes.Buffer) err := binary.Write(buf, binary.BigEndian, dataIn)
if err != nil {
fmt.Println(err)
return
}
}
Source code github.com/vladimirvivien/go-binary/encode1.go
In the previous snippet, after function Write
is executed, writer buf
will write the 10 bytes encoded using BigEndian
. This time however, notice there is no explicit calls to any ByteOrder
methodPutXXX
as before. Function Write
handles the internal buffer bookkeeping and automatically figures out data sizes from the struct field types.
Decoding with binary.Read
Conversely, function binary.Read(r io.Reader, o ByteOrder, data interface{})
can read encoded data from an io.Reader
and decode it automatically into parameter data
which must be one of the followings:
- A pointer to a numeric value (i.e. bool, int8, uint8, int16, float32, etc.)
- A pointer to a struct or array of numeric values
- A slice of numbers or structs/arrays containing only numbers
For instance in the following code snippet, bytes from io.Reader buf
are decoded with function Read
and the decoded values are stored in the provided struct variable dataOut
:
type packet struct {
Sensid uint32
Locid uint16
Tstamp uint32
Temp int16
}func main() {
buf := <reader with encoded binary data>
var dataOut packet err := binary.Read(buf, binary.BigEndian, &dataOut)
if err != nil {
fmt.Println("failed to Read:", err)
return
}
}
Source code github.com/vladimirvivien/go-binary/encode1.go
When the previous code is executed, function Read
uses io.Reader buf
to read and decode binary values using the BigEndian
byte order. Notice there is no explicit calls to ByteOrder
methods UintXXX
as before. Function Read
automatically handles buffer bookkeeping and figures out the proper data sizes to correctly decode the data into the respective fields of variable dataOut
.
Encoding multiple packets
As mentioned above, functions binary.Read
and binary.Write
can also encode and decode data stored in arrays and slices. For instance the following snippet encodes and end decodes multiple data packets stored in a slice of packet
:
type packet struct {
Sensid uint32
Locid uint16
Tstamp uint32
Temp int16
}func main() {
dataOut := []packet{
{Sensid: 1, Locid: 1233, Tstamp: 123452123, Temp: 12},
{Sensid: 2, Locid: 4567, Tstamp: 133452124, Temp: 32},
{Sensid: 7, Locid: 8910, Tstamp: 143452125, Temp: -12},
} // encode a slice of packet
buf := new(bytes.Buffer)
err := binary.Write(buf, binary.LittleEndian, dataOut)
if err != nil {
fmt.Println(err)
return
} // decode all items from slice
dataIn := make([]packet, 3)
err := binary.Read(buf, binary.LittleEndian, dataIn)
if err != nil {
fmt.Println("failed to Read:", err)
return
} fmt.Printf("%v", dataIn)}
Source code github.com/vladimirvivien/go-binary/encode2.go
In the snippet above, slice dataOut
is encoded using function Write
and stored in byte buffer buf
. The reverse is done with function Read
which decode the data from buf
and reconstructs the slice of packet
which is assigned to variable dataIn
.
Conclusion
Package encoding/binary offers a convenient way of encoding data structures with numeric values into fixed-size binary representation. It can be used directly or as the basis of custom binary protocols. Because the package supports IO stream interfaces, it can be easily integrated in communication or storage programs using streaming IO primitives.
More about the binary package:
- Example source code for write up
- The encoding/binary Go doc
- Streaming IO in Go
- Let’s Make an NTP Client in Go
- Go Walkthrough: encoding/binary
Also, don’t forget to checkout my book on Go, titled Learning Go Programming from Packt Publishing.