Porting Java To Go(lang)

Published in

KANO Engineering

4 min readApr 4, 2019

For the last few days, while our team collecting user requirement and I have nothing to do. I decided to port solr javabin decoder ( https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java) to golang. Why because the small size of javabin and how it store data, it might be the better way to communicate between solr and its client, anyway the default format for solrj is javabin so that saying something about its performance compared to json (at least until solr support protobuf/grpc).

Setting up porting environment

To make sure we have some degree of compatibility we need to have a way to ensure the java and go code have same output/behaviour given the same input. The simplest way is to use unit test the decoding function while add debug lines in the java code to add information how it runs. The tool we use for testing in java is maven, and in go is its in built test tool.

The test run more or less this way.

create some structure of SimpleOrderedMap in java
encode to byte array using JavaBinCodec.Marshal
get the output of 2 and try to decode it back in java, observe the behaviour using the trace log
using the output of point no 2 and 3. Try to replicate the decoding process in go, ensuring the output and behavior is the same

We run all the code, No Exception

The first obvious problem with porting java code to golang is the exception, golang does not have any. Golang use multi value on return to make it more “functional” so every error should be returned. So the java code like this:

protected Object readObject(DataInputInputStream dis) throws IOException {
    // if ((tagByte & 0xe0) == 0) {
    // if top 3 bits are clear, this is a normal tag// OK, try type + size in single byte
    switch (tagByte >>> 5) {
      case STR >>> 5:
        return readStr(dis, stringCache, readStringAsCharSeq);
      case SINT >>> 5:
        return readSmallInt(dis);
      case SLONG >>> 5:
        return readSmallLong(dis);
      case ARR >>> 5:
        return readArray(dis);
      case ORDERED_MAP >>> 5:
        return readOrderedMap(dis);
      case NAMED_LST >>> 5:
        return readNamedList(dis);
      case EXTERN_STRING >>> 5:
        return readExternString(dis);
    }switch (tagByte) {
      case NULL:
        return null;
      case DATE:
        return new Date(dis.readLong());
      case INT:
        return dis.readInt();
      case BOOL_TRUE:
        return Boolean.TRUE;
      case BOOL_FALSE:
        return Boolean.FALSE;
      case FLOAT:
        return dis.readFloat();
      case DOUBLE:
        return dis.readDouble();
      case LONG:
        return dis.readLong();
      case BYTE:
        return dis.readByte();
      case SHORT:
        return dis.readShort();
      case MAP:
        return readMap(dis);
      case SOLRDOC:
        return readSolrDocument(dis);
      case SOLRDOCLST:
        return readSolrDocumentList(dis);
      case BYTEARR:
        return readByteArray(dis);
      case ITERATOR:
        return readIterator(dis);
      case END:
        return END_OBJ;
      case SOLRINPUTDOC:
        return readSolrInputDocument(dis);
      case ENUM_FIELD_VALUE:
        return readEnumFieldValue(dis);
      case MAP_ENTRY:
        return readMapEntry(dis);
      case MAP_ENTRY_ITER:
        return readMapIter(dis);
    }throw new RuntimeException("Unknown type " + tagByte);
  }

will become more or less this:

func readObject(m interface{}, dis *bytes.Buffer) error {
	tagByte, err := dis.ReadByte()
	fmt.Printf("TAG %x\n", tagByte)
	if err != nil {
		return err
	}
	checkType := tagByte >> 5
	switch checkType {
	case STR >> 5:
		err = readStr(dis, readStringAsCharSeq, &tagByte, m)
		if err != nil {
			return err
		}
	case SINT >> 5:
		intRes, err := readSmallInt(dis, tagByte)
		if err != nil {
			return err
		}
		setValue(m, intRes)
	case EXTERN_STRING >> 5:
		fmt.Println("Read ExternString")
		err = readExternString(m, dis, &tagByte)
	case ORDERED_MAP >> 5:
		err = readOrderedMap(m, dis, tagByte)
		return err
	}
	switch tagByte {
	case BOOL_FALSE:
		setValue(m, false)
		return nil
	case BOOL_TRUE:
		setValue(m, true)
		return nil
	case DOUBLE:
		dbl, err := readDouble(dis, tagByte)
		if err != nil {
			return err
		}
		setValue(m, dbl)
		return nil
	}
	return nil
}

Another example in java:

public long readLong() throws IOException {
    return  (((long)readUnsignedByte()) << 56)
            | (((long)readUnsignedByte()) << 48)
            | (((long)readUnsignedByte()) << 40)
            | (((long)readUnsignedByte()) << 32)
            | (((long)readUnsignedByte()) << 24)
            | (readUnsignedByte() << 16)
            | (readUnsignedByte() << 8)
            | (readUnsignedByte());
  }

the same in golang:

func readLong(dis *bytes.Buffer, tagByte byte) (uint64, error) {
	var hasil uint64
	var idx uint
	for idxC := 56; idxC >= 0; idxC -= 8 {
		tempByte, err := dis.ReadByte()
		if err != nil {
			fmt.Println("CurIDX", idx)
			return 0, err
		}
		hasil |= uint64(tempByte) << uint(idxC)
	}
	return hasil, nil
}

As you can see, in go we choose to do this in “somewhat” functional way so we hope we can simplify the concurrency (no mutex or such)

Sign? What sign?

The next thing is pretty non obvious difference between java and golang. Java does not have unsigned data type while go have. This mean we need to do some adaptation to the data type and operation. Some operation in java like >>> and <<< will need to be replaced with >> and << respectively. While in java if we have to store unsigned byte we need to use int, in golang we can get away with unsigned byte. Just make sure you know whether you should use int or byte.

This is no Casting

Another problem we found during porting is getting reading floating value. The encoding process turns the floating value into int64 bits, So we need to find a way to treat the bit string as float. For some people this might be a casting, well no, this is not a casting. Casting is like turing 4 integer into 4.0 in float. We don’t want that process we want keep the binary string but treat it as float/double.

In java we can do this

@Override
  public double readDouble() throws IOException {
    return Double.longBitsToDouble(readLong());    
  }

While in Go we can do this

unc readDouble(dis *bytes.Buffer, tagByte byte) (float64, error) {
	res, err := readLong(dis, tagByte)
	if err != nil {
		return 0, err
	}
	resDouble := math.Float64frombits(res)
	return resDouble, nil
}

For some people interested in the development of golang port of javabin decoder, the source code can be found on https://github.com/kharism/solrjavabindecoder. Just remember the project is still in somewhat active development and update may break compatibility.

Porting Java To Go(lang)

Setting up porting environment

We run all the code, No Exception

Sign? What sign?

This is no Casting

Written by kharisma muchammad