From Java to C

Justin Dekeyser
Javarevisited
Published in
8 min readJul 29, 2020

As a mathematician and advanced Java developer, I realized I never touched the C programming language. So yesterday, taking the benefit of a rainy day off, I decided to experiment it. See this text as friendly feedback!

So the genesis of this adventure was a disappointment around type inference in Java (must confess!), after which I thought: the more complicated the language, the least you can actually do.

Then I thought about very simple languages I knew (Cobol, SQL, and JS to be honest) and I told myself that those look like good languages if and only if the amount of code you have to manage is small, by code unit, mainly because of type safety. This made me think about Haskell and FP paradigm in general, where the basic moto is “write small pure functions only”.

Those rely on a strong type system, but other languages as JavaScript can also be fun to code in if we focus on small methods at a time. Then, not really knowing why, I thought about C.

Before I started the journey, I only knew that C had struct (which (nearly) exist in Java under record ) and pointers. Pointers do not really exist as such in Java, although everything stored in some Object-like variable should be thought off as such!

So I decided to give C a try, and I wanted to see how one could define the notion of Iterator.

The Java interface for an Iterator is pretty straightforward and looks like this (simplified):

interface Iterator<T> {
/* Returns true if the iterator has a next element, false otherwise */
boolean hasNext();
/* Returns the next element, throws if no such element exists */
T next();
}

Implementing this interface on the given of an array may look like this:

class IteratorOnArray<T> implements Iterator<T> {
private final T[] array;
private int counter = 0;
IteratorOnArray(T[] array) { this.array = array; } @Override public
boolean hasNext() {
return counter < array.length;
}
@Override public
T next() {
return array[counter++];
}
}

Java is a pretty straightforward language after all. So, what could be the equivalent of the above Java code, in C?

C has no interface but has header files

Every Java developer knows that C has header files to describe what is exposed by a given code. (You should, as Java dev, because it’s strongly linked with native methods.) The idea of a header file is about describing what is exposed.

After taking time to understand the C type system, I finally came up with a preliminary version of the header file (iterator.h):

#ifndef __ITERATOR_INTERFACE_H_
#define __ITERATOR_INTERFACE_H_
typedef struct Cursor Cursor;typedef struct Iterator {
int (*hasNext)(struct Cursor*);
int (*next)(struct Cursor*);
} Iterator;
// Return a new struct
Iterator iterator();
#endif

The first two lines are pre-processing commands to ensure that if the header file gets included twice in other codes, the compiler will not complain. More interesting are the type definitions.

As you can see, we reserve a type called Cursor : a struct with no field. Such a type is said to be incomplete. The user will not be able to use it directly, because as far as I understood it: every C code should be capable of inferring the size of its data (sounds like Cobol).

Since we did not specify fields in the struct, a user of Cursor cannot guess its size and, thus, cannot use Cursor . Nevertheless, it stills can use a pointer to Cursor , namely Cursor* . Because similarly to Java, a pointer is something like 64 bits, no matter what is pointed.

The Iterator struct is different and is closer to the usual notion of interface we know. Here, we give a description of the structure by stating that it has two fields that are pointers to functions of the signature

Cursor* -> int

C language at its lowest level does not have a notion of boolean, but it rather acts as Javascript by considering 0 as falsy, 1 as non falsy. Here we keep focus on iterators on integers, hence the signature of next .

We also provide a factory method iterator() that can construct an iterator. The client can use it to obtain a basic iterator algorithm on the array. I’m not sure this is the most C-idiomatic way of exposing methods, but it could have the advantage not to pollute public scope too much, with keywords likenext and hasNext .

Note that in C, struct (and everything else) are passed by value. Hence, if method iterator() create a struct and returns it, it actually copies the whole struct, and the inner struct is destroyed on scope close. As in Java, pointers are roughly 64 bits, so our Iterator struct is 2x64 bits size.

The lack of Object

Careful from now on! As a reminder, C has no notion of object. As such, the pattern we have chosen to follow is a decoupling of Cursor and Iterator . In other words: the way data are traversed and the current reading state are two different structures.

This is because there is no portable way of creating an iterator “binded” to some Cursor. More precisely, when calling something like it.hasNext() , we have no real indication of it inside the core definition of hasNext . That’s why we have chosen to split the two’s.

After some readings, I discovered there was kind of a way to achieve OOP in C but let’s focus on not copying Java paradigms and let’s move on with “easy C” (remember: it’s my first day).

Cursor implementation and data hiding

As mentioned, the Cursor structure is hidden from client. We are going to define it in the iterator.c file:

typedef struct Cursor {
int counter;
int arraySize;
int* arrayReference;
} Cursor;

Some comments here on the business logic. The cursor pattern is closely related to the one that exists in SQL when fetching data from a SELECT result. A cursor knows a reference (pointer) to some array of integers, it knows the total size of the array, and it knows the current position (counter).

As in Java, C arrays are initialized by providing an immutable length:

int[5] array = {1,2,3,4,5};

Nevertheless, unlike Java, C array-types are complete when given a length. In otherwords, the type int[] is not complete, while int[5] is complete. In the Cursor structure, the type int* (which is kind of synonym for int[] ) is thus incomplete. Unlike Java, there is no length method on array (as they are no objects). However, arrays are truly consecutive piece of memory. In Java, this is not exactly true, and left to JVM behavior.

As such, C arrays are direct pointers to their first elements, and the followings are equivalent:

array[0] ~ *array // dereference the array (=pointer to first elem)
array[2] ~ *(array + 2) // dereference pointer to third elem

Our Cursor structure has size 64+2x32 bits, as it would basically have in Java.

next and hasNext implementations

The implementations for next and hasNext , provided some Cursor* , is then straightforward:

static int hasNext(Cursor* cursor) {
return (cursor -> counter) < (cursor -> arraySize);
}
static int next(Cursor* cursor) {
int currentCounter = cursor -> counter;
cursor -> counter = currentCounter + 1;
return *(cursor -> arrayReference + currentCounter);
}

Some words here. The static keyword basically means private and ensures that the two methods are only directly reachable from the current file. The syntax -> may look to you like some nasty php (and maybe php was inspired by C?).

In fact, struct fields are reached as in Java with a . operator:

Cursor cursor;
cursor.counter; // the counter field of cursor

However, when cursor is a pointer Cursor* , you must first dereference it to access its field:

Cursor* cursor;
(*cursor).counter;

As this is a bit annoying to write, C provides a shortcut -> :

(*cursor).counter ~ cursor->counter

So far, so good. We can now provide an implementation of the iterator() factory method:

Iterator iterator() {
Iterator iterator = {
.hasNext = hasNext,
.next = next
};
return iterator;
}

Here we do not really care about double memory allocation by returning a full copy of the iterator structure. The main reason is that, at least for now, the Iterator structure is fully immutable and quite light. As such, we can expect a client creates only few versions of the iterator.

Creating Cursors

All the mutable concern of iteration is contained in the Cursor structure. To recall, the structure detail is hidden from client. This certainly has the advantage of protecting client against state mutating operations. We now need a way to allow client for indirect cursor access, that is: how to provide a Cursor* ?

After some experiments, I realized that creating a structure can be done by following two steps:

  1. Allocate a sufficient amount of memory that holds the structure,
  2. Fill that memory.

The problem of memory allocation is that we need a way to ensure the client will free this memory later, when the client is not interested by the cursor.

In Java, when we have some resource to protect and that we want to make sure someone is going to close it, we exploit the try-with-resource syntax. The idea also exists in Python (although not a lot of Python scripters know how to implement it). There is no such thing in C, but we can be inspired by this behavior.

The client being not aware of the size of the Cursor (because we exposed it as an incomplete type), we are going to allow him to speak in an indirect way to our cursor, which will our own concern.

Once in this mindset, the following implementation becomes straightforward:

void iterate(void* array, int arraySize, Method method) {
Cursor* mutRef = malloc(sizeof(Cursor));
mutRef -> counter = 0;
mutRef -> arraySize = arraySize;
mutRef -> arrayReference = array;
method(cursor); free(cursor);
}

and we enrich the header file with the Method type definition and a new contract:

typedef void(*Method)(Cursor*);void iterate(void* array, int arraySize, Method method);

Usages

After all this work, as the day was nearly off, I finally ended up with the following use case:

#include<stdio.h>
#include<stdlib.h>
#include "iterator.h"
void f(Cursor* cursor) {
Iterator it = iterator();
printf("Iterating on array\n");
while(it.hasNext(cursor)) {
printf("Element is %d\n", it.next(cursor));
}
}
int main() {
int array[5] = {1,2,3,4,5};
iterate(array, 5, f);
return 0;
}

Here, for the sake of this small warm up with C, the iterator is only capable of handling int[] arrays and the executing method f takes no other parameter than Cursor* and returns nothing.

Well, is this idiomatic C? I don’t know but at least, it allows me to cover many language features (at least, it’s the feeling I have).

It also shows how the way we think in Java can be exploit in other programming languages.

Conversely, having done this exercise also sheds light on our own Java code. For instance, the way we design our Iterator (at the beginning of this text) can be rethought: as in the C version, we could also have an Iterator agnostic of the actual array, knowing only how to move a cursor…

Working in C also makes you realize that sometimes, generic types may be overused. Indeed, taking my current job place as a bad example, you often find yourself in defining a super abstract class with generic types so general, than actually you cannot guess any useful behavior in those classes.

This is even more true when you realize than generic typing at the method level is in fact way more powerful than at class level. Working with C really makes you feel that: generics on classes are actually a constraint, while generics on methods (and functional interfaces) are a power.

That being said, it was fun and inspiring!

--

--

Justin Dekeyser
Javarevisited

PhD. in Mathematics, W3C huge fan and Java profound lover ❤