DB1: Tokio Simple DB

Arjun Sunil Kumar
Distributed Systems Engineering
5 min readApr 28, 2022

Hey, mid-night in Dallas. Thought of starting the next project in RUST.

We will be covering a tiny DB built using tokio.rs. Before diving into the code, let's understand a few things.

Enum vs Struct:

https://www.reddit.com/r/rust/comments/93valt/differences_between_struct_and_enum/
  • Enum is used for Tagging. It can have a same-space(kind of stack) allocation, as the size is known. Used usually in conjunction with match().
  • Struct is used similar to Class in Java. It needs separate space allocation (kind of heap), as attribute size can’t be determined in compile time.
https://doc.rust-lang.org/book/ch06-01-defining-an-enum.html

Mutex:

Converts to a lockable data type.

https://doc.rust-lang.org/book/ch16-03-shared-state.html
https://fongyoong.github.io/easy_rust/Chapter_43.html

Arc vs Rc:

https://doc.rust-lang.org/rust-by-example/std/arc.html

Basically, we have the apple in heap memory. When the last reference is done, its value is cleared. When we use thread, we use the move to transfer ownership. But a possible issue is that, when the thread finishes, it clears the value due to the end of the last reference. We don’t want that, as some other thread might also be reading it. So, what we do is we, clone the day using Arc. Here we increase the reference counter. So even when the current thread, finishes, the value is not cleared. This gets used in a singleton kind of pattern, say database connection etc.

https://doc.rust-lang.org/rust-by-example/std/arc.html
https://doc.rust-lang.org/std/rc/struct.Rc.html

RUST Code:

Part 1: (Request Response)

Let's dissect the rust code, and understand each portion one at a time.

  1. Here we a Struct for Database, which contains Mutex HashMap.

2. Request and Response are enums: having the value Get/Set or Value/Set/Error respectively.

Request

3. Impl enum : It is enhancing the Request enum to have parse() function.

Simply put, below would be the structure without intermediate code.

4. splitn : Used to get a 3 element array, by splitting using space.

5. A simple tokenizer similar to java.

  • Here we use parts.next() to fetch the next token.
  • We are using match to handle all the null scenarios.
  • Finally, we are using
Ok(Request::Set {
key: key.to_string(),
value: value.to_string(),
})

to set the values in the Enum (either Request::Get or Request::Set).

  • We are converting a grammar line to Request Enum.

Response:

Similar to implementing toString() on Enum.

Usage:

Part 2: (Server Loop)

  1. Till line 10 (address binding), we are kind of clear as we have seen it in our TCP Client & Server articles.

2. Arc :

  • Here we are initially creating a plain HashMap and inserting a value.
  • Then we are creating a new Database Object using that hashmap.
  • We are converting that Database Object into an Arc type so that it can be cloned inside the new thread. The db.clone() will reference point to the original db.
  • Note that the inside hashmap is Mutex, prevent concurrent modifications.
Using Arc to clone the DB

3. Ok((socket, _)) : Here we are spinning up a new thread for a client connection.

New Thread for each Socket Connection(client connection)

Removing all the intermediate code, it is simply below.

4. Async Move: Upon creating a new thread, we are doing async move to push the db ownership to the inside thread.

5. Framed: Stream + Sink. Can read stream sequentially using the encoder passed, ie the converts bytes to text lines in our case.

source code
use tokio_util::codec::{Framed, LinesCodec};
Lines : Framed TcpStream + List<String>
  • Here we are using lines.next().await to read lines one by one. *
  • lines.sent(response.as_string()).await to write back to the sink.

6. While Loop: We are declaring a variable (result) in the condition of while. We are using a match on the result, to see if it is Ok(line) or Err(e)

7. Passing Request & getting a response, and sending it back:

Part 3: Server Handler

  1. Parse a line into Request enum
  2. We get the DB map and lock that using lock().
  3. Based on the Request type, we insert() or get() the value from the hashmap.

Conclusion:

Here we have completed our session on understanding tinydb and in-memory hashmap, wrapped into a server. In the next session, probably, we will go through the mini-redis codebase.

Found it Interesting?

Please show your support by 👏.

--

--

Arjun Sunil Kumar
Distributed Systems Engineering

Writes on Database Kernel, Distributed Systems, Cloud Technology, Data Engineering & SDE Paradigm. github.com/arjunsk