Source: https://blog.duyet.net/2022/01/rust-cheatsheet.html

Rust NewType & ADT Patterns

Published in

lifefunk

5 min readJul 19, 2024

As a software engineer, it’s our job to analyze the business model and translate it into our software codebase, it means our software codebase should represent our business model, or it may be called an abstracting business model. But in reality, it’s common in many cases that the codebase itself is very different from the purpose of the business model.

This problem is because we (as software engineers) think and talk using different languages, and actually there is a solution for that, which is ubiquitous language.

bliki: Ubiquitous Language

a bliki entry for Ubiquitous Language

martinfowler.com

Ubiquitous Language is the term Eric Evans uses in Domain Driven Design for the practice of building up a common, rigorous language between developers and users. This language should be based on the Domain Model used in the software — hence the need for it to be rigorous, since software doesn’t cope well with ambiguity

The Ambiguity

Let’s say we have an use case like this, just a simple user management, where an user has their credential identity and also their married status in their profile.

Imagine we’re modeling its data structures like below:

pub struct User {
  password: String,
  username: String
}

pub struct Profile {
  marital_status: String
}

This is just a very simple struct object in Rust, no-brainer here. But, if we ask ourselves, if the User.id is a String , and it’s the same with the Profile.marital_status which is also in String, are they the same?

Let’s say we have a method like this:

pub fn user_marital_status(status: String) {}

If everything is String , then what kind of String used for that method? If we use User.password or User.username which is also a String , so is it still valid?

This condition is where the ambiguity starts to rise. Maybe if there is a single method and two simple objects like the example above, it is still okay. But how about if need to develop more complex domain problem or complex business model?

Rust NewType Patterns

From the example above, we can implement Rust NewType pattern like this:

pub struct Password(String);
pub struct Username(String);
pub struct MaritalStatus(String);

These new struct objects will be used in our simple function parameters:

pub fn user_marital_status(status: MaritalStatus) {}

It’s done. By implement this simple data patterns, we just remove a lot of ambiguity in our codebase. Our codebase become more explicit and start to following our business model’s requirements.

I have a personal project about the cryptography which just I’ve refactored to follow this pattern too:

GitHub - prople/crypto: A set of cryptography libraries used at Prople system

A set of cryptography libraries used at Prople system - prople/crypto

github.com

These are examples of my newtype pattern implemented.

Base traits:

pub trait Value<T> {
    fn get(&self) -> Result<T, CommonError>;
}

pub trait StringValue {
    fn get_string(&self) -> String;
}

/// `BytesValue` is a trait used to get common bytes array
/// The return value will be wrapped in [`Bytes`] container object to simplify
/// the bytes arrary process
pub trait BytesValue {
    fn bytes(&self) -> Bytes;
}

/// `VectorValue` is a trait used to get main vector value. It has a generic parameter used to
/// indicate a real data types will used inside the vector
pub trait VectorValue<T> {
    fn vec(&self) -> Vec<T>;
}

ecdh/types.rs :

#[derive(PartialEq, Debug, Clone)]
pub struct PublicKeyBytes([u8; 32]);

impl Value<[u8; 32]> for PublicKeyBytes {
    fn get(&self) -> Result<[u8; 32], CommonError> {
        let byte_slice = self.bytes().slice(0..32);
        let byte_output = &byte_slice[..];

        let output: Result<[u8; 32], CommonError> = <&[u8; 32]>::try_from(byte_output)
            .map(|val| val.to_owned())
            .map_err(|_| CommonError::ParseValueError("unable to parse bytes".to_string()));

        output
    }
}

impl TryFrom<ByteHex> for PublicKeyBytes {
    type Error = EcdhError;

    fn try_from(value: ByteHex) -> Result<Self, Self::Error> {
        let result = hex::decode(value.hex())
            .map_err(|err| EcdhError::Common(CommonError::ParseHexError(err.to_string())))?;

        let peer_pub_bytes: [u8; 32] = match result.try_into() {
            Ok(value) => value,
            Err(_) => {
                return Err(EcdhError::ParsePublicKeyError(
                    "unable to parse given public key".to_string(),
                ))
            }
        };

        Ok(PublicKeyBytes(peer_pub_bytes))
    }
}

impl BytesValue for PublicKeyBytes {
    fn bytes(&self) -> Bytes {
        Bytes::from(self.0.to_vec())
    }
}

impl From<[u8; 32]> for PublicKeyBytes {
    fn from(value: [u8; 32]) -> Self {
        PublicKeyBytes(value)
    }
}

keysecure/types.rs

#[derive(Clone, Debug)]
pub struct Password(String);

impl StringValue for Password {
    fn get_string(&self) -> String {
        self.0.to_owned()
    }
}

impl From<String> for Password {
    fn from(value: String) -> Self {
        Password(value)
    }
}

By implementing this pattern, we’ll also get an advantage that the Rust compiler will help us to ensure that we’re following the domain model types, no longer the primitive data types.

The ADT (Algebraic Data Types)

Yesterday I found two recommended articles about this data type patterns:

Functional Domain Modeling in Rust - Part 1

Domain modeling, influenced by Functional Programming principles, aims to accurately represent the business domain in…

xebia.com

Before we discuss more about these data types, we need to know more about it:

Algebraic data type - Wikipedia

In computer programming, especially functional programming and type theory, an algebraic data type (ADT) is a kind of…

en.wikipedia.org

In computer programming, especially functional programming and type theory, an algebraic data type (ADT) is a kind of composite type, i.e., a type formed by combining other types.

In ADT, there two concept of types:

product types, created by combining two or more data types into a new type
sum types, also known as enums or tagged unions, represent data that can take on one of several possible values

In Rust, there are two basic types that able to help us to implement the ADT:

enum
struct

From my personal project above, here is my implementation using enum :

#[derive(Clone, Debug, Serialize, Deserialize)]
#[serde(crate = "self::serde")]
pub enum ContextOptions {
    X25519,
    ED25519
}

impl ContextOptions {
    pub fn get(&self) -> String {
        match self {
            ContextOptions::X25519 => String::from("X25519"),
            ContextOptions::ED25519 => String::from("Ed25519") 
        }
    }
}

Or here more example using enum too taken from the article above:

enum ExperienceLevel {
    Junior,
    MidLevel,
    Senior,
}

enum InterviewStatus {
    Scheduled,
    Completed,
    Cancelled,
}

enum ApplicationStatus {
    Submitted,
    UnderReview,
    Rejected,
    Hired,
}

By implementing this data model (ADT), we’re able to eliminate all unnecessary ambiguity from our codebase and make our software to be more precise to our domain business needs.

Outro

Although there are a lot of advantages to implementing ADT using Rust, we have to be aware too, there are cons for it, one of which is too many boilerplate codes that need to be attached to our new types, examples:

pub struct Nonce(Vec<u8>);

impl From<Vec<u8>> for Nonce {
    fn from(value: Vec<u8>) -> Self {
        Nonce(value)
    }
}

impl VectorValue<u8> for Nonce {
    fn vec(&self) -> Vec<u8> {
        self.0.to_owned()
    }
}

pub struct MessagePlain(Vec<u8>);

impl From<Vec<u8>> for MessagePlain {
    fn from(value: Vec<u8>) -> Self {
        MessagePlain(value)
    }
}

impl From<String> for MessagePlain {
    fn from(value: String) -> Self {
        MessagePlain(value.as_bytes().to_vec())
    }
}

impl VectorValue<u8> for MessagePlain {
    fn vec(&self) -> Vec<u8> {
        self.0.to_owned()
    }
}

The other things that need to be our concern is, that we also need to add more tests to each available custom type to make sure that all internal data flows between types work as expected.

But, in my personal opinion, all of these downsides are really worth it. I’d prefer to take the boilerplates and add more tests, rather than working with the codebase with too much ambiguity in it.