Improved Error Handling

kitkat1
6 min readSep 9, 2023

--

Hi everyone. Since my last post, I’ve added some new features to my language: There are now more operators such as modulus (%) and bitwise shift (<< and >>), and I improved the error handling system. Since coding the new operators wasn’t much different to the others, I won’t focus on that. This post will focus on the error handling. If you’d like to see all the source code for this project, check out this GitHub repo.

My goal for the error handling improvements is to create a compiler flag that will enable/disable detailed runtime errors. If the errors are not detailed, they won’t contain the line and column where the error occurred. While this makes the code harder to debug, it also makes the executable smaller and faster. In release builds, this could be useful for increasing performance.

I started the error handling improvements by creating a Log struct. Instead of simply returning a list of Strings when returning errors, I will return a list of Logs which can be printed. This is what the Log struct looks like:

/// Represents all possible errors as well as helpful debug information when relevant.
#[derive(Clone, PartialEq, Eq)]
pub struct Log
{
pub log_type: LogType,
pub line_and_col: Option<(usize, usize)>
}
/// An enum representing anything that can be logged.
#[derive(Clone, PartialEq, Eq)]
pub enum LogType
{
Warning(WarningType),
Error(ErrorType)
}

Each Log contains the type of log it is, as well as an optional line/column pair used for debugging. There are currently two different groups of logs: Warning and Error. The Warning type is used to indicate that something has gone wrong but that the code can still run, while the Error type indicates that the program will fail in some way. The WarningType and ErrorType enums store every possible message that can be displayed in the terminal. This is a very long list, so we won’t be showing it here. The formatting code for the Logs is also very long, but using the colored package, we’re able to create pretty error messages like the one below:

After doing this and changing all of the code to use this system, there are some instant benefits: for one, we don’t need a can_compile variable anymore: we can just check if the list of Logs contains any errors. We can also modify errors very easily, so enabling/disabling line and column numbers will be very easy.

After this, I added a new flag, -detailed_errors, in cli_reader.rs. The process for this was the same as the last flag, so I’ll move on to the vm. When I went to modify the vm, I noticed how repetative it was. I decided to clean it up using closures, which are functions that can be used as arguments in other functions. I made a function for unary operators and binary operators, and now all operators use one of those two functions. The unary function is not useful for this discussion, so below is the binary function, as well as the RuntimeError struct is uses.

// Contains info about a runtime error that could happen.
struct RuntimeError<'a, T>
{
condition: &'a (dyn Fn(T) -> bool),
error: ErrorType,
index: &'a mut usize,
bytecode: &'a Vec<u8>,
}

...

// Performs a binary operation on a pair of ints.
fn binary_int<F>(stack: &mut Vec<u8>, logs: &mut Vec<Log>, func: F, error: Option<RuntimeError<(i32, i32)>>)
where F: Fn(i32, i32) -> i32
{
let detailed_err: bool = if let Some(error) = &error
{
get_detailed_err(error.bytecode)
}
else
{
false
};
if detailed_err && errors_stored_incorrectly(error.as_ref().expect("detailed_err is true"))
{
logs.push(Log{log_type: LogType::Error(ErrorType::FatalError), line_and_col: None});
return;
}

let b: Option<i32> = pop_int_from_stack(stack);
let a: Option<i32> = pop_int_from_stack(stack);
let mut fail: bool = true;
if let Some(a) = a
{
if let Some(b) = b
{
fail = false;
if let Some(mut error) = error
{
handle_error(&mut error, (a, b), detailed_err, logs)
}
let c: i32 = func(a, b);
stack.append(&mut c.to_le_bytes().to_vec());
}
}
if fail
{
logs.push(Log{log_type: LogType::Error(ErrorType::FatalError), line_and_col: None});
}
}

The RuntimeError struct stores the condition under which the error is thrown. The type of this error Fn(T) -> bool, which is a closure that takes in a parameter of the generic type T and outputs a boolean value. Since we can’t know the size of this type at compile-time, we use the dyn keyword and wrap it in a reference, which we do know the size of. We also store the type of error that is thrown, the current index of the bytecode we’re at, and the bytecode itself. All of these except the error type are references with the lifetime generic parameter 'a, which is the lifetime of the output reference as well. This just means that if any of the internal references are dropped from memory, this object will be as well.

The first thing this code does if find out if there could be a detailed error. It uses the value of the flag, which has been stored in the bytecode, and whether or not the error value is None or not. If there is a detailed error, errors_stored_incorrectly verifies that the line and column information is availible, or throws a fatal error if not. Then, the code runs largely in the same way as before. If the code could throw an error, the handle_error is called:

// Handles runtime errors.
fn handle_error<T>(error: &mut RuntimeError<T>, value: T, detailed_err: bool, logs: &mut Vec<Log>)
{
let condition: &dyn Fn(T) -> bool = error.condition;
let index: &mut usize = error.index;
let bytecode: &Vec<u8> = error.bytecode;
let ptr_size: usize = get_ptr_size(bytecode);
if condition(value)
{
if detailed_err
{
let mut bytes : [u8; (usize::BITS / 8) as usize] = [0; (usize::BITS / 8) as usize];
for i in 0..ptr_size
{
bytes[i] = bytecode[*index];
*index += 1;
}
let line: usize = usize::from_le_bytes(bytes);
bytes = [0; (usize::BITS / 8) as usize];
for i in 0..ptr_size
{
bytes[i] = bytecode[*index];
*index += 1;
}
let col: usize = usize::from_le_bytes(bytes);
logs.push(Log{log_type: LogType::Error(error.error.clone()), line_and_col: Some((line, col))});
}
else
{
logs.push(Log{log_type: LogType::Error(error.error.clone()), line_and_col: None});
}
}
else if detailed_err
{
*index += 2 * get_ptr_size(bytecode);
}
}

If the error condition is met for the given value, we log an error. However, only in the case of detailed_err do we collect the line and column info first. The ending else if makes sure both possibilities are ready to read the next instruction after the potential error is processed. This is what we want: now, only if the flag is set will the line and column number be recorded with the error. Note that this only applies for runtime errors. In the case of compile time errors, line and column info is always shown. It never helps to not show it, and it makes a lot more sense to show it, as it has to be fixed before the code can do anything.

Here is the output of the same error as before when detailed errors are disabled.

As of right now, the line and column information is shown by default, and the flag must be set to false in the command line to prevent this info from being in the bytecode. My reasoning for this is that removing this information is only necessary in release builds, which are less common than debug builds.

That’s all for tonight. Next time, I’ll add boolean values, which I’ve been meaning to do for a while. Unfortunately, this will require a lot of rewriting in the parser. I love refactoring :).

--

--