Unit Testing ChatGPT Prompts: Introducing Riteway for SudoLang

Eric Elliott
JavaScript Scene
Published in
8 min readApr 3, 2023
Running Riteway’s usage example tests in SudoLang running on ChatGPT using GPT-4

I have been a long-time advocate of Test-Driven Development (TDD) because of its many productivity and quality benefits. You can read more about those in “TDD Changed My Life”.

When I realized that GPT-4 was capable of following complex instructions, one of the first things I thought of was trying to unit test prompts. To do that, I needed more precise control over the prompts. SudoLang is a powerful natural language, constraint-based pseudocode language for GPT-4 and other language model AIs.

Since language model outputs can vary significantly and sometimes lack determinism, unit tests play a pivotal role in guaranteeing that the model adheres to all software requirements while generating implementations.

Riteway has been my testing framework of choice since we built it during the development of the DevAnywhere.io source code because it enforces good practices in test writing. With SudoLang, Riteway becomes even more powerful by allowing the language model to provide intelligent feedback on the quality of your tests, and by using intelligent inference to produce quality unit tests for non-deterministic AI-generated outputs.

Constraints in SudoLang

SudoLang is a constraint-based programming language inspired by the constraints in Ivan Sutherland’s Sketchpad. In SudoLang, a constraint is a predefined condition or rule that the model should follow when generating and interpreting code. Constraints help improve the accuracy and relevance of the code functionality, ensuring that it meets the specified requirements. Constraints combine with AI’s omnipotent inference to automatically and dynamically synchronize related state changes in SudoLang programs.

A lot of the magic in Riteway for SudoLang comes from the powerful, AI-inferred omnipotent constraint solving.

Riteway for SudoLang: Built-in Constraints, Requirements, and Warnings

Riteway for SudoLang goes beyond traditional testing by incorporating constraints, requirements, and warnings into the testing framework. This enhanced functionality guides the model to generate more accurate and reliable code, while also giving you valuable feedback on the quality of your tests.

Riteway for SudoLang is designed to help developers create high-quality tests by enforcing constraints, requirements, and providing warnings. The framework emphasizes five key questions every unit test must answer and follows the RITE Way acronym to ensure that tests are:

  1. Readable: Tests should be easily understandable, using clear language and formatting.
  2. Isolated (for unit tests) or Integrated (for integration tests): Tests should not depend on each other and should be isolated from external factors.
  3. Thorough: Tests should cover all likely edge cases, ensuring that the code works correctly in various scenarios.
  4. Explicit: Everything needed to understand the test should be included within the test itself. Tests should not rely on external, shared mutable state or other tests for context.

By adhering to these principles, the Riteway framework helps developers create tests that are easy to understand, maintain, and debug. The constraints and warnings built into the framework guide the user in crafting tests that effectively answer the following five questions:

  1. What is the unit under test? (The test should be in a named describe block)
  2. What is the expected behavior? (The $given and $should arguments should be adequate)
  3. What is the actual output? (The unit under test was exercised by the test)
  4. What is the expected output? (The $expected and/or $should arguments should be adequate)
  5. How can we find the bug? (The bug should be inferable from the test error message)

By focusing on these questions and the RITE Way acronym, the Riteway testing framework for SudoLang helps developers create high-quality tests that are easy to understand, maintain, and debug.

The Riteway Test Runner

The Riteway test runner is designed to carefully interpret the tested code and generate realistic output for each test. It logs the results of each test case, providing insights on whether the test passed or failed. If a test fails, it offers suggestions on how to identify and fix the bug.

Example Test Cases and Output

Here’s an example of how Riteway for SudoLang can be used to create and run tests:

describe("Tests", {
assert({
should: "run tests",
actual: true,
expected: true
});
});

describe("pythagoreanTriple", {
interface pythagoreanTriple {
// generate a valid pythagorean tripple
function pythagoreanTriple();
constraint a^2 + b^2 = c^2;
}

assert({
should: "generate a valid pythagorean triple",
actual: pythagoreanTriple(),
});
});

The output for these tests would look like this:

1. ✅ Should run tests: Actual: true, Expected: true
2. ✅ Should generate a valid pythagorean triple: Actual: [3, 4, 5], Expected: [3, 4, 5]
✅ 2 tests passed, 0 failed.

Handling Nondeterminism

Note in the pythagoreanTriple function above, it generates a random pythagoreanTriple, nondeterministically. In traditional programming languages, we’d need to do some clever math in our expectation, potentially duplicating buggy logic from the implementation. In SudoLang, we lean on the inference engine with the following constraint:

constraint: if expected is undefined, infer it from the should description

Authoring JavaScript in SudoLang

Writing and maintaining a suite of unit tests in SudoLang is a great and easy way to specify well-defined requirements for AI code-generation tasks:

  • Paste the Riteway framework into the AI prompt:
  • In the same prompt, declare a SudoLang function and add any required constraints.
  • In the same prompt: list(requirements we should test for) |> generateRitewayTests() |> log:
function fibonacci(n) {
constraint sequence should begin with 0
find the nth fibonacci number
}

list(requirements we should test for) |> generateRitewayTests() |> log

This should produce output like:

Here are the requirements we should test for the fibonacci function:

The sequence should begin with 0.

The function should return the correct Fibonacci number for the given input.

The function should handle edge cases, such as negative numbers and non-integer inputs.

Based on these requirements, we can generate Riteway tests:

describe("fibonacci", {
assert({
should: "start the sequence with 0",
actual: fibonacci(0),
expected: 0
});

assert({
should: "return the correct Fibonacci number for the given input",
actual: fibonacci(5),
expected: 5
});

assert({
should: "return the correct Fibonacci number for another given input",
actual: fibonacci(10),
expected: 55
});

assert({
should: "handle negative numbers",
actual: fibonacci(-5),
expected: undefined // or an appropriate error message if an error is thrown
});

assert({
should: "handle non-integer inputs",
actual: fibonacci(3.5),
expected: undefined // or an appropriate error message if an error is thrown
});
});

These tests ensure that the fibonacci function correctly calculates Fibonacci numbers, starts the sequence with 0, and handles edge cases like negative numbers and non-integer inputs.

Notice that the suggested tests are incorrect, but with the test suite listed, you can request revisions with revise(test to
revise, description of requirement change)
:

revise(
```
describe("fibonacci", {
assert({
should: "start the sequence with 0",
actual: fibonacci(0),
expected: 0
});

assert({
should: "return the correct Fibonacci number for the given input",
actual: fibonacci(5),
expected: 5
});

assert({
should: "return the correct Fibonacci number for another given input",
actual: fibonacci(10),
expected: 55
});

assert({
should: "handle negative numbers",
actual: fibonacci(-5),
expected: undefined // or an appropriate error message if an error is thrown
});

assert({
should: "handle non-integer inputs",
actual: fibonacci(3.5),
expected: undefined // or an appropriate error message if an error is thrown
});
});
```, {
please correct expectations based on the real fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34...
expect out-of-domain inputs to throw using try..catch
)

You might get something like this:

describe("fibonacci", {
assert({
should: "start the sequence with 0",
actual: fibonacci(1),
expected: 0
});

assert({
should: "return the correct Fibonacci number for the given input",
actual: fibonacci(6),
expected: 5
});

assert({
should: "return the correct Fibonacci number for another given input",
actual: fibonacci(10),
expected: 34
});

assert({
should: "throw for negative numbers",
actual: {
try {
fibonacci(-5);
} catch (error) {
error.message;
}
}
});

assert({
should: "throw for non-integer inputs",
actual: {
try {
fibonacci(3.5);
} catch (error) {
error.message;
}
}
});
});
  • Now run the tests: run(tests)
  • With that preview of the test results, you can refine constraints, add or revise tests, etc.
  • When you’re satisfied, transpile the tests

In this case, I don’t actually like the throw conditions, so I’m omitting them in my transpiled tests.

Transpiling SudoLang Tests to JavaScript

You can transpile tests with:

ModuleName tests |>
transpile(JavaScript Riteway) |> // or (JavaScript Jest)
wrapWith(Markdown code block)

Which gives me:

describe("fibonacci", assert => {
assert({
given: "1",
should: "start the sequence with 0",
actual: fibonacci(1),
expected: 0
});

assert({
given: "a valid input",
should: "return the correct Fibonacci number for the given input",
actual: fibonacci(6),
expected: 5
});

assert({
given: "another valid input",
should: "return the correct Fibonacci number for another given input",
actual: fibonacci(10),
expected: 34
});
});

Riteway for SudoLang is Experimental

SudoLang works really well for very simple programs in both GPT-3 and GPT-4. For more complex programs, you’ll need GPT-4. The Riteway framework itself is pushing the boundaries of complexity that GPT-4 is capable of, so you should not rely on getting correct or expected results every time (or even half the time).

But the fact that it’s even possible to get this far is an extraordinary testament to what lies ahead. More than a decade ago, I spoke at a JavaScript conference and gave an interview where I talked about the future of programming. I’ve been interested in AI and generative technology all my adult life. In the interview, I talked about a future where software engineers don’t need to write all the implementation code anymore, and their primary responsibility will be to maintain the requirements with enough functional clarity to keep generative models on-track.

The beginning of that future is upon us today.

Wrapping Up

Riteway for SudoLang offers a powerful and intuitive way to test your AI-generated code. By combining the benefits of TDD with the flexibility of SudoLang and the intelligence of the language model, you can create robust and reliable software with greater confidence. Give Riteway for SudoLang a try and experience the difference for yourself.

Riteway for SudoLang is now available in the SudoLang examples folder on GitHub.

Enjoy!

👉 The Art of Effortless Programming

Next Steps

I’m offering 1:1 mentorship on AI whispering to help people interested in supercharging their experience with AI to solve complex problems. Sign up here.

What qualifies me to do that? Natural Language Processing (NLP) is what sucked me into programming in the first place — as a teenager. Unlike all the “experts” who just got into it since the launch of ChatGPT last year, I have been studying AI my entire adult life. I was among the first people with access to GPT-3, and I have been interacting with LLMs daily since summer, 2020 and with chatbots in general for a long time before that. I have been writing and speaking about AI for years.

I’m featured in the new Cyborg Society documentary film, coming soon to a streaming service near you.

Professionally, I was a technology lead at a Stanford Research Institute spin-off video social network that used AI to do video speech to text with contextual awareness, and I’ve deployed AI models at scale with tens of millions of users.

Together, we’ll explore AI-first user interfaces, the future of AI, remaining challenges like continual learning, how to build incredible AI features into consumer products you can ship today, and more.

Interested? Reach out here.

Eric Elliott is a tech product and platform advisor, author of “Composing Software”, creator of SudoLang (the AI programming language), cofounder of EricElliottJS.com and DevAnywhere.io, and dev team mentor. He has contributed to software experiences for Adobe Systems, Zumba Fitness, The Wall Street Journal, ESPN, BBC, and top recording artists including Usher, Frank Ocean, Metallica, and many more.

He enjoys a remote lifestyle with the most beautiful woman in the world.

--

--