Introducing a Python-Based Reasoning Engine for Deterministic AI
People want deterministic systems, and we are bringing back old ideas into the LLM age.
One of the biggest problems that exist is creating constraints around how systems validate and reason about the information entering the system. As we consume more and more unstructured data through stochastic LLMs, the ability to enforce rules and guardrails become even more important.
We have built a Python-based reasoning and validation framework, inspired by Pydantic, that makes it simple for developers and non-technical domain experts to build complex rule engines. This process is easily extensible by developers, and we have fine-tuned a model that helps automate the construction of rules from natural language instructions. There is a certain level of customization that is required for every type of problem, but many business problems can use similar frameworks (i.e. logistics -> food delivery). This symbolic reasoning and validation framework is useful where you are looking to turn SOPs and other business guardrails into enforceable code. As can be seen later in the river crossing puzzle, it can also be used to model out reasoning frameworks for problem solving.
These rule engines work well with graph data structures, but do not require a graph database to operate. In this case, our graph data structure sits in a JSON file. It is intended to be flexible if you store your data in graph, relational or nosql databases. We use Python, as semantic tools and languages (such as SHACL and RDF) are difficult to access and unfamiliar to the majority of devs in the Python ecosystem. Further, we can leverage familiar frameworks like Pydantic to bring validation and basic reasoning to existing workflows. This means full, stateful programs can be built all within the same code file. We cannot build a State Machine in RDF anyway.
This system comprises of 5 parts:
- Data Structure
- Rules
- Reasoning Framework / LLM
- Agent
- State Machine
What does that mean? Let us analogize this system to Chess. If we were to compare a Reasoning Engine to Chess, we would say that:
- Data Structure = Chess Pieces
- Rules = Rules of Chess
- Reasoning Framework = Chess Player
- Agent = Hand that moves the Chess pieces
- State Machine = Records the Board, Status of pieces, etc.
There are two types of systems we are going to introduce — Validation Engines and Reasoning Engines. Validation Engines are a subset of Reasoning Engines, as they only check if something is valid according to the Rules stated. Validation Engines do not typically have state machines nor agents nor a reasoning framework since it merely validates data structure against the rule systems that exist.
A Reasoning Engine helps to understand and ‘navigate’ a problem set. This involves making decisions, bound by the Rules and to make decisions based on potential choices. Given that decisions are involved, a state machine is necessary for a Reasoning Engine.
Validation & Reasoning Case Studies
We built a process that is aimed to streamline the way non-technical operations/business users can contribute to an enterprise-grade rule-engine, with developers in the loop.
The process is as follows:
- A natural language UI that non-technical ops people can insert rules and constraints in natural language
- These natural language rules get translated into a python structure within our framework that your dev team can approve.
- When inputting rules, it’ll also automatically detect conflicts/flags, etc, so developers in your team can streamline the validation checks when inputting new rules.
- Once approved, rules automatically run in the background.
- Our process saves about 5x more dev time and can be extensible to any use-case.
This is a visualized snapshot of the front-end flow (this is of the river crossing example below). We include a video of what this flow looks like at the end of the article.
Breakdown of Dev Hours Spent
For each problem statement, the development process took about 5 hours; most of it was understanding the use case and creating the data structure and reasoning framework that reflects the specific context of the problem. Once we have in mind the optimal structure, the model is then used to help further flesh out all the details of the rules, structure and reasoning framework.
The process is essentially:
- Identifying the data structure that should best be used to represent the set of rules and problem space
- Developing the reasoning framework to be used in conjunction with the data structure
- Contextualizing the model with the data structure used
- Plugging the UI into the contextualized model
Let us go in depth into a couple of use-cases.
Validation Engine (Simple, Stateless)
We now go into an anonymized case study. A mining company wants to validate their existing employees (and potential employees) qualifications and courses to safely do their job, which can vary from region to region. There are rules around age restrictions and qualifications needed for different people, and for different vehicle types.
A sample of the data would be as follows:
{
"employees": [
{
"name": "Sarah",
"age": 25,
"country": { "name": "Australia" },
“role”: “Manager”,
"documents": [{ "type": "safe_handling_at_work" }, { "type": "heavy_lifting" }],
},
{`
"name": "John",
"age": 17,
“role”: “Laborer”,
"country": { "name": "Australia" },
"documents": [{ "type": "heavy_lifting" }],
},
{
"name": "Alice",
"age": 30,
“role”: “Dozer Operator”,
"country": { "name": "Brazil" },
"documents": [{ "type": "heavy_lifting" }],
}
]
}
And the rules, such as “Employees must be at least 18”, can be represented in the same way:
{
"rules": [
{
"type": "min_age",
"parameters": { "min_age": 18 }
},
{
"type": "dozer_operator",
"parameters": {
"country": "Australia",
“role”: “dozer_operator”,
"document_type": "dozer_qualification"
}
},
]
}
Then, running it with a basic Python reasoning engine, we can get the following output.
Graph does not conform to rules:
Minimum_age must be 18
Role "Dozer_operator" must have document_type: "Dozer_qualification"
Reasoning Engine (Agentic, Stateful)
Puzzle-Solving
In this example, we tackle the famous river crossing puzzle that LLMs famously have to rely on rote memorization to solve.
A farmer returns from the market, where he bought a goat, a cabbage and a wolf. On the way home he must cross a river. His boat is small and won’t fit more than one of his purchases. He cannot leave the goat alone with the cabbage (because the goat would eat it), nor can he leave the goat alone with the wolf (because the goat would be eaten). How can the farmer get everything on the other side in this river crossing puzzle?
Although latest versions of LLMs (as of October ‘24) seem to have been sufficiently trained on enough data to generalize and be adequately useful for the river crossing puzzle specifically, this classic example shows that LLMs need a lot of work to be used as a reasoning or validation engine for internal enterprise use-cases. It is also not clear if a sufficiently unique rule that is out of the dataset would throw off the LLM.
For the process here, we take the following steps:
- Translate the original puzzle into a number of rules, data structures, and reasoning framework
- We show the original puzzle being solved
- We insert a number of new rules into the puzzle through natural language
- Our model turns the natural language instructions into rules
- We run validate checks and accept the new rules
- We run and solve the new puzzle with the new rules
- (Optional) We showcase how the State Machine is controlled and visible through a Knowledge Graph
The below video illustrates the specific step where we turn natural language instructions into rules, that are automatically validated, checked, and accepted, with a human-in-the-loop component for both the non-technical domain expert, as well as the developer.
Developer Walkthrough
For technical teams, we also add a few sample code snippets to showcase what we have done. The output of the basic process can be seen below. Given this is a symbolic validation engine, this runs in 0.0003 seconds. It could be faster.
Initial state:
Left bank: Wolf, Goat, Cabbage, Farmer
Right bank: empty
Solution:
Step 0:
Left bank: Wolf, Goat, Cabbage, Farmer
Right bank: empty
Step 1:
Left bank: Wolf, Cabbage
Right bank: Goat, Farmer
Step 2:
Left bank: Wolf, Cabbage, Farmer
Right bank: Goat
Step 3:
Left bank: Cabbage
Right bank: Wolf, Goat, Farmer
Step 4:
Left bank: Goat, Cabbage, Farmer
Right bank: Wolf
Step 5:
Left bank: Goat
Right bank: Wolf, Cabbage, Farmer
Step 6:
Left bank: Goat, Farmer
Right bank: Wolf, Cabbage
Step 7:
Left bank: empty
Right bank: Wolf, Goat, Cabbage, Farmer
The state machine can be represented graphically, as can be seen in a WhyHow graph here:
Rules can be added as simply as follows:
class GoatCabbageRule(Rule):
def evaluate(self, state: WolfGoatCabbageState) -> bool:
return not (state.goat == state.cabbage and state.farmer != state.goat)
def get_description(self) -> str:
return "Goat cannot be left alone with cabbage"
So, to add a new one, say “A wolf cannot be left alone with a chicken”, we can add the rule:
class ChickenWolfRule(Rule):
def evaluate(self, state: StateModel) -> bool:
return not (state.chicken == state.wolf and state.farmer != state.chicken)
def get_description(self) -> str:
return "Wolf cannot be left alone with chicken"
And then add that rule to the engine:
class RulesEngine:
def __init__(self):
self.rules: List[Rule] = [
WolfGoatRule(),
GoatCabbageRule(),
ChickenWolfRule()
]
def validate_state(self, state: StateModel) -> Tuple[bool, List[str]]:
violations = []
for rule in self.rules:
if not rule.evaluate(state):
violations.append(rule.get_description())
return len(violations) == 0, violations
We then get the following output:
Initial state:
Left bank: Wolf, Goat, Cabbage, Farmer
Right bank: empty
Rule violations: Farmer can only carry one item at a time
Solution found in 0.0003 seconds
And can then adjust our code, such that the answer is as follows:
Initial state:
Left bank: Wolf, Goat, Cabbage, Chicken, Farmer
Right bank: empty
All rules satisfied
Searching for solution…
Debug: Invalid - Wolf would eat goat
Debug: Invalid - Goat would eat cabbage
Debug: Invalid - Wolf would eat chicken
Debug: Invalid - Wolf would eat goat
Debug: Invalid - Wolf would eat goat
Debug: No valid moves found!
Current state: wolf='left' goat='left' cabbage='left' chicken='left' farmer='left'
Attempted moves were invalid due to rule violations
Current state: Left bank: Wolf, Goat, Cabbage, Chicken, Farmer
Right bank: empty
Found 0 possible moves
No solution found!
As can be seen above, adding the ChickenWolfRule creates an impossible situation with no solution. This failure has value, as we can understand the limits of our data and our system. However, to create a solvable extension, we can add a further rule, being that “A farmer can carry two things at once”, as follows:
class CarryingCapacityRule(Rule):
def evaluate(self, state: StateModel) -> bool:
"""Check that farmer isn't carrying more than two items at once"""
# Get previous state from visited states (if it exists)
# Count how many items moved with the farmer
items_moved = 0
items = ['wolf', 'goat', 'cabbage', 'chicken']
# For each item, check if it moved to the same side as the farmer
for item in items:
item_side = getattr(state, item)
if item_side == state.farmer:
items_moved += 1
# Farmer can carry up to two items
return items_moved <= 2
def get_description(self) -> str:
return "Farmer can carry up to two items at a time"
And now, we get the following solution:
Initial state:
Left bank: Wolf, Goat, Cabbage, Chicken, Farmer
Right bank: empty
All rules satisfied
Searching for solution...
Current state: Left bank: Wolf, Goat, Cabbage, Chicken, Farmer
Right bank: empty
Solution found in 0.0007 seconds:
Step 0:
Left bank: Wolf, Goat, Cabbage, Chicken, Farmer
Right bank: empty
All rules satisfied
Step 1:
Left bank: Goat, Chicken
Right bank: Wolf, Cabbage, Farmer
All rules satisfied
Step 2:
Left bank: Goat, Chicken, Farmer
Right bank: Wolf, Cabbage
All rules satisfied
Step 3:
Left bank: empty
Right bank: Wolf, Goat, Cabbage, Chicken, Farmer
All rules satisfied
It solves the puzzle! And further, it does it in only 3 moves, instead of the original 7.
To further enhance the neurosymbolic nature of this system, we also have the LLMs playing a role in the debugging of the system. This is done by having the LLM suggest new rules that would overcome any restrictive rules entering the system (i.e. the chicken as a restrictive rule, the LLM suggesting the amended farmer carrying capacity as a new rule as an automated debugging step).
For those interested in implementing a similar reasoning and validation engine, the WhyHow team is available for a services engagement. Given the amount of customization that needs to go into contextualizing the initial data structure and reasoning framework, we are currently only engaging with clients on a services/consulting basis to help create validation and reasoning engines. This includes access to the UI platform created, which is used as an internal tool for ourselves and our clients.
WhyHow.AI provides tools, services and processes for Structured Knowledge, Knowledge Graphs and deterministic Agentic RAG solutions. If you are interested in exploring any of our open-source tools (KG Studio, Knowledge Table) and services, feel free to chat with us here.
If you’re thinking about, in the process of, or have already incorporated knowledge graphs in RAG for accuracy, memory and determinism, follow our newsletter at WhyHow.AI or join our discussions about rules, determinism and knowledge graphs in RAG on our Discord.