A Turing Machine simulator written in Pharo

I am studying Turing Machines in a course named “Calculability and Complexity” this year. I wondered how easy it would be to write a Turing Machine simulator in Pharo.

Image shamefully taken from http://www.felienne.com/archives/2974

All the code written here is available on Github under MIT licence. You can load the package we create here on your image using:

Metacello new
baseline: 'TuringMachine';
repository: 'github://juliendelplanque/turing-machine:article/repository';
load.
The “article” branch of this git repository is used to keep a version of the project coherent with this article.

Did you say Turing Machine?

First thing first, what is a Turing Machine? Basically, it is a mathematical model that can simulate any algorithm’s logic.

Church-Turing axiom: Turing Machine model the intuitive notion of algorithm.

A Turing Machine is made of:

  • An infinite tape which contains cells where a symbol can be read or written.
  • A head that points a cell of the tape and can move left or right.
  • A set of rules that define what the machine should do when a symbol is read according to the current state of the machine.

From these “objects” the Turing Machine decides if an input tape is accepted or rejected. If you want to know more about these machines, check out Wikipedia’s article. I will not explain more about it since the goal of this story is not to teach you what it is but how to simulate it using Pharo.

Image taken from http://pharo.org/

Designing a Turing Machine in Pharo

What objects do we need to simulate a Turing machine? Well, let’s think about it a little:

  • The tape: an ordered infinite set of cells that can be read or written. Also, by convention the first cell contains the symbol ‘ç’ and can’t be modified. The tape hold a “word” which is a sequence of symbols.
  • The head: can read/write the tape and move left, right or stay on its current position.
  • A state: has a name and defines what action(s) to do according to the symbol read by the head on the tape.
  • A transition: represents what actions have to be done when moving from a state to another.
  • The Turing Machine: holds a current state and determinate when a word on a tape is accepted or refused.

Implementing the Turing Machine

Now it’s time to implement the machine. First, we create a package named “TuringMachine”.

An infinite tape

We need to simulate an infinite tape, this will be the role of an object we will call TMTape.

Object subclass: #TMTape
instanceVariableNames: ‘currentCell dataList announcer index’
classVariableNames: ‘’
category: ‘TuringMachine-Core’

The instance variable dataList will hold the data structure chosen to store symbols (the symbols we are talking about here are Turing Machine’s symbols represented by Character objects in Pharo) and currentCell will hold the current cell pointed by the head. index will hold the index of the current cell and announcer an Announcer object to launch announcements according to what’s happening to the tape.

What data structure should we use to be able to simulate an infinite tape? Well, it should be possible to do it with any SequenceableCollection provided by Pharo but, we want to simulate this “infinite” size efficiently (of course, we are limited by the size of your computer RAM). So we will use a data structure for which we have a good control on the size, a DoubleLinkedList.

A double linked list from Wikipedia

Let’s override #initialize message:

initialize
super initialize.
dataList := DoubleLinkedList new.

By definition, a tape always starts with ‘ç’ symbol, so we add it directly in #initialize message. Also, we set the index instance variable at 1.

initialize
super initialize.
dataList := DoubleLinkedList new.
dataList addLast: $ç.
index := 1

Now, we need to initialize currentCell instance variable but we have a little problem here, DoubleLinkedList have no message defined to access the first “node” of the DoubleLinkedList (in Pharo, nodes of linked lists are represented by Link class and in our case, by DoubleLink class) it only has a message named #first to access the first data of the list.

The extension feature of Pharo is very useful here! In Pharo, it is possible to add messages to a class already defined in the system without modifying the class. This can be done by adding a protocol named *YourPackageName (the star is important!) to the class to extend. So in our case, we add *TuringMachine protocol to DoubleLinkedList class. In this protocol we can add messages that have access to instance variables of DoubleLinkedList object and that’s what we need! Messages added in this protocol will be associated with TuringMachine package but you will be able to send these messages to DoubleLinkedList instances.

Let’s create the message #firstLink in DoubleLinkedList class:

firstLink
self isEmpty
ifTrue: [ CollectionIsEmpty signalWith: self ].
^ head

This message returns the first DoubleLink object of the list.

Back to TMTape’s #initialize message, we can now initialize currentCell and the final #initialize message looks like:

initialize
super initialize.
dataList := DoubleLinkedList new.
dataList addLast: $ç.
currentCell := dataList firstLink.
index := 1

Nice, our tape is now initialized correctly. We add #currentData accessor and #currentData: and #dataList: mutators. Let’s implement #next and #previous messages to navigate on the tape.

For #currentData:, we announce the fact that the current data has been modified using TMCurrentCellValueChanged announcement (related to the GUI, discussed in another article).

currentData: symbol
currentCell value: symbol.
self announcer announce: (TMCurrentCellValueChanged newValue: symbol)

That’s in #next message that we manage the “infinite size simulation”. To simulate that, we will dynamically add nodes with space character data inside them when we reach the end of the double linked list.

next
(currentCell nextLink isNil)
ifTrue: [
dataList addLast: Character space.
self announcer announce: TMCellAdded new ].
currentCell := currentCell nextLink.
index := index + 1.
self announcer announce: (TMCurrentCellChanged cellIndex: index)

As you can see, we use TMCellAdded and TMCurrentCellChanged classes. I do not detail the implementation of these classes here since they are related to the GUI (discussed in another article).

For #previous message, we have to manage when the head tries to go before the first symbol ‘ç’. To manage this case, we subclass Error class to define a new error named TMNavigationError:

Error subclass: #TMNavigationError
instanceVariableNames: ‘’
classVariableNames: ‘’
category: ‘TuringMachine-Errors’

Here is the implementation of #previous message:

previous
(currentCell value = $ç)
ifTrue: [ TMNavigationError signal: ‘There is nothing before $ç on a tape.’ ].
currentCell := currentCell previousLink.
index := index - 1.
self announcer announce: (TMCurrentCellChanged cellIndex: index)

Finally, we create the class message #newFrom: to easily create instance of TMTape from a SequenceableCollection:

newFrom: anSequenceableCollection
^ self new
dataList: (anSequenceableCollection collect: [ :each |
each isCharacter
ifFalse: [ each asCharacterDigit ]
ifTrue: [ each ] ]);
yourself

And that’s all for TMTape.

The head

We have the tape and we need a head to navigate, read and write it. Let’s create a TMHead object:

Object subclass: #TMHead
instanceVariableNames: ‘tape’
classVariableNames: ‘’
category: ‘TuringMachine-Core’

As you can see there is an instance variable named tape that will hold the tape to work with.

To let the user get correctly initialized instances of TMHead object, we add a class message #on: that takes a TMTape as parameter:

on: aTape
^ self new
tape: aTape;
yourself

Now we need to implement navigation messages.

We add #left message that moves the head on the left case adjacent to the current case.

left
tape previous

We also add #L message that just sends #left to self. We do this to have a notation close to what is written on paper.

L
self left

And we do the same kind of things for #right-#R and #none-#N (of course #none and #N do nothing).

Finally, we need messages to read/write the tape:

#read message:

read
^ tape currentData

#write: message:

write: aSymbol
tape currentData: aSymbol

And that’s all for the head.

States

Our states will hold actions to do according to the symbol read on the tape. To implement this feature, we will create a TMState class that holds an instance variable actionDict containing a Dictionary. This dictionary will map symbols to transitions.

Object subclass: #TMState
instanceVariableNames: ‘stateName actionDict’
classVariableNames: ‘’
category: ‘TuringMachine-Core’

stateName instance variable will hold the name of the state.

#initialize message will looks like:

initialize
super initialize.
actionDict := Dictionary new.

Let’s add a class message to have an easy way to create instance of TMState:

named: aString
^ self new
stateName: aString;
yourself

Now we need to be able to add mappings to the dictionary. This can be done using #when:do: message that we define like this (TMTransition is defined later in this story):

when: aSymbol do: aTMTransition
actionDict at: aSymbol put: aTMTransition

Finally, we add a message that will perform the correct transition on a TMMachine (that we will also define later) according to the symbol read:

readSymbolPointedBy: aHead using: aMachine
(actionDict
at: aHead read
ifAbsent: [ (TMUnknownSymbolError
symbol: aHead read
state: aMachine currentState) signal)
applyTo: aMachine

As you can see, if the symbol read is not a key of the dictionary, a TMUnknownError is raised. This error is defined like this:

Error subclass: #TMUnknownSymbolError
instanceVariableNames: ‘symbol state’
classVariableNames: ‘’
category: ‘TuringMachine-Errors’

A Turing Machine contains two special states:

  • The accept state that is reached when the input tape is accepted.
  • The reject state that is reached when the input tape is refused.

To represent these states, we create two subclasses of TMState: TMAcceptState and TMRejectState. These two states re-define #stateName to always return ‘accept’ and ‘reject’ respectively and they both redefine #readSymbolPointedBy:using: to do nothing since these two special states are finals.

Transitions

We are coming closer and closer to a full simulation of Turing Machines. In this section, we are going to implement transitions between states. First of all, we create the TMTransition class:

Object subclass: #TMTransition
instanceVariableNames: ‘write moveHead moveTo’
classVariableNames: ‘’
category: ‘TuringMachine-Core’

write holds a Character (a Turing Machine’s symbol) to write, moveHead holds a Symbol (here we are talking about Pharo’s Symbol class) included in {#right, #left, #none, #R, #L, #N} and moveTo holds a reference to a TMState. There is an accessor for each instance variable.

To set these instance variables, we provide the message #write:moveHead:moveTo: simply set each instance variable:

write: aCharacter moveHead: aSymbol moveTo: aState
write := aCharacter.
moveHead := aSymbol.
moveTo := aState

Then, we add a message that applies the transition to the Turing Machine:

applyTo: aMachine
aMachine head write: self write.
aMachine head perform: self moveHead.
aMachine moveToState: self moveTo

Once again, to make instantiation easy, we create a class message:

write: aCharacter moveHead: aSymbol moveTo: aState
^ self new
write: aCharacter moveHead: aSymbol moveTo: aState;
yourself

The eagerly awaited Turing Machine

We have all the pieces needed to build the Turing Machine, what we will do here is to create the TMMachine object to glue them together. Looking at how we defined transitions using TMTransition object, our TMMachine needs to, at least, implement the #head and #moveToState: messages.

Let’s create our TMMachine class:

Object subclass: #TMMachine
instanceVariableNames: ‘initialState currentState head acceptState rejectState’
classVariableNames: ‘’
category: ‘TuringMachine-Core’

initialState obviously holds the initial TMState of the Turing Machine, currentState holds its current TMState and head holds the TMHead instance used. Each of these variable has its accessor and its mutator. acceptState and rejectState hold the two special states “accept” and “reject”.

We also add #tape message that returns the head’s tape:

tape
^ self head tape

We create #accept and #reject messages that return respectively an instance of TMAcceptState and an instance of TMRejectState:

accept
“Return the accept state.”

^ TMAcceptState new
reject
“Return the reject state.”

^ TMRejectState new

Then, we add #start message to start the machine:

start
“Start the Turing Machine.”

[ self
currentState readSymbolPointedBy: self head
using: self ] on: TMUnknownSymbolError
do: [ self currentState: self reject ]

When there is no transition for a symbol, we assume that there is a transition to the reject state (i.e. transitions to reject state are implicit). That’s what is done using TMUnknownSymbolError.

We also need a message to change the TMMachine’s state:

moveToState: aState
“Change the machine’s state and continue the execution.”

self currentState: aState.
self start

To have an easy way to reset the machine, we add a #reset message:

reset
“Reset the machine by putting it back to initialState.”
    self currentState: self initialState

Finally we add #onTape: message on both class and instance side.

On instance side, it sets the machine on the tape given as parameter and reset it:

onTape: aTape
“Set the machine on the tape in parameter and reset it.”

self head: (TMHead on: aTape).
self reset

On class side, it creates a new instance of a TMMachine set on the tape given as a parameter:

onTape: aTape
^ self new
head: (TMHead on: aTape);
yourself

Here we are! Our Turing Machine is fully implemented!

How to use it?

It’s time to try what we have built. Let’s create a simple TuringMachine that given an input tape, modifies it to add a space at the beginning. For example, ‘ç123’ became ‘ç 123’ after the processing.

How the example machine looks like

Here is the implementation using objects we defined (you can find this example in #example1 TMMachine’s class message):

| machine qinit q1 qEnd q0Read qWritten q1Read |
machine := TMMachine new.
qinit := TMState named: ‘qinit’.
q1 := TMState named: ‘q1’.
qEnd := TMState named: ‘q-end’.
q0Read := TMState named: ‘q-0-read’.
qWritten := TMState named: ‘q-written’.
q1Read := TMState named: ‘q-1-read’.
qinit when: $ç do: (TMTransition write: $ç moveHead: #R moveTo: q1).
q1 when: $0 do: (TMTransition write: $0 moveHead: #R moveTo: q1);
when: $1 do: (TMTransition write: $1 moveHead: #R moveTo: q1);
when: Character space do: (TMTransition write: Character space moveHead: #L moveTo: qEnd).
qEnd when: $0 do: (TMTransition write: Character space moveHead: #R moveTo: q0Read);
when: $1 do: (TMTransition write: Character space moveHead: #R moveTo: q1Read);
when: $ç do: (TMTransition write: $ç moveHead: #N moveTo: machine accept).
q0Read when: Character space do: (TMTransition write: $0 moveHead: #L moveTo: qWritten).
q1Read when: Character space do: (TMTransition write: $1 moveHead: #L moveTo: qWritten).
qWritten when: Character space do: (TMTransition write: Character space moveHead: #L moveTo: qEnd).
machine initialState: qinit.

Just copy the preceding code in a Workspace, select it, right click and select “Do it”. You can now test the machine using a TMTape created from a String:

machine onTape: (TMTape newFrom: ‘0011’).
machine currentState stateName. "qinit"
machine start.
machine result. "' 0011'"
machine currentState stateName. "qaccept"

Just select these lines, right click and select “Do it” or “Print it”. As you can see we have a fully operational Turing Machine simulator.

Simplify the syntax to add transitions

We are able to create and use TMMachine but for now, I think the syntax is a bit too heavy. We are writing Smalltalk here, adding transitions should look simpler than it is now.

Let’s add a message to TMState in order to simplify transitions adding:

when: aCharacter write: anotherCharacter moveHead: aSymbol andMoveTo: aState
self
when: aCharacter
do: (TMTransition
write: anotherCharacter
moveHead: aSymbol
moveTo: aState)

This new message allows us to rewrite the preceding example like this:

| machine qinit q1 qEnd q0Read qWritten q1Read |
machine := TMMachine new.
qinit := TMState named: ‘qinit’.
q1 := TMState named: ‘q1’.
qEnd := TMState named: ‘q-end’.
q0Read := TMState named: ‘q-0-read’.
qWritten := TMState named: ‘q-written’.
q1Read := TMState named: ‘q-1-read’.
qinit when: $ç write: $ç moveHead: #R andMoveTo: q1.
q1 when: $0 write: $0 moveHead: #R andMoveTo: q1;
when: $1 write: $1 moveHead: #R andMoveTo: q1;
when: Character space write: Character space moveHead: #L andMoveTo: qEnd.
qEnd when: $0 write: Character space moveHead: #R andMoveTo: q0Read;
when: $1 write: Character space moveHead: #R andMoveTo: q1Read;
when: $ç write: $ç moveHead: #N andMoveTo: machine accept.
q0Read when: Character space write: $0 moveHead: #L andMoveTo: qWritten.
q1Read when: Character space write: $1 moveHead: #L andMoveTo: qWritten.
qWritten when: Character space write: Character space moveHead: #L andMoveTo: qEnd.
machine initialState: qinit.

Close to english isn’t it?

One more step for the syntax to add transitions

Now we have the message #when#write:#moveHead:#andMoveTo: it is better than before, but one may want to have a syntax “transitions oriented”. That’s what will be done here.

First, let’s add an extension in Character class (*TuringMachine protocol).

, aSymbol
self assert: aSymbol isSymbol.
^ { self . aSymbol }

Like that, when #, message is sent to a Character object with a Symbol as parameter, an array which contains the Character and the Symbol is returned.

Now, let’s add #to:transition: message to TMState:

to: aState transition: anAssociation
self
when: anAssociation key
write: anAssociation value first
moveHead: anAssociation value second
andMoveTo: aState

This allow us to write things like:

qInit to: q0 transition: $ç -> ($ç,#R).
The #-> message creates an Association between two objects (defined in Object class).

But what if there is more than one transition from a state to another? To be able to add multiple transition using a single message, we add #to:transitions: message (TMState):

to: aState transitions: anArrayOfAssociations
anArrayOfAssociations do: [ :assoc |
self to: aState transition: assoc ]

Which allows us to write:

q1 to: q1 transitions: {
$0 -> ($1,#R).
$1 -> ($0,#R) }

Conclusion

We have been able to build a fully functional Turing Machine simulator in only a few minutes using Pharo.

Where to go next? Well, our simulator seems to work well but it would be great to have a GUI (yes, that’s why announcements are used in some methods!) to see what it looks like. This will be implemented and explained in another story…