Optimistic Commands Locking: Our Experience

Tanya K.
reSolve blog
Published in
4 min readMar 6, 2018

My team is working on the CQRS + ES framework with AWS serverless architecture and FaaS. The FaaS model has many advantages, for instance, low infrastructure requirements, excellent scalability, etc. On the other hand, FaaS requires handling a potential command concurrency, which is possible in any system if several users make changes to the same data at the same time. In FaaS events concurrency issues should be solved during development stages because such architecture is capable to scale fast and automatically, which enables FaaS applications to handle high load peaks by avoiding UX issues. Martin Fowler describes two commands concurrency solutions: he calls these methods optimistic and pessimistic offline lock. However, we need to decide which approach to use.

Why not Pessimistic Locking?

First, we discussed both solutions’ advantages and disadvantages and decided to try the pessimistic approach. The pessimistic offline lock prevents conflicts by allowing data access to only one transaction at a time. Martin Fowler has a scheme for this:

We have chosen to use an Amazon ElastiCache for Redis for the pessimistic approach, and ended up with the code like this:

const redis = require("redis")
const client = redis.createClient()
// set is a promisified version of client.set
const acquireLock = async lockId => {
const ret = await set(lockId, 1, "PX", TIMEOUT, "NX")
if (ret !== "OK") {
await acquireLock(lockId)
}
}
// del is a promisified version of client.del
const releaseLock = async lockId => {
await del(lockId)
}
await acquireLock(lockId)
try {
await exec()
} finally {
await releaseLock(lockId)
}

In this code, the exec() function writes data to a database and returns a promise fulfilled once finished (you can also use the redis-lock package instead of this code).

In general, the pessimistic approach does not suit our needs because an aggregate’s transactions can be locked due to an unstable Internet connection and cause the system to become unresponsive.

Optimistic Locking with DynamoDB

After trying the pessimistic approach, we decided that optimistic is better suited to our needs. It requires no auxiliary code for Redis, and we can process failed database requests to resolve transaction concurrency problems. The optimistic offline lock allows us to prevent conflicts by detecting concurrency and rolling back the transaction.

We planned to implement this principle using the aggregate’s event versioning. A new event gets a higher version number than the most recent stored event. The version numbers are checked before adding the event to a store. The event is stored when the new event’s version is higher or rejected when it is the same.

We chose DynamoDB table as an event store for implementing the described algorithm. The following tuple can uniquely define an event: (aggregateId, aggregateVersion), so it's possible to set a primary key with those two fields in DynamoDB table:

Resources:
EventsTable:
Type: AWS::DynamoDB::Table
Properties:
AttributeDefinitions:
- AttributeName: aggregateId
AttributeType: S
- AttributeName: aggregateVersion
AttributeType: N
# - Other attributes
KeySchema:
- AttributeName: aggregateId
KeyType: HASH
- AttributeName: aggregateVersion
KeyType: RANGE

The PutItem method is used to add rows to a table, but it does not prevent existing items from being overridden. We used the ConditionExpression parameter to detect the event concurrency because it allows us to check whether an item exists and prevents overwriting. Since the event always contains the aggregateId parameter, the condition can be similar to attribute_not_exists(aggregateId).

const documentClient = new aws.DynamoDB.DocumentClient({
accessKeyId,
secretAccessKey,
region
})
const saveEvent = async event => {
try {
await documentClient.put({
TableName: tableName,
Item: event,
ConditionExpression: 'attribute_not_exists(aggregateId)'
}).promise()
} catch (err) {
if (err.code === 'ConditionalCheckFailedException') {
throw new OptimisticLockError({
aggregateId: event.aggregateId,
aggregateVersion: event.aggregateVersion
})
}
throw err
}
}

The command concurrency handling task is solved by recreating an aggregate and repeatedly executing a command with a limited number of attempts. We used Lambda, whose limited execution time allows us repeat attempts to apply failed events without limiting their count.

const executeCommand = async command => {
// load and apply events, process command, create new event
try {
await saveEvent(event)
} catch (err) {
if (err instanceof OptimisticLockError) {
await executeCommand(command)
} else {
throw err
}
}
}

Conclusion

We hope our experience could help you understand how to reSolve event concurrency issues and implement optimistic locking with DynamoDB. Feel free to comment and communicate with us on twitter or facebook. You can also learn more about our team and the product here on GitHub.

--

--