Prevent concurrent exception in Ohm/Redic
On one of the projects we catch a few exceptions in a short period of time which pointed to Ohm::Model
objects with the message RuntimeError: not connected
. Ohm is a gem that provides objects mapping for Redis and the issue in the first place was fixed with some tweaks around Redis connections, but the reason for the issue was not clear. Why connection to Redis was not established? Or maybe the connection was lost due to some reasons? So it was decided to clarify where such exception comes from and why.
Ohm by default use Redic adapter for connecting to Redis. In Redic client implementation lib/redic/client.rb:
def connect
establish_connection unless connected? @semaphore.synchronize do
yield
end
rescue Errno::ECONNRESET
@connection = false
retry
enddef establish_connection
begin
@connection = Redic::Connection.new(@uri, @timeout)
rescue StandardError => err
raise err, “Can’t connect to: %s” % @uri
end # ...
end
we could see that StandardError
exception during creation of connection instance is handled and exception message raised in result is different from what we’ve got, at the same time we handle Errno::ECONNRESET
exception which expects to re-establish connection via retry
if it to was lost. So, there is no reason why we could have RuntimeError: not connected
, to clarify it we need dive dipper into implementation of Redic::Connection
which simply wraps Hiredis::Connection
from hredis-rb gem.
Moving to Hiredis we could find connection implementation which by default lead us to Hiredis::Ext::Connection
which wraps C
implementation of connection used for better performance and maps EOFError
to already spotted in Redic exception Errno::ECONNRESET
module Hiredis
module Ext
class Connection
def read
_read
rescue ::EOFError
raise Errno::ECONNRESET
end
end
end
end
moving down the line to C-extension we could find definition of read
method https://github.com/redis/hiredis-rb/blob/master/ext/hiredis_ext/connection.c#L459
static VALUE connection_read(VALUE self) {
// ....
if (__get_reply(pc,&reply) == -1)
parent_context_raise(pc);
which means that if a connection is lost we invokeparent_context_raise(pc)
which expects that connection will be nullified and raised `EOFError` exception https://github.com/redis/hiredis-rb/blob/master/ext/hiredis_ext/connection.c#L61
static void parent_context_raise(redisParentContext *pc) {
// ...
parent_context_try_free(pc);
// ...
case REDIS_ERR_EOF:
/* Raise native Ruby EOFError */
rb_raise(rb_eEOFError,"%s",errstr);
as we already have seen EOFError
later is transformed to Errno::ECONRESET
and expect to be handled by Redic via creating a new connection, but if we will try to send data to such broken connection RuntimeError: not connected
will be raised, so why the connection was not re-established but instead we send data to bad one? It happens simply because Redic establish_connection
was involved outside of @semaphore.synchronize
block which expects to sync all execution, as a result, while Errno::ECONNRESET
was handled by one thread, others were trying to send data to old broken one and resulted in exception.
As conclusion, the issue was spotted and synchronization was fixed in PR to Redic https://github.com/amakawa/redic/pull/15