Ruby, I just blew your buffer.

John Allison
2 min readJun 4, 2013

--

Using linux (I’m using Ubuntu)? Let’s make ruby overflow it’s buffer:

$ irb
irb(main):001:0> require ‘redis’
irb(main):002:0> (1..1025).inject([]) do |connections, i|
irb(main):003:1* connections << Redis.new.tap do |conn|
irb(main):004:2* conn.info
irb(main):005:2> end
irb(main):006:1> end

If you run it on a linux system, I’m guessing you’ll find that your irb session will crash with a buffer overflow error like:

*** buffer overflow detected ***: irb terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f92dd27082c]
/lib/x86_64-linux-gnu/libc.so.6(+0x109700)[0x7f92dd26f700]
/lib/x86_64-linux-gnu/libc.so.6(+0x10a7be)[0x7f92dd2707be]

Wait, what did that code just do?

All the code does is attempt to open 1,025 connections to a local Redis instance. More on why you may want to do that below, but let’s figure out why that crashes Ruby first.

Each connection to Redis opens a socket connection. Each socket connection creates a file descriptor. Is there a limit on how many file descriptor’s a process can create?

Yep, and it turns out it’s 1,024. Great! Let’s increase that limit!

ulimit -n 2000

Now we should be able to create 2,000 file descriptors. Let’s try it again:

*** buffer overflow detected ***: irb terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f92dd27082c]
/lib/x86_64-linux-gnu/libc.so.6(+0x109700)[0x7f92dd26f700]
/lib/x86_64-linux-gnu/libc.so.6(+0x10a7be)[0x7f92dd2707be]

Humm…

After some beard pulling, it turns out the redis-rb gem uses IO.select to open socket connections. IO.select uses the underlying select system call which has a fixed sized buffer of 1,024 in many versions of linux regardless of how many file descriptors you allow at the operating system level.

Bummer…

One alternative to select, is to use poll. The poll system call has no real limit, so you can spin up as many sockets as you’d like! Unfortunately, ruby doesn’t implement an IO.poll call, so we’re out of luck unless we want to drop down to writing a C-extension and manage all that stuff yourself.

A few projects are starting to support poll in ruby. EventMachine now supports it as does Celluloid::IO through it’s nio4r gem.

There’s also a random io-poll gem, which adds a IO.select_using_poll method…

Even if there was a good option for using poll, we’ll still need to fork any gems which use IO.select (like our redis gem above) in order to replace any select calls will poll.

That kinda sucks, but why do you need more than 1,024 connections anyway?

We’re using Sidekiq, a multi-threaded message processor, heavily for all of our backend processing and real-time segmentation at Customer.io. Generally, you want to have a pool of connections for Sidekiq, so the multi-threaded workers can all connect freely to your datastores without any contention.

Over the last few days, we began sharding some of our Redis stores using Redis::Distributed to consistently hash keys throughout our new shards.

Each instance of Redis::Distributed holds a connection open for each shard.

So, if you throw a bunch of them into a connection pool, your total number of connections grows pretty quick. In our case:

60 (# of connections in pool) * 32 (# of redis shards) = 1,920

Boom.

--

--

John Allison

Software Developer, Golfer, Arkansas Razorback fan, founder of http://customer.io. You can find me on the twitters: @jrallison