Debugging Sidekiq Poison Pills

That one time a memory leak almost took down one of our apps — and how we fixed it

Brittney Johnson
Feb 22 · 6 min read

Uh-oh, there’s a problem

How we stopped the problem

queue = Sidekiq::Queue.new("maintenance")
queue.each do |job|
job.delete if job.klass == 'Recurring::DisasterJob'
end
This diagram glosses over the storage details of reliable queue, but is a good comparison of the enqueuing and work strategies
# we ran this five times: maintenance_0 through 4
queue = Sidekiq::Queue.new("maintenance_4")
queue.each do |job|
job.delete if job.klass == 'Recurring::DisasterJob'
end
all_keys = []
$redis.scan_each(match: '*') do |key|
all_keys << key
end
all_keys.select{|k| k.include?('queue')}.map{|k| [k, ($redis.llen(k) rescue "another data type for key")]}.sort_by{|arr| arr[1].to_i}

Should we prevent this from happening again?


Gusto Engineering

Reengineering Payroll, Benefits, and HR for modern business. Hiring empathetic engineers in San Francisco, Denver and NYC! https://gusto.com/about/careers

Brittney Johnson

Written by

Software engineer @ Gusto

Gusto Engineering

Reengineering Payroll, Benefits, and HR for modern business. Hiring empathetic engineers in San Francisco, Denver and NYC! https://gusto.com/about/careers