Video of this talk available on Youtube, with thanks to Frank Rietta
See pt. I for background discussion
Okay. Worker platform: selected. Let’s do this.
Listen to the Tests
See commit cbc86ea
Aliveness of TDD notwithstanding, our current tests are exposing some problems with that open
call …
- It’s actually opening the cat gifs (which in fairness is pretty cute, but can clutter up your tabs quickly)
- It is not testing that the
open
command is run, because is can’t as currently structured
Being hard to test can be a symptom of bad abstraction, and I think it is in this case. I’d prefer to wrap our shell interactions in an object that we can inspect and modify as needed (e.g. if we decide to support a platform without open
). Here’s an approach at that, taking a bit of a nod from ActionMailer::Base#deliveries
-
class Shell
attr_reader :history
def initialize limit: 100
@history, @limit = [], limit
end
# Runs the command, maintaining a rolling window
# of recently executed commands
def run cmd
record cmd
system cmd unless Rails.env.test?
end
def record cmd
@history.shift if @history.length == @limit
@history << cmd
end
end
After adding an initializer so that we can access it, we can use the shell as follows:
class CatRequest < ActiveRecord::Base
…
def fulfill
Air.shell.run "sleep 5" # Pretend this is harder than it really is
update_attributes cat: Cat.choose
Air.shell.run "open #{cat.download_path} -a 'Google Chrome'"
end
end
and update the tests to make sure we’re executing the right system call:
expect( Air.shell.history.last ).to match /open.*#{req.cat.download_path}/
Bonus: since we don’t actually run the commands, our tests no longer have to do the five second wait thing either. Win!
Plumbing
See commits 7f18f3d and 0b6bb4a
We ended up going with Sidekiq, but your needs may differ; take a look at pt. I if you’re wondering what’s right for you. Setup was anticlimactically easy:
- Add
sidekiq
andsinatra
(forSidekiq::Web
) to the Gemfile - Install redis (
brew install redis
in my case) - Mount
Sidekiq::Web
under the/sidekiq
route like so
And then we just need to add a worker to do our fulfillment:
class CatRequestWorker
include Sidekiq::Worker
def perform id
cat_request = CatRequest.find id
cat_request.fulfill
end
end
and call it instead of doing the fulfillment inline:
def cat_me!
req = cat_requests.create!
# - req.fulfill
CatRequestWorker.perform_async req.id
end
Note that we’re passing off the id of the Request that we want to fulfill, not the Request itself. You should always pass JSON-able arguments into your jobs, and look up current state when the job actually starts to run. Trying to pass e.g. an ActiveRecord in could cause problems when trying to (de)serialize through Redis, and the record could be out of date by the time the worker actually runs.
Good news! We’re done. Fire up Sidekiq (with bundle exec sidekiq
) and try it out - you should be able to hammer the button repeatedly and (after a short wait), get a whole pounce of cat gifs.
Red ⇒ Green
See commit: 7d34a0e
Bad news! We broke the specs. Running manually seems to work though … what’s going on? Let’s take a closer look at the relevant test:
click_on "Cat Me"
req = @user.cat_requests.last
expect( req.cat ).to be_present
Hoist by our own petard! We introduced background workers so that our code wouldn’t have to wait around while we wrangled up some pictures, but now our tests aren’t waiting for the things they should be testing! We could do something to introduce a delay here (even prying for long enough), but it turns out this is a common problem that most worker gems have tools for. In Sidekiq’s case it’s:
require 'sidekiq/testing'
Sidekiq::Testing.inline!
As you can see, when running inline!
d, Sidekiq simply fakes the Redis round-trip by dumping and loading though JSON and then calling perform directly, without sparking off a new thread.
Coping with Failure
See commit: 1223764
Especially when dealing with external services, we have to allow for the possibility of a worker failing. Different systems do this somewhat differently, but Sidekiq will automatically retry failed jobs, as you can watch in the admin area by unleashing the chaos monkey on your worker:
raise "NOPE" if rand < 0.9
This is convenient, but has some ramifications on designing your workers: if a job dies somewhere in the middle, it’ll retry from the start, so we need to make our workers as robust as possible. In our case, I see two main defensive changes to make -
Since jobs may be run multiple times, we should make sure we never fulfill a request more than once:
def fulfill!
return if cat_id.present?
…
end
If we add logic for canceling a request by deleting it before it gets fulfilled, the worker’s lookup will fail and keep retrying. We should change it to something like:
def perform id
cat_request = CatRequest.where(id: id).first
cat_request.fulfill! if cat_request
end
There. Bulletproof. Now …