Home
About
Personal
Tech

Building a VCR Clone in 28 Lines of Julia

The Ruby gem VCR is a tool that allows you to “[r]ecord your test suite’s HTTP interactions and replay them during future test runs for fast, deterministic, accurate tests”. It’s really useful for testing things like web API wrappers. It also contains 3,000 lines of code. Let’s use Cassette to build it in Julia with less than 1% of the code!

The first example in VCR’s README looks like this:

VCR.use_cassette("synopsis") do
  response = Net::HTTP.get_response(URI('http://www.iana.org/domains/reserved'))
  assert_match /Example domains/, response.body
end

We can make ours look very similar, since both Julia and Ruby have do-block syntax:

playback("synopsis.bson") do
    response = HTTP.get("http://www.iana.org/domains/reserved")
    @test match(r"Example domains", String(response.body)) !== nothing
end

Modes of Operation

#

VCR, at its core, records your HTTP requests and plays them back to you. More specifically:

  • If the specified data file exists, use it as the source of HTTP responses
  • If the file does not exist, make the HTTP requests and store the responses in the file

This gives us two modes of operation: recording mode and playback mode.

Recording Mode

#

To record HTTP responses, we need to hook into HTTP requests, adding a side effect after we’ve obtained a final response. This is where Cassette comes in! I’m not going to go over Cassette’s execution model in detail in this post, so if you’re unfamiliar, I’d definitely recommend reading this page to start, and check out this one too if you have a few extra minutes.

First of all, we’ll need a recording mode context:

Cassette.@context RecordingCtx

Next, we’ll implement Cassette.posthook for our context, but before we can do that, we need to dive into HTTP’s internals a little bit. HTTP makes requests with the HTTP.request function; there are shortcuts for each HTTP verb, like HTTP.get and HTTP.post, but they themselves just call HTTP.request.

A first attempt at a posthook implementation might look like this:

Cassette.posthook(ctx::RecordingCtx, resp, ::typeof(HTTP.request), args...) =
    push!(ctx.metadata, resp)

Let’s try it out:1

julia> ctx = RecordingCtx(; metadata=[]);

julia> Cassette.overdub(ctx, () -> HTTP.get("https://httpbin.org/anything"));

julia> length(ctx.metadata)
6

What happened here? We only made one request, but we got 6 responses!

What we’ve failed to acknowledge is that HTTP’s request mechanism operates recursively on a stack of layers. There’s a nice diagram here (try the raw view if it your font doesn’t render it nicely). Basically the pipeline of request(args) looks something like this:

a = request(FirstLayer, args)
b = request(SecondLayer, a)
c = request(ThirdLayer, b)
response = request(LastLayer, c)

What goes into each request call isn’t as important as the fact that we’re calling request a bunch of times for one HTTP round-trip. Because our posthook method accepts any set of arguments, it’s going to be called as many times as request is called, so we’ll end up with a bunch of extra results that we don’t want. What we really need is to figure out what the last layer type is, and specialize our posthook method on a first argument of that type. We can figure out exactly what the stack looks like by running HTTP.stack():

julia> HTTP.stack()
RedirectLayer{BasicAuthLayer{MessageLayer{RetryLayer{ExceptionLayer{ConnectionPoolLayer{StreamLayer{Union{}}}}}}}}

This looks a bit odd, but we can read it as a list from left to right. The last layer happens to always be an empty Union, so let’s negate our old posthook method and write a new one:

Cassette.posthook(ctx::RecordingCtx, resp, ::typeof(HTTP.request), args...) = nothing
Cassette.posthook(ctx::RecordingCtx, resp, ::typeof(HTTP.request), ::Type{Union{}}, args...) =
    push!(ctx.metadata, resp)

And let’s run our test again:

julia> ctx = RecordingCtx(; metadata=[]);

julia> Cassette.overdub(ctx, () -> HTTP.get("https://httpbin.org/anything"));

julia> length(ctx.metadata)
1

julia> typeof(ctx.metadata[1])
HTTP.Messages.Response

Much better! The last thing we need is to save the data to disk. We’ll write a function for this that we’ll call later:

function after(ctx::RecordingCtx, path)
    mkpath(dirname(path))
    BSON.bson(path; responses=ctx.metadata)
end

Playback Mode

#

Now that we’ve recorded some HTTP responses, we need to implement playback mode to use them. This turns out to be really easy. Recording mode was a bit hairy because we had to apply our posthook at the very end of the pipeline, but for playback, we want to intercept the call to HTTP.request as early as possible. So our method should accept any arguments, and it will immediately return a stored response instead of recursing. We’ll also implement the after function like we did above, except this one doesn’t need to do anything:

Cassette.@context PlaybackCtx
Cassette.overdub(ctx::PlaybackCtx, ::typeof(HTTP.request), args...) =
    popfirst!(ctx.metadata)
after(::PlaybackCtx, path) = nothing

We can see that this works easily enough:

julia> ctx = PlaybackCtx(; metadata=["this should really be a response"]);

julia> Cassette.overdub(ctx, () -> HTTP.get("https://httpbin.org/anything"))
"this should really be a response"

julia> isempty(ctx.metadata)
true

Putting It Together

#

Now that we have our two modes working, we need to combine them with the playback function (our user API entrypoint). To figure out what mode we should operate in, we can check if the data file exists or not. If it does, we’re in playback mode, and if not, we’re in recording mode.

function playback(f, path)
    ctx = if isfile(path)
        data = BSON.load(path)
        PlaybackCtx(; metadata=data[:responses])
    else
        RecordingCtx(; metadata=[])
    end
    
    result = Cassette.overdub(ctx, f)
    after(ctx, path)
    return result
end

Let’s run it:

julia> isfile("test.bson")
false

julia> @time playback(() -> HTTP.get("https://httpbin.org/delay/5"), "test.bson");
  5.403699 seconds (51.95 k allocations: 2.944 MiB)

julia> isfile("test.bson")
true

julia> @time playback(() -> HTTP.get("https://httpbin.org/delay/5"), "test.bson");
  0.015231 seconds (35.24 k allocations: 1.831 MiB)

The first call to playback runs in recording mode, and takes a bit more than 5 seconds, as per the delay we specified. The second one runs in playback mode, takes just 15 milliseconds, and can run offline!

Summary

#

If you haven’t already guessed, the title of this post is very much clickbait. VCR has a ton of features beyond its core capabilities, and we barely even covered the basics. This post exists more to showcase the utility of Cassette than to discredit VCR in any way.

On that note, I hope that this post helps show just how powerful Cassette can be! It really is one of my favourite Julia packages.

I wrote this code live on Twitch, and it was a lot of fun! Feel free to check out my channel for more stuff like this in the future.

Here’s the code we wrote in its entirety (just 28 lines according to Tokei):

module VCR

export playback

using BSON: BSON
using Cassette: Cassette
using HTTP: HTTP

Cassette.@context RecordingCtx
Cassette.posthook(ctx::RecordingCtx, resp, ::typeof(HTTP.request), args...) =
    push!(ctx.metadata, resp)
function after(ctx::RecordingCtx, path)
    mkpath(dirname(path))
    BSON.bson(path; responses=ctx.metadata)
end

Cassette.@context PlaybackCtx
Cassette.overdub(ctx::PlaybackCtx, ::typeof(HTTP.request), args...) =
    popfirst!(ctx.metadata)
after(::PlaybackCtx, path) = nothing

function playback(f, path)
    ctx = if isfile(path)
        data = BSON.load(path)
        PlaybackCtx(; metadata=data[:responses])
    else
        RecordingCtx(; metadata=[])
    end
    
    result = Cassette.overdub(ctx, f)
    after(ctx, path)
    return result
end

end

Update

#

This package, with improvements, is now available at BrokenRecord.jl.


  1. If you run this code yourself, you’ll probably get a gigantic error message from deep in the compiler before it returns. I have no idea what’s causing it, but it seems safe to ignore and it only happens on the first run. See this issue for more details. ↩︎