Tuesday, November 26, 2013

Unicast enumerators let you debug in production.

Have you ever had to debug something broken in production? Have you ever had to grep through mountains of logs to find a stack trace? What if I told you you could get back log data in your API call response? But wait there's more! What if you could run jobs via an API endpoint, allowing them to be scheduled from other services?

The ghost of Billy Mays here, and before I'm off to haunt that dick bag sham wow guy, I just wanted to tell you about this amazing opportunity to use Play Framework's unicast enumerators.

It's no secret that Klout fucks with iteratees. We use them all over the place: our collectors, our targeting framework, our HBase and ElasticSearch libraries. Recently, we discovered one weird trick to debug code in production using this framework. Bugs hate us!

Iteratees

If you don't know what an enumerator or iteratee is, usher yourself to the Play website to have a 20 megaton knowledge bomb dropped on your dome piece. If you're too lazy, let me give you a mediocre 10,000 foot overview: enumerators push out inputs, enumeratees modify those inputs, iteratees are a sink to receive those inputs.

For example, in the following code snippet:

List(1,2,3).map(_ + 1).foldLeft(0)(_ + _)

The list is the enumerator, the mapping is the enumeratee, and the folding is the iteratee. Of course, everything is done piece-meal and asynchronously. The 1 gets pushed to _ + 1 enumeratee, which then pushes a 2 to the fold iteratee, which dumps it in a sink.

Play Framework

Interestingly enough, Play's HTTP handling is actually done by iteratees. Your controller actions return byte array enumerators, which get consumed by various iteratees. Everything else is just a bunch of sugar on top of this low-level interface.

Also in Play 2.1, there is a unicast enumerator, which allows you to imperatively push inputs via a channel. Yeah, yeah we all know that imperative code is uglier than Steve Buscemi and Paul Scheer's unholy love child, but pushing inputs allows you to do some interesting things. If you haven't solved the fucking mystery yet, and are still as confused as you were by the ending to Prometheus, let me give you a giant fucking hint: you can push debug statements via the channel, which get spat out by the enumerator. This enumerator then gets consumed by a Play's iteratee.

So here's my implementation of this pattern:

The call starts with a controller wrapper, intended to be used like this:

The event stream then gets passed down to the service method as an implicit.

When you curl the endpoint, if you use a special header, the log pushes will be returned as a chunked http response AS THEY ARE PUSHED.

curl -H 'X-Klout-Use-Stream:debug' 'http://rip.billy.com/api/blah.json'

[info] [2013-11-26 09:57:50.931] Fetching something from HBase
[info] [2013-11-26 09:57:51.932] HBase returned in 1s. Durrgh.
[complete] [2013-11-26 23:25:32.499] {"billymays" : "badass"}
200

Here's the wrapper code itself. The implementation of the EventStream case class itself is left as an exercise for the reader.

Basically what's going on here is I look through the headers for my special header and see if it maps to a predefined event level. If it doesn't, I do the usual shit. If it does, I call Concurrent.unicast. This factory takes in a function that uses a channel of type T and returns an enumerator of type T. I take this channel and put it into an EventStream case class. This case class gets passed down to the business logic, which typically uses it as an implicit.

Word.

Application

In addition to being able to stream logs by request in production, you can by extension stream logs from a long running request and call it a job. This means you can use external services like Jenkins to run an API endpoint like a job.

Yay! Also, Deltron:

No comments: