Clojure’s EDN Tagged Literals

We had an issue earlier where the EDN reader was breaking because it didn’t recognize the #bin tag.

The root problem there was that we were sending identity frames across the wire when we shouldn’t have been.

But it’s worth knowing about how to treat the symptom for cases where we want to send things that aren’t legal natively–that’s one of the main points to using an extensible format.

I think clojure introduced customizable “tagged literals” in 1.4. Pretty much all the google results about this talk about the full-blown clojure version. They’re all about setting up a data_readers.clj and using it to describe what should happen when the reader encounters a tag it doesn’t recognize.

We’re just serializing everything by calling pr-str on it. This mostly creates strings that the clojure reader can automatically parse. For things that it can’t, there’s a multi-method (I think it’s print-dup) that you can override.

The clojure reader gracefully handles things that the EDN reader can’t, by default. The EDN reader also ignores data_readers.clj. It’s very specifically designed for deserializing input from untrusted sources.

The danger (and another of the big reasons for using EDN–before it was introduced, the default lisp approach was just to trust the incoming data and use lisp as the serialization format) is that you can have a tag cause the reader to execute any arbitrary code that strikes your fancy.

That’s really a scary amount of power (which can be used for a lot of Good).

Anyway. That’s why the code sample from this morning was complaining about the #bin tag. EDN, by default, doesn’t recognize the format for raw byte arrays. That’s one that we probably shouldn’t ever mess with.