The Novacut video editor will "save" the edits as JSON using a simple graph-based description. As it will be a distributed video editor, it's very important that the video and audio source files (produced by your HDSLR camera and digital audio recorder) be referenced in a globally unique way.
The solution is easy: reference these media files by their content hash. The question is, what hash? After some research and testing, I'm leaning toward Skein, specifically skein-512 with a 240-bit digest. But I would love some feedback on this. I would especially love some feedback from my former freeIPA teammates at Red Hat because, well, you people are security rock stars. And opinionated. Yes, Simo, I'm looking at you! So Rob, Pavel, Martin, John, Dmitri, Simo, Stephen, what do you think? My own rationale goes something this:
- The hash needs an extremely long useful life: the video edit description is designed specifically for remixing, so these hashes will become the keepers of a (hopefully) large body of read/write culture
- At the same time, the hash should have a reasonably small digest size so it's URL friendly, easy to use in many contexts
- Ideally the hash would have a digest size that is a multiple of 40-bits so it can be cleanly base32 encoded (I'm avoiding base64 so I can use the hash to name files, even on case-insensitive file systems)
- sha1 (40 * 4 = 160bits) is already considered pretty broken, so that doesn't sound future proof to me
- skein-512 is fast, has a conservative design with a large 512-bit internal state, and can produce any digest size desired (so we just pick our favorite multiple of 40-bits)
- A 240-bit digest means happy birthday in 2**120, which is darn close to the fuzzy feeling I get when anything security-related requires 2**128 operations to brute-force
- When we base32-encode a 240-bit digest, we get a 48-character string, which is still short enough to be fairly URL friendly
The only worry I have about Skein is that the rotational constants might be changed again, which would be quite disruptive. Not impossible to deal with (the editing format should really have a graceful way to migrate to a different hash anyway), but the timing would suck.
I saw on Bruce Schneier's blog that a constant will be changed in Threefish, the block-cipher used by Skein:
Even with the attack, Threefish has a good security margin. Also, the attack doesn't affect Skein. But changing one constant in the algorithm's key schedule makes the attack impossible. NIST has said they're allowing second-round tweaks, so we're going to make the change. It won't affect any performance numbers or obviate any other cryptanalytic results -- but the best attack would be 33 out of 72 rounds.
As this change will change the value of Skein hashes, I'll wait to use Skein in dmedia till after the change. Hopefully it will be completed soon. In the meantime, I'll use a base32-encoded sha1 hash in dmedia. Depending on how many adventurous beta-testers we have, I may not provide a sha1 to skein migration path.
Also, thanks for your input, Simo!