Nat! bio photo


Senior Mull

Twitter Github Twitch

Reducing known plaintext in encrypted gzip archives

If you want to put encrypted files for backup somewhere on a US server, how can you minimize the chances, that the NSA reads them with one of their quantum crackers ?

Here are my thoughts :

  1. Create a tar archive of your files.
  2. Use gzip to compress it. Don't use bzip2. Bzip2 litters your output with magic numbers, which amounts to known plaintext. And whenever there is known plaintext it makes it easier for the cracker. gpg also can and will compress it's contents, but I assume without knowing, that it leaves the usual headers in place.
  3. Use my newly written utility mulle-gz-header-utility to remove the gz header, which consists of 10 bytes of known plaintext or easily guessable plaintext. Now we are only encrypting the compressed data. [1]
  4. encrypt with gpg --compress-algo none --symmetric --cipher-algo CAMELLIA256 and a long and complex passphrase. AES256 is recommended by the NSA, so lets use the japanese algorithm instead.
  5. upload and forget

With a high entropy plaintext and a strong cipher, I believe this will be tough to crack no matter what.

[1] What is not removed is the gzip footer, which besides the CRC of the decompressed file also contains it's length. The length of the decompressed file amounts to an easily guessable 32 bits of plaintext.

Update: Well maybe not 32 bits and not so easily guessable. Depending on the file size, lets assume 100 MB compressed, it's actually more like four leading zero bits and then there will be a few bits with are more likely set than not.

BTW: Extracting the footer and storing it outside of the gpged archive is not a good idea!