Дима Рубинштейн (dimrub) wrote in gotchas,
Дима Рубинштейн
dimrub
gotchas

A base64-encoded deja vu

I was writing an HTTP client of sorts that would be uploading files to an HTTP server I had also control over, using a base64-encoded multipart HTTP request. Everything worked fine until I added a test that would transfer files of small random sizes. Every once in a while the uploads would fail, and the server would report "malformed mime body" (or something along these lines). Digging into the server code, I discovered the server thinks the base64-encoded message is invalid, due to padding of incorrect length (the size of a base64-encoded message must be divisible by 4). After digging some more in a rather complex and multutiered code, I gave up, and instead modified my test to transfer files of ever increasing size. To my surprise, I immediately saw the system in the madness: files of sizes 1-57 went through just fine, then sizes 58-114 resulted in error, then it was fine again, then 172-228 again defined a problematic range (so 57 seemed somehow to be a magical number here). At this point I was having a rather strong sense of deja-vu. Both the client encoder, and the server decoder had a decent suite of unit tests, and they were all green. Modifying the inputs in both to be in the problematic size ranges didn't change anything. Out of sheer desperation, I went on, and copied the output of the encoder into the input of the decoder's tests verbatim (with the only difference I could notice from a manually crafted input being that the output of the encoder was broken into multiple lines). Voila! The test went red. From there it was easy to spot the problem: the decoder, when performing the padding verification, did not discard the new line characters, but rather looked at the total size of the encoded data. That meant that whenever the number of line breaks was odd (meaning the number of lines being even), the new line characters (two of them: CR+LF) resulted in the size of the message giving 2 modulo 4. The longest line in the encoded message is 76 characters, perfectly corresponding to the 57 characters of the original message I was observing.
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your IP address will be recorded 

  • 0 comments