Some files were uploaded in the wrong mode to the FTP (via command line). I believe I have some binary files that were uploaded in TEXT mode and now I cannot open them.
I dont have access to the original files, can I somehow recover from this? Is there some tool that will allow me to get the files in their correct format?
Answer
I recently had to face the same problem. Linux -> Windows, ASCII mode. I've finished writing a program in Python that allows for the recovery of ASCII transferred binaries. It's a byte bruteforcer, and here is how it works:
- Open damaged archive as byte stream.
- Find all occurrences of 0d followed by 0a (ASCII 13, ASCII 10).
- Remove all occurrences of 0d followed by 0a and store the byte addresses.
- Cycle through each of the addresses to restore a number of 0d's in case they were supposed to be there in the binary, restore and try to open (in my case I was dealing with bz2 archives, and had a CRC checksum algorithm check the integrity of the uncompressed data and match it with the one hardcoded into the archive).
The number of possible valid 0d 0a byte pairs in a binary will not be very high; the probability of a binary having a valid 0d 0a pair is quite low. The time a bz2 archive takes to fix with this bruteforce method is under 10 seconds for files under 100kb. I have not checked it with other types of files, but it is possible.
I am not going to paste the code here, since this question is not programming related, and this was a sort of competition task and I don't think I'm comfortable with taking the sources public, but if you do require it, please let me know.
Cheers, and Merry Christmas everyone! :)
No comments:
Post a Comment