Friday 13 December 2019

data recovery - Recovering corrupted files uploaded in wrong FTP mode


Some files were uploaded in the wrong mode to the FTP (via command line). I believe I have some binary files that were uploaded in TEXT mode and now I cannot open them.


I dont have access to the original files, can I somehow recover from this? Is there some tool that will allow me to get the files in their correct format?



Answer



I recently had to face the same problem. Linux -> Windows, ASCII mode. I've finished writing a program in Python that allows for the recovery of ASCII transferred binaries. It's a byte bruteforcer, and here is how it works:



  1. Open damaged archive as byte stream.

  2. Find all occurrences of 0d followed by 0a (ASCII 13, ASCII 10).

  3. Remove all occurrences of 0d followed by 0a and store the byte addresses.

  4. Cycle through each of the addresses to restore a number of 0d's in case they were supposed to be there in the binary, restore and try to open (in my case I was dealing with bz2 archives, and had a CRC checksum algorithm check the integrity of the uncompressed data and match it with the one hardcoded into the archive).


The number of possible valid 0d 0a byte pairs in a binary will not be very high; the probability of a binary having a valid 0d 0a pair is quite low. The time a bz2 archive takes to fix with this bruteforce method is under 10 seconds for files under 100kb. I have not checked it with other types of files, but it is possible.


I am not going to paste the code here, since this question is not programming related, and this was a sort of competition task and I don't think I'm comfortable with taking the sources public, but if you do require it, please let me know.


Cheers, and Merry Christmas everyone! :)


No comments:

Post a Comment

How can I VLOOKUP in multiple Excel documents?

I am trying to VLOOKUP reference data with around 400 seperate Excel files. Is it possible to do this in a quick way rather than doing it m...