ubuntu - Can I repair/reuse a dying hard drive?

Saturday, 25 May 2019

ubuntu - Can I repair/reuse a dying hard drive?

I have a harddrive that's giving lots of read/write I/O errors, bad sectors, general malfunctioning. It's a 2 TB caviar green Western Digital. The disk is dying, not dead, so it's recognized by my system, I can access it, etc.

I hope this is not a duplicate because every other question deals with recovering data, which I already did. If anyone wants to know about that process I can expand on it, but it basically involved pvmoving the whole drive chunk by chunk to another drive while getting tons of I/O errors and having to restart and resume moves several times. The drive was part of my +20TB LVM server, under Ubuntu 12.04. It's empty and unpartitioned now.

This is the drive's S.M.A.R.T information. As you can see, there are several red flags: error rate, reallocations... (it's an old and heavily used drive):

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   140   138   021    Pre-fail  Always       -       10000
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       661
  5 Reallocated_Sector_Ct   0x0033   192   192   140    Pre-fail  Always       -       62
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   026   026   000    Old_age   Always       -       54086
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       219
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       133
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       -       637609
194 Temperature_Celsius     0x0022   106   095   000    Old_age   Always       -       46
196 Reallocated_Event_Count 0x0032   138   138   000    Old_age   Always       -       62
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   001   001   000    Old_age   Offline      -       613558

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%     53401         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing

This is a small sample of the errors that appear on syslog when doing a simple dd of a few MB to the device:

[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg] CDB: 
[vie may  4 12:08:45 2018] Write(10): 2a 00 00 00 c8 00 00 04 00 00
[vie may  4 12:08:45 2018] end_request: I/O error, dev sdg, sector 51200
[vie may  4 12:08:45 2018] Buffer I/O error on device sdg, logical block 6400
[vie may  4 12:08:45 2018] lost page write due to I/O error on sdg
[vie may  4 12:08:45 2018] Buffer I/O error on device sdg, logical block 6401
[vie may  4 12:08:45 2018] lost page write due to I/O error on sdg
[vie may  4 12:08:45 2018] Buffer I/O error on device sdg, logical block 6402
[vie may  4 12:08:45 2018] lost page write due to I/O error on sdg
[vie may  4 12:08:45 2018] Buffer I/O error on device sdg, logical block 6403
[vie may  4 12:08:45 2018] lost page write due to I/O error on sdg
[vie may  4 12:08:45 2018] Buffer I/O error on device sdg, logical block 6404
[vie may  4 12:08:45 2018] lost page write due to I/O error on sdg
[vie may  4 12:08:45 2018] Buffer I/O error on device sdg, logical block 6405
[vie may  4 12:08:45 2018] lost page write due to I/O error on sdg
[vie may  4 12:08:45 2018] Buffer I/O error on device sdg, logical block 6406
[vie may  4 12:08:45 2018] lost page write due to I/O error on sdg
[vie may  4 12:08:45 2018] Buffer I/O error on device sdg, logical block 6407
[vie may  4 12:08:45 2018] lost page write due to I/O error on sdg
[vie may  4 12:08:45 2018] Buffer I/O error on device sdg, logical block 6408
[vie may  4 12:08:45 2018] lost page write due to I/O error on sdg
[vie may  4 12:08:45 2018] Buffer I/O error on device sdg, logical block 6409
[vie may  4 12:08:45 2018] lost page write due to I/O error on sdg
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg] Unhandled error code
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg]  
[vie may  4 12:08:45 2018] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg] CDB: 
[vie may  4 12:08:45 2018] Write(10): 2a 00 00 00 cc 00 00 04 00 00
[vie may  4 12:08:45 2018] end_request: I/O error, dev sdg, sector 52224
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg] Unhandled error code
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg]  
[vie may  4 12:08:45 2018] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg] CDB: 
[vie may  4 12:08:45 2018] Write(10): 2a 00 00 00 d0 00 00 04 00 00
[vie may  4 12:08:45 2018] end_request: I/O error, dev sdg, sector 53248
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg] Unhandled error code
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg]  
[vie may  4 12:08:45 2018] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg] CDB: 
[vie may  4 12:08:45 2018] Write(10): 2a 00 00 00 d4 00 00 04 00 00
[vie may  4 12:08:45 2018] end_request: I/O error, dev sdg, sector 54272
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg] Unhandled error code
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg]  
[vie may  4 12:08:45 2018] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
[vie may  4 12:08:45 2018] sd 5:0:0:0: [sdg] CDB: 
[vie may  4 12:08:45 2018] Write(10): 2a 00 00 00 d8 00 00 04 00 00
[vie may  4 12:08:45 2018] end_request: I/O error, dev sdg, sector 55296

IMO it seems like a case of hardware failure from old age, but I would like to know if anyone has a different idea about the cause.

I'm not stupid and I've spent enough time recovering its data, so I won't be putting important info in there, haha. I just want to know if there is any procedure (software or even hardware) that I can use to "repair" some of these bad sectors. This is mostly from a curiosity and wanting-to-learn point of view. If I end up keeping it, I will use it for testing stuff, having a backup of parts for my other drives, etc.

TL;DR: Can I "repair" a dying hard drive (not caring about its data)?

Answer

Yes, you can, indeed, repair it. However, it's sort of pointless. The end result would lead to you buying a second drive, swapping the platters (to clear up the platter damage and bad sectors), possibly flashing or replacing the interface board to reset the SMART data....literally, building a new drive in the hull of the old one. Which requires the parts from another drive...rendering your whole repair pointless.

Unless there's something you're just desperate to keep, 2TB WD Greens tend to go on sale on Amazon often, and are pretty inexpensive now in general (under $70). I'd Data Recover what you can if there's anything left, get a Torx screwdriver, pull it apart, and begin your project of a mirror made out of dead HDD platters.

HDD platters are the core of the device and are where the data lives. Often, bad sectors means that the thin metal that is lain over the glass platter is becoming pitted or not able to be magnetized anymore. Older model platters are no more durable, even though they are usually solid metal. Some really old ones even have precious metal cores. In all cases, when the sector can't be magnetized, it becomes useless to the drive. There's no real coming back from that.

Notes

Saturday, 25 May 2019

ubuntu - Can I repair/reuse a dying hard drive?

No comments:

Post a Comment

How can I VLOOKUP in multiple Excel documents?