Ubuntu – Damaged swap partition – How and what to do

12.04hard drivepartitioningswap

I needed GPartEd to format a MicroSD card (don't trust Disk Utility too much as more problems seem to occur with hard drives and memory cards I formatted with it, than with the ones I formatted wth GPartEd), and almost got a heart attack… There's a damaged partition on my new hard drive (a bit more than 2 months old)!
Luckily it was only my SWAP partition, but still…

I'd like to know a few things…

  • How to rule out the possibility of hardware failure.
  • How this can happen if the problem's not hardware-related.
    • I only installed Linux once on this computer (waited for Ubuntu 12.04 to get released), so it's not a SWAP partition that remains from an earlier installation.
  • What can be the cause of physical hard drive damage.
    • The computer hasn't fallen of a desk or anything…
  • How to prevent this in future, if at all possible.
  • Additional: Whether it's normal that reading (and writing as well, I think) speed drops significantly over a minute time; and if it's not, what are some possible ways to analyse/fix the issue?

No problem if you can't answer all of my questions.


Information

enter image description here
enter image description here
enter image description here
enter image description here
It's a read-only benchmark.
enter image description here
The partition labeled Windows holds a Windows 7 installation I'm supposed to need some time for school, and I used before Ubuntu 12.04 got released. I have booted it up a few times after Ubuntu 12.04 was installed (I don't know whether Windows' disk check can do any harm to Linux partitions, but it always seems to run a checkdisk after I changed my partition table layout with a Linux application like GPartEd).
/dev/sda7 is the SWAP partition I'm talking about.
enter image description here


I guess my best option now is just try to boot a Live CD and format /dev/sda7 again? It won't format it from my installed Ubuntu system.

Best Answer

Just a shoot in the dark, but IMHO your swap partition isn't damaged at all. I've already seen discrepancies between fdisk and gparted and, it's sad to say, but fdisk is almost always right.

Try a:

#> cat /proc/meminfo | grep -i swap
SwapCached:        10632 kB
SwapTotal:       2094076 kB
SwapFree:        2053324 kB

you should see that your swap space is actually in use (or at least I hope so).

The explaination for that "unreadable" /dev/mapper/cryptswap1 partition is that cryptswap1 is actually a mapped encrypted swapspace, so it should be fine that no one can understand what's in there. If you want to disable it, you can look at this thread: How to disable cryptswap?

Last but not least, your SMART status: at first sight (just watching the read error rate and the seek error rate), I'd have said that your drive was about to melt down. But, no, it's fine, I have a drive whose SMART says exactly the same. I'll post the full output just for reference (both for me and others) for future visits.

#> sudo smartctl --all /dev/sda
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-24-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.12
Device Model:     ST3250318AS
Serial Number:    9VM2R3AN
LU WWN Device Id: 5 000c50 015aa8d47
Firmware Version: CC35
User Capacity:    250,059,350,016 bytes [250 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Sun May 27 18:03:03 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (  617) seconds.
Offline data collection
capabilities:            (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   1) minutes.
Extended self-test routine
recommended polling time:    (  52) minutes.
Conveyance self-test routine
recommended polling time:    (   2) minutes.
SCT capabilities:          (0x103f) SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   118   099   006    Pre-fail  Always       -       196559365
  3 Spin_Up_Time            0x0003   097   097   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       320
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   067   060   030    Pre-fail  Always       -       6277671
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       517
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       158
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   099   000    Old_age   Always       -       41
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   065   065   045    Old_age   Always       -       35 (Min/Max 21/35)
194 Temperature_Celsius     0x0022   035   040   000    Old_age   Always       -       35 (0 12 0 0)
195 Hardware_ECC_Recovered  0x001a   052   045   000    Old_age   Always       -       196559365
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       72748156060552
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       968998393
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       939693204

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

And I have another drive that looks more "normal":

#> sudo smartctl --all /dev/sdc
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.0.6-gentoo-goomba-test-3]
(local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green (Adv. Format)
Device Model:     WDC WD20EARS-00MVWB0
Serial Number:    WD-WCAZA2437330
LU WWN Device Id: 5 0014ee 205473c89
Firmware Version: 51.0AB51
User Capacity:    2,000,398,934,016 bytes [2,00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sun May 27 18:16:09 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
                    was suspended by an interrupting
command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine
completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (37500) seconds.
Offline data collection
capabilities:            (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off
support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    ( 255) minutes.
Conveyance self-test routine
recommended polling time:    (   5) minutes.
SCT capabilities:          (0x3035) SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED
WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always  -       0
  3 Spin_Up_Time            0x0027   253   253   021    Pre-fail  Always  -       1233
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always  -       390
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always  -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always  -       0
  9 Power_On_Hours          0x0032   094   094   000    Old_age   Always  -       4988
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always  -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always  -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always  -       388
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always  -       33
193 Load_Cycle_Count        0x0032   135   135   000    Old_age   Always  -       197801
194 Temperature_Celsius     0x0022   119   109   000    Old_age   Always  -       31
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always  -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always  -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline -       0
199 UDMA_CRC_Error_Count    0x0032   200   199   000    Old_age   Always  -       451
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

And another one with some "real" error, that however is still alive after several months of tiny complaints:

#> sudo smartctl --all /dev/sda
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.0.6-gentoo-goomba-test-3] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Maxtor DiamondMax 20
Device Model:     MAXTOR STM3160211AS
Serial Number:    6PT56QN7
Firmware Version: 3.AAE
User Capacity:    160,041,885,696 bytes [160 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sun May 27 18:33:59 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (  430) seconds.
Offline data collection
capabilities:            (0x5b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   1) minutes.
Extended self-test routine
recommended polling time:    (  54) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   102   087   006    Pre-fail  Always       -       4542948
  3 Spin_Up_Time            0x0003   095   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       1011
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       11
  7 Seek_Error_Rate         0x000f   089   060   030    Pre-fail  Always       -       846828717
  9 Power_On_Hours          0x0032   086   086   000    Old_age   Always       -       13126
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       1019
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   064   052   045    Old_age   Always       -       36 (Min/Max 22/37)
194 Temperature_Celsius     0x0022   036   048   000    Old_age   Always       -       36 (0 14 0 0 0)
195 Hardware_ECC_Recovered  0x001a   050   046   000    Old_age   Always       -       11583613
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   169   000    Old_age   Always       -       48
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
ATA Error Count: 204 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 204 occurred at disk power-on lifetime: 5852 hours (243 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 2d 72 00 00 e0  Error: ICRC, ABRT 45 sectors at LBA = 0x00000072 = 114

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 3e 61 00 00 e0 00      00:01:52.203  READ DMA
  27 00 00 00 00 00 e0 00      00:01:52.133  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:01:52.125  IDENTIFY DEVICE
  ef 03 42 00 00 00 a0 00      00:01:52.104  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 00      00:01:46.941  READ NATIVE MAX ADDRESS EXT

Error 203 occurred at disk power-on lifetime: 5852 hours (243 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 2d 72 00 00 e0  Error: ICRC, ABRT 45 sectors at LBA = 0x00000072 = 114

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 3e 61 00 00 e0 00      00:01:45.519  READ DMA
  c8 00 02 5f 00 00 e0 00      00:01:45.511  READ DMA
  27 00 00 00 00 00 e0 00      00:01:45.503  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:01:45.431  IDENTIFY DEVICE
  ef 03 42 00 00 00 a0 00      00:01:45.423  SET FEATURES [Set transfer mode]

Error 202 occurred at disk power-on lifetime: 5852 hours (243 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 60 00 00 e0  Error: ICRC, ABRT at LBA = 0x00000060 = 96

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 02 5f 00 00 e0 00      00:01:45.519  READ DMA
  27 00 00 00 00 00 e0 00      00:01:45.511  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:01:45.503  IDENTIFY DEVICE
  ef 03 42 00 00 00 a0 00      00:01:45.431  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 00      00:01:45.423  READ NATIVE MAX ADDRESS EXT

Error 201 occurred at disk power-on lifetime: 5852 hours (243 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 60 00 00 e0  Error: ICRC, ABRT at LBA = 0x00000060 = 96

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 02 5f 00 00 e0 00      00:01:44.035  READ DMA
  25 00 08 af 8a a1 e0 00      00:01:43.980  READ DMA EXT
  27 00 00 00 00 00 e0 00      00:01:43.972  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:01:43.968  IDENTIFY DEVICE
  ef 03 42 00 00 00 a0 00      00:01:43.904  SET FEATURES [Set transfer mode]

Error 200 occurred at disk power-on lifetime: 5852 hours (243 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 b6 8a a1 e0  Error: ICRC, ABRT at LBA = 0x00a18ab6 = 10586806

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 08 af 8a a1 e0 00      00:01:44.035  READ DMA EXT
  25 00 06 41 8a a1 e0 00      00:01:43.980  READ DMA EXT
  27 00 00 00 00 00 e0 00      00:01:43.972  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:01:43.968  IDENTIFY DEVICE
  ef 03 42 00 00 00 a0 00      00:01:43.904  SET FEATURES [Set transfer mode]

SMART Self-test log structure revision number 1

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

For what concerns the read speed benchmark, I think that a speed decrease over time is normal. I suspect that caching mechanisms make the disk appear faster in the initial phase of testing, and slower at the end. However I see that your "worst" read speed is around 80MB/s that is well above my "best average" read speed (around 60MB/s), so I wouldn't worry about this aspect.