wdp
Joined: 09 Sep 2006 Posts: 6 Location: Bremen
|
Posted: Sat Jan 12, 2008 4:32 am Post subject: haaaard.. disc |
|
|
Hey,
i found the following lines in my logfiles:
Code: | Jan 12 01:12:07 irulan smartd[1815]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 64 to 65
Jan 12 01:12:07 irulan smartd[1815]: Device: /dev/hda, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 64 to 65
Jan 12 02:12:07 irulan smartd[1815]: Device: /dev/hdc, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 58 to 57
Jan 12 02:12:07 irulan smartd[1815]: Device: /dev/hdc, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 58 to 57
Jan 12 03:12:07 irulan smartd[1815]: Device: /dev/hda, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 65 to 64 |
I tried to find out wether that's good or bad. At the moment i'm still not sure, maybe someone of you can help?
smartctl -A shows:
hda
Code: |
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 065 051 006 Pre-fail Always - 115551025
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 55
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 079 060 030 Pre-fail Always - 17578685589
9 Power_On_Hours 0x0032 090 090 000 Old_age Always - 9156
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 64
194 Temperature_Celsius 0x0022 032 053 000 Old_age Always - 32
195 Hardware_ECC_Recovered 0x001a 065 051 000 Old_age Always - 115551025
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 188 000 Old_age Always - 74
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0
|
hdc
Code: |
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 058 055 006 Pre-fail Always - 134704791
3 Spin_Up_Time 0x0003 099 098 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 7
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 084 060 030 Pre-fail Always - 329042277
9 Power_On_Hours 0x0032 089 089 000 Old_age Always - 10253
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 44
194 Temperature_Celsius 0x0022 033 055 000 Old_age Always - 33
195 Hardware_ECC_Recovered 0x001a 058 054 000 Old_age Always - 134704791
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0
|
both harddiscs are the same; here's hdparm -i:
Code: |
Model=ST380011A, FwRev=3.06, SerialNo=3JV822FQ
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=2048kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=156299375
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 *udma4 udma5
AdvancedPM=no WriteCache=enabled
Drive conforms to: ATA/ATAPI-6 T13 1410D revision 2: ATA/ATAPI-1,2,3,4,5,6
|
why udma4 if it could run at udma5? In fact hdc is running in udma5, while hda is running in udma4. I don't know why, in my syslog i got the following:
Code: |
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1
ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
Switched to high resolution mode on CPU 0
Switched to high resolution mode on CPU 1
hda: ST380011A, ATA DISK drive
hda: selected mode 0x45
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: ST380011A, ATA DISK drive
hdc: selected mode 0x45
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 512KiB
hda: Host Protected Area detected.
current capacity is 156299375 sectors (80025 MB)
native capacity is 156301488 sectors (80026 MB)
hda: Host Protected Area disabled.
hda: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=16383/255/63, UDMA(100)
hda: cache flushes supported
hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 >
hdc: max request size: 512KiB
hdc: Host Protected Area detected.
current capacity is 156299375 sectors (80025 MB)
native capacity is 156301488 sectors (80026 MB)
hdc: Host Protected Area disabled.
hdc: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=16383/255/63, UDMA(100)
hdc: cache flushes supported
hdc: hdc1 hdc2 hdc3 hdc4 < hdc5 hdc6 hdc7 >
|
no failure so far.. after turning the raid on and mounting it, i get:
Code: |
raid1: raid set md0 active with 2 out of 2 mirrors
md: ... autorun DONE.
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 292k freed
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
ide: failed opcode was: unknown
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
ide: failed opcode was: unknown
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
ide: failed opcode was: unknown
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
ide: failed opcode was: unknown
ide0: reset: success
EXT3 FS on hda7, internal journal
|
so as you can see it's starting with udma5 - at the point while or before mounting hda7 it's disabling DMA and switching to udma4 - if i try to manually change that with hdparm i get the same errors again in dmesg and it's automatically changing again to udma4.
The harddisc(s) are running pretty fine. Anyway. If i go on them with e2fsck (Read only) i get the following:
root partition - hda7
Code: |
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 1224228 has zero dtime. Fix? no
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(2469935--2469937) -(2470117--2470118)
Fix? no
Inode bitmap differences: -1224228
Fix? no
|
/usr - hda5
Code: |
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 194639 has zero dtime. Fix? no
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(418059--418096)
Fix? no
Inode bitmap differences: -194639
Fix? no
|
on hdc hdc3
Code: |
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (817363, counted=817362).
Fix? no
|
hdc7 (not in use atm)
Code: |
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found. Create? no
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (619015, counted=1070784).
Fix? no
Free inodes count wrong (521339, counted=611636).
Fix? no
|
i checked the raid device, too, there are no filesystem errors. so i'm not seeing anything 'critical'.
Someone any idea on this? I mean - is the harddisc broken, should i replace it, will the harddisc die in a few weeks (i have that errors now a while and nothing happened), is this behavior correctable and maybe chipset/kernel related... ANY ideas? |
|