Hi,sorry if this has been discussed already, all i could find are topics about upgrading the firmware on these drives.I've got a NV+v2 with 4 3TB ST3000DM001-9YN166 disks in it, firmware CCH4 (the latest).System has been running for a couple of months now, and all was working smoothly until i got a warning 2 weeks ago:Detected increasing command timeouts[65537] on disk 1 [ST3000DM001-9YN166, W1F083HJ]. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.Looking into the SMART status, it indeed gives some ridiculously high number (4295032833) on command time out for disk 1. On the other 3 disks it's 0.Another thing i noticed in the SMART status for all disks though, is that the load count cycle is quite high: around 43000 after 1800 power on hours.After a bit of googling however, all i could find was this problem cropping up on WD green drives. The solution seems to be to set the idle time a bit higher.I did find this old topic (http://www.readynas.com/forum/viewtopic.php?f=36&t=51536) explaining how to do it on some other seagate model, but i wanted to get some feedback before i attempt to change things via ssh. Don't wanna corrupt anything.So basically my questions are:1) Is the command time out problem in anyway related to the lcc problem, or is this drive just about to die no matter what?2) Does anybody else have these high lcc numbers with the ST3000DM001 drives?3) Is it worthwhile changing the idle time setting with hdparm?3b) If yes, are the instructions in that old topic still valid?On a related note, i did install the ssh add-on and looked around a little, however when i try to get drive status with hdparm i get this:root@nas:~# hdparm -i /dev/sda/dev/sda: HDIO_GET_IDENTITY failed: Inappropriate ioctl for deviceSame for sdb, sdc, and sddAm i using the right identifiers here? Sorry for the ignorance :oops: Hope this isn't all too confusing, i tried to cram a lot of information in this post :-)Thanks in advance for any help!EDIT: apparently i do not have permission to use the url tag..

Command Timeout: Indicates a number of aborted operations due to hard disk timeoutthis is a critical parameterit would be a good idea to connect the drive to your computer and test with sea tools > http://www.seagate.com/support/downloads/seatools/

This is the list of known S.M.A.R.T. attributes supported by IDE and Serial ATA hard disks.Note: some manufacturers may use the attributes for different purposes also.Attributes not listed here are "vendor specific" attributes (their purpose is not known)Raw Read Error Rate - Errors occured while reading raw data from a diskIndicate problem with the disk surface or the read/write heads.Critical attributeThroughput Performance - General throughput performance of the hard diskIndicate problem with motor, servo or bearings.Spin Up Time - Time needed by spindle to spin-up to full RPMIndicate problem with motor or bearings.Critical attributeStart/Stop Count - Count of start/stop cycles of spindleThis value does not directly affect the condition of the drive.Reallocated Sector Count (Reallocated Sectors Count) - Count of sectors moved to the spare areaIndicate problem with the disk surface or the read/write heads.Critical attributeCommand Timeout - Indicates a number of aborted operations due to hard disk timeoutCritical attributeRead Channel Margin - Margin of a channel while reading dataThe exact function of this attribute is not specified.Seek Error Rate - Rate of positioning errors of the read/write headsIndicate problem with servo, head. High temperature can also cause this problem.Critical attributeSeek Time Performance - Average time of seek operations of the headsIndicate problem with servo.Critical attributePower-On Time Count - Total time the drive is powered onThe unit of the measure depends on the manufacturer.Spin Retry Count - Retry count of spin start attemptsIndicate problem with motor, bearings or power supply.Critical attributeDrive Calibration Retry Count - Number of attempts to calibrate a driveIndicate problem with motor, bearings or power supply.Drive Power Cycle Count - Number of complete power on/off cyclesThis value does not directly affect the condition of the drive.Soft Read Error Rate - Number of software read errorsThe number of uncorreactable read errors.Airflow Temperature - Airflow temperatureThe temperature of the air inside the hard disk housing.Mechanical Shock - Count of problems caused by mechanical shockAcceleration (for example falling) can cause mechanical shock.Power off Retract Cycle - Count of power off cyclesThis value does not directly affect the condition of the drive.Load/Unload Cycle Count - Count of load/unload cyclesNumber of cycles the head moved into landing zone position.HDD Temperature - Disk temperatureThe temperature inside the hard disk housing.Hardware ECC Recovered - Count of correctable errorsNumber of errors corrected by the internal error correcting mechanism.Reallocation Event Count - Count of sector remap operationsNumber of all (successful and failed) remap operations.Critical attributeCurrent Pending Sector Count - Count of unstable sectorsThese pending sectors may be remapped to the spare area.Critical attributeOff-Line Uncorrectable Sector Count - Count of uncorrectable errors when reading/writingIndicate problem with the disk surface or the read/write heads.Critical attributeUltra ATA CRC Error Count - Count of errors during data transfer between disk and hostIndicate problem with the power supply or data cable.Write Error Rate - Errors occured while writing raw data from a diskIndicate problem with the disk surface or the read/write heads.Soft Read Error Rate - Number of software read errorsThe number of uncorreactable read errors.Data Address Mark Errors - Number of data address mark errorsNumber of incorrect or invalid address marks.Run Out Cancel - Number of data correction errorsInvalid error correction checksum found during error correction.Soft ECC Correction - Number of corrected data errorsErrors corrected by the internal error correction mechanism.Thermal Asperity Rate - Number of thermal problemsTotal number of problems caused by high temperature.Flying Height - Head flying heightThe height of the disk heads above the disk surface.Spin High Current - Current value during spin upThe current needed to spin up the drive.Spin Buzz - Number of cycles needed to spin upThe number of retries during spin up because of low current available.Offline Seek Performance - Drive performance during offline operationsThe seek performance of the drive during internal self tests.Disk Shift - Distance of the disk has shifted relative to the spindleIncorrect disk spin can be cause by mechanical shock or high temperature.G-Sense Error Rate - Number of mechanical errorsNumber of errors resulting from shock or vibration.Loaded Hours - Number of powered on hoursThis value is constantly increasing (once per every hour).Load/Unload Retry Count - Number of load/unload operationsThe number of drive head enters/leaves the data zone.Load Friction - Mechanical friction rateThe rate of friction between mechanical parts. Indicate problem with the mechanical subsystem of the drive.Load-in Time - Total time the heads are loadedThe time while the read/write heads are in the data zone.Torque Amplification Count - Rate of torque increaseTorque increase during the spin up operation of the hard disk.Power-off Retract Count - Number of power off cyclesThe number of times the head was retracted as a result of power loss.GMR Head Amplitude - Head positioning amplitudeHead moving distances between operations.Hard Disk Temperature - Disk temperatureThe temperature inside the hard disk housing.Head Flying Hours - Number of head positioning hoursTime spent during the positioning of the drive heads.Read Error Retry Rate - Number of retries during read operationsNumber of errors found during reading a sector from disk surface.

I have the same configuration, a NV+v2 with 4 3TB ST3000DM001-9YN166 disks upgraded firmware CCH4 (the latest).And the identical message error:Detected increasing command timeouts[524296] on disk 1 [ST3000DM001-9YN166, S1F03ANB]. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.I just checked (two days ago, after the upgrade to new firmware) all the disks with seatools and there were no problems..

what command timeout count do you see in the smart stats?

intothevoid

Aspirant

Oct 01, 2012

[Solved] Seagate ST3000DM001-9YN166: Load Cycle Count

Hi,

sorry if this has been discussed already, all i could find are topics about upgrading the firmware on these drives.

I've got a NV+v2 with 4 3TB ST3000DM001-9YN166 disks in it, firmware CCH4 (the latest).
System has been running for a couple of months now, and all was working smoothly until i got a warning 2 weeks ago:

Detected increasing command timeouts[65537] on disk 1 [ST3000DM001-9YN166, W1F083HJ]. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.

Looking into the SMART status, it indeed gives some ridiculously high number (4295032833) on command time out for disk 1. On the other 3 disks it's 0.
Another thing i noticed in the SMART status for all disks though, is that the load count cycle is quite high: around 43000 after 1800 power on hours.
After a bit of googling however, all i could find was this problem cropping up on WD green drives. The solution seems to be to set the idle time a bit higher.
I did find this old topic (http://www.readynas.com/forum/viewtopic.php?f=36&t=51536) explaining how to do it on some other seagate model, but i wanted to get some feedback before i attempt to change things via ssh. Don't wanna corrupt anything.
So basically my questions are:

1) Is the command time out problem in anyway related to the lcc problem, or is this drive just about to die no matter what?
2) Does anybody else have these high lcc numbers with the ST3000DM001 drives?
3) Is it worthwhile changing the idle time setting with hdparm?
3b) If yes, are the instructions in that old topic still valid?

On a related note, i did install the ssh add-on and looked around a little, however when i try to get drive status with hdparm i get this:

root@nas:~# hdparm -i /dev/sda

/dev/sda:
 HDIO_GET_IDENTITY failed: Inappropriate ioctl for device

Same for sdb, sdc, and sdd
Am i using the right identifiers here? Sorry for the ignorance :oops:

Hope this isn't all too confusing, i tried to cram a lot of information in this post :-)
Thanks in advance for any help!

EDIT: apparently i do not have permission to use the url tag..

Hw & Hw Compatibility

Other

50 Replies

Replies have been turned off for this discussion

StephenB
Guru - Experienced User
Oct 20, 2012
HERBIEO wrote:
Do you think there could be a bug in this firmware version that is reporting errors that are not really happening ?
because looking at the SMART info there is no errors.
It looks that way. To confirm it you would need to remove the disk (NAS powered down first) and check the SMART stats on another device (for instance a PC).
spett
Aspirant
Oct 21, 2012
Got a similar problem myself with the same disks and NV+ V2.
Got 30065229831 Command Timouts on Disk 1 after a disk test performed trough the boot menu. Powered down the nas, removed Disk 1 and ran Seatools on it without findig any faults. Put the disk back in and ran a new disk test on the nas, when that finished the command timouts had doubled to 60130459662. Incredibly high number! Had a spare disk lying around, brand new. Removed Disk 1 once again and resynced to the brand new disk. Ran a new disk test with the same result a before, 30065229831 Command Timouts on Disk 1.

What is happening here? Running RAIDiator 5.3.6

HERBIEO wrote:
Do you think there could be a bug in this firmware version that is reporting errors that are not really happening ? because looking at the SMART info there is no errors.

StephenB

Guru - Experienced User

Oct 21, 2012

spett wrote:
Got a similar problem myself with the same disks and NV+ V2.
Got 30065229831 Command Timouts on Disk 1 after a disk test performed trough the boot menu. Powered down the nas, removed Disk 1 and ran Seatools on it without findig any faults. Put the disk back in and ran a new disk test on the nas, when that finished the command timouts had doubled to 60130459662. Incredibly high number! Had a spare disk lying around, brand new. Removed Disk 1 once again and resynced to the brand new disk. Ran a new disk test with the same result a before, 30065229831 Command Timouts on Disk 1.

What is happening here? Running RAIDiator 5.3.6

spett wrote:
Got a similar problem myself with the same disks and NV+ V2. Got 30065229831 Command Timouts on Disk 1 after a disk test performed trough the boot menu. Powered down the nas, removed Disk 1 and ran Seatools on it without findig any faults. Put the disk back in and ran a new disk test on the nas, when that finished the command timouts had doubled to 60130459662. Incredibly high number! Had a spare disk lying around, brand new. Removed Disk 1 once again and resynced to the brand new disk. Ran a new disk test with the same result a before, 30065229831 Command Timouts on Disk 1. What is happening here? Running RAIDiator 5.3.6

I don't believe the 30065229831 (or 60130459662) number, but there still might be something actually wrong in the SMART stats. Are you seeing this in email alerts/log entries? If so you should confirm by looking at the actual SMART statistics.

You might also try putting the drive into the PC again and looking at the SMART stats there (Seatools won't show them to you, but other tools will. Acronis Drive Monitor is one of several).

spett

Aspirant

Oct 21, 2012

StephenB wrote:
spett wrote:
Got a similar problem myself with the same disks and NV+ V2.
Got 30065229831 Command Timouts on Disk 1 after a disk test performed trough the boot menu. Powered down the nas, removed Disk 1 and ran Seatools on it without findig any faults. Put the disk back in and ran a new disk test on the nas, when that finished the command timouts had doubled to 60130459662. Incredibly high number! Had a spare disk lying around, brand new. Removed Disk 1 once again and resynced to the brand new disk. Ran a new disk test with the same result a before, 30065229831 Command Timouts on Disk 1.

What is happening here? Running RAIDiator 5.3.6
I don't believe the 30065229831 (or 60130459662) number, but there still might be something actually wrong in the SMART stats. Are you seeing this in email alerts/log entries? If so you should confirm by looking at the actual SMART statistics.

You might also try putting the drive into the PC again and looking at the SMART stats there (Seatools won't show them to you, but other tools will. Acronis Drive Monitor is one of several).

I have tried putting the drive in my computer, same number is saved in the smart stats in the drive..

StephenB
Guru - Experienced User
Oct 22, 2012
spett wrote:
I have tried putting the drive in my computer, same number is saved in the smart stats in the drive..
Since the drive itself creates/maintains the SMART stats, that means that (a) either the drive really is seeing that many Command Timeouts, or (b) there is a bug in the drive's firmware.

If you purchased these drives yourself, you probably will need to contact Seagate support.

What firmware is the drive running? Note you can check at Seagate to see if there is a firmware update for your drives (you can enter the drive model and it's serial number, and their on-line tool should tell you).

spett wrote:
I have tried putting the drive in my computer, same number is saved in the smart stats in the drive..

spett

Aspirant

Oct 22, 2012

StephenB wrote:
spett wrote:
I have tried putting the drive in my computer, same number is saved in the smart stats in the drive..
Since the drive itself creates/maintains the SMART stats, that means that (a) either the drive really is seeing that many Command Timeouts, or (b) there is a bug in the drive's firmware.

If you purchased these drives yourself, you probably will need to contact Seagate support.

What firmware is the drive running? Note you can check at Seagate to see if there is a firmware update for your drives (you can enter the drive model and it's serial number, and their on-line tool should tell you).

They are upgraded to firmware version CC4H by me because of a note in the hardware compability list for the NV+ V+, but I do notice that this comment is removed now.
But the weird thing is that the comand timouts only increases when doing a disk test trough the boot menu. Did this test because I had frequent lock-ups using sabnzbd.

Did not find the online check at seagate.com, do you have a link?

HERBIEO
Aspirant
Oct 22, 2012
Online check https://apps1.seagate.com/downloads/request.html
but i think you are allready running the latest firmware for those drives but worth a check.
spett
Aspirant
Oct 23, 2012
Found out it does have to have something to do with the NV+ V2. Since I now had a drive with a high command timeout as a sparedrive I pulled out drive 2 which had 0 command timeouts and resynced with the spare drive. Ran a new disk test and once again the command timeouts of drive 1 increased with 30065229831. Drive 2 still have the same amount as before the test.

Googling command timouts i get that this is probably because of a faulty power supply or corroded contacts.

What do you think? Is this a software fault or is my NAS faulty?
StephenB
Guru - Experienced User
Oct 23, 2012
The problem appears isolated to the disk test. So that would seem to rule out the supply and the contacts - they wouldn't mysteriously heal when the test completes.

It is conceivable that the disk test includes some commands that provoke the drive to fail. For instance, malformed/illegal commands, or perhaps legal but unusual commands that the drive firmware doesn't handle properly.

I'd suggest a support case.

In the meantime I would simply not run that test! And record/track the current counts and confirm that they never increase in normal operation.

intothevoid

Aspirant

Oct 23, 2012

The problem appears isolated to the disk test. So that would seem to rule out the supply and the contacts - they wouldn't mysteriously heal when the test completes.

It is conceivable that the disk test includes some commands that provoke the drive to fail. For instance, malformed/illegal commands, or perhaps legal but unusual commands that the drive firmware doesn't handle properly.

I'd suggest a support case.

In the meantime I would simply not run that test! And record/track the current counts and confirm that they never increase in normal operation.

Is that test run automatically at boot, or on a schedule? I've never manually triggered it.
Both times my drive 'failed' (and had a higher command timeout value afterwards) the NAS was in normal operation.

EDIT: Included quote for top of page clarity.

Forum Discussion

[Solved] Seagate ST3000DM001-9YN166: Load Cycle Count

50 Replies

Related Content

Seagate ST3000DM001 9YN166 (fw CC82) compatible?

ReadyNAS RN104 fresh setup to replace Seagate NAS

Netgear Nighthawk RAX50V2 - Access Denied Bug SOLVED SOLUTION

SOLVED: ReadyNAS Backups of new Windows 11 PC

ST3000DM001-9YN166 causing problems

NETGEAR Academy

ProSupport for Business