Hi,sorry if this has been discussed already, all i could find are topics about upgrading the firmware on these drives.I've got a NV+v2 with 4 3TB ST3000DM001-9YN166 disks in it, firmware CCH4 (the latest).System has been running for a couple of months now, and all was working smoothly until i got a warning 2 weeks ago:Detected increasing command timeouts[65537] on disk 1 [ST3000DM001-9YN166, W1F083HJ]. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.Looking into the SMART status, it indeed gives some ridiculously high number (4295032833) on command time out for disk 1. On the other 3 disks it's 0.Another thing i noticed in the SMART status for all disks though, is that the load count cycle is quite high: around 43000 after 1800 power on hours.After a bit of googling however, all i could find was this problem cropping up on WD green drives. The solution seems to be to set the idle time a bit higher.I did find this old topic (http://www.readynas.com/forum/viewtopic.php?f=36&t=51536) explaining how to do it on some other seagate model, but i wanted to get some feedback before i attempt to change things via ssh. Don't wanna corrupt anything.So basically my questions are:1) Is the command time out problem in anyway related to the lcc problem, or is this drive just about to die no matter what?2) Does anybody else have these high lcc numbers with the ST3000DM001 drives?3) Is it worthwhile changing the idle time setting with hdparm?3b) If yes, are the instructions in that old topic still valid?On a related note, i did install the ssh add-on and looked around a little, however when i try to get drive status with hdparm i get this:root@nas:~# hdparm -i /dev/sda/dev/sda: HDIO_GET_IDENTITY failed: Inappropriate ioctl for deviceSame for sdb, sdc, and sddAm i using the right identifiers here? Sorry for the ignorance :oops: Hope this isn't all too confusing, i tried to cram a lot of information in this post :-)Thanks in advance for any help!EDIT: apparently i do not have permission to use the url tag..

Command Timeout: Indicates a number of aborted operations due to hard disk timeoutthis is a critical parameterit would be a good idea to connect the drive to your computer and test with sea tools > http://www.seagate.com/support/downloads/seatools/

This is the list of known S.M.A.R.T. attributes supported by IDE and Serial ATA hard disks.Note: some manufacturers may use the attributes for different purposes also.Attributes not listed here are "vendor specific" attributes (their purpose is not known)Raw Read Error Rate - Errors occured while reading raw data from a diskIndicate problem with the disk surface or the read/write heads.Critical attributeThroughput Performance - General throughput performance of the hard diskIndicate problem with motor, servo or bearings.Spin Up Time - Time needed by spindle to spin-up to full RPMIndicate problem with motor or bearings.Critical attributeStart/Stop Count - Count of start/stop cycles of spindleThis value does not directly affect the condition of the drive.Reallocated Sector Count (Reallocated Sectors Count) - Count of sectors moved to the spare areaIndicate problem with the disk surface or the read/write heads.Critical attributeCommand Timeout - Indicates a number of aborted operations due to hard disk timeoutCritical attributeRead Channel Margin - Margin of a channel while reading dataThe exact function of this attribute is not specified.Seek Error Rate - Rate of positioning errors of the read/write headsIndicate problem with servo, head. High temperature can also cause this problem.Critical attributeSeek Time Performance - Average time of seek operations of the headsIndicate problem with servo.Critical attributePower-On Time Count - Total time the drive is powered onThe unit of the measure depends on the manufacturer.Spin Retry Count - Retry count of spin start attemptsIndicate problem with motor, bearings or power supply.Critical attributeDrive Calibration Retry Count - Number of attempts to calibrate a driveIndicate problem with motor, bearings or power supply.Drive Power Cycle Count - Number of complete power on/off cyclesThis value does not directly affect the condition of the drive.Soft Read Error Rate - Number of software read errorsThe number of uncorreactable read errors.Airflow Temperature - Airflow temperatureThe temperature of the air inside the hard disk housing.Mechanical Shock - Count of problems caused by mechanical shockAcceleration (for example falling) can cause mechanical shock.Power off Retract Cycle - Count of power off cyclesThis value does not directly affect the condition of the drive.Load/Unload Cycle Count - Count of load/unload cyclesNumber of cycles the head moved into landing zone position.HDD Temperature - Disk temperatureThe temperature inside the hard disk housing.Hardware ECC Recovered - Count of correctable errorsNumber of errors corrected by the internal error correcting mechanism.Reallocation Event Count - Count of sector remap operationsNumber of all (successful and failed) remap operations.Critical attributeCurrent Pending Sector Count - Count of unstable sectorsThese pending sectors may be remapped to the spare area.Critical attributeOff-Line Uncorrectable Sector Count - Count of uncorrectable errors when reading/writingIndicate problem with the disk surface or the read/write heads.Critical attributeUltra ATA CRC Error Count - Count of errors during data transfer between disk and hostIndicate problem with the power supply or data cable.Write Error Rate - Errors occured while writing raw data from a diskIndicate problem with the disk surface or the read/write heads.Soft Read Error Rate - Number of software read errorsThe number of uncorreactable read errors.Data Address Mark Errors - Number of data address mark errorsNumber of incorrect or invalid address marks.Run Out Cancel - Number of data correction errorsInvalid error correction checksum found during error correction.Soft ECC Correction - Number of corrected data errorsErrors corrected by the internal error correction mechanism.Thermal Asperity Rate - Number of thermal problemsTotal number of problems caused by high temperature.Flying Height - Head flying heightThe height of the disk heads above the disk surface.Spin High Current - Current value during spin upThe current needed to spin up the drive.Spin Buzz - Number of cycles needed to spin upThe number of retries during spin up because of low current available.Offline Seek Performance - Drive performance during offline operationsThe seek performance of the drive during internal self tests.Disk Shift - Distance of the disk has shifted relative to the spindleIncorrect disk spin can be cause by mechanical shock or high temperature.G-Sense Error Rate - Number of mechanical errorsNumber of errors resulting from shock or vibration.Loaded Hours - Number of powered on hoursThis value is constantly increasing (once per every hour).Load/Unload Retry Count - Number of load/unload operationsThe number of drive head enters/leaves the data zone.Load Friction - Mechanical friction rateThe rate of friction between mechanical parts. Indicate problem with the mechanical subsystem of the drive.Load-in Time - Total time the heads are loadedThe time while the read/write heads are in the data zone.Torque Amplification Count - Rate of torque increaseTorque increase during the spin up operation of the hard disk.Power-off Retract Count - Number of power off cyclesThe number of times the head was retracted as a result of power loss.GMR Head Amplitude - Head positioning amplitudeHead moving distances between operations.Hard Disk Temperature - Disk temperatureThe temperature inside the hard disk housing.Head Flying Hours - Number of head positioning hoursTime spent during the positioning of the drive heads.Read Error Retry Rate - Number of retries during read operationsNumber of errors found during reading a sector from disk surface.

I have the same configuration, a NV+v2 with 4 3TB ST3000DM001-9YN166 disks upgraded firmware CCH4 (the latest).And the identical message error:Detected increasing command timeouts[524296] on disk 1 [ST3000DM001-9YN166, S1F03ANB]. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.I just checked (two days ago, after the upgrade to new firmware) all the disks with seatools and there were no problems..

what command timeout count do you see in the smart stats?

intothevoid

Aspirant

Oct 01, 2012

[Solved] Seagate ST3000DM001-9YN166: Load Cycle Count

Hi,

sorry if this has been discussed already, all i could find are topics about upgrading the firmware on these drives.

I've got a NV+v2 with 4 3TB ST3000DM001-9YN166 disks in it, firmware CCH4 (the latest).
System has been running for a couple of months now, and all was working smoothly until i got a warning 2 weeks ago:

Detected increasing command timeouts[65537] on disk 1 [ST3000DM001-9YN166, W1F083HJ]. This often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.

Looking into the SMART status, it indeed gives some ridiculously high number (4295032833) on command time out for disk 1. On the other 3 disks it's 0.
Another thing i noticed in the SMART status for all disks though, is that the load count cycle is quite high: around 43000 after 1800 power on hours.
After a bit of googling however, all i could find was this problem cropping up on WD green drives. The solution seems to be to set the idle time a bit higher.
I did find this old topic (http://www.readynas.com/forum/viewtopic.php?f=36&t=51536) explaining how to do it on some other seagate model, but i wanted to get some feedback before i attempt to change things via ssh. Don't wanna corrupt anything.
So basically my questions are:

1) Is the command time out problem in anyway related to the lcc problem, or is this drive just about to die no matter what?
2) Does anybody else have these high lcc numbers with the ST3000DM001 drives?
3) Is it worthwhile changing the idle time setting with hdparm?
3b) If yes, are the instructions in that old topic still valid?

On a related note, i did install the ssh add-on and looked around a little, however when i try to get drive status with hdparm i get this:

root@nas:~# hdparm -i /dev/sda

/dev/sda:
 HDIO_GET_IDENTITY failed: Inappropriate ioctl for device

Same for sdb, sdc, and sdd
Am i using the right identifiers here? Sorry for the ignorance :oops:

Hope this isn't all too confusing, i tried to cram a lot of information in this post :-)
Thanks in advance for any help!

EDIT: apparently i do not have permission to use the url tag..

Hw & Hw Compatibility

Other

50 Replies

Replies have been turned off for this discussion

toomanybarts
Aspirant
Oct 27, 2012
Cynan - good info, that makes me feel a little better, but I stress the "little"! :)
intothevoid
Aspirant
Oct 27, 2012
seems a little strange,

binary 100010001 is 111 in hex
hex 4295032833 is 100001010010101000000110010100000110011 in binary

or am i doing something wrong? :-)
dsm1212
Apprentice
Oct 27, 2012
intothevoid wrote:
seems a little strange,

binary 100010001 is 111 in hex
hex 4295032833 is 100001010010101000000110010100000110011 in binary

or am i doing something wrong? :-)

4295032833 is a decimal number. Convert it to hex. You treated it like a hex number.
dsm1212
Apprentice
Oct 27, 2012
Btw I think that many timeouts would take centuries. So either seagate has a fw bug or the readynas is not polling the smart info correctly. Does the drive show this smart data in a different system? If so then its seagates problem I would think. However netgear could give provide an option to not treat this as a failed drive to work around the bug.

intothevoid wrote:
seems a little strange, binary 100010001 is 111 in hex hex 4295032833 is 100001010010101000000110010100000110011 in binary or am i doing something wrong? :-)

StephenB

Guru - Experienced User

Oct 27, 2012

dsm1212 wrote:
Btw I think that many timeouts would take centuries. So either seagate has a fw bug or the readynas is not polling the smart info correctly. Does the drive show this smart data in a different system? If so then its seagates problem I would think. However netgear could give provide an option to not treat this as a failed drive to work around the bug.

dsm1212 wrote:
Btw I think that many timeouts would take centuries. So either seagate has a fw bug or the readynas is not polling the smart info correctly. Does the drive show this smart data in a different system? If so then its seagates problem I would think. However netgear could give provide an option to not treat this as a failed drive to work around the bug.

Based on the limited information available, it appears that with this drive Seagate is formatting this parameter in a proprietary way - packing multiple counts into the same parameter. Hence the 0x100010001 or 0x700070007 or 0xE000E000E values that have been reported here. Exactly why they are doing this is unknown - perhaps it is simply a firmware bug in the drive, or perhaps the three subfields are somewhat different (and can hold different values in some failure modes). Since these counts have been confirmed by some users on a PC (using other SMART query tools), that doesn't seem to be a ReadyNAS issue.

But there some other users who are getting alerts on this parameter, but are seeing 0 in the SMART parameter. That seems more likely to be a ReadyNAS bug.

grr1
Aspirant
Oct 29, 2012
I have two ST3000DM001-1CH166 in my Ultra4, and I am getting:

Detected increasing uncorrectable errors
Detected increasing pending sector count

On Disk 1, and a couple of days later the NAS considered the disk dead. When I test it with Seatools on a PC everything is 100% healthy. Nothing found at all.

Starting to wonder if this is related to this thread... even though it is a slightly different model drive and NAS.
StephenB
Guru - Experienced User
Oct 29, 2012
What are the SMART stats?
intothevoid
Aspirant
Jan 28, 2013
Hi all,

a quick follow up:

After my last post in this thread, 3 months ago, I decided to try one final thing: I disabled the 'disk spin-down after X minutes of inactivity' power setting. Annnnd: no problems ever since! All disks are fine and have stopped failing.
I waited a while before posting this since the original problem occurred only about once a month, but I think after 3 months we can conclude that this fixes things :-)

natepiet

Aspirant

May 08, 2014

grr wrote:
I have two ST3000DM001-1CH166 in my Ultra4, and I am getting:

Detected increasing uncorrectable errors
Detected increasing pending sector count

On Disk 1, and a couple of days later the NAS considered the disk dead. When I test it with Seatools on a PC everything is 100% healthy. Nothing found at all.

Starting to wonder if this is related to this thread... even though it is a slightly different model drive and NAS.

grr wrote:
I have two ST3000DM001-1CH166 in my Ultra4, and I am getting: Detected increasing uncorrectable errors Detected increasing pending sector count On Disk 1, and a couple of days later the NAS considered the disk dead. When I test it with Seatools on a PC everything is 100% healthy. Nothing found at all. Starting to wonder if this is related to this thread... even though it is a slightly different model drive and NAS.

I have started having this same problem on my NAS 314 - OS 6.1.7.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Thu May 8 2014 22:57:40
Disk: Detected increasing uncorrectable error count: [1592] on disk 3 (Internal) [ST3000DM001-1CH166, Z1F3910G]. This condition often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.
Thu May 8 2014 22:57:34
Disk: Detected increasing pending sector count: [1592] on disk 3 (Internal) [ST3000DM001-1CH166, Z1F3910G] 37 times in the past 30 days. This condition often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.
Thu May 8 2014 22:55:39
Disk: Detected increasing ATA error count: [87] on disk 3 (Internal) [ST3000DM001-1CH166, Z1F3910G] 12 times in the past 30 days. This condition often indicates an impending failure. Please be prepared to replace this disk to maintain data redundancy.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

What ended up happening in the end? Did you just buy a new disk and replace it?

StephenB
Guru - Experienced User
May 09, 2014
I suggest testing the disk with seatools in a PC. If you have 1592 pending sectors, the disk should fail the diagnostic.

If you can't run seatools, RMA the disk anyway. There is a code for that in the RMA template Seagate uses.

(BTW, if the disk is new, you are better off exchanging it with the seller, as Seagate will not give you a new disk in return. If you do end up RMAing, look for an option to send you the new disk before you return the defective one. In the US that costs almost the same as shipping the disk yourself, and you get the replacement disk a lot faster).

Forum Discussion

[Solved] Seagate ST3000DM001-9YN166: Load Cycle Count

50 Replies

Related Content

Seagate ST3000DM001 9YN166 (fw CC82) compatible?

ReadyNAS RN104 fresh setup to replace Seagate NAS

Netgear Nighthawk RAX50V2 - Access Denied Bug SOLVED SOLUTION

SOLVED: ReadyNAS Backups of new Windows 11 PC

ST3000DM001-9YN166 causing problems

NETGEAR Academy

ProSupport for Business