ReadyNASOS 6.9.0-RC1 and Kernel Plus app

givememynamebak · ‎2017-10-12

Is linux-image-extra included with this RC? Docker is now having a hard time installing because this is missing.

Do I need to install it manually?

Thank you!

givememynamebak · ‎2017-10-13

Upgraded to this version 6.9.0-RC1 last night on my Pro6 and tonight I am begin slammed with impending disk failures on drive 3. Just coincidence?

Detected increasing pending sector: count [144] on disk 3 (Internal) [Hitachi HDS5C4040ALE630, PL2321LAG8VWNJ] 11 times in the past 30 days. This condition often indicates an impending failure. Be prepared to replace this disk to maintain data redundancy.

Detected increasing ATA error count: [90] on disk 3 (Internal) [Hitachi HDS5C4040ALE630, PL2321LAG8VWNJ] 16 times in the past 30 days. This condition often indicates an impending failure. Be prepared to replace this disk to maintain data redundancy.

Went from 65 error count at 8:43pm to 90 at 9:27pm (45 minutes). Seems like this may have been from the update last night?

StephenB · ‎2017-10-15

Counts on both errors increased several times over the past 30 days, so it seems unlikely to be RC-1.

The pending sector counts are particularly concerning - those are sectors that could not be read.

punchbuggy · ‎2017-10-15

Hi. Another possibility is that these errors were always occurring, but just not being logged until RC1.

I recall similar errors on one of my WD Red drives in a previous OS release which went away with a subsequent release of a new branch i.e. the opposite situation. I assumed that error handling was better in the newer release.

I'd be keen to see how this matter progresses. I don't want the errors back... 😉

givememynamebak · ‎2017-10-15

It only started the night I installed RC1. Until then, no errors on any drive I check regularly. I've got a new drive on the way. This is using the default xraid,. Raid 5, so I presume it will recover okay. Should I shut it down until the new drive arrives?

punchbuggy · ‎2017-10-15

You have the same configuration as my RN314. Your drive hasn't failed, so you can replace the one reporting errors when the new one arrives. After that you can try connecting it to a PC and run manufacturer diagnostics on it - would be telling if it reports it as clean. I keep an external SATA drive enclosure for just such requirements.

When I was seeing the ATA errors regularly, I upgraded the unit to 6.8 (I think it was) and the errors stopped appearing. Either better device handling or condition logging changed - either way, the same drive configuration is still running (4 x WD Red disks).

givememynamebak · ‎2017-10-15

Wierd. Okay, I'll run manufacturer diagnostics once I get the new drive. I've got a SATA enclosure too... Will post back the results here when I am done. Thanks for the info!

Cheers!

punchbuggy · ‎2017-10-15

And I'll apologise for misinformation now - just checked. When I was getting drive errors, they went away with 6.4.0 - that was 2015. I must be getting old... 😞

givememynamebak · ‎2017-10-15

No worries, we all are. 😀. Still with doing... Zero errors reported until 6.9.0-RC1... my error counts are now at like 2000... All since RC1. Who knows it maybe coincidence... But we'll see when I get my new drive from Newegg.

mdgm-ntgr · ‎2017-10-15

Have a look at smart_history.log. That will show the history of the error count increases for key values. You can compare that with the firmware update history in initrd.log. You may well find the errors started increasing before the firmware update.

Note we don't send out alerts for every change, only when there is a significant change.

Is it just the current pending sector count and ATA errors that have been increasing?

givememynamebak · ‎2017-10-16

The smart_history.log shows errors starting 2017-10-13 18:56:14:

2017-03-05 07:44:46  Hitachi HDS5C4040ALE  PL2321LAG98SWJ        0             0             0               -1          -1            0             0                  0         
2017-10-13 18:56:14  Hitachi HDS5C4040ALE  PL2321LAG8VWNJ        11            13            0               -1          -1            32            0                  11

The initrd.log contains shows last upgrade as 2017/10/13 05:36:41 UTC - 10AM Arizona time?

[2017/03/05 14:42:38 UTC] Factory default initiated by Frontview
[2017/03/05 14:42:40 UTC] Defaulting to X-RAID mode, RAID level 5, 6 disks
[2017/03/05 14:43:19 UTC] Factory default initiated on ReadyNASOS 6.7.0-T180 (Beta 3).
[2017/03/26 15:41:38 UTC] Updated from ReadyNASOS 6.7.0-T180 (Beta 3) to 6.7.0-T206 (Beta 4).
[2017/05/06 18:28:00 UTC] Updated from ReadyNASOS 6.7.0-T206 (Beta 4) to 6.7.1 (ReadyNASOS).
[2017/05/27 10:54:03 UTC] Updated from ReadyNASOS 6.7.1 (ReadyNASOS) to 6.7.4 (ReadyNASOS).
[2017/06/24 16:06:57 UTC] Updated from ReadyNASOS 6.7.4 (ReadyNASOS) to 6.7.5 (ReadyNASOS).
[2017/08/03 05:50:28 UTC] Updated from ReadyNASOS 6.7.5 (ReadyNASOS) to 6.8.0 (RC1).
[2017/08/11 06:04:52 UTC] Updated from ReadyNASOS 6.8.0 (RC1) to 6.8.0 (ReadyNASOS).
[2017/09/14 03:19:55 UTC] Updated from ReadyNASOS 6.8.0 (ReadyNASOS) to 6.8.1 (RC2).
[2017/09/28 06:52:38 UTC] Updated from ReadyNASOS 6.8.1 (RC2) to 6.8.1 (ReadyNASOS).
[2017/10/13 05:36:41 UTC] Updated from ReadyNASOS 6.8.1 (ReadyNASOS) to 6.9.0 (RC1).

Its hard for me to tell since one is logging in UTC time and the other is Arizona (MST) timezone maybe? But it seems like the errors only started after the update.

The errors I've received are 3 types and so far the last entries of each:

Sun Oct 15 2017 13:23:38

Disk: Detected increasing reallocated sector count: [1884] on disk 3 (Internal) [Hitachi HDS5C4040ALE630 PL2321LAG8VWNJ] 193 times in the past 30 days. This condition often indicates an impending failure. Be prepared to replace this disk to maintain data redundancy.

Sun Oct 15 2017 12:44:16

Disk: Detected increasing pending sector: count [744] on disk 3 (Internal) [Hitachi HDS5C4040ALE630, PL2321LAG8VWNJ] 52 times in the past 30 days. This condition often indicates an impending failure. Be prepared to replace this disk to maintain data redundancy.

Sun Oct 15 2017 3:52:22

Disk: Detected increasing ATA error count: [838] on disk 3 (Internal) [Hitachi HDS5C4040ALE630, PL2321LAG8VWNJ] 72 times in the past 30 days. This condition often indicates an impending failure. Be prepared to replace this disk to maintain data redundancy.

Do you think I should go back to 6.8.1 and see if that stops the errors or are these legit errors? I am gussing maybe they weren't logged before or something? Although I do check often maybe they weren't being incremented... not too sure.

StephenB · ‎2017-10-16

The NAS can't create reallocated or pending sectors on the drive (and those counts are managed and stored by the drives themselves).

If you have access to a Windows PC, the next step is to test the drive using the vendor tools in that PC. Hitachi is now owned by Western Digital, so that would be WDC Lifeguard. You can either connect the drive to a Windows SATA or eSATA port, or use a USB->SATA converter.

Power down the NAS, and if you remove more than one drive at a time, be careful to label them. Reinsert the drive(s) before powering up the NAS.

givememynamebak · ‎2017-10-16

Yah thats what I was thinking too but thought to ring the forum just incase as it seemed a little too coincidental. I'll check out the manufacturer troubleshooting software when the new drive arrives. As a side note and strangely - the errors magically stopped growing today. I think I shut off all scheduled defrags, balancing etc last night... hmmm

mdgm-ntgr · ‎2017-10-16

When you're using the NAS heavily some of the error counts are much more likely to increase. For example the current pending sector count is increased when the NAS fails to read a sector on the disk. If it doesn't try to read any sectors then the error count won't increase.

givememynamebak · ‎2017-10-17

It wasn't in use, if at all by me. Maybe system operations or the drive utilities like defrag, or other scheduled repair operations is all I could really think of. New drive is in. Resync in progress 1.25% complete, 37 hours to go. Will check out the other drive and post back by this weekend.

mdgm-ntgr · ‎2017-10-17

Defrag heavily tries to read the sectors on the disks. You can't defragment data without reading it.

johnw248 · ‎2017-10-18

You should check the serial number of the defective drive on the HGST RMA site, if still under warranty they will replace it once returned. The replacements take some time, the ones they sent me were shipped from the factory in Tiawan to a UPS drop ship and then out to user.

givememynamebak · ‎2017-10-18

Will do. Thank you.

givememynamebak · ‎2017-10-18

Wed Oct 18 2017 18:21:14	
Volume: Bit rot has detected an error within /data/.timemachine/ReadyNAS/Gary’s MacBook Pro.sparsebundle/bands/123 and cannot correct the error.
Wed Oct 18 2017 11:42:28	
Disk: Disk in channel 3 (Internal) changed state from RESYNC to ONLINE.
Wed Oct 18 2017 11:42:26	
Volume: Volume data health changed from Degraded to Redundant.
Wed Oct 18 2017 11:42:23	
Volume: Volume data is resynced.
Wed Oct 18 2017 1:00:09	
Volume: Volume data is Degraded.
Tue Oct 17 2017 21:29:41	
Volume: Resyncing started for Volume data.
Tue Oct 17 2017 21:28:54	
Volume: Volume data is Degraded.

Nice that my drive problem is gone... and drive is Redundant. Was a painless update. Now it looks like Bit Rot is on my .timemachine data now. Sigh. I guess I can start over and backup new with not many other options. What kind of scheduling is recommended Defrag, Scrub, Balance, Disk Test for the various Volume Schedule Activities? Monthly? Yearly? Weekly? In the 3 ReadyNAS boxes I have, I've only enabled this recently on this NAS using OS6 and using alternating weeks and once every month for each activity and ironically or maybe not so ironically -- my first disk failure (above) on this ReadyNAS. Maybe I was beating them up too hard with that kind of schedule during off-peak hours... Thoughts anyone?

givememynamebak · ‎2017-11-18

Thank you John. No more warranty on that drive, expired over a year ago April 2016. I have to say, this HGST drive is the first drive I've owned that broke a year and a half after warranty making it the worst drive I've owned. Everything else lasted much longer than this, but I guess thats certainly not worth complaining about! 🙂

ReadyNASOS 6.9.0-RC1 and Kernel Plus app

ReadyNASOS 6.9.0-RC1 and Kernel Plus app

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors

Re: ReadyNASOS 6.9.0-RC1 - drive errors