NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
HolgerGT86
Apr 08, 2020Guide
RN104 shutdown because disks exceed safe temperature
Hello, I'm using a RN104 at firmware level 6.10.3. The ReadyNAS is equipped with 4 times Hitachi HUA721010KLA330 1TB SATA drives, all 4 drives building one RAID5 X-RAID volume. Today the RN104 sh...
Sandshark
Jun 08, 2020Sensei
If your "stonger" (I assume you mean higher CFM) fan needs a higher RPM to attain that higher flow, then the NAS is not aware of that capability.
Do make sure the rear oif the NAS is at least 6" (more is better) from the wall and that there is nothing that can cause the exhaust to re-circulate to the intake (like being hemmed in on a shelf).
HolgerGT86
Jun 08, 2020Guide
Hello Sandshark, hello StephenB,
many thanks for your responses. I'll try to answer your questions although most of those, if not all, are already documented in this thread.
I have no option to test the disk drives outside the RN104.
The disk temperature is really about 60°C. When I restart the RN104 and I can access the admin web interface, the disk temperature is still close to 60°C and there's very warm air coming out of the rear of the RN104.
There's nothing blocking the air circulation. The RN104 is placed on a shelf in a cellar room. The air temperature in that room is about 15°C all year. There must be a picture already in this thread, but I'll upload it again.
You are right when saying that I mean higher RPM when saying stronger. The fan is rotating up to 3800RPM and has more/ higher air pressure/ air flow than the original fan. I'm monitoring it on a regular base and it's usually running at >3000RPM when timemachine backups or RN104 maintenance activities like defragmentation, data scrubbing, etc. are running. At time when I monitor the disk performance and fan speed, the disk temperature is between 30°C and 45°C for the inside disk drives.
In fact, including the original fan shipped with the RN104, it is the 3rd fan I'm currently trying ... all showing same behaviour - so I doubt it's the fan itself.
The disk drives are 1TB Hitachi storage server drives of type HUA721010KLA330.
As I already documented, I so far failed to recreate the issue. I already placed the RN104 at my desk, having it aside all day. I loaded the RN104 with as much I/O as I could but everything worked fine. Even when data scrubbing is running for about 48 hours + regular time machine backups in parallel, everything is working fine.
As already said, it only happens exactly after defragmentation ended. But, I was not able to recreate it by running defragmentation manually.
I don't know what else to document here... using a smart monitor will not help because I already know the disk drives smart values correclty report the 60°C. The only thing I was not able to "see" so far is the fan stopping to rotate when defragmentation ends.
I need to place the RN104 at my desk again and try to catch it with my eyes ... or I need to buy another NAS maybe as I'm tired to debug this behaviour ... don't know.
- HolgerGT86Jun 08, 2020Guide
Foto of the RN104 when placed in the shelf ...
- StephenBJun 08, 2020Guru - Experienced User
HolgerGT86 wrote:
I have no option to test the disk drives outside the RN104.
The disk temperature is really about 60°C.
I suggest purchasing a USB/SATA adapter or dock, so you can test the disk outside of the NAS. Adapters are quite inexpensive ($20 for one with that includes power - which you need). WD's lifeguard diagnostic should be able to test the drives.
HolgerGT86 wrote:
In fact, including the original fan shipped with the RN104, it is the 3rd fan I'm currently trying ... all showing same behaviour - so I doubt it's the fan itself.
The disk drives are 1TB Hitachi storage server drives of type HUA721010KLA330.
I suspect it is one of the disks. Have you been using these with all three fans?
One option is to try replacing the one that feels the hottest - perhaps a WD Gold (which should run a bit cooler anyway). Though if you move up to a larger size, you'd have other options (Ironwolf Pro, or WD Red Pro). If that doesn't solve it, then you could perhaps then hot-swap the other Hitachi with the one you removed.
HolgerGT86 wrote:
As already said, it only happens exactly after defragmentation ended. But, I was not able to recreate it by running defragmentation manually.
That is a puzzle. Have you tried running the disk test on the volume wheel? Or only scrubs and defrags?
- HolgerGT86Jun 09, 2020Guide
I have been using all the 4 same disks all the time with all fans.
I'm running all available maintenance processes including defrag, balance, disk test and data scrubbing once a month since years.
I'm using the RN104 as backup target only. No app is installed in addition to the base firmware.
From the S.M.A.R.T data, only the outer disk in slot1 show 2 relocations. All other error counters are 0 for all the 4 disk drives.
I have some of these disk drives still available, so I'll maybe start replacing one by one over time.
Maybe I'll rotate the disks so that the inner disk drives reporting the 60°C will become the outer disk drives, just to see if the issue is really related to the current inner drives. But this will take some time as the Marvell processor is not the fastest one...
I can do hardware actions, but what I cannot do is debugging the firmware. Because of this I'ld be happy is someone with software debugging capabilities can have a look why on my RN104 the defrag startup message is always issued twice. Is this a messaging problem only, or is the process really started up 2 times in parallel?
Have a great day,
Holger - StephenBJun 09, 2020Guru - Experienced User
HolgerGT86 wrote:
on my RN104 the defrag startup message is always issued twice. Is this a messaging problem only, or is the process really started up 2 times in parallel?
I've seen reports of this, but haven't seen any information on the root cause.
But even if two defrag processes were running simultaneously, that wouldn't cause the disks to overheat. All that the process does is search for fragmented files, and then attempt to defrag them. There has to be something else going on. You've looked for other errors in the log zip file?
HolgerGT86 wrote:
Maybe I'll rotate the disks so that the inner disk drives reporting the 60°C will become the outer disk drives, just to see if the issue is really related to the current inner drives.
Power down the NAS, then remove/label the disks by their original slots. Then shuffle them as you wish. As long as you shuffle with the NAS powered down, it should work ok with no resyncs needed.
- HolgerGT86Jul 10, 2020Guide
Hello, good morning, good evening and good afternoon,
I'm updating this thread with a current status to keep it alive. The RN104 is running without any issue since disabling the defragmentation maintenance process. Data scrubbing is running for more than a day on the RN104 and never experienced an issue.
I'm planning to move the RN104 to my office desk in the next days and run defragmentation again. I'll closely monitor the RN104 behaviour than.
I'll keep you updated.
Have a great day and stay healthy!
Holger
- HolgerGT86Jul 18, 2020Guide
Hello again,
I spend some hours in testing this further. I placed the RN104 at my desk and was monitoring it through the web GUI and by sitting next to it.
I scheduled defrag and, as usual, defrag started up and 2 log entries were written that defrag was started.
The disk temperature went up to a maximum of 44°C for the inner 2 drives as long as defrag was running.
Defrag completed successfully after running for about 50-55 minutes.
When defrag completed, the interesting part started. Within 1-2 minutes after defrag completed, the temperature of the inner 2 disk drives increased up to 51/52°C! The fan speed was increased from 2730rpm to 3150rpm.
I logged in to the NAS using ssh and run smartctl. The values returned by smartctl matched the temperature of the disk drives shown in the performance window of the GUI. Seems to be the disk drives are reporting their temperature accurate.
I attached a text file including the complete smartctl output ... just in case someone can see more than I do.
The test confirms that this is not a fan problem but the temperature of the disk drives really goes up immediately after defrag completes - why ever.
Would replacing the disk drives the only option or does someone have another idea what's going on here and how to fix/ mitigate it?
Thanks and regards,
Holger
- StephenBJul 18, 2020Guru - Experienced User
HolgerGT86 wrote:
Would replacing the disk drives the only option or does someone have another idea what's going on here and how to fix/ mitigate it?
Maybe first power down the NAS, and swap drive 1,2 and 3,4 (moving the inner drives to the outer slots). Then do a defrag and see if the heating problem stays with the slots, or moves with the drives. After that test, I'd power down the NAS and restore the drives to their original positions.
I would certainly consider replacing the drives. Enterprise-class (in my opinion) are overkill for the RN100 - it is limited by it's processor speed, not the disks. I suggest starting over - two WD30EFRX or two ST3000VN007 would cost a total of about $200 (current US prices) and would give you the same capacity you are using now. Then you'd have two slots for expansion (and if you like, you can put the drives in bay 1 and bay 4 to create more distance between then).
Avoid WD EFAX models between 2 TB and 6 TB - they are SMR technology, which IMO aren't great choices for RAID. The older EFRX line is fine. Ironwolf drives are also a good option. BTW, many desktop models are also not SMR, so I won't use them either.
- SandsharkJul 18, 2020Sensei
HolgerGT86 wrote:I logged in to the NAS using ssh and run smartctl. The values returned by smartctl matched the temperature of the disk drives shown in the performance window of the GUI. Seems to be the disk drives are reporting their temperature accurate.
I attached a text file including the complete smartctl output ... just in case someone can see more than I do.
The test confirms that this is not a fan problem but the temperature of the disk drives really goes up immediately after defrag completes - why ever.
Since both the ReadyNASOS and smartctl rely on the drive reporting it's own temperature (and the OS probably calls smartctl for that), this only indicates the reported temperature is consistent, not correct. But with more than one drive reporting a similar temperature, it likely is correct.
It sounds like an airflow problem, but with the NAS out on your desk, there shouldn't be anything external contributing to that. The ambient air temperature is within normal comfort range, correct?
Does the fan speed initially drop at the conclusion of the defrag and then have to jump up when the temperature starts to increase? I've seen the hystereses and step sizes of the fan control to be less than optimal on legacy NASes, but I don't have a lot of newer ones (and no 104) to evaluate them for those cases. It may be that the greater amount of heat produced by the enterprise drives is outside the expectations Netgear had when establishing the fan control parameters for a 104. But I've run 2TB and 3TB drives from the same family in both 2 and 4 bay legacy NASes running OS6.x and had no issues with high-intensity processes like initial sync or scrub (I don't use defrag, I think it's mostly a waste of time on RAID and it can quickly increase the space used by snapshots).
- StephenBJul 18, 2020Guru - Experienced User
Sandshark wrote:
But I've run 2TB and 3TB drives from the same family in both 2 and 4 bay legacy NASes running OS6.x and had no issues with high-intensity processes like initial sync or scrub (I don't use defrag, I think it's mostly a waste of time on RAID and it can quickly increase the space used by snapshots).Still, it is weird that this happens repeatedly after a defrag. It does make me wonder if there is something off with the file system.
Another thing that HolgerGT86 could try is doing a factory reset, rebuild the NAS, and then reload the data from backup. The risk here is that the disks might overheat during the RAID sync. But the fact that they don't overheat during a scrub suggests that they won't.
- HolgerGT86Jul 19, 2020Guide
StephenB wrote:
HolgerGT86 wrote:Would replacing the disk drives the only option or does someone have another idea what's going on here and how to fix/ mitigate it?
Maybe first power down the NAS, and swap drive 1,2 and 3,4 (moving the inner drives to the outer slots). Then do a defrag and see if the heating problem stays with the slots, or moves with the drives. After that test, I'd power down the NAS and restore the drives to their original positions.
I would certainly consider replacing the drives. Enterprise-class (in my opinion) are overkill for the RN100 - it is limited by it's processor speed, not the disks. I suggest starting over - two WD30EFRX or two ST3000VN007 would cost a total of about $200 (current US prices) and would give you the same capacity you are using now. Then you'd have two slots for expansion (and if you like, you can put the drives in bay 1 and bay 4 to create more distance between then).
Avoid WD EFAX models between 2 TB and 6 TB - they are SMR technology, which IMO aren't great choices for RAID. The older EFRX line is fine. Ironwolf drives are also a good option. BTW, many desktop models are also not SMR, so I won't use them either.
Hello StephenB,
many thanks for your response. I already moved the inner drives to the outer slots and vice versa. The RN104 came up fine in this configuration.
I'm using this type of enterprise drives because I received a bunch of them for free. Nevertheless, I'm thinking about replacing them. Thanks for providing recommendations. Besides this, I now left the front door open although I do not believe it will help. The outside air temperature is always about 15°C all the year and except the front door, nothing is blocking the air flow.
I believe in an internal disk drive cooling issue when they slow down after defrag completed. I don't know if they slow down their platter rotation speed and this is than reducing their cooling ... doesn't matter anymore. I turned off defrag already.
- HolgerGT86Jul 19, 2020Guide
Sandshark wrote:
HolgerGT86 wrote:I logged in to the NAS using ssh and run smartctl. The values returned by smartctl matched the temperature of the disk drives shown in the performance window of the GUI. Seems to be the disk drives are reporting their temperature accurate.
I attached a text file including the complete smartctl output ... just in case someone can see more than I do.
The test confirms that this is not a fan problem but the temperature of the disk drives really goes up immediately after defrag completes - why ever.
Since both the ReadyNASOS and smartctl rely on the drive reporting it's own temperature (and the OS probably calls smartctl for that), this only indicates the reported temperature is consistent, not correct. But with more than one drive reporting a similar temperature, it likely is correct.
It sounds like an airflow problem, but with the NAS out on your desk, there shouldn't be anything external contributing to that. The ambient air temperature is within normal comfort range, correct?
Does the fan speed initially drop at the conclusion of the defrag and then have to jump up when the temperature starts to increase? I've seen the hystereses and step sizes of the fan control to be less than optimal on legacy NASes, but I don't have a lot of newer ones (and no 104) to evaluate them for those cases. It may be that the greater amount of heat produced by the enterprise drives is outside the expectations Netgear had when establishing the fan control parameters for a 104. But I've run 2TB and 3TB drives from the same family in both 2 and 4 bay legacy NASes running OS6.x and had no issues with high-intensity processes like initial sync or scrub (I don't use defrag, I think it's mostly a waste of time on RAID and it can quickly increase the space used by snapshots).
Hello Sandshark,
many thanks for your response, too. I assume the temperature is correct as the drives are really warm/ hot.
The temperature at my desk is about 24°C, in the cellar room, where the RN104 is usually located, the temperature is about 15°C all the year.
When defrag is complete, the fan speed did not go down. It started to increase immediately, first slowly, than in larger steps. It's for sure not the fan ... it is a new one providing more air pressure than the original one. See my response to StephenB, too, please.
The issue is only occuring when running defrag and defrag completed. No issues at all running data scrubbing or disk test.
We can stop here ... I'll think about how to "resolve" this issue.
Thanks again!
- HolgerGT86Jul 19, 2020Guide
StephenB wrote:
Sandshark wrote:
But I've run 2TB and 3TB drives from the same family in both 2 and 4 bay legacy NASes running OS6.x and had no issues with high-intensity processes like initial sync or scrub (I don't use defrag, I think it's mostly a waste of time on RAID and it can quickly increase the space used by snapshots).Still, it is weird that this happens repeatedly after a defrag. It does make me wonder if there is something off with the file system.
Another thing that HolgerGT86 could try is doing a factory reset, rebuild the NAS, and then reload the data from backup. The risk here is that the disks might overheat during the RAID sync. But the fact that they don't overheat during a scrub suggests that they won't.
I'll think about starting over from scratch and restore a backup. I'm not sure though how to reload all data from backup including the user access rights and share settings as I'm backing up the RN104 to 2 USB drives attached to it. Besides this the RN104 is used to store Mac time machine backups and Windows 10 backups. I'm afraid to lose/ break things, so I'll delay this activity to the autumn/ winter time.
Many thanks for all you support/ help. You may close this thread.
Stay healthy!
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!