× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

Problems with READYNAS system (strange system errors)

MC1967
Aspirant

Problems with READYNAS system (strange system errors)

So, I purchased a used 3200 on Ebay, that the seller listed as working, but had no drive caddys. I also happened to have a couple extra motherboards for a 4200v2 (one with a  E-3450 CPU in it already). I needed an iSCSI storage to replace a D-Link iSCSI SAN I bought that didn't work out, so I figured that I would convert the used 3200 into a 4200, so I could use 12 of the 4TB drives I had purchased for the D-Link. This is the same system I posted about previously, looking for data on speeds of RAID 5 vs RAID 10.

 

Well, I've rebuilt two other systems in the past, and not had any trouble with either of them. This "new" system, though, is a mess. I had pulled the original USB boot device and made a clone of the one in my working 4200v2. After pulling and updating the DMI info from the new board (after pulling the old MB), I put it back. I had some cables and RAM from the other projects (all verified working with a 4200). The device goes through POST, no problem. I can get into the BIOS fine, and see nothing out of the ordinary.

 

But, once it boots into OS 6 and is "power on" for a length of time (which varies), I start getting errors in the system log. Power errors, temp errors, fan not working errors (which seems to be either fan 1 or 3). I've tried all the regular troubleshooting steps I know - pulled the power supplies and used two from a working system, changed power cords, changed the power receptacle the system is plugged into in my rack, run it cover off and on my work table - and nothing makes the errors go away. Last night, I even replaced the motherboard with the other X8SI6-F-NI015 motherboard I had, and I'm STILL getting errors. I've attached a screenshot of the error log.

 

Does anyone have any ideas? Is it a bad case, somehow? Could it be the wiring harness that the power supplies plug into? The OS 6 boot device is an exact copy of my other, working, 4200v2 and that systems does not show any of these errors.

 

Model: ReadyNAS-4200v2|ReadyNAS 4200v2
Message 1 of 11

Accepted Solutions
Sandshark
Sensei

Re: Problems with READYNAS system (strange system errors)

Is the board you used from an RN4200 (lacks IPMI) or an RD5200 (has IPMI)?  They have identical part numbers.  If it's from an RD5200, then the IPMI and OS are likely having collisions WRT monitoring the board.  See the SuperMicro X8SI6 documentation as to how to disable the IPMI.

 

If that's not it, maybe it's the board that controls the redundant power supplies, though I'm not sure why that would cause fan errors.

 

A problem with the system configuration file could also cause this, but I don't see in your process how that would have happened.  The config for a 4200V2 in early OS6 was wrong, but it's been right for a while now.

 

I did the conversion you did and had no issues with it.  Since you cloned the boot USB, the two NAS should show the same serial number, which will cause issues if you ever run ReadyCloud, but nothing else should cause a problem.

View solution in original post

Message 2 of 11

All Replies
Sandshark
Sensei

Re: Problems with READYNAS system (strange system errors)

Is the board you used from an RN4200 (lacks IPMI) or an RD5200 (has IPMI)?  They have identical part numbers.  If it's from an RD5200, then the IPMI and OS are likely having collisions WRT monitoring the board.  See the SuperMicro X8SI6 documentation as to how to disable the IPMI.

 

If that's not it, maybe it's the board that controls the redundant power supplies, though I'm not sure why that would cause fan errors.

 

A problem with the system configuration file could also cause this, but I don't see in your process how that would have happened.  The config for a 4200V2 in early OS6 was wrong, but it's been right for a while now.

 

I did the conversion you did and had no issues with it.  Since you cloned the boot USB, the two NAS should show the same serial number, which will cause issues if you ever run ReadyCloud, but nothing else should cause a problem.

Message 2 of 11
MC1967
Aspirant

Re: Problems with READYNAS system (strange system errors)

Both boards were from 5200s and have an IPMI port. I will check the BIOS and see if it's enabled. The box is currently up and running on my work bench with the top off and plugged into a monitor and keyboard. So, getting into the BIOS is easy right now. I saw in another thread here that OS 6.10.6 has been released and I downloaded/installed it to this system to see what happens.

 

I copied the serial number from the DMI info of the old 5200 system along with the UUID and serial number of the motherboard to the new DMI file I pushed to the motherboard, so ReadyCloud shouldn't be an issue.

 

I'll shut it down and check the IPMI setup over lunch. I'll report back what I find.

Message 3 of 11
MC1967
Aspirant

Re: Problems with READYNAS system (strange system errors)

Turns out that to turn off IPMI/BMC, you have to move a jumper. Turned it off and there is no IPMI setup option in the BIOS advanced settings screen now. System is up and running, so I'm just waiting to see if anymore of those errors show up. There's been no pattern, so it might take a day or so. If shutting off IPMI solves the problem, I'll mark that reply as the solution.

Message 4 of 11
MC1967
Aspirant

Re: Problems with READYNAS system (strange system errors)

Well, it was good for awhile, but the fan error began again after about 6 hours. Then, after I had gone to bed, the HDD temp warning popped up again for disk 7 (which had no issues in the D-Link device, FWIW) and the system shut down.

 

This morning, before I turned it back on, I swapped the MB fan connectors for fans 1 and 2. Can't hurt and I'm really just not sure what else to try at this point. Went through all the logs last night and nothing really jumped out at me, but I did see all the fan speed permission errors you had reported on in a different thread.

 

Has anyone else had something like these issues happen? I would think it was the motherboard, but I'm getting the same issues with two different boards.I have another 3200 in my rack. It's just there for backups, so I could put the current board I'm working with into that case, to test the case. But, I've never run across a bad case in the past.

 

I've attached the latest part of the logs with the current errors.

Model: ReadyNAS-4200v2|ReadyNAS 4200v2
Message 5 of 11
Sandshark
Sensei

Re: Problems with READYNAS system (strange system errors)

Switching cases would also switch out the board that controls the redundant power supplies, so it may be worth a try, though I am having trouble figuring how all the errors could come from that.

 

Unfortunately, the performance charts don't include drive temps or fan speed.  But do the CPU or system temp give any indication that the temperature really might be out of control?

 

What setting is in the BIOS for the fans?  I believe it needs to be set to max for the ReadyNASOS control to work (though you'll still get the permission errors -- I never did track down the cause of those).

 

It may be helpful to create a task that puts the fan speeds and temperatures into a file for review/plotting and see if it really looks like these are real or false alarms.

 

 

Message 6 of 11
MC1967
Aspirant

Re: Problems with READYNAS system (strange system errors)

I checked the performance chart for a twenty-four hour period that included the shut down for the hard drive overtemp condition. System and CPU both never rose above 30 degrees Celcius, even right before the shutdown.. The box is up on my work table with the top off, so not really sure how the drive temp got that high. The temp in my cellar is around 68 degrees, so no issue there.

 

I'm not a big Linux guy, so writing that task will be a bit of a problem. I know enough to get data through SSH, ping, etc..., but have never written a CRON job.

 

Now I see that there is power warning "System: AVCC voltage in enclosure Internal is out of spec. (0.00 V)" that came in this morning. I checked all those voltages in the BIOS yesterday and they were fine.

 

I'm not home until later tonight, but I guess over the weekend, I'll put this board and CPU in the other 3200 case and see what happens. I mean, the motherboards are used, so theoretically one could have issues, but two boards showing the exact same issues?

Model: ReadyNAS-4200v2|ReadyNAS 4200v2
Message 7 of 11
Sandshark
Sensei

Re: Problems with READYNAS system (strange system errors)

Take a look here: CPU-and-HDD-Temp-Logging .  He doesn't use a cron job, just a small script that continuously runs in the background.

Message 8 of 11
MC1967
Aspirant

Re: Problems with READYNAS system (strange system errors)

I'll take a look. This whole issue is really strange. Friday night, I started to get system overtemp warnings (in the range of 114°C) and went down to check the system, as I had left it running on my work bench. The case cover was loosely set over the running internals and was NOT hot to the touch, which you would figure it would be if the internal temp was 237°F. Removing the cover, there was no large blast of heat in my face. Yet, as I was working with it, another temp warning came in.

 

In frustration, I shut everything down and called it a night. Yesterday, I was going to remove the 4200 motherboard and put the original 3200 back in to see if I would get the same errors with that board. Disconnected all the fan power, the main 24 and 8 pin cables, removed the hard drives (I was going to use the set in my 3200, as it had 2GB drives in bays 5-12), then noticed that it looked like the front control panel connector might be on backwards.

 

Back upstairs to the motherboard manuals for each board. Could this be the cause of the weirdness? I was hopeful. Flipped around the connector, plugged everything else back in, and powered up. No light on the front panel, so flipped the connector around one more time, back to the way I had it. Lights on the panel now! Everything was running, so I left it and went back to my Saturday run of chores, figuring that eventually the errors would start again and then I would swap the boards.

 

Well, we are now far past 24 hours, and not a SINGLE error. None. I am at a loss to figure this whole mess out. I'm almost afraid to shut it down and put it back in the rack, for fear of the damn errors starting again. The hard drive and system/CPU temps have been stable as well - none of the HDD are over 38°C and the CPU shows 34°C while the system temp is 36°C.

 

Ever since I swapped the fan connectors, the fan 1 errors have also stopped.

 

Here's the run of log messages up to this point:

 

Nov 13, 2021 11:31:15 AM
 
System: ReadyNASOS background service started.
Nov 12, 2021 07:06:53 PM
 
System: The system is shutting down.
Nov 12, 2021 06:54:07 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (114 C).
Nov 12, 2021 06:39:02 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (113 C).
Nov 12, 2021 06:38:19 PM
 
System: ReadyNASOS background service started.
Nov 12, 2021 06:36:42 PM
 
System: The system is rebooting.
Nov 12, 2021 06:35:46 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (114 C).
Nov 12, 2021 06:20:43 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (114 C).
Nov 12, 2021 06:20:13 PM
 
System: 3VSB voltage in enclosure Internal is out of spec. (0.08 V).
Nov 12, 2021 06:05:40 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (114 C).
Nov 12, 2021 05:50:36 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (114 C).
Nov 12, 2021 05:35:28 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (114 C).
Nov 12, 2021 05:20:27 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (114 C).
Nov 12, 2021 05:05:22 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (114 C).
Nov 12, 2021 04:50:18 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (114 C).
Nov 12, 2021 04:35:12 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (114 C).
Nov 12, 2021 04:20:11 PM
 
System: System temperature in enclosure Internal exceeded safety threshold. (114 C).
Nov 12, 2021 05:45:54 AM
 
System: ReadyNASOS background service started.
Nov 12, 2021 05:25:31 AM
 
System: The system is shutting down.
Nov 12, 2021 01:09:48 AM
 
System: +3.3V voltage in enclosure Internal is out of spec. (0.35 V).
Nov 12, 2021 12:49:42 AM
 
System: Vbat voltage in enclosure Internal is out of spec. (4.08 V).
Nov 12, 2021 12:49:41 AM
 
System: 3VSB voltage in enclosure Internal is out of spec. (4.08 V).
Nov 11, 2021 12:45:47 PM
 
System: Vbat voltage in enclosure Internal is out of spec. (0.00 V).
Nov 11, 2021 10:07:51 AM
 
System: AVCC voltage in enclosure Internal is out of spec. (0.00 V).
Nov 11, 2021 07:19:21 AM
 
System: ReadyNASOS background service started.
Nov 10, 2021 10:23:52 PM
 
System: The system is shutting down.
Nov 10, 2021 10:23:42 PM
 
Disk: System is shutting down because disk in channel 7 (Internal) exceeded safe temperature threshold (60 C).
Nov 10, 2021 10:23:41 PM
 
Disk: Disk in channel 7 (Internal) exceeded safe temperature threshold (60 C).
Nov 10, 2021 09:59:30 PM
 
System: Fan Fan 1 in enclosure Internal speed is below threshold. (0 rpm).
Nov 10, 2021 08:59:15 PM
 
System: Fan Fan 1 in enclosure Internal speed is below threshold. (0 rpm).
Nov 10, 2021 07:59:11 PM
 
System: Fan Fan 1 in enclosure Internal speed is below threshold. (0 rpm).
Nov 10, 2021 02:28:55 PM
 
System: Alert configuration was saved.
Nov 10, 2021 01:29:52 PM
 
System: Service protocol SSH is disabled.
Nov 10, 2021 11:18:16 AM
 
System: ReadyNASOS background service started.
Nov 10, 2021 10:40:50 AM
 
System: The system is shutting down.
Nov 10, 2021 09:09:41 AM
 
System: ReadyNASOS background service started.
Nov 10, 2021 09:09:41 AM
 
System: Firmware was upgraded to 6.10.6.
Nov 10, 2021 09:06:16 AM
 
System: The system is rebooting.
Nov 10, 2021 05:45:34 AM
 
System: ReadyNASOS background service started.
Nov 10, 2021 03:01:49 AM
 
System: AVCC voltage in enclosure Internal is out of spec. (1.18 V).
Nov 10, 2021 03:01:40 AM
 
System: The system is shutting down.
Nov 10, 2021 03:01:29 AM
 
Disk: System is shutting down because disk in channel 7 (Internal) exceeded safe temperature threshold (60 C).
Nov 10, 2021 03:01:28 AM
 
Disk: Disk in channel 7 (Internal) exceeded safe temperature threshold (60 C).
Nov 10, 2021 02:13:55 AM
 
System: Fan Fan 1 in enclosure Internal speed is below threshold. (0 rpm).
Nov 10, 2021 01:13:47 AM
 
System: Fan Fan 1 in enclosure Internal speed is below threshold. (0 rpm).
Nov 10, 2021 12:13:44 AM
 
System: Fan Fan 1 in enclosure Internal speed is below threshold. (0 rpm).
Nov 09, 2021 11:13:40 PM
 
System: Fan Fan 1 in enclosure Internal speed is below threshold. (0 rpm).
Nov 09, 2021 10:07:43 PM
 
System: Vcore voltage in enclosure Internal is out of spec. (0.02 V).

 

Model: ReadyNAS-4200v2|ReadyNAS 4200v2
Message 9 of 11
MC1967
Aspirant

Re: Problems with READYNAS system (strange system errors)

So, it's been about four days with no more errors. I put the device back in the rack yesterday and guess I will just hope for the best. Weird, though. Wish I could have figured out what was going on.

 

I'll mark SandShark's reply as a "solution", since he was the only one to reply.

Model: ReadyNAS-4200v2|ReadyNAS 4200v2
Message 10 of 11
Leenhart
Aspirant

Re: Problems with READYNAS system (strange system errors)

Hi. This thread showed up while I was searching for help with my ReadyNAS ultra 2 (2 bay small NAS with external power brick, ca. anno 2011). I've had warnings of the type: "System: V+12 voltage in enclosure Internal is out of spec."

 

Attempts to solve by changing power brick with no success eventually led me to a complete tear-down and step by step build up with as few components as possible. Firstly I could run trouble free with main board and power/network PCB. Then with SATA connector board and a single HDD attached. But, when I mounted the fan to that setup... I got the errors!

 

Thus, I suggest that a fan error may be the issue for my NAS, and possibly for your setup too. I have replaced the fan with a new Noctua fan and re-assembled the unit. So far no errors (13+ hours running including it's dedicated bi-weekly backup scheme)

 

Note in retrospect: I have had the type of error before and believed the problem solved by change of power brick. Also I have had a situation where a zip-tie plastic strip from cable behind the NAS was stuck through the fan cover and obstructed the fan - thus I consider it likely that the fan can have internal shortcut and/or damage to coils.

 

So far goes my thoughts about hopefully steps towards a solution for one or more of us who have old but nice hardware that should ideally keep running. Good luck!

 

Message 11 of 11
Top Contributors
Discussion stats
  • 10 replies
  • 2372 views
  • 0 kudos
  • 3 in conversation
Announcements