× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

PSU not detected after upgrade to FW 6.9.1

scottonaharley
Aspirant

PSU not detected after upgrade to FW 6.9.1

After upgrading to FW version 6.9.1 from 6.8.x i see log messages that indicate PSU1 not detected and PSU2 not detected.  The unit is populated with 12 4TB drives.  I've attempted to upload the logs but the zip file is not accepted.  Furthermore only jpg,gif,png,pdf are the only file types accepted.  Am I expected to convert every file contained log download from .log (text) to a pdf?

Model: RN4220X|ReadyNAS 4220 10Gbase-T
Message 1 of 14

Accepted Solutions
scottonaharley
Aspirant

Re: PSU not detected after upgrade to FW 6.9.1

This past weekend I was able to shut down the unit, remove the power cords, remove the power supplies, release the drives from each slot and blow out any dust (there was no visible dust inside the unit and while compressed air was blowing there was no visible dust being displaced)  and clean any dirt from the contacts on the back of the power supplies (there was no visible difference after cleaning the contacts.

 

The unit was reassembled and powered up.  Since 6.9.2 had been downloaded it was installed as the machine rebooted.  

 

Upon accessing the GUI the power supplies were both shown as installed and functioning.  

 

While the problem is solved I'm not 100% sure that dirt was the problem and the reappearance of the power supplies could be associated with the extended amount of time the unit was powered down.

View solution in original post

Message 11 of 14

All Replies
StephenB
Guru

Re: PSU not detected after upgrade to FW 6.9.1


@scottonaharley wrote:

Am I expected to convert every file contained log download from .log (text) to a pdf?


No.  You shouldn't post the log zip here, as there is some personal information in them that will then be accessible to the whole planet.

 

You are supposed to email the zip using the instructions here: https://kb.netgear.com/21543/How-do-I-send-all-logs-to-ReadyNAS-Community-moderators

 

Another approach is to PM (private message) the mod, and send a link to the log zip (using google drive, dropbox, ...).  You send a PM using the envelope link in the upper right of the forum page.

 

Either way, normally logs aren't sent unless a mod requests them (and they are directed to that mod).

 

 

Message 2 of 14
evan2
NETGEAR Expert

Re: PSU not detected after upgrade to FW 6.9.1

@scottonaharley

I checked my RN4220X, RN4220S and RN3220, PSU works OK after update to 6.9.1,

Cloud do you please send log and let us check?

How do I send all logs to ReadyNAS Community moderators?
https://kb.netgear.com/21543/How-do-I-send-all-logs-to-ReadyNAS-Community-moderators

 

 

Message 3 of 14
scottonaharley
Aspirant

Re: PSU not detected after upgrade to FW 6.9.1

I sent in a set of logs earlier this wek via email per the iinstrucions.  I havent tried to disable just the PSU messages, it should be possible but it appears that the GUI does not have that level of granularity.

 

Message 4 of 14
evan2
NETGEAR Expert

Re: PSU not detected after upgrade to FW 6.9.1

@scottonaharley

I checked your log, there is i2c error, I need to check with HW developter what problem happen on this device.

Jan 18 12:10:01 apollo CRON[21692]: (root) CMD (/bin/bash /opt/replication/etc/init.d/watchdog.sh)
Jan 18 12:10:01 apollo CRON[21688]: pam_unix(cron:session): session closed for user root
Jan 18 12:10:01 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:88
Jan 18 12:10:01 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:88
Jan 18 12:10:02 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:88
Jan 18 12:10:02 apollo rn-expand[4952]: PSU read voltage b2 err [-11] state [0], will retry[retry=4].
Jan 18 12:10:02 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:88
Jan 18 12:10:02 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:88
Jan 18 12:10:12 apollo kernel: i2c_nct6775_xfer: data timed out addr:58 cmd:88
Jan 18 12:10:12 apollo kernel: i2c_nct6775_xfer: data timed out addr:58 cmd:88
Jan 18 12:10:13 apollo kernel: i2c_nct6775_xfer: data timed out addr:58 cmd:88
Jan 18 12:10:13 apollo kernel: i2c_nct6775_xfer: data timed out addr:58 cmd:88
Jan 18 12:10:13 apollo rn-expand[4952]: PSU read voltage b0 err [-11] state [0], will retry[retry=4].
Jan 18 12:10:13 apollo kernel: i2c_nct6775_xfer: data timed out addr:58 cmd:88
Jan 18 12:10:13 apollo kernel: i2c_nct6775_xfer: data timed out addr:58 cmd:88
Jan 18 12:10:13 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:88
Jan 18 12:10:14 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:88
Jan 18 12:10:14 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:88
Jan 18 12:10:14 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:88
Jan 18 12:10:14 apollo rn-expand[4952]: PSU read voltage b2 err [-11] state [0], will retry[retry=4].
Jan 18 12:10:14 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:88
Jan 18 12:10:14 apollo kernel: i2c_nct6775_xfer: data timed out addr:58 cmd:78
Jan 18 12:10:15 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:88
Jan 18 12:10:15 apollo kernel: i2c_nct6775_xfer: data timed out addr:59 cmd:78
Jan 18 12:10:19 apollo snapperd[6268]: THROW: subvolume is not a btrfs subvolume
Jan 18 12:10:19 apollo snapperd[6268]: reading failed
  

 

Message 5 of 14
c3po
NETGEAR Expert

Re: PSU not detected after upgrade to FW 6.9.1

I debugged one RMA RN3220 with this problem. It turned out that the problem might be related to pollutant in the air 😞

See attached picture, the black dust has weak conductivity, resulted I2C bus voltage drop on backplane. While it does not affect backplane I2C bus operation, it does affect I2C bus buffer for PSU PMBus which is sensitive to voltage level. I cleaned up backplane using dust blower, the system was back to normal. RN4220-BP-Dust.JPG

Message 6 of 14
c3po
NETGEAR Expert

Re: PSU not detected after upgrade to FW 6.9.1

If PSU is not detected right after boot up. Would appreciate if you can try this and report back:

1. Gracefully shutdown NAS

2. Take out all hard drives(keep it in order and put back in same order later even though this is not needed)

3. Visual inspect if there is excessive dust on backplane. Use compressed air can to clean up backplane, especially the area with components. You can also get access of back of backplane by removing top cover if clean from front does not recover the PSU monitoring

4. Replace drive(and cover)

5. Check if PSU is found, chance is it will.

Message 7 of 14
scottonaharley
Aspirant

Re: PSU not detected after upgrade to FW 6.9.1

While your suggestion is excellent in this case it is probably not the case.  The units operate in a temperature controlled clean environment.  While it is not as clean as a true clean room the air is still very clean.  So clean that units operating a year or more will have no visible dust on the fan grills or inside.  In addition this issue began on the reboot immediately  following the update to 6.9.1 from 6.8.x so it is likely related to that change rather than an hardware issue.

 

Since this unit is a storage domain for a virtual server environment I will have to wait until the maintenance window is available to take it offline and open the cover to clean the backplane (both on the drive side and on the back side) as well as the PSU connection points.

 

 

Message 8 of 14
c3po
NETGEAR Expert

Re: PSU not detected after upgrade to FW 6.9.1

There is a serial console port on back of RN4220/RN3220. The settings are 115200-8N1.

BIOS posts two types of message during POST(Power On Self Test) regarding PSU:

PSU x is not present (This indicates that PSU is not detected)

PSU x is not electrified (Normal message)

If BIOS manages to detect PSU but firmware fails, then we know for sure it is caused by firmware update.

 

We did review I2C driver code history, it does not have any change for long time because the PMBus is driven by mature Nuvoton super IO chip.

 

If the server is running in server room, I agree that chance of pollutant/moisture causing problem is extremely low. We have many RN4220/RN3220 running in not so well controlled labs, none of these systems has this issue. We did many versions of firmware update/downgrade and these were OK.

 

I don't have other easy no-risk ways for you to troubleshoot this problem. If you are comfortable with messing with drives and screws(several dozens of these!), I can send you a good drive cage with backplane, then you can swap out yours to see if problem goes away. The reason that I suspect the backplane is that only backplane shares I2C bus with PSU. All other I2C devices are on a different I2C bus.

 

 

Message 9 of 14
scottonaharley
Aspirant

Re: PSU not detected after upgrade to FW 6.9.1

I'll check it out this weekend during the scheduled maintenance window and see what the results are.

 

Message 10 of 14
scottonaharley
Aspirant

Re: PSU not detected after upgrade to FW 6.9.1

This past weekend I was able to shut down the unit, remove the power cords, remove the power supplies, release the drives from each slot and blow out any dust (there was no visible dust inside the unit and while compressed air was blowing there was no visible dust being displaced)  and clean any dirt from the contacts on the back of the power supplies (there was no visible difference after cleaning the contacts.

 

The unit was reassembled and powered up.  Since 6.9.2 had been downloaded it was installed as the machine rebooted.  

 

Upon accessing the GUI the power supplies were both shown as installed and functioning.  

 

While the problem is solved I'm not 100% sure that dirt was the problem and the reappearance of the power supplies could be associated with the extended amount of time the unit was powered down.

Message 11 of 14
c3po
NETGEAR Expert

Re: PSU not detected after upgrade to FW 6.9.1

The RMA unit I debugged was not able to detect PSU (many cold boot up, warm reboots) until I disassembled the backplane and cleaned it. Based on your observations, I will double check I2C signal traces to see if there is any chance that somehow shock/vibration can cause signal integrity issue of I2C bus. Thanks for helping us to try to pin down the root cause - Although we did move drives back and forth many time between good and bad units, and the bad unit kept as bad with all drives going in and out many times, I tend to agree with you that it might not be caused by dust.

Message 12 of 14
scottonaharley
Aspirant

Re: PSU not detected after upgrade to FW 6.9.1

There is one other observation I made while the unit was disassembled. When I cleaned the power supply connector (I used an electronic contact cleaner spray which I applied to a cotton swab. Then I used the swap to clean the contact surface) I noticed that while there was little or no indication of dirt on the swab there were marks on the power supply contact surface which I presume to be from the connector on the back plane. While it wouldn’t be unusual to see those marks where there is metal to metal contact, I did notice that rather than being centered on each of the power supply contact surfaces the marks were right on the edge.

Perhaps the issue is related to the alignment of the back plane receptacle for the power supply to the connector on the power supply?

Since there were no code changes (per your earlier post) to the power monitoring function before the problem or after it was resolved. It points to a mechanical issue (the physical connection) with the root cause being either dirt, oxidation of the contacts or physical alignment of the contacts.
Message 13 of 14
c3po
NETGEAR Expert

Re: PSU not detected after upgrade to FW 6.9.1

Thanks for your inputs! Should this issue happen again on any other RN4220/3220, I will ask our tech support to debug if it is caused by dirt, oxidization or misalignment. It has to be one of these three possible causes.

 

PSU has its own backplane in PSU chamber, if it is caused my misalignment, I imagine that swap two PSU (or pull offending PSU )may get one or both PSU back.

We will try to isolate if it is caused by hard drive backplane or PSU backplane first. Then pin down root cause so that we can know how to solve it or prevent it.

Message 14 of 14
Top Contributors
Discussion stats
  • 13 replies
  • 3939 views
  • 0 kudos
  • 4 in conversation
Announcements