NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
scottonaharley
Dec 06, 2017Aspirant
PSU not detected after upgrade to FW 6.9.1
After upgrading to FW version 6.9.1 from 6.8.x i see log messages that indicate PSU1 not detected and PSU2 not detected. The unit is populated with 12 4TB drives. I've attempted to upload the logs ...
- Jan 31, 2018
This past weekend I was able to shut down the unit, remove the power cords, remove the power supplies, release the drives from each slot and blow out any dust (there was no visible dust inside the unit and while compressed air was blowing there was no visible dust being displaced) and clean any dirt from the contacts on the back of the power supplies (there was no visible difference after cleaning the contacts.
The unit was reassembled and powered up. Since 6.9.2 had been downloaded it was installed as the machine rebooted.
Upon accessing the GUI the power supplies were both shown as installed and functioning.
While the problem is solved I'm not 100% sure that dirt was the problem and the reappearance of the power supplies could be associated with the extended amount of time the unit was powered down.
scottonaharley
Jan 25, 2018Aspirant
While your suggestion is excellent in this case it is probably not the case. The units operate in a temperature controlled clean environment. While it is not as clean as a true clean room the air is still very clean. So clean that units operating a year or more will have no visible dust on the fan grills or inside. In addition this issue began on the reboot immediately following the update to 6.9.1 from 6.8.x so it is likely related to that change rather than an hardware issue.
Since this unit is a storage domain for a virtual server environment I will have to wait until the maintenance window is available to take it offline and open the cover to clean the backplane (both on the drive side and on the back side) as well as the PSU connection points.
c3po
Jan 25, 2018NETGEAR Expert
There is a serial console port on back of RN4220/RN3220. The settings are 115200-8N1.
BIOS posts two types of message during POST(Power On Self Test) regarding PSU:
PSU x is not present (This indicates that PSU is not detected)
PSU x is not electrified (Normal message)
If BIOS manages to detect PSU but firmware fails, then we know for sure it is caused by firmware update.
We did review I2C driver code history, it does not have any change for long time because the PMBus is driven by mature Nuvoton super IO chip.
If the server is running in server room, I agree that chance of pollutant/moisture causing problem is extremely low. We have many RN4220/RN3220 running in not so well controlled labs, none of these systems has this issue. We did many versions of firmware update/downgrade and these were OK.
I don't have other easy no-risk ways for you to troubleshoot this problem. If you are comfortable with messing with drives and screws(several dozens of these!), I can send you a good drive cage with backplane, then you can swap out yours to see if problem goes away. The reason that I suspect the backplane is that only backplane shares I2C bus with PSU. All other I2C devices are on a different I2C bus.
- scottonaharleyJan 25, 2018Aspirant
I'll check it out this weekend during the scheduled maintenance window and see what the results are.
- scottonaharleyJan 31, 2018Aspirant
This past weekend I was able to shut down the unit, remove the power cords, remove the power supplies, release the drives from each slot and blow out any dust (there was no visible dust inside the unit and while compressed air was blowing there was no visible dust being displaced) and clean any dirt from the contacts on the back of the power supplies (there was no visible difference after cleaning the contacts.
The unit was reassembled and powered up. Since 6.9.2 had been downloaded it was installed as the machine rebooted.
Upon accessing the GUI the power supplies were both shown as installed and functioning.
While the problem is solved I'm not 100% sure that dirt was the problem and the reappearance of the power supplies could be associated with the extended amount of time the unit was powered down.
- c3poJan 31, 2018NETGEAR Expert
The RMA unit I debugged was not able to detect PSU (many cold boot up, warm reboots) until I disassembled the backplane and cleaned it. Based on your observations, I will double check I2C signal traces to see if there is any chance that somehow shock/vibration can cause signal integrity issue of I2C bus. Thanks for helping us to try to pin down the root cause - Although we did move drives back and forth many time between good and bad units, and the bad unit kept as bad with all drives going in and out many times, I tend to agree with you that it might not be caused by dust.
- scottonaharleyFeb 01, 2018AspirantThere is one other observation I made while the unit was disassembled. When I cleaned the power supply connector (I used an electronic contact cleaner spray which I applied to a cotton swab. Then I used the swap to clean the contact surface) I noticed that while there was little or no indication of dirt on the swab there were marks on the power supply contact surface which I presume to be from the connector on the back plane. While it wouldn’t be unusual to see those marks where there is metal to metal contact, I did notice that rather than being centered on each of the power supply contact surfaces the marks were right on the edge.
Perhaps the issue is related to the alignment of the back plane receptacle for the power supply to the connector on the power supply?
Since there were no code changes (per your earlier post) to the power monitoring function before the problem or after it was resolved. It points to a mechanical issue (the physical connection) with the root cause being either dirt, oxidation of the contacts or physical alignment of the contacts. - c3poFeb 01, 2018NETGEAR Expert
Thanks for your inputs! Should this issue happen again on any other RN4220/3220, I will ask our tech support to debug if it is caused by dirt, oxidization or misalignment. It has to be one of these three possible causes.
PSU has its own backplane in PSU chamber, if it is caused my misalignment, I imagine that swap two PSU (or pull offending PSU )may get one or both PSU back.
We will try to isolate if it is caused by hard drive backplane or PSU backplane first. Then pin down root cause so that we can know how to solve it or prevent it.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!