NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
Joey17
Nov 30, 2020Tutor
Orbi RBR50 Pulsing White and no Network Appearing
A week ago my Orbi mesh network completely went down (an RBR50 with 2 RBS50 satellites). The RBR50 was pulsing white and the network could not be found in any of my devices. I believe the satellietes...
Joey17
Dec 02, 2020Tutor
Connecting to the serial ports at least gave me a bit more information about the problem. It looks like there's something corrupted about the OpenWrt image because it runs into an error verifying the checksum:
## Checking Image at 843bffc0 ... Legacy image found Image Name: ARM OpenWrt Linux-3.14.77 Image Type: ARM Linux Kernel Image (lzma compressed) Data Size: 39061504 Bytes = 37.3 MiB Load Address: 40908000 Entry Point: 40908000 Verifying Checksum ... Bad Data CRC rootfs checksum error ## Booting kernel from FIT Image at 84000000 ... Using 'config@1' configuration Trying 'kernel@1' kernel subimage Description: ARM OpenWrt Linux-3.14.77 Type: Kernel Image Compression: gzip compressed Data Start: 0x840000e4 Data Size: 3523429 Bytes = 3.4 MiB Architecture: ARM OS: Linux Load Address: 0x80208000 Entry Point: 0x80208000 Hash algo: crc32 Hash value: 8b1e9c99 Hash algo: sha1 Hash value: d3b5236481fead36da4d4b1da5363d14b5fb553e Verifying Hash Integrity ... crc32 error! Bad hash value for 'hash@1' hash node in 'kernel@1' image node Bad Data Hash
Is there a way to reinstall OpenWrt? I'm assuming the standard TFTP firware updates don't update OpenWrt but correct me if I'm wrong. Using TFTP to upload a new firmware version doesn't change this error message.
pbarham
Dec 03, 2020Apprentice
Joey17 wrote:Connecting to the serial ports at least gave me a bit more information about the problem. It looks like there's something corrupted about the OpenWrt image because it runs into an error verifying the checksum:
Is there a way to reinstall OpenWrt? I'm assuming the standard TFTP firware updates don't update OpenWrt but correct me if I'm wrong. Using TFTP to upload a new firmware version doesn't change this error message.
As far as I understand it, the `fw_reccovery` command of the U-Boot console should upload a firmware image (from the netgear site) that includes the OpenWrt firmware and various other firmware blobs that seem to be used for radio interfaces. (at least, that's what the log output seemed to show) But - just because you sent an image from a laptiop dooesn't necessarily mean the router will accept it!
What I did to recover my router was:
1) Press a key on the serial console to interrupt the boot process and drop you into the U-boot command line
2) There you can poke around and examine the various config settings (`board_data_show` and various related commands)
3) You can start a tftp server that will accept a firmware image and write it to the flash memory using the `fw_recovery` command. It will tell you the IP address it's listening on (whichi was 192.168.1.250 for my router!) and you'll see logging when it receives data .
4) You can get u-boot to check out the image and validate it using the `bootdni` command (this will show your the kernel, openwrt versions etc)
What console output do you get from step 3) ? The first few times I tried this the update process complained that the new firmware image wasn't valid for the device. This was because the hardware id and model id had been corrupted in the config memory. I got error messages telling me what was expected and what was found. I couldn't upload the new image successfully until I'd fixed all these.
[I wrote a post with more detailed info in the thread that you linked to in an earlier message.]
I also at one point found some relevant threads on an OpenWrt hacking site suggesting that the U-boot firmware can run diagnostic tests on your flash memory to see if the chip is damaged (I expect this is unlikely)
- pbarhamDec 03, 2020Apprentice
pbarham wrote:The first few times I tried this the update process complained that the new firmware image wasn't valid for the device. This was because the hardware id and model id had been corrupted in the config memory. I got error messages telling me what was expected and what was found. I couldn't upload the new image successfully until I'd fixed all these.
To clarify ... as far as the tftp client on my laptop was concerned, the firmware file had been sent (PUT) to the router successuflly. But the router was failing to validate the image file and refusing write it to the flash. It's impossible to tell whether this is happening without looking at the serial console output! (three things were going wrong for me .... not setting 'binary mode' on the TFTP transfer, and the router not believing that it was the correct image file ... for two different reasons)
So long as you're happy you have the correct firmware image from Netgear for your router (RBR50 vs RBR50v2), you can just set the router's hardware ID to the ID that the firmarare image file contains and the error messages go away. Once the firmware is written to flash succesfully you should (hopefully) see different behavior after a reset.
Of course, there may be something totally differnt happening in your case - but from what I read on the OpenWrt sites, it's pretty hard to get the router into a totally unreoverable state. So long as U-boot it working you ought to be able to get things back (so long as there's no actiual hardware failure).
The uboot help command shows commands callled 'nand' and 'mmc' that I think are the ones where the tests and utilities live for the flash. The firmware gets uploaded to the MMC memory.
- Joey17Dec 03, 2020Tutor
Thanks pbarham! That's all very helpful.
I've gotten farther now in some ways, but still have a some new issues that have surfaced. Using the `fw_recovery` command worked well and it didn't seem to have any problems accepting the firmware image over tftp. I was able to access the Orbi webpage after that for the first time in this whole process. Interestingly the web page showed it was running firmware version 2.3.5.30 even though the version I transfered via tftp was 2.7.1.60. And it still remembered the SSID even though I had factory reset it many times during this process.
Overall though it seemed to be funtioning well at this point. I could connect to the network from one of my devices and the router was connecting to one of my satellites. I wasn't connected to my modem though, so I couldn't test the actual internet connection . Once I did move it back near the modem and connect it failed to boot up again though, with the same checksum validation error as before. As before, I used the `fw_recovery` command and it got past the chesum validation error, but eventually ran into a new error:
Mission mode: Firmware CHIP Version -1610612717 ath10k_ahb a000000.wifi: Firmware loaded from user helper succesfully ol_swap_seg_alloc: Successfully allocated memory for SWAP size=262144 ath10k_ahb a000000.wifi: Firmware loaded from user helper succesfully Swap: bytes_left to copy: fw:16; dma_page:101685 Swap: wrong length read:0 ol_swap_wlan_memory_expansion: Swap total_bytes copied: 160459 Target address 417c40 scn=dbac04c0 target_write_addr=417c40 seg_info=deb29a10 ol_transfer_swap_struct:Code swap structure successfully downloaded for bin type =2 bin_filename=IPQ4019/hw.1/athwlan.bin swap_filename=/lib/firmware/IPQ4019/hw.1/athwlan.codeswap.bin ol_transfer_bin_file: Downloading firmware file: IPQ4019/hw.1/athwlan.bin SQUASHFS error: xz decompression failed, data probably corrupt SQUASHFS error: squashfs_read_data failed to read block 0xc42c0d SQUASHFS error: xz decompression failed, data probably corrupt SQUASHFS error: squashfs_read_data failed to read block 0xc42c0d ath10k_ahb a000000.wifi: firmware, attempted to load /lib/firmware/IPQ4019/hw.1/athwlan.bin, but failed with error -5 ath10k_ahb a000000.wifi: Direct firmware load failed with error -5 ath10k_ahb a000000.wifi: Falling back to user helper SQUASHFS error: xz decompression failed, data probably corrupt SQUASHFS error: squashfs_read_data failed to read block 0xc42c0d ath10k_ahb a000000.wifi: Firmware loaded from user helper succesfully ol_transfer_bin_file: Failed to get IPQ4019/hw.1/athwlan.bin ol_ath_download_firmware : ol_transfer_bin_file failed. phase:3 ol_ath_attach error status -1
Though upon retrying now I'm not even getting this far. Now I'm strangly struggling to even turn on the router successfully. The power LED always turns green, but most of the time the LED ring isn't lighting up and no output gets transmitted over the serial connection. It does successfully turn on maybe one out of every 20 or so attempts, but even when it does it goes back to that original checksum error and if I try the tftp upload, it hangs at the step of resetting the cpu immediately after rebooting.
One time the LED ring started flashing green and I got this kernal panic message which potentially indicates a hardware problem:
@@@@@@@@@ DNI Kernel panic @@@@@@@@@@ mmc0: Starting tests of card mmc0:0001... mmc0: Test case 3. Basic read (with data verification)... mmc_test_verify_read_____dni_panic_flag1 mmc0: Result: OK mmc0: Tests completed. console_panic_event_________write 1 times mmc0: Starting tests of card mmc0:0001... mmc0: Test case 2. Basic write (with data verification)... mmc0: Result: ERROR (-22) mmc0: Tests completed. mmc0: Starting tests of card mmc0:0001... mmc0: Test case 5. Basic proc write (with data verification)... mmc0: Result: ERROR (-22) mmc0: Tests completed.
- pbarhamDec 03, 2020Apprentice
Unfortunately, that's all starting to look like a genuine hardware fault to me. The log messages suggest that it's having trouble reliably writing the MMC flash chip.
If it were me, I'd try to find some low level flash read/write test commands in the 'mmc' part of the uboot console and see if those pass. If not, then then your next steps probably depend on how comfortable you are with electronics and soldering ;-( I expect that the chip would be a small surface mount part - so *really* hard to desolder and replace. There might be some dry joints on the legs or on a nearby capacitor ... but I doubt that I'd have the confidence to work on that myself given the limited tools I have access to at home.
Like you, I'm also a little suspicious of the firmware version mismatch when you got it to boot. That does suggest one possibility.... you'll note that there are actually two fw_recovery commands and there is another oommand (I can't remember its name) to set the 'boot partition'. It may be the case that the boot loader allows for two sets of firmware and you can configure which one is booted. You *may* be successfully updating one, but booting the one that's still corrupt?
(also, if you do manage to get access to the web interface again, then you could try to do a fw update via that. )
Long shots, I know...
Another hail Mary, if it's an unrepariable hardware fault, would be to buy a cheap 'broken' RBR50 from ebay and see it you can recover *that* one! (I'm sure most ppl give up and sell theirs for parts!)
Good luck!
P
- Joey17Dec 09, 2020Tutor
Thanks again for all your help pbarham . I unfortunately didn't get it working again... I'm pretty convinced that it's a hardware failure at this point that it was in fact a hardware failure :(
Oh well, at least I learned a lot trying to fix it. Your suggestions definitely helped me get a lot further than I would have otherwise so I really appreciate it!