× NETGEAR will be terminating ReadyCLOUD service by July 1st, 2023. For more details click here.
Orbi WiFi 7 RBE973
Reply

4.1.9(T6) Data Corruption

slayerking
Aspirant

4.1.9(T6) Data Corruption

After updating to the T6 beta on my ReadyNas Duo, All downloads would be corrupt. For example if I used utorrent as I have done for years to download straight to the NAS all the data would be corrupt, The same goes with other download managers too. I tested with firefox' download manager and same thing corrupt data.
Message 1 of 41
yoh-dah
Guide

Re: 4.1.9(T6) Data Corruption

By chance did you change the memory module on the Duo?
Message 2 of 41
slayerking
Aspirant

Re: 4.1.9(T6) Data Corruption

Nope just updated the firmware to the T6. I'm back on 4.1.8 now and it's fine again. It was fine copying from pc to the nas, But if I downloaded a file with say firefox that had the destination of the nas the file would be corrupt 100% of the time.
Message 3 of 41
mdgm-ntgr
NETGEAR Employee Retired

Re: 4.1.9(T6) Data Corruption

What version of Firefox?
Message 4 of 41
slayerking
Aspirant

Re: 4.1.9(T6) Data Corruption

mdgm wrote:
What version of Firefox?

All of them, I tried with 11, 10, 8, 7 and so on down to 3. It's not just firefox, I used firefox as an example. Any program I tried that downloaded from the net produced the corruption. Transferring files from pc to nas worked fine.
Message 5 of 41
matthew1471
Guide

Re: 4.1.9(T6) Data Corruption

I had been running this firmware for a while (since Mar 31 10:55:34) and then hit issues.

I tried writing as the disks were powering up, the file that was written was corrupt. The application (KeePass) even acknowledged the file had become corrupt.

---------------------------
KeePass
---------------------------
The new file's content does not match the data that KeePass has written, i.e. writing to the file has failed and it might be corrupted now.
Please try saving again, and if that fails, save the database to a different location.
---------------------------
OK
---------------------------

trying to open the file that had just been written resulted in:

---------------------------
KeePass
---------------------------
Failed to load the specified file!
The file signature is invalid. Either the file isn't a KeePass database file at all or it is corrupted.
---------------------------
OK
---------------------------

Agree with previous poster (and the poster in viewtopic.php?f=17&t=61990#p350165), copying the files using Windows Explorer doesn't seem to trigger the bug, writing directly to the NAS from a program does.

I then opened another file and tried saving, MORE corruption. I rebooted the NAS.

Saving the same file again even after the reboot of the NAS still caused corruption errors. Reverted back to T2 and it instantly fixed it (without even rebooting my PC!). Bug is CLEARLY in T6.

Update: Have reported to ReadyNAS support Case ID 18343146.
Message 6 of 41
matthew1471
Guide

Re: 4.1.9(T6) Data Corruption

Right. I think I know what causes it now. Here is what I have sent to NETGEAR support:

I have done some further testing myself and managed to find the cause of this and can now reproduce it 100% of the time without fail.

Basically if you read information about the file while it is still being written to (refresh the folder you are writing in / switch to that folder / have the file highlighted and the folder the current active window and sometimes Windows Explorer itself keeps updating as the file grows / the program tries to read that file while it's writing), the NAS running T6 corrupts it.

So in my case, what must have happened is I had explorer open in that folder (because I must have had it open to have double clicked to open it) while KeePass was trying to write to that file, explorer kept trying to read the file-size while KeePass was writing. That corrupted it.

I have made a program in C# to demonstrate the T6 flaw. This is the best way to cause the error and observe how it has corrupted the file (unless you have a good Internet connection and don't mind re-downloading lots of large files?). Understandably sending you my programs over the Internet is a bad idea (they could be viruses) so I attach my source code instead so that you can see what the code does and decide whether you trust it enough to run it on your computer.

The one thing I can't work out is why some byte patterns seem immune to this corruption but my program writes all of the possible bytes values to the file though and it reproduces the flaw every time (on only T6).

CentralHeating.txt (apologises for the RAIDiator testing pun 😉 ) - Simply copy this into the free Microsoft Visual Studio Express (or one of the paid versions if you have it) as a C# Console application. You will need to change the path to a writeable place on your NAS where you want the file to be stored. The program when it runs creates a file with a repeated pattern of bytes. Then it reads back the same file and checks to see if it is still the same pattern. If the pattern is not the same as what it should have written, the program will tell you that the file is bad!

Now normally this will work perfectly ("SUCCESS - Program read back the file exactly as it should have been written."), HOWEVER, if you open up a Windows Explorer window and go to the folder on your NAS where the file is being written to and highlight it (Windows Explorer will keep updating the file size while it is being written to), maybe pressing or even holding F5 to refresh the folder a few times (just to be sure) as my program is running then the NAS will get distracted and corrupt the pattern in the file. Therefore rendering the file useless.

In a real world scenario this would mean if I was downloading a large binary file using Firefox and I had the file I was downloading to highlighted (or I kept pressing F5 or refresh in the folder where the file was being saved) as it was writing in Windows Explorer, my downloaded file would be corrupt (this doesn't seem to happen for all downloads though).

If you would prefer we can arrange a time for you to ring me and I can remotely show you this? and/or If you don't want to compile my program and would rather I sent you one you can just run and you have a computer you don't mind running my programs on (a test virtual machine or something?) then let me know and I will send you the .exe file for you to run.

When I go back to the T2 firmware, my program behaves normally even when I have left Windows Explorer open in the folder I am writing to and refresh it while the file in that folder is being written to. The bug is CLEARLY in T6 and I would guess has something to do with samba file locking. There is a page about Samba locking here (http://www.samba.org/samba/docs/man/Sam ... cking.html) which should help further diagnose why the connection gets broken by clients attempting to read files while another user is performing a write operation.

Hope this helps,
Matthew

P.s. On my NAS the configuration is "Disable full data journaling", "Disable journaling", "Enable fast CIFS writes" and the share ("/temp") allows Guest users to read/write (which is who I am logged in as) but the errors still occur even if I turn all these off and reboot the device.


using System;
using System.IO;

namespace CentralHeating
{
class Program
{
// Where the file should be written to. The file will be deleted if it already exists.
const string outputFile = @"\\192.168.1.3\temp\thrash\Radiator.key";

static void Main(string[] args)
{
// Write the program banner.
Console.WriteLine("----------------------------");
Console.WriteLine("Matthew1471! CentralHeating");
Console.WriteLine("----------------------------");
Console.WriteLine();

// Perform the "write of a binary file and then verify" test (deleting the file if it already exists).
writeThenVerifyFile();

Console.WriteLine("Program Completed. Press any key to exit.");
Console.ReadKey();
}

private static void writeThenVerifyFile()
{
// Trash the output file if it already exists.
if (File.Exists(outputFile)) { File.Delete(outputFile); }

// Whether there was a bad byte in the file.
bool badByte = false;

// Update the user.
Console.WriteLine("Starting Test..");

// The using statement in C# ensures the resources get cleaned up properly.
using (FileStream fileStream = new FileStream(outputFile, FileMode.CreateNew, FileAccess.ReadWrite, FileShare.ReadWrite))
{
// We want to make a 25MB file.
for (int loopCount = 0; loopCount < 102400; loopCount++)
{
// Update the user on our progress.
if (loopCount % 10240 == 1) { Console.WriteLine("Writing " + (loopCount / 1024) + "%.."); }

// Write to the file all the bytes from 0x00 - 0xFF.
for (int count = 0; count < 256; count++) { fileStream.WriteByte((byte)count); }
}
}

// Open the file we have just written.
using (FileStream fileStream = new FileStream(outputFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
// We want to read the 25MB file.
for (int loopCount = 0; loopCount < 102400; loopCount++)
{
// Update the user on our progress.
if (loopCount % 10240 == 1) { Console.WriteLine("Reading " + (loopCount / 1024) + "%.."); }

// Read 256 bytes.
for (int count = 0; count < 256; count++)
{
// Read in a byte from the file.
byte readByte = (byte)fileStream.ReadByte();

// Check if the read byte matches the expected byte for this position in the file.
if (readByte != (byte)count)
{
// This now could be classed as a "corrupt" file.
badByte = true;

// Notify the user that a bad byte was encountered.
Console.WriteLine("ERROR - While reading the file, what should be 0x" + count.ToString("X2") + ", we instead found 0x" + readByte.ToString("X2") + ".");
}
}
}
}

Console.WriteLine();

// If there was not a bad byte in the file then the user will need a success message.
if (!badByte) { Console.WriteLine("SUCCESS - Program read back the file exactly as it should have been written."); }

Console.WriteLine();
}
}
}
Message 7 of 41
slayerking
Aspirant

Re: 4.1.9(T6) Data Corruption

Unfortunately I don't have explorer open in any of the NAS folders. If I use any program to write to the NAS it gets corrupted, But not if I copy from PC to the NAS myself. But yes it is clearly a bug in T6.
Message 8 of 41
matthew1471
Guide

Re: 4.1.9(T6) Data Corruption

Maybe your applications are reading the files back as they are writing them?
Message 9 of 41
slayerking
Aspirant

Re: 4.1.9(T6) Data Corruption

It's a possibility, I know uTorrent would at some point not sure about the others though. All I noticed was the files were never complete on the NAS and gave the errors you posted earlier, I could download the same file time after time and every time it would be a different size and 100% of the time corrupt.
Message 10 of 41
matthew1471
Guide

Re: 4.1.9(T6) Data Corruption

Just thinking out loud but I wonder if some anti-virus products as part of their "real time" scanning would re-scan (i.e. read) a file that has just been written to? That way even an application's data where the application does not explicitly read the file would corrupt.

As for bittorent, I guess it depends on whether the particular implementation of it caches the pieces that it is trying to send in memory (if possible, granted some machines might have so little memory that almost everything has to come straight from storage) or if the moment a user requests a piece it immediately looks on the disk.

Either way whatever the cause I'm happy that I have a concrete test case that should be reproducible so the devs aren't having to spend days trying to figure out what is introducing occasional corruption and I'm pretty happy that T2 is pretty safe with my data (and can easily test any future versions with my program rather than waiting until I lose a file again 😉 ).

I'm sure by fixing my data issue they will fix yours.

Did you try T2? T2 fixed all the performance issues with 4.1.8 for me. I wouldn't mind betting T2 will not give you corruption either.
Message 11 of 41
slayerking
Aspirant

Re: 4.1.9(T6) Data Corruption

Yeah T2 worked well but I deleted it so just went back to 4.1.8
Message 12 of 41
matthew1471
Guide

Re: 4.1.9(T6) Data Corruption

T2 - http://www.readynas.com/download/beta/r ... r-4.1.9-T2

I hope they work out the issues with T6, because I really liked the fact there was a newer Samba in it.

The reason I drifted into beta territory in the first place was because 4.1.8 has a speed issue (http://blogx.co.uk/Comments.asp?Entry=855)
Message 13 of 41
ahpsi1
Tutor

Re: 4.1.9(T6) Data Corruption

Crap, I just corrupted half a directory of inventory Excel files by upgrading to T6 on an 1100. Was running T2 and didn't have any issues but wanted the newer Samba as a new DC goes live next week. Now I must revert to off-site backups for the corrupted files. Should have read this thread before upgrading. Hard to believe T6 is still available for D/L given what it can do to your data. :?
Message 14 of 41
matthew1471
Guide

Re: 4.1.9(T6) Data Corruption

ahpsi, sorry to hear that.

It might be worth you ringing Netgear and asking if you can be added to my case (Case ID 18343146). Might help to get T6 pulled quicker. I think because I've been the only user to raise the issue so far they have hesitated over pulling it.

I hope they do keep the new Samba (and just fix whatever the problem is) rather than just decide T2 *is* the release.
Message 15 of 41
ahpsi1
Tutor

Re: 4.1.9(T6) Data Corruption

Well, I do have great faith in the Jedi's and know they monitor the forums. In the end (and I may be wrong in this) I think Netgear farms the first level tech support out through the same channels their other products are supported through but when the issues get to a certain level they come back into a smaller group that actually knows what is going on and can actually affect product development. I would hope given the severity of the issue (data corruption ranks very high in my list of 'bad' things when we are talking about a data storage device) this thread is already on their radar, assuming you and I haven't done anything monumentally stupid and caused our own issues. I've already reverted the 1100's and Duo's I'd updated and initiated new one way RSYNC jobs to clear any corrupted data so I'm not too worried but will keep watching for an official communication regarding this issue.

Thanks for all the work you've done in identifying and reproducing this, great work!
Message 16 of 41
slayerking
Aspirant

Re: 4.1.9(T6) Data Corruption

matthew1471 wrote:
T2 - http://www.readynas.com/download/beta/r ... r-4.1.9-T2

I hope they work out the issues with T6, because I really liked the fact there was a newer Samba in it.

The reason I drifted into beta territory in the first place was because 4.1.8 has a speed issue (http://blogx.co.uk/Comments.asp?Entry=855)
Thanks for the link.
Yeah I'm amazed it hasn't been pulled yet
Message 17 of 41
ianch99
Aspirant

Re: 4.1.9(T6) Data Corruption

Reading this thread on data corruption in T6 worries me as I am currently on this version. If I understand things correctly, I download a large(?) file directly onto a network share and then read it back to compare against the same file downloaded locally. The network drive copy will be corrupt and the local one ok.

I used Firefox 11 on XP and downloaded iTunesSetup.exe (75 Mb) twice: once onto a (smb) network share hosted by my Duo v1 and second onto a local 😧 drive. I then ran cksum against both files:

[:C:/] cksum d:/Downloads/iTunesSetup.exe //ianch-nas/backup/Utility/iTunesSetup.exe
573707582 74982768 d:/Downloads/iTunesSetup.exe
573707582 74982768 //ianch-nas/backup/Utility/iTunesSetup.exe

As you can see, no difference ..

I am unsure if I do or do not have a problem 😞

Have NetGear Support verified this corruption problem on T6 and if so, what is their advice?
Message 18 of 41
slayerking
Aspirant

Re: 4.1.9(T6) Data Corruption

ianch99 wrote:
Reading this thread on data corruption in T6 worries me as I am currently on this version. If I understand things correctly, I download a large(?) file directly onto a network share and then read it back to compare against the same file downloaded locally. The network drive copy will be corrupt and the local one ok.


Any file large or small, Let me explain. I have the downloads folder on my NAS mapped to drive W: and the media folder mapped to X:. When I use firefox to download which points to drive W: (The NAS) the file is corrupt. Say pidgin for example it's 9,175KB in size, If I download it on the T6 firmware it will be a different size(say 8,964KB), I downloaded it 4 times and it was never the same size and was never complete.

Trying to run said file results in this error message "Failed to load the specified file! The file signature is invalid. Either the file isn't a KeePass database file at all or it is corrupted." or roundabouts.
Message 19 of 41
ianch99
Aspirant

Re: 4.1.9(T6) Data Corruption

I did exactly this i.e. download a file from the internet directly onto a smb (ReadyNAS Duo v1) share and the file size & cksum was correct .. no corruption

Have NetGear confirmed this as a problem in T6?

Thanks
Message 20 of 41
slayerking
Aspirant

Re: 4.1.9(T6) Data Corruption

ianch99 wrote:
I did exactly this i.e. download a file from the internet directly onto a smb (ReadyNAS Duo v1) share and the file size & cksum was correct .. no corruption

Have NetGear confirmed this as a problem in T6?

Thanks

Well continue what you are doing then as you seem to be doing something different. I can replicate it 100% of the time which is confirmed by matthew1471 and ahpsi found out the hard way too.
Message 21 of 41
ianch99
Aspirant

Re: 4.1.9(T6) Data Corruption

I am more concerned that NetGear is silent regards this issue. For a NAS product, what is more important than data corruption? Please can someone from NetGear respond on this thread and update us with their position on this? Do we all keep using T6 or not?
Message 22 of 41
matthew1471
Guide

Re: 4.1.9(T6) Data Corruption

ahpsi wrote:
Thanks for all the work you've done in identifying and reproducing this, great work!

No problem, my degree is in Computer Science and I probably have a few more tools at my disposal (Visual Studio 2010 etc) than the average user reporting "the NAS is doing something weird", so it only seemed fair to give a full disclosure and save Netgear a couple of weeks of fruitless searching.

ianch99 wrote:
I am more concerned that NetGear is silent regards this issue.


To be fair it is something they will need to check themselves before they admit it.. and given they may not have explicitly tested reading file information at the same time as writing it is something they're going to need to take a bit of time to go through (it may even be this problem is due to some weird way Windows 7 uses SMB too, which will make reproducing it in their lab even more fiddly). T2 had no noticeable side effects apart from a few users saying it was slower (which is all T6 fixes over T2) but I'd rather slightly slower than "can't guarantee it writes the file properly".

To be honest I am impressed Netgear provided as much of their time and attention as they have so far, how many other companies will a) not accuse it of being your PC or networking issues (with most call centres and ISPs I have to pull the "trust me, I know what I'm doing and seeing" card) b) refuse to support you because it's a beta and c) refuse to support you because you've replaced the RAM?

I'm definitely impressed with their support, just think they could probably note "Some users are experiencing corruption with T6, as a workaround use T2 if you have any concerns." on the download page.

ianch99 wrote:
I did exactly this i.e. download a file from the internet directly onto a smb (ReadyNAS Duo v1) share and the file size & cksum was correct .. no corruption


You only hit this issue if you are attempting to read information about the file (could just be what the file size currently is) while writing to it at the same time. This could either be caused by your anti-virus program (trying to read what file you've just written to see if it's bad), Bittorrent client (to upload a bit of the file you've already downloaded to someone else) or just from refreshing the folder (which Windows 7 seems to even do itself sometimes if you have the folder open and the most active window) a few times while it is downloading.

ianch99 wrote:
I download a large(?) file

You (or Windows) are more likely to try to refresh the folder a few times while it's writing if the file is large.. in theory it could really be just about any file size.. but it's more fiddly to try and read the folder at *exactly* the time it's writing, far more easy to reproduce with your 20GB / 12 hour download that you keep going in and out of the same folder for other files etc. My "proof of concept" program generates a file that I think is around 25MB.. and I can corrupt that by refreshing the folder I write to about 5 or 6 times while it's writing so it can be just about any file really.

I managed to "real life scenario" reproduce this while on the phone to Netgear by downloading iTunes using Firefox and setting the file to be saved to my NAS, having that folder open and pressing F5 a few times in that folder while it was downloading (Windows said it wasn't even sure if it was a file it was able to run once I did this).. this is a show-stopper for me.

I agree you could just stop trying to read the file information while it's downloading (and for most files particularly small ones you seem to get away with it) but the risks are too severe for me to see this as a decent workaround to keep running T6. It's definitely reproducible in my set-up and instantly goes away the moment I put T2 back on. I don't mind if you continue to use T6 or not (and I'm happy enough with the program I wrote being able to reproduce it for any Netgear engineer who cares to give it a spin), but for me my decision is clear :D.
Message 23 of 41
matt_hargett
Tutor

Re: 4.1.9(T6) Data Corruption

This may explain a weird iTunes database corruption issue I get when I have podcasts/apps downloading in iTunes while my phone is syncing at the same time. I was trying to track it down and make it reproducible before reporting it. Thanks for going to the effort! I'm sure they'll fix it ASAP and release an updated beta now that they have a reproducible case.
Message 24 of 41
ianch99
Aspirant

Re: 4.1.9(T6) Data Corruption

matthew1471 wrote:
ahpsi wrote:
Thanks for all the work you've done in identifying and reproducing this, great work!

No problem, my degree is in Computer Science and I probably have a few more tools at my disposal (Visual Studio 2010 etc) than the average user reporting "the NAS is doing something weird", so it only seemed fair to give a full disclosure and save Netgear a couple of weeks of fruitless searching.

ianch99 wrote:
I am more concerned that NetGear is silent regards this issue.


To be fair it is something they will need to check themselves before they admit it.. and given they may not have explicitly tested reading file information at the same time as writing it is something they're going to need to take a bit of time to go through (it may even be this problem is due to some weird way Windows 7 uses SMB too, which will make reproducing it in their lab even more fiddly). T2 had no noticeable side effects apart from a few users saying it was slower (which is all T6 fixes over T2) but I'd rather slightly slower than "can't guarantee it writes the file properly".

To be honest I am impressed Netgear provided as much of their time and attention as they have so far, how many other companies will a) not accuse it of being your PC or networking issues (with most call centres and ISPs I have to pull the "trust me, I know what I'm doing and seeing" card) b) refuse to support you because it's a beta and c) refuse to support you because you've replaced the RAM?

I'm definitely impressed with their support, just think they could probably note "Some users are experiencing corruption with T6, as a workaround use T2 if you have any concerns." on the download page.

ianch99 wrote:
I did exactly this i.e. download a file from the internet directly onto a smb (ReadyNAS Duo v1) share and the file size & cksum was correct .. no corruption


You only hit this issue if you are attempting to read information about the file (could just be what the file size currently is) while writing to it at the same time. This could either be caused by your anti-virus program (trying to read what file you've just written to see if it's bad), Bittorrent client (to upload a bit of the file you've already downloaded to someone else) or just from refreshing the folder (which Windows 7 seems to even do itself sometimes if you have the folder open and the most active window) a few times while it is downloading.

ianch99 wrote:
I download a large(?) file

You (or Windows) are more likely to try to refresh the folder a few times while it's writing if the file is large.. in theory it could really be just about any file size.. but it's more fiddly to try and read the folder at *exactly* the time it's writing, far more easy to reproduce with your 20GB / 12 hour download that you keep going in and out of the same folder for other files etc. My "proof of concept" program generates a file that I think is around 25MB.. and I can corrupt that by refreshing the folder I write to about 5 or 6 times while it's writing so it can be just about any file really.

I managed to "real life scenario" reproduce this while on the phone to Netgear by downloading iTunes using Firefox and setting the file to be saved to my NAS, having that folder open and pressing F5 a few times in that folder while it was downloading (Windows said it wasn't even sure if it was a file it was able to run once I did this).. this is a show-stopper for me.

I agree you could just stop trying to read the file information while it's downloading (and for most files particularly small ones you seem to get away with it) but the risks are too severe for me to see this as a decent workaround to keep running T6. It's definitely reproducible in my set-up and instantly goes away the moment I put T2 back on. I don't mind if you continue to use T6 or not (and I'm happy enough with the program I wrote being able to reproduce it for any Netgear engineer who cares to give it a spin), but for me my decision is clear :D.


Thanks for the great explanation. I missed the reading whilst writing part .. I did your "iTunes" test, downloading the 75 Mb file using Firefox (on XP) direct to a mounted NAS network drive. All the while the download was in-progress, I hit F5 (once per second) and could see the file size increasing. I checked the size & cksum at the end with the local copy of the same file and it was the same. I could also run the .exe from the network drive .. strange ..

.. I then switched to a Win 7 PC and repeated the same process ... the cksum is different!! 🙂 Seems to be a Win 7 only problem? Possibly related to a Samba upgrade in T6?

I think I will go back 4.1.8 and wait for 4.1.9 ..
Message 25 of 41
Top Contributors
Discussion stats
  • 40 replies
  • 6653 views
  • 1 kudo
  • 10 in conversation
Announcements