NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
berillio
Oct 24, 2021Aspirant
vertical expansion - of the wrong nas?
Hello Forum, I have an issue, but describing became War and Peace (600 pages of it), so this is the short of it I have three NASs, (say A, B, C) (4x 4TB, 4x 4TB, 4x 8TB). A & B run OS-6.10.3, C ...
StephenB
Oct 24, 2021Guru - Experienced User
berillio wrote:
A 4Tb disk failed on NAS A. I can replace it with a new 8TB I have – and start expanding, which is good because I need more storage space.
but
That is NOT the NAS which needs expanding. It is NAS B which needs expanding; it has one aging disk but also three newish one (april 2020).
What can/should I do?
I tend to favor minimal steps - so I'd normally just replace the failed disk in A with the 8 TB drive. Since it isn't going to expand, you can replace the 8 TB drive with a 4 TB later on if you want to.
However, your plan A would also work. Generally I test my drives in a PC before inserting them into - first running the full non-destructive generic test, and following that up with the full erase/write zeros test. I'd recommend doing that on the 4 TB drive you want to re-use. Erasing it will also avoid any confusion in NAS A when you add it.
It would be good to make sure you have a backup of the unique files of each NAS before you manipulate its drives. So if you want to expand anyway, getting a second 8 TB drive for that purpose makes sense.
I'm not sure what your long term plan here is. I suggest designating one NAS as the "primary" NAS, and putting all the content you have on that NAS. Then use the other 2 as backups. You can over time expand them all to have the same size (giving you two full backups), but in the beginning you can back up some shares on each (giving you one full backup between the two NAS).
If you go with that suggestion, then you first should figure out what capacity you want in each of the NAS.
berillio wrote: I can expand NAS A. Then move the ENTIRE contents from NAS A to NAS B and viceversa. Lengthy (~8TB each way, incredibly messy, and (if I understood right, as they are 90% a mistery to me) all the snapshots will get jumbled up and possibly useless.
You might also want to re-think how much retention you really need in the snapshots. For me, snapshots are a way to recover from user error. If retention is too short, then you might not realize that you need to recover something until it's too late. If the retention is too long, then you end up with a lot of disk space, and a lot of fragmentation in the main shares. I tend to use 3-month retention on the snapshots (though some shares have shorter retention).
Just to clarify this. If you
- use NAS backup jobs to copy everything on A to B
- then use NAS backup jobs to copy everything on B to A
A and B will have identical content in the main shares, but A and B will retain their original snapshots. The original B snapshots won't be on A (and vice versa).
FWIW, I do use NAS->NAS backup. My RN526x is the primary NAS. I do share-by-share backup to the other NAS (running daily), with daily snapshots enabled on each NAS.
The snapshots on the backup NAS are similar to the primary, but not identical. If a folder is renamed on the main NAS, then the rsync backup ends up doing a copy/delete. So the snapshots will reflect that (using more storage on the backup). But it is close enough for my purpose. If I rename a really large folder, I can always go into the backup NAS and rename the folder there as well.
Sandshark
Oct 24, 2021Sensei
Plan 1 has the risk that the older drive will fail during the NAS2 re-sync. Just how old and whether there are any SMART errors are factors in the likelihood of failure. How long can you afford to have NAS1 non-redundant (which you'd base on how much "churn" there is on it, how well backed up, and whether or not you have any SMART errors on the remaining drives. If you can wait for the time it takes to expand NAS2, then replace the older drive in NAS2 with 8TB. Then, when that sync finishes, replace a newer one and move the newer 4TB into NAS1. Before moving it, you may want to zero it on a PC, especially if the volume name on NAS2 is the same as on NAS1.
- berillioOct 24, 2021Aspirant
thanks Stephen B and Sandshark.
My fault, it was in the "war & peace" but not in the summary.
NAS Ais OFFLINE
That is the NAS with the failed drive - actually it has not failed yet, I started receiving mails 32-40-48 errors, I saved the logs and switched it off before it did.
NAS A is rarely accessed and written to, it can wait offline a week or two.
Actually, if I power it on to copy the data BEFORE doing any volume rebuilding, would it be better to do it with the failing disk removed (it is already out), or power it back on, and hot remove it?
I had a disk failure initiating a daisy chain failure fo an healthy disk, I would prefer REMOVE the disk before it the NAS failed it
- StephenBOct 24, 2021Guru - Experienced User
berillio wrote:
Actually, if I power it on to copy the data BEFORE doing any volume rebuilding, would it be better to do it with the failing disk removed (it is already out), or power it back on, and hot remove it?
I had a disk failure initiating a daisy chain failure fo an healthy disk, I would prefer REMOVE the disk before it the NAS failed it
The puzzle here is whether you have multiple disks on the edge or not.
If only a single disk is at risk, then removing it is fine. But if multiple disks are stuggling, then I think it's best to leave them all in place. The theory there is that the bad sectors aren't likely to overlap, so all the data is still recoverable.
Personally I've never seen a case where a failing disk provoked a failure on a second drive.
- berillioOct 25, 2021Aspirant
Stephen, Sandshark,
Thanks again for coming back
Some NAS/disk history.
The array in NAS A was built in stages in a RN104. Started with a 4TB Seagate, added 2 WD Reds, one of which failed and was replaced, and a 4th Red.
In March (?) 2020 the RN104 failed, in April I bought two RN214s (NAS A&B), migrated the array from the RN104 to NAS A; the failed Red subsequently passed WD diagnostics and went in Slot 1 of NAS B with 3 more WD Reds, all powered up in May 2020. Faultless so far, and it is the MOST used, PCs are writing to it at all times (i.e., right now).
NAS A: slot 1 is the only drive which has a sector count [64]
slot disk dom fitted
1 ST4000DM000-1F2168 20-May-13 Sept-2013 (?)
2 WD40EFRX-68N32N0 15-May-18 Oct-2018 (the replacement)
3 WD40EFRX-68WT0N0 02-Feb-15 April -2015
4 WD40EFRX-68N32N0 07-Mar-18 Oct-2018
From disk_info.log (“health data” non reported was ZERO);(I don’t know if there is other data of interest in other logs).
NAS A,
Disk 1 (the “failing” disk)
Date of Manufacture (DoM) 20-May-2013
Fitted September (?) 2013
Current Pending Sector Count: 64
Uncorrectable Sector Count: 64
Temperature: 47
Start/Stop Count: 542
Power-On Hours: 57893
Power Cycle Count: 389
Load Cycle Count: 543
Disk 2
DoM 15-May-2018
Fitted October 2018
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 48
Start/Stop Count: 280
Power-On Hours: 22363
Power Cycle Count: 180
Load Cycle Count: 24
Disk 3
DoM 02-Feb-15
Fitted April -2015
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 49
Start/Stop Count: 461
Power-On Hours: 48656
Power Cycle Count: 340
Load Cycle Count: 7847
Disk 4
DoM 07-Mar-18
Fitted Oct-2018
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 43
Start/Stop Count: 298
Power-On Hours: 24677
Power Cycle Count: 193
Load Cycle Count: 307
NAS B - fully populated & started in May 2020
1 WD40EFRX -68WT0N0 02-Feb-2015 Failed, then passed DLDIAG
2 WD40EFRX-68N32N0 26-Jan-2020
3 WD40EFRX-68N32N0 26-Jan-2020
4 WD40EFRX-68N32N0 26-Jan-2020
NAS B – all populated and started in May 2020
Disk 1
DoM 02-Feb-2015
Fitted April 2015, failed May 2020 (passed DLDDIAG)
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 46
Start/Stop Count: 1631
Power-On Hours: 39333
Power Cycle Count: 275
Load Cycle Count: 8708
Disk 2
DoM 26 Jan 2020
Fitted 5 May 2020
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 47
Start/Stop Count: 1296
Power-On Hours: 12847
Power Cycle Count: 66
Load Cycle Count: 1291
Disk 3
DoM 26 Jan 2020
Fitted 5 May 2020
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 47
Start/Stop Count: 1207
Power-On Hours: 12830
Power Cycle Count: 61
Load Cycle Count: 1201
Disk 4
DoM 26 Jan 2020
Fitted 5 May 2020
Current Pending Sector Count: 0
Uncorrectable Sector Count: 0
Temperature: 43
Start/Stop Count: 1140
Power-On Hours: 12796
Power Cycle Count: 60
Load Cycle Count: 1134
Sandshark – I appreciate and share your concern about Disk1 in NAS B, the next oldest disk
Unfortunately, I am not at all familiar with reading SMART data
This is smart_history.log for NAS A
time model serial realloc_sect realloc_evnt spin_retry_cnt ioedc cmd_timeouts pending_sect uncorrectable_err ata_errors
------------------- -------------------- -------------------- ------------ ------------ -------------- ---------- ------------ ------------ ----------------- ----------
2013-11-07 20:06:55 ST4000DM000-1F2168 W300G5AN 0 0 0 0 0 0 0 0
2015-04-10 10:06:04 WDC WD40EFRX-68WT0N0 WD-WCC4E2KNHZ2N 0 0 0 -1 -1 0 0 0
2015-04-10 22:49:49 WDC WD40EFRX-68WT0N0 WD-WCC4E5AU82XY 0 0 0 -1 -1 0 0 0
2018-10-03 13:53:27 WDC WD40EFRX-68N32N0 WD-WCC7K6HY4DPN 0 0 0 -1 -1 0 0 0
2019-01-30 23:14:03 WDC WD40EFRX-68N32N0 WD-WCC7K7HD8AJD 0 0 0 -1 -1 0 0 0
2019-04-19 12:12:05 ST4000DM000-1F2168 W300G5AN 0 0 0 0 0 8 8 0
2021-10-20 10:58:53 ST4000DM000-1F2168 W300G5AN 0 0 0 0 0 16 16 0
2021-10-20 11:02:54 ST4000DM000-1F2168 W300G5AN 0 0 0 0 0 24 24 0
2021-10-20 11:04:54 ST4000DM000-1F2168 W300G5AN 0 0 0 0 0 32 32 0
2021-10-20 11:06:57 ST4000DM000-1F2168 W300G5AN 0 0 0 0 0 40 40 0
2021-10-20 11:12:59 ST4000DM000-1F2168 W300G5AN 0 0 0 0 0 48 48 0
2021-10-20 11:15:00 ST4000DM000-1F2168 W300G5AN 0 0 0 0 0 64 64 0
And for NASB
time model serial realloc_sect realloc_evnt spin_retry_cnt ioedc cmd_timeouts pending_sect uncorrectable_err ata_errors
------------------- -------------------- -------------------- ------------ ------------ -------------- ---------- ------------ ------------ ----------------- ----------
2020-05-02 21:01:07 WDC WD40EFRX-68WT0N0 WD-WCC4E2KNHZ2N 0 0 0 -1 -1 0 0 0
2020-05-05 02:36:57 WDC WD40EFRX-68N32N0 WD-WCC7K0SF9U38 0 0 0 -1 -1 0 0 0
2020-05-05 20:58:12 WDC WD40EFRX-68N32N0 WD-WCC7K4JKN2L4 0 0 0 -1 -1 0 0 0
2020-05-07 04:06:28 WDC WD40EFRX-68N32N0 WD-WCC7K6YX6PYY 0 0 0 -1 -1 0 0 0
stephen “disks on the edge”
Well, AFAIK, everything is fine (but so it was last Thursday when, BY COMPLETE FLUKE, as I had the admin page open, I saw the alerts and shut down NAS A).
Simply judging by the hours of operation (I don’t know how to judge the other parameters), the next oldest disks are NAS B slot 1 and NAS A Slot 3.
I am currently transferring ~1.25TB from NAS B to NAS C; I am planning to fit a 8TB to a PC, to copy NAS B data on it.
“Personally I've never seen a case where a failing disk provoked a failure on a second drive.”
I believe that this is what happened on my NV+ v2, because only one disk was writing errors and because now that the disks are in one PC with a Linux double boot, two recovery suites can see the entire data set, one of them can also read/show the data in the free version. I never REALLY tried to recover the data because of lack of expertise (I know nothing about Linux), time, and priorities level. But maybe I am wrong, there was a REAL failure on another disk, and, althought the data "appears" to be recoverable, the recovery would fail. Now, any recovery attempt will have to wait even more, I have to take out those disks to make space for the 8TB on that PC.
Related Content
- Sep 07, 2016Retired_Member
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!