25
submitted 1 week ago* (last edited 1 week ago) by lightrush@lemmy.ca to c/coffee@lemmy.world
32
submitted 1 week ago* (last edited 1 week ago) by lightrush@lemmy.ca to c/selfhosted@lemmy.world

I'm syncoiding from my normal RAIDz2 to a backup mirror made of 2 disks. I looked at zpool iostat and I noticed that one of the disks consistently shows less than half the write IOPS of the other:

                                        capacity     operations     bandwidth 
pool                                  alloc   free   read  write   read  write
------------------------------------  -----  -----  -----  -----  -----  -----
storage-volume-backup                 5.03T  11.3T      0    867      0   330M
  mirror-0                            5.03T  11.3T      0    867      0   330M
    wwn-0x5000c500e8736faf                -      -      0    212      0   164M
    wwn-0x5000c500e8737337                -      -      0    654      0   165M

This is also evident in iostat:

     f/s f_await  aqu-sz  %util Device
    0.00    0.00    3.48  46.2% sda
    0.00    0.00    8.10  99.7% sdb

The difference is also evident in the temperatures of the disks. The busier disk is 4 degrees warmer than the other. The disks are identical on paper and bought at the same time.

Is this behaviour expected?

[-] lightrush@lemmy.ca 21 points 2 months ago
6
submitted 2 months ago* (last edited 2 months ago) by lightrush@lemmy.ca to c/uoft@lemmy.ca

UofT is closing the [first.last]@mail.utoronto.ca email accounts for alumni. You can get a new email account that will end in @alumni.utoronto.ca but this won't happen automatically. You have to request one. Check your current UTmail+ inbox for an email titled "Notice of Upcoming Email Account Closure"

[-] lightrush@lemmy.ca 24 points 2 months ago

I'm dying here. 🤣

117
Sugar island (lemmy.ca)
submitted 2 months ago by lightrush@lemmy.ca to c/coffee@lemmy.world
5
Splosion (lemmy.ca)
submitted 5 months ago by lightrush@lemmy.ca to c/shittyfoodporn@lemmy.ca
22
submitted 5 months ago* (last edited 5 months ago) by lightrush@lemmy.ca to c/selfhosted@lemmy.world

I built a 5x 16TB RAIDz2, filled it with data, then I discovered the following.

Sequentially reading a single file from the file system gave me around 40MB/s. Reading multiple in parallel brought the total throughput in the hundreds of megabytes - where I'd expect it. This is really weird. The 5 disks show 100% utilization during single file reads. Writes are supremely fast, whether single threaded or parallel. Reading directly from each disk gives >200MB/s.

Splitting the the RAIDz2 into two RAIDz1s, or into one RAIDz1 and a mirror improved reads to 100 and something MB/s. Better but still not where it should be.

I have an existing RAIDz1 made of 4x 8TB disks on the same machine. That one reads with 250-350MB/s. I made an equivalent 4x 16TB RAIDz1 from the new drives and that read with about 100MB/s. Much slower.

All of this was done with ashift=12 and default recordsize. The disks' datasheets say their block size is 4096.

I decided to try RAIDz2 with ashift=13 even though the disks really say they've got 4K physical block size. Lo and behold, the single file reads went to over 150MB/s. 🤔

Following from there, I got full throughput when I increased the recordsize to 1M. This produces full throughput even with ashift=12. My existing 4x 8TB RAIDz1 pools with ashift=12 and recordsize=128K read single files fast.

Here's a diff of the queue dump of the old and new drives. The left side is a WD 8TB from the existing RAIDz1, the right side is one of the new HC550 16TB

< max_hw_sectors_kb: 1024
***
> max_hw_sectors_kb: 512
20c20
< max_sectors_kb: 1024
***
> max_sectors_kb: 512
25c25
< nr_requests: 2
***
> nr_requests: 60
36c36
< write_cache: write through
***
> write_cache: write back
38c38
< write_zeroes_max_bytes: 0
***
> write_zeroes_max_bytes: 33550336

Could the max_*_sectors_kb being half on the new drives have something to do with it?


Can anyone make any sense of any of this?

1
submitted 5 months ago by lightrush@lemmy.ca to c/canadapolitics@lemmy.ca

I wasn't aware Steve Paikin and John McGrath had a podcast on Ontario politics. I like it!

[-] lightrush@lemmy.ca 31 points 5 months ago

Thanks for the warning ⚠️🙏

This isn't my first rodeo with ZFS on USB. I've been running USB for a few years now. Recently I ran this particular box through a battery of tests and I'm reasonably confident that with my particular set of hardware it'll be fine. It passed everything I threw at it, once connected to a good port on my machine. But you're generally right and as you can see I discussed that in the testing thread, and I encountered some issues that I managed to solve. If you think I've missed something specific - let me know! 😊

[-] lightrush@lemmy.ca 26 points 5 months ago

That was the cheapest option. 🤭

[-] lightrush@lemmy.ca 16 points 5 months ago* (last edited 5 months ago)
  • 8x 8TB in a set of 2, some shucked WDs, some IronWolfs
  • 5x 16TB in a set of 2, "recertified" WDs from serverpartdeals.com
178
submitted 5 months ago* (last edited 5 months ago) by lightrush@lemmy.ca to c/selfhosted@lemmy.world
[-] lightrush@lemmy.ca 10 points 5 months ago

git merge --no-ff

73
submitted 5 months ago* (last edited 5 months ago) by lightrush@lemmy.ca to c/selfhosted@lemmy.world

Why

I'm running a ZFS pool of 4 external USB drives. It's a mix of WD Elements and enclosed IronWolfs. I'm looking to consolidate it into a single box since I'm likely to add another 4 drives to it in the near future and dealing with 8 external drives could become a bit problematic in a few ways.

ZFS with USB drives

There's been recurrent questions about ZFS with USB. Does it work? How does it work? Is it recommended and so on. The answer is complicated but it revolves around - yes it works and it can work well so long as you ensure that anything on your USB path is good. And that's difficult since it's not generally known what USB-SATA bridge chipset an external USB drive has, whether it's got firmware bugs, whether it requires quirks, is it stable under sustained load etc. Then that difficulty is multiplied by the number of drives the system has. In my setup for example, I've swapped multiple enclosure models till I stumbled on a rock-solid one. I've also had to install heatsinks on the ASM1351 USB-SATA bridge ICs in the WD Elements drives to stop them from overheating and dropping dead under heavy load. With this in mind, if a multi-bay unit like the OWC Mercury Elite Pro Quad proves to be as reliable as some anecdotes say, it could become a go-to recommendation for USB DAS that eliminates a lot of those variables, leaving just the host side since it comes with a cable too. And the host side tends to be reliable since it's typically either Intel or AMD. Read ##Testing for some tidbits about AMD.

Initial observations of the OWC Mercury Elite Pro Quad

  • Built like a tank, heavy enclosure, feet screwed-in not glued
  • Well designed for airflow. Air enters the front, goes through the disks, PSU, main PCB and exits from the back. Some IronWolf that averaged 55°C in individual enclosures clock at 43°C in here
  • It's got a Good Quality DC Fan (check pics). So far it's pretty quiet
  • Uses 4x ASM235CM USB-SATA bridge ICs which are found in other well-regarded USB enclosures. It's newer than the ASM1351 which is also reliable when not overheating
  • The USB-SATA bridges are wired to a USB 3.1 Gen 2 hub - VLI-822. No SATA port multipliers
  • The USB hub is heatsinked
  • The ASM235CM ICs have a weird thick thermal pad attached to them but without any metal attached to it. It appears they're serving as heatsinks themselves which might be enough for the ICs to stay within working temps
  • The main PCB is all-solid-cap affair
  • The PSU shows electrolytic caps which is unsurprising
  • The main PCB is connected to the PSU via standard molex connectors like the ones found in ATX PSUs. Therefore if the built-in PSU dies, it could be replaced with an ATX PSU
  • It appears to rename the drives to its own "Elite Pro Quad A/B/C/D" naming, however hdparm -I /dev/sda seems to return the original drive information. The disks appear with their internal designations in GNOME Disks. The kernel maps them in /dev/disks/by-id/* according to those as before. I moved my drives in it, rebooted and ZFS started the pool as if nothing happened
  • SMART info is visible in GNOME Disks as well as smartctl -x /dev/sda
  • It comes with both USB-C to USB-C cable and USB-C to USB A
  • Made in Taiwan

Testing

  • No errors in the system logs so far
  • I'm able to pull 350-370MB/s sequential from my 4-disk RAIDz1
  • Loading the 4 disks together with hdparm results in about 400MB/s total bandwidth
  • It's hooked up via USB 3.1 Gen 1 on a B350 motherboard. I don't see a significant difference in the observed speeds whether it's on the chipset-provided USB host, or the CPU-provided one
  • Completed a manual scrub of a 24TB RAIDz1 while also being loaded with an Immich backup, Plex usage, Syncthing rescans and some other services. No errors in the system log. Drives stayed under 44°C. Stability looks promising
  • Will pull a drive and add a new one to resilver once the latest changes get to the off-site backup
  • Pulled a drive from the pool and replaced it with a spare while the pool was live. SATA hot plugging seems to work. Resilvered 5.25TB in about 32 hours while the pool was in use. Found the following vomit in the logs repeating every few minutes:
Apr 01 00:31:08 host kernel: scsi host11: uas_eh_device_reset_handler start
Apr 01 00:31:08 host kernel: usb 6-3.4: reset SuperSpeed USB device number 12 using xhci_hcd
Apr 01 00:31:08 host kernel: scsi host11: uas_eh_device_reset_handler success
Apr 01 00:32:42 host kernel: scsi host11: uas_eh_device_reset_handler start
Apr 01 00:32:42 host kernel: usb 6-3.4: reset SuperSpeed USB device number 12 using xhci_hcd
Apr 01 00:32:42 host kernel: scsi host11: uas_eh_device_reset_handler success
Apr 01 00:33:54 host kernel: scsi host11: uas_eh_device_reset_handler start
Apr 01 00:33:54 host kernel: usb 6-3.4: reset SuperSpeed USB device number 12 using xhci_hcd
Apr 01 00:33:54 host kernel: scsi host11: uas_eh_device_reset_handler success
Apr 01 00:35:07 host kernel: scsi host11: uas_eh_device_reset_handler start
Apr 01 00:35:07 host kernel: usb 6-3.4: reset SuperSpeed USB device number 12 using xhci_hcd
Apr 01 00:35:07 host kernel: scsi host11: uas_eh_device_reset_handler success
Apr 01 00:36:38 host kernel: scsi host11: uas_eh_device_reset_handler start
Apr 01 00:36:38 host kernel: usb 6-3.4: reset SuperSpeed USB device number 12 using xhci_hcd
Apr 01 00:36:38 host kernel: scsi host11: uas_eh_device_reset_handler success

It appears to be only related to the drive being resilvered. I did not observe resilver errors

  • Resilvering iostat shows numbers in-line with the 500MB/s of the the USB 3.1 Gen 1 port it's connected to:
      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
   314.60       119.9M        95.2k         0.0k     599.4M     476.0k       0.0k sda
   264.00       119.2M        92.0k         0.0k     595.9M     460.0k       0.0k sdb
   411.00       119.9M        96.0k         0.0k     599.7M     480.0k       0.0k sdc
   459.40         0.0k       120.0M         0.0k       0.0k     600.0M       0.0k sdd
  • Running a second resilver on a chipset-provided USB 3.1 port while looking for USB resets like previously seen in the logs. The hypothesis is that here's instability with the CPU-provided USB 3.1 ports as there have been documented problems with those
    • I had the new drive disconnect upon KVM switch, where the KVM is connected to the same same chipset-provided USB controller. Moved the KVM to the CPU-provided controller. This is getting fun
    • Got the same resets as the drive began the sequential write phase:
Apr 02 16:13:47 host kernel: scsi host11: uas_eh_device_reset_handler start
Apr 02 16:13:47 host kernel: usb 6-2.4: reset SuperSpeed USB device number 9 using xhci_hcd
Apr 02 16:13:47 host kernel: scsi host11: uas_eh_device_reset_handler success
  • 🤦 It appears that I read the manual wrong. All the 3.1 Gen 1 ports on the back IO are CPU-provided. Moving to a chipset-provided port for real and retesting... The resilver entered its sequential write phase and there's been no resets so far. The peak speeds are a tad higher too:
      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
   281.80       130.7M        63.2k         0.0k     653.6M     316.0k       0.0k sda
   273.00       130.1M        56.8k         0.0k     650.7M     284.0k       0.0k sdb
   353.60       130.8M        63.2k         0.0k     654.0M     316.0k       0.0k sdc
   546.00         0.0k       133.2M         0.0k       0.0k     665.8M       0.0k sdd
  • Resilver finished. No resets or errors in the system logs
  • Did a second resilver. Finished without errors again
  • Resilver while connected to the chipset-provided USB port takes around 18 hours for the same disk that took over 30 hours via the CPU-provided port

Verdict so far

~~The OWC passed all of the testing so far with flying colors.~~ Even though resilver finished successfully, there were silent USB resets in the logs with the OWC connected to CPU-provided ports. Multiple ports exhibited the same behavior. When connected to a B350 chipset-provided port on the other hand the OWC finished two resilvers with no resets and faster, 18 hours vs 32 hours. My hypothesis is that these silent resers are likely related to the known USB problems with Ryzen CPUs. The OWC itself passed testing with flying colors when connected to a chipset port.

I'm buying another one for the new disks.

Pics

General

PSU

34

Swapped out the stock Gateron Brown switches for MX2A Ergo Clear. Added o-rings to shorten the key travel to 2.5-3mm. The result feels dramatically different. Individual key presses feel amazing. I'm not sure about typing yet.

74
submitted 7 months ago by lightrush@lemmy.ca to c/canada@lemmy.ca

Just got reminded of this classic!

41
submitted 7 months ago* (last edited 7 months ago) by lightrush@lemmy.ca to c/mechanicalkeyboards@lemmy.ml

It's a very interesting feel, unlike any other mechanical keyboard I've had. I've been typing on DSA for over half a decade now so I'm used to the flat row profile but the increased surface size still messes with my brain bit. I like it. There's some similarity to the feel of a laptop keyboard.

More info on SP G20.

[-] lightrush@lemmy.ca 23 points 8 months ago* (last edited 8 months ago)

- Hey ChatGPT, is it normal for my A4 to be burning this much oil? ...

- Yes.

[-] lightrush@lemmy.ca 19 points 8 months ago

Gotta hide the drywall horror show the HVAC people left. 😅

[-] lightrush@lemmy.ca 13 points 1 year ago

I'll be damned. I had no idea Conde Nast owned it. That said I can see a more recent injection of $150M by Tencent too.

[-] lightrush@lemmy.ca 17 points 1 year ago

It's already indexing it.

[-] lightrush@lemmy.ca 35 points 1 year ago* (last edited 1 year ago)

@SteelBeard@lemmy.world , you should add a link to the announcement which explains why Beehaw defederated since this looks to be the top question many are asking.

[-] lightrush@lemmy.ca 12 points 1 year ago* (last edited 1 year ago)

It's already monetised. Just click on the links under Donations in the main sidebar or straight to the OpenCollective page for a glimpse. We pay for it with our money. That's how we know we're not the product.

view more: next ›

lightrush

joined 1 year ago
MODERATOR OF