Reduce bandwidth costs with dm-cache: fast local SSD caching for network storage Hackernews Viewer

Reduce bandwidth costs with dm-cache: fast local SSD caching for network storage

(devcenter.upsun.com)

30 points by tlar 9 September 2025 | 9 comments

Comments

rbranson 2 hours ago

> For e-commerce workloads, the performance benefit of write-back mode isn’t worth the data integrity risk. Our customers depend on transactional consistency, and write-through mode ensures every write operation is safely committed to our replicated Ceph storage before the application considers it complete.

Unless the writer is always overwriting entire files at once blindly (doesn't read-then-write), consistency requires consistency reads AND writes. Even then, potential ordering issues creep in. It would be really interesting to hear how they deal with it.

0xbadcafebee 2 hours ago

This is good timing; I was just looking at a use-case where we need more iops and the only immediate solutions involve allocating way more high-performance disks or network storage. The problem with a cache is having a large dataset with random access, so repeated cache hits might not be frequent. But I had a theory that you could still make an impact on performance and lower your storage performance requirements. I may try this out, but it is block-level, so it's a bit intrusive.

Another option I haven't tried is tmpfs with an overlay. Initial access is RAM, falls back to underlying slower storage. Since I'm mostly doing reads, should be fine, writes can go to the slower disk mount. No block storage changes needed.

mrkurt 1 hour ago

dm-cache writeback mode is both amazing and terrifying. It reorders writes, so not only do you lose data if the cache fails, you probably just corrupted the entire backing disk.

kayson 2 hours ago

I was looking into SSD caching recently and decided to go with Open-CAS instead, which should be more performant (didn't test it personally): https://github.com/Open-CAS/open-cas-linux/issues/1221

It's maintained by Intel and Huawei and the devs were very responsive.

AtlasBarfed 3 hours ago

"When deploying infrastructure across multiple AWS availability zones (AZs), bandwidth costs can become a significant operational expense"

An expense in the age of 100gbit networking that is entirely because AWS can get away with charging the suckers, um, customers for it