Hi HN, author here.
I built BlockFrame because I wanted the durability of distributed object storage (erasure coding, bit-rot protection) but for local, single-node archives.
It is a storage engine that shards files into Reed-Solomon blocks (RS(30,3) or RS(1,3)) to guarantee mathematical recovery from disk corruption. It then exposes this engine via a FUSE/WinFSP interface so you can access the data using standard tools (read, seek) without needing custom APIs.
Key features:
Engine Layer: Handles the heavy lifting of parity calculation and Merkle tree verification.
Access Layer: Virtual filesystem driver allows zero-copy access and random seeking on multi-gigabyte datasets.
Self-Healing: The engine transparently reconstructs corrupted sectors during reads.
I’m graduating in May and aiming for systems roles in the UK. I'd love feedback on the architecture and any feedback is highly appreciated, Thank you.
There is nothing wrong with the ratio. It is that for this ratio it is less computation to use the Merkle tree and plain replication rather than erasure coding. My suggestion is already stated. Store four copies and use the Merkle tree to determine which is valid.
You know what, that might be a very good idea, if encoding speeds on my really crappy hard-drive weren't under a second. But no I do see where you're coming from, and for larger files that concept does apply, but blockframe wants to protect your files from circumstances that you can't control, and the concept is still, have the file as its own entity, and to keep it planted, we use RS to make sure nothing funky happens. Small files do tend to cause random access speeds, that's why they're not split up.
Hi HN, author here. I built BlockFrame because I wanted the durability of distributed object storage (erasure coding, bit-rot protection) but for local, single-node archives.
It is a storage engine that shards files into Reed-Solomon blocks (RS(30,3) or RS(1,3)) to guarantee mathematical recovery from disk corruption. It then exposes this engine via a FUSE/WinFSP interface so you can access the data using standard tools (read, seek) without needing custom APIs.
Key features:
Engine Layer: Handles the heavy lifting of parity calculation and Merkle tree verification.
Access Layer: Virtual filesystem driver allows zero-copy access and random seeking on multi-gigabyte datasets.
Self-Healing: The engine transparently reconstructs corrupted sectors during reads.
I’m graduating in May and aiming for systems roles in the UK. I'd love feedback on the architecture and any feedback is highly appreciated, Thank you.
RS(1,3) is a slow way to store four copies.
I'm open to suggestions for better erasure coding storage ratio :)
There is nothing wrong with the ratio. It is that for this ratio it is less computation to use the Merkle tree and plain replication rather than erasure coding. My suggestion is already stated. Store four copies and use the Merkle tree to determine which is valid.
You know what, that might be a very good idea, if encoding speeds on my really crappy hard-drive weren't under a second. But no I do see where you're coming from, and for larger files that concept does apply, but blockframe wants to protect your files from circumstances that you can't control, and the concept is still, have the file as its own entity, and to keep it planted, we use RS to make sure nothing funky happens. Small files do tend to cause random access speeds, that's why they're not split up.