Those are pretty cool. I meant to highlight more, that the laptop has done super well. I can't even tell it's on as I hear no fan / no heat. I guess laptops are pretty good for this as they are great at sipping power when there is a low load.
Back in 2012 or so, I reused an old netbook (an Asus Eee PC) with an Atom CPU & 1GB of RAM, installed Ubuntu Server, and used it as a home server. It handled the printer, DNS-VPN proxying for streaming, and a few other things admirably for years. (And ironically was resilient to Spectre because its Atom CPU was before Intel added speculative execution)
Eventually, the thing that kicked the bucket was actually the keyboard (and later the fan started making "my car won't start" noises occasionally). Even the horribly-slow HDD (that handled Ubuntu Server surprisingly well) hadn't died yet.
One question, are you keeping it plugged in continuously, or disconnect it, sometimes? Because keeping it 100% will slowly degrade the battery, and during power loss it might fail.
In my country it's possible to have power loss occasionally, so having the battery on good health is important. I'm planning to setup two unused laptops, and a mobile for servers (different purposes) and power management, battery health has been an issue.
The Framework bios (at least recent ones, but not original, so maybe OP hasn't updated) allow for a setting that, if the power is plugged in continuously, it won't fully charge the battery, even if your normal power limit is 100%.
Here is a knowledgebase article that goes into the details:
I have the original mainboard (i5, 11th gen) the laptop came with in that coolermaster case. Generally really quiet, but fans have kicked in a few times and it was pretty noticeable. Since it's VESA mountable, I might just move it into the rack at some point and let the rack fans take care of everything.
I put my original mainboard in one of these when I upgraded. It's fantastic. I had it VESA-mounted to the back of a monitor for a while which made a great desktop PC. Now I use it as an HTPC.
Not sure that framework adds anything here, I host around the same on a RPi5 and a few disks. Instead of ZFS I use lvm2, it's pretty good on the RPi5. ZFS eats up too much RAM for what I use it for there.
I’ve not heard of garage before but it looks quite interesting. I use s3 a lot for work but for homelab backups I’ve always just used borg on borgbase. Now I’m wondering whether I could use garage to pair a local node and AWS glacier for cheap redundancy of a large media library (I’m assuming that ~all of the reading is automatically done from the local node). TFA doesn’t really talk much about the actual experience of using garage - would love to hear more opinions from those who use it for self-hosting.
Edit: Realised you can’t use glacier since storage has to be mounted to the ec2 compute running the garage binary as a filesystem. So doesn’t really make sense as media library backup over just scheduling a periodic borg / restic backup to glacier directly.
You could still pair to AWS S3 and have an aggressive lifecycle policy to move the data to glacier after a short time window. I had this setup at a previous job for data. After X days it would switch from standard tier to glacier. In your case, X could equal 1.
I haven't needed to interact with Garage itself specifically. I've been using Boto3 / awscli / s3cmd / rclone for everything S3 API related and it's worked great. Garage was a few commands to setup, turn on, get API keys setup, and then left to run on it's own for the past 4 months.
I'd like more ellaboration on the technical side. Not literally how to do the same and what commands to use, but more in the line of how are the ZFS pools configured, or if Garage is opinionated and configures it all by itself. Are there mirrors in there? Or it's just individual pools that sync from some disks to others?
I have 2 USB disks and want to make a cheapo NAS but I always doubt between making a ZFS mirror, making 2 independent pools and use one to backup the other, or just go the alternate route and use SnapRAID and then be able to mix more older HDDs for maximum usage of the hardware I already own.
> doubt between making a ZFS mirror, making 2 independent pools
You will gain protection agains bit-rot and self-healing (via scrubs) with a mirror. Also faster reads.
> mix more older HDDs
You can do this with ZFS too! As long as you have two HDDs of the same size (or similar size, as to not loose too much to unused space), you can also add them as a mirrored zdev to your existing pool (or make an new one for backups as you wrote). Only the two disks in a mirror need to be of similar size, not all disks in a pool.
My understanding is that Garage is not opinionated and could easily have worked without ZFS. I installed ZFS in Ubuntu, and then later installed Garage.
As for the ZFS setup, I kept it simple and did RAID5/raidz1. I'm no expert in that, and have been starting to think about it again as the pool approaches 33% full.
I saw this comment in another thread here that sounded interesting as well by magicalhippo:
"I've been using ZFS for quite a while, and I had a realization some time ago that for a lot of data, I could tolerate a few hours worth of loss. So instead of a mirror, I've set up two separate one-disk pools, with automatic snapshots of the primary pool every few hours, which are then zfs send/recv to the other pool."
This caught my attention as it matches my usecase well. My original idea was that RAID5 would be good incase a HD fails, and that I would replicate the setup at another location, but the overall costs (~$1k USD) are enough that I haven't done that yet.
If you know where to look/are a little lucky, you can get an adequate RAID5 going for like $500-800 depending on the storage you need. I grabbed a QNAP 4 bay (no SSD caching) and 4x refurbished enterprise HDD's (14tb/ea) for just under $700 all-in last november if memory serves. Pretty reasonable for a 42tb RAID5 IMO.
I'd rather go with an old Dell T30 and 2x10TB Seagate Exos in ZFS RAID1 mode (Mirror). This thing would make me nervous every day, even with a daily backup in place... While the Dell T30 would also make me nervous, you could at least plug the disks into any other device and are not wiring up everything with some easy to pull out cables ;)
However, garage sounds nice :-) Thanks for posting.
I've been using ZFS for quite a while, and I had a realization some time ago that for a lot of data, I could tolerate a few hours worth of loss.
So instead of a mirror, I've set up two separate one-disk pools, with automatic snapshots of the primary pool every few hours, which are then zfs send/recv to the other pool.
This gives me a lot more flexibility in terms of the disks involved, one could be SSD other spinning rust for example, at the cost of some read speed and potential uptime.
Depending on your needs, you could even have the other disk external, and only connect it every few days.
I also have another mirrored RAID pool for more precious data. However almost all articles on ZFS focus on the RAID aspect, while few talk about the less hardware demanding setup described above.
1.) A mirror with an attached Tasmota Power Plug that I can turn on and off via curl to spin up an USB-Backup-HD:
curl "$TASMOTA_HOST/cm?cmnd=POWER+ON"
# preparation and pool imports
# ...
# clone the active pool onto usb pool
zfs send --raw -RI "$BACKUP_FROM_SNAPSHOT" "$BACKUP_UNTIL_SNAPSHOT" | pv | zfs recv -Fdu "$DST_POOL"
To prevent partial data loss I use zfs-auto-snapshot, zrepl or sanoid, which I configure to snapshot every 15 minutes and keep daily, weekly, montly and yearly snapshots as long as possible.
To clean up my space when having too many snapshots, I wrote my own zfs-tool (https://github.com/sandreas/zfs-tool), where you can do something like this:
That's a really cool idea and matches my use case well. I just copy pasted it to another person in this thread who was asking about the ZFS setup.
Your use case perfectly matches mine in that I wouldn't mind much about a few hours of data loss.
I guess the one issue is that it would require more disks, which at the current prices is not cheap. I was suprised how expensive it was when I bought them 6 months ago and was even more suprised when I looked recently and the same drives are even more now.
That sounds cool; is it possible to just query the ZFS system to know when it has finished synchronizing the slow disk, before bringing it offline again? Do you think that stopping and spinning the disk again, 24 times a day, is not going to cause much wear to the motors?
It's weird to me that "owning a computer that runs stuff" is now "self-hosting", just feels like an odd phrasing. Like there's an assumption that all computers belong to someone else now, so we have to specify that we're using our own.
It’s not clear from the blog post if the S3 is accessible from outside their home. I agree with the parent that purely local services aren’t what typically counts as “self-hosting”.
Let's not kid ourselves that maintaining 10TB with resiliency handling and other controls built in is something that is trivial. It is only trivial due to the offerings that Cloud computing has made easy.
Self-hosting implies those features without the cloud element and not just buying a computer.
10tb fits on one disk though - it may not be trivial but it's not overly complicated setting up a raid-1. Off-site redundancy and backup of course does make it more complicated however.
You can buy a 10TB+ external drive which uses RAID1.
You can also buy a computer with this — not a laptop, and I don't know about budget desktops, but on Dell's site (for example) it's just a drop-down selection box.
Nice to see Garage mentioned. I was deciding between S3-compatible self-hosted alternatives and ended up choosing SeaweedFS. It seems to require less manual configuration compared to Garage
Very cool! I replaced my mainboard on my framework and am trying to convert it to a backup for my nas.
Could you talk a little more about your zfs setup? I literally just want it to be a place to send snapshots but I’m worried about the usb connection speed and the accidentally unplugging it and losing data
A fair few things want blob object storage like S3. NFS does not scale to ridiculous levels horizontally or vertically. S3 does things like de-duplication and other funky tricks.
So if you want to use an app that needs S3 then you need to deploy S3 and not NFS.
I run a minio cluster (S3) for Veeam backups at work. I also run multiple NFS for Veeam and VMware datastores.
I use a self-hosted s3 compatible object storage for cheap storage of Android apks and logs. I do push various processed items to Digital Ocean Spaces, so having the ability to use rclone / the same functions for both is useful.
The article didn't load for me (HN hug of death) so I was confused too as are many others in this comment thread (and some have been downvoted for simply asking). Just use the correct terminology; if you're in tech you know there's no room or excuse for ambiguity.
Getting into S3 myself and really curious about what Garage has to offer vs the more mature alternatives like Minio. From what I gather, it kinda works better with small (a few kilobytes) files or something?
Yeah, there's a trend of people who don't actually believe in software freedoms releasing a subset of their proprietary software under free software licenses and pretending.
It's really just a bait and switch to try to get free community engagement around a commercial product. It's fundamentally dishonest. I call it "open source cosplay". They're not real open source projects (in the sense that if you write a feature under a free software license that competes with their paid proprietary software, there's zero percent chance it will be upstreamed, even if all of the users of the project want it) so they shouldn't get the credit for being such just because they slapped a free software license on a fraction of their proprietary code.
Invariably they also want contributors to sign a rights-assignment CLA so they can reuse free software contributions (that they didn't pay for) in their for-profit proprietary project. Never sign a CLA that assigns rights.
Some open source projects flat-out illegally "relicensed" open source contributions as a proprietary license when they wanted to start selling software (CapRover). Some just start removing features or refuse to integrate features (Minio, Mattermost, etc). Many (such as Minio) use nonfree fake open source licenses like the AGPL[1].
It's all a scam by people who don't care about software freedoms. If you believe in software freedoms, you never release any software that isn't free software.
I loved minio until they silently removed 99% of the admin UI to push users towards the paid offering. It just disappeared one day after fetching the new minio images. The only evidence of the change online was discussions by confused users in the GitHub issues
I have also been considering this for some time. Been comparing MinIO, Garage, and Ceph. MinIO may not be wise given their recent moves, as another commenter noted. Garage seems ok but their git doesn’t show much activity these days so I wonder if it too will be abandoned. Which leaves us with Ceph. May have a higher learning curve but also offers the most flexibility as one can do object as well as block and file. Gonna set up a single node with 9 OSD’s soon and give it a go but always looking for input if anyone would like to provide some.
If I can reassure you about Garage, it's not at all abandoned. We have active work going on to make a GUI for cluster administration, and we have applied for a new round of funding for more low-level work on performance, which should keep us going for the next year or so. Expect some more activity in the near future.
I manage several Garage clusters and will keep maintaining the software to keep these clusters running. But concerning the "low level of activity in the git repo": we originally built Garage for some specific needs, and it fits these needs quite well in its current form. So I'd argue that "low activity" doesn't mean it's not reliable, in fact it's the contrary: low activity means that it works well for us and there isn't a need to change anything.
Of course implementing new features is another deal, I personally have only limited time to spend on implementing features that I don't need myself. But we would always welcome outside contributions of new features from people with specific needs.
I appreciate the response! Thanks for the update. I will continue keeping an eye on the project then and possibly giving it a try. I have read the docs and was considering setting it up across two sites. The implementation seemed address this pain point with distributed storage solutions and latency.
I've used Ceph in a home lab setting for 9 years or so now. Since cephadm is has gotten even easier to manage even though it really was never that hard. A few pointers. No SMR drives, they have such bad performance that they can periodically drop out of the cluster. Second, no consumer SSDs/NVMe devices. You need power loss prevention on your drives. Ceph directly writes to the drive, it ignores cache, without PLP you may literally have slower performance than rust.
You also want fast networking, I just use 10Gbps. My nodes each are 6 rust and 1 NVMe drive each, 5 nodes. I colocate my MONs and MDS daemons with my OSDs, each node has 64GB of RAM and I use around 40GB.
Usage is RDB for a three node OpenStack cluster, and CephFS. I have about 424TiB between rust and NVMe raw.
I have an ancient Qnap NAS (2015) which is on borrowed time and I’m trying to figure out what to replace it with. Keep going back and forth between rolling my own with a Jonsbo case vs. a prebuilt like the new Ubiquti boxes. This is an attractive third option of a modest compute box (raspy, NUC, etc.) paired with a JBOD over USB. Can you still use something like TrueNAS with a setup like that?
If you don't mind using m.2 drives, check out Beelink ME Mini. The m.2 drives are going to be limited by the PCIe 3.0 x1 bus [1], but you get a very neat and small appliance-like box that can handle home storage workloads really well. Just keep in mind that you'll need an additional OS drive: TrueNAS will probably wear that eMMC out in a matter of a year or two.
Neat. Depending on your use case it might make sense.
Still I wonder what they use for backup? For many use cases downtime is acceptable, but data loss is generally not. Did I miss it in the post?
OP here. There I currently have some things syncd to a cloud S3. The long term plan would be to replicate the setup at another location to take advantage of garage region/nodes, but need to wait for the money for that.
ZFS is RAM hungry, plus doesn't like USB connections (like the article implied). So, I've been eyeing btrfs as a way to setup my NAS drives. Would I miss something in that setup?
ZFS is not RAM hungry. The only official memory requirements that exist specify 768MB system RAM and thats the whole OS (Solaris 9) minimum requirement. Just like any other file system ZFS will use memory if available and release it when other parts of the system need it.
Also ZFS is perfectly happy with USB connections. In fact it's the best type of FS to have if your storage is unreliable due to its self healing capabilities. Not that modern USB is unreliable nowadays and there are plenty of DAS solutions that rely on 3.x USB.
Yeah, this was an effort to get around cloud costs for large amounts of 'low value' data that I have but use in my other home servers for processing. I still sync some smaller result sets to an S3 in the cloud for redundancy as well as for CDN uses.
Why are you calling it S3? That is a proprietary Amazon cloud technology. Why not call it what is it is, e.g. ZFS, file store, or object store? Let's not dilute terms.
That's a good point, it is S3 compatible object storage, not just S3. My experience with AWS S3 has impacted the way I use object storage and since this project is syncd to another S3 compatible object storage using the S3 protocol, in my head I just call it all S3.
Okay, weird to call it S3, if it is just object storage somewhere else. Its like saying "EKS" if you mean Kubernetes, or talking about "self hosting EC2" by installing qemu.
AWS S3 was the first S3-compatible API provider, nowadays most cloud providers and bunch of self hosted software supports S3(-Compatible) APIs. Call it Object Store (which is a bit unspecific) or call it S3-Compatible.
EKS and EC2 on the other hand are a set of tools and services, operated by AWS for you - with some APIs surrounding them that are not replicated by any other party (at least for production use).
In hindsight you are correct about the title not being accurate.
"It's trivial to have storage"
I'd argue this wasn't trivial for me. Buying $1k of drives+JBOD, acquiring the second hand laptop, getting ZFS working with USB took a couple tries and finally moving my projects to using the dual local network S3 object storage vs cloud S3 took a fair amount of time.
"however you want an shove the S3 API on top."
You're right here too. I did find this part pleasantly trivial, which I didn't know before, and hence the article about how pleased I was that this part ended being trivial and has remained trivial once the other parts were setup.
If it's just the mainboard and no screen, OP could put it in a dedicated case like the CoolerMaster one:
https://www.coolermaster.com/en-global/products/framework/
Those are pretty cool. I meant to highlight more, that the laptop has done super well. I can't even tell it's on as I hear no fan / no heat. I guess laptops are pretty good for this as they are great at sipping power when there is a low load.
Back in 2012 or so, I reused an old netbook (an Asus Eee PC) with an Atom CPU & 1GB of RAM, installed Ubuntu Server, and used it as a home server. It handled the printer, DNS-VPN proxying for streaming, and a few other things admirably for years. (And ironically was resilient to Spectre because its Atom CPU was before Intel added speculative execution)
Eventually, the thing that kicked the bucket was actually the keyboard (and later the fan started making "my car won't start" noises occasionally). Even the horribly-slow HDD (that handled Ubuntu Server surprisingly well) hadn't died yet.
One question, are you keeping it plugged in continuously, or disconnect it, sometimes? Because keeping it 100% will slowly degrade the battery, and during power loss it might fail.
In my country it's possible to have power loss occasionally, so having the battery on good health is important. I'm planning to setup two unused laptops, and a mobile for servers (different purposes) and power management, battery health has been an issue.
The Framework bios (at least recent ones, but not original, so maybe OP hasn't updated) allow for a setting that, if the power is plugged in continuously, it won't fully charge the battery, even if your normal power limit is 100%.
Here is a knowledgebase article that goes into the details:
https://knowledgebase.frame.work/en_us/framework-laptop-13-b...
You can manually specify a fixed maximum, which was available from 3.04 (which was the BIOS I had back in late 2022 when I got mine).
One interesting note in the new functionality is that charging will let the battery drain by some percent before recharging.
You can also control this from userspace. On Linux, use something like:
A battery may turn into a spicy pillow. Please do consider a UPS.
UPS is a mandatory for my desktop already, but all of them goes bad in a year or two, unfortunately. Which is why I am considering Laptops for server.
I have the original mainboard (i5, 11th gen) the laptop came with in that coolermaster case. Generally really quiet, but fans have kicked in a few times and it was pretty noticeable. Since it's VESA mountable, I might just move it into the rack at some point and let the rack fans take care of everything.
Here's a link to the case on the Framework marketplace:
https://frame.work/ca/en/products/cooler-master-mainboard-ca...
I put my original mainboard in one of these when I upgraded. It's fantastic. I had it VESA-mounted to the back of a monitor for a while which made a great desktop PC. Now I use it as an HTPC.
Or a 10” mini rack!! https://deskpi.com/products/deskpi-rackmate-t0-plus-rackmoun...
Not sure that framework adds anything here, I host around the same on a RPi5 and a few disks. Instead of ZFS I use lvm2, it's pretty good on the RPi5. ZFS eats up too much RAM for what I use it for there.
I’ve not heard of garage before but it looks quite interesting. I use s3 a lot for work but for homelab backups I’ve always just used borg on borgbase. Now I’m wondering whether I could use garage to pair a local node and AWS glacier for cheap redundancy of a large media library (I’m assuming that ~all of the reading is automatically done from the local node). TFA doesn’t really talk much about the actual experience of using garage - would love to hear more opinions from those who use it for self-hosting.
Edit: Realised you can’t use glacier since storage has to be mounted to the ec2 compute running the garage binary as a filesystem. So doesn’t really make sense as media library backup over just scheduling a periodic borg / restic backup to glacier directly.
Another alternative is ZeroFS[1], just store your stuff directly to S3.
[1] https://github.com/Barre/ZeroFS
That looks very interesting - will look into it, thanks.
You could still pair to AWS S3 and have an aggressive lifecycle policy to move the data to glacier after a short time window. I had this setup at a previous job for data. After X days it would switch from standard tier to glacier. In your case, X could equal 1.
I haven't needed to interact with Garage itself specifically. I've been using Boto3 / awscli / s3cmd / rclone for everything S3 API related and it's worked great. Garage was a few commands to setup, turn on, get API keys setup, and then left to run on it's own for the past 4 months.
So in that sense, I've loved it.
I'd like more ellaboration on the technical side. Not literally how to do the same and what commands to use, but more in the line of how are the ZFS pools configured, or if Garage is opinionated and configures it all by itself. Are there mirrors in there? Or it's just individual pools that sync from some disks to others?
I have 2 USB disks and want to make a cheapo NAS but I always doubt between making a ZFS mirror, making 2 independent pools and use one to backup the other, or just go the alternate route and use SnapRAID and then be able to mix more older HDDs for maximum usage of the hardware I already own.
> doubt between making a ZFS mirror, making 2 independent pools
You will gain protection agains bit-rot and self-healing (via scrubs) with a mirror. Also faster reads.
> mix more older HDDs
You can do this with ZFS too! As long as you have two HDDs of the same size (or similar size, as to not loose too much to unused space), you can also add them as a mirrored zdev to your existing pool (or make an new one for backups as you wrote). Only the two disks in a mirror need to be of similar size, not all disks in a pool.
My understanding is that Garage is not opinionated and could easily have worked without ZFS. I installed ZFS in Ubuntu, and then later installed Garage.
As for the ZFS setup, I kept it simple and did RAID5/raidz1. I'm no expert in that, and have been starting to think about it again as the pool approaches 33% full.
I saw this comment in another thread here that sounded interesting as well by magicalhippo: "I've been using ZFS for quite a while, and I had a realization some time ago that for a lot of data, I could tolerate a few hours worth of loss. So instead of a mirror, I've set up two separate one-disk pools, with automatic snapshots of the primary pool every few hours, which are then zfs send/recv to the other pool."
This caught my attention as it matches my usecase well. My original idea was that RAID5 would be good incase a HD fails, and that I would replicate the setup at another location, but the overall costs (~$1k USD) are enough that I haven't done that yet.
If you know where to look/are a little lucky, you can get an adequate RAID5 going for like $500-800 depending on the storage you need. I grabbed a QNAP 4 bay (no SSD caching) and 4x refurbished enterprise HDD's (14tb/ea) for just under $700 all-in last november if memory serves. Pretty reasonable for a 42tb RAID5 IMO.
I'd rather go with an old Dell T30 and 2x10TB Seagate Exos in ZFS RAID1 mode (Mirror). This thing would make me nervous every day, even with a daily backup in place... While the Dell T30 would also make me nervous, you could at least plug the disks into any other device and are not wiring up everything with some easy to pull out cables ;)
However, garage sounds nice :-) Thanks for posting.
I've been using ZFS for quite a while, and I had a realization some time ago that for a lot of data, I could tolerate a few hours worth of loss.
So instead of a mirror, I've set up two separate one-disk pools, with automatic snapshots of the primary pool every few hours, which are then zfs send/recv to the other pool.
This gives me a lot more flexibility in terms of the disks involved, one could be SSD other spinning rust for example, at the cost of some read speed and potential uptime.
Depending on your needs, you could even have the other disk external, and only connect it every few days.
I also have another mirrored RAID pool for more precious data. However almost all articles on ZFS focus on the RAID aspect, while few talk about the less hardware demanding setup described above.
Interesting idea... thanks for sharing.
I have two setups.
1.) A mirror with an attached Tasmota Power Plug that I can turn on and off via curl to spin up an USB-Backup-HD:
2.) A backup server that pulls backup to ensure ransomware has no chance via zsync (https://gitlab.bashclub.org/bashclub/zsync/)To prevent partial data loss I use zfs-auto-snapshot, zrepl or sanoid, which I configure to snapshot every 15 minutes and keep daily, weekly, montly and yearly snapshots as long as possible.
To clean up my space when having too many snapshots, I wrote my own zfs-tool (https://github.com/sandreas/zfs-tool), where you can do something like this:
That's a really cool idea and matches my use case well. I just copy pasted it to another person in this thread who was asking about the ZFS setup.
Your use case perfectly matches mine in that I wouldn't mind much about a few hours of data loss.
I guess the one issue is that it would require more disks, which at the current prices is not cheap. I was suprised how expensive it was when I bought them 6 months ago and was even more suprised when I looked recently and the same drives are even more now.
I opted to use a two disk mirror, and offline the slow disk. Hourly cronjob to online the slow disk, wait, and then offline it again.
Gives me the benefit of automatic fixes in the event of bit rot in any blocks more then an hour old too.
That sounds cool; is it possible to just query the ZFS system to know when it has finished synchronizing the slow disk, before bringing it offline again? Do you think that stopping and spinning the disk again, 24 times a day, is not going to cause much wear to the motors?
That is another way, though annoying if you've set up automatic error reporting.
Just wanted to share a quiet successful self hosting.
Does this JBOD consist of SSD? HDDs in that amount can be rather noisy.
Yeah they are HDs and are surprisingly noisy.
It's weird to me that "owning a computer that runs stuff" is now "self-hosting", just feels like an odd phrasing. Like there's an assumption that all computers belong to someone else now, so we have to specify that we're using our own.
Think services
You can own a computer and not run any services at all. Most people do.
Deciding to run your own services, like email, means a lot of work that most people aren’t interested or capable of doing.
It’s the difference between using your computer to consume things or produce things.
It’s not clear from the blog post if the S3 is accessible from outside their home. I agree with the parent that purely local services aren’t what typically counts as “self-hosting”.
We call it self hosting because it is typically hosted by someone else, get it?
Kind of like how installing got renamed to side-loading.
Let's not kid ourselves that maintaining 10TB with resiliency handling and other controls built in is something that is trivial. It is only trivial due to the offerings that Cloud computing has made easy.
Self-hosting implies those features without the cloud element and not just buying a computer.
> Let's not kid ourselves that maintaining 10TB with resiliency handling and other controls built in is something that is trivial.
It is though. People in tech need to stop pretending everything they are doing is super complicated.
10tb fits on one disk though - it may not be trivial but it's not overly complicated setting up a raid-1. Off-site redundancy and backup of course does make it more complicated however.
And all of those things are more steps than "buying a computer".
Reminds me of the "Dropbox can be built in a weekend"
You can buy a 10TB+ external drive which uses RAID1.
You can also buy a computer with this — not a laptop, and I don't know about budget desktops, but on Dell's site (for example) it's just a drop-down selection box.
Moot point. It really depends on your expectations.
Self-hosting 10TB in an enterprise context is trivial.
Self hosting 10TB at home is easy.
The thing is: once you learn enough ZFS, whether you’re hosting 10 or 200TB it doesn’t change much.
The real challenge is justifying to yourself spending for all those disks. But if it’s functional to yourself spending hobby…
I love Garage. It just works. I have Garage running on a few older Odroid HC2's, primarily for k8s Velero backup, and it's just set and forget.
Nice to see Garage mentioned. I was deciding between S3-compatible self-hosted alternatives and ended up choosing SeaweedFS. It seems to require less manual configuration compared to Garage
Very cool! I replaced my mainboard on my framework and am trying to convert it to a backup for my nas.
Could you talk a little more about your zfs setup? I literally just want it to be a place to send snapshots but I’m worried about the usb connection speed and the accidentally unplugging it and losing data
Previous discussion of Garage:
https://news.ycombinator.com/item?id=41013004
I do something similar with Minio although I still have most of my stuff on normal file system.
What do you use self-hosted S3 for? I feel like all the use cases I can think of would be better served by a network attached file system.
A fair few things want blob object storage like S3. NFS does not scale to ridiculous levels horizontally or vertically. S3 does things like de-duplication and other funky tricks.
So if you want to use an app that needs S3 then you need to deploy S3 and not NFS.
I run a minio cluster (S3) for Veeam backups at work. I also run multiple NFS for Veeam and VMware datastores.
Tools for the job mate!
I use a self-hosted s3 compatible object storage for cheap storage of Android apks and logs. I do push various processed items to Digital Ocean Spaces, so having the ability to use rclone / the same functions for both is useful.
The article didn't load for me (HN hug of death) so I was confused too as are many others in this comment thread (and some have been downvoted for simply asking). Just use the correct terminology; if you're in tech you know there's no room or excuse for ambiguity.
Getting into S3 myself and really curious about what Garage has to offer vs the more mature alternatives like Minio. From what I gather, it kinda works better with small (a few kilobytes) files or something?
Minio recently started removing features from the community version. https://news.ycombinator.com/item?id=44136108
How awful. It seems to be a pattern nowadays?
Some former colleagues still using gitlab ce tell me they also removed features from their self-hosted version, particularly from their runners.
Yeah, there's a trend of people who don't actually believe in software freedoms releasing a subset of their proprietary software under free software licenses and pretending.
It's really just a bait and switch to try to get free community engagement around a commercial product. It's fundamentally dishonest. I call it "open source cosplay". They're not real open source projects (in the sense that if you write a feature under a free software license that competes with their paid proprietary software, there's zero percent chance it will be upstreamed, even if all of the users of the project want it) so they shouldn't get the credit for being such just because they slapped a free software license on a fraction of their proprietary code.
Invariably they also want contributors to sign a rights-assignment CLA so they can reuse free software contributions (that they didn't pay for) in their for-profit proprietary project. Never sign a CLA that assigns rights.
Some open source projects flat-out illegally "relicensed" open source contributions as a proprietary license when they wanted to start selling software (CapRover). Some just start removing features or refuse to integrate features (Minio, Mattermost, etc). Many (such as Minio) use nonfree fake open source licenses like the AGPL[1].
It's all a scam by people who don't care about software freedoms. If you believe in software freedoms, you never release any software that isn't free software.
[1]: https://sneak.berlin/20250720/the-agpl-is-nonfree/
I loved minio until they silently removed 99% of the admin UI to push users towards the paid offering. It just disappeared one day after fetching the new minio images. The only evidence of the change online was discussions by confused users in the GitHub issues
I have also been considering this for some time. Been comparing MinIO, Garage, and Ceph. MinIO may not be wise given their recent moves, as another commenter noted. Garage seems ok but their git doesn’t show much activity these days so I wonder if it too will be abandoned. Which leaves us with Ceph. May have a higher learning curve but also offers the most flexibility as one can do object as well as block and file. Gonna set up a single node with 9 OSD’s soon and give it a go but always looking for input if anyone would like to provide some.
If I can reassure you about Garage, it's not at all abandoned. We have active work going on to make a GUI for cluster administration, and we have applied for a new round of funding for more low-level work on performance, which should keep us going for the next year or so. Expect some more activity in the near future.
I manage several Garage clusters and will keep maintaining the software to keep these clusters running. But concerning the "low level of activity in the git repo": we originally built Garage for some specific needs, and it fits these needs quite well in its current form. So I'd argue that "low activity" doesn't mean it's not reliable, in fact it's the contrary: low activity means that it works well for us and there isn't a need to change anything.
Of course implementing new features is another deal, I personally have only limited time to spend on implementing features that I don't need myself. But we would always welcome outside contributions of new features from people with specific needs.
I appreciate the response! Thanks for the update. I will continue keeping an eye on the project then and possibly giving it a try. I have read the docs and was considering setting it up across two sites. The implementation seemed address this pain point with distributed storage solutions and latency.
I've used Ceph in a home lab setting for 9 years or so now. Since cephadm is has gotten even easier to manage even though it really was never that hard. A few pointers. No SMR drives, they have such bad performance that they can periodically drop out of the cluster. Second, no consumer SSDs/NVMe devices. You need power loss prevention on your drives. Ceph directly writes to the drive, it ignores cache, without PLP you may literally have slower performance than rust.
You also want fast networking, I just use 10Gbps. My nodes each are 6 rust and 1 NVMe drive each, 5 nodes. I colocate my MONs and MDS daemons with my OSDs, each node has 64GB of RAM and I use around 40GB.
Usage is RDB for a three node OpenStack cluster, and CephFS. I have about 424TiB between rust and NVMe raw.
The point about smr drives cannot be stressed enough.
Smr drives are absolutly shit-tier choice in terms of drives.
I have an ancient Qnap NAS (2015) which is on borrowed time and I’m trying to figure out what to replace it with. Keep going back and forth between rolling my own with a Jonsbo case vs. a prebuilt like the new Ubiquti boxes. This is an attractive third option of a modest compute box (raspy, NUC, etc.) paired with a JBOD over USB. Can you still use something like TrueNAS with a setup like that?
If you don't mind using m.2 drives, check out Beelink ME Mini. The m.2 drives are going to be limited by the PCIe 3.0 x1 bus [1], but you get a very neat and small appliance-like box that can handle home storage workloads really well. Just keep in mind that you'll need an additional OS drive: TrueNAS will probably wear that eMMC out in a matter of a year or two.
[0] https://www.bee-link.com/products/beelink-me-mini-n150
[1] https://youtu.be/TkFfTekB3eM?t=1034
Local storage should be like a home appliance, not something we build even though we can.
When things inevitably need attention it’s not about diy.
i'd be stressed out while watering those plants.
Plants look very portable
The laptop is easy to repair, at least.
Neat. Depending on your use case it might make sense. Still I wonder what they use for backup? For many use cases downtime is acceptable, but data loss is generally not. Did I miss it in the post?
OP here. There I currently have some things syncd to a cloud S3. The long term plan would be to replicate the setup at another location to take advantage of garage region/nodes, but need to wait for the money for that.
Thanks for the lead on Garage S3. Everyone's always recommending minIO and Ceph which are just not fun to work with.
What enclosure houses the JBOD?
Don't know about that one but can recommend Terramaster DAS, they don't cheap out on the controller. I have a d4-320 connected to my NUC.
ZFS is RAM hungry, plus doesn't like USB connections (like the article implied). So, I've been eyeing btrfs as a way to setup my NAS drives. Would I miss something in that setup?
ZFS is not RAM hungry. The only official memory requirements that exist specify 768MB system RAM and thats the whole OS (Solaris 9) minimum requirement. Just like any other file system ZFS will use memory if available and release it when other parts of the system need it.
https://docs.oracle.com/cd/E19253-01/819-5461/gitgn/index.ht...
Also ZFS is perfectly happy with USB connections. In fact it's the best type of FS to have if your storage is unreliable due to its self healing capabilities. Not that modern USB is unreliable nowadays and there are plenty of DAS solutions that rely on 3.x USB.
With the metadata only on the internal drive, isn't this a SPOF?
Given that it's JBOD over USB I don't think this is aimed at redundancy
Yeah, this was an effort to get around cloud costs for large amounts of 'low value' data that I have but use in my other home servers for processing. I still sync some smaller result sets to an S3 in the cloud for redundancy as well as for CDN uses.
I thought zfs is doing the RAID.
It could be. Author didn't specify. zfs isn't inherently redundant or RAID so it may or may not have redundancy
Why are you calling it S3? That is a proprietary Amazon cloud technology. Why not call it what is it is, e.g. ZFS, file store, or object store? Let's not dilute terms.
> Garage implements the Amazon S3 API and thus is already compatible with many applications.
https://garagehq.deuxfleurs.fr/
Yes, it's S3 API compatible, but it's not S3. The originally submitted article title misleads by claiming it's S3. There is no valid excuse.
That's a good point, it is S3 compatible object storage, not just S3. My experience with AWS S3 has impacted the way I use object storage and since this project is syncd to another S3 compatible object storage using the S3 protocol, in my head I just call it all S3.
Amazing, I will try Garage.
What brand of HDD did you use?
Read up on backblaze hard drive reports. Great source of info
I went with IronWolf, likely due to price, though interestingly they are 25% more expensive than when I bought them six months ago.
10TB, you could just mirror 2 drives with that, seen people serving 10PB at home by this point I'm sorry to say
I really don't get it. Do they host it on Amazon S3 or do they self-host it on a NAS?
They built an object storage system exposing an S3-compatible API, by using https://garagehq.deuxfleurs.fr/
Okay, weird to call it S3, if it is just object storage somewhere else. Its like saying "EKS" if you mean Kubernetes, or talking about "self hosting EC2" by installing qemu.
> weird to call it S3
I feel that is a bit of an unfair assessment.
AWS S3 was the first S3-compatible API provider, nowadays most cloud providers and bunch of self hosted software supports S3(-Compatible) APIs. Call it Object Store (which is a bit unspecific) or call it S3-Compatible.
EKS and EC2 on the other hand are a set of tools and services, operated by AWS for you - with some APIs surrounding them that are not replicated by any other party (at least for production use).
You can even do S3-on-ZFS
Which solutions do you find stable?
S3 is both a product and basically an API standard.
Garage talks the same S3 API.
GarageFS S3 compatibility https://garagehq.deuxfleurs.fr/documentation/reference-manua... vs
SeaweedFS vs. JuiceFS https://juicefs.com/docs/community/comparison/juicefs_vs_sea...It’s self hosted, and self hosted nas’ can run the s3 storage protocol locally as well.
Yeah, that's pretty standard for object storage to be S3-compatible. I think azure blob is the only one that doesn't support it.
>>About 5 months ago I made the decision to start self hosting my own S3.
It is eleven nines of durability? No. You didn't build S3. You built a cheapo NAS.
And won't be charged for ingres, egress or IOPS etc, it's better than bad, it's good. Happy times.
I think it's pretty obvious he's talking about the protocol not the amazon service...
title is 'Self hosting 10TB IN S3.'
Yes, it's obvious, but it's a terrible title. I don't really get the point. It's trivial to have storage however you want an shove the S3 API on top.
In hindsight you are correct about the title not being accurate.
"It's trivial to have storage" I'd argue this wasn't trivial for me. Buying $1k of drives+JBOD, acquiring the second hand laptop, getting ZFS working with USB took a couple tries and finally moving my projects to using the dual local network S3 object storage vs cloud S3 took a fair amount of time.
"however you want an shove the S3 API on top." You're right here too. I did find this part pleasantly trivial, which I didn't know before, and hence the article about how pleased I was that this part ended being trivial and has remained trivial once the other parts were setup.