RogerBW's Blog

Building a File Server 3: software 03 October 2017

In this part of the series on building a file server, I'll talk about software.

You could just install FreeNAS. I'd rather have a server that I can patch and fix like all my other servers. So I'll ignore that option and do it the fun way.

You have to make several choices here. To combine discs together into mirrors, stripes, and RAID volumes, you can use md (the Linux multi-disc driver), or ZFS (available on Linux but supposedly more robust on FreeBSD/OpenIndiana). If you use md, you should probably put a volume manager on top of it (so that you can extend the array later without major pain), and you'll need to put a filesystem on top of that (for example the current Linux standard ext4; I think btrfs is probably still too flakey for production use); if you use ZFS it acts as a volume manager and filesystem too. ZFS uses different terminology: its RAID6 is "raidz2", its RAID1 is "mirror", and it uses no special term for RAID0. At this point I use ZFS for convenience (it also incorporates incremental remote backups), though ext4 on md has served me well in the past.

One caution: ZFS starts slowing down when it gets more than 80% full, and at 90% is downright sluggish. Plan capacity accordingly.

My OS drives use md RAID-1, because boot support for ZFS was not reliable when I built these machines. I understand it's better now.

Create the pool. Yes, you must use ashift=12 so that sector sizes match what a modern disc wants.

zpool create -o ashift=12 storage raidz2 /dev/disk/by-path/…

Create filesystems within the pool. This compression mode is so light on CPU usage that it provides a speed increase (fewer bytes have to be read off the disc).

zfs create -o compression=lzjb storage/foo

For filesystems that may have significant duplicated data (e.g. backups of multiple machines), you can add -o dedup=sha256 to save some space; note that this wants lots of RAM.

Remember that RAID is not a backup system. I'll repeat that, because it's important: RAID is not a backup system. If you delete or corrupt a terribly important file, that change will be faithfully mirrored across all your redundant discs before you have time to say "oh shit". ZFS offers snapshots as a way of getting round this, but really you want a full backup too. Which, in practice, probably means building another machine to do the same job, though maybe with less redundancy and it doesn't need to be running full-time. My current setup has a full mirror with the same hardware setup and capacity as the primary.

Use the tools of your choice to map the internal device names of your discs to their actual serial numbers. I tend to use

hdparm -I /dev/disk/by-path/… |grep Serial.Number

Keep the results of this somewhere that isn't only on the fileserver. When a disc fails, the software will tell you the internal device name, but it's nice to be able to confirm that with the serial number.

I like to run relatively little software on my fileservers, because I have other machines too and I want the fileserver to put all its efforts into serving files; get_iplayer will run on a different box. (I do run mpd on the file server, though, for convenience of access.) If you don't have other machines that run all the time, you may want to put other software on the server, which is much easier with a straight Linux or FreeBSD installation then with FreeNAS.

To get existing data on, if you're using a conventional PC chassis, you may have had room for a DVD drive, in which case you can copy DVDs and CDs directly; otherwise just pull data across the network. (dvdbackup and cdparanoia are recommended). I was running four CD drives in parallel (on different machines) when I did my own mass ripping. This may take days or weeks, but you only have to do it once; then the physical media can go into the loft to serve as unusually-bulky licence keys.

NFS is the traditional way of getting data onto and off a storage server in the Unix world. Authentication by anything other than IP address (and remember, this is UDP, so anyone on the LAN can send a packet claiming to be from anywhere) is such a nightmare that I've never got it working, even with Kerberos, so I supply NFS read-only. For access I use sshfs, which with modern CPUs is plentifully fast.

For Windows machines, Samba is still the way to go. I don't have any Windows machines any more, hurrah. I think Macs probably talk this too. iOS and Android can barely do anything by default, but apps can persuade them to talk sensible protocols.

If you have a smart TV or similar closed-source hardware, you may want to look into a DLNA server.

And of course boring old HTTP still works.

The final part will deal with maintenance.

Tags: computing

See also:
Building a File Server 1: planning
Building a File Server 2: hardware
Building a File Server 4: maintenance


  1. Posted by Peter at 10:48am on 04 October 2017

    A couple of notes on your choice of zpool/zfs command options:

    You can set the dataset properties on the root dataset at pool-creation time by using the -O option of the zpool command, which means you don't have to remember it later in "zfs create". For example: "zpool create -o ashift=12 -O compression=lz4 -O atime=off -O recordsize=1M canmount=noauto ...". This is also much less faff than going back and doing "zfs set" for each individual property later.

    lzjb compression is effectively an obsolete wart from a decade ago, with lz4 being rather faster and marginally more efficient at compression. The only downside of using lz4 is that your pool is no longer importable on Oracle Solaris or other ancient ZFS releases. Depending on your particular OpenZFS port and version, "compression=on" will either enable ljzb or lz4, so it's best to be explicit.

Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.

Search
Archive
Tags 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s 2010s 3d printing action advent of code aeronautics aikakirja anecdote animation anime army astronomy audio audio tech base commerce battletech beer boardgaming book of the week bookmonth chain of command children chris chronicle church of no redeeming virtues cold war comedy computing contemporary cornish smuggler cosmic encounter coup covid-19 crime crystal cthulhu eternal cycling dead of winter doctor who documentary drama driving drone ecchi economics en garde espionage essen 2015 essen 2016 essen 2017 essen 2018 essen 2019 essen 2022 essen 2023 existential risk falklands war fandom fanfic fantasy feminism film firefly first world war flash point flight simulation food garmin drive gazebo genesys geocaching geodata gin gkp gurps gurps 101 gus harpoon historical history horror hugo 2014 hugo 2015 hugo 2016 hugo 2017 hugo 2018 hugo 2019 hugo 2020 hugo 2021 hugo 2022 hugo 2023 hugo 2024 hugo-nebula reread in brief avoid instrumented life javascript julian simpson julie enfield kickstarter kotlin learn to play leaving earth linux liquor lovecraftiana lua mecha men with beards mpd museum music mystery naval noir non-fiction one for the brow opera parody paul temple perl perl weekly challenge photography podcast politics postscript powers prediction privacy project woolsack pyracantha python quantum rail raku ranting raspberry pi reading reading boardgames social real life restaurant reviews romance rpg a day rpgs ruby rust scala science fiction scythe second world war security shipwreck simutrans smartphone south atlantic war squaddies stationery steampunk stuarts suburbia superheroes suspense television the resistance the weekly challenge thirsty meeples thriller tin soldier torg toys trailers travel type 26 type 31 type 45 vietnam war war wargaming weather wives and sweethearts writing about writing x-wing young adult
Special All book reviews, All film reviews
Produced by aikakirja v0.1