NixOS Part 1 - Introduction & Basic Setup

04 Aug 2019

Once again, various life stressors have resulted in me attempting to retreat into my little shell, and whenever that happens I always seem to find something new to work on. In this particular instance, the stress seemed to perfectly coincide with my discovery of a new Linux distribution, NixOS. I ended up becoming very fond of the distribution almost immediately, and after two weeks of living on it and learning a bit more about it, I recently redid the MacBook utilizing Nix as the sole operating system on it.

The next series of blog posts, assuming I manage to stay consistent, will be about my journey using NixOS, why I have become so fond of it so quickly, and other interesting things I come up with along the way.

Why NixOS?

The most predominant form of package management in linux distributions tends to be imperative package management. Even in distributions like Gen/Funtoo, the make file is in general a static entity composed of variables that resolve how imperative commands, such as emerge system are executed. Essentially, we spend our time telling the system how we want it to accomplish what we actually want, and by following the list of imperative commands we (theoretically) end up with what we wanted.

The problem is, this doesn’t always work. To quote GlaDOS, “This former Linux Developer would like to remind you that Dependency Hell is a real place where you WILL be sent at the first sign of poor imperative order control.” Essentially, because we are so focused on how something will be accomplished, if anything along that chain breaks we end up in a genuinely unfortunate situation.

This is not an old problem, and many distributions have attempted to resolve this. Back in 2004/2005 when I was starting to get interested in Linux I spent some time talking to a friend about what I was using to tinker. While I cannot remember the specific distributions, I do know that RPM was the primary mode of package management on them. He essentially scoffed at me, and said while he was glad I was using Linux, he wished me luck when I inevitably discovered “RPM Hell,” perhaps one of the earliest forms of dependency hell.

When I inquired as to what he was using, he told me about Gentoo. This was a very very long time ago, where Stage 1/Stage 2/Stage 3 installations were a thing. The benefit, I was told, is that because literally every part of the package is imperatively defined ahead of time, you mitigate the chance of dependency hell, because you’re modifying the entire operating system as a single unit.

This began my extremely long and at times complicated history with the Gen/Funtoo community during my past life. I still hold an incredible soft spot for their ecosystem, but there are situations that can occur even within the portage subsystem, and the method of doing things imperatively often means that the problem isn’t discovered until the middle of rebuilding your system, for example with an emerge world. Not to mention there is a penalty, and it was even worse during the P3/P4 days, for building everything completely from source.

Various hybrid Binary/Source systems were developed over time, one that I remember offhand is Sabayon which started as a layer over Gentoo if I remember correctly, that would install unmodified packages from binary but anything modified would be built. However, this technique just solves the speed issue, it does nothing to resolve the imperative issues.

How does Nix Help?

So Nix works by utilizing a declarative form of package management. Essentially, instead of focusing on telling Nix how we want our system, we tell it what system we want. We declare the Operating System we desire as a function, Nix evaluates that function, and then it decides how it will form that system. This is combined with transactional, atomic, procedures. Every “Generation” of Nix is a stand alone operating system. Installing/Un-installing packages never leaves any cruft behind because the packages are just dynamically linked upon boot. Removing the package from our declaration removes the request to symbolically link it to the system, which means it never existed.

Nix, in particular, is a functional package manager, an even more strict form of declarative package management. Instead of rehashing the explanations, I’ll let the wonderful nix webpage explain the difference. NixOS is simply an entire operating system built on top of Nix.

Beyond the benefits discussed on the official Nix website, there are other benefits to putting the entire operating system under functional control.

As an example, let us take the boot process. Since the system we are going to get to below utilizes encrypted swap and an encrypted root, you would expect that we need to handle a custom initrd, custom fstab, etc etc. You would be right! But instead of handling multiple files to get this done, we can actually do everything right in our configuration.nix.

Swap:

swapDevices = [
  {
    device = "/dev/disk/by-uuid/f6533f92-baf2-4804-afda-880a7b5975ac";
    encrypted = {
      enable  = true;
      keyFile = "/mnt-root/root/swap.key"; # Believe it or not, this is correct.
      label   = "nixos-swap";
      blkDev  = "/dev/disk/by-uuid/6babbdb8-26ec-43ee-b7ab-76b43015acd3";
    };
  }
];

Root FS:

boot = {
  initrd = {
    luks = {
      devices = {
        decrypted-disk-name = {
          device = "/dev/disk/by-uuid/0765a1fc-6045-45af-978e-db49609bc0e3";
          keyFile = "/root.key";
        };
      };
    };
  };
};

Additionally, while some distributions have opted to have several tools for building your grub.cfg, those still rely on modifying external files, for example under /etc/default or other directories. Instead, we add it right into our configuration.nix as well. We are just declaring what we want Nix to do, whatever else it decides it needs is up to it.

Grub Configuration:

boot = {
  grub = {
    device                = "nodev";                # This isn't for BIOS.
    efiInstallAsRemovable = true;                   # Try to use Standard EFI.
    efiSupport            = true;                   # This IS for EFI.
    enable                = true;                   # Grub is needed for our weird shit
    enableCryptodisk      = true;                   # Add LUKS support
    extraInitrd           = "/boot/initrd.keys.gz"; # LUKS Key
    zfsSupport            = true;                   # Add ZFS support
  };
};

Some people may note right away that we’re building our key into the initrd and may worry about security issues, but we will get to that as well!

Needless to say, literally everything that is handled in configuration files throughout a normal linux distribution, is instead located in a centralized file. We can break that file up and import others as well, similar to any other programming language. Essentially, NixOS reduces the entire operating system to a series of Nix-Language source files, and we let Nix handle all the rest!

About Our System

So, for this initial article I will be discussing what I wanted out of my new daily driver operating system, and how I went about implementing it. You can look at my entire Nix-Configuration through the GitHub repository of the same name, but I won’t be referring to any specific files yet because the repository is going to change layouts several times throughout this series of articles, as I attempt to convert to a more functional form of creating my system.

So, without further ado, what are our requirements?

Encrypted Root Partition
ZFS Root Partition
Encrypted Boot(!) Partition, to protect our kernels and initrd
Encrypted Swap
Hibernation Support with encrypted swap(!)

There are actually more requirements, but these form the basis for this article. Before we get into it, it is worth noting that I used several different sources to compile all the steps needed to accomplish everything.

First, for Nix on ZFS, the NixOS Wiki Page of the same name was instrumental in solving the basic requirements of our work. We skip over ZFS Native encryption, because while it is not leaky in relevant ways, it is still slightly leaky. From man zfs:

zfs will not encrypt metadata related to the pool structure, including dataset
names, dataset hierarchy, file size, file holes, and dedup tables.

Next, we need to ensure that we can encrypt the boot directory. This blog post was instrumental in getting things to work. Had I not found this post, its possible that I would have forgone boot partition encryption, and I’m very glad I was able to get it done.

Last, but not least, encrypted swap with hibernation was enabled by following the first answer to this stack exchange question with some gentle modifications made.

Additionally, it is worth noting that the configuration.nix file was simply copied from my previous trial system, and tweaks made to it. The primary reason the repository is out of date is that while trying to get the system up and running, I ignored most standards of aesthetics, so I’d like to get it cleaned up proper before releasing it.

Let’s get started!

Part 1 - Live Environment Pre-Work

While installing the trial system, I simply used the minimal disk environment, without a GUI, and used my phone to access documentation. However, with the number of things I wanted to try this time, I figured it would be best to have a graphical environment to refer to the three sources mentioned above. This poses a unique problem on a Mac, as the proprietary NVIDIA driver is the only driver at the time of writing that will get an X-Session up and running.

Adding additional complexity is the fact that while preparing for this process, I disassembled my MacBook and somewhere along the way caused some sort of issue with the IO Board, which means I was limited to only one USB 3.0 port on the left side of the computer.

To start, I used macOS to install the new macOS beta. This was important because the only way to update MacBook firmware is through macOS. By installing a beta release, I was trying to get out ahead of any firmware updates to be released in the next six months. Fingers crossed, this is all that will be required, and we won’t need to figure out how to get macOS back because of a new firmware exploit.

Once the beta was installed, and everything was good to go, I downloaded a quick live-image of ElementaryOS (I knew the Nix live image would be a problem, and wanted to wait till the system was ready for installation to deal with it), and used Etcher to write it to a USB disk.

Rebooting into ElementaryOS, I ran a series of commands on the main SSD. Starting with blkdiscard, I initiated a manual TRIM on the disk to mark every sector as free of data. Next, I used an ATA Secure Erase command, per the Arch Wiki Memory Cell article to reset the drive to factory default write speed. For good measure, I ran the --security-erase-enhanced form of the wipe. Finally, I ran another blkdiscard on the drive, just to be really sure that everything was gone.

Next, we rebooted into ElementaryOS again, this time telling GRUB I wanted the whole live system to be stored in RAM. When this was done, I downloaded the NixOS Graphical Install CD, and burned the ISO to the USB drive.

Rebooting, I was presented with the boot menu, and I made sure to load NixOS entirely into RAM as well. The NixOS live CD does not contain any non-free firmware or software, which means the Mac Broadcom WIFI chipset will not be detected. I already needed to deal with one small issue with the live system, and it would be easier to simply unplug the USB drive and use a USB->Ethernet dongle to connect for the majority of the installation.

So, now we are inside the NixOS Live System, at a command prompt, and we have an internet connection. Attempting to boot into a GUI, as expected, results in a failure to find a valid display device. This isn’t as much of an issue as it could be on other systems, we simply need to edit /etc/nixos/configuration.nix on the live-system to include services.xserver.videoDrivers = [ "nvidia" ]; and then run nixos-rebuild switch. Once this completes, we run the given command to start the X-Session, and viola, it works. Checking all of our networking areas, we see that we have a proper internet connection, and we can move on to the more fun things.

Part 2 - Disk Configuration

This part is fairly straight forward. We use gdisk to set up three partitions. The first partition is for our EFI Boot Partition, the second will be for our Swap partition, made large enough to handle hibernation plus a little extra, and the third is our new root partition, where ZFS will live.

$ gdisk /dev/sda
GPT fdisk (gdisk) version 1.0.4

Partition table scan:
  MBR: not present
  BSD: not present
  APM: not present
  GPT: not present

Creating new GPT entries in memory.

Command (? for help): o
This option deletes all partitions and creates a new protective MBR.
Proceed? (Y/N): Y

Command (? for help): n
Partition number (1-128, default 1): 1
First sector (34-2097118, default = 2048) or {+-}size{KMGTP}:
Last sector (2048-2097118, default = 2097118) or {+-}size{KMGTP}: +200M
Current type is 'Linux filesystem'
Hex code or GUID (L to show codes, Enter = 8300): ef00
Changed type of partition to 'EFI System'

Command (? for help): n
Partition number (1-128, default 2): 2
First sector (34-2097118, default = 2048) or {+-}size{KMGTP}:
Last sector (2048-2097118, default = 2097118) or {+-}size{KMGTP}: +20G
Current type is 'Linux filesystem'
Hex code or GUID (L to show codes, Enter = 8300): 8200
Changed type of partition to 'Linux swap'

Command (? for help): n
Partition number (2-128, default 3): 3
First sector (34-2097118, default = 411648) or {+-}size{KMGTP}:
Last sector (411648-2097118, default = 2097118) or {+-}size{KMGTP}:
Current type is 'Linux filesystem'
Hex code or GUID (L to show codes, Enter = 8300):
Changed type of partition to 'Linux filesystem'

Command (? for help): w

Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
PARTITIONS!!

Do you want to proceed? (Y/N): Y
OK; writing new GUID partition table (GPT) to /dev/sda.
The operation has completed successfully.

Now we can take a look at what we have:

$ gdisk /dev/sda
GPT fdisk (gdisk) version 1.0.4

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.

Command (? for help): p
Disk /dev/sda: 977105060 sectors, 465.9 GiB
Model: APPLE SSD SM0512
Sector size (logical/physical): 512/4096 bytes
Disk identifier (GUID): C35223A0-E004-474E-8B79-230B64658AB0
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 977105026
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048          411647   200.0 MiB   EF00  EFI System
   2          411648        42354687   20.0 GiB    8200  Linux swap
   3        42354688       977105026   445.7 GiB   8300  Linux filesystem

Command (? for help): q

So, everything is now set up on disk, it is time to build our filesystems.

Part 3 - Filesystems

With our disk structures on place, let’s talk about our filesystems. There will be three “main” ones, but it gets a bit more complex than that. First, let’s start by setting up our new EFI System Partition

$ mkfs.vfat /dev/sda1

That solves that issue, next we need to set up our two encrypted partitions. Despite the fact that we are going to use keyfiles, we should still establish a typed passphrase in the event we need to tweak the partitions from outside the operating system built on it. After setting up the encryption, we open each encrypted container and assign it a friendly name to work with.

$ cryptsetup luksFormat /dev/sda2
Enter passphrase:
Verify passphrase:
Command successful.
$ cryptsetup luksFormat /dev/sda3
Enter passphrase:
Verify passphrase:
Command successful.
$ cryptsetup luksOpen /dev/sda2 nixos-swap
Enter passphrase:
Command successful.
$ cryptsetup luksOpen /dev/sda3 nixos-root
Enter passphrase:
Command successful.

Okay, so now we have our containers. Next step is to form the basic filesystems inside of each one. Swap on our swap device, ZFS on our ZFS device. It is important that we use the /dev/disk/by-id/ entry with the UUID specification and the friendly name of the device. This makes identifying things easier when we work with them, and helps ZFS to understand what exactly is going on.

$ mkswap /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-deadbeef-nixos-swap
Setting up swapspace version 1, size = 20971520 KB

Before we get to the ZFS setup, I’d like to explain the options I am using. While the explanations are available on the wiki page, they are restated here for brevity.

-O compression=lz4 - Disk space on an SSD is more valuable than CPU time. Using LZ4 will not impact the user experience to any discernible degree.
-O normalization=formD - Our whole filesystem will be in Unicode with this. While not really required, it could let you do some interesting things, and in general I like to use Unicode wherever possible.
-O xattr=sa - Boost performance with certain file attributes, this could become useful if I ever attempt system hardening (I likely will at some point)
-O acltype=posixacl - Required for systemd-journald
-O mountpoint=none - This turns off ZFS’ automount machinery. In certain instances, ZFS’ and NixOS’ boot time automounting machinery could trigger a race condition and prevent the system from booting. This allows us you bypass that potential completely.
-o ashift=12 - Force 4K sectors. It is very likely ZFS would have done this anyways, but instead of risking the chance that it would read the hardware incorrectly, I just manually declare it.

So, with that out of the way, here is what our nice bulky zpool creation command looks like:

$ zpool create -O compression=lz4 -O normalization=formD -O xattr=sa -O acltype=posixacl -O mountpoint=none -o ashift=12 zroot /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-deadbeef-nixos-root

With our zpool initialized, next we need to form the filesystems under it. We will do three things. First, we will create a separate dataset for our /home directory, so that user data is kept somewhere separate from the root partition. Next, we will define a root dataset, and within that, a nixos dataset. This means, should we ever want to, we could run multiple distributions off of the same ZPool by nesting them under the root dataset, and pointing their /home at our home dataset. Is it likely we will ever do this? No, but it would be nice to have if we ever decided to try it!

Once again, we set our mount points to legacy to ensure that the automount machinery has absolutely nothing to go on, preventing it from firing during boot.

$ zfs create -o mountpoint=none zroot/root
$ zfs create -o mountpoint=legacy zroot/root/nixos
$ zfs create -o mountpoint=legacy zroot/home

Where does this leave us? As of right now, we have our EFI System Partition, freshly created with nothing on it. We have our Swap Partition, nested within a LUKS encrypted volume, and we have our three ZFS datasets, within our zroot zpool, within a LUKS encrypted volume.

Not yet done, we need to mount everything to the proper locations, and then do some additional work to make sure everything will boot as we want it to. To begin with, we will set up the easy ZFS mount points. Next, we need to mount our EFI System Partition to /mnt/efi. We do this because it will allow GRUB to write to the EFI partition, and have that point to our actual, encrypted, boot directory, which we also create here. Lastly, for some additional work we will do here in a moment, we manually create the /root directory.

$ mount -t zfs zroot/root/nixos /mnt
$ mkdir /mnt/home
$ mount -t zfs zroot/home /mnt/home
$ mkdir /mnt/efi
$ mount /dev/sda1 /mnt/efi
$ mkdir /mnt/boot
$ mkdir /mnt/root

Now, we mentioned above that we would like to have the system automatically unlock. This is secure, because before we can even access GRUB directly, we will have to type in our decryption passphrase for our nixos-root partition. Essentially, everything we are about to do will be encrypted based on that master passphrase anyways, so there’s no real chance of a leak occurring.

To do this, we will create two binary keyfiles. swap.key will be the binary key for the Swap partition, and root.key the binary key for the root partition. We will use LUKS to assign those keys to their respective LUKS encrypted volumes, allowing the volumes to be decrypted both with a binary keyfile and a passphrase. The root.key file will then be packaged into a CPIO archive, and GRUB will append this to the initrd image made by NixOS.

During boot, we will type in our master password to unlock GRUB, select our boot entry, and then GRUB will hand over control to the initrd after appending our CPIO archive. During boot, the initrd will unlock /dev/sda3 using root.key and then hand over control to systemd. SystemD will continue the boot, and then load swap.key to unlock the swap partition. Since the swap partition is not encrypted randomly each time, this process can be repeated, thus enabling hibernation to function properly.

The end result of all of this is that a single master passphrase only needs to be entered once to allow the system to boot properly. Without this method, we would have to re-enter the nixos-root passphrase twice, and the nixos-swap passphrase once. I am not sure, but this also might break our hibernation capabilities.

Let’s get started. First we will create our binary keyfiles from /dev/urandom, then assign them to the volumes, then create the CPIO archive and stash it where it needs to be.

$ dd count=4096 bs=1 if=/dev/urandom of=/mnt/root/root.key
$ dd count=4096 bs=1 if=/dev/urandom of=/mnt/root/swap.key
$ cryptsetup luksAddKey /dev/sda2 /mnt/root/swap.key
Enter passphrase:
Command successful.
$ cryptsetup luksAddKey /dev/sda3 /mnt/root/root.key
Enter passphrase:
Command successful.
$ cd /mnt/root
$ echo ./root.key | cpio -o -H newc -R +0:+0 --reproducible | gzip -9 > /mnt/boot/initrd.keys.gz

With that last string of commands, we are all set. To recap what was accomplished in this section:

We created the FAT filesystem for our EFI System Partition
LUKS formatted /dev/sda2 and /dev/sda3 with a passphrase
Opened /dev/sda2 and /dev/sda3 and assigned them to nixos-swap and nixos-root, respectively
Created a swap FS on nixos-swap
Created a zpool on nixos-root
Created 3 datasets on nixos-root
Mounted everything correctly
Generated 2 binary keyfiles
Assigned each binary keyfile to its respective partition
Generated a CPIO archive for our initrd.

It’s time to move on to installing NixOS and configuring it to make use of our work.

End of Part 1

While I had intended for us to have a system up and running by the end of part one, this post is close to breaking 600 lines, and to be completely honest, this is the most I have written in quite a while. Part 2 will cover getting the system up and running, as well as a little preview of what our configuration.nix file will look like. Expect that installment to be quite a bit smaller than this one. Finally, part 3 will deal with the initial declarative configuration of our /home directory.

The end goal is that all user data will end up kept in a ~/.library directory, similar to the nix-store, and upon login, a symbolic-link farm will be built according to our declarative home-management system. In this way, everything, from /etc/nixos to various dotfiles will be kept in an easy to understand layout, and just linked to their less-easy to understand directories by the derivation created by home-management.

But, for another time.

More Blog Updates

10 Dec 2018

There are yet more updates to the blog. The first of which is that we now have an actual domain name, which is zyradyl.moe. In keeping with my tradition of complete transparency, the domain was acquired through Gandi.net, after I found out that IWantMyName was unable to accept Discover cards. While I am still supportive of IWMN as a company, if they don’t accept my card it leaves me unable to use them.

Next, the DNS for this site is now handled through Cloudflare, which means that this site is now fully available via HTTPS with a valid SSL certificate. So, small victories.

While running through the process of updating the blog, I noticed several things were broken and went ahead and fixed those:

The “Site Version” link in the sidebar now properly links to the GitHub source repository.
~~A long standing issue with pagination has been corrected by updating to the jekyll-paginate-v2 gem, and rewriting the appropriate liquid blocks.~~
- Github-Pages does not support the v2 gem. Therefore, the site has been downgraded back to the v1 gem, and the liquid blocks were cleaned up based on trial and error.
Related posts are now actually related! This is accomplished by iterating through tags at compile time and creating a list of related posts. While this may not always be accurate, it is far more accurate than the time based system jekyll uses by default.
A small issue has been corrected with the header file used across pages. There was a typo that was generating invalid HTML. It didn’t cause any visible issues, but it was a problem all the same.
The archive page now uses a new Liquid code block. This is to resolve the long standing </ul> problem, where the code would generate trailing closing tags.
HTTPS links have been enforced across the board. I cannot promise the site that you visit will have a valid SSL certificate, but we will certainly try to redirect the connection over SSL now.

~~HTML proofer is still throwing a few errors related to my consistent use of the Introduction and Conclusion headers, but these are not actual errors.~~

Even these errors have been fixed. HTMLProofer now returns a completely safe site.

I’m also in the process of going back through previous posts and cleaning up the YAML front matter. While this front-matter previously had very little impact on the site, it now can matter quite a lot with the way the related posts system works.

ThothBackup - Part 3

09 Dec 2018

So, another week has gone, and it is time to update this blog with what I have learned. Unfortunately, experiments were not able to be run this week in the realm of data transfer. I decided to revisit the base system to focus on encrypting backup data while it is at rest on the system. This was one of the remaining security vulnerabilities with this process. While end-users still have to trust me, they can at least be assured the data is encrypted at rest.

Essentially, if the system was ever stolen, or our apartment door was broken down, we would just have to cut power and the data would be good. With that previous statement, please keep in mind that this week’s post only refers to the root drive. I didn’t make much progress because of things happening at work, but this is a nice, strong, foundation to build upon.

Many of the steps in this post were cobbled together from various sources across the internet. At the bottom of this post you can find a works cited that will show the posts that I used to gather the appropriate information.

End Goal

The end goal is to ensure that the operating system’s root drive is encrypted at rest. Full Disk Encryption is not an active security measure, it is a passive one. It is primarily there to ensure that should the system ever be stolen, it would not be readable. The root partition will not host any user data, so the encryption should be transparent and seamless.

In short, we will utilize a USB key to provide a Keyfile which will then be combined with LUKS encryption to unlock the LVM array to allow the initramfs to hand over control to the operating system.

Notes ## {: #thoth-3-notes }

Because we are using a solid state drive, and we will be filling the drive with data, it was important for me to over-provision the drive. The SSD we’re using comes with 240GB of space. We can assume that there is some form of manufacturer over-provisioning in play to get that number, if I had to guess I would assume there is actually 256GB of NAND memory on the drive, but only 240GB are made available to the user. This is a fairly reasonable level of over-provisioning.

However, with us planning to fill the drive with pseudorandom data in order to obfuscate the amount of data actually in use, this 16GB could potentially be used quite quickly. SSDs cannot actually rewrite sectors on the fly, they have to run a READ/ERASE/WRITE cycle. This is typically done by writing the new block to an over-provisioned area and then pointing the drive’s firmware at that block. In this way we avoid the ERASE penalty, which can be on the order of 0.5 seconds per block.

Essentially then, every single write to the drive will require a READ/ERASE/WRITE cycle, so padding the over-provisioning is a very good idea. It will help with wear leveling and prevent severe write amplification, while also making the drive “feel” faster.

Prior Work

Before we get into the new installation, we need to prepare the drive for its new role. Unless the flash cells are at their default state, the firmware will regard them as holding data and will not utilize them for wear leveling, thus rendering the over-provisioning useless.

To begin, boot the system via a Debian Live-CD and open up a root prompt using sudo.

If you, like me, prefer to work remotely, you will then need to run a sequence of commands to prep the system for SSH access. We need to add a password to the liveCD user, then install openSSH, and finally start the service. Once all of this is complete, you can log in from a more comfortable system.

# apt-get update
# apt-get install openssh-server
# passwd user
# systemctl start sshd

We will need to install one last software package, hdparm. Run apt-get install hdparm to grab it. Once you have done so, run hdparm -I /dev/sda. Under “Security” you are looking for the words “not frozen”. If it says frozen, and you are working remotely, you will need to access the physical console to suspend/resume the machine. This should unfreeze access to the ATA security system.

The first thing we need to do is to run an ATA Enhanced Erase. After this is done, I still like to run blkdiscard just to make sure every sector has been marked as empty. Finally, we will use hdparm to mark a host-protected-area, which the drive firmware will be able to use as an over-provisioning space. To calculate the HPA size, figure out what size you want to be available to you. Convert that into bytes, and divide by 512, which is the sector size. This will give you the number to pass to hdparm.

# hdparm --user-master u --security-set-pass Eins /dev/sda
# hdparm --user-master u --security-erase-enhanced Eins /dev/sda
# blkdiscard /dev/sda
# hdparm -Np390625000 --yes-i-know-what-i-am-doing /dev/sda
# reboot

Once this is done reboot immediately. There is a lot that can go wrong if you fail to reboot. At this point, I swapped out my disk for the Debian installer. If you are doing this on your own 2006-2008 MacMini, you may want to use the AMD64-mac ISO that the Debian project provides.

From here, we just have to confirm that the drive shows up how we want in the installer (200GB in size, in my case), and we can proceed with the installation.

Installation

Most of the Debian installation process is self explanatory. The only point where I will interject is partitioning. Because of the way the MacMini2,1 boots, it is important that we use an MBR based grub installation. You can do a 32bit EFI installation, but it is very fragile, and I’m not a fan of fragile things. That being said, I still wanted the ability to use GPT partitions. I like being able to label everything from the partition up to the individual filesystems.

Accomplishing this is actually fairly easy anymore. You just need to create a 1MB grub_bios partition as part of your scheme and you’re good to go. To get the level of control we need, we will select manual partitioning when prompted to set up our partitions in the installer.

Create a new partition table (This will default to GPT), and then lay out your initial partition layout. It will look something like this:

<PART #>  <SIZE>  <NAME>          <FILESYSTEM>  <FLAGS>
#1        1MB     BIOS_PARTITION  none          grub_bios
#2        1GB     BOOT_PARTITION  ext4          bootable
#3        199GB   ROOT_PARTITION  crypto        crypto

When you select “Physical Volume For Encryption” it will prompt you to configure some features. You can customize the partition there, but I actually wanted more options than the GUI provided, so I accepted the defaults and planned to re-encrypt later. Please make sure to allow the installer to write encrypted data to the partition. Since we have already set up a customized HPA, a potential attacker already knows the maximum amount of cipher text that can be present, and if the HPA is disabled they would likely be able to gain access to more. Therefore, it is important that we take every possible precaution.

Once this is done, you should scroll to the top where it will say “Configure Encryption” or something similar. Select this option, then select the physical volume we just set up, and it should drop you back to the partitioning menu. This time, however, you will be able to see the newly unlocked crypto partition as something that we can further customize.

Select that volume and partition it like so:

<PART #>  <SIZE>  <NAME>          <FILESYSTEM>  <FLAGS>
#1        199GB                   none          lvm

The LVM option will show up in the menu as “Physical Volume for LVM.” From here, we go back up to the top of our menu and select “Configure Logical Volume Manager.” You will then be taken to a new screen where it should show that you have one PV available for use. Create a new volume group that fills the entire PV and name it as you would like. For this project, I named it djehuti-root and completed setup.

Next we need to create a Logical Volume for each partition that you would like to have. For me, this looked like the following:

<Logical Volume>  <Size>  <Name>
#1                30GB    root-root
#2                25GB    root-home
#3                10GB    root-opt
#4                05GB    root-swap
#5                05GB    root-tmp
#6                10GB    root-usr-local
#7                10GB    root-var
#8                05GB    root-var-audit
#9                05GB    root-var-log
#10               05GB    root-var-tmp

Your layout may be similar. Once this is done, you can exit out and you will see that all of your logical volumes are now available for formatting. Since I wanted to stick with something stable, and most importantly resizable (more on why later), I picked ext4 for all of my partitioning. We will tweak mount options later. For now, the end product looked like the following:

<PARTITION>                       <FS>    <MOUNT POINT> <MOUNT OPTIONS>
/dev/sda2                         ext4    /boot         defaults
/dev/djehuti-root/root-root       ext4    /             defaults
/dev/djehuti-root/root-home       ext4    /home         defaults
/dev/djehuti-root/root-opt        ext4    /opt          defaults
/dev/djehuti-root/root-swap       swapfs  none          defaults
/dev/djehuti-root/root-tmp        ext4    /tmp          defaults
/dev/djehuti-root/root-usr-local  ext4    /usr/local    defaults
/dev/djehuti-root/root-var        ext4    /var          defaults
/dev/djehuti-root/root-var-audit  ext4    /var/audit    defaults
/dev/djehuti-root/root-var-log    ext4    /var/log      defaults
/dev/djehuti-root/root-var-tmp    ext4    /var/tmp      defaults

Once everything is setup appropriately, follow through the installation until you get to the task-sel portion. You really only want to install an ssh server and the standard system utilities pack. Once the installation completes, reboot into your server and make sure everything boots appropriately. We’re going to be doing some offline tweaking after this point, so ensuring that everything is functioning as is will save you a lot of headache.

Once you are satisfied the initial installation is functioning and booting correctly, it is time to move on to re-encrypting the partition with our own heavily customized parameters.

Re-Encryption

This process isn’t so much difficult as it is simply time consuming. Go ahead and reboot your system to the boot media selection screen. You will want to swap out your Debian Installation CD for the Debian LiveCD that we used earlier. Once the disks have been swapped, boot into the live environment and then bring up a shell. We will first need to install the tools that we will use, and then run the actual command. The command is actually fairly self explanatory, so I won’t explain that, but I will explain the reasoning behind the parameters below.

# apt-get update
# apt-get install cryptsetup
# cryptsetup-reencrypt /dev/sda3 --verbose --use-random --cipher serpent-xts-plain64 --key-size 512 --hash whirlpool --iter-time <higher number>

So, onto the parameters:

cipher - I picked Serpent because it is widely acknowledged to be a more “secure” cipher. Appropriate text from the above link is as follows: “The official NIST report on AES competition classified Serpent as having a high security margin along with MARS and Twofish, in contrast to the adequate security margin of RC6 and Rijndael (currently AES).” The speed trade-off was negligible for me, as the true bottleneck in the system will be network speed, not disk speed.
key-size - The XTS algorithm requires double the number of bits to achieve the same level of security. Therefore, 512 bits are required to achieve an AES-256 level of security.
hash - In general, I prefer hashes that have actually had extensive cryptanalysis performed to very high round counts. The best example of an attack on whirlpool, with a worst case situation where the attacker controls almost all aspects of the hash, the time complexity is still 2^128th on 9.5 of 10 rounds. This establishes a known time to break of over 100 years.
iter-time - The higher your iteration time, the longer it takes to unlock, but it also makes it harder to break the hash function. So if we combine what we know above with a large iteration time, we gain fairly strong security at the expense of a long unlock time when using a passphrase.

Once these specifications have been entered, you simply need to press enter and sit back and relax as the system handles the rest. Once this process is complete, you should once again reset and boot into the system to verify that everything is still working as intended. If it is, you are ready for the next step, which is automating the unlock process.

Auto-Decryption

There are a few ways to handle USB key based auto-decryption. The end goal is to actually use a hardware security module to do this, and I don’t anticipate the FBI busting down my door any time soon for hosting the data of my friends and family, so I opted for one that is easily extendable.

Essentially, the key will live on an ext4 filesystem. It will be a simple hidden file, so nothing extremely complex to find. This shouldn’t be considered secure at this point, but it is paving the way to a slightly more secure future.

The first thing that I did, though it isn’t strictly necessary, is write random data to the entire USB stick. In my case, the USB drive could be found at /dev/sdb.

# dd if=/dev/urandom of=/dev/sdb status=progress bs=1M

Once this is done, we’ve effectively destroyed the partition table. We will recreate a GPT table, and then create a partition that fills the usable space of the drive.

# apt update
# apt install parted
# parted /dev/sdb
(parted) mklabel gpt
(parted) mkpart KEYS ext4 0% 100%
(parted) quit

Now we just create the filesystem, a mount point for the filesystem, and make our new LUKS keyfile. Once the file has been created, we just add it to the existing LUKS header.

# mkfs.ext4 -L KEYS /dev/sdb1
# mkdir /mnt/KEYS
# mount LABEL=KEYS /mnt/KEYS
# dd if=/dev/random of=/mnt/KEYS/.root_key bs=1 count=4096 status=progress
# cryptsetup luksAddKey /dev/sda3 /mnt/KEYS/.root_key

After this point, the setup diverges a bit depending on what guide you follow. We will stick close to the guide posted to the Debian mailing list for now, as that guide got me a successful boot on the first try. The others are slightly more elegant looking, but at the expense of added complexity. As such, they may end up being the final configuration, but for this prototyping phase they are a bit excessive.

We have to modify the crypttab file to enable the keyfile to be loaded off of our freshly set up key drive.

sda3_crypt  UUID="..."  /dev/disk/by-label/KEYS:/.root_key:5  luks,initramfs,keyscript=/lib/cryptsetup/scripts/passdev,tries=2

At this point, we need to repackage our startup image, update grub, and reboot to test the whole package.

# update-initramfs -tuck all
# update-grub
# reboot

At this point the system should boot automatically, but you will notice a weird systemd based timeout that happens. This is mentioned in the guide posted to the Debian Stretch mailing list, and is fairly easy to solve. We just need to create an empty service file to prevent systemd from doing it’s own thing.

# touch "/etc/systemd/system/systemd-cryptsetup@sda3_crypt.service"
# reboot

At this point, everything should boot correctly and quickly. You may notice a few thrown errors, but it shouldn’t be anything severe, more services loading out of order.

At this point, it used to be possible to allow for the creation of a fallback in the event that the key drive wasn’t present, but that seems to have been removed. I plan to look into it further when I have more time.

Conclusion ## {: #thoth-3-conclusion }

This concludes the first part of the Operating System setup process. The next step was originally planned to be thin-provisioning the partitions inside the djehuti-root volume group, but there seems to be some problems in getting the system to boot from a thin-provisioned root. I’m looking into a weird combined system, where the root is static but all the accessory partitions are thinly provisioned, but it will take time to tinker with this and report back.

Thin Provisioning isn’t strictly required, but it is a rather neat feature and I like the idea of being able to create more partitions than would technically fit. I’m not sure when this would be useful, but we will see.

Once all of this is finalized, we will move on to hardening the base system, and last but not least creating the Stage 1 Project page. Then it is back to experiments with data synchronization. This is a fairly large step back in progress, but I am hopeful it will result in a better end product, where security can be dynamically updated as needed.

Works Cited

The following sources were invaluable in cobbling this process together. I sincerely thank the authors both for figuring the process out and documenting the process online.

ThothBackup - Part 2

03 Dec 2018

Hello! It’s that time of the week again, where I update everyone on my latest work. This episode is far less technical and focuses more on the concept of a “One and Done” backup solution, aka the holy grail of data maintenance.

It fucking sucks.

Introduction ### {: #thoth-2-introduction }

This entry is slightly unidirectional. The concept of a simple, easy to implement, catch everything you might ever need solution is quite literally the holy grail, yet it has never honestly been implemented. Sure, user data is generally scooped out, but in the day and age of game mods, and with some development projects taking place outside of the User directory, it seemed prudent to at least attempt the full backup. Well, I’ve been attempting it for seven days. Here’s what I’ve found.

Focus

We will not be focusing on the space impact of a complete backup. This is actually fairly negligible. With out-of-band deduplication, only one set of operating system files would ever be stored, so server side storage would reach a weird type of equilibrium fairly quickly. Instead, I’ll talk about three things:

Metadata Overhead
Metadata Processing
Initial Synchronization

There may be another post tonight talking about additional things, but this deserves it’s own little deal.

Metadata Overhead

A fully updated Windows 10 partition of your average gamer, aka my fiancé, is composed of 479,641 files and 70,005 directories which comprise a total data size of ~216 GiB. This is actually just the C drive and typical programs. If you factor in the actual game drive in use by our test case, that drive contains 354,315 files and 29,111 directories which comprise a total of ~385 GiB of space.

In summation, an initial synchronization of what is typically considered a “full system backup” comprises 833,956 files and 99116 directories comprising ~601GiB which results in an average filesize of ~755KiB and an average directory size of ~9 files.

SyncThing creates a block store that is comprised of, by default, 128KiB blocks. This means that for our system, assuming the data is contiguous, we need 4923392 Metadata Entries. Assuming the files are NOT contiguous, this is probably closer to about 5 Million metadata entries. As of right now, the server side metadata storage for the testing pool is at 1.7 GiB and initial syncronization is not yet complete. Extrapolating a bit, we can assume that 2.0 GiB would not be an unreasonable size for a final server side data store.

The client side store, at the time of writing, is approximately 1 GiB and may grow slightly larger. However, I will use 1 GiB. This means that there is a plausible total of 3GiB of metadata overhead representing an overhead percentage of ~0.5% across the pool. Scaling up, this means 10 clients with 1TB of data each would require 51.2GB of Metadata.

Should anything happen to the metadata store, it would need to be rebuilt by data reprocessing. This introduces a potentially massive liability, as scanning frequency would need to be reduced to not impact the rebuild operation.

Metadata Processing

The server is capable of a hash rate of 107MB/s. I am picking the server’s hash rate because it is both the slowest hash rate of the pool and would have the most metadata that would need to be rebuilt.

For a complete rebuild of the data of our current cluster, it would take the server ~96 Minutes during which no data synchronization could occur. This equates to a minimum of 1 Missed Hourly Update and could potentially result in up to 2 missed hourly updates if the timing was unfortunate enough.

For a complete rebuild of the data of our theoretical cluster, we will allow for a hash rate of 300MB/s. The total data needed to be rebuilt would be 10TB. This would result in a database rebuilt time of ~10 Hours which could result in up to 11 missed synchronization attempts.

Initial Synchronization

The initial syncronization is composed of three primary parts. First, the client and host must agree on what folders to syncronize. Second, the client must build a database of the content hosted locally. Next, utilizing a rolling hash algorithm, data is entered into the metadata cache and transmitted to the server.

Per the developer of SyncThing, millions of small files are the worst case scenario for the backup system. As of my independent, albeit anecdotal testing, After 7 days the synchronization process is still in effect. This represents a very poor user experience and would not be ideal for a widespread rollout.

Conclusion ### {: #thoth-2-conclusion }

The primary goal of a backup utility is to synchronize files and achieve cross system consistency as quickly as possible. While it is true that eventually consistent systems are utilized in large scale operations, this type of consistency is allowable only, in my opinion, at data sizes over 10TB. The current testing set is approximately 1TB at most, and thus this is unacceptable.

Either the backup paradigm must change, or the utility used to implement it must change. While I do not expect to find any faster utilities for performing the backup process, I do plan to continue to experiment. At this time, however, it seems that the most likely way to make the process as friendly as possible would be the implementation of a default backup subset, with additional data added upon user request, and after the high priority synchronization had been completed.

Blog Updates

30 Nov 2018

Yes, it’s that time of year again. I have updated the blog! You can find more information below.

Analytics

Google Analytics are back. I understand that some people may not like being tracked, but at that point you should have an add or tracking blocker installed. I recommend looking at the Brave Browser, which is what I personally use, or installing uBlock Origin. The reason I have added this back to the blog is that I have noticed links to this blog appearing in various places over the web, and I would like to be able to detect how much traffic I am getting from these links.

I wholeheartedly understand if you disapprove of the use of Google Analytics. If anyone is able to suggest a better service, that collects less user data, please open an issue in the GitHub repository for this site. I will gladly change providers as long as the new one is also free.

Theme Updates

I forked the Lanyon repository and applied all the currently pending pull requests. This should keep everything up to date with the latest version of Jekyll. To make it easier for anyone else looking for that information, I created a new PR in the lanyon repository to my mergers. You can also find them under my GitHub site.

Fonts

Somehow the fonts on the site got nuked during the upgrade. They are back in place now.

Page Speed

The new updates have not yet been optimized. Previously I used Google’s page load speed thingy to optimize the site. Maintaining that by hand is one of the reasons the upgrade was so painful to implement. I’m looking for a way to automate the process the same way that I have currently automated the link tests that I run prior to a push. This will likely involve writing a new process in the Rakefile, so it will take some time. In the meantime, the only thing that is really being pulled is a few font files and the analytics script, so the load impact shouldn’t be too bad.

Organization

There is a weird issue between rendering the site on my local machine and the way GitHub pages renders it on their side. To solve this I created a new template and appended it to all the pages that should not appear in the sidebar. Hopefully this strikes a good balance between being able to use standardized templates, as well as ease of use. The new template simply imports the existing page template under a new name.

Images

While I am primarily focused on text on this blog, I have recently included a few images. These are also hosted on GitHub, so I cleaned up the way the image directory is laid out. This has impacted all of one image, but should make any expansion in the future much easier.

Liquid Changes

There was some issue with the way the Liquid on the Tags page was written. This has been corrected accordingly.

About Page

The about page has been correctly updated! My view on some of the listed issues has evolved in recent years. You will now find those things struckthrough with comments added underneath.

Future Improvements

There is still minifying, javascript inlining, and font work to be done to make the page run faster. Additionally, the tags page is simply a disaster. Between all of that and the project pages, there is still a lot left to be done. Hopefully, all will be accomplished in time.

Conclusion #### {: #blog-updates-conclusion }

Hopefully all of these changes make the site a bit more enjoyable to use. I do understand if the use of analytics bothers you. Please make an issue in the tracker, or if someone already has, comment accordingly. If there are enough people using this site that honestly care, I might consider removing the analytics while I do further research.

Older Newer

$USER@bit-of-a-byte ~ $ cat $(find /var/log -name '*.log' -print | head -n 5)