MDADM and a single LVM to boot from (on a Debian based system)

Yesterday I moved a customer's server to new disks. To get some extra features like snapshotting I opted to convert the current static disk layout to LVM on top of MDADM. As a bonus I did the entire sync online to a new disk connected to my laptop which acted as a degraded mirror.

Yes I know GPT is the way to go but I started this move at 7 PM to minimise business impact and since the new disks are still only 500G I went for MBR which was already in place. For GPT "RAID" check this answer

This tutorial can also be used to move a non-RAID server to a RAID setup or to move a server to a new machine that will have MDADM RAID.

First create the partition table

parted /dev/sdz 
mklabel msdos
mkpart primary ext2 0% 100%
set 1 lvm on

Create the MDADM mirror and LVM group + partitions (I will only use 2 in this tutorial but you can use this as a guide to get a better solution for a professional environment)

mdadm --create --verbose /dev/md0 --level=mirror --raid-devices=2 /dev/sdz1 missing
pvcreate /dev/md0
vgcreate lvm /dev/md0
lvcreate -L 500M lvm -n boot
lvcreate -L 20G lvm -n root
mkfs.ext4 /dev/mapper/lvm-boot
mkfs.ext4 /dev/mapper/lvm-root

 Next mount the new partitions and start cloning the old system

mount /dev/mapper/lvm-root /mnt
mkdir /mnt/boot
mount /dev/mapper/lvm-boot /mnt/boot
rsync -aAXv --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} user@src /mnt/
*here you should stop all services on the src and rerun the rsync command for a final sync*

Next we will set up the bootloader and adapt the system files

for i in /dev /dev/pts /proc /sys /run; do sudo mount -B $i /mnt$i; done
sudo cp /etc/resolv.conf /mnt/etc/resolv.conf
sudo chroot /mnt
*get the ID's for the LVM paritions*
vi /etc/fstab
*replace the ID's of the old /, /boot, ...*
apt-get install lvm2 mdadm
mdadm --examine --scan >> /etc/mdadm.conf
*delete the previous lines starting with ARRAY, if any*
vi /etc/mdadm/mdadm.conf
*now update initramfs to make sure it contains MDADM and LVM2 support and install grub to the new disk*
update-initramfs -k all -u -v
grub-install /dev/sdz

At this point you should be able to shut down the src server and boot from the new disk after replacement (or if you use this to move a server by starting the dst server). If you are able to boot succesfully and have the second disk already in place we are now going to "restore" the mirror.

*copy the MBR*
dd if=/dev/sdz of=/dev/sdy bs=512 count=1
partprobe /dev/sdy
*add the disk to the MDADM RAID to start rebuilding*
mdadm --manage /dev/md0 --add /dev/sdy1
*check if the rebuild is started*
cat /proc/mdstat
*just to make sure reinstall grub on boot disks, choose both /dev/sdz en /dev/sdy*
dpkg-reconfigure grub-pc

Another reboot after the rebuild is finished can be good to verify everything.

Fully encrypted ZFS root on Linux using LUKS (Ubuntu 16.04)

Since I wanted to have the joy of compression, fast resilvering, caches, ... on my workstation I started to look to use ZFS with LZ4 compression on top of a bunch of LUKS devices. I used 6 * 128GB MLC SSDs and put them in this great backplane IcyDock MB996SP-6SB

Plain ZFS with an eCryptFS home folder on top wasn't a good solution because that would render the LZ4 compression useless -> If you can compress encrypted data, your encryption method is useless...

So this are the steps I took to get it working:

Get a Ubuntu desktop live USB/CD and boot it. Next do this to get the necessary packages

sudo apt-get update
sudo apt-get install cryptsetyp debootstrap zfs mdadm

Get started by making a DOS-type partition table on /dev/sda with a 80MB primary partition and a second primary partition that uses all that is left. Afterwards you can copy the partition table using this simple dd command to copy only the first sector.

sudo dd if=/dev/sda of=/dev/sdb bs=512 count=1

Please notice that this only works for a DOS table with primary partitions. For logical partitions and GPT you will have to use something more advanced:


sfdisk -d /dev/sda > part_table


sfdisk /dev/sda < part_table

But this is what I did, since I only used primary partitions anyway.

dd if=/dev/sda of=/dev/sdb bs=512 count=1
dd if=/dev/sda of=/dev/sdc bs=512 count=1
dd if=/dev/sda of=/dev/sdd bs=512 count=1
dd if=/dev/sda of=/dev/sdf bs=512 count=1
dd if=/dev/sda of=/dev/sdg bs=512 count=1

Next encrypt your second primary partitions using luksFormat. By default LUKS will use AES with a 256bit key. You can benchmark to see which is the fastest encryption to use. Either way make sure your CPU has AES encryption and that this function is enabled in the BIOS/EFI otherwise you will have a lot of overhead!


cryptsetup benchmark

Check if AES is enabled (works on AMD and Intel CPU's)

grep -m1 -o aes /proc/cpuinfo

The actual formatting with the default settings (in my case the ones with the best performance)

cryptsetup luksFormat /dev/sda2
cryptsetup luksFormat /dev/sdb2
cryptsetup luksFormat /dev/sdc2
cryptsetup luksFormat /dev/sdd2
cryptsetup luksFormat /dev/sdf2
cryptsetup luksFormat /dev/sdg2

Now Open (decrypt) to use them

cryptsetup luksOpen /dev/sdg2 crypt_sdg2
cryptsetup luksOpen /dev/sde2 crypt_sde2
cryptsetup luksOpen /dev/sdf2 crypt_sdf2
cryptsetup luksOpen /dev/sdd2 crypt_sdd2
cryptsetup luksOpen /dev/sdc2 crypt_sdc2
cryptsetup luksOpen /dev/sdb2 crypt_sdb2
cryptsetup luksOpen /dev/sda2 crypt_sda2

And create a zpool with 3 mirrors (VDEVs) to create a RAID10. I know you loose a lot of space this way but the LZ4 makes up a little for that and the increased speed, reliability and most important resilvering time makes using mirrors preferable over RAIDZ[1-3]

Aside I see a lot of tutorials where they ask you to set the ashift (sector alignment) to 12 (4k native) but since all my SSDs have a physical and logical sector size of 512 (ashift=9) I don't see why you would like to do that. Anyway ZFS should detect the sector size and align automatically.

zpool create rpool mirror crypt_sda2 crypt_sdb2 mirror crypt_sdc2 crypt_sde2 mirror crypt_sdd2 crypt_sdg2

Next we will enable compression on the pool (LZ4 is the default compression and the best/fastest one available)

zfs set compression=on rpool

Now create a / mountpoint make it bootable, make sure that the pool itself isn't mounted and export (stop) the entire pool

zfs create rpool/ROOT
zpool set bootfs=rpool/ROOT rpool
zfs set mountpoint=none rpool
zfs set mountpoint=/ rpool/ROOT
zpool export rpool

Next create a RAID10 of the first primary partitions to use as /boot

mdadm --create /dev/md0 --level=10 --raid-devices=6 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1 /dev/sdg1
mkfs.ext4 /dev/md0

Next import (start) your pool and redirect all mounting points to /mnt

zpool import -R /mnt rpool

And prepare a basic Ubuntu 16.04

debootstrap xenial /mnt

Get the UUIDs

blkid | grep LUKS

Next parse all the UUID of your LUKS containers in /etc/crypttab inside your target directory.

echo "crypt_sda2 UUID=b86435dd-71cd-45cf-abde-ee373554915b none luks" >> /mnt/etc/crypttab
echo "crypt_sdb2 UUID=8d370731-8b6c-4789-9d15-68b5c6a8d74f none luks" >> /mnt/etc/crypttab
echo "crypt_sdc2 UUID=260bb228-a1b8-4739-8ce7-a4671b4d723b none luks" >> /mnt/etc/crypttab
echo "crypt_sdd2 UUID=9e35fc89-bd1c-4db6-b9fc-15d311652f0b none luks" >> /mnt/etc/crypttab
echo "crypt_sde2 UUID=35129e92-3fb6-4118-aada-5dc2be628c05 none luks" >> /mnt/etc/crypttab
echo "crypt_sdg2 UUID=3ef442d5-ed6e-4a4a-bcce-84f3c31acf32 none luks" >> /mnt/etc/crypttab

Set your hostname

echo "SoloTheatre" > /mnt/etc/hostname
echo " SoloTheatre" >> /mnt/etc/hosts

Prepare a chroot environment and enter it

mount /dev/md0 /mnt/boot
mount --bind /dev /mnt/dev
mount --bind /dev/pts /mnt/dev/pts
mount --bind /proc /mnt/proc
mount --bind /sys /mnt/sys
chroot /mnt /bin/bash --login
hostname SoloTheatre

Force initramfs to be cryptsetup aware

echo "export CRYPTSETUP=y" >> /usr/share/initramfs-tools/conf-hooks.d/forcecryptsetup

Now push all the LUKS containers inside a initramfs config to make sure they are being picked up and presented for decryption when booting

echo "target=crypt_sda2,source=UUID=b86435dd-71cd-45cf-abde-ee373554915b,key=none,rootdev,discard" >> /etc/initramfs-tools/conf.d/cryptroot
echo "target=crypt_sdb2,source=UUID=8d370731-8b6c-4789-9d15-68b5c6a8d74f,key=none,rootdev,discard" >> /etc/initramfs-tools/conf.d/cryptroot
echo "target=crypt_sdc2,source=UUID=260bb228-a1b8-4739-8ce7-a4671b4d723b,key=none,rootdev,discard" >> /etc/initramfs-tools/conf.d/cryptroot
echo "target=crypt_sdd2,source=UUID=9e35fc89-bd1c-4db6-b9fc-15d311652f0b,key=none,rootdev,discard" >> /etc/initramfs-tools/conf.d/cryptroot
echo "target=crypt_sdf2,source=UUID=35129e92-3fb6-4118-aada-5dc2be628c05,key=none,rootdev,discard" >> /etc/initramfs-tools/conf.d/cryptroot
echo "target=crypt_sdg2,source=UUID=3ef442d5-ed6e-4a4a-bcce-84f3c31acf32,key=none,rootdev,discard" >> /etc/initramfs-tools/conf.d/cryptroot

link all the LUKS containers since grub-update doesn't care about the default /dev/mapper/ directory

ln -sf /dev/mapper/crypt_sda2 /dev/crypt_sda2
ln -sf /dev/mapper/crypt_sdb2 /dev/crypt_sdb2
ln -sf /dev/mapper/crypt_sdc2 /dev/crypt_sdc2
ln -sf /dev/mapper/crypt_sdd2 /dev/crypt_sdd2
ln -sf /dev/mapper/crypt_sdf2 /dev/crypt_sdf2
ln -sf /dev/mapper/crypt_sdg2 /dev/crypt_sdg2

Next set up apt repositories

echo "deb xenial main universe restricted multiverse
deb xenial-security universe multiverse main restricted
deb xenial-updates universe multiverse main restricted
" > /etc/apt/sources.list

And install the bare necessities to get started. Replace ubuntu-minimal with ubuntu-desktop if you are planning to use this system as a desktop computer.

apt-get update
apt-get install mdadm zfs zfs-initramfs grub-pc linux-image-generic ubuntu-minimal cryptsetup
#install grub to all the disks you used for the /boot mdadm RAID
apt-get upgrade
apt-get dist-upgrade

When you see the ncursus window for grub-pc make sure grub uses the ZFS as root and select all your physical disks to install grub on (/dev/sda,/dev/sdb,/dev/sdc,/dev/sdd,/dev/sdf,/dev/sdg). If you didn't get this window you can run:

sudo dpkg-reconfigure grub-pc


Set the UUID of your md raid device and all the LUKS containers you used for ZFS in /etc/fstab

UUID=c6c15ae8-2453-4e7e-8013-d5ce88d97800 /boot auto defaults 0 0
UUID=bff05b3e-bbec-4aba-a4d3-9d6f8b6f28c9 / zfs defaults 0 0

Force initramfs and grub update

update-initramfs -k all -c

Set swap (4G is sufficient on most modern systems)

zfs create -V 4G -b $(getconf PAGESIZE) -o compression=zle -o logbias=throughput -o sync=always -o primarycache=metadata -o secondarycache=none -o com.sun:auto-snapshot=false rpool/swap
mkswap -f /dev/zvol/rpool/swap
echo /dev/zvol/rpool/swap none swap defaults 0 0 >> /etc/fstab

create a sudo user and exit + unmount everything before rebooting in your new invironment

adduser USERNAME
usermod -a -G adm,cdrom,sudo,dip,plugdev,lpadmin,sambashare,libvirtd USERNAME
umount /mnt/boot
umount /mnt/dev/pts
umount /mnt/dev
umount /mnt/proc
umount /mnt/sys
zfs umount -a
zpool export rpool