echo "

MDADM and a single LVM to boot from (on a Debian based system)

Yesterday I moved a customer's server to new disks. To get some extra features like snapshotting I opted to convert the current static disk layout to LVM on top of MDADM. As a bonus I did the entire sync online to a new disk connected to my laptop which acted as a degraded mirror.

Yes I know GPT is the way to go but I started this move at 7 PM to minimise business impact and since the new disks are still only 500G I went for MBR which was already in place. For GPT "RAID" check this answer

This tutorial can also be used to move a non-RAID server to a RAID setup or to move a server to a new machine that will have MDADM RAID.

First create the partition table

parted /dev/sdz 
mklabel msdos
mkpart primary ext2 0% 100%
set 1 lvm on
quit

Create the MDADM mirror and LVM group + partitions (I will only use 2 in this tutorial but you can use this as a guide to get a better solution for a professional environment)

mdadm --create --verbose /dev/md0 --level=mirror --raid-devices=2 /dev/sdz1 missing
pvcreate /dev/md0
vgcreate lvm /dev/md0
lvcreate -L 500M lvm -n boot
lvcreate -L 20G lvm -n root
mkfs.ext4 /dev/mapper/lvm-boot
mkfs.ext4 /dev/mapper/lvm-root

 Next mount the new partitions and start cloning the old system

mount /dev/mapper/lvm-root /mnt
mkdir /mnt/boot
mount /dev/mapper/lvm-boot /mnt/boot
rsync -aAXv --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} user@src /mnt/
*here you should stop all services on the src and rerun the rsync command for a final sync*

Next we will set up the bootloader and adapt the system files

for i in /dev /dev/pts /proc /sys /run; do sudo mount -B $i /mnt$i; done
sudo cp /etc/resolv.conf /mnt/etc/resolv.conf
sudo chroot /mnt
blkid
*get the ID's for the LVM paritions*
vi /etc/fstab
*replace the ID's of the old /, /boot, ...*
apt-get install lvm2 mdadm
mdadm --examine --scan >> /etc/mdadm.conf
*delete the previous lines starting with ARRAY, if any*
vi /etc/mdadm/mdadm.conf
*now update initramfs to make sure it contains MDADM and LVM2 support and install grub to the new disk*
update-initramfs -k all -u -v
grub-install /dev/sdz

At this point you should be able to shut down the src server and boot from the new disk after replacement (or if you use this to move a server by starting the dst server). If you are able to boot succesfully and have the second disk already in place we are now going to "restore" the mirror.

*copy the MBR*
dd if=/dev/sdz of=/dev/sdy bs=512 count=1
partprobe /dev/sdy
*add the disk to the MDADM RAID to start rebuilding*
mdadm --manage /dev/md0 --add /dev/sdy1
*check if the rebuild is started*
cat /proc/mdstat
*just to make sure reinstall grub on boot disks, choose both /dev/sdz en /dev/sdy*
dpkg-reconfigure grub-pc

Another reboot after the rebuild is finished can be good to verify everything.

Checking disk space usage on a unix based appliance with Nagios/Icinga without installing any scripts or NRPE

We had a Vcenter (appliance) that stopped accepting changes. After some quick investigation we found out the log partition was full. After some clean up and extension + reboot the Vcenter worked fine again but to prevent this from happening in the future we wanted the disk pressure to be monitored by Nagios as well. We already monitored the availability of the web interface, SSH and ping.

Since VMware doesn't support third party software on their appliance software (ESXi,VCSA,vCenter Support Assistant,...) I wasn't keen on having to install the NRPE client again after every update of a VMware appliance. For that I created this solution:

 

First of all if you didn't do it yet, move your appliance to key based authentication to prevent having to store your password inside a Nagios config file.

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2100508
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002866

 

I used this guy's script but used a wrapper to prevent having to install the script on the destination host

https://supporthandbook.wordpress.com/2011/10/03/monitoring-hosts-on-nagios-without-nrpe/

And took some ideas from this thread

http://unix.stackexchange.com/questions/87405/how-can-i-execute-local-script-on-remote-machine-and-include-arguments

 

The script

../../usr/lib64/nagios/plugins/bash_scripts/check_disk_without_nrpe.sh

#!/bin/bash
##checks the used disk space for nagios
##usage disk.sh mountpoint critical_used%value warning_used%value
size=`df -Ph $1 | tail -1 | awk '{print $5}'`
size=$(echo ${size%\%})

if [ $size -gt $2 ]
then
echo "Critical $1 size exceeded $2 % current size $size "
exit 2;
fi

if [ $size -gt $3 ]
then
echo "Warning $1 size exceeded $3 % current size $size"
exit 1;
fi

echo "OK $1 current size $size %"
exit 0;

The wrapper to execute the script on the remote host

../../usr/lib64/nagios/plugins/bash_scripts/check_disk_usage_over_ssh.sh

ssh -oStrictHostKeyChecking=no $2@$1 "bash -s" -- < /usr/lib64/nagios/plugins/bash_scripts/check_disk_without_nrpe.sh "$3 $4 $5" 2>/dev/null

Add this inside ../../etc/nagios/objects/commands.cfg

#Check mount point disk usage over SSH (you will have to add the nagios user key to the user you are trying to connect as)
define command{
command_name check_disk_no_nrpe
command_line /usr/lib64/nagios/plugins/bash_scripts/check_disk_usage_over_ssh.sh $HOSTADDRESS$ $ARG1$ $ARG2$ $ARG3$ $ARG4$ $ARG5$
}

And then implement the check inside the config file for the host you want to check ../../etc/nagios/conf.d/hosts/vcenter.cfg

define service{
use generic-service ; Inherit default values from a template
host_name vcenter.koendiels.be
service_description Check / usage
check_command check_disk_no_nrpe!root!/!90!75
contact_groups it-team
}

Repeat this for every mount point you want to check

Home