echo "

Ansible check if machines can connect to a port on a server

I had to test if all hosts were able to connect to a certain port on a certain server. Ansible was the perfect tool, but since not all machines have nc or nmap installed I had to make a workaround using Python. I will be checking if TCP port nagios.company.com:5666 is open.

the script, nagios.py (will be copied and executed on the remote host by ansible). This only works for TCP, changing SOCK_STREAM to SOCK_DGRAM will always return 0.

#!/usr/bin/python
import socket;
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
result = sock.connect_ex(('nagios.company.com',5666))
if result == 0:
   print "OK"
else:
   print "Not OK"

the playbook, nagios.yml

- name: Nagios connectivity test
  hosts: all
  tasks:
    - name: script
      script: /tmp/nagios.py
      register: nagios
    - debug: msg="{{ nagios.stdout }}"

the run command to filter the hosts that can't connect

ansible-playbook /tmp/nagios.yml | grep -B1 Not\ OK

Swap clean up script (update)

Today I had a machine that had no swap space left which in return triggered a monitoring alert. Since the pressure was already gone but the swap wasn't cleared fast enough I searched and found this script as a nice solution. The script was a bit older and was made for a version of free that seperated buffers and caches. In newer versions they are combined as "buff/cache" and a new field is introduced that shows the combined available memory. Which gave me a chance to simplify the script.

clearswap.sh

#!/bin/bash

free_mem="$(free | grep 'Mem:' | awk '{print $7}')"
used_swap="$(free | grep 'Swap:' | awk '{print $3}')"

echo -e "Free memory:\t$free_mem kB ($((free_mem / 1024)) MiB)\nUsed swap:\t$used_swap kB ($((used_swap / 1024)) MiB)"
if [[ $used_swap -eq 0 ]]; then
    echo "Congratulations! No swap is in use."
elif [[ $used_swap -lt $free_mem ]]; then
    echo "Freeing swap..."
    sudo swapoff -a
    sudo swapon -a
else
    echo "Not enough free memory. Exiting."
    exit 1
fi

 Thx Scott Severance for the initial script

Automatically clean up logs with a cron job

Since a lot of our applications generate enormous log files while running on Tomcat, Play, ... We have to intervene and clean them up and since zipping those logs makes a huge improvement on disk space consumption this is how I do it. Of course you can use ZFS with native disk compression too :)

I install this cron (as the user that runs the application). It will clean up all the logs that aren't in use and will clean up the archived logs after 30 days. And the neat thing is that recent versions of Vim kan still read the log while zipped, deflating it on the fly.

find /opt/ -type f \( -mtime +30 -iname "*.[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9].log.gz" -exec rm -f {} \; -o -iname "*.[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9].log" -exec gzip {} \; \)

 

Setting a Grub password using Ansible (update)

After an upgrade from RedHat the template for the password is changed and uses a variable that OpenScap doesn't read. This makes that our test fails. On top of that the test checks for the use of common administrator account names like root, admin or administrator. This update solves the issue and from now we use a user instead of root for Grub2.

Today I had to update and verify that we have an entry password for Grub on all machines. We needed to do this to comply with the Certified Cloud Service Provider's OpenScap benchmark.

This only prevents a person with physical access to boot in single user mode! The machine can still be booted without the need of a password.

RHEL6 machines all use legacy boot where on RHEL7 we also make a difference between EFI and non-EFI machines. 

 

First generate the hashes (on a RHEL6 and on a RHEL7 node)

RHEL6

grub-crypt --sha-512

RHEL7

grub2-mkpasswd-pbkdf2

And to finish... Here are the Ansible lines:

playbook lines:

#GRUB
- name: "grub v1 | add password"
  lineinfile: dest=/etc/grub.conf regexp='^password ' state=present line='password --encrypted {{ grub_password_v1_passwd }}' insertafter='^timeout'
  when: rhel6
  tags: grub-password

- stat: path=/sys/firmware/efi/efivars/
  register: grub_efi
  when: rhel7
  tags: grub-password

- name: remove unwanted grub.cfg on EFI systems
  file:
    state: absent
    path: /boot/grub2/grub.cfg
  when: rhel7 and grub_efi.stat.exists == True
  tags: grub-password

- name: Install user template to make sure grub2-mkconfig doesn't mess up the config
  template:
    src: 01_users.j2
    dest: /etc/grub.d/01_users
    owner: root
    group: root
    mode: '0700'
  notify:
     - grub2-mkconfig EFI
     - grub2-mkconfig MBR
  when: rhel7
  tags: grub-password

- name: "grub v2 EFI | add password"
  lineinfile: dest=/etc/grub2-efi.cfg regexp="^password_pbkdf2 {{ grub_user }} " state=present insertafter=EOF line='password_pbkdf2 {{ grub_user }} {{ grub_password_v2_passwd }}'
  when: rhel7 and grub_efi.stat.exists == True
  tags: grub-password

- name: "grub v2 MBR | add password"
  lineinfile: dest=/etc/grub2.cfg regexp="^password_pbkdf2 {{ grub_user }} " state=present insertafter=EOF line='password_pbkdf2 {{ grub_user }} {{ grub_password_v2_passwd }}'
  when: rhel7 and grub_efi.stat.exists == False

vars:

grub_password_v1_passwd: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
grub_password_v2_passwd: grub.pbkdf2.sha512.10000.xxxxxxxxxxxxxxxxxxx
grub_user: loginuser

 Handlers:

- name: grub2-mkconfig EFI
  command: grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg
  when: grub_efi.stat.exists == True

- name: grub2-mkconfig MBR
  command: grub2-mkconfig -o /boot/grub2/grub.cfg
  when: grub_efi.stat.exists == False

01_users.j2:

#!/bin/sh -e

cat << "EOF"
set superusers="{{ grub_user }}"
export superusers
password_pbkdf2 {{ grub_user }} {{ grub_password_v2_passwd }}
EOF

Script to clean up all the ARF/OpenScap compliance reports in Satellite

Since we only need to know the last compliance check I made a script to clean up all the previous reports before the next compliance check runs.

#!/bin/bash
#this script removes all the arf reports from the satellite server
###

#settings
USER=ronly
PASS=xxxxxxxxxxx
URI=https://localhost

#check amount of reports
while [ $(curl -k -u $USER:$PASS $URI/api/v2/compliance/arf_reports/ | python -m json.tool | grep \"\total\": | cut -f2 -d":" | cut -f1 -d"," | sed "s/ //g") -gt 0 ]; do
        #fetch reports
        for i in $(curl -k -u $USER:$PASS $URI/api/v2/compliance/arf_reports/ | python -m json.tool | grep \"\id\": | cut -f2 -d":" | cut -f1 -d"," | sed "s/ //g")
        #delete reports
        do
                curl -k -u $USER:$PASS -i -H "Content-Type: application/json" -X DELETE $URI/api/v2/compliance/arf_reports/$i
        done
done

To manually rerun the benchmark on all machines I use following ansible command

ansible all -m shell -a 'eval $(grep foreman_scap_client /var/spool/cron/root | cut -f6-7 -d" " | sed '/^$/d')'

 Update: Red Hat published my script https://access.redhat.com/solutions/3040861

 Update: After Satellite 6.3 the location for the cron rule has changed to /etc/cron.d/foreman_scap_client_cron

Make a CSV file of all the cronjobs on all the systems managed by ansible

To get a monthly overview of all the cronjobs on all the systems I wrote a wrapper (in bash) to create the CSV lines which with the help of an ansible playbook to generate a csv file.

This is the wrapper (collect_cron.sh):

#!/bin/bash
#this script will return a CSV file containing the server,user,cronjob
##
#this is set to be able to use filters on wildcards
shopt -s extglob
#here we store the hostname since we only need to declare this once
HOST=$(hostname|cut -d"." -f1)
#here we start looping through all the cron files exept the ones filtered by the pipe seperated list
for f in $(ls /var/spool/cron/*;ls /etc/cron.d/!(*@(sysstat|0hourly)) 2>>/dev/null )
do
        #here we store the content of the current cron file
        COMMAND=$(cat $f)
        #here we loop over the individual jobs in the file while filtering out comments and empty lines
        echo "$COMMAND" | sed /^#/d | sed /^\s*$/d | while read line;
        do
                #here we start printing a line for our CSV file
                #starting with the host
                printf $HOST","
                #here we check if it is a user or a system cron and we print accordingly
                if [[ $f == /var/spool/cron/* ]];
                then
                        USER=${f##*/}
                        printf $USER","
                else
                        printf "system,"
                fi
                #and finally here we print the actual command and since we desire a new line echo is used here instead of printf
                echo "$line"
        done
done

 And the matching playbook (made by a colleague):

- hosts: all
  #gather_facts: no
  tasks:
#  - name: create folder
#    local_action: file dest=/tmp/cron_collect state=directory owner=root group=root mode=0700
  - block:
    - name: "collect crons on system"
      script: "{{ playbook_dir}}/../../scripts/collect_cron.sh"
      register: crons
      ignore_errors: yes
    - name: move to csv file
      local_action: copy content={{ crons.stdout }} dest=/opt/systems/cron_collect/{{ansible_fqdn}}.csv
- hosts: localhost:ansibleserver01
  gather_facts: no
  tasks:
  - name: combine into one file
    assemble:
      src: /opt/systems/cron_collect/
      dest: /tmp/croncollection.csv
      owner: bescorli
      group: sysauto
      mode: 0640
  - name: remove blanks
    lineinfile:
      dest: /tmp/croncollection.csv
      regexp: '^\s$'
      state: absent

 

Simple script to get the IP of a list of hostnames (update)

I made a file called to_find_ip which has a hostname on every line. And a simple bash script to process the file and return the matching IP.

The script called get_ip_for_list_of_hostnames.sh (use getent ahosts if you only need IPv4 addresses)

#!/bin/bash
while read p; do
getent hosts $p | cut -f1 -d ' '
done <$1

To run the script:

sh /tmp/get_ip_for_list_of_hostnames.sh /tmp/to_find_ip

Use Ansible to report which systems need to reboot

I created this cronjob using an Ansible playbook to see if any of the Ansible managed hosts are up too long or have a newer kernel installed than the one running. This gives us a monthly overview to see which machines should be rebooted. This script is only valid for RPM and Red Hat based systems. Please adapt root on the end of the cron line to match the e-mail address you want to send your report to and make sure the system mail service is configured correctly.

/etc/cron.d/uptime-and-kernel-upgrade-report

# Send a report mail every month with the ansible managed hosts that have an uptime equal or higher than 300 days OR have a newer kernel installed than the one running
0 0 1 * * sysauto /usr/bin/flock -x -n /opt/systems/ansible -c 'cd /opt/systems/ansible/ ; ansible-playbook playbooks/systems/check_uptime_and_kernel_upgrade.yml | grep " has " | sed -e "s/^[ \t]*//" | mail -E -s "Monthly report: Systems that need a reboot" root'

check_uptime_and_kernel_upgrade.yml

- hosts: all
tasks:
- name: "Check for machines that have an uptime that exceeds 300 days"
shell: echo "$(hostname) has been up for $(uptime | cut -d ',' -f 1 | cut -d ' ' -f 4) days"
when: ansible_uptime_seconds > 25920000
register: uptime_exceeded
- name: "Check for machines that aren't running the latest installed kernel"
shell: LAST_KERNEL=$(rpm -q --last kernel | perl -pe 's/^kernel-(\S+).*/$1/' | head -1);CURRENT_KERNEL=$(uname -r);test $LAST_KERNEL = $CURRENT_KERNEL || echo "$(hostname) has a newer kernel installed than the one running"
ignore_errors: true
register: reboot_hint
- debug: var=uptime_exceeded.stdout_lines
- debug: var=reboot_hint.stdout_lines

 

Home