Safe Automatic Cleanup of Old Objects (on OpenStack Swift Containers)

Bogdan Enache
METRO SYSTEMS Romania
9 min readAug 27, 2019

This article is a continuation of the previous one, Lightweight (MariaDB) Database Backup to OpenStack Swift Containers, which I strongly advise the reader to peruse. While this article will refer specifically to backups stored on OpenStack Swift Containers, the algorithms (and overall logic) described here can be applied to other storages as well.

Keeping in mind the criteria we have seen already set in previous article, if we do regular backups (to OpenStack Swift containers) and at some point we decide that there are too many files clogging them , we can setup a script that will cleanup objects older than a certain point in time.

Such a script can also be adapted to cleanup other type of files, not just backups.

Also, this entire series of articles can serve as an entry point for people wanting to learn Bash-scripting and might be interested in some proposed good practices.

Implementation Goals

We would like to keep the cleanup script as portable as the backup script (hint: read the previous article), so we will implement it in Bash too. While for many this might seem like a rudimentary choice, for me personally years of experience have taught me (among others) two important things:

  • to keep things as simple as possible (maybe we don’t want hundreds of megabytes of dependencies and libraries for Python or Perl, do we?);
  • Bash is really powerful if properly used for a suitable scope;

The entire script must be safe: it must not delete a minimal set of data that must be kept, and it must allow dry-runs.

Design

We’ll call this cleanup script swift_cleanup_mul.sh.

It will take as input arguments:

  • an oslist.txt file, which contains the list of containers that need cleanup; might be the same one as the one in the previous article;
  • the maximum age to keep objects, expressed in days;
  • an action, which can be “delete” or “dry run”;
  • a safeguard parameter k that forces the script to keep a minimum number of files, even if they are too old (optional);
  • a path under which to cleanup (optional);
  • a Regex pattern to match (optional);
  • a verbose switch

While most of these arguments are obvious, the safeguard parameter k needs some explaining: it is there so the cleanup script doesn’t keep on deleting older files when the backup script(s) stop working or are disabled. We really wouldn’t want to loose files in such conditions. While this seems like a “no-brainer”, you would be amazed at the number of cleanup scripts that do not implement such functionality.

Also, dry-run is needed to see exactly what such a script would do to files, and test our input parameters, because we wouldn’t want to delete by accident all our backups.

Actual Implementation

We start by printing a nice usage banner if no arguments are given, which is a standard recommended practice:

# Print usage if no arguments are given
if [ $# -eq 0 ]; then
echo 'Swift cleanup script for multiple destinations.'
echo ' No argument(s) supplied.'
echo ' Usage: swift_cleanup_mul.sh -o <oslist> -t <days> [ -k <no> ] [ -p <path> ] [ -m <regex> ] [ -v ]'
echo ' Options:'
echo ' -o <oslist> Text file that contains the definitions for OpenStack container targets,'
echo ' having a row format:'
echo ' ENV CONTAINER RCFILE'
echo ' -t <days> Cleanup objects older than this days.'
echo ' -a <act> Take <act> action for cleanup, where <act> can be:'
echo ' dry : dry run, just print, but do not do anything;'
echo ' del : delete the object(s);'
echo ' -k <no> Optionally keep at least these many objects, even if they should be cleaned-up.'
echo ' -p <path> Optionally cleanup under this sub-path only.'
echo ' -m <regex> Optionally object name must match regex expression, otherwise it will be skipped.'
echo ' -v Activate verbose (debug) mode.'
echo ''
exit
fi

We initialize our internal variables with default values:

DBG=0
KEEP=0

We parse the input arguments using getopts:

# Get input arguments
while getopts ":t:a:k:o:m:p:v" option; do
case ${option} in
t) DAYS=${OPTARG}
;;
a) ACT=${OPTARG}
;;
k) KEEP=${OPTARG}
;;
o) OS_FILE=${OPTARG}
;;
m) ARC_REGEX=${OPTARG}
;;
p) C_PATH=${OPTARG}
;;
v) DBG=1
esac
done

We then define our internal functions log() and verbose(). The first one will always print the passed string, adding a timestamp as a prefix, while the later will only print it if the debug flag is enabled.

# Functions to print logs, verbose or not
log() {
echo "[`date +\"%Y-%m-%d %H:%M:%S\"`]: ${1}"
}
verbose() {
if [ "${DBG}" = 1 ]; then
echo "[`date +\"%Y-%m-%d %H:%M:%S\"`]: ${1}"
fi
}

We validate the input parameters, making sure that k and t are positive integers:

# Check args
if [ -z "${DAYS}" ]; then
log "[ERROR] Days to keep the files not provided. Use -t option."
exit 1
fi
if [ -z "${ACT}" ]; then
log "[ERROR] Action must be provided. Use -a option."
exit 2
fi
if [ ! "${ACT}" = 'del' ] && [ ! "${ACT}" = 'dry' ]; then
log "[ERROR] Action must be one of: dry, del. You specified: ${ACT}."
exit 3
fi
if [ -z "${OS_FILE}" ]; then
log "[ERROR] OpenStack container file not provided. Use -o option."
exit 4
fi
# Check OS container list file
if [ ! -r "${OS_FILE}" ]; then
log "[ERROR] List of containers \"${OS_FILE}\" does not exists or is not readable."
exit 11
fi
# Check that days are numeric
nore='^[0-9]+$'
if ! [[ "${DAYS}" =~ $nore ]] ; then
log "[ERROR] Days to keep the files must be a positive integer argument."
exit 12
fi
# Check that keep is numeric, only if defined
if [ ! -z "${KEEP}" ]; then
if ! [[ "${KEEP}" =~ $nore ]]; then
log "[ERROR] Number of minimum objects to keep must be a positive integer, if defined."
exit 13
fi
fi

We generate the base dir, which we will need later:

# Generate basedir
BASE_DIR=`dirname "${OS_FILE}"`

We then parse the container file as in the previous article, checking that we have the right variables:

# Read OS container list file
grep -v '^#' "${OS_FILE}" | while read -r LINE || [ -n "$LINE" ]; do
read -r FENV FCONT FRC REST <<< "${LINE}"
if [ ! -z "${FENV}" ] && [ ! -z "${FCONT}" ] \\
&& [ ! -z "${FRC}" ] && [ -z "${REST}" ]; then
# Determine if RC files have relative or absolute paths (start with / or not)
# If relative, then they will be relative to the OS_FILE
if [[ "${FRC}" != /* ]]; then
ARC="${BASE_DIR}/${FRC}"
else
ARC=${FRC}
fi
verbose "[DEBUG] Found container \"${FCONT}\" on env \"${FENV}\" using rc file \"${FRC}\" -> \"${ARC}\"."

We check for errors, and if none, we source the RC file:

if [ ! -r "${ARC}" ]; then
log "[ERROR] OpenStack RC file \"${ARC}\" does not exists or is not readable. Will continue with the others."
continue
else
source "${ARC}"
EC=$?
if [ "$EC" != "0" ]; then
log "[ERROR] Could not source RC file \"${ARC}\". Will continue with the others."
continue
else

We get the object listings from Swift, checking for errors:

# Get listings, with subpath or not
if [ ! -z ${C_PATH} ]; then
OBJ_LIST=`swift list -l "${FCONT}" -p "${C_PATH}"`
else
OBJ_LIST=`swift list -l "${FCONT}"`
fi
EC=$?
if [ "$EC" != "0" ]; then
log "[ERROR] Could not get Swift listings using RC file \"${ARC}\", error <${EC}>. Will continue with the others."
continue
fi

We then proceed to parse the listings returned by Swift:

# Process listing from Swift
OBJS_FOUND=0
OBJS_TO_DEL=''
OBJS_TO_DEL_NO=0
while IFS= read -r OBJ_LINE; do
# Process line
read -r OBJ_SIZE OBJ_DATE OBJ_TIME OBJ_X OBJ_NAME OBJ_REST <<< "${OBJ_LINE}"
if [ ! -z "${OBJ_SIZE}" ] && [ ! -z "${OBJ_DATE}" ] && [ ! -z "${OBJ_TIME}" ] \
&& [ ! -z "${OBJ_X}" ] && [ ! -z "${OBJ_NAME}" ] \
&& [ -z "${REST}" ]; then

If requested, we check that object names match our regex:

# Check that object name matches our regex, if requested
if [ ! -z "${ARC_REGEX}" ]; then
if ! [[ "${OBJ_NAME}" =~ $ARC_REGEX ]] ; then
continue
fi
fi

We then increment the number of objects found, compute the date difference and check that our object is old enough to be a candidate for our action (whatever that is, delete or dry-run):

OBJS_FOUND=$((OBJS_FOUND+1))# Calculate date difference
DATE_DIFF=$((($(date +%s)-$(date --date="${OBJ_DATE}" +%s))/(60*60*24)))
if [ ${DATE_DIFF} -gt ${DAYS} ]; then
verbose "[DEBUG] Found object named \"${OBJ_NAME}\", modified on \"${OBJ_DATE}\" at \"${OBJ_TIME}\" (${DATE_DIFF} days ago) - might get cleaned-up."
printf -v OBJS_TO_DEL "%s\n%s %s" "${OBJS_TO_DEL}" "${OBJ_NAME}" "${OBJ_DATE}"
OBJS_TO_DEL_NO=$((OBJS_TO_DEL_NO+1))
else
verbose "[DEBUG] Found object named \"${OBJ_NAME}\", modified on \"${OBJ_DATE}\" at \"${OBJ_TIME}\" (${DATE_DIFF} days ago) - will not get cleaned-up."
fi
fi

We process the list of candidates and sort them in reverse order of the date:

# Process list of objects to cleanup
OBJS_TO_DEL=`echo -e "${OBJS_TO_DEL}"`
# Reverse sort by date (oldest first)
OBJS_TO_DEL=`sort -k 2 <<< "${OBJS_TO_DEL}"`
done < <(printf '%s\n' "${OBJ_LIST}")

Pass number 2 of our algorithm’s logic is necessary for our safeguard parameter k (making sure we don’t delete all our backups in case the backup scripts stop working):

# Step 2 of list processing
OBJS_TO_DEL_2=''
OBJS_TO_DEL_NO_2=0
OBJS_REM=$((OBJS_FOUND-OBJS_TO_DEL_NO))
verbose "[DEBUG] 1st step: ${OBJS_FOUND} objects found, ${OBJS_TO_DEL_NO} to get cleanup, ${OBJS_REM} remaining. Possible list to cleanup:"
verbose "${OBJS_TO_DEL}"
# Check if objects to keep >= object to cleanup or total
if [ ${KEEP} -ge ${OBJS_FOUND} ]; then
log "[INFO] Found ${OBJS_FOUND} objects in total, and need to forcefully keep at least ${KEEP}, so no action is taken [container \"${FCONT}\", using RC \"${FRC}\"]."
OBJS_TO_DEL_2=''
OBJS_TO_DEL_NO_2=0
elif [ ${OBJS_REM} -ge ${KEEP} ]; then
OBJS_TO_DEL_2=${OBJS_TO_DEL}
OBJS_TO_DEL_NO_2=${OBJS_TO_DEL_NO}
else
OBJS_TO_DEL_NO_2=$((${OBJS_FOUND}-${KEEP}))
verbose "[DEBUG] Determined that will cleanup ${OBJS_TO_DEL_NO_2} object(s) in total."
# Iterate through list FINAL_DEL_NO times
INC=0
while IFS= read -r OBJ2_LINE; do
read -r OBJ2_NAME OBJ2_DATE OBJ2_REST <<< "${OBJ2_LINE}"
if [ ! -z "${OBJ2_NAME}" ] && [ ! -z "${OBJ2_DATE}" ] && [ -z "${OBJ2_REST}" ]; then
verbose "[DEBUG] Adding object \"${OBJ2_NAME}\" to cleanup queue 2."
printf -v OBJS_TO_DEL_2 "%s\n%s %s" "${OBJS_TO_DEL_2}" "${OBJ2_NAME}" "${OBJ2_DATE}"
INC=$((${INC}+1))
if [ ${INC} -ge ${OBJS_TO_DEL_NO_2} ]; then
break
fi
fi
done < <(printf '%s\n' "${OBJS_TO_DEL}")
fi

If we still have candidates, we start applying the desired action:

# Take actions
if [ ${OBJS_TO_DEL_NO_2} -gt 0 ]; then
OBJS_TO_DEL_2=`echo -e "${OBJS_TO_DEL_2}"`
verbose "[DEBUG] Found <${OBJS_TO_DEL_NO_2}> objects to take actions on: ${OBJS_TO_DEL_2}."
while IFS= read -r OBJ2_LINE; do
read -r OBJ2_NAME OBJ2_DATE OBJ2_REST <<< "${OBJ2_LINE}"
if [ ! -z "${OBJ2_NAME}" ] && [ ! -z "${OBJ2_DATE}" ] && [ -z "${OBJ2_REST}" ]; then

Dry-run action will only log(), and do nothing else:

# Dry-run action
if [ "${ACT}" = 'dry' ]; then
log "[INFO] Dry-run action on object \"${OBJ2_NAME}\", env \"${FENV}\" [container \"${FCONT}\", path \"${C_PATH}\", using RC \"${FRC}\"]."

The delete action will delete the object from Swift (using a sub-path or not), and checking for errors:

# Delete action
elif [ "${ACT}" = 'del' ]; then
if [ ! -z ${C_PATH} ]; then
swift delete -p "${C_PATH}" "${FCONT}" "${OBJ2_NAME}"
else
swift delete "${FCONT}" "${OBJ2_NAME}"
fi
EC=$?
if [ "$EC" != "0" ]; then
log "[ERROR] Could not delete object \"${OBJ2_NAME}\", env \"${FENV}\" [container \"${FCONT}\", path \"${C_PATH}\", using RC \"${FRC}\"]: error <${EC}>. Will continue with the others."
else
log "[INFO] Succesfully deleted object \"${OBJ2_NAME}\", env \"${FENV}\" [container \"${FCONT}\", path \"${C_PATH}\", using RC \"${FRC}\"]."
fi
fi
fi
done < <(printf '%s\n' "${OBJS_TO_DEL_2}")

And finally, we close our script:

else
log "[INFO] No actions to take, 0 objects eligible for cleanup on env \"${FENV}\" [container \"${FCONT}\", path \"${C_PATH}\", using RC \"${FRC}\"]."
fi
fi
fi
else
log '[ERROR] Container file list \"${OS_FILE}\" has incorrect format, please check it.'
exit 22
fi
echo '----------'doneexit 0

This script could be run for example as such:

./swift_cleanup_mul.sh -o /some/path/oslist.txt -t 15 -k 7 -a del -m '\.zip$' -p some_app

- here deleting files older than 15 days, but always keeping at least 7, regex matching the extension .zip and having a subpath “some_app”.

Output will be similar to:

db1/pp_db1_2018_11_11.sql.zip
[2018-11-22 05:45:04]: [INFO] Succesfully deleted object "db1/pp_db1_2018_11_19.sql.zip", env "pp" [container "app_backup_pp", path "db1", using RC "os_openrc.d/pp_cloud01.sh"].
----------
db1/prod_db1_2018_11_11.sql.zip
[2018-11-22 05:45:07]: [INFO] Succesfully deleted object "db1/prod_db1_2018_11_19.sql.zip", env "prod" [container "app_backup_prod", path "db1", using RC "os_openrc.d/prod_cloud01.sh"].
----------

Conclusions

This script and similar ones run for quite some time on different systems. While it might seem too overwhelming to someone, the checks that are in place are needed to ensure proper safety of data (we don’t want to delete valuable data, do we?).

One can expand this script to be used for other storages, or for other types of data besides backups, but the logic presented here should suffice.

--

--