- 11
- Dec
- 2010
A while ago, at work, I tried my hand at a little shell script parallelization. My script's purpose is to compress a whole bunch of files at once, making full use of all 12 cores on our server. However, the script could easily be modified for any task which processes a bunch of files in a directory.
#! /bin/bash MAX_CONCURRENT=4 TOTAL_COUNT=0 BASE_PATH= COMPRESSOR=bzip2 COMPRESSOR_OPTS='--best' VERBOSE= FILE_EXTENSIONS="gz bz2 tgz lzma" FILE_EXT_PATTERN='(tar)?\.('$(echo $FILE_EXTENSIONS|tr ' ' '|')')$' while getopts "m:jzltp:v" flag; do case $flag in m) MAX_CONCURRENT="$OPTARG";; j) COMPRESSOR=bzip2; COMPRESSOR_OPTS='--best';; z) COMPRESSOR=gzip; COMPRESSOR_OPTS='--best';; l) COMPRESSOR=lzma; COMPRESSOR_OPTS='--fast';; t) COMPRESSOR="sleep 3;"; COMPRESSOR_OPTS=echo;; p) BASE_PATH="$OPTARG";; v) VERBOSE=1;; esac done shift $((OPTIND - 1)) # Get rid of processed args # If we haven't recieved a path via a switch, fall back to arguments if [ -z "$BASE_PATH" ] && [ -n "$1" ]; then BASE_PATH="$1" elif [ -z "$BASE_PATH" ]; then echo "Please specify the directory containing files to compress." exit 1 fi RUNNING_JOBS='' [ $VERBOSE ] && echo "Compressing files in $BASE_PATH" ls -1d ${BASE_PATH}* |grep -vE $FILE_EXT_PATTERN |while read x || [ -n "$(jobs -rp)" ]; do if [ -n "$x" ]; then [ ! -f "$x" ] && continue if [ $(for ext in $FILE_EXTENSIONS; do [ -f "$x".$ext ] && echo; done|wc -l) -gt 0 ] ; then [ $VERBOSE ] && echo "Skipping $x: already compressed" continue else RUNNING_JOBS=$(jobs -rp |wc -l) [ $VERBOSE ] && [ $RUNNING_JOBS -gt $((MAX_CONCURRENT - 1)) ] && echo -n "Spawned more than $MAX_CONCURRENT $COMPRESSOR jobs. Waiting for one to complete..." while [ $RUNNING_JOBS -gt $((MAX_CONCURRENT - 1)) ]; do pid=$(jobs -rp |head -n1) [ -n "$pid" ] || wait $pid RUNNING_JOBS=$(jobs -rp |wc -l) done [ $VERBOSE ] && echo "continuing. ($(jobs -rp |wc -l) jobs are running)" $COMPRESSOR $COMPRESSOR_OPTS "$x" & TOTAL_COUNT=$((TOTAL_COUNT + 1)) fi else wait fi done [ $VERBOSE ] && echo "$TOTAL_COUNT files compressed."
After writing it, I looked around to find the proper way to do this, and found many other options.
Of these, xargs -P seems the most handy, since it is available on just about every unix OS, no installation of this or that additional binary or script is required.
Comments have been disabled.