• 11
  • Dec
  • 2010
parallel unix-verses

A while ago, at work, I tried my hand at a little shell script parallelization. My script's purpose is to compress a whole bunch of files at once, making full use of all 12 cores on our server. However, the script could easily be modified for any task which processes a bunch of files in a directory.

#! /bin/bash

MAX_CONCURRENT=4
TOTAL_COUNT=0
BASE_PATH=
COMPRESSOR=bzip2
COMPRESSOR_OPTS='--best'
VERBOSE=
FILE_EXTENSIONS="gz bz2 tgz lzma"
FILE_EXT_PATTERN='(tar)?\.('$(echo $FILE_EXTENSIONS|tr ' ' '|')')$'

while getopts "m:jzltp:v" flag; do
  case $flag in
    m) MAX_CONCURRENT="$OPTARG";;
    j) COMPRESSOR=bzip2; COMPRESSOR_OPTS='--best';;
    z) COMPRESSOR=gzip; COMPRESSOR_OPTS='--best';;
    l) COMPRESSOR=lzma; COMPRESSOR_OPTS='--fast';;
    t) COMPRESSOR="sleep 3;"; COMPRESSOR_OPTS=echo;;
    p) BASE_PATH="$OPTARG";;
    v) VERBOSE=1;;
  esac
done
shift $((OPTIND - 1)) # Get rid of processed args

# If we haven't recieved a path via a switch, fall back to arguments
if [ -z "$BASE_PATH" ] && [ -n "$1" ]; then
  BASE_PATH="$1"
elif [ -z "$BASE_PATH" ]; then
  echo "Please specify the directory containing files to compress."
  exit 1
fi
RUNNING_JOBS=''

[ $VERBOSE ] && echo "Compressing files in $BASE_PATH"
ls -1d ${BASE_PATH}* |grep -vE $FILE_EXT_PATTERN |while read x || [ -n "$(jobs -rp)"  ]; do
  if [ -n "$x" ]; then
     [ ! -f "$x" ] && continue
     if [ $(for ext in $FILE_EXTENSIONS; do [ -f "$x".$ext ] && echo; done|wc -l) -gt 0 ] ; then
       [ $VERBOSE ] && echo "Skipping $x: already compressed"
       continue
     else
       RUNNING_JOBS=$(jobs -rp |wc -l)
       [ $VERBOSE ] && [ $RUNNING_JOBS -gt $((MAX_CONCURRENT - 1)) ] && echo -n "Spawned more than $MAX_CONCURRENT $COMPRESSOR jobs. Waiting for one to complete..."
       while [ $RUNNING_JOBS -gt $((MAX_CONCURRENT - 1)) ]; do
   pid=$(jobs -rp |head -n1)
   [ -n "$pid" ] || wait $pid
   RUNNING_JOBS=$(jobs -rp |wc -l)
       done
       [ $VERBOSE ] && echo "continuing. ($(jobs -rp |wc -l) jobs are running)"
       $COMPRESSOR $COMPRESSOR_OPTS "$x" &
       TOTAL_COUNT=$((TOTAL_COUNT + 1))
     fi
  else
    wait
  fi
done
[ $VERBOSE ] && echo "$TOTAL_COUNT files compressed."

After writing it, I looked around to find the proper way to do this, and found many other options.

Of these, xargs -P seems the most handy, since it is available on just about every unix OS, no installation of this or that additional binary or script is required.

Comments have been disabled.

ZettaZebra.com © 2012, david clymer