Well, for one thing, it doesn't quote arguments correctly. If you do this... tou...

kazinator · on June 21, 2023

  #!/bin/sh

  DEFER=

  defer() {
    local i
    local cmdline=""

    for arg in ${@+"$@"} ; do
      [ -n "$cmdline" ] && cmdline="$cmdline "
      case $arg in
      *"'"* )
        case $arg in
        *[\"\$\\]* )
          cmdline="$cmdline'$(printf "%s" "$arg" | sed -e "s/'/'\\\\''/g")'"
          ;;
        * )
          cmdline="$cmdline\"$arg\""
          ;;
        esac
        ;;
      *'"'* | *['$*?[(){};&|<>#']* | '~'* )
        cmdline="$cmdline'$arg'"
        ;;
      *' '* | *'        '* ) # literal tab between second pair of quotes
        cmdline="$cmdline\"$arg\""
        ;;
      * )
        cmdline="$cmdline$arg"
        ;;
      esac
    done
    DEFER="$cmdline; $DEFER"
    # printf "DEFER=%s\n" "$DEFER" # debugging
    trap "command $DEFER" EXIT
  }

(The quoting code is complicated because it's taken from a program that quotes "neatly". Words that don't need quoting aren't quoted, and single or double quotes are used, whichever is nicer. It's not valuable if all you do is execute, but it could help the debug printf be more understandable.)

patrec · on June 21, 2023

No need for the gobbledygook. Just use printf %q, which will quote neatly, too.

    defer() {
       new_cmd="$(printf "%q " "$@")"
       DEFER="${new_cmd% }; $DEFER"
       trap "command $DEFER" EXIT
    }

Yeah, %q is not in POSIX, so this won't work with e.g. dash – but trying to stick to POSIX is almost always a dumb idea.

patrec · on June 21, 2023

(And, BTW, the original code, for all its escaping gymnastics, will only work correctly in bash anyway)

kazinator · on June 21, 2023

Did you spot anything in the for loop that is Bash specific?

That was taken from code that doesn't use local and which absolutely has to work in various other shells.

I just happened to plant it into a function that uses the local feature. That is not just a Bash feature; it's in the Korn Shell, and Dash has it too.

patrec · on June 21, 2023

The trap EXIT only does the right thing in bash (run on any exit condition, try-finally-style). For dash and unfortunately even for zsh, you'd have to specify multiple signal handlers:

     > bash -c 'trap "echo cleaning up" EXIT; while :; do sleep 1; done'
     ^Ccleaning up
     > dash -c 'trap "echo cleaning up" EXIT; while :; do sleep 1; done'
     ^C
     > zsh -c 'trap "echo cleaning up" EXIT; while :; do sleep 1; done'
     ^C

What makes it even more irritating is that you need to manually clear the handlers and kill yourself if you trap INT etc, so you can't just add a few more signals to the trap clause above either.

IMO the lack of [[, print %q and sane trap semantics, plus the fact that >99.9% of systems that have a posix shell also have bash means that outside of very specific circumstances (e.g. it needs to run in busybox) trying to limit oneself to posix sh is ill-advised.

kazinator · on June 21, 2023

In general, limiting yourself to POSIX has the benefit that you think harder what not to do in the shell.

If something can't be done easily in POSIX scripting, it's probably a poor fit for shell programming, even if it can be done nicely in Bash.

Writing scripts in Bash only makes sense when you have these assumption:

1. Every system you care about has the FOSS language "GNU Bash" installed.

2. At least one of those systems has no other FOSS language you could use for programming that is vastly better than Bash.

3. At least one system installation meeting condition (2) is locked down; nothing can be installed.

If just one of these doesn't hold, you have one reason or another not to use Bash. (If just 3 doesn't hold, you do have a reason to use it: installing things on target systems creates a dependency and requires space. That has to be weighed against being stuck with shell scripting.)

patrec · on June 21, 2023

> If something can't be done easily in POSIX scripting, it's probably a poor fit for shell programming,

I can see where you are coming from, because I used to think along similar lines. In particular, whilst Bash is an ugly and baroque mess, Bourne shell, at it's core, is actually a simple and elegant design, marred by a handful of irritating flaws. Since neither is particularly suitable for writing robust code, it's very tempting to say stick to sh, and where that does not suffice, pick a better language.

Unfortunately, I was forced to conclude that a) often there isn't a great alternative to shell scripting and b) bash is sufficiently ubiquitous and POSIX did a sufficiently bad job again that POSIX sh is generally not worth bothering with.

There is a handful of features, none of which are unique to bash (ksh has all of them, but sadly lost), but all of which are absent from POSIX sh, that make it significantly easier to write robust scripts. The crucial ones, I think, are:

1. sane trap EXIT (ksh also has this, zsh sadly doesn't but has "always")

2. process substitution (ksh and zsh also have this)

3. printf %q (ksh and zsh also have this)

Not needing to use ugly cruft like [ "x$ans" = xyes ] is also good, but less essential.

Take your code for example (and you're clearly someone with hardcore unix skills): it's much more complicated, slower, and brittle all due to the absence of 1&3.

Is that an indication that you shouldn't write anything that needs to do clean-up in shell script? I don't think so. People write lots of really useful stuff with bash all the time (devenv, nix build steps etc.). I'd argue that all three of your conditions typically hold for anything that's well expressed in < 1 page of bash. The reasons are:

A. When sh is available, so is bash in almost all cases.

B. Any competent devops or sysadmin person can handle bash, and so can a lot of normal developers. The same is not true for any of the typically plausible replacement languages (python, perl, ruby, ...). Furthermore, all of these have trouble expressing some shell idioms equally well and, apart from perl5 which is fairly fossilized at this point, bring significant versioning problems.

C. It's not just about being locked down, it's also about not wanting to add extra attack-surface, maintenance & mental overhead, bloat etc. If you are Jane Street and use Ocaml for everything throughout your org, you can probably (and profitably!) avoid almost all bash scripting, but that's a fairly unique situation.

Basically, my recommendation is to avoid shell scripts for anything where it does not have a clear advantage over python or similar. And where it does, to avoid most of the extra functionality that bash offers over sh for scripting, but to make use of at least 1-3 above where they are a natural solution. Because emulating any of these with posix shell is both painful and error-prone, and all of them are very often useful.

adrianmonk · on June 22, 2023

Here's my solution to the problem I pointed out.

The basic idea is let the shell do expansion / word splitting on the to-be-deferred command when defer() is called, then save the words into an array, and avoid doing expansion. This should avoid running into quoting problems in the first place.

    #! /bin/bash

    defer_run() {
      while [ ${#DEFER_OFFSETS[@]} -gt 0 ]
      do
        # Get / remove last element of offset and length arrays.
        offset=${DEFER_OFFSETS[-1]}; unset DEFER_OFFSETS[-1]
        len=${DEFER_LENGTHS[-1]}; unset DEFER_LENGTHS[-1]

        # Get a slice of the main array corresponding to one deferred
        # command's word list. Execute that word list as a command.
        "${DEFER_LIST[@]:$offset:$len}"
      done
    }

    defer() {
      declare -g -a DEFER_LIST DEFER_OFFSETS DEFER_LENGTHS
      trap defer_run EXIT

      DEFER_OFFSETS+=( ${#DEFER_LIST[@]} )
      DEFER_LENGTHS+=( $# )
      DEFER_LIST+=( "$@" )
    }

    defer echo a  b
    defer echo "c  d"

This prints:

    c  d
    a b

The quoted string "c d" is preserved all the way through as one argument to echo. The non-quoted ones work as expected too.

Note that since bash doesn't have lists of lists, I've stuck all the word-lists of all the deferred commands together into one giant array, and I've saved offsets and length so I can separate them out again.

Since expansion doesn't happen again after you've deferred a command, you can't expand (say) a variable when the deferred command runs. But if you want to do that, define a function and defer it, or just use eval:

    defer eval 'echo $myvar'

remram · on June 21, 2023

You mean it doesn't quote arguments automatically, it is safe so long as you always quote the command

    defer 'touch "a b"'

(a usability pitfall I know, but probably not fixable easily)

edit: maybe escaping could be added though, using sed?

    DEFER="; ${DEFER}"
    for arg in "$@"; do
        DEFER="$(printf -- %s "${arg}" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/'/") ${DEFER}"
    done

ramses0 · on June 21, 2023

    printf "%q" "$VAR"

P.S. quotes matter, surprisingly or unsurprisingly because “bash”

remram · on June 21, 2023

I suspect that this works only for bash and won't work in other shells.

patrec · on June 21, 2023

It works in bash and zsh, which are pretty much the only bourne-like shells that matter.