Bash – Shell valid function name characters

bashfunctionlkshmkshshellzsh

Using extended Unicode characters is (no-doubt) useful for many users.

Simpler shells (ash (busybox), dash) and ksh do fail with:

tést() { echo 34; }

tést

But , , , and seem to allow it.

I am aware that POSIX valid function names use this definition of Names. That means this regex:

[a-zA-Z_][a-zA-Z0-9_]*

However, in the first link it is also said:

An implementation may allow other characters in a function name as an extension.

The questions are:

  • Is this accepted and documented?
  • Where?
  • For which shells (if any)?

Related questions:
Its possible use special characters in a shell function name?
I am not interested in using meta-characters (>) in function names.

Upstart and bash function names containing “-”
I do not believe that an operator (subtraction "-") should be part of a name.

Best Answer

Since POSIX documentation allow it as an extension, there's nothing prevent implementation from that behavior.

A simple check (ran in zsh):

$ for shell in /bin/*sh 'busybox sh'; do
    printf '[%s]\n' $shell
    $=shell -c 'á() { :; }'
  done
[/bin/ash]
/bin/ash: 1: Syntax error: Bad function name
[/bin/bash]
[/bin/dash]
/bin/dash: 1: Syntax error: Bad function name
[/bin/ksh]
[/bin/lksh]
[/bin/mksh]
[/bin/pdksh]
[/bin/posh]
/bin/posh: á: invalid function name
[/bin/yash]
[/bin/zsh]
[busybox sh]
sh: syntax error: bad function name

show that bash, zsh, yash, ksh93 (which ksh linked to in my system), pdksh and its derivation allow multi-bytes characters as function name.

yash is designed to support multibyte characters from the beginning, so there's no surprise it worked.

The other documentation you can refer is ksh93:

A blank is a tab or a space. An identifier is a sequence of letters, digits, or underscores starting with a letter or underscore. Identifiers are used as components of variable names. A vname is a sequence of one or more identifiers separated by a . and optionally preceded by a .. Vnames are used as function and variable names. A word is a sequence of characters from the character set defined by the current locale, excluding non-quoted metacharacters.

So setting to C locale:

$ export LC_ALL=C
$ á() { echo 1; }
ksh: á: invalid function name

make it failed.

Related Question