Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Shell Corner: Bash Dynamically Loadable Built-in Commands


UnixReview.com
June 2006

Shell Corner: Bash Dynamically Loadable Built-in Commands

Hosted by Ed Schaefer

Bash shell programmers can improve the efficiency of their scripts by using the shell's dynamically loadable built-in commands. This month, Chris F.A. Johnson shows us how to use them.

Bash Dynamically Loadable Built-In Commands
by Chris F.A. Johnson

If you use the shell for serious programming, as I do, speed of execution is a serious issue. A script should not appear sluggish; it should not be noticeably slower than a program written in Perl or Python — or even C. One of the major contributors to slowdown of scripts is starting a new process, whether it is an external command or command substitution. (All shells except Korn Shell 93 create a new process for command substitution.) I first noticed how long command substitution takes when I was writing a script to print a form on the screen. A function converted dates in ISO form (YYYY-MM-DD) to a "human friendly" DD Month YYYY with a_date=$(format_date "$date"). There were only four date fields in the form, but with the conversion there was a noticeable delay; without it, the screen was redrawn immediately.

When I started writing Unix shell scripts, I used a Bourne shell. It was far more powerful than the Amiga or MS-DOS shells I had used previously, but it still relied on external commands for most useful work. There was no arithmetic in the shell; I used expr and awk for calculations. I used expr, cut, tr, basename, and various other commands to manipulate strings.

With the Korn shell, and the later POSIX/SUS standardization, string chopping (via parameter expansion: ${var%PATTERN}, ${var#PATTERN}, etc.) and integer arithmetic were brought into the shell itself, speeding up many operations. It became possible to write many useful programs without calling any external commands.

This still left many trivial operations requiring external commands (converting uppercase letters to lowercase, for example). Bash has a solution — commands that can be compiled and loaded at run time if and when needed.

Compiling and Loading Bash Built-Ins

The bash source package has a directory full of examples ready to be compiled. To do that, download the source from ftp://ftp.cwru.edu/pub/bash/bash-3.1.tar.gz. Unpack the tarball, cd into the top-level directory, and run the configure script:

wget ftp://ftp.cwru.edu/pub/bash/bash-3.1.tar.gz
gunzip bash-3.1.tar.gz
tar xf bash-3.1.tar
cd bash-3.1
./configure
The configure script creates Makefiles throughout the source tree, including one in examples/loadables. In that directory are the source files for built-in versions of a number of standard commands "whose execution time is dominated by process startup time". You can cd into that directory and run make:
cd examples/loadables
make -k   ##  I use -k because I get some errors.
You'll now have a number of commands ready to load into your bash shell. These include:

logname  basename  dirname   tee
head     mkdir     rmdir     uname
ln       cat       id        whoami
There are also some useful new commands:

print     ## Compatible with the ksh print command
finfo     ## Print file information
strftime  ## Format date and time
These built-ins can be loaded into a running shell with:

enable -f filename built-in-name
They include documentation, and the help command can be used with them, just as with other built-in commands:

$ enable ./strftime strftime
$ help strftime
strftime: strftime format [seconds]
    Converts date and time format to a string and displays it on the
    standard output.  If the optional second argument is supplied, it
    is used as the number of seconds since the epoch to use in the
    conversion, otherwise the current time is used.
Modifying Loadable Built-Ins

With the strftime command, I can now do date arithmetic without external commands. For example, to get yesterday's date (a very frequently asked question in the newsgroups):
strftime %Y-%m-%d $(( $(strftime %s) - 86400 ))
That script has one drawback — it uses command substitution. The timing of commands must not be taken too literally (they can vary a great deal even on the same system, depending on what else is running at the time), but they give a useful basis for comparison. The difference between using the built-in strftime (with command substitution) and the GNU date command is surprisingly small:

$ time strftime %Y-%m-%d $(( $(strftime %s) - 86400 ))
2006-04-04

real    0m0.006s
user    0m0.000s
sys     0m0.005s
$ time date -d yesterday +%Y-%m-%d
2006-04-04

real    0m0.007s
user    0m0.000s
sys     0m0.007s
In absolute terms, it's not very long, but in a script there may be many such commands and they may be repeated many times. Since built-in commands are executed in the current shell, why not have it set a variable instead of printing the result? I added an option to strftime to store the result in a variable rather than printing it on stdout. The difference was significant:

$ time {
     strftime -p now %s
     strftime %Y-%m-%d $(( $now - 86400 ))
}
2006-04-04

real    0m0.000s
user    0m0.000s
sys     0m0.000s
The changes to strftime.c are relatively minor. I included the header for bash's internal options parser:

#include "bashgetopt.h"
Then I declared two variables:

  int ch;
  char *var = NULL;
The longest piece of code parses the options, which are passed as a linked list and parsed by bash's own function:

   reset_internal_getopt ();
   while ((ch = internal_getopt (list, "p:")) != -1)
     switch(ch) {
       case 'p':
         var = list_optarg; /* should add check for valid variable name */
         break;
       default:
         builtin_usage();
         return (EX_USAGE);
     }
   list = loptend;
The bind_variable function stores the result in a shell variable if the -p option was used:

   if ( var )
     bind_variable (var, tbuf, 0);
   else
Finally, there are two lines to add to the documentation. The first is added to the array of strings that are printed when help strftime command is used:

   "OPTION: -p VAR - Store the result in shell variable VAR",
The second is the short documentation or usage string that modifies the existing string:

   "strftime [-p VAR] format [seconds]",   /* usage synopsis; becomes short_doc */
The final strftime.c file is shown in Listing 1.

Writing New Bash Built-Ins

To write your own loadable commands, create a directory for them and copy the Makefile and the template.c files from bash-3.1/examples/loadables into it. The Makefile, which was created by running ./configure at the root of the bash source tree, contains the location of that source so that header files can be found. Make sure that top_dir points to the same place as BUILD_DIR. I also strip out all that I don't need. My resulting Makefile is shown in Listing 2, and template.c is shown in Listing 3. The template.c file is compilable and has the bare bones necessary to write a dynamically loadable built-in, plus the skeleton for adding command-line options. There are three necessary sections:

  • The function that implements the built-in,
  • A struct containing the documentation to be printed with the help command, and
  • A struct telling bash where to find the built-in and its documentation, and a short documenation or usage string.
These are outlined in the examples/loadables/hello.c. See Listing 4 .

To write a new built-in command, I use this script to change the references to template in template.c to the name of my built-in and add it to the Makefile. See Listing 5 .

One of the scripts most frequently requested in the Unix and Linux newsgroups converts filenames from uppercase (or partly uppercase) to lowercase. This usually means calling tr once for every file. (ksh has typeset -u, but it's non-standard and not implemented in bash.)

A shell function is quite efficient for converting short strings:

lcase()
{
    word=$1
    while [ -n "$word" ]
    do
      temp=${word#?}
      case ${word%"$temp"} in
          A*) _LWR=a ;;        B*) _LWR=b ;;
          C*) _LWR=c ;;        D*) _LWR=d ;;
          E*) _LWR=e ;;        F*) _LWR=f ;;
          G*) _LWR=g ;;        H*) _LWR=h ;;
          I*) _LWR=i ;;        J*) _LWR=j ;;
          K*) _LWR=k ;;        L*) _LWR=l ;;
          M*) _LWR=m ;;        N*) _LWR=n ;;
          O*) _LWR=o ;;        P*) _LWR=p ;;
          Q*) _LWR=q ;;        R*) _LWR=r ;;
          S*) _LWR=s ;;        T*) _LWR=t ;;
          U*) _LWR=u ;;        V*) _LWR=v ;;
          W*) _LWR=w ;;        X*) _LWR=x ;;
          Y*) _LWR=y ;;        Z*) _LWR=z ;;
          *) _LWR=${1%${1#?}} ;;
      esac
      printf "%s" "$_LWR"
      word=$temp
    done
}
...but it drags when used for long words, and approaches the execution time of tr. A built-in command would be an order of magnitude faster, so I wrote lcase. See Listing 6 .

Having done that, I added the inverse, ucase to convert lowercase to uppercase. Then, I used icase to convert upper to lower and lower to upper. Next came pattern creation to match either upper or lower case:

$ icase "John Doe"
jOHN dOE
$ ncase qwerty
[Qq][Ww][Ee][Rr][Tt][Yy]
Finally, I added cap, to capitalize the first letters of words. I amalgamated all of these into a single file (Listing 7 , case.c), and they are all enabled with a single command:

  enable -f $HOME/src/loadables/case ucase lcase icase ncase cap

Chris Johnson is the author of Shell Scripting Recipes: A Problem Solution Approach, Apress (2005). When not pushing shell scripting to the limits, Chris composes cryptic crosswords and teaches chess.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.