Command-line arguments

It is standard practice in UNIX for information to be passed from the command line directly into a program through the use of one or more command-line arguments, or switches. Switches are typically used to modify the behavior of a program, or to set the values of some internal parameters. You have already encountered several of these--for example, the "ls" command lists the files in your current directory, but when the switch -l is added, "ls -l" produces a so-called ``long'' listing instead. Similarly, "ls -l -a" produces a long listing, including ``hidden'' files, the command "tail -20" prints out the last 20 lines of a file (instead of the default 10), and so on.

Conceptually, switches behave very much like arguments to functions within C, and they are passed to a C program from the operating system in precisely the same way as arguments are passed between functions. Up to now, the main() statements in our programs have had nothing between the parentheses. However, UNIX actually makes available to the program (whether the programmer chooses to use the information or not) two arguments to main: an array of character strings, conventionally called argv, and an integer, usually called argc, which specifies the number of strings in that array. The full statement of the first line of the program is

    main(int argc, char** argv)

(The syntax char** argv declares argv to be a pointer to a pointer to a character, that is, a pointer to a character array (a character string)--in other words, an array of character strings. You could also write this as char* argv[]. Don't worry too much about the details of the syntax, however--the use of the array will be made clearer below.)

When you run a program, the array argv contains, in order, all the information on the command line when you entered the command (strings are delineated by whitespace), including the command itself. The integer argc gives the total number of strings, and is therefore equal to equal to the number of arguments plus one. For example, if you typed

      a.out -i 2 -g -x 3 4

the program would receive

      argc = 7
      argv[0] = "a.out"
      argv[1] = "-i"
      argv[2] = "2"
      argv[3] = "-g"
      argv[4] = "-x"
      argv[5] = "3"
      argv[6] = "4"

Note that the arguments, even the numeric ones, are all strings at this point. It is the programmer's job to decode them and decide what to do with them.

The following program simply prints out its own name and arguments:

#include <>
 
main(int argc, char** argv)
{
    int i;
 
    printf("argc = %d\n", argc);
 
    for (i = 0; i <>
      printf("argv[%d] = \"%s\"\n", i, argv[i]);
}

UNIX programmers have certain conventions about how to interpret the argument list. They are by no means mandatory, but it will make your program easier for others to use and understand if you stick to them. First, switches and key terms are always preceded by a ``-'' character. This makes them easy to recognize as you loop through the argument list. Then, depending on the switch, the next arguments may contain information to be interpreted as integers, floats, or just kept as character strings. With these conventions, the most common way to ``parse'' the argument list is with a for loop and a switch statement, as follows:

#include <>
#include <>
 
main(int argc, char** argv)
{
    /* Set defaults for all parameters: */
 
    int a_value = 0;
    float b_value = 0.0;
    char* c_value = NULL;
    int d1_value = 0, d2_value = 0;
 
    int i;
 
    /* Start at i = 1 to skip the command name. */
 
    for (i = 1; i <>
 
      /* Check for a switch (leading "-"). */
 
      if (argv[i][0] == '-') {
 
          /* Use the next character to decide what to do. */
 
          switch (argv[i][1]) {
 
            case 'a':   a_value = atoi(argv[++i]);
                        break;
 
            case 'b':   b_value = atof(argv[++i]);
                        break;
 
            case 'c':   c_value = argv[++i];
                        break;
 
            case 'd':   d1_value = atoi(argv[++i]);
                            d2_value = atoi(argv[++i]);
                        break;
 
          }
      }
    }
 
    printf("a = %d\n", a_value);
    printf("b = %f\n", b_value);
    if (c_value != NULL) printf("c = \"%s\"\n", c_value);
    printf("d1 = %d, d2 = %d\n", d1_value, d2_value);
}

Note that argv[i][j] means the j-th character of the i-th character string. The if statement checks for a leading ``-'' (character 0), then the switch statement allows various courses of action to be taken depending on the next character in the string (character 1 here). Note the use of argv[++i] to increase i before use, allowing us to access the next string in a single compact statement. The functions atoi and atof are defined in stdlib.h. They convert from character strings to ints and doubles, respectively.

A typical command line might be:

      a.out -a 3 -b 5.6 -c "I am a string" -d 222 111

(The use of double quotes with -c here makes sure that the shell treats the entire string, including the spaces, as a single object.)

Arbitrarily complex command lines can be handled in this way. Finally, here's a simple program showing how to place parsing statements in a separate function whose purpose is to interpret the command line and set the values of its arguments:

 
 
                  /********************************/
                  /*                             */
                  /*   Getting arguments from          */
                  /*                      */
                  /*      the Command Line        */
                  /*                      */
                  /********************************/
 
 
                               /* Steve McMillan  */
                               /* Written: Winter 1995  */
 
 
#include <>
#include <>
 
void get_args(int argc, char** argv, int* a_value, float* b_value)
{
    int i;
 
    /* Start at i = 1 to skip the command name. */
 
    for (i = 1; i <>
 
      /* Check for a switch (leading "-"). */
 
      if (argv[i][0] == '-') {
 
          /* Use the next character to decide what to do. */
 
          switch (argv[i][1]) {
 
            case 'a':   *a_value = atoi(argv[++i]);
                        break;
 
            case 'b':   *b_value = atof(argv[++i]);
                        break;
 
            default:    fprintf(stderr,
                        "Unknown switch %s\n", argv[i]);
          }
      }
    }
}
 
main(int argc, char** argv)
{
    /* Set defaults for all parameters: */
 
    int a = 0;
    float b = 0.0;
 
    get_args(argc, argv, &a, &b);
 
    printf("a = %d\n", a);
    printf("b = %f\n", b);
}