Home > Software design >  Efficient String Parsing in C
Efficient String Parsing in C

Time:01-23

I'm implementing some C getter/setter for an embedded system which takes a C string as an input.

I need to parse out some commands, ex: command, option, option, option. The options themselves will need to be further parsed. As a simple example

set_speed M1=10 M2=20, set_speed needs to be parsed, then each token M1=10 and M2=20 need to be further parsed.

strtok can not be repeatedly called unfortunately, if it could be the problem would be simple.

CodePudding user response:

getopt is a standard C function/library for parsing parameters. There's a good example of it here...

https://www.gnu.org/software/libc/manual/html_node/Example-of-Getopt.html

CodePudding user response:

Doing it with strtok() to split both words and the options up is possible, but a bit of a pain. There's POSIX strtok_r(), which would work better, but that isn't in standard C (But is easy to write yourself...). In most languages, regular expressions would be a good choice, but again they're not in standard C, just POSIX or third-party libraries like PCRE2. A few other routes come to mind (Like sscanf(), or a parser generator created routine (Maybe ragel or re2c are worth exploring since they compile to C code embedded in a larger source file and don't need a support framework, but I'm not very familiar with using them)), but aren't really efficient or suitable for an embedded environment.

However, it's easy enough to parse strings like this in a single pass with just standard string search functions and a bit of pointer manipulation:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// Note: Destructively modifies its argument
void parse(char *str) {
  static const char *ws = " \t"; // Whitespace characters that separate tokens
  char *command = str;

  // Find the end of the first word
  str  = strcspn(str, ws);
  if (*str) {
    *str   = 0; // Terminate command
  }
  printf("Command: %s\n", command);

  while (*str) {
    // Skip leading whitespace
    str  = strspn(str, ws);
    if (!*str) {
      // That was actually trailing whitespace at the end of the string
      break;
    }

    // Split at = sign
    char *option = str;
    str = strchr(str, '=');
    if (!str) {
      fputs("Missing = after option!\n", stderr);
      exit(EXIT_FAILURE);
    }
    *str   = 0; // Terminate option

    // Parse the numeric argument
    char *valstr = str;
    double val = strtod(valstr, &str);
    if (valstr == str || !strchr(ws, *str)) {
      fprintf(stderr, "Non-numeric argument to %s!\n", option);
      exit(EXIT_FAILURE);
    }

    printf(" Option %s, value %f\n", option, val);
  }
}

int main(void) {
  char command_string[] = "set_speed M1=10 M2=20";
  parse(command_string);
  return 0;
}

Example:

$ gcc -g -O -Wall -Wextra -o demo demo.c
$ ./demo
Command: set_speed
 Option M1, value 10.000000
 Option M2, value 20.000000
  •  Tags:  
  • Related