Optimizing string creation

From Nutwiki
Revision as of 17:02, 27 October 2016 by Harald (Talk | contribs) (1 revision imported)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Overview

This page shows some examples of how to fill up strings using C. Along with the source a detailed comparison of the different tests is provided to show how the compiler behaves differently depending on what you do.

Before you continue: These examples and the numbers and details for the comparisons were obtained by compiling for the Ethernut 2.1b. Depending on your board the binary sizes may differ, the compiler behaviour should stay largely the same, though.

Another thing that you may encounter depending on your Nut/OS version: In recent versions Nut/OS gained a new function called NutHeapDump. By default, this function uses fprintf and thereby includes sprintf no matter what you do. Since we don't want that, you should first open /nut/os/heap.c and find this part at the end:


<source lang="c"> /*!

* \brief Dump heap memory to a given stream.
*/

void NutHeapDump(void * stream) {

   HEAPNODE *node;
  1. ifdef NUTMEM_SPLIT_FAST
   for (node = heapFastMemFreeList; node; node = node->hn_next) {
       fprintf(stream, "%p(%d)\n", node, (int) node->hn_size);
   }
  1. endif
   for (node = heapFreeList; node; node = node->hn_next) {
       fprintf(stream, "%p(%d)\n", node, (int) node->hn_size);
   }
  1. ifdef NUTDEBUG_HEAP
   for (node = heapAllocList; node; node = node->ht_next) {
       fprintf(stream, "%p(%u) %s:%d\n", node, (int) node->ht_size, node->ht_file, node->ht_line);
   }
  1. endif

} </source>

Change it to this so that fprintf will only be invoked when debugging is actually used:

<source lang="c"> /*!

* \brief Dump heap memory to a given stream.
*/

void NutHeapDump(void * stream) {

  1. ifdef NUTDEBUG_HEAP
   HEAPNODE *node;
  1. endif
  1. ifdef NUTMEM_SPLIT_FAST
   for (node = heapFastMemFreeList; node; node = node->hn_next) {
       fprintf(stream, "%p(%d)\n", node, (int) node->hn_size);
   }
  1. endif
  1. ifdef NUTDEBUG_HEAP
   for (node = heapFreeList; node; node = node->hn_next) {
       fprintf(stream, "%p(%d)\n", node, (int) node->hn_size);
   }


   for (node = heapAllocList; node; node = node->ht_next) {
       fprintf(stream, "%p(%u) %s:%d\n", node, (int) node->ht_size, node->ht_file, node->ht_line);
   }
  1. endif

} </source>

Now rebuild the file (e.g. using the configurator to rebuild Nut/OS). Any subsequent application compiles will now use this little fix and not include fprintf by default.

Source Code

<source lang="c">

  1. include <dev/board.h>
  2. include <stdio.h>
  3. include <io.h>

/* choose a test to compile using this define */

  1. define TEST 6
  1. if (TEST == 1)

void CombineStrings(char *target, char *str1, char *str2) {

   sprintf(target, "%s%s", str1, str2);

}

  1. endif
  1. if (TEST == 2)

void CombineStrings(char *target, char *str1, char *str2) {

   strcat(target, str1);
   strcat(target, str1);

}

  1. endif

int main(void) {

   char str[32];
  1. if (TEST == 1)
   CombineStrings(str, "Hello ", "world!");
  1. endif
  1. if (TEST == 2)
   CombineStrings(str, "Hello ", "world!");
  1. endif
  1. if (TEST == 3)
   sprintf(str, "Hello world!");
  1. endif
  1. if (TEST == 4)
   strcpy(str, "Hello world!");
  1. endif
  1. if (TEST == 5)
   sprintf(str, "Hello world%c", '!');
  1. endif
  1. if (TEST == 6)
   CONST char *template = "Hello world!";
   int i = 0;
   while (*(template + i) != '\0') {
       *(str + i) = *(template + i);
   }
  1. endif
   for (;;);
   return 0;

} </source>

Before compiling, make sure you set the define TEST to the number of the test you want to compile.

Test details

  • 1: ...passes two source strings and a target string to a function. This function then joins them via sprintf.
  • 1: ...passes two source strings and a target string to a function. This function then joins them via strcat.
  • 3: ...uses sprintf without any parameters to fill a string with content.
  • 4: ...uses strcpy to fill a string with content.
  • 5: ...uses sprintf with a single character as parameter which is then inserted into the string.
  • 6: ...accesses the memory directly to copy a string from one location to another.

Binary sizes

  • 1: 8,840 Bytes
  • 2: 6,668 Bytes
  • 3: 6,606 Bytes
  • 4: 6,606 Bytes
  • 5: 8,740 Bytes
  • 6: 6,562 Bytes

Depending on the target you compile for the binary sizes may differ. The differences should still be alike. These numbers were obtained by compiling for Ethernut 2.1b with Ethernut 4.9.8

Test details

Test 1

In this test we use sprintf to combine two strings into one. While this solution works, it is very unelegant and wastes some CPU time, RAM and flash memory that could be needed elsewhere. Compare the sizes of the first 2 binaries. They do the same thing but the one with sprintf is a whopping 2 kilobytes bigger then the other. The only difference between them is the function used to concatenate the strings.

Of course sprintf does not add 2k to your binary every time you use it. It only does that if it is used at least once. Even if you have to include it already anyway, it is still a very slow function. If you can, always think about faster, more fitting solutions for the problem at hand.

Test 2

This test achieves absolutely the same results as the first. Instead of sprintf we use strcat, though. This is a way faster method of concatenating strings and it doesn't force us to link sprintf into the finished binary either.

Test 3

We already talked about sprintf and the fact that it is pretty huge for a single method. Since the compiler is aware of that it tries to optimize sprintf wherever it appears. In this test we use sprintf without any parameters for the format string. Interestingly, the binary isn't as big as the one in Test 1. The reason for this is the compiler. It detects that

<source lang="c"> sprintf(str, "Hello world!"); </source>

is technically the same as

<source lang="c"> strcpy(str, "Hello world!"); </source>

and replaces it, thus saving the unnecessary inclusion of sprintf.

Test 4

Test 4 serves mainly as a proof-of-concept for Test 3. Compile it and then compare the binary size. Exactly the same, although we used sprintf in the previous test. You may also want to check the .lst files for both, as they are the exactly the same as well.

Test 5

The fifth test demonstrates the opposite: As soon as even a single character (or any other data type, for that matter) is passed to sprintf, it can't be optimized away anymore and adds about 2k to your binary.

Test 6

The sixth test now does string copying in a very quick, resource-saving way. It starts inspecting the string at the beginning and copies characters to the other string until it finds a '\0' character. Since this character means "end of string" we can be sure to have copied the whole string afterwards.

Why not use strlen to get the string length, you ask? strlen works in the same way as the while loop here: It starts at the beginning and counts until it reaches '\0'. So while it would have worked, this variant is way faster because it copies the string while obtaining its length, not afterwards. This may save precious CPU time in tight spots.

Conclusion

Most applications have to create strings at certain points. If you have to create them, spend a moment thinking about the string you are creating.

  • Is it empty? If so, use memset to create it or create it as mystring = "\0";
  • Is it constant? If so, declare it as constant. If you're using an AVR processor, you may even want to keep those in program memory to save some more space on runtime.
  • Does it require only minor modifications? If your string manipulation is predictable in such a way that you could easily access it via simple code, don't use sprintf. Spend the few moments writing the additional code, it will save your application lots of resources later on.
  • Does it only require appending one string to another? There is a function strcat. But often you won't even need that, for example when concatenating constant strings. Just place the first string in memory and start writing the second string directly where the first strings '\0' is located.

While the optimizations in this article may seem minor, keep in mind that all those examples only covered the bare minimum to handle their tasks. In real applications, these minor differences can easily blow up enough to make your application work slow or run out of RAM altogether.