Both the ATtiny and ATmega CPU's have a very limited amount of RAM (512 bytes for the ATtiny45/85 and 1K for the ATmega8/168) and it's easy to hit the limit without some careful programming. I've been working on a project based around an ATtiny85 that barely fits into the 512 bytes available. While analysing the code I came across a few common (and not so common) tricks to help reduce the amount of RAM being used by your program so I thought I'd take the opportunity to share them here.

To determine how much memory you are using you can use the avr-size command (part of the avr-gcc suite) as follows:

avr-size --mcu=attiny85 --format=avr myfile.elf  

The output of the command will not only tell you the amount of flash and RAM that is being used by your program it will also tell you what that is as a percentage of the available memory on the specified MCU. The output looks like this:

AVR Memory Usage     ----------------     Device: attiny85

     Program:    3644 bytes (44.5% Full)     (.text + .data + .bootloader)

     Data:        372 bytes (72.7% Full)     (.data + .bss + .noinit)

I put an avr-size command in the link steps of the Makefile for my projects, this makes it easy to check if I am approaching the memory limits for the chip I'm using.

The remainder of this post looks at what things are placed into RAM and what options you have to minimise that memory usage.

Uninitialised global or static variables.

This is the most obvious. Any global variable (or static variable declared inside a function) will have space allocated for it in the RAM. A declaration like the following ...

uint8_t myArray[12]; // 12 bytes of uninitialised memory  

... will set aside 12 bytes of memory.

There is not a lot you can do to avoid this apart from minimising your globals and statics. Reusing variables is a possible option (at the cost of making your code harder to understand) or re-evaluating if the variable really needs to be global.

If you are using large arrays (input buffers for example or as a space to build strings) you need to evaluate how big they really need to be and trim them down to the minimum required size.

To see what is being allocated in the uninitialised data segment (the .bss section) you can use the avr-objdump command:

avr-objdump -t myfile.elf | grep "\.bss"  

For my sample program the output looks something like this:

0080009c l    d  .bss    00000000 .bss  
0080019c l     O .bss    00000038 g_state  
0080009c l     O .bss    00000100 g_framebuffer  
008001d4 g       .bss    00000000 __bss_end  
0080009c g       .bss    00000000 __bss_start  

The first column gives the address, we can calculate the total size by looking at the difference in addresses for bss_start and bss_end. In this case it's 0x80009c to 0x8001d4 or 312 bytes. The second numerical column gives you the size of each variable - 0x100 (256) for g_framebuffer and 0x38 (56) for g_state. Note that this does not match up to the total RAM usage reported above - there is another segment (the initialised data segment or .data section) that gets added to it as well.

Initialised variables (including constant strings)

These are variables that are initialised with a value at compile time.

const char *cszMessage = "This is my message";  
uint8_t myArray[] = { 0x01, 0x02, 0x03 };  

A difference between these and uninitialised variables is that they take up space in both flash and RAM (before your program starts the values are copied from flash to RAM to ensure they are initialised with the right values).

This applies to const (read only) data as well. The AVR family is based on the Harvard Architecture which means that code memory and data memory are completely separate. To access data stored in code memory you need to use special instructions.

Luckily gcc-avr provides a set of macros to help deal with this situation, these are defined in the 'avr/pgmspace.h' header file. Essentially you need to declare your variables as being in program memory and then use specific macros to access them. A lookup table might be implemented as follows:

const uint8_t lookup[] PROGMEM = { 1, 2, 3, 4, 5, 7 };

inline uint8_t getEntry(int index) {  
  return (uint8_t)pgm_read_byte_near(lookup + index);

The macros are documented here and a Google search will lead to plenty of examples. The most common use of the pgmspace.h utilities is to keep constant strings in flash only, the runtime library provides an alternative set of string manipulation functions that will work with such strings.

You shouldn't limit yourself to strings though - any lookup tables or blocks of constant data are valuable candidates for optimisation.

'switch()' Statements

This one was a bit of a surprise to me and I only came across it while trying to track down a 'missing' block of 60 bytes in the initialised data segment that I couldn't match to anything I had declared.

To see what is being placed in the data segment you can use the avr-objdump command as described earlier. Here is an example that filters out items in the .data section:

avr-objdump -t myfile.elf | grep "\.data"  

The output will look something like this:

00800060 l    d  .data    00000000 .data  
00800060 l     O .data  0000003c CSWTCH.6  
0080009c g       .data  00000000 __data_end  
0080009c g       .data    00000000 _edata  
00800060 g       .data    00000000 __data_start  

You can see the CSWTCH.6 reference taking up 0x3c (60) bytes. This doesn't match up to any of my declared variables or constants so I was at a bit of a loss as to where it was coming from.

It turns out that when avr-gcc generates code for a switch() statement it creates a lookup table to optimise the jumps it needs to make to get to the appropriate code for the value being passed to the switch(). A lookup table is initialised data and therefore gets copied into RAM prior to running the program.

I would have expected the compiler to generate code to access this directly from flash (and I'm not alone, there is a bug report against this behaviour). If you really need to save the memory you will have to change the code from something like this ...

switch(c) {  
  case 0:         // Do something
  case 1:         // Do something
  case 2:         // Do something
  default:        // Do something else

... to something like this ...

if(c==0) {  
  // Do something
else if(c==1) {  
  // Do something
else if(c==2) {  
  // Do something
else {  
  // Do something else

That will avoid having a lookup table generated and free up some precious RAM. In this case the code is still very readable (even if it may be a little slower to execute) so it's not a huge trade off.

The Stack

The stack is dynamically allocated and lives in the unused portion of RAM reported by avr-size. In the example output above I am explicitly allocating 372 bytes out of an available 512 which leaves 140 bytes for the stack.

Every time you call a function the return address will be pushed to the stack and occupy space in RAM until the called function is finished. If you have deeply nested function calls (or a recursive function) this can take up a lot of space.

The stack is also used for local variables and parameters in functions - when your function is called space will be allocated for any local variables it declares, when the function returns this memory is released again. From what I have seen the avr-gcc compiler does a good job of allocating registers for storing local variables and passing parameters though so this may not be a significant problem.

It is difficult to determine the maximum amount of stack space your program is going to require so you need to make sure there is a reasonable amount of unused RAM available. For an average program I would be dubious about anything less than 64 bytes.

Some steps you can take to minimise your stack usage include minimising the number of local variables in functions (reuse variables where possible rather than declaring a new one) and reducing the call depth (you could declare smaller functions as inline for example).

Stack overflow can be very difficult to diagnose as there is no stack checking done at runtime and using more memory than is available will corrupt the values of other variables meaning the symptoms may seem to be random failures. If you are having troubles this article describes a useful technique to see how much stack is actually being used.


So there you have it - a nice collection of simple tricks to use to optimise your memory usage as well as some useful tools you can use to help you determine what is being use where and by who. Be careful to avoid premature optimization though - save your memory optimisation until you need it.

In the case of this particular project I am trying to squeeze as much as I can into an ATtiny85. If you find yourself in a similar situation, or need to add features to an existing project these tips might come in very useful.