433-253 - C Recap

Matt Giuca :: 253 Index

Bad memory

As you have probably seen by now, sometimes crazy stuff happens in C. Crazy stuff being where you're looking at the code, and there's just no logical way it could possibly be doing what it's doing, but it's doing it anyway! This includes things like:

Usually, when you see completely unexpected behaviour it's because somewhere in your program, something is writing to memory it shouldn't be writing to. The bad news is, the part of the code where the bug shows up is likely to be in a completely different spot to the part of the code that's causing it. The bug could even show up inside calls to standard library functions. (If gdb is telling you that realloc or strcmp crashed, I'm sorry to tell you it's probably your fault).

As you should know by now, C pointers are completely unsafe and unchecked. It is possible to make a pointer point anywhere you like, and it is then possible to read or write from that location. The fact it's unchecked is what makes C so fast ("A C program is like a fast dance on a newly waxed dance floor by people carrying razors." -- Waldi Ravens). But if you do access any memory outside of where you're "supposed" to, basically the program is sooner or later going to stop working.

Absolutely anything can happen from this point onwards. It could crash, it could make garbage output, it could continue happily, or it could make demons fly out of your nose.

When you get seg faults, and when you don't

While segmentation faults are pretty common, they are in no way guaranteed to come up when you access out-of-bounds (a lot of students don't realise this). Generally, accessing memory that's close to the bounds will not cause segmentation faults (just cause random badness), while accessing memory far away (such as random pointers, or NULL pointers) will cause segmentation faults. The lesson here is not to rely on seg faults to catch memory errors for you.

What can happen if you don't get segmentation faults is you may simply write over some other memory. "Other memory" could very well mean something very important. Sooner or later, something will screw up. When it does, it's impossible to track the bug down from the "screw-up point" (usually the segfault) - you need to track it down from the point where the violation occured.

Common errors which cause bad memory accesses

Ways to find these problems

When you see a segfault or other random memory error, the debugging can be long and painful. (Not just for students! I can sit with a student for 15 minutes and still not figure out the problem!) The first thing to do, however (after panicking), is run two tools immediately.

GDB

The first tool to use when tracking down a segfault is GDB. We cover GDB in depth in the subject 433-252, which I believe is a corequisite for 253. Here I will not talk about its advanced uses, just what you should do immediately to track down a segfault.

Use the following commands:

With these commands, you should be able to examine the state of the program to get a better idea of what is going on. Refer to the 433-252 lecture/lab notes for further instructions on using GDB.

However, as discussed above, the point where the segfault occured is not always the place where things started to go wrong. Things often start to go wrong when you access an array out of bounds. For that you need ...

BGCC =D

The most useful and often overlooked device for catching these bugs is a little tool we introduce in 252 called "bgcc" (bounds-checking GCC). I really can't recommend this tool enough.

Basically, this is exactly the same as normal gcc, except the compiled programs will check for memory violations at runtime, and tell you as soon as one is detected. On the CSSE machines, you can use this right away - just type "bgcc" in place of "gcc".

With this in place, you will see errors right away (when the memory violation occurs, not later on), and they are actual errors, not random bugs. Furthermore, they tell you the line of code which caused it, and the pointer location you tried to access.

So take a quick look in GDB without BGCC, but don't spend too long there - run it through BGCC next. Using both tools, you can see both where the segfault occurs, and where the violation occurs too.

Note that bgcc can't find all types of memory violations. For instance, it has trouble handling huge stack overflows. Also, it doesn't work within a malloc'd block. This means it cannot detect if you wrote too far past one field of a struct into another field of that same struct. Keep these in mind.

Using BGCC with GDB

The main problem with BGCC is that although it tells you the line a violation occurred on, it doesn't let you examine the state.

If you use bgcc with -g, you can run it through GDB as normal. If you do this, it will break on the BGCC-detected violation, not the segfault, which lets you access the state at the time of the violation.

Keep in mind that because of the extra BGCC code, you may find yourself buried deep in unknown functions. Just type up a few times until you find yourself in familiar code.

Please put these techniques into practice before asking your demonstrator for help. Learning the tools makes you a more effective debugger so you can get back to doing the algorithms and data structures (remember them?)

Bad memory quiz

1. Which of these code snippets cause seg faults? Which ones don't? (What do they do instead?) Which ones can bgcc detect?

Note: Try to guess what will happen, then try it out in a C program.

a)

    int* x;
    printf("%d\n", x);

b)

    int* x;
    printf("%d\n", *x);

c)

    int* x = NULL;
    printf("%d\n", *x);

d)

    int* x = (int*) malloc(sizeof(int) * 10);
    x[10] = 4;

e)

    int* x = (int*) malloc(10);
    x[6] = 4;

f)

    int x[10];
    x[10] = 4;

g)

    char* x = "flip";
    x[2] = 'o';

2. Download and compile this C code. What happens?

    #include <stdio.h>
    #include <stdlib.h>

    int main()
    {
        int i;
        int t;
        int* a1 = (int*) malloc(sizeof(int) * 500);
        int* a2 = (int*) malloc(sizeof(int) * 500);
        /* Populate arrays */
        srand(time(NULL));
        for (i=0; i<=5000; i++)
        {
            a1[i] = rand();
            a2[i] = i;
        }
        /* Swap the two arrays */
        for (i=0; i<=5000; i++)
        {
            t = a1[i];
            a1[i] = a2[i];
            a2[i] = t;
        }
        /* Print the arrays */
        for (i=0; i<=5000; i++)
        {
            printf("%d : %d\n", a1[i], a2[i]);
        }
        free(a1);
        free(a2);
    }

Before you fix the ridiculously-easy-to-spot-but-i'm-tired-and-can't-be-bothered-making-anything-more-subtle error, apply the above tools on the code and see how they report the problems. What does GDB help you find? What does BGCC help you find?

Fix the first-and-most-obvious error, and you should find the program runs correctly without segfaulting. Is there another issue? (Hint: Yes). Which tool helps you find it?

3. Compile question 3 of the malloc quiz with BGCC (the static one in the question, not your answer). Can BGCC detect the problem of overflowing the array? Does it depend on the input?