Python Tutor is a free tool that has been used by tens of millions of people since 2010 to visualize and debug code step-by-step, mostly for introductory courses (e.g., CS1/CS2). Despite its name, it also visualizes C and C++ code (in addition to Java and JavaScript) to help students understand critical concepts and debug homework assignments.
This article shows instructors how Python Tutor can illustrate key concepts from a wide range of C and C++ courses. If you think this tool may be helpful for your staff or students, please share these links in relevant course materials, chat groups, mailing lists, discussion forums, or social media:
(Also, if you teach in Java, check out what the Java visualizer can do as well.)
One of the most distinctive yet challenging aspects of learning C or C++ (rather than, say, Python or Java) is that we actually care about where data resides in memory.
This example visualization shows data in the globals area, stack, heap, and read-only memory regions (denoted by the red "this is in read-only storage" label). You can step back and forth using the slider or buttons under the code:
The visualizer renders C strings as null-terminated char
arrays (with
a '\0'
at the end of each). Note how s1
is a pointer to a string in
read-only memory (because it's a string literal), whereas s3
is a
pointer to a string on the heap due to
strdup()
.
And s2
is an inline char
array within the stack frame of main
. The
printf
line prints all three as strings, so without this visualization
it's hard to tell where each resides in memory.
Now toggle the "C/C++ details" selector at the bottom-right corner
(under the stack frame of main
) to "show memory addresses." The
visualization now shows the memory address where each global/stack/heap
value resides. Notice how s1
has the pointer value 0x400644
and s3
has the value of 0x5402040
; these are the memory addresses of the
char
arrays they each point to.
For more detail, choose "byte-level view of data" to see the contents of each raw byte of data in both hex and binary. This is useful when teaching low-level memory operations such as bit shifting or masking. See "Binary-Level View of Data" for another example.
If you use gdb
or print statements to display run-time data, you will
see nonsense garbage values if a block of memory is uninitialized
(i.e., not assigned to a value yet). This can be misleading to novices
who may think those are real values. In contrast, the visualizer uses
Valgrind to track exactly which bytes are
uninitialized so garbage values aren't shown.
Using the same example as above, if you rewind back to Step 2, you'll
see a bunch of ?
representing uninitialized values on the stack
when main()
first starts running:
Then if you step forward by clicking "Next >", each ?
will
progressively fill in with the data initialized at each execution step.
Here's an example showing a stack array and three pointers into the middle of it:
To see the exact pointer values, toggle the "C/C++ details" selector at the bottom right to "show memory addresses."
We've already seen arrays in the above example. Structs and unions
render similarly, and can themselves contain nested
arrays/structs/unions. For instance, here is a pointer to a
heap-allocated array of 3 Person structs, each containing an inline
character array (firstName
) and a nested struct (birthday
):
For unions, the visualizer shows how all members share the same memory address. Here's an example contrasting structs and unions:
Now click "Next >" once to run line 18. Since all union fields share the
same memory address (0xFFF000BD4
), when line 18 is run, all those
fields get initialized at once. In contrast, note how each struct field
has its own separate address, so initializing one does not automatically
initialize all the others.
And as always, toggle "C/C++ details" to "byte-level view of data" to see more details about what is going on at the binary level.
Here's an example of C++ classes showing two Rectangle
objects (one
global and one on the stack), a call to the copy constructor (triggered
by rect2 = rect1
in Step 2), and the this
pointer in the display()
member function in Step 10:
Novices may struggle to understand the different ways that parameters can be passed to functions. To clarify, here is a visualization of passing parameters by value, by pointer, and by C++ reference:
Note how each function call is visualized as a stack frame underneath
main
, and it shows whether the x
parameter is a copy of or pointer
to myNumber
.
The visualizer uses Valgrind to detect and report out-of-bounds errors.
For instance, let's say you're walking a pointer along a heap array of
ints
. What happens when you overflow or underflow? The pointer ends up
pointing to a skull emoji 💀 next to the array since the address is in
unallocated memory. Step through this example to see:
Here's the exact same example, except now I've toggled the "C/C++
details" selector to "show memory addresses." This lets you see the
memory address of each heap array element above it in gray (e.g., the
first element is at 0x5402040
) ...
... and when you take each step to do pointer arithmetic, you can see
exactly what value the pointer p
holds and when it under/overflows the
array.
Now what happens when p
points to a global array instead of a
heap-allocated one? Sometimes when you do pointer arithmetic, it can
overflow into the spot where a neighboring global variable resides!
Step through this example to see how:
The first overflow shows a skull emoji 💀 at address 0x601044
but then
p++
becomes 0x601048
, which happens to be the start of the arr2
global array in memory. This example can show students both the raw
power and danger of using pointers.
Here's a more advanced example that shows the level of detail that the
visualizer captures. Here arr
is initialized to a heap array with
values 65, 66, and 67 (which correspond to the ASCII values for the
characters A
, B
, C
, respectively). Now we perform type punning by
assigning char* s
to arr
so that both point to the same block of
heap memory:
Now you can view this block of memory as either an array of int
(through the arr
pointer) or an array of char
(through the s
pointer). Click the "type punning: [switch views]" link right below
the array to switch between the two views.
If you're in the int
array view then try to step to execute s++
repeatedly, you'll see a bunch of skulls since it's often pointing
into the middle of an int
element. But if you switch views to the
char
array view, then each s++
is always properly aligned with the
char
-sized array elements.
Here's the same example as above, except with "C/C++ details" toggled to
"byte-level view of data." Now if you step through s++
note how it
points into the middle of each int
array element since s++
advances 1 byte at a time whereas each int
is 4 bytes wide:
To illustrate low-level concepts such as bit shifting, masking, and
integer over/underflows, you can toggle "C/C++ details" to "byte-level
view of data." This will show all bytes in memory as both hex and
binary. Here is an example that shows an unsigned int
wrapping around
from UINT_MAX
(4294967295) back to 0 when you run x++
This binary-level view makes it clear what's happening under the hood. There's a lot displayed on-screen in this view, so here's a brief guide:
x
resides in the stack is 0xFFF000BDC
unsigned int
is 4 bytes on this machine, each of the four
bytes is displayed at the bottom, with labels 0xFFF000BDC:
,
0xFFF000BDD:
, 0xFFF000BDE:
, and 0xFFF000BDF:
, which are four
contiguous bytes starting at 0xFFF000BDC
.?
.x
gets assigned the value of UINT_MAX
, which fills up
all bytes with 1's. The hex value for each byte is 0xFF
and the
binary is 11111111
.x++
forces the bytes to wrap around from all 1's to all
0's, which is why you see 0x00 00000000
for all bytes (again hex +
binary).x++
increments the value by 1, so the lowest byte now has
the value 0x01
in hex and 00000001
in binary.Pointers can point to other pointers, even through multiple levels of function calls. How many levels deep can we go?
The C and C++ visualizers in Python Tutor can help your students understand and debug a variety of code that they encounter in introductory or intermediate-level courses.
Feel free to share these links in relevant course materials, chat groups, mailing lists, discussion forums, social media, or anywhere else: