Algorithms
and
Data
Structures

Spring 2021
course
site
-->

March 4 - pointers and all that

... start recording.

Questions about anything?

I've posted an assignment for Monday; essentially one coding project in each of Python and C, with several incremental pieces. Please do work on your skills in both languages, reading up online how these things work. As usual I'll have my own version on Monday that you can compare with your work.

If you're having trouble along the way, ask for help from other classmates, the tutors, or me, either on the slack channel, in zoom tutoring times, or by making an appointment.

Today I want to continue the conversation we started on Monday, talking about linked lists, stacks, queues, and how those can be implemented in Python and C.

And I'd like to start by going over some programming ideas that are used throughout this sort of stuff : pointers/references, pass-by-value vs pass-by-reference, objects, structs, allocated memory, and what that looks like in these two languages. These ideas run throughout many languages, though there are some differences.

There are a number of books and online sources on the resources page where you can read about these topics in C and Python. One fairly C tutorial that explains pointers and malloc is learn-c.org ; look under Pointers, Structures, and Dynamic Allocation.

pointers

Consider this python code. What's going on? Run in pythontutor.com.

a = [1, 2, 3]
b = [1, 2, 3]
print("a is b ? ", a is b)
print("id(a) ", hex(id(a)))
print("id(b) ", hex(id(b)))

c = b
print("id(c) ", hex(id(c)))
print("c is b ? ", c is b)

Python doesn't use these "identities" of variables, except to tell whether or not two variables refer to the same address in memory.

But C uses address quite a lot. Here are two new operators that you may be unfamiliar with : * and & .

// --- ptrs .c ---
#include <stdio.h>
int main(){
  int    a = 1234;
  int    b = 5678;
  int*   a_p = &a;        // pointer to a
  int*   b_p = &b;        // pointer to b
  int**  a_pp = &a_p;     // pointer to pointer to a
  int**  b_pp = &b_p;     // pointer to pointer to b

  printf("--- address and the values within them --- \n");
  printf("&a,     a     : %12p  %14d \n", &a,    a);
  printf("&b,     b     : %12p  %14d \n", &b,    b);
  printf("&a_p,   a_p   : %12p  %12p \n", &a_p,  a_p);
  printf("&b_p,   b_p   : %12p  %12p \n", &b_p,  b_p);
  printf("&a_pp,  a_pp  : %12p  %12p \n", &a_pp, a_pp);
  printf("&b_pp,  b_pp  : %12p  %12p \n", &b_pp, b_pp);
  printf("--- follow the pointer --- \n");
  printf("*a_p   : %d \n", *a_p);
  printf("**a_pp : %d \n", **a_pp);
  printf("--- sizes of things (in bytes) --- \n");
  printf("sizeof(int)   : %lu \n", sizeof(int));
  printf("sizeof(int*)  : %lu \n", sizeof(int*));
  printf("sizeof(int**) : %lu \n", sizeof(int**));

  return 0;
}

Compiling and running that on jupyter.bennington gives

jim@jupyter:~$ gcc ptrs.c -o ptrs
jim@jupyter:~$ ./ptrs
--- address and the values within them ---
&a,     a     : 0x7fff10f261e0            1234
&b,     b     : 0x7fff10f261e4            5678
&a_p,   a_p   : 0x7fff10f261e8  0x7fff10f261e0
&b_p,   b_p   : 0x7fff10f261f0  0x7fff10f261e4
&a_pp,  a_pp  : 0x7fff10f261f8  0x7fff10f261e8
&b_pp,  b_pp  : 0x7fff10f26200  0x7fff10f261f0
--- follow the pointer ---
*a_p   : 1234
**a_pp : 1234
--- sizes of things (in bytes) ---
sizeof(int)   : 4
sizeof(int*)  : 8
sizeof(int**) : 8

Discuss what this is all about.

The notation is set up so that int* i_ptr; and int *i_ptr are both OK, and mean the same thing: i_ptr is a pointer to (i.e. the address of) an integer.

In C, an array like int numbers[10] actually means that numbers is essentially an int*, a pointer to an int, which is the first element of the array. So numbers[0] is also *numbers, and numbers[1] is also *(numbers+1).

memory allocation

Consider the following python code.

def foo():
    a = [1,2,3]
    print("id(a) is ", id(a))
    b = [4,5,6]
    return a

aa = foo()
print(aa)
print("id(aa) is ", id(aa))
print(b)

Quick quiz: what happened to the memory where [1,2,3] was? what happened to the memory where [4,5,6] was? How does the program know the difference? Was the whole array returned ... or just a reference to it? (Hint: is id(a) the same or different than id(aa)?)

In C, just like in Python, we don't usually copy an array when we return it - instead we return its address. But we also have to make sure that it exists even when the function stops running. (Python does that for you.)

You might try to do this ...

// THIS CODE HAS A SERIOUS BUG IN IT.
int* foo(){
  int a[] = {1,2,3};
  int b[] = {4,5,6};
  return a;     // Here "a" is the same as &a[0], address of array.
}

... but you would be cruising for a bruising. The address of where those numbers were is returned, but the memory itself is not kept. In C, each function's local variables are explicitly declared, and all are up for grabs once the function is finished.

If in C you want to have a function that creates and returns a block of memory, you must instead use malloc (i.e. "memory allocate") or a similar system function, which reserves some memory off in a special place for dynamic memory for that process. The call to malloc returns a pointer to that memory.

// THIS IS OK.
int* foo(){
  int* a = malloc(3 * sizeof(int)); // space for 3 integers
  int i;
  for (i=1; i<=3; i++){ a[i] = i; }  // fill with 1,2,3
  return a;     
}

C is a primitive, low level language. You can't even tell how long the array is without sending along its size as another parameter!

In class exercise ... all of us.

Write and test a Python program that implements a MyArray class
which has get(n), set(n, value), and length() methods.

Write and test a short C program that implements a similar myarray struct
with methods new_myarray(n), get(n, array), set(n, value, array).

I've attached myarray.py and myarray.c, the two files
that we created and ran.

Depending how much time we have, return to Monday's notes to discuss

asides

https://cs.bennington.college /courses /spring2021 /algorithms /notes /pointers
last modified Thu March 4 2021 5:55 pm

attachments [paper clip]

  last modified size
TXT myarray.c Thu Mar 04 2021 05:55 pm 1.0K
TXT myarray.py Thu Mar 04 2021 05:55 pm 1.2K