I was looking for a computer science book to read and learn more about the fundamentals, when I luckily came across the book mentioned in the title, shortly named SICP. The authors Harold Abelson and Gerald J. Sussman (with Julie Sussman), who are MIT professors, used this as a textbook for MIT introductory programming courses. The guide is also currently being taught in Berkeley EECS department, as the course CS61A. It is available for free and also can be purchased in amazon, where it has some encouraging reviews.
The book has opened my mind in terms of programming, I feel like there are more doors in the path to solve a computer exercise writing maintainable and powerful code, and I have not even finished the book yet! The following ideas are going to show you why. Let us do a review of what I found to be the most significant aspects of chapter 1: Building abstraction with procedures.
Introduction to Scheme
The book is based on Scheme programming language, which comes from the second oldest programming language in widespread use: Lisp. Here are some examples of expressions in an Scheme interpreter:
>(+ 137 349) 486 >(- 1000 334) 666 >(* 5 10 2) 100 >(+ (* 3 5) (- 10 6) -9 1) ; nested combinations 11
The group of elements inside a parenthesis are called combinations, and the leftmost element in the combination is called the operator (or procedure). This convention is called prefix notation. We can also name variables and procedures as follows:
; variables ; (define <name> (<body>)) >(define size 2) >size 2 >(define pi 3.14159) >(define radius 10) >(define area (* pi (* radius radius))) >area 314.159 ; procedures ; (define (<name> <parameters>) (<body>)) >(define (square x) (* x x)) >(square 4) 16 >(define (sum-squares x y) (+ (square x) (square y)) ) >(sum-squares 3 4) 25
When the interpreter evaluates a combination, it does the following:
- Evaluates the subexpressions of the combination
- Applies the procedure (or operator) to the arguments
For example, applying these rules to the following combination we obtain the following steps:
(+ (square 6) (square 10)) (+ (* 6 6) (* 10 10)) (+ 36 100) 136
This process is called the substitution model, it is a way to understand the steps that the interpreter does to solve a combination. This tool is useful to analyze the shapes and steps of procedures, terms that we will review later.
We can also perform different operations depending on the result of a test, with conditional expressions using cond and if:
>(define (abs x) (cond ((> x 0) x) ((< x 0) (- x)) ((= x 0) 0))) >(abs (- 5 10)) 5 >(define (abs x) (if (< x 0) (- x) x)) >(abs -10) 10
Procedures and the Processes they generate
In this section the book explains us that it is important to visualize the processes generated by various types of procedures, in order to predict the consequences of executing a procedure, otherwise we would be like chess players that know how to move the pieces but none about strategy and tactics. We are going to describe processes in terms of their shape and the amount of steps it takes to reach a result.
For example, let the following procedure for computing factorials:
>(define (factorial n) (if (= n 1) 1 (* n (factorial (- n 1)))))
It is a recursive procedure, meaning that it is defined in terms of itself. Using the substitution model we can watch this procedure’s shape as shown next:
(factorial 4) (* 4 (factorial 3)) (* 4 (* 3 (factorial 2))) (* 4 (* 3 (* 2 (factorial 1)))) (* 4 (* 3 (* 2 1))) (* 4 (* 3 2)) (* 4 6) 24 ; shape of expansion followed by contraction
Similarly, we can recast factorial as the following procedure:
(define (factorial n) (define (iter counter result) (if (> counter n) result (iter (+ counter 1) (* counter result)) ) ) (iter 1 1))
Which produces this shape using the substitution model:
(factorial 4) (iter 1 1) (iter 2 1) (iter 3 2) (iter 4 6) (iter 5 24) 24 ; constant space shape
Let us compare the two processes. Both compute the same result and each require a number of steps proportional to n. However, the first procedure of factorial produces a chain of deferred operations, which produces a shape of expansion followed by contraction. The length of the chain of deferred multiplications (the amount of information needed to keep track of) grows linearly with n in this example, just like the number of steps. That is why this kind of process is called a linear recursive process.
On the other hand, the second process has a constant space shape, this means that the information needed to keep track of does not increase. The process just needs to keep track of the variables product, counter and n, called the state variables. This kind of process is called iterative process, it is described by its state variables and a fixed rule that updates the state variables until the final state, reaching the result.
One does not have to confound recursive process with recursive procedure. A recursive procedure is a procedure defined in terms of itself, but when we talk about a recursive process we are describing the way the process evolves. In fact, our second definition of factorial uses a recursive procedure iter, which produces an iterative process.
Another kind of recursive process is the tree recursion. For example, we have the following numbers sequence:
0, 1, 1, 2, 3, 5, 8, 13, 21….
This is the Fibonacci sequence of numbers, which follows the rule that the next number is the sum of the two previous numbers. We can easily define the sequence as a recursive procedure:
(define (fib n) (cond ((= n 0) 0) ((= n 1 ) 1) (else (+ (fib (- n 1)) (fib (- n 2)))) ) )
This procedure works, but it is not efficient. The number of steps grows exponentially as n grows, and the shape of the process (the tree) grows linear with n, because the interpreter needs to keep track only of the nodes above the node of computation. Notice that, for instance, to compute (fib 5) the procedure computes (fib 4) and (fib 3), and to compute (fib 4) it computes (fib 3) and (fib 2). This means that (fib 3) is computed twice, leading to redundant computation (same occurs with (fib 2)). Figure 1 illustrates the shape of the process, where we can visualize (fib 3) being computed twice.
Visualizing processes like this helps us to determine if it is going to be efficient or not. In this example, using the notions of shape and steps we could decide that this is not an adequate procedure to compute Fibonacci numbers (look for exercise 1.19 in the book to appreciate the efficient implementation). We must not conclude that tree recursive processes are useless. They are a powerful tool for operating hierarchically data structures (SICP page 50) .
Procedures as Black-Box Abstractions
Let us imagine a black box as something that receives an input and gives an output, but we do not want to know what is inside the box. To view a procedure as a black box means that we have to forget about the implementation of the procedure and just focus on what it computes (the arguments and the output). We must forget or suppress the details of the implementation because, when building more complex procedures, we are not concerned with the details of the lower procedures. This enables us to build complex procedures based on identifiable, reusable procedures. “A user should not need to know how the procedure is implemented in order to use it” (SICP page 35).
As an example, consider the following implementation of the square root:
(define (average x y) (/ (+ x y) 2)) (define (sqrt n) (define (good-enough? guess) (< (abs (- (square guess) n)) 0.001) ) (define (improve guess) (average guess (/ n guess)) ) (define (sqrt-iter guess) (if (good-enough? guess) guess (sqrt-iter (improve guess)) ) ) (sqrt-iter 1.0) )
We defined three procedures inside sqrt definition: good-enough?, improve and sqrt-iter. Notice that good-enough? is built on top of the procedures abs and square, which were defined previously. Does it matter their implementation at this point? No, we just care about their computation, we are using them as black-boxes. Same goes for improve, which is built on top of average. If the programmer that defined average finds a better algorithm and redefines the procedure, it doesn’t modify the definition of improve, because average will return the same output.
High-Order Procedures and First-Class status
Sometimes we are building procedures that share the same programming patterns. Consider the following procedure that computes the sum of the squares in a given range:
(define (sum-squares a b) (if (> a b) 0 (+ (square a) (sum-squares (+ a 1) b))))
And also consider the following procedure that computes the sum of a sequence of terms in the series:
1, 5, 9, 13, 17, 21, …
(define (sum-series a b) (if (> a b) 0 (+ (- (* 2 a) 1) (sum-series (+ a 2) b))))
These two procedures share a pattern, that can be shown like the following procedural template:
(define (<name> a b) (if (> a b) 0 (+ (<term> a) (<name> (<next> a) b))))
How can we build an abstraction based on this template, in order to deal with summation and its variations? We would like to give this template a name, and be able to give it procedures as arguments, do not we?. In other words, we want a procedure that accepts other procedure as argument and returns a result. Is that possible in any programming language? Not in every language, but it is becoming more and more common. Well, Scheme has this wonderful capability of passing procedures as arguments, so let us define a sum procedure to build the previous abstraction:
(define (sum term next a b) (if (> a b) 0 (+ (term a) (sum term next (next a) b))))
term and next are procedures that determine the computation of the current term and the index of the next term, respectively. Now that we implemented the summation as a procedure, let use it as a black-box and redefine the initial sums:
(define (sum-squares a b) (sum square 1+ a b)) (define (sum-series a b) (sum (lambda(x) (- (* 2 x) 1)) (lambda(x) (+ x 2))) a b))
Beautiful and concise, isn’t it? Notice that the procedure 1+ is a primitive operator that increases a number by 1. Also notice that in the definition of sum-series we used lambda expressions, also called anonymus functions. They are useful to save time and space if one does not want to bound a procedure to a name. To be clear, the following definition of sum-series is identical in terms of results, without using lambda:
(define (term x) (- (* 2 x) 1)) (define (inc_2 x) (+ x 2)) (define (sum-series a b) (sum term inc_2 a b))
Scheme awards procedures full first-class status, this means that procedures do have the following privileges:
- They may be named by variables
- They may be passed as arguments to procedures
- They may be returned as the results of procedures
- They may be included in data structures (see Chapter 2)
We have not seen yet how a procedure returns another procedure and why this could be useful, the last example is going to illustrate this.
In order to conclude, I am going to define again the square root sqrt procedure in terms of other high-order procedures (fixed-point, iterative-improvement, average-damp), and you will see how powerful is to build abstractions (compare this with the previous definition of sqrt). By reading this post I hope you got a sense about the abstractions that can be built with procedures and how enlightening is this book.
(define (iterative-improvement good-enough? improve-guess) (define (iter guess) (if (good-enough? guess) guess (iter (improve-guess guess)))) iter) (define tolerance 0.0001) (define (fixed-point f first-guess) ((iterative-improvement (lambda (y) (< (abs (- (f y) y)) tolerance)) f) first-guess)) (define (average-damp f) (lambda (x) (/ (+ x (f x)) 2))) (define (sqrt x) (fixed-point (average-damp (lambda (y) (/ x y))) 1.0))