All notes
Programming

Concepts

Unwinding The Stack

c2.com: unwinding the stack.

Unwinding the stack is the restoration/cleanup of ActivationRecords that occur when control is transferred from one activation record to another; usually from the callee's activation record to the caller's.

When a function calls another:

When the function returns, the opposite things have to happen:

Problems in C++

One longstanding issue with C/C++ and modern optimizing compilers is the interaction of setjmp/longjmp (and asynchronous signal handlers) with stack unwinding. Calling longjmp causes a "long distance" transfer of flow control, consuming many stack frames. Return values are a non-issue with longjmp(), but correct destructor semantics and register un-spilling are problems, as longjmp bypasses the compiler-emitted code to do this.

In C++, the presence of destructors especially complicates this. However, C++ has a feature--ExceptionHandling--that largely replaces setjmp/longjmp() (the one issue that remains is calling longjmp() in a signal handler); so use of setjmp/longjmp in C++ is generally discouraged.

Activation Records

c2.com: activation record.

Activation Record: A data structure containing the important state information for a particular instance of a function call (or something that resembles a function call). May exist entirely in memory, or be partially stored in registers.

Typically, an ActivationRecord contains:

In many (most) programming languages, ActivationRecords maintain a strict LastInFirstOut (LIFO) discipline and are thus stored on a stack (often referred to as TheStack). For this reason, ActivationRecords are also commonly known as stack frames.

Limitations

It can't support the following features (which violate LIFO discipline).

In languages with these, other arrangements are often used, such as ContinuationImplementation.

Continuations

Definition

A continuation reifies the program control state, i.e. the continuation is a data structure that represents the computational process at a given point in the process's execution; the created data structure can be accessed by the programming language, instead of being hidden in the runtime environment.

Reification
[,riefi'keishen] 具体化。 The process by which an abstract idea about a computer program is turned into an explicit data model or other object created in a programming language.

Continuations are useful for encoding other control mechanisms in programming languages such as exceptions, generators, coroutines, and so on.

First-class continuations

The term continuations can also be used to refer to first-class continuations, which are constructs that give a programming language the ability to save the execution state at any point and return to that point at a later point in the program, possibly multiple times.

Uses

Continuations simplify and clarify the implementation of several common design patterns including coroutines/green threads and exception handling, by providing the basic, low-level primitive which unifies these seemingly unconnected patterns.

The Smalltalk Seaside web framework switches continuations when switching pages. The use of continuations shields the programmer from the stateless nature of the HTTP protocol. Other framworks include: Continuity for Perl, Wee for Ruby, Nagare framework for Python, Wt for C++, and the Apache Cocoon Web application framework also provides continuations.

In C, longjmp can be used to jump from the middle of one function to another, provided the second function lies deeper in the stack.

More examples include: coroutines in Simula 67, Lua, and Perl; tasklets in Stackless Python; generators in Icon and Python; continuations in Scala (starting in 2.8); fibers in Ruby (starting in 1.9.1); the backtracking mechanism in Prolog; monads in functional programming; and threads.

Disadvantages

Continuations are the functional expression of the GOTO statement, and the same caveats apply.

CoRoutine

Definition

Coroutines are functions or procedures that save control state between calls (as opposed to, but very similar to, Generators, such as Random Number Generators, that save data state between calls).

The most common example is a lexical analyser that behaves differently on different calls because it is tracking e.g. whether it is currently inside a function definition or outside a function definition.

Coroutines are very important as one of the very few examples of a form of concurrency that is useful, and yet constrained enough to completely avoid the typical difficulties (race conditions, deadlock, etc). Synchronization is built-in to the paradigm.

Lex and GNU Flex implement coroutine-based lexical analyzers, as can be seen by their support for entering and leaving user-defined states that allow for state-dependent pattern recognition.

It's worth noting that any direct implementation of a state machine simulates coroutines, or you could say that coroutines directly implement state machines, indicating that they are theoretically both fundamental and well-behaved.

Virtual method table

Definition

It is a mechanism used in a programming language to support dynamic dispatch (or run-time binding, late binding).

Whenever a class defines a virtual function, most compilers add a hidden member variable to the class which points to an array of pointers to (virtual) functions. These pointers are used at runtime to invoke the appropriate function implementations.

Acronyms: VMT, vtable, or vftable. Other names: dispatch table, virtual function table.

Implementation

Class's own vtable

Every class (base or derived) with virtual functions (or is derived from a class that uses virtual functions) is given it’s own virtual table.

This table is simply a static array that the compiler sets up at compile time. A virtual table contains one entry for each virtual function that can be called by objects of the class. Each entry simply points to the most derived version of the functions.

*__vptr

In the base class, the compiler also adds a hidden pointer (and thus inherited by derived classes), which we will call *__vptr.

*__vptr is set (automatically) when a class instance is created so that it points to the virtual table for that class.

Unlike the *this pointer, which is actually a function parameter used by the compiler to resolve self-references, *__vptr is a real pointer. Consequently, it makes each class object allocated bigger by the size of one pointer. It also means that *__vptr is inherited by derived classes, which is important.

To summarize, it’s really quite simple: the *__vptr in each class points to the vtable for that class. The entries in the vtable point to the most-derived version of the function objects of that class are allowed to call.


class Base
{
public:
    FunctionPointer *__vptr; // Added by the compiler.
    virtual void function1() {};
    virtual void function2() {};
};
 
class D1: public Base
{
public:
    virtual void function1() {};
};
 
class D2: public Base
{
public:
    virtual void function2() {};
};

int main()
{
    D1 cClass;
    Base *pClass = &cClass; // pClass is a base pointer which only points to the Base portion of cClass. However, note that *__vptr is in the Base portion of the class, so pClass has access to this pointer. Finally, pClass->__vptr points to the D1 virtual table! Consequently, even though pClass is of type Base, it still has access to D1’s virtual table.
    pClass->function1(); // It looks up which version of function1() to call in D1’s virtual table. This has been set to D1::function1(). Therefore, pClass->function1() resolves to D1::function1()!
}

Disadvantages

It takes 3 steps (use the __vptr pointer to find vtable, look up in the vtable, direct function call) in virtual function call. Indirect funtion call by function pointer costs 2 steps. Thus virtual function is slower. But the cost could be ignored.

Memory allocation

Virtual memory

Virtual memory systems separate the memory addresses used by a process from actual physical addresses, allowing separation of processes and increasing the effectively available amount of RAM using paging or swapping to secondary storage.

The quality of the virtual memory manager can have an extensive effect on overall system performance.

The Stack

Advantages:

Another feature of the stack is that there is a limit (varies with OS) on the size of variables that can be store on the stack.

The Heap

Unlike the stack, variables created on the heap are accessible by any function, anywhere in your program. Heap variables are essentially global in scope.