All notes
Programming

# Concepts

## Unwinding The Stack

Unwinding the stack is the restoration/cleanup of ActivationRecords that occur when control is transferred from one activation record to another; usually from the callee's activation record to the caller's.

When a function calls another:

• The actual parameters to the called function must be put in the proper place (assume that they have already been computed)
• If local variables are cached in registers, they must be spilled (saved to TheStack or somewhere else in memory) so that the registers are available for the called function to use.
• In languages like C++ which allow objects on TheStack; constructors must be invoked.

When the function returns, the opposite things have to happen:

• The return value or thrown exception (if any) must be placed in the proper place.
• The previously spilled registers in the caller must be unspilled--put back in registers--so that subsequent computation in the caller can use them.
• In C++, destructors for stack objects must be invoked.
• In languages with UnwindProtect or similar ("finally" blocks in Java or C#), these clauses must be executed.

### Problems in C++

One longstanding issue with C/C++ and modern optimizing compilers is the interaction of setjmp/longjmp (and asynchronous signal handlers) with stack unwinding. Calling longjmp causes a "long distance" transfer of flow control, consuming many stack frames. Return values are a non-issue with longjmp(), but correct destructor semantics and register un-spilling are problems, as longjmp bypasses the compiler-emitted code to do this.

In C++, the presence of destructors especially complicates this. However, C++ has a feature--ExceptionHandling--that largely replaces setjmp/longjmp() (the one issue that remains is calling longjmp() in a signal handler); so use of setjmp/longjmp in C++ is generally discouraged.

## Activation Records

Activation Record: A data structure containing the important state information for a particular instance of a function call (or something that resembles a function call). May exist entirely in memory, or be partially stored in registers.

Typically, an ActivationRecord contains:

• A pointer to the ActivationRecord of the caller (commonly known as the DynamicChain). It is different with ContinuationPassingStyle, which does so via the supplied continuation.
• In languages which allow nested LexicalScoping, a pointer to the ActivationRecord of the enclosing scope (aka the ReferencingEnvironment) - this pointer is called the StaticChain. C/C++ disallows nested LexicalScopes.
• Local variables, and actual parameters to the function. They are both frequently cached in registers.
• Space for the return value of the function.

In many (most) programming languages, ActivationRecords maintain a strict LastInFirstOut (LIFO) discipline and are thus stored on a stack (often referred to as TheStack). For this reason, ActivationRecords are also commonly known as stack frames.

### Limitations

It can't support the following features (which violate LIFO discipline).

• FirstClass LexicalClosures (a closure can outlive its referencing environment).
• Continuations, excluding UpwardContinuation.
• CoRoutines, BlocksInSmalltalk, and a few other control structures that are found in various languages.

In languages with these, other arrangements are often used, such as ContinuationImplementation.

## Continuations

### Definition

A continuation reifies the program control state, i.e. the continuation is a data structure that represents the computational process at a given point in the process's execution; the created data structure can be accessed by the programming language, instead of being hidden in the runtime environment.

Reification
[,riefi'keishen] 具体化。 The process by which an abstract idea about a computer program is turned into an explicit data model or other object created in a programming language.

Continuations are useful for encoding other control mechanisms in programming languages such as exceptions, generators, coroutines, and so on.

### First-class continuations

The term continuations can also be used to refer to first-class continuations, which are constructs that give a programming language the ability to save the execution state at any point and return to that point at a later point in the program, possibly multiple times.

### Uses

Continuations simplify and clarify the implementation of several common design patterns including coroutines/green threads and exception handling, by providing the basic, low-level primitive which unifies these seemingly unconnected patterns.

The Smalltalk Seaside web framework switches continuations when switching pages. The use of continuations shields the programmer from the stateless nature of the HTTP protocol. Other framworks include: Continuity for Perl, Wee for Ruby, Nagare framework for Python, Wt for C++, and the Apache Cocoon Web application framework also provides continuations.

In C, longjmp can be used to jump from the middle of one function to another, provided the second function lies deeper in the stack.

More examples include: coroutines in Simula 67, Lua, and Perl; tasklets in Stackless Python; generators in Icon and Python; continuations in Scala (starting in 2.8); fibers in Ruby (starting in 1.9.1); the backtracking mechanism in Prolog; monads in functional programming; and threads.

Continuations are the functional expression of the GOTO statement, and the same caveats apply.

## CoRoutine

### Definition

Coroutines are functions or procedures that save control state between calls (as opposed to, but very similar to, Generators, such as Random Number Generators, that save data state between calls).

The most common example is a lexical analyser that behaves differently on different calls because it is tracking e.g. whether it is currently inside a function definition or outside a function definition.

Coroutines are very important as one of the very few examples of a form of concurrency that is useful, and yet constrained enough to completely avoid the typical difficulties (race conditions, deadlock, etc). Synchronization is built-in to the paradigm.

Lex and GNU Flex implement coroutine-based lexical analyzers, as can be seen by their support for entering and leaving user-defined states that allow for state-dependent pattern recognition.

It's worth noting that any direct implementation of a state machine simulates coroutines, or you could say that coroutines directly implement state machines, indicating that they are theoretically both fundamental and well-behaved.

## Virtual method table

### Definition

It is a mechanism used in a programming language to support dynamic dispatch (or run-time binding, late binding).

Whenever a class defines a virtual function, most compilers add a hidden member variable to the class which points to an array of pointers to (virtual) functions. These pointers are used at runtime to invoke the appropriate function implementations.

Acronyms: VMT, vtable, or vftable. Other names: dispatch table, virtual function table.

### Implementation

#### Class's own vtable

Every class (base or derived) with virtual functions (or is derived from a class that uses virtual functions) is given it’s own virtual table.

This table is simply a static array that the compiler sets up at compile time. A virtual table contains one entry for each virtual function that can be called by objects of the class. Each entry simply points to the most derived version of the functions.

#### *__vptr

In the base class, the compiler also adds a hidden pointer (and thus inherited by derived classes), which we will call *__vptr.

*__vptr is set (automatically) when a class instance is created so that it points to the virtual table for that class.

Unlike the *this pointer, which is actually a function parameter used by the compiler to resolve self-references, *__vptr is a real pointer. Consequently, it makes each class object allocated bigger by the size of one pointer. It also means that *__vptr is inherited by derived classes, which is important.

To summarize, it’s really quite simple: the *__vptr in each class points to the vtable for that class. The entries in the vtable point to the most-derived version of the function objects of that class are allowed to call.


class Base
{
public:
FunctionPointer *__vptr; // Added by the compiler.
virtual void function1() {};
virtual void function2() {};
};

class D1: public Base
{
public:
virtual void function1() {};
};

class D2: public Base
{
public:
virtual void function2() {};
};

int main()
{
D1 cClass;
Base *pClass = &cClass; // pClass is a base pointer which only points to the Base portion of cClass. However, note that *__vptr is in the Base portion of the class, so pClass has access to this pointer. Finally, pClass->__vptr points to the D1 virtual table! Consequently, even though pClass is of type Base, it still has access to D1’s virtual table.
pClass->function1(); // It looks up which version of function1() to call in D1’s virtual table. This has been set to D1::function1(). Therefore, pClass->function1() resolves to D1::function1()!
}

It takes 3 steps (use the __vptr pointer to find vtable, look up in the vtable, direct function call) in virtual function call. Indirect funtion call by function pointer costs 2 steps. Thus virtual function is slower. But the cost could be ignored.

# Memory allocation

## Virtual memory

Virtual memory systems separate the memory addresses used by a process from actual physical addresses, allowing separation of processes and increasing the effectively available amount of RAM using paging or swapping to secondary storage.

The quality of the virtual memory manager can have an extensive effect on overall system performance.