All notes
Os

# Basics

## Endianness

"Little Endian" means that the low-order byte of the number is stored in memory at the lowest address - The little end comes first. For example, a 4 byte LongInt Byte3 Byte2 Byte1 Byte0 will be arranged in memory as follows:

// Little Endian.
// Big Endian.


Intel processors (those used in PC's) use "Little Endian" byte order, Motorola processors (those used in Mac's) use "Big Endian" byte order.

### Which is Better?

In "Little Endian" form, assembly language instructions for picking up a 1, 2, 4, or longer byte number proceed in exactly the same way for all formats: first pick up the lowest order byte at offset 0.

Also, because of the 1:1 relationship between address offset and byte number (offset 0 is byte 0), multiple precision math routines are correspondingly easy to write.

In "Big Endian" form, by having the high-order byte come first, you can always test whether the number is positive or negative by looking at the byte at offset zero.

The numbers are also stored in the order in which they are printed out, so binary to decimal routines are particularly efficient.

The Windows .BMP format, since it was developed on a "Little Endian" architecture, insists on the "Little Endian" format.

Adobe Photoshop, JPEG use Big Endian.

### Conversion

Function Reverse (N:LongInt) : LongInt ;
Var B0, B1, B2, B3 : Byte ;
Begin
B0 := N Mod 256 ;
N  := N Div 256 ;
B1 := N Mod 256 ;
N  := N Div 256 ;
B2 := N Mod 256 ;
N  := N Div 256 ;
B3 := N Mod 256 ;
Reverse := (((B0 * 256 + B1) * 256 + B2) * 256 + B3) ;
End ;

// A more efficient version that depends on the presence of hexadecimal numbers, bit masking operators AND, OR, and NOT, and shift operators SHL and SHR might look as follows:

Function Reverse (N:LongInt) : LongInt ;
Var B0, B1, B2, B3 : Byte ;
Begin
B0 := (N AND $000000FF) SHR 0 ; B1 := (N AND$0000FF00) SHR  8 ;
B2 := (N AND $00FF0000) SHR 16 ; B3 := (N AND$FF000000) SHR 24 ;
Reverse := (B0 SHL 24) OR (B1 SHL 16) OR (B2 SHL 8) OR (B3 SHL 0) ;
End ;


### Examples


#include <stdio.h>

/* function to show bytes in memory, from location start to start+n*/
void show_mem_rep(char *start, int n)
{
int i;
for (i = 0; i < n; i++)
printf(" %.2x", start[i]);
printf("
");
}

/*Main function to call above function for 0x01234567*/
int main()
{
int i = 0x01234567;
show_mem_rep((char *)&i, sizeof(i));
getchar();
return 0;
}
// Little Endian output: 67 45 23 01

A quick way to determine endianness of your machine:


#include <stdio.h>
int main()
{
unsigned int i = 1;
char *c = (char*)&i;
if (*c)
printf("Little endian");
else
printf("Big endian");
getchar();
return 0;
}

Since size of character is 1 byte when the character pointer is de-referenced it will contain only first byte of integer. If machine is little endian then *c will be 1 (because last byte is stored first) and if machine is big endian then *c will be 0.

### History of Endian-ness

Where does this term "endian" come from? Jonathan Swift was a satirist (he poked fun at society through his writings). His most famous book is "Gulliver's Travels", and he talks about how certain people prefer to eat their hard boiled eggs from the little end first (thus, little endian), while others prefer to eat from the big end (thus, big endians) and how this lead to various wars.

### Misconceptions on registers

However, if you have a 32 bit register storing a 32 bit value, it makes no sense to talk about endianness. The register is neither big endian nor little endian. It's just a register holding a 32 bit value. The rightmost bit is the least significant bit, and the leftmost bit is the most significant bit. There's no reason to rearrange the bytes in a register in some other way.

## Number conversion

• Decimal to Other Base System
• Other Base System to Decimal
• Other Base System to Non-Decimal
• Shortcut method - Binary to Octal
• Shortcut method - Octal to Binary
• Shortcut method - Binary to Hexadecimal
• Shortcut method - Hexadecimal to Binary

### Decimal to Other Base System

 

Steps	Operation	Result	Remainder  Re-interp
Step 1	29 / 2		14		1          1+2*14
Step 2	14 / 2		7		0          1+2*(7*2+0)
Step 3	7 / 2		3		1          1+2*((3*2+1)*2+0)
Step 4	3 / 2		1		1          1+2*(((1*2+1)*2+1)*2+0)
Step 5	1 / 2		0		1          1+1*2^4+1*2^3+1*2^2+0*2

  NOTE: The remainders have to be arranged in the reverse order so that the first remainder becomes the least significant digit (LSD) and the last remainder becomes the most significant digit (MSD).

Decimal Number: $29_{10}$ = Binary Number : $11101_2$.

### Other base system to Decimal System

$11101_2 = ((1 \times 2^4) + (1 \times 2^3) + (1 \times 2^2) + (0 \times 2^1) + (1 \times 2^0))_{10} = 29_{10}$

### Other base system to non-Decimal System

1. Convert the original number to a decimal number (base 10).
2. Convert the decimal number so obtained to the new base number.

### Shortcut method - Binary to Octal

1. Divide the binary digits into groups of three (starting from the right):
2. Convert each group of three binary digits to one octal digit.

$10101_2$
$010_2\, 101_2$
$2_8\, 5_8$
$25_8$

### Shortcut method - Octal to Binary

1. Convert each octal digit to a 3 digit binary number (the octal digits may be treated as decimal for this conversion).
2. Combine all the resulting binary groups (of 3 digits each) into a single binary number.

$25_8$
$2_{10} 5_{10}$
$010_2\, 101_2$
$010101_2$.

### Shortcut method - Binary to Hexadecimal

1. Divide the binary digits into groups of four (starting from the right).
2. Convert each group of four binary digits to one hexadecimal symbol.

$10101_2$
$0001_2\, 0101_2$
$1_{10}\, 5_{10}$
$15_{16}$

### Shortcut method - Hexadecimal to Binary

$15_{16}$
$1_{10}\, 5_{10}$
$0001_2\, 0101_2$
$00010101_2$

### Exams

#### Hex to Bytes

$$0x7F = 7\times 16 + 15 = 127$$ $$0xFE = 15\times 16 + 14 = 254$$ $$0xFFFE = 255 254$$

## Two's complement

First of all, we should note that a complement system is an operation on a binary number. And in a signed data representation system, these operations are used to obtain a negative number from a positive number. There are the following two operations.

One's complement of a binary number is simply its binary inversion of all its bits, e.g., one's complement of 1001001 is 0110110. This system has a disadvantage that, one's complement of 0000 0000 is 1111 1111, which indicates there is a 0 and a -0 within this system. Therefore, the one's complement system could only represent a value ranging from $[-2^{N-1}+1, 2^{N-1}-1 ]$, where $N$ is bit number. For example, a 8-bit data could have value within $[-127 127]$.

Two's complement of a binary number consists of two steps:

1. The first step is right the one's complement.
2. The second step is add one to the result.

We could see that, two's complement of "0000 0000" is "1 0000 0000", and if we omit the overflowed highest bit, the two's complement of 0 is just 0. Perfect! For this reason, the two's complement system could represent a value ranging from $[-2^{N-1}, 2^{N-1}-1 ]$. No wonder it is now the prevalent choice for most systems.

A question: char v=128. What is the real value of v?
128 in unsigned 8 bit representation is 1000 0000 ($2^7$, 1 followed by 7 0's). Now the char type $v$ sees it as a negative number since the highest bit is 1. Using two's complement backwards on 1000 0000 (we should still keep the highest bit. Don't omit it just because it only represents the sign, otherwise we will wrongly get 0) - minus 1 from it (get 0111 1111) and then do binary inversion, we could get the absolute value of this negative number: 128 (1000 0000)! So in the end, the v has a value of -128.

## Processes

SegmentFalu.t 每个进程有独立的地址空间，资源句柄

# Composition

## Memory

### RAM

Resident Set Size (RSS) is the portion of memory occupied by a process that is held in main memory (RAM). The rest of the occupied memory exists in the swap space or file system, either because some parts of the occupied memory were paged out, or because some parts of the executable were never loaded.

Paging is a memory management scheme. In this scheme, the operating system retrieves data from secondary storage in same-size blocks called pages. The secondary storage (e.g. hard disk) is used to let programs exceed the size of available physical memory (RAM).

So paging out means the OS allocate secondary memories in page for current program use.

### File system

#### Block/Cluster size

For ext2 or ext3, each file occupies one or multiple blocks (not in half or part). All blocks have the same size, usually 1024, 2048 or 4096 bytes. That block size is what you specify with mke2fs -b. A file of size 2049 bytes will occupy 2 blocks in a file system with 2048 block size.

The FAT filesystem used in particular by MS-DOS and early versions of Windows has a similarly simple space allocation called clusters. The concept is the same with blocks.

Some filesystems (Reiserfs and Btrfs) have a more sophisticated allocation scheme: they have fixed-size blocks, but can use the same block to store the last few bytes of more than one file. This is known as block suballocation.

##### Block size in Unix Utilities

Unix utilities often use the word "block" to mean an arbitrarily-sized unit, typically 512 bytes or 1kB, for a historical reason - disks and filesystems at the old time often operated in 512B chunks.

Use du -B or df -B to check the block size.

The following file is of 13B, and the file system is in 2048B (which corresponds to 4 blocks since "stat" uses 512B for block size).


stat test
# File: test'
# Size: 13            Blocks: 4          IO Block: 2048   regular file
# Device: 700h/1792d  Inode: 15          Links: 1
`
##### Sector size in disks

Most disks present an interface that shows the disk as a bunch of sectors. The disk can only write or read a whole sector, not individual bits or bytes. Most hard disks have 512-byte sectors, though 4kB-sector disks started appearing a couple of years ago.

The disk sector size is not directly related to the filesystem block size, but having a block be a whole number of sectors is better for performance.

# Concepts

## Turing complete

A problem is said to be Turing-complete if it can only be solved by a Turing machine or any system that is TuringEquivalent.

A given programming language is said to be Turing-complete if it can be shown that it is computationally equivalent to a Turing machine. That is, any problem that can be solved on a Turing machine using a finite amount of resources (i.e., time and tape), can be solved with the other language using a finite amount of its resources.

Typically, one proves a given language being Turing-complete by providing a recipe for translating any given Turing machine program into an equivalent program in the language in question. Alternately, one can provide a translation scheme from another language, one that has already been proven to be Turing-complete.

Nearly every existing computer language is Turing-complete.

## Kernel mode, User mode

In Kernel mode, the executing code has complete and unrestricted access to the underlying hardware. It can execute any CPU instruction and reference any memory address. Kernel mode is generally reserved for the lowest-level, most trusted functions of the operating system. Crashes in kernel mode are catastrophic; they will halt the entire PC.

In User mode, the executing code has no ability to directly access hardware or reference memory. Code running in user mode must delegate to system APIs to access hardware or memory. Due to the protection afforded by this sort of isolation, crashes in user mode are always recoverable. Most of the code running on your computer will execute in user mode.

How the switch occurs. The switch from user mode to kernel mode is not done automatically by CPU. CPU is interrupted by interrupts (timers, keyboard, I/O). When interrupt occurs, CPU stops executing the current running program, switch to kernel mode, executes interrupt handler. This handler saves the state of CPU, performs its operations, restore the state and returns to user mode.