All notes
Network

OSI layer models

OSI: Open System Interconnection. MTU: maximum transmission unit.

From lower to higher:

  1. Physical layer. It is concerned with the transmission and reception of the unstructured raw bit stream over a physical medium.
    twisted pair, coaxial cable, optical fiber, wireless.
    Ethernet, DSL, Bluetooth, USB.
  2. Data link layer. It provides (virtually) error-free transfer of data frames from one node to another over the physical layer.
    It is divided into two sub layers:
    • MAC, media access control, controls how a computer on the network gains access to the data and permission to transmit it.
    • LLC, Logical link control, controls frame synchronization, flow control and error checking. Provides multiplexing mechanisms that make it possible for several network protocols such as IP, IPX, Decnet and Appletalk to coexist within a multipoint network and to be transported over the same network medium.
    Includes: ARP, L2TP, PPP, IEEE 802.5/ 802.2, IEEE 802.3/802.2.
  3. Network layer: it takes all routing decisions, dealing with end to end data transmission, such as the operation of the subnet, deciding which physical path the data should take based on network conditions, priority of service, and other factors.
    IP, IPsec, ICMP, AppleTalk.
  4. Transport layer: TCP, UDP, SPX.
  5. Session layer: controls the dialogues (connections) between computers.
    NFS, RPC, SOCKS, PPTP, SPDY, NetBIOS, Named pipe.
  6. Presentation layer: defines and encrypts/decrypts data types from the application layer.
    MIME, XDR, MPEG, GIF, .etc.
  7. Application layer: keeps track of how each application talks to another application.
    HTTP, DHCP, Gopher, DNS.

Data link Layer, L2

ARP

The Address Resolution Protocol (ARP) is a telecommunication protocol used for resolution of network layer addresses into link layer addresses, a critical function in multiple-access networks.

MAC

MAC layer is the lower sublayer of the data link layer (layer 2) of the seven-layer OSI model. The MAC sublayer provides addressing and channel access control mechanisms that make it possible for several terminals or network nodes to communicate within a multiple access network that incorporates a shared medium, e.g. an Ethernet network. The hardware that implements the MAC is referred to as a media access controller.

Network Layer, L3

AppleTalk

AppleTalk was a proprietary suite of networking protocols developed by Apple Inc. for their Macintosh computers. AppleTalk includes a number of features that allow local area networks to be connected with no prior setup or the need for a centralized router or server of any sort. Connected AppleTalk-equipped systems automatically assign addresses, update the distributed namespace, and configure any required inter-networking routing. It is a plug-n-play system.

Session Layer, L4

SOCKS

TCP

Termination

Wikipedia.

tcmdump

Protocols

SNMP

Wikipedia.

BGP

bkjia.com.

BGP(Border Gateway Protocol,边界网关协议)主要用于互联网AS(自治系统)之间的互联,BGP的最主要功能在于控制路由的传播和选择最好的路由。BGP是Internet工程任务组制定的一个加强的、完善的、可伸缩的协议。采用BGP方案来实现双线路互联或多线路互联的机房,称为BGP机房。

中国网通 、中国电信、中国铁通和一些大的民营IDC运营商都具有AS号,全国各大网络运营商多数都是通过BGP协议与自身的AS号来实现多线互联的。使用此方案来实现多线路互联,IDC需要在CNNIC(中国互联网信息中心)或APNIC(亚太网络信息中心)申请自己的IP地址段和AS号,(特别注明:目前国内的世纪互联同时是APNIC和CNNIC的会员单位,号称中国最大的电信中立互联网基础设施服务商),然后通过BGP协议将此段IP地址广播到其它的网络运营商的网络中。使用BGP协议互联后,网络运营商的所有骨干路由设备将会判断到IDC机房IP段的互联最佳路由,以保证不同网络运营商用户的高速访问。

Wikipedia: BGP.

Border Gateway Protocol (BGP) is a standardized exterior gateway protocol designed to exchange routing and reachability information among autonomous systems (AS) on the Internet. It belongs to Application Layer.

The protocol is often classified as a path vector protocol but is sometimes also classed as a distance-vector routing protocol. The Border Gateway Protocol makes routing decisions based on paths, network policies, or rule-sets configured by a network administrator and is involved in making core routing decisions.

BGP may be used for routing within an autonomous system. In this application it is referred to as Interior Border Gateway Protocol, Internal BGP, or iBGP.

The current version of BGP is version 4 (BGP4 or BGP-4) codified in RFC 4271 since 2006. The major enhancement in version 4 was support for Classless Inter-Domain Routing and use of route aggregation to decrease the size of routing tables.

Most Internet service providers must use BGP to establish routing between one another (especially if they are multihomed).

Ports

http://www.cyberciti.biz/faq/linux-unix-open-ports/.

# Increase local port range by typing the following command (Linux specific example):
echo 1024 65535 > /proc/sys/net/ipv4/ip_local_port_range

# You can also increase or decrease socket timeout (Linux specific example):
echo 2000 > /proc/sys/net/ipv4/tcp_keepalive_time

Commands

curl


# -s, --silent

# -S, --show-error
    When used with -s it makes curl show an error message if it fails.

-f, --fail
    (HTTP) Fail silently (no output at all) on server errors. This is mostly done to better enable scripts etc to better deal  with failed  attempts. In normal cases when an HTTP server fails to deliver a document, it  returns  an  HTML  document  stating  so (which  often  also describes why and more). This flag will prevent curl from outputting that and return error 22.

# -L, --location. If the server reports that the requested page has moved to a different location (indicated with a Location: header and a 3XX response code), this option will make curl redo the request on the new place. If used together with -i/--include or -I/--head, headers from all requested pages will be shown. When authentication is used, curl only sends its credentials to the initial host.  If a redirect takes curl to a different host, it won’t be able to intercept the user+password.

# -k, --insecure. (SSL) This option explicitly allows curl to perform "insecure" SSL connections and transfers. All SSL connections are attempted to be made secure by using the CA certificate bundle installed by default. This makes all connections considered "insecure" fail unless -k, --insecure is used.

# -m, --max-time seconds. Maximum time in seconds that you allow the whole operation to take.

# -i, --include. (HTTP) Include the HTTP-header in the output.

# It is recommended to add "&& echo" after every curl command to enforce line break.
curl http://ipecho.net/plain && echo

# Use curl to get the web content from a virtual host,
# by telling it the hostname. Reference.
curl -H 'Host mydomain.com' myIP

# Check if the port is open.
# If "connected", it is open. "refused", closed. "timeout", firewalled.
telnet mydomain.com portNum

StackOverflow.
# -g, --globoff. Switches off the "URL globbing parser", in order to prevent curl interpreting bracket letters: {}[].
# curl is trying to interpret the square brackets as a globbing pattern.
curl -g "http://192.168.1.1:12345/info?sort=[(_updated,-1)]&page=1"

ip

TecMint.


ip addr show
ip route show

sudo ip link set eth1 up
sudo ip link set eth1 down

sudo ip addr add 192.168.50.5 dev eth1
sudo ip addr del 192.168.50.5/24 dev eth1

# "via" denotes gateway.
sudo ip route add 10.10.20.0/24 via 192.168.50.100 dev eth0
sudo ip route del 10.10.20.0/24
sudo ip route add default via 192.168.50.100

route

See route note.

lokkit

About.


sudo lokkit -s http -s ssh

netstat


# -a, --all. Show both listening and non-listening sockets.
# -t: tcp, -u: udp.
# -p, --program. Show the PID and program name.
# -l: listening
# -e, --extend (display additional info).
# -n, --numeric (don't look up for host, port and user names)
netstat -tuplen
netstat -lnp # Check all the open ports.
netstat -ap | grep 192.168.1.1: #Check all connections to 192.168.1.1 and which program makes these.

# lsof: list open files.
# -i, listing of all Internet and HP-UX network files.
# -a, causes list selection options to be ANDed.
# -p s. Selects or excludes process. s could be: "123,^456", '^' means negation.
# -r, repeat. -r1 repeat every second.
lsof -i -a -p -r1 `pidof firefox`
# To list open IPv4 connections use the lsof command:
lsof -Pnl +M -i4

Both netstat and lsof are less reliable than nmap for check the network status.

nmap


nmap -sT -O localhost

nmap -sP 192.168.0.* //扫描0网段所有ip,报告 up 的ip。ping扫
nmap   192.168.0.1-3 //扫描0.1,0.2,0.3 三台机
nmap   192.168.0.*         //扫描0网段
nmap  -p 22,23,80  ip                //扫指定端口
nmap  -p 22-80  192.168.0.*
nmap  -sU    192.168.0.*           //扫描udp
nmap  -sT    192.168.0.*            //扫描tcp
nmap  -sS    192.168.0.3           //半开式的扫描
nmap  -O    192.168.0.3            //整体信息(OS类型,端口情况。。。)
nmap  -v    192.168.0.3             //详细模式

# Check if the port is associated with the official list of known services
cat /etc/services | grep portnum

ss

ss is used to dump Socket Statistics.


# Display all TCP sockets.
ss -t -a

# Display all UDP sockets.
ss -u -a

# Display all established ssh connections.
ss -o state established '( dport = :ssh or sport = :ssh )'

# Find all local processes connected to X server.
# -x, --unix: display only unix domain sockets.
ss -x src /tmp/.X11-unix/*

# List all the tcp sockets in state FIN-WAIT-1 for our apache to network 193.233.7/24 and look at their timers.
ss -o state fin-wait-1 '( sport = :http or sport = :https )' dst 193.233.7/24

# Watch all tcp connections to 192.168.1.1
watch -n 1 --difference=cumulative 'ss -est | grep 192.168.1.1'

telnet


# Should always set escapeChar so that you could quit communication by typing this char.
telnet -q escapeChar ip port

nc


# Listen on localhost 60001.
nc -kl 60001

# In another computer, send Hello to the server.
echo "Hello\!" | nc serverIP 60001

Otherwise, you could open "nc -kl port" on one machine, and "nc serverIP port" on another machine, and talk to each other by inputing in stdin. A simplest chat app.

IPTables

More accurate name is iptables/netfilter。iptables is a userspace module. 作为用户,你在命令行就是通过它将防火墙规则放进缺省的表里。netfilter is a kernel module,它内置于内核中,进行实际的过滤。

iptables 将规则放进缺省的规则链(INPUT、OUTPUT 及 FORWARD),而所有流量(IP 封包)都会被相关的规则链检查,根据当中的规则判断如何处理每个封包,例如:接纳或丢弃它。这些动作称为target,而最常见的两个缺省target: DROP or ACCEPT 。

3 条缺省规则链:

iptables options:

-t table
	filter: input, forward, output.
		Default table.
	nat: prerouting, output, postrouting.
		This table is consulted when a packet that creates a new connection is encountered.
	mangle: prerouting, output, input, forward, postrouting.
		This table is used for specialized packet alteration.
	raw: prerouting, output.
		This table is used mainly for configuring exemptions from connection tracking in combination with the NOTRACK target.

Commands:
-A, --append chain rule
-D, --delete chain rule/rulenum
-I, --insert chain [rulenum] rule. If the rule number is 1, the rule or rules are inserted at the head of the chain. This is also the default if no rule number is specified.
-R, --replace chain rulenum rule
-L, --list [chain]. Usually used with -n to suppress DNS lookups. If no chain is selected, all chains are listed.
-S, --list-rules [chain]
-F, --flush [chain]. This is equivalent to deleting all the rules one by one. All the chains in the table will be removed if none is given.
-Z, --zero [chain [rulenum]]. Zero the packet and byte counters in all chains, or only the given chain, or only the given rule in a chain. It is legal to specify the -L, --list (list) option as well, to see the counters immediately before they are cleared.
-N, --new-chain chain
-X, --delete-chain [chain]
-P, --policy chain target
-E, --rename-chain oldChain newChain

Parameters
-s, --source address/mask
-d, --destination address/mask
-j, --jump target. The target can be a user-defined chain, one of the special builtin targets which decide the fate of the packet immediately, or an extension.
-g, --goto chain.
-i, --in-interface name.
-o, --out-interface name.

Other
-n, --numeric. IP and port numbers will be in numeric format.
-p, --protocol
-m, --match module.
	iptables can use extended packet matching modules. These are loaded in two ways: implicitly, when -p is specified, or explicitly with the -m; various extra command line options become available, depending on the specific module. You can specify multiple extended match modules in one line, and you can use the -h or --help options after the module has been specified to receive help specific to that module.

State module
--state state
	INVALID: the  packet could  not  be identified for some reason.
	ESTABLISHED: the packet is associated with a connection which has seen  packets  in  both  directions.
	NEW: the packet has started a new connection, or otherwise associated with a connection  which  has  not  seen packets  in both directions.
	RELATED: the packet is starting a new connection, but is associated with an existing connection, such as an FTP data transfer, or an ICMP error.

Reference. Ref on disabling iptables.


# Check if iptables installed.
rpm -q iptables

# Check if it is running as modules.
lsmod | grep ip_tables

# List all active rules.
iptables -L --line-numbers

# Start iptables
system-config-securitylevel

# Add rule.
iptables -I INPUT -p tcp --dport 5000 -j ACCEPT
iptables -A INPUT -m state NEW -p tcp --dport 2345 -j ACCEPT
iptables -I INPUT 5 -m state --state NEW -m tcp -p tcp --dport 62085 -j ACCEPT

# Blacklist IP
iptables -I INPUT -s 61.153.104.170 -j DROP

# Delete the first rule.
iptables -D INPUT 1

# Reference.

## verify new firewall settings 
/sbin/iptables -L INPUT -n -v

# Stop iptables, disable iptables
# Save newly added firewall rules, and disable iptables.
# iptables: Saving firewall rules to /etc/sysconfig/iptables.
service iptables save
service iptables stop
# If you are using IPv6 firewall, enter:
service ip6tables save
service ip6tables stop

## Open port 80 and 443 for 192.168.1.0/24 subnet only ##
/sbin/iptables -A INPUT -s 192.168.1.0/24  -m state --state NEW -p tcp --dport 80 -j ACCEPT
/sbin/iptables -A INPUT -s 192.168.1.0/24 -m state --state NEW -p tcp --dport 443 -j ACCEPT

# Accept packets from trusted IP addresses by MAC
iptables -A INPUT -s 192.168.0.4 -m mac --mac-source 00:50:8D:FD:E6:32 -j ACCEPT
Choosing match patterns

# TCP packets from 192.168.1.2:
iptables -t nat -A POSTROUTING -p tcp -s 192.168.1.2 [...]

# UDP packets to 192.168.1.2:
iptables -t nat -A POSTROUTING -p udp -d 192.168.1.2 [...]

# all packets from 192.168.x.x arriving at eth0:
iptables -t nat -A PREROUTING -s 192.168.0.0/16 -i eth0 [...]

# all packets except TCP packets and except packets from 192.168.1.2:
iptables -t nat -A PREROUTING -p ! tcp -s ! 192.168.1.2 [...]

# packets leaving at eth1:
iptables -t nat -A POSTROUTING -o eth1 [...]

# TCP packets from 192.168.1.2, port 12345 to 12356 to 123.123.123.123, Port 22
# (a backslash indicates contination at the next line)
iptables -t nat -A POSTROUTING -p tcp -s 192.168.1.2 --sport 12345:12356 -d 123.123.123.123 --dport 22 [...]

# Source-NAT: Change sender to 123.123.123.123
iptables [...] -j SNAT --to-source 123.123.123.123

# Mask: Change sender to outgoing network interface
iptables [...] -j MASQUERADE

# Destination-NAT: Change receipient to 123.123.123.123, port 22
iptables [...] -j DNAT --to-destination 123.123.123.123:22

# Redirect to local port 8080
iptables [...] -j REDIRECT --to-ports 8080

Example: allow some user to login outside LAN

http://serverfault.com/questions/310459/allowgroups-and-match-address-for-ssh. This presumes you have the inside sshd listening on port 2200 and the outside sshd listening on port 2201, and that each one is using an appropriately configured sshd_config file.

# Connect inside users to "inside" sshd.
iptables -t nat -A PREROUTING -s 192.168.1.0/24 -p tcp --dport 22 -j REDIRECT --to-ports 2200

# Connect out*emphasized text*side users to "outside" sshd.
iptables -t nat -A PREROUTING -s 192.168.1.0/24 -p tcp --dport 22 -j REDIRECT --to-ports 2201

iptables -A INPUT -p tcp --dport 22 -j ACCEPT
iptables -A INPUT -p tcp --dport 2200 -j ACCEPT
iptables -A INPUT -p tcp --dport 2201 -j ACCEPT

FAQ

Why there is an accept all rule but connections are still blocked

http://unix.stackexchange.com/questions/60953/incoming-accept-all-iptables-rule-still-appearing Type: iptables -vL instead. You will find the accept all rule is only applied to lo interface. Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 55651 48M ACCEPT all -- any any anywhere anywhere state RELATED,ESTABLISHED 109 5255 ACCEPT icmp -- any any anywhere anywhere 1 35 ACCEPT all -- lo any anywhere anywhere

NAT

This table is consulted when a packet that creates a new connection is encountered. It consists of three built-ins: PREROUTING (for altering packets as soon as they come in), OUTPUT (for altering locally-generated packets before routing), and POSTROUTING (for altering packets as they are about to go out). In short, "PREROUTING - DNAT for incoming traffic, OUTPUT - DNAT for outgoing traffic, POSTROUTING - SNAT for outgoing traffic" Ref.

Ref. This command can be explained in the following way:

-t nat	 	select table "nat" for configuration of NAT rules.
-A POSTROUTING	 	Append a rule to the POSTROUTING chain (-A stands for "append").
-o eth1	 	this rule is valid for packets that leave on the second network interface (-o stands for "output")
-j MASQUERADE	 	the action that should take place is to 'masquerade' packets, i.e. replacing the sender's address by the router's address.
Using the MASQUERADE target every packet receives the IP of the router's outgoing interface. The advantage over SNAT is that dynamically assigned IP addresses from the provider do not affect the rule, there is no need to adopt the rule. For ordinary SNAT you would have to change the rule every time the IP of the outgoing interface changes. As for SNAT, MASQUERADE is meaningful within the POSTROUTING-chain only.

# Transparent proxying:
# (local net at eth0, proxy server at port 8080)
iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j REDIRECT --to-ports 8080 

SNAT, DNAT, masquerade

http://server.zdnet.com.cn/server/2008/0317/772069.shtml.

如下命令表示把所有10.8.0.0网段的数据包snat成192.168.5.3/192.168.5.4/192.168.5.5等几个ip然后发出去

iptables -t nat -A POSTROUTING -s 10.8.0.0/255.255.255.0 -o eth0 -j snat --to-source 192.168.5.3-192.168.5.5

如此配置的话,不用指定snat的目标ip了 不管现在eth0的出口获得了怎样的动态ip,MASQUERADE会自动读取eth0现在的ip地址然后做snat出去.

iptables -t nat -A POSTROUTING -s 10.8.0.0/255.255.255.0 -o eth0 -j MASQUERADE

TTL

TTL is the short name for Time to live. It usually appears in IP packets, DNS records or HTTP, with a little difference among them.

IP packets

TTL is an 8-bit field under Internet Protocal (IP), and thus its maximum is 255. Under IPv6, it is renamed hop limit. Everytime the IP packet is transfered by a router, its TTL value is decreased by 1, and when it comes down to 0, it will be discarded and an ICMP error - Time Exceeded - is sent back to the sender. Obviously, TTL here is used to kill those immortals and keep our internet clean. Wikipedia page says TTL is in theory measured by seconds, but here in practice it is measured by hop times, thus the name in IPv6.

DNS records

TTLs also occur in the DNS, where each item in the zone file has a TTL. When a cache/recursive nameserver fetches a resource record from an authoritative nameserver, it will cache the record for the time specified by its TTL (measured in seconds). Shorter TTL in zone file imposes heavier loads on an authoritative nameserver, but is useful when changing the critical address. The recommended practice is to lower it down before changing these addresses. Some cache nameservers do not respect the TTL set in authoritative nameservers, therefore it is not guaranteed that all downstream DNS records are renewed after the TTL has expired.

HTTP

TTLs are also present in headers in HTTP responses, and field in HTTP cookies. Their significance is similar as previously mentioned.

SSL certificate

  1. Generate a new CSR (Certificate Signing Request). Now Godaddy asks for more than 2048-bit and SHA-2.
  2. Re-key your certificate.

Generate a CSR

Godaddy: generate CSR.

  1. First generate and submit a Certificate Signing Request (CSR) to the Certification Authority (CA).
  2. The CSR contains your certificate-application information, including your public key.
  3. Use your Web server software to generate the CSR, which will also create your public/private key pair used for encrypting and decrypting secure transactions.

DN asked for CSR

The Web server software will use this information to create your Web server certificate's distinguished name (DN). Distinguished names uniquely identify individual servers:

About DBA. Doing Business As (DBA): The operating name of a company, as opposed to the legal name of the company. See Entrepreneur webpage, and this good article for reference.

Install SSL cert on Apache

Godaddy tutorial.

Use apachectl graceful to restart.

Use GlobalSign page to test whether SSL certificate is installed successfully.

Re-keying SSL cert

Re-keying is the process of generating a new private key for your existing SSL certificate. Your Web server uses the private key to decrypt secure information.

The information in your new CSR must be identical to the information for your existing certificate, i.e. you cannot change the organization.

If you need to change your certificate details, you must revoke the certificate in your account, purchase a new SSL credit, and complete the SSL request again.

Network programming

See the server and client example in the last of the manual.

What is a socket

Ref. In layman’s term, a Socket is an end point of communication between two systems on a network. To be a bit precise, a socket is a combination of IP address and port on one system. So on each system a socket exists for a process interacting with the socket on other system over the network. A combination of local socket and the socket at the remote system is also known a ‘Four tuple’ or ’4-tuple’. Each connection between two processes running at different systems can be uniquely identified through their 4-tuple.

Client example

#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define BUF_SIZE 500

int main(int argc, char *argv[])
{
    struct addrinfo hints;
    struct addrinfo *result, *rp;
    int sfd, s, j;
    size_t len;
    ssize_t nread;
    char buf[BUF_SIZE];

    if (argc < 3) {
        fprintf(stderr, "Usage: %s host port msg...\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    /* Obtain address(es) matching host/port */
    memset(&hints, 0, sizeof(struct addrinfo));
    hints.ai_family = AF_UNSPEC;    /* Allow IPv4 or IPv6 */
    hints.ai_socktype = SOCK_DGRAM; /* Datagram socket */
    hints.ai_flags = 0;
    hints.ai_protocol = 0;          /* Any protocol */

    s = getaddrinfo(argv[1], argv[2], &hints, &result);
    if (s != 0) {
        fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(s));
        exit(EXIT_FAILURE);
    }

   /* getaddrinfo() returns a list of address structures.
       Try each address until we successfully connect(2).
       If socket(2) (or connect(2)) fails, we (close the socket
       and) try the next address. */

   for (rp = result; rp != NULL; rp = rp->ai_next) {
        sfd = socket(rp->ai_family, rp->ai_socktype,
                     rp->ai_protocol);
        if (sfd == -1)
            continue;

       if (connect(sfd, rp->ai_addr, rp->ai_addrlen) != -1)
            break;                  /* Success */

       close(sfd);
    }

   if (rp == NULL) {               /* No address succeeded */
        fprintf(stderr, "Could not connect\n");
        exit(EXIT_FAILURE);
    }

   freeaddrinfo(result);           /* No longer needed */

   /* Send remaining command-line arguments as separate
       datagrams, and read responses from server */

   for (j = 3; j < argc; j++) {
        len = strlen(argv[j]) + 1;
                /* +1 for terminating null byte */

       if (len + 1 > BUF_SIZE) {
            fprintf(stderr,
                    "Ignoring long message in argument %d\n", j);
            continue;
        }

       if (write(sfd, argv[j], len) != len) {
            fprintf(stderr, "partial/failed write\n");
            exit(EXIT_FAILURE);
        }

       nread = read(sfd, buf, BUF_SIZE);
        if (nread == -1) {
            perror("read");
            exit(EXIT_FAILURE);
        }

       printf("Received %ld bytes: %s\n", (long) nread, buf);
    }

   exit(EXIT_SUCCESS);
}

htonl, htons, ntohl, ntohs

Ref. Convert values between host and network byte order.

#include <arpa/inet.h>
uint32_t htonl(uint32_t hostlong);
uint16_t htons(uint16_t hostshort);
uint32_t ntohl(uint32_t netlong);
uint16_t ntohs(uint16_t netshort);

getaddrinfo

Hints

Hints usually have ai_family, ai_socktype, ai_protocol, ai_flags set, while other fields set to 0.

If the AI_PASSIVE flag is specified in hints.ai_flags, and node is NULL, then the returned socket addresses will be suitable for binding a socket that will accept connections. The returned socket address will contain the "wildcard address" (INADDR_ANY for IPv4 addresses, IN6ADDR_ANY_INIT for IPv6 address). The wildcard address is used by applications (typically servers) that intend to accept connections on any of the hosts's network addresses. If node is not NULL, then the AI_PASSIVE flag is ignored.

INADDR_ANY

StackOverflow.

printf("%d",htonl(INADDR_ANY));
// prints 0, which means 0.0.0.0.

// in inet.h, it shows:

# define INADDR_ANY ((unsigned long int) 0x00000000)
# define INADDR_NONE    0xffffffff
# define INPORT_ANY 0

// While INADDR_LOOPBACK means 127.0.0.1.

If the AI_PASSIVE flag is not set in hints.ai_flags, then the returned socket addresses will be suitable for use with connect(2), sendto(2), or sendmsg(2). If node is NULL, then the network address will be set to the loopback interface address (INADDR_LOOPBACK for IPv4 addresses, IN6ADDR_LOOPBACK_INIT for IPv6 address); this is used by applications that intend to communicate with peers running on the same host.

service sets the port in each returned address structure. If this argument is a service name, it is translated to the corresponding port number. This argument can also be specified as a decimal number, which is simply converted to binary. If service is NULL, then the port number of the returned socket addresses will be left uninitialized. If AI_NUMERICSERV is specified in hints.ai_flags and service is not NULL, then service must point to a string containing a numeric port number. This flag is used to inhibit the invocation of a name resolution service in cases where it is known not to be required.

Either node or service, but not both, may be NULL.

The getaddrinfo() function allocates and initializes a linked list of addrinfo structures, one for each network address that matches node and service, subject to any restrictions imposed by hints, and returns a pointer to the start of the list in res. The items in the linked list are linked by the ai_next field.

There are several reasons why the linked list may have more than one addrinfo structure, including: the network host is multihomed, accessible over multiple protocols (e.g., both AF_INET and AF_INET6); or the same service is available from multiple socket types (one SOCK_STREAM address and another SOCK_DGRAM address, for example). Normally, the application should try using the addresses in the order in which they are returned. The sorting function used within getaddrinfo() is defined in RFC 3484; the order can be tweaked for a particular system by editing /etc/gai.conf (available since glibc 2.5).

If hints.ai_flags includes the AI_CANONNAME flag, then the ai_canonname field of the first of the addrinfo structures in the returned list is set to point to the official name of the host.

// Reference.

#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
/**
@retval 0 success.
@retval non-0 error. Use gai_strerror() to translate it.
*/
int getaddrinfo(const char *node, const char *service,
                const struct addrinfo *hints,
                struct addrinfo **res);

void freeaddrinfo(struct addrinfo *res);

const char *gai_strerror(int errcode);

struct addrinfo {
	int              ai_flags;
	int              ai_family;
	int              ai_socktype; // SOCK_STREAM, SOCK_DGRAM.
	int              ai_protocol;
	socklen_t        ai_addrlen;
	struct sockaddr *ai_addr;
	char            *ai_canonname;
	struct addrinfo *ai_next;
};

Reference: stackoverflow. The reason why we could cast pointers of the other structures to sockaddr*, is because most functions only need the sa_family, and it is 16bit in all those structures.

struct sockaddr {
	unsigned short    sa_family;    // address family, AF_xxx
	char              sa_data[14];  // 14 bytes of protocol address
};

// In Mac OS, sys/socket.h, we have a different one.
struct sockaddr {
	__uint8_t	sa_len;		/* total length */
	sa_family_t	sa_family;	/* [XSI] address family */
	char		sa_data[14];	/* [XSI] addr value (actually larger) */
};

// Defined in "/usr/include/netinet/in.h".
struct sockaddr_in {
	short            sin_family;   // e.g. AF_INET, AF_INET6
    unsigned short   sin_port;     // e.g. htons(3490)
    struct in_addr   sin_addr;     // see struct in_addr, below
    char             sin_zero[8];  // zero this if you want to
};

/* Internet address. */
// in_addr. Should be assigned one of the INADDR_* values (e.g., INADDR_ANY) or set using the inet_aton library functions or directly with the name resolver (see gethostbyname).
struct in_addr {
    uint32_t       s_addr;     /* address in network byte order */
};

// Man7.org.
// cp: Internet host address with the IPv4 numbers-and-dots notation.
// Stores it in the structure that inp points to.
// Returns nonzero if the address is valid, zero if not.
// The two functions are in /usr/include/arpa/inet.h
int inet_aton(const char *cp, struct in_addr *inp);
char *inet_ntoa(struct in_addr in);

struct sockaddr_in6 {
    u_int16_t       sin6_family;   // address family, AF_INET6
    u_int16_t       sin6_port;     // port number, Network Byte Order
    u_int32_t       sin6_flowinfo; // IPv6 flow information
    struct in6_addr sin6_addr;     // IPv6 address
    u_int32_t       sin6_scope_id; // Scope ID
};

struct sockaddr_storage {
    sa_family_t  ss_family;     // address family

    // all this is padding, implementation specific, ignore it:
    char      __ss_pad1[_SS_PAD1SIZE];
    int64_t   __ss_align;
    char      __ss_pad2[_SS_PAD2SIZE];
};

SIGPIPE

//SIGPIPE ignore                                                       
struct sigaction act;
act.sa_handler = SIG_IGN;
sigaction(SIGPIPE, &act, 0);

socket

The file descriptor tables

Each running process has a file descriptor table which contains pointers to all open i/o streams. When a process starts, three entries are created in the first three cells of the table. Entry 0 points to standard input, entry 1 points to standard output, and entry 2 points to standard error. Whenever a file or other i/o stream is opened, a new entry is created in this table, usually in the first available empty slot.

The socket system call returns an entry into this table; i.e. a small integer. This value is used for other calls which use this socket. The accept system call returns another entry into this table. The value returned by accept is used for reading and writing to that connection.

// Reference.

#include <sys/types.h>
#include <sys/socket.h>
/**
	@brief Creates an endpoint for communication and returns a fd.

	@param[in]	domain	AF_INET (ipv4), AF_INET6.
	@param[in]	type	SOCK_STREAM (tcp), SOCK_DGRAM (UDP).
	@param[in]	protocol Usually 0.
*/
int socket(int domain, int type, int protocol);

setsockopt

Reference.

#include <sys/socket.h>

int setsockopt(int socket, int level, int option_name,
	const void *option_value, socklen_t option_len);

// Example. 
if (socket_fd = socket(AF_INET, SOCK_STREAM, 0)) == -1 ) {
	perror("Create socket error");
	exit -1;
}

int flag=1;
// Set options at the socket level: SOL_SOCKET.
// Used to solve Error "address already in use" from bind().
setsockopt(socket_fd, SOL_SOCKET, SO_REUSEADDR, (void*)&flag, sizeof(int));

Bind Error: "Address already in use"

// Reference. When closing the socket, the kernel keeps it in TIME_WAIT state for about 2 to 4 minutes. "A socket is a 5 tuple (proto, local addr, local port, remote addr, remote port). SO_REUSEADDR just says that you can reuse local addresses. The 5 tuple still must be unique!"

shutdown, close

Reference. Sockets can be closed with close(fd). If there is still data waiting to be transmitted, close() will try to complete the transmission. The SO_LINGER socket option specifies a timeout period.

Another and more precise way is shutdown().

/**
	@param[in] how
		0: Stop receiving data.
		1: Stop sending data.
		2: Stop both.

	@retval 0 success.
	@retval -1 error.
*/
int shutdown(int socket, int how);
The errno cases include:

connect

// Reference.

#include <sys/types.h>
#include <sys/socket.h>
/**
	@brief Connect sockfd to the addr.

	Generally, connection-based protocol sockets may successfully
       connect() only once; connectionless protocol sockets may use
       connect() multiple times to change their association.

	@retval 0	Success.
	@retval -1	Error.

*/
int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen);

getpeername

Reference. getpeername() returns the address of the peer connected to the socket sockfd, in the buffer pointed to by addr.

#include <sys/socket.h>
int getpeername(int sockfd, struct sockaddr *addr, socklen_t *addrlen);

For stream sockets, once a connect(2) has been performed, either socket can call getpeername() to obtain the address of the peer socket.

On the other hand, datagram sockets are connectionless. Calling connect(2) on a datagram socket merely sets the peer address for outgoing datagrams sent with write(2) or recv(2). The caller of connect(2) can use getpeername() to obtain the peer address that it earlier set for the socket. However, the peer socket is unaware of this information, and calling getpeername() on the peer socket will return no useful information (unless a connect(2) call was also executed on the peer). Note also that the receiver of a datagram can obtain the address of the sender when using recvfrom(2).

Recv

Explanations are given in the code comments.

// Reference.

#include <sys/types.h>
#include <sys/socket.h>

/** @brief Receive contents.

	Receive contents from either connection-based/-less sockets.

	If no message are available at the socket, the calls wait for a
	message to arrive, unless the socket is nonblocking, in which case
	-1 is returned (and errno is set to EAGAIN
	or EWOULDBLOCK).

	An application can use select, poll,
	or epoll to determine when more data arrives on a
	socket.

	@retval	-1, or the length of the received message in bytes.
*/

ssize_t recv(int sockfd, void *buf, size_t len, int flags);

ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags, struct
	sockaddr *src_addr, socklen_t *addrlen);

ssize_t recvmsg(int sockfd, struct msghdr *msg, int flags);

EAGAIN and EWOULDBLOCK

CompGroups. These two errorno no are different integers, but may indicate similar errors:

Ref: why epoll is better than poll and select.

select

// Reference.

/** @brief Synchronous I/O multiplexing.

	select() and pselect() are used to monitor
	multiple file descriptors, waiting until one or more of them become
	ready.

	@param[in] nfds Number of FDs. It's the highest-numbered file
		descriptor in any of the three sets, plus 1.

	@param[in] timeout The argument specifies the interval that
		select() should block waiting for a file descriptor to become
		ready.

	@note on timeout
		If both fields of the timeval structure are zero, then select()
		returns immediately. (This is useful for polling.) If timeout is
		NULL (no timeout), select() can block indefinitely.

	@retval Number of FDs contained in the three returned sets. -1 is
		returned on error.

*/

////////// - select --

/* According to POSIX.1-2001 */
#include <sys/select.h>

/* According to earlier standards */
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>	
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set
           *exceptfds, struct timeval *timeout);

/** @brief Four macros provided to manipulate the File Description sets.
*/

// Remove fd from set.
void FD_CLR(int fd, fd_set *set);
// Add fd to set.
void FD_SET(int fd, fd_set *set);
// Check if fd is in set.
int  FD_ISSET(int fd, fd_set *set);
// Clear set.
void FD_ZERO(fd_set *set);

////////// - pselect --
	
#include <sys/select.h>

/**
Linux specific.
*/
int pselect(int nfds, fd_set *readfds, fd_set *writefds,
            fd_set *exceptfds, const struct timespec *timeout,
            const sigset_t *sigmask);

// Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
// pselect(): _POSIX_C_SOURCE >= 200112L || _XOPEN_SOURCE >= 600

poll

// Reference.

#include <poll.h>

struct pollfd
{
	int fd;         /* file descriptor */
	short events;     /* requested events */
	short revents;    /* returned events */
};

/** 
	@brief poll performs a similar task
	to select: waits for one of a set of file descriptors
	to become ready to perform I/O.

	Specifying a negative value in timeout means an infinite timeout.
	Specifying a timeout of zero causes poll() to return immediately.

	@retval 0: the call timed out and no FD was ready.
	@retval -1: error.
	@retval positive number: number of structures having non-zero revents.

*/

int poll(struct pollfd *fds, nfds_t nfds, int timeout);

////////// Seperator

#define _GNU_SOURCE         /* See feature_test_macros(7) */
#include <signal.h>
#include <poll.h>

/**
Linux specific.
*/
int ppoll(struct pollfd *fds, nfds_t nfds,
        const struct timespec *timeout_ts, const sigset_t *sigmask);

epoll

accept

This overflow discussion is a good reference. accept() is a blocking call, and the CPU is allowed to execute other threads after calling it. Once the accept() is returned, the CPU comes back to execute the following steps.

Reference.

#include <sys/types.h>
#include <sys/socket.h>

int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);

The accept() system call is used with connection-based socket types (SOCK_STREAM, SOCK_SEQPACKET). It extracts the first connection request on the queue of pending connections for the listening socket, sockfd, creates a new connected socket, and returns a new file descriptor referring to that socket. The newly created socket is not in the listening state. The original socket sockfd is unaffected by this call.

The argument sockfd is a socket that has been created with socket(2), bound to a local address with bind(2), and is listening for connections after a listen(2).

If no pending connections are present on the queue, and the socket is not marked as nonblocking, accept() blocks the caller until a connection is present. If the socket is marked nonblocking and no pending connections are present on the queue, accept() fails with the error EAGAIN or EWOULDBLOCK.

In order to be notified of incoming connections on a socket, you can use select(2) or poll(2). A readable event will be delivered when a new connection is attempted and you may then call accept() to get a socket for that connection. Alternatively, you can set the socket to deliver SIGIO when activity occurs on a socket; see socket(7) for details.

The following code snippet comes from this page.

int setNonblocking(int fd)
{
    int flags;

    /* If they have O_NONBLOCK, use the Posix way to do it */
#if defined(O_NONBLOCK)
    /* Fixme: O_NONBLOCK is defined but broken on SunOS 4.1.x and AIX 3.2.5. */
    if (-1 == (flags = fcntl(fd, F_GETFL, 0)))
        flags = 0;
    return fcntl(fd, F_SETFL, flags | O_NONBLOCK);
#else
    /* Otherwise, use the old way of doing it */
    flags = 1;
    return ioctl(fd, FIOBIO, &flags);
#endif
}

send

Reference.

#include <sys/types.h>
#include <sys/socket.h>

ssize_t send(int sockfd, const void *buf, size_t len, int flags);

ssize_t sendto(int sockfd, const void *buf, size_t len, int flags,
		const struct sockaddr *dest_addr, socklen_t addrlen);

The send() call may be used only when the socket is in a connected state (so that the intended recipient is known). The only difference between send() and write(2) is the presence of flags. With a zero flags argument, send() is equivalent to write(2). Also, the following call send(sockfd, buf, len, flags); is equivalent to sendto(sockfd, buf, len, flags, NULL, 0);

If the message is too long to pass atomically through the underlying protocol, the error EMSGSIZE is returned, and the message is not transmitted.

On success, these calls return the number of characters sent. On error, -1 is returned, and errno is set appropriately.

Buffer size

Look at /proc/sys/net/ipv4/tcp_rmem (for read) and /proc/sys/net/ipv4/tcp_wmem (for write) to see the minimum, default and maximum memory size values (in byte), respectively. There is also /proc/sys/net/core/rmem_default for recv and /proc/sys/net/core/wmem_default for send.

http://stackoverflow.com/questions/7865069/how-to-find-the-socket-buffer-size-of-linux.

int n;
unsigned int m = sizeof(n);
int fdsocket;
fdsocket = socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP); // example
getsockopt(fdsocket,SOL_SOCKET,SO_RCVBUF,(void *)&n, &m);
// now the variable n will have the socket size

bind

Reference.

#include <sys/types.h>
#include <sys/socket.h>

int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);

When a socket is created with socket(2), it exists in a name space (address family) but has no address assigned to it. bind() assigns the address specified by addr to the socket referred to by the file descriptor sockfd. Traditionally, this operation is called “assigning a name to a socket”.

It is normally necessary to assign a local address using bind() before a SOCK_STREAM socket may receive connections (see accept(2)).

Others

# Check the connection. traceroute rubygems.org

Protocols

OSPF

Ref. Open Shortest Path First (OSPF) is a routing protocol for Internet Protocol (IP) networks. It uses a link state routing algorithm and falls into the group of interior routing protocols, operating within a single autonomous system (AS).

FAQ

Find IP address

# CentOS:
ifconfig en0 | grep "inet " | awk '{ print $2 }'
# In .bash_profile:
alias wcfip="ifconfig en0 | grep \"inet \" | awk '{ print \$2 }'"

# Get external IP: Reference.
# the ; means "execute commands sequentially.".
# The echo is used to print newline.
curl http://ipecho.net/plain; echo
# I prefer this one: (if curl fails, don't echo)
curl http://ipecho.net/plain && echo

Good DNS servers

Why we should always use a public DNS server? Simply put, it is because DNS is not encrypted! Therefore the Man-In-the-Middle can log and conclude your most visited websites and frequently used softwares (since they are going to visit their websites for news and updates).

Recommended 3rd-party DNS servers: