All notes
Http

Basics

List of HTTP headers.

Status

2xx: Success

3xx: Redirection

This class of status code indicates the client must take additional action to complete the request. Many of these status codes are used in URL redirection.

4xx: Client Error

5xx: Server Error

Authenticattion

Basic

Username/password are sent in clear text (simply Base64-encoded). Note in the following example, "YWxpY2U6YWxpY2U=" decodes into "alice:alice".


GET /basic_auth/test.html HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
Accept-Language: en-us
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Host: 127.0.0.1:8000
Connection: Keep-Alive
Authorization: Basic YWxpY2U6YWxpY2U=

Digest

The MD5 digest of the password is transmitted. The server challenges the client with a nonce value, and expects a response, which is the digest of the username, password, the given nonce value, the HTTP method, and the requested URL.

Client Request:

GET /digest_auth/test.html HTTP/1.1
Accept: image/gif, image/jpeg, */*
Accept-Language: en-us
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Host: 127.0.0.1:8000

Server Response:

HTTP/1.1 401 Authorization Required
Date: Tue, 20 Oct 2009 08:16:43 GMT
Server: Apache/2.2.14 (Win32)
WWW-Authenticate: Digest realm="Members only", 
  nonce="LHOKe1l2BAA=5c373ae0d933a0bb6321125a56a2fcdb6fd7c93b", algorithm=MD5, qop="auth"
Content-Length: 401
Content-Type: text/html; charset=iso-8859-1

Client Request again:

GET /digest_auth/test.html HTTP/1.1
Accept: image/gif, image/jpeg, */*
Accept-Language: en-us
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Host: 127.0.0.1:8000
Authorization: Digest username="bob", realm="members only",
  qop="auth", algorithm="MD5", uri="/digest_auth/test.html",
  nonce="5UImQA==3d76b2ab859e1770ec60ed285ec68a3e63028461",
  nc=00000001, cnonce="1672b410efa182c061c2f0a58acaa17d",
  response="3d9ebe6b9534a7135a3fde59a5a72668"
quality-of-protection (qop).

Disadvantage

Wikipedia.

http/1.1

Content-Type

multipart/form-data

StackOverflow: what does enctype multipart form data mean. HTML forms provide those methods of encoding:

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="utf-8"/>
  <title>upload</title>
</head>
<body>
  <form action="http://localhost:8000" method="post" enctype="multipart/form-data">
  <p><input type="text" name="text1" value="text default">
  <p><input type="text" name="text2" value="a&#x03C9;b">
  <p><input type="file" name="file1">
  <p><input type="file" name="file2">
  <p><input type="file" name="file3">
  <p><button type="submit">Submit</button>
</form>
</body>
</html>

Run nc -l 8000, we get the http request:

POST / HTTP/1.1
Host: localhost:8000
Connection: keep-alive
Content-Length: 679
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Origin: null
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36
Content-Type: multipart/form-data; boundary=----WebKitFormBoundary2tN4WpT3j1PrU0Y7
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.8,zh;q=0.6,zh-CN;q=0.4,af;q=0.2

------WebKitFormBoundary2tN4WpT3j1PrU0Y7
Content-Disposition: form-data; name="text1"

text default
------WebKitFormBoundary2tN4WpT3j1PrU0Y7
Content-Disposition: form-data; name="text2"

aωb
------WebKitFormBoundary2tN4WpT3j1PrU0Y7
Content-Disposition: form-data; name="file1"; filename=""
Content-Type: application/octet-stream


------WebKitFormBoundary2tN4WpT3j1PrU0Y7
Content-Disposition: form-data; name="file2"; filename=""
Content-Type: application/octet-stream


------WebKitFormBoundary2tN4WpT3j1PrU0Y7
Content-Disposition: form-data; name="file3"; filename=""
Content-Type: application/octet-stream


------WebKitFormBoundary2tN4WpT3j1PrU0Y7--

The boundary field is auto set by browser, here "----WebKitFormBoundary2tN4WpT3j1PrU0Y7".
And in message body, it is "------WebKitFormBoundary2tN4WpT3j1PrU0Y7", which has extra "--" in the beginning. The last one is "------WebKitFormBoundary2tN4WpT3j1PrU0Y7--", which has one more trailing "--".

POST V.S. PUT

w3.org

Safe methods: GET, HEAD.
Idempotent methods: GET, HEAD, PUT, DELETE, OPTIONS, TRACE.

PUT vs POST in REST. POST: Used to modify and update a resource.
PUT: Used to create a resource, or overwrite it. While you specify the resources new URL.

POST /questions/<existing_question> HTTP/1.1
Host: wahteverblahblah.com

# Note that the following is an error:

POST /questions/<new_question> HTTP/1.1
Host: wahteverblahblah.com

# If the URL is not yet created, you should not be using POST to create it while specifying the name. This should result in a 'resource not found' error because <new_question> does not exist yet. You should PUT the <new_question> resource on the server first.

# You could though do something like this to create a resources using POST:

POST /questions HTTP/1.1
Host: wahteverblahblah.com

# Note that in this case the resource name is not specified, the new objects URL path would be returned to you.

# For a new resource:
PUT /questions/<new_question> HTTP/1.1
Host: wahteverblahblah.com

# To overwrite an existing resource:
PUT /questions/<existing_question> HTTP/1.1
Host: wahteverblahblah.com

Cache

Disable cache

The correct minimum set of headers that works across all mentioned clients (and proxies):

Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0

The Cache-Control is per the HTTP 1.1 spec for clients and proxies (and implicitly required by some clients next to Expires). The Pragma is per the HTTP 1.0 spec for prehistoric clients. The Expires is per the HTTP 1.0 and 1.1 spec for clients and proxies. In HTTP 1.1, the Cache-Control takes precedence over Expires, so it's after all for HTTP 1.0 proxies only.
The Pragma could be omitted if you don't care about HTTP 1.0 clients (HTTP 1.1 was introduced 1997). If the server auto-includes a valid Date header, then you could theoretically omit Cache-Control too and rely on Expires only, but that may fail if e.g. the enduser manipulates the operating system date and the client software is relying on it.
Other Cache-Control parameters such as max-age are irrelevant if the abovementioned three are specified. The Last-Modified header as included in most other answers here is only interesting if you actually want to cache the request, so you don't need to specify it at all.

HTTP header or HTTP meta tags?

The one specified in the HTTP response header will get precedence over the HTML meta tag. The HTML meta tag will only be used when the page is viewed from local disk file system via a file:// URL.

Apache

Using Apache .htaccess file:

<IfModule mod_headers.c>
    Header set Cache-Control "no-cache, no-store, must-revalidate"
    Header set Pragma "no-cache"
    Header set Expires 0
</IfModule>

HTML

<meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
<meta http-equiv="Pragma" content="no-cache" />
<meta http-equiv="Expires" content="0" />

If-modified-since

CNBlogs. 最近在分析Squid的access.log日志文件时,发现了一个现象。就是即使是对同一个文件进行HTTP请求,第一次和第二次产生的网络流量数据也是不一致的。 大家都知道客户端浏览器是有缓存的,里面存放之前访问过的一些网页文件。其实缓存里存储的不只是网页文件,还有服务器发过来的该文件的最后服务器修改时间。 If-Modified-Since是标准的HTTP请求头标签,在发送HTTP请求时,把浏览器端缓存页面的最后修改时间一起发到服务器去,服务器会把这个时间与服务器上实际文件的最后修改时间进行比较。 如果时间一致,那么返回HTTP状态码304(不返回文件内容),客户端接到之后,就直接把本地缓存文件显示到浏览器中。 如果时间不一致,就返回HTTP状态码200和新的文件内容,客户端接到之后,会丢弃旧文件,把新文件缓存起来,并显示到浏览器中。

StackOverflow.

When handling an If-Modified-Since header field, some servers will use an exact date comparison function, rather than a less-than function, for deciding whether to send a 304 (Not Modified) response. To get best results when sending an If-Modified-Since header field for cache validation, clients are advised to use the exact date string received in a previous Last-Modified header field whenever possible.
This indicates that you should send a Last-Modified header when you expect/want the client to send If-Modified-Since.

HTTP/2

newRelic.com: http2 best practices.

HTTP 1.1 is a text protocol: it supports interaction with Web servers using text in a telnet session on port 80: typing "GET / HTTP/1.1" returns an HTML document. In comparison, requests and responses in HTTP/2 are represented by a stream of binary frames, described as a “basic protocol unit”.

Some new features in HTTP/2 don’t map to HTTP 1.1. Server push (also known as “cache push”) and stream reset are features that correspond to types of binary frames. Frames can also have a priority that allows clients to give servers hints about the priority of some assets over others.

One of the easiest ways to actually see the individual binary frames is by using the net-internals tab of Google Chrome (type chrome://net-internals/#http2 into the address bar).

All major browsers require HTTP/2 connections to be secure for a practical reason: an extension of TLS called Application-Layer Protocol Negotiation (ALPN) lets servers know the browser supports HTTP/2 (among other protocols) and avoids an additional round-trip.

A key performance problem with HTTP 1.1 is latency: every time an asset is fetched, a new TCP connection is generally needed. HTTP 1.1 offers different workarounds for latency issues, including pipelining and the Keep-Alive header. However, pipelining was never widely implemented and the Keep-Alive header suffered from head-of-line blocking: the current request must complete before the next one can be sent.

Long-standing workarounds that aim to reduce the number of connections by bundling related assets:

With unbundled assets in HTTP/2, there is greater opportunity to aggressively cache smaller pieces of a Web application (wcfNote: the cached files are compared between versions with MD5): a very small change doesn't require the entire concatenated file to be downloaded again.

HTTP/2 requires SSL. letsEncrypt is a good free service.

Session Management

HTTP is a stateless (or non-persistent) protocol. A few techniques can be used to maintain state information across multiple HTTP requests, see below.

Rerferences: NTU-ehchua.

Cookies

On client's request, the server-side program sends a response message containing a "Set-Cookie" response header.

Version 0

Cookie Version 0 "Set-Cookie" Header (Netscape)

Set-Cookie: cookie-name=cookie-value; expires=date; path=path-name; domain=domain-name; secure

Version 1

Cookie Version 1 "Set-Cookie" Header (RFC2109/RFC2965)

Set-Cookie: cookie-name=cookie-value; Comment=text; Domain=domain-name; Path=path-name; Max-Age=seconds; Version=1; Secure

Client "Cookie" request header

The client returns the cookie(s) to the matching domain and path in the subsequent requests, using a "Cookie" request header.

Cookie: cookie-name-1=cookie-value-1; cookie-name-2=cookie-value-2; ...

Other methods

Hidden field in the HTML form

<form method="post" action="url">
  <input type="hidden" name="sessionid" value="1111">
  <input type="submit">
</form>
All the pages have to be dynamically generated to update this hidden field.

URL rewriting

http://host:port/shopping.html;sessionid=value
You must rewrite all the URLs in all the HTML files that is send to the client with this unique session ID.

Tokens

A few major problems arose with the Server-based Authentication:

UseTokenBasedAuth.

json web token (jwt)

TheAnatomy

JWTs are self-contained: They will carry all the information necessary within itself. This means that a JWT will be able to transmit basic information about itself, a payload (usually user information), and a signature.

Format: "header.payload.signature". Payload could be:

  1. Registered Claims: iss (issuer), sub (subject), aud (audience), exp, nbf (not before, time before which the token MUST NOT be accepted), iat (issued at time), jti (JWT uid)
  2. Public Claims: name, admin
  3. Private Claims.

How to generate signature part:

var encodedString = base64UrlEncode(header) + "." + base64UrlEncode(payload);
// The secret is the signature held by the server.
HMACSHA256(encodedString, 'secret');

// Header
{
  "typ": "JWT",
  "alg": "HS256"
}

// Payload
{
  "iss": "scotch.io",
  "exp": 1300819380,
  "name": "Chris Sevilleja",
  "admin": true
}

// The final JWT is (header and payload are also base64-ed):
// eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzY290Y2guaW8iLCJleHAiOjEzMDA4MTkzODAsIm5hbWUiOiJDaHJpcyBTZXZpbGxlamEiLCJhZG1pbiI6dHJ1ZX0.03f329983b86f7d9a9f5fef85305880101d5e302afafa20154d094b229f75773

Prevent MitM when using JWT

Can user fake request if having JWT?

Speed up

HTTP pipelining

HTTP Pipelining
Multiple HTTP requests are sent on a single TCP connection without waiting for the corresponding responses.

HTTP keep-alive

HTTP keep-alive
Uses a single TCP connection to send and receive multiple HTTP requests/responses.

HLS

HTTP Live Streaming.

Send chunks of video for streaming using HTTP protocol?

StackOverflow.

  1. If you only want to play from the start of the file then it's fairly straightforward -
    • Make a standard HTTP request and just start playing as soon as you've buffered up enough video that you can finish downloading the file before you catch up with your download rate.
    • Seeking is trickier. You could take the approach that sites like YouTube used to take which is to simply not allow the user to seek until the file has downloaded enough to reach that point in the video (or just leave them looking at a spinner until that point is reached).
  2. To do better you need to be in control of the streaming client.
    • It is suggested treating the file in chunks and making byte range requests for one chunk at a time. When the user seeks into the middle of the file, you can work out the byte offset into the file and start making byte range requests from that point.
    • If the video format contains some sort of index at the start then you can use this to work out file offsets.
    • If the format doesn't have any form of index but it's encoded at a constant bit rate (CBR) then you can do an initial HEAD request and look at the Content-Length header to find the size of the file.
    • If you have control of the file format and the server, you could make life easier by making each chunk a separate resource. This is how Apple HTTP live streaming (HLS) and Microsoft smooth streaming both work. These also do more clever tricks such as allowing a client to switch between multiple versions of the stream encoded at different bit rates to cope with differences in bandwidth.