The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems.It is invented by Tim Berner Lee. HTTP is the foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access. A typical flow over HTTP involves a client machine making a request to a server, which then sends a response message.
HTTP is the standard protocol for communication between web browsers and web servers. HTTP specifies how a client and server establish a connection, how the client requests data from the server, how the server responds to that request, and finally, how the connection is closed. HTTP connections use the TCP/IP protocol for data transfer. For each request from client to server, there is a sequence of four steps:
-
The client opens a
TCPconnection to the server on port80, by default; other ports may be specified in the URL. -
The client sends a message to the server requesting the resource at a specified path. The request includes a header, and optionally (depending on the nature of the request) a blank line followed by data for the request.
-
The server sends a response to the client. The response begins with a response code, followed by a header full of metadata, a blank line, and the requested document or an error message.
-
The server closes the connection.
This is the basic HTTP 1.0 procedure. In HTTP 1.1 and later, multiple requests and responses can be sent in series over a single TCP connection. That is, steps 2 and 3 can repeat multiple times in between steps 1 and 4. Furthermore, in HTTP 1.1, requests and responses can be sent in multiple chunks. This is more scalable.
Each request and response has the same basic form: a header line, an HTTP header containing metadata, a blank line, and then a message body.
HTTP persistent connection, also called HTTP keep-alive, or HTTP connection reuse, is the idea of using a single TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new connection for every single request/response pair. The newer HTTP/2 protocol uses the same idea and takes it further to allow multiple concurrent requests/responses to be multiplexed over a single connection.
A client indicates that it’s willing to reuse a socket by including a Connection field in the HTTP request header with the value Keep-Alive:
Connection: Keep-Alive
This will continue until either the client or the server decides that the conversation is over and in this case they omit the Connection: header from the last message sent or, better, they add the keyword close to it:
Connection: close
The URL class transparently supports HTTP Keep-Alive unless explicitly turned off. That is, it will reuse a socket if you connect to the same server again before the server has closed the connection. You can control Java’s use of HTTP Keep-Alive with several system properties:
-
Set
http.keepAlivetotrueorfalseto enable/disable HTTP Keep-Alive. (It is enabled by default.) -
Set
http.maxConnectionsto the number of sockets you’re willing to hold open at one time. The default is5. -
Set
http.keepAlive.remainingDatatotrueto let Java clean up after abandoned connections (Java 6 or later). It isfalseby default. -
Set
sun.net.http.errorstream.enableBufferingtotrueto attempt to buffer the relatively short error streams from400- and 500-levelresponses, so the connection can be freed up for reuse sooner. It isfalseby default. -
Set
sun.net.http.errorstream.bufferSizeto the number ofbytesto use for buffering error streams. The default is4,096bytes. -
Set
sun.net.http.errorstream.timeoutto the number ofmillisecondsbefore timing out a read from the error stream. It is300milliseconds by default.
The defaults are reasonable, except that you probably do want to set sun.net.http.errorstream.enableBuffering to true unless you want to read the error streams from failed requests.
The HTTP specification includes a collection of methods that are used to interact with server-side resources. There are commonly referred to as HTTP request methods or HTTP verbs and are intended to cover all possible types of interaction with resources.
While HTTP request methods typically perform different operations, there is an overlap in functionality, and depending on the task, several HTTP requests will have to be made before it is complete. There are also HTTP method properties to consider, including whether a HTTP request is safe, idempotent, or cacheable.
Some HTTP methods are:
GETPOSTPUTHEADDELETEPATCHOPTIONSCONNECTTRACE
The two most common HTTP methods are: GET and POST.
GET is used to request data from a specified resource. Note that the query string (name/value pairs) is sent in the URL of a GET request:
/test/example.php?name1=value1&name2=value2
GETrequests can be cachedGETrequests remain in the browser historyGETrequests can be bookmarkedGETrequests should never be used when dealing with sensitive dataGETrequests have length restrictionsGETrequests are only used to request data (not modify)
POST is used to send data to a server to create/update a resource.
The data sent to the server with POST is stored in the request body of the HTTP request:
POST /test/example.php HTTP/1.1
Host: example.com
name1=value1&name2=value2
POSTrequests are never cachedPOSTrequests do not remain in the browser historyPOSTrequests cannot be bookmarkedPOSTrequests have no restrictions on data length
The GET method retrieves a representation of a resource identified by a URL. The exact location of the resource you want to GET from a server is specified by the various parts of the path and query string. How different paths and query strings map to different resources is determined by the server. The URL class doesn’t really care about that. As long as it knows the URL, it can download from it.
POST and PUT are more complex. In these cases, the client supplies the representation of the resource, in addition to the path and the query string. The representation of the resource is sent in the body of the request, after the header. That is, it sends these four items in order:
-
A starter line including the method, path and query string, and HTTP version
-
An HTTP header
-
A blank line (two successive carriage return/linefeed pairs)
-
The body
For example, this POST request sends form data to a server:
POST /example/text.php HTTP 1.0
Date: Sun, 27 Apr 2013 12:32:36
Host: www.example.org
Content-type: application/x-www-form-urlencoded
Content-length: 54
username=ram+bahadur&email=ram%40example.org
In this example, the body contains an application/x-www-form-urlencoded data, but that’s just one possibility. In general, the body can contain arbitrary bytes. However, the HTTP header should include two fields that specify the nature of the body:
-
A
Content-lengthfield that specifies how many bytes are in the body (54 in the preceding example) -
A
Content-typefield that specifies the MIME media type of the bytes (application/x-www-form-urlencodedin the preceeding example).
The application/x-www-form-urlencoded MIME type used in the preceding example is common because it’s how web browsers encode most form submissions. Thus it’s used by a lot of server-side programs that talk to browsers. However, it’s hardly the only possible type you can send in the body. For example, a camera uploading a picture to a photo sharing site can send image/jpeg. A text editor might send text/html. It’s all just bytes in the end.
Many websites use small strings of text known as cookies to store persistent client-side state between connections. Cookies are passed from server to client and back again in the HTTP headers of requests and responses. Cookies can be used by a server to indicate session IDs, shopping cart contents, login credentials, user preferences, and more. For instance, a cookie set by an online bookstore might have the value ISBN=0802099912&price=$34.95 to specify a book that I’ve put in my shopping cart.
However, more likely, the value is a meaningless string such as ATVPDKIKX0DER, which identifies a particular record in a database of some kind where the real information is kept. Usually the cookie values do not contain the data but merely point to it on the server.
Cookies are limited to nonwhitespace ASCII text, and may not contain commas or semicolons.
To set a cookie in a browser, the server includes a Set-Cookie header line in the HTTP header. For example, this HTTP header sets the cookie “cart” to the value “ATVPDKIKX0DER”:
HTTP/1.1 200 OK
Content-type: text/html
Set-Cookie: cart=ATVPDKIKX0DER
If a browser makes a second request to the same server, it will send the cookie back in a Cookie line in the HTTP request header like so:
GET /index.html HTTP/1.1
Host: www.example.org
Cookie: cart=ATVPDKIKX0DER
Accept: text/html
As long as the server doesn’t reuse cookies, this enables it to track individual users and sessions across multiple, otherwise stateless, HTTP connections.
The CookieManager class provides a precise implementation of CookieHandler. This separates the storage of cookies from the policy surrounding accepting and rejecting cookies. A CookieManager is initialized with a CookieStore and a CookiePolicy. The CookieStore manages storage, and the CookiePolicy object makes policy decisions on cookie acceptance/rejection.
Java 5 includes an abstract java.net.CookieHandler class that defines an API for storing and retrieving cookies. However, it does not include an implementation of that abstract class, so it requires a lot of grunt work. Java 6 fleshes this out by adding a concrete java.net.CookieManager subclass of CookieHandler that you can use. However, it is not turned on by default. Before Java will store and return cookies, you need to enable it:
CookieManager manager = new CookieManager();
CookieHandler.setDefault(manager);If all you want is to receive cookies from sites and send them back to those sites, you’re done. That’s all there is to it. After installing a CookieManager with those two lines of code, Java will store any cookies sent by HTTP servers you connect to with the URL class, and will send the stored cookies back to those same servers in subsequent requests. However, you may wish to be a bit more careful about whose cookies you accept. You can do this by specifying a CookiePolicy. Three policies are predefined:
CookiePolicy.ACCEPT_ALL: All cookies allowed.CookiePolicy.ACCEPT_NONE: No cookies allowed.CookiePolicy.ACCEPT_ORIGINAL_SERVER: Only first party cookies allowed.
For example, this code fragment tells Java to block third-party cookies but accept firstparty cookies:
CookieManager manager = new CookieManager();
manager.setCookiePolicy(CookiePolicy.ACCEPT_ORIGINAL_SERVER);
CookieHandler.setDefault(manager);That is, it will only accept cookies for the server that you’re talking to, not for any server on the Internet.
If you want more fine-grained control, for instance to allow cookies from some known domains but not others, you can implement the CookiePolicy interface yourself and override the shouldAccept() method:
public boolean shouldAccept(URI uri, HttpCookie cookie)The methods inside the CookieManager class:
getCookieStore(): This method will retrieve the current cookie store.setCookiePolicy(CookiePolicy cookiePolicy): This method will set the cookie policy of this cookie manager.get(URI uri, Map<String, List<String>> requestHeaders): This method will get all the applicable cookies from a cookie cache for the specified URI in the request header.put(URI uri, Map<String, List<String>> responseHeaders): This method will set all the applicable cookies.
Example:
import java.io.IOException;
import java.net.*;
import java.util.List;
public class CookieManagerExample {
public static void main(String[] args) throws IOException {
String uri = "https://www.example.com/";
CookieManager cookieManager = new CookieManager();
CookieHandler.setDefault(cookieManager);
CookiePolicy cookiePolicy = CookiePolicy.ACCEPT_ORIGINAL_SERVER;
cookieManager.setCookiePolicy(cookiePolicy);
URL url = new URL(uri);
URLConnection connection = url.openConnection();
connection.getContent();
CookieStore cookieStore = cookieManager.getCookieStore();
List<HttpCookie> cookieList = cookieStore.getCookies();
for (HttpCookie cookie : cookieList) {
System.out.println("Domain name is: " + cookie.getDomain());
System.out.println("Cookie name is: " + cookie.getName());
}
}
}A CookieStore is an interface in Java that is a storage area for cookies. It is used to store and retrieve cookies. A CookieStore is responsible for removing HTTPCookie instances that have expired.
The CookieManager adds the cookies to the CookieStore for every incoming HTTP response by calling CookieStore.add() and retrieves the cookies from the CookieStore for every outgoing HTTP request by calling CookieStore.get().
It is sometimes necessary to put and get cookies locally. For instance, when an application quits, it can save the cookie store to disk and load those cookies again when it next starts up. You can retrieve the store in which the CookieManager saves its cookies with the getCookieStore() method:
CookieStore store = manager.getCookieStore();The CookieStore class allows you to add, remove, and list cookies so you can control the cookies that are sent outside the normal flow of HTTP requests and responses:
public void add(URI uri, HttpCookie cookie): Adds one HTTP cookie to the store.public List<HttpCookie> get(URI uri): Retrieves cookies whose domain matches theURI.public List<HttpCookie> getCookies(): Get all cookies inCookieStorewhich are not expired.public List<URI> getURIs(): Get allURIsthat identify cookies inCookieStore.public boolean remove(URI uri, HttpCookie cookie): Removes a cookie fromCookieStore.public boolean removeAll(): Removes all cookies in theCookieStore.
Example:
import java.net.*;
import java.util.*;
public class CookieStoreExample {
public static void main(String[] args) {
// CookieManager and CookieStore
CookieManager cookieManager = new CookieManager();
CookieStore cookieStore = cookieManager.getCookieStore();
// creating cookies and URI
HttpCookie cookieA = new HttpCookie("First", "1");
HttpCookie cookieB = new HttpCookie("Second", "2");
URI uri = URI.create("https://www.example.com/");
// Method 1 - add(URI uri, HttpCookie cookie)
cookieStore.add(uri, cookieA);
cookieStore.add(null, cookieB);
System.out.println("Cookies successfully added\n");
// Method 2 - get(URI uri)
List cookiesWithURI = cookieStore.get(uri);
System.out.println("Cookies associated with URI in CookieStore : " + cookiesWithURI + "\n");
// Method 3 - getCookies()
List cookieList = cookieStore.getCookies();
System.out.println("Cookies in CookieStore : " + cookieList + "\n");
// Method 4 - getURIs()
List uriList = cookieStore.getURIs();
System.out.println("URIs in CookieStore" + uriList + "\n");
// Method 5 - remove(URI uri, HttpCookie cookie)
System.out.println("Removal of Cookie : " + cookieStore.remove(uri, cookieA));
List remainingCookieList = cookieStore.getCookies();
System.out.println("Remaining Cookies : " + cookieList + "\n");
// Method 6 - removeAll()
System.out.println("Removal of all Cookies : " + cookieStore.removeAll());
List EmptyCookieList = cookieStore.getCookies();
System.out.println("Empty CookieStore : " + cookieList);
}
}Example:
Cookies successfully added
Cookies associated with URI in CookieStore : [First="1"]
Cookies in CookieStore : [First="1", Second="2"]
URIs in CookieStore[http://www.example.com]
Removal of Cookie : true
Remaining Cookies : [Second="2"]
Removal of all Cookies : true
Empty CookieStore : []