<
meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<
title>Chapter
1. Fundamentals<
/title><
link rel="stylesheet" href="css/hc-tutorial.css" type="text/css"><
meta name="generator" content="DocBook XSL-NS Stylesheets V1.73.2"><
link rel="start" href="index.html" title="HttpClient Tutorial"><
link rel="up" href="index.html" title="HttpClient Tutorial"><
link rel="prev" href="preface.html" title="Preface"><
link rel="next" href="connmgmt.html" title="Chapter 2. Connection management"><
/head><
body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><
div xmlns:fo
="http://www.w3.org/1999/XSL/Format" class="banner"><
a class="bannerLeft" href="http://www.apache.org/" title="Apache Software Foundation"><
img style="border:none;" src="images/asf_logo_wide.gif"><
/a><
a class="bannerRight" href="http://hc.apache.org/httpcomponents-core/" title="Apache HttpComponents Core"><
img style="border:none;" src="images/hc_logo.png"><
/a><
div class="clear"><
/div><
/div><
div class="navheader"><
table width="100%" summary="Navigation header"><
tr><
th colspan="3" align="center">Chapter
1. Fundamentals<
/th><
/tr><
tr><
td width="20%" align="left"><
a accesskey="p" href="preface.html">Prev<
/a> <
/td><
th width="60%" align="center"> <
/th><
td width="20%" align="right"> <
a accesskey="n" href="connmgmt.html">Next<
/a><
/td><
/tr><
/table><
hr><
/div><
div class="chapter" lang="en"><
div class="titlepage"><
div><
div><
h2 class="title"><
a name="fundamentals"><
/a>Chapter
1. Fundamentals<
/h2><
/div><
/div><
/div>
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h2 class="title" style="clear: both"><
a name="d4e37"><
/a>
1.1. Request execution<
/h2><
/div><
/div><
/div>
<
p> The most essential function of HttpClient is to execute HTTP methods. Execution of an
HTTP method involves one or several HTTP request / HTTP response exchanges, usually
handled internally by HttpClient. The user is expected to provide a request object to
execute and HttpClient is expected to transmit the request to the target server return a
corresponding response
object, or throw an exception if execution was unsuccessful. <
/p>
<
p> Quite naturally, the main entry point of the HttpClient API is the HttpClient
interface that defines the contract described above. <
/p>
<
p>Here is an example of request execution process in its simplest form:<
/p>
<
pre class="programlisting">
HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://localhost/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
InputStream instream = entity.getContent();
int l;
byte[] tmp = new byte[2048];
while ((l = instream.read(tmp)) != -1) {
}
}
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e43"><
/a>1.1.1. HTTP request<
/h3><
/div><
/div><
/div>
<
p>All HTTP requests have a request line consisting a
method name, a request URI and
a HTTP protocol
version.<
/p>
<
p>HttpClient supports out of the box all HTTP methods defined in the HTTP
/1.1
specification: <
code class="literal">GET<
/code>, <
code class="literal">HEAD<
/code>,
<
code class="literal">TRACE<
/code> and <
code class="literal">OPTIONS<
/code>. There is a special
class for each
method type.: <
code class="classname">HttpGet<
/code>,
<
code class="classname">HttpHead<
/code>, <
code class="classname">HttpPost<
/code>,
<
code class="classname">HttpPut<
/code>, <
code class="classname">HttpDelete<
/code>,
<
code class="classname">HttpTrace<
/code>, and <
code class="classname">HttpOptions<
/code>.<
/p>
<
p>The Request-URI is a Uniform Resource Identifier that identifies the resource upon
which to apply the request. HTTP request URIs consist of a protocol scheme, host
name, optional port, resource path, optional query, and optional fragment.<
/p>
<
pre class="programlisting">
HttpGet httpget = new HttpGet(
"http://www.google.com/search?hl=en&q=httpclient&btnG=Google+Search&aq=f&oq=");
<
p>HttpClient provides a number of utility methods to simplify creation and
modification of request URIs.<
/p>
<
p>URI can be assembled programmatically:<
/p>
<
pre class="programlisting">
URI uri = URIUtils.createURI("http", "www.google.com", -1, "/search",
"q=httpclient&btnG=Google+Search&aq=f&oq=", null);
HttpGet httpget = new HttpGet(uri);
System.out.println(httpget.getURI());
<
pre class="programlisting">
http://www.google.com/search?q=httpclient&btnG=Google+Search&aq=f&oq=
<
p>Query string can also be generated from individual parameters:<
/p>
<
pre class="programlisting">
List<NameValuePair> qparams = new ArrayList<NameValuePair>();
qparams.add(new BasicNameValuePair("q", "httpclient"));
qparams.add(new BasicNameValuePair("btnG", "Google Search"));
qparams.add(new BasicNameValuePair("aq", "f"));
qparams.add(new BasicNameValuePair("oq", null));
URI uri = URIUtils.createURI("http", "www.google.com", -1, "/search",
URLEncodedUtils.format(qparams, "UTF-8"), null);
HttpGet httpget = new HttpGet(uri);
System.out.println(httpget.getURI());
<
pre class="programlisting">
http://www.google.com/search?q=httpclient&btnG=Google+Search&aq=f&oq=
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e72"><
/a>1.1.2. HTTP response<
/h3><
/div><
/div><
/div>
<
p>HTTP response is a message sent by the server back to the client after having
received and interpreted a request message. The first line of that message consists
of the protocol version followed by a numeric status code and its associated textual
<
pre class="programlisting">
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
System.out.println(response.getProtocolVersion());
System.out.println(response.getStatusLine().getStatusCode());
System.out.println(response.getStatusLine().getReasonPhrase());
System.out.println(response.getStatusLine().toString());
<
pre class="programlisting">
HTTP/1.1
200
OK
HTTP/1.1 200 OK
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e78"><
/a>1.1.3. Working with message headers<
/h3><
/div><
/div><
/div>
<
p>An HTTP message can contain a number of
headers describing properties of the
message such as the content length, content type and so on. HttpClient provides
methods to retrieve, add, remove and enumerate
headers.<
/p>
<
pre class="programlisting">
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
response.addHeader("Set-Cookie",
"c1=a; path=/; domain=localhost");
response.addHeader("Set-Cookie",
"c2=b; path=\"/\", c3=c; domain=\"localhost\"");
Header h1 = response.getFirstHeader("Set-Cookie");
System.out.println(h1);
Header h2 = response.getLastHeader("Set-Cookie");
System.out.println(h2);
Header[] hs = response.getHeaders("Set-Cookie");
System.out.println(hs.length);
<
pre class="programlisting">
Set-Cookie: c1=a; path=/; domain=localhost
Set-Cookie: c2=b; path="/", c3=c; domain="localhost"
2
<
p>The most efficient way to obtain all
headers of a given
type is by using the
<
code class="interfacename">HeaderIterator<
/code> interface.<
/p>
<
pre class="programlisting">
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
response.addHeader("Set-Cookie",
"c1=a; path=/; domain=localhost");
response.addHeader("Set-Cookie",
"c2=b; path=\"/\", c3=c; domain=\"localhost\"");
HeaderIterator it = response.headerIterator("Set-Cookie");
while (it.hasNext()) {
System.out.println(it.next());
}
<
pre class="programlisting">
Set-Cookie: c1=a; path=/; domain=localhost
Set-Cookie: c2=b; path="/", c3=c; domain="localhost"
<
p>It also provides convenience methods to parse HTTP messages into individual header
<
pre class="programlisting">
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
response.addHeader("Set-Cookie",
"c1=a; path=/; domain=localhost");
response.addHeader("Set-Cookie",
"c2=b; path=\"/\", c3=c; domain=\"localhost\"");
HeaderElementIterator it = new BasicHeaderElementIterator(
response.headerIterator("Set-Cookie"));
while (it.hasNext()) {
HeaderElement elem = it.nextElement();
System.out.println(elem.getName() + " = " + elem.getValue());
NameValuePair[] params = elem.getParameters();
for (int i = 0; i < params.length; i++) {
System.out.println(" " + params[i]);
}
}
<
pre class="programlisting">
c1 = a
path=/
domain=localhost
c2 = b
path=/
c3 = c
domain=localhost
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e93"><
/a>1.1.4. HTTP entity<
/h3><
/div><
/div><
/div>
<
p>HTTP messages can carry a
content entity associated with the request or response.
Entities can be found in some requests and in some responses, as they are optional.
Requests that use entities are referred to as entity enclosing requests. The HTTP
specification defines two entity enclosing methods: <
code class="literal">POST<
/code> and
<
code class="literal">PUT<
/code>. Responses are usually expected to enclose a
content
entity. There are exceptions to this rule such as responses to
<
code class="literal">HEAD<
/code>
method and <
code class="literal">
204 No Content<
/code>,
<
code class="literal">
304 Not Modified<
/code>, <
code class="literal">
205 Reset Content<
/code>
<
p>HttpClient distinguishes three kinds of entities, depending on where their
content
<
div class="itemizedlist"><
ul type="disc"><
li>
The content is received from a stream, or generated on the fly. In
particular, this category includes entities being received from HTTP
responses. Streamed entities are generally not repeatable.
<
b>self-contained: <
/b>
The content is in memory or obtained by means that are independent
from a connection or other entity. Self-contained entities are generally
repeatable. This type of entities will be mostly used for entity
enclosing HTTP requests.
The content is obtained from another entity.
<
p>This distinction is important
for connection management when streaming out
content
from an HTTP response. For request entities that are created by an application and
only sent using HttpClient, the difference between streamed and self-contained is of
little importance. In that case, it is suggested to consider non-repeatable entities
as streamed, and those that are repeatable as self-contained.<
/p>
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h4 class="title"><
a name="d4e117"><
/a>1.1.4.1. Repeatable entities<
/h4><
/div><
/div><
/div>
<
p>An entity can be repeatable, meaning its
content can be read more than once.
This is only possible with self contained entities (like
<
code class="classname">ByteArrayEntity<
/code> or
<
code class="classname">StringEntity<
/code>
)<
/p>
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h4 class="title"><
a name="d4e122"><
/a>1.1.4.2. Using HTTP entities<
/h4><
/div><
/div><
/div>
<
p>Since an entity can represent both binary and character
content, it has
support for character encodings (to support the latter, ie. character
<
p>The entity is created when executing a request with enclosed
content or when
the request was successful and the response body is used to send the result back
<
p>To read the
content from the entity, one can either retrieve the input stream
via the <
code class="methodname">HttpEntity#getContent
()<
/code>
method, which returns
an <
code class="classname">java.io.InputStream<
/code>, or one can supply an output
stream to the <
code class="methodname">HttpEntity#writeTo
(OutputStream
)<
/code>
method,
which will return once all
content has been written to the given stream.<
/p>
<
p>When the entity has been received with an incoming message, the methods
<
code class="methodname">HttpEntity#getContentType
()<
/code> and
<
code class="methodname">HttpEntity#getContentLength
()<
/code> methods can be used
for reading the common metadata such as <
code class="literal">Content-Type<
/code> and
<
code class="literal">Content-Length<
/code>
headers (if they are available
). Since the
<
code class="literal">Content-Type<
/code> header can contain a character encoding
for
text mime-types like text/plain or text/html, the
<
code class="methodname">HttpEntity#getContentEncoding
()<
/code>
method is used to
read this information. If the headers aren't available, a length of -1 will be
returned, and NULL for the content type. If the <code class="literal">Content-Type</code>
header is available, a <code class="interfacename">Header</code> object will be
returned.</p>
<p>When creating an entity for a outgoing message, this meta data has to be
supplied by the creator of the entity.</p>
<pre class="programlisting">
StringEntity myEntity = new StringEntity("important message",
"UTF-8");
System.out.println(myEntity.getContentType());
System.out.println(myEntity.getContentLength());
System.out.println(EntityUtils.getContentCharSet(myEntity));
System.out.println(EntityUtils.toString(myEntity));
System.out.println(EntityUtils.toByteArray(myEntity).length);
</pre>
<p>stdout ></p>
<pre class="programlisting">
Content-Type: text/plain; charset=UTF-8
17
UTF-8
important message
17
</pre>
</div>
</div>
<div class="section" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="d4e143"></a>1.1.5. Ensuring release of low level resources</h3></div></div></div>
<p>When finished with a response entity, it's important to ensure that all entity
content has been fully consumed, so that the connection could be safely returned to
the connection pool and re-used by the connection manager for subsequent requests.
The easiest way to do so is to call the
<
code class="methodname">HttpEntity#consumeContent
(<
/code>
) method to consume any
available content on the stream. HttpClient will automatically release the
underlying connection back to the connection manager as soon as it detects that the
end of the content stream has been reached. The
<
code class="methodname">HttpEntity#consumeContent
()<
/code>
method is safe to call more
<
p>There can be situations, however, when only a small portion of the entire response
content needs to be retrieved and the performance penalty for consuming the
remaining content and making the connection reusable is too high, one can simply
terminate the request by calling <
code class="methodname">HttpUriRequest#abort
()<
/code>
<
pre class="programlisting">
HttpGet httpget = new HttpGet("http://localhost/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
InputStream instream = entity.getContent();
int byteOne = instream.read();
int byteTwo = instream.read();
// Do not need the rest
httpget.abort();
}
<
p>The connection will not be reused, but all level resources held by it will be
correctly deallocated.<
/p>
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e152"><
/a>1.1.6. Consuming entity content<
/h3><
/div><
/div><
/div>
<
p>The recommended way to consume
content of an entity is by using its
<
code class="methodname">HttpEntity#getContent
()<
/code> or
<
code class="methodname">HttpEntity#writeTo
(OutputStream
)<
/code> methods. HttpClient
also comes with the <
code class="classname">EntityUtils<
/code>
class, which exposes several
static methods to more easily read the content or information from an entity.
Instead of reading the <
code class="classname">java.io.InputStream<
/code> directly, one can
retrieve the whole content body in a string / byte array by using the methods from
this
class. However, the use of <
code class="classname">EntityUtils<
/code> is
strongly discouraged unless the response entities originate from a trusted HTTP
server and are known to be of limited length.<
/p>
<
pre class="programlisting">
HttpGet httpget = new HttpGet("http://localhost/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
long len = entity.getContentLength();
if (len != -1 && len < 2048) {
System.out.println(EntityUtils.toString(entity));
} else {
// Stream content out
}
}
<
p>In some situations it may be necessary to be able to read entity
content more than
once. In this case entity content must be buffered in some way, either in memory or
on disk. The simplest way to accomplish that is by wrapping the original entity with
the <
code class="classname">BufferedHttpEntity<
/code>
class. This will cause the
content of
the original entity to be read into a in-memory buffer. In all other ways the entity
wrapper will be have the original one.<
/p>
<
pre class="programlisting">
HttpGet httpget = new HttpGet("http://localhost/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
entity = new BufferedHttpEntity(entity);
}
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e164"><
/a>1.1.7. Producing entity content<
/h3><
/div><
/div><
/div>
<
p>HttpClient provides several classes that can be used to efficiently stream out
content though HTTP connections. Instances of those classes can be associated with
entity enclosing requests such as <
code class="literal">POST<
/code> and <
code class="literal">PUT<
/code>
in order to enclose entity content into outgoing HTTP requests. HttpClient provides
several classes for most common data containers such as string, byte array, input
stream, and file: <
code class="classname">StringEntity<
/code>,
<
code class="classname">ByteArrayEntity<
/code>,
<
code class="classname">InputStreamEntity<
/code>, and
<
code class="classname">FileEntity<
/code>.<
/p>
<
pre class="programlisting">
File file = new File("somefile.txt");
FileEntity entity = new FileEntity(file, "text/plain; charset=\"UTF-8\"");
HttpPost httppost = new HttpPost("http://localhost/action.do");
httppost.setEntity(entity);
<
p>Please note <
code class="classname">InputStreamEntity<
/code> is not repeatable, because it
can only read from the underlying data stream once. Generally it is recommended to
implement a custom <
code class="interfacename">HttpEntity<
/code>
class which is
self-contained instead of using generic <
code class="classname">InputStreamEntity<
/code>.
<
code class="classname">FileEntity<
/code> can be a good starting point.<
/p>
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h4 class="title"><
a name="d4e179"><
/a>1.1.7.1. Dynamic
content entities<
/h4><
/div><
/div><
/div>
<
p>Often HTTP entities need to be generated dynamically based a particular
execution context. HttpClient provides support for dynamic entities by using
<
code class="classname">EntityTemplate<
/code> entity
class and
<
code class="interfacename">ContentProducer<
/code> interface.
Content producers
are objects which produce their content on demand, by writing it out to an
output stream. They are expected to be able produce their content every time
they are requested to do so. So entities created with
<
code class="classname">EntityTemplate<
/code> are generally self-contained and
<
pre class="programlisting">
ContentProducer cp = new ContentProducer() {
public void writeTo(OutputStream outstream) throws IOException {
Writer writer = new OutputStreamWriter(outstream, "UTF-8");
writer.write("<response>");
writer.write(" <content>");
writer.write(" important stuff");
writer.write(" </content>");
writer.write("</response>");
writer.flush();
}
};
HttpEntity entity = new EntityTemplate(cp);
HttpPost httppost = new HttpPost("http://localhost/handler.do");
httppost.setEntity(entity);
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h4 class="title"><
a name="d4e186"><
/a>1.1.7.2. HTML forms<
/h4><
/div><
/div><
/div>
<
p>Many applications frequently need to simulate the process of submitting an
HTML form, for instance, in order to log in to a web application or submit input
data. HttpClient provides special entity class
<
code class="classname">UrlEncodedFormEntity<
/code> to facilitate the
<
pre class="programlisting">
List<NameValuePair> formparams = new ArrayList<NameValuePair>();
formparams.add(new BasicNameValuePair("param1", "value1"));
formparams.add(new BasicNameValuePair("param2", "value2"));
UrlEncodedFormEntity entity = new UrlEncodedFormEntity(formparams, "UTF-8");
HttpPost httppost = new HttpPost("http://localhost/handler.do");
httppost.setEntity(entity);
<
p>This <
code class="classname">UrlEncodedFormEntity<
/code> instance will use the so
called URL encoding to encode parameters and produce the following
<
pre class="programlisting">
param1=value1&param2=value2
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h4 class="title"><
a name="d4e194"><
/a>1.1.7.3.
Content chunking<
/h4><
/div><
/div><
/div>
<
p>Generally it is recommended to let HttpClient choose the most appropriate
transfer encoding based on the properties of the HTTP message being transferred.
It is possible, however, to inform HttpClient that the chunk coding is preferred
by setting <
code class="methodname">HttpEntity#setChunked
()<
/code> to true. Please note
that HttpClient will use this flag as a hint only. This value well be ignored
when using HTTP protocol versions that do not support chunk coding, such as
<
pre class="programlisting">
StringEntity entity = new StringEntity("important message",
"text/plain; charset=\"UTF-8\"");
entity.setChunked(true);
HttpPost httppost = new HttpPost("http://localhost/acrtion.do");
httppost.setEntity(entity);
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e199"><
/a>1.1.8. Response handlers<
/h3><
/div><
/div><
/div>
<
p>The simplest and the most convenient way to handle responses is by using
<
code class="interfacename">ResponseHandler<
/code> interface. This
method completely
relieves the user from having to worry about connection management. When using a
<
code class="interfacename">ResponseHandler<
/code> HttpClient will automatically
take care of ensuring release of the connection back to the connection manager
regardless whether the request execution succeeds or causes an exception.<
/p>
<
pre class="programlisting">
HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://localhost/");
ResponseHandler<byte[]> handler = new ResponseHandler<byte[]>() {
public byte[] handleResponse(
HttpResponse response) throws ClientProtocolException, IOException {
HttpEntity entity = response.getEntity();
if (entity != null) {
return EntityUtils.toByteArray(entity);
} else {
return null;
}
}
};
byte[] response = httpclient.execute(httpget, handler);
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h2 class="title" style="clear: both"><
a name="d4e205"><
/a>
1.2. HTTP execution context<
/h2><
/div><
/div><
/div>
<
p>Originally HTTP has been designed as a stateless, response-request oriented protocol.
However, real world applications often need to be able to persist state information
through several logically related request-response exchanges. In order to enable
applications to maintain a processing state HttpClient allows HTTP requests to be
executed within a particular execution context, referred to as HTTP context. Multiple
logically related requests can participate in a logical session if the same context is
reused between consecutive requests. HTTP context functions similarly to
<
code class="interfacename">java.util.Map<String, Object><
/code>. It is
simply a collection of arbitrary named values. Application can populate context
attributes prior to a request execution or examine the context after the execution has
<
p>In the course of HTTP request execution HttpClient adds the following attributes to
the execution context:<
/p>
<
div class="itemizedlist"><
ul type="disc"><
li>
<
b>
'http.connection': <
/b>
<
code class="interfacename">HttpConnection<
/code> instance representing the
actual connection to the target server.
<
b>
'http.target_host': <
/b>
<
code class="classname">HttpHost<
/code> instance representing the connection
target.
<
b>
'http.proxy_host': <
/b>
<
code class="classname">HttpHost<
/code> instance representing the connection
proxy, if used
<
b>
'http.request': <
/b>
<
code class="interfacename">HttpRequest<
/code> instance representing the
actual HTTP request.
<
b>
'http.response': <
/b>
<
code class="interfacename">HttpResponse<
/code> instance representing the
actual HTTP response.
<
b>
'http.request_sent': <
/b>
<
code class="classname">java.
lang.Boolean<
/code>
object representing the flag
indicating whether the actual request has been fully transmitted to the
connection target.
<
p>
For instance, in order to determine the final redirect
target, one can examine the
value of the <
code class="literal">http.target_host<
/code> attribute after the request
<
pre class="programlisting">
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpContext localContext = new BasicHttpContext();
HttpGet httpget = new HttpGet("http://www.google.com/");
HttpResponse response = httpclient.execute(httpget, localContext);
HttpHost target = (HttpHost) localContext.getAttribute(
ExecutionContext.HTTP_TARGET_HOST);
System.out.println("Final target: " + target);
HttpEntity entity = response.getEntity();
if (entity != null) {
entity.consumeContent();
}
<
pre class="programlisting">
Final target: http://www.google.ch
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h2 class="title" style="clear: both"><
a name="d4e246"><
/a>
1.3. Exception handling<
/h2><
/div><
/div><
/div>
<
p>HttpClient can throw two types of exceptions:
<
code class="exceptionname">java.io.IOException<
/code> in case of an I
/O failure such as
socket timeout or an socket reset and <
code class="exceptionname">HttpException<
/code> that
signals an HTTP failure such as a violation of the HTTP protocol. Usually I/O errors are
considered non-fatal and recoverable, whereas HTTP protocol errors are considered fatal
and cannot be automatically recovered from.<
/p>
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e251"><
/a>1.3.1. HTTP transport safety<
/h3><
/div><
/div><
/div>
<
p>It is important to understand that the HTTP protocol is not well suited
for all
types of applications. HTTP is a simple request/response oriented protocol which was
initially designed to support static or dynamically generated content retrieval. It
has never been intended to support transactional operations. For instance, the HTTP
server will consider its part of the contract fulfilled if it succeeds in receiving
and processing the request, generating a response and sending a status code back to
the client. The server will make no attempts to roll back the transaction if the
client fails to receive the response in its entirety due to a read timeout, a
request cancellation or a system crash. If the client decides to retry the same
request, the server will inevitably end up executing the same transaction more than
once. In some cases this may lead to application data corruption or inconsistent
<
p>Even though HTTP has never been designed to support transactional processing, it
can still be used as a transport protocol for mission critical applications provided
certain conditions are met. To ensure HTTP transport layer safety the system must
ensure the idempotency of HTTP methods on the application layer.<
/p>
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e255"><
/a>1.3.2. Idempotent methods<
/h3><
/div><
/div><
/div>
<
p>HTTP
/1.1 specification defines idempotent
method as<
/p>
[<
span class="citation">Methods can also have the property of
"idempotence" in
that (aside from error or expiration issues) the side-effects of N > 0
identical requests is the same as
for a single request<
/span>
]
<
p>In other words the application ought to ensure that it is prepared to deal with
the implications of multiple execution of the same method. This can be achieved, for
instance, by providing a unique transaction id and by other means of avoiding
execution of the same logical operation.<
/p>
<
p>Please note that this problem is not specific to HttpClient. Browser based
applications are subject to exactly the same issues related to HTTP methods
<
p>HttpClient assumes non-entity enclosing methods such as <
code class="literal">GET<
/code> and
<
code class="literal">HEAD<
/code> to be idempotent and entity enclosing methods such as
<
code class="literal">POST<
/code> and <
code class="literal">PUT<
/code> to be not.<
/p>
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e267"><
/a>1.3.3. Automatic exception recovery<
/h3><
/div><
/div><
/div>
<
p>By default HttpClient attempts to automatically recover from I
/O exceptions. The
default auto-recovery mechanism is limited to just a few exceptions that are known
<
div class="itemizedlist"><
ul type="disc"><
li>
<
p>HttpClient will make no attempt to recover from any logical or HTTP
protocol errors (those derived from
<
code class="exceptionname">HttpException<
/code>
class).<
/p>
<
p>HttpClient will automatically retry those methods that are assumed to be
<
p>HttpClient will automatically retry those methods that fail with a
transport exception while the HTTP request is still being transmitted to the
target server (i.e. the request has not been fully transmitted to the
<
p>HttpClient will automatically retry those methods that have been fully
transmitted to the server, but the server failed to respond with an HTTP
status code (the server simply drops the connection without sending anything
back). In this case it is assumed that the request has not been processed by
the server and the application state has not changed. If this assumption may
not hold true for the web server your application is targeting it is highly
recommended to provide a custom exception handler.<
/p>
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e280"><
/a>1.3.4. Request retry handler<
/h3><
/div><
/div><
/div>
<
p>In order to enable a custom exception recovery mechanism one should provide an
implementation of the <
code class="interfacename">HttpRequestRetryHandler<
/code>
<
pre class="programlisting">
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpRequestRetryHandler myRetryHandler = new HttpRequestRetryHandler() {
public boolean retryRequest(
IOException exception,
int executionCount,
HttpContext context) {
if (executionCount >= 5) {
// Do not retry if over max retry count
return false;
}
if (exception instanceof NoHttpResponseException) {
// Retry if the server dropped connection on us
return true;
}
if (exception instanceof SSLHandshakeException) {
// Do not retry on SSL handshake exception
return false;
}
HttpRequest request = (HttpRequest) context.getAttribute(
ExecutionContext.HTTP_REQUEST);
boolean idempotent = !(request instanceof HttpEntityEnclosingRequest);
if (idempotent) {
// Retry if the request is considered idempotent
return true;
}
return false;
}
};
httpclient.setHttpRequestRetryHandler(myRetryHandler);
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h2 class="title" style="clear: both"><
a name="d4e285"><
/a>
1.4. Aborting requests<
/h2><
/div><
/div><
/div>
<
p>In some situations HTTP request execution fail to complete within the expected time
frame due to high load on the target server or too many concurrent requests issued on
the client side. In such cases it may be necessary to terminate the request prematurely
and unblock the execution thread blocked in a I/O operation. HTTP requests being
executed by HttpClient can be aborted at any stage of execution by invoking
<
code class="methodname">HttpUriRequest#abort
()<
/code>
method. This
method is thread-safe
and can be called from any thread. When an HTTP request is aborted its execution thread
blocked in an I/O operation is guaranteed to unblock by throwing a
<
code class="exceptionname">InterruptedIOException<
/code><
/p>
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h2 class="title" style="clear: both"><
a name="d4e290"><
/a>
1.5. HTTP protocol interceptors<
/h2><
/div><
/div><
/div>
<
p>HTTP protocol interceptor is a routine that implements a specific aspect of the HTTP
protocol. Usually protocol interceptors are expected to act upon one specific header or
a group of related headers of the incoming message or populate the outgoing message with
one specific header or a group of related headers. Protocol interceptors can also
manipulate content entities enclosed with messages, transparent content compression /
decompression being a good example. Usually this is accomplished by using the
'Decorator' pattern where a wrapper entity class is used to decorate the original
entity. Several protocol interceptors can be combined to form one logical unit.<
/p>
<
p>Protocol interceptors can collaborate by sharing information - such as a processing
state - through the HTTP execution context. Protocol interceptors can use HTTP context
to store a processing state
for one request or several consecutive requests.<
/p>
<
p>Usually the order in which interceptors are executed should not matter as long as they
do not depend on a particular state of the execution context. If protocol interceptors
have interdependencies and therefore must be executed in a particular order, they should
be added to the protocol processor in the same sequence as their expected execution
<
p>Protocol interceptors must be implemented as thread-safe. Similarly to servlets,
protocol interceptors should not use instance variables unless access to those variables
<
p>This is an example of how local context can be used to persist a processing state
between consecutive requests:<
/p>
<
pre class="programlisting">
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpContext localContext = new BasicHttpContext();
AtomicInteger count = new AtomicInteger(1);
localContext.setAttribute("count", count);
httpclient.addRequestInterceptor(new HttpRequestInterceptor() {
public void process(
final HttpRequest request,
final HttpContext context) throws HttpException, IOException {
AtomicInteger count = (AtomicInteger) context.getAttribute("count");
request.addHeader("Count", Integer.toString(count.getAndIncrement()));
}
});
HttpGet httpget = new HttpGet("http://localhost/");
for (int i = 0; i < 10; i++) {
HttpResponse response = httpclient.execute(httpget, localContext);
HttpEntity entity = response.getEntity();
if (entity != null) {
entity.consumeContent();
}
}
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h2 class="title" style="clear: both"><
a name="d4e298"><
/a>
1.6. HTTP parameters<
/h2><
/div><
/div><
/div>
<
p>HttpParams interface represents a collection of immutable values that define a runtime
behavior of a component. In many ways <
code class="interfacename">HttpParams<
/code> is
similar to <
code class="interfacename">HttpContext<
/code>. The main distinction between the
two lies in their use at runtime. Both interfaces represent a collection of objects that
are organized as a map of keys to
object values, but serve distinct purposes:<
/p>
<
div class="itemizedlist"><
ul type="disc"><
li>
<
p><
code class="interfacename">HttpParams<
/code> is intended to contain simple
objects: integers, doubles, strings, collections and objects that remain
immutable at runtime.<
/p>
<
code class="interfacename">HttpParams<
/code> is expected to be used in the
'write
once - ready many' mode. <
code class="interfacename">HttpContext<
/code> is intended
to contain complex objects that are very likely to mutate in the course of HTTP
<
p>The purpose of <
code class="interfacename">HttpParams<
/code> is to define a
behavior of other components. Usually each complex component has its own
<
code class="interfacename">HttpParams<
/code>
object. The purpose of
<
code class="interfacename">HttpContext<
/code> is to represent an execution
state of an HTTP process. Usually the same execution context is shared among
many collaborating objects.<
/p>
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e316"><
/a>1.6.1. Parameter hierarchies<
/h3><
/div><
/div><
/div>
<
p>In the course of HTTP request execution <
code class="interfacename">HttpParams<
/code>
of the <
code class="interfacename">HttpRequest<
/code>
object are linked together with
<
code class="interfacename">HttpParams<
/code> of the
<
code class="interfacename">HttpClient<
/code> instance used to execute the request.
This enables parameters set at the HTTP request level take precedence over
<
code class="interfacename">HttpParams<
/code> set at the HTTP client level. The
recommended practice is to set common parameters shared by all HTTP requests at the
HTTP client level and selectively override specific parameters at the HTTP request
<
pre class="programlisting">
DefaultHttpClient httpclient = new DefaultHttpClient();
httpclient.getParams().setParameter(CoreProtocolPNames.PROTOCOL_VERSION,
HttpVersion.HTTP_1_0);
httpclient.getParams().setParameter(CoreProtocolPNames.HTTP_CONTENT_CHARSET,
"UTF-8");
HttpGet httpget = new HttpGet("http://www.google.com/");
httpget.getParams().setParameter(CoreProtocolPNames.PROTOCOL_VERSION,
HttpVersion.HTTP_1_1);
httpget.getParams().setParameter(CoreProtocolPNames.USE_EXPECT_CONTINUE,
Boolean.FALSE);
httpclient.addRequestInterceptor(new HttpRequestInterceptor() {
public void process(
final HttpRequest request,
final HttpContext context) throws HttpException, IOException {
System.out.println(request.getParams().getParameter(
CoreProtocolPNames.PROTOCOL_VERSION));
System.out.println(request.getParams().getParameter(
CoreProtocolPNames.HTTP_CONTENT_CHARSET));
System.out.println(request.getParams().getParameter(
CoreProtocolPNames.USE_EXPECT_CONTINUE));
System.out.println(request.getParams().getParameter(
CoreProtocolPNames.STRICT_TRANSFER_ENCODING));
}
});
<
pre class="programlisting">
HTTP/1.1
UTF-8
false
null
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h3 class="title"><
a name="d4e327"><
/a>1.6.2. HTTP parameters beans<
/h3><
/div><
/div><
/div>
<
p><
code class="interfacename">HttpParams<
/code> interface allows
for a great deal of
flexibility in handling configuration of components. Most importantly, new
parameters can be introduced without affecting binary compatibility with older
versions. However, <
code class="interfacename">HttpParams<
/code> also has a certain
disadvantage compared to regular Java beans:
<
code class="interfacename">HttpParams<
/code> cannot be assembled using a DI
framework. To mitigate the limitation, HttpClient includes a number of bean classes
that can used in order to initialize <
code class="interfacename">HttpParams<
/code>
objects using standard Java bean conventions.<
/p>
<
pre class="programlisting">
HttpParams params = new BasicHttpParams();
HttpProtocolParamBean paramsBean = new HttpProtocolParamBean(params);
paramsBean.setVersion(HttpVersion.HTTP_1_1);
paramsBean.setContentCharset("UTF-8");
paramsBean.setUseExpectContinue(true);
System.out.println(params.getParameter(
CoreProtocolPNames.PROTOCOL_VERSION));
System.out.println(params.getParameter(
CoreProtocolPNames.HTTP_CONTENT_CHARSET));
System.out.println(params.getParameter(
CoreProtocolPNames.USE_EXPECT_CONTINUE));
System.out.println(params.getParameter(
CoreProtocolPNames.USER_AGENT));
<
pre class="programlisting">
HTTP/1.1
UTF-8
false
null
<
div class="section" lang="en"><
div class="titlepage"><
div><
div><
h2 class="title" style="clear: both"><
a name="d4e337"><
/a>
1.7. HTTP request execution parameters<
/h2><
/div><
/div><
/div>
<
p>These are parameters that can impact the process of request execution:<
/p>
<
div class="itemizedlist"><
ul type="disc"><
li>
<
b>
'http.protocol.version': <
/b>
defines HTTP protocol version used if not set explicitly on the request
object. This parameter expects a value of type
<
code class="interfacename">ProtocolVersion<
/code>. If this parameter is not
set HTTP/1.1 will be used.
<
b>
'http.protocol.element-charset': <
/b>
defines the charset to be used for encoding HTTP protocol elements. This
parameter expects a
value of
type <
code class="classname">java.
lang.String<
/code>.
If this parameter is not set <
code class="literal">US-ASCII<
/code> will be
used.
<
b>
'http.protocol.content-charset': <
/b>
defines the charset to be used per default for content body coding. This
parameter expects a
value of
type <
code class="classname">java.
lang.String<
/code>.
If this parameter is not set <
code class="literal">ISO-
8859-
1<
/code> will be
used.
<
b>
'http.useragent': <
/b>
defines the
content of the <
code class="literal">User-Agent<
/code> header. This
parameter expects a
value of
type <
code class="classname">java.
lang.String<
/code>.
If this parameter is not set, HttpClient will automatically generate a value
for it.
<
b>
'http.protocol.strict-transfer-encoding': <
/b>
defines whether responses with an invalid
<
code class="literal">Transfer-Encoding<
/code> header should be rejected. This
parameter expects a
value of
type <
code class="classname">java.
lang.Boolean<
/code>.
If this parameter is not set invalid <
code class="literal">Transfer-Encoding<
/code>
values will be ignored.
<
b>
'http.protocol.expect-continue': <
/b>
activates <
code class="literal">Expect:
100-Continue<
/code> handshake
for the entity
enclosing methods. The purpose of the <
code class="literal">Expect:
100-Continue<
/code> handshake is to allow the client that is sending
a request message with a request body to determine if the origin server is
willing to accept the request (based on the request headers) before the
client sends the request body. The use of the <
code class="literal">Expect:
100-continue<
/code> handshake can result in a noticeable performance
improvement
for entity enclosing requests
(such as <
code class="literal">POST<
/code>
and <
code class="literal">PUT<
/code>
) that require the
target server
's authentication.
<code class="literal">Expect: 100-continue</code> handshake should be used with
caution, as it may cause problems with HTTP servers and proxies that do not
support HTTP/1.1 protocol. This parameter expects a value of type
<code class="classname">java.lang.Boolean</code>. If this parameter is not set
HttpClient will attempt to use the handshake.
</p>
</li><li>
<p>
<b>'http.protocol.wait-for-continue': </b>
defines the maximum period of time in milliseconds the client should spend
waiting for a <code class="literal">100-continue</code> response. This parameter
expects a value of type <code class="classname">java.lang.Integer</code>. If this
parameter is not set HttpClient will wait 3 seconds for a confirmation
before resuming the transmission of the request body.
</p>
</li></ul></div>
</div>
</div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="preface.html">Prev</a> </td><td width="20%" align="center"> </td><td width="40%" align="right"> <a accesskey="n" href="connmgmt.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Preface </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Chapter 2. Connection management</td></tr></table></div></body></html>