Package org.htmlparser.http
Class ConnectionManager
java.lang.Object
org.htmlparser.http.ConnectionManager
Handles proxies, password protected URLs and request properties
including cookies.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected Hashtable
Cookie storage, a hashtable (by site or host) of vectors of Cookies.protected static Hashtable
Default Request header fields.protected static SimpleDateFormat
Cookie expiry date format for parsing.protected ConnectionMonitor
The object to be notified prior to and after each connection.protected String
The user password for accessing the URL.protected String
The proxy server name.protected String
The proxy user password.protected int
The proxy port number.protected String
The proxy username name.protected boolean
Flag determining if redirection processing is being handled manually.protected Hashtable
Request header fields.protected String
The username name for accessing the URL. -
Constructor Summary
ConstructorsConstructorDescriptionCreate a connection manager.ConnectionManager
(Hashtable properties) Create a connection manager with the given connection properties. -
Method Summary
Modifier and TypeMethodDescriptionvoid
addCookies
(URLConnection connection) Generate a HTTP cookie header value string from the cookie jar.protected Vector
addCookies
(Vector cookies, String path, Vector list) Add qualified cookies from cookies into list.static final String
encode
(byte[] array) Encodes a byte array into BASE64 in accordance with RFC 2045.Turn spaces into %20.protected String
generateCookieProperty
(Vector cookies) Creates the cookie request property value from the list of valid cookies for the domain.boolean
Predicate to determine if cookie processing is currently enabled.static Hashtable
Get the current default request header properties.protected String
Get the domain from a host.protected String
getLocation
(HttpURLConnection http) Get the Location field if any.Get the monitoring object, if any.Get the URL users's password.Get the proxy host name, if any.Set the proxy user's password.int
Get the proxy port number.Get the user name for proxy authorization, if any.boolean
Predicate to determine if url redirection processing is currently enabled.Get the current request header properties.getUser()
Get the user name to access the URL.openConnection
(String string) Opens a connection based on a given string.openConnection
(URL url) Opens a connection using the given url.void
parseCookies
(URLConnection connection) Check for cookie and parse into cookie jar.protected void
saveCookies
(Vector list, URLConnection connection) Save the cookies received in the response header.void
Adds a cookie to the cookie jar.void
setCookieProcessingEnabled
(boolean enable) Enables and disabled cookie processing.static void
setDefaultRequestProperties
(Hashtable properties) Set the default request header properties.void
setMonitor
(ConnectionMonitor monitor) Set the monitoring object.void
setPassword
(String password) Set the URL users's password.void
setProxyHost
(String host) Set the proxy host to use.void
setProxyPassword
(String password) Get the proxy user's password.void
setProxyPort
(int port) Set the proxy port number.void
setProxyUser
(String user) Set the user name for proxy authorization.void
setRedirectionProcessingEnabled
(boolean enabled) Enables or disables manual redirection handling.void
setRequestProperties
(Hashtable properties) Set the current request properties.void
Set the user name to access the URL.
-
Field Details
-
mDefaultRequestProperties
Default Request header fields. So far this is just "User-Agent" and "Accept-Encoding". -
mRequestProperties
Request header fields. -
mProxyHost
The proxy server name. -
mProxyPort
protected int mProxyPortThe proxy port number. -
mProxyUser
The proxy username name. -
mProxyPassword
The proxy user password. -
mUser
The username name for accessing the URL. -
mPassword
The user password for accessing the URL. -
mCookieJar
Cookie storage, a hashtable (by site or host) of vectors of Cookies. This will be null if cookie processing is disabled (default). -
mMonitor
The object to be notified prior to and after each connection. -
mRedirectionProcessingEnabled
protected boolean mRedirectionProcessingEnabledFlag determining if redirection processing is being handled manually. -
mFormat
Cookie expiry date format for parsing.
-
-
Constructor Details
-
ConnectionManager
public ConnectionManager()Create a connection manager. -
ConnectionManager
Create a connection manager with the given connection properties.- Parameters:
properties
- Name/value pairs to be added to the HTTP request.
-
-
Method Details
-
getDefaultRequestProperties
Get the current default request header properties. A String-to-String map of header keys and values. These fields are set by the parser when creating a connection.- Returns:
- The default set of request header properties that will currently be used.
- See Also:
-
setDefaultRequestProperties
Set the default request header properties. A String-to-String map of header keys and values. These fields are set by the parser when creating a connection. Some of these can be set directly on aURLConnection
, i.e. If-Modified-Since is set with setIfModifiedSince(long), but since the parser transparently opens the connection on behalf of the developer, these properties are not available before the connection is fetched. Setting these request header fields affects all subsequent connections opened by the parser. For more direct control create aURLConnection
massage it the way you want and then set it on the parser.From RFC 2616 Hypertext Transfer Protocol -- HTTP/1.1:
5.3 Request Header Fields The request-header fields allow the client to pass additional information about the request, and about the client itself, to the server. These fields act as request modifiers, with semantics equivalent to the parameters on a programming language method invocation. request-header = Accept ; Section 14.1 | Accept-Charset ; Section 14.2 | Accept-Encoding ; Section 14.3 | Accept-Language ; Section 14.4 | Authorization ; Section 14.8 | Expect ; Section 14.20 | From ; Section 14.22 | Host ; Section 14.23 | If-Match ; Section 14.24 | If-Modified-Since ; Section 14.25 | If-None-Match ; Section 14.26 | If-Range ; Section 14.27 | If-Unmodified-Since ; Section 14.28 | Max-Forwards ; Section 14.31 | Proxy-Authorization ; Section 14.34 | Range ; Section 14.35 | Referer ; Section 14.36 | TE ; Section 14.39 | User-Agent ; Section 14.43 Request-header field names can be extended reliably only in combination with a change in the protocol version. However, new or experimental header fields MAY be given the semantics of request- header fields if all parties in the communication recognize them to be request-header fields. Unrecognized header fields are treated as entity-header fields.
- Parameters:
properties
- The new set of default request header properties to use. This affects all subsequently created connections.- See Also:
-
getRequestProperties
Get the current request header properties. A String-to-String map of header keys and values, excluding proxy items, cookies and URL authorization.- Returns:
- The request header properties for this connection manager.
-
setRequestProperties
Set the current request properties. Replaces the current set of fixed request properties with the given set. This does not replace the Proxy-Authorization property which is constructed from the values ofsetProxyUser(java.lang.String)
andsetProxyPassword(java.lang.String)
values or the Authorization property which is constructed from thesetUser(java.lang.String)
andsetPassword(java.lang.String)
values. Nor does it replace the Cookie property which is constructed from the current cookie jar.- Parameters:
properties
- The new fixed properties.
-
getProxyHost
Get the proxy host name, if any.- Returns:
- Returns the proxy host.
-
setProxyHost
Set the proxy host to use.- Parameters:
host
- The host to use for proxy access. Note: You must also set the proxyport
.
-
getProxyPort
public int getProxyPort()Get the proxy port number.- Returns:
- Returns the proxy port.
-
setProxyPort
public void setProxyPort(int port) Set the proxy port number.- Parameters:
port
- The proxy port. Note: You must also set the proxyhost
.
-
getProxyUser
Get the user name for proxy authorization, if any.- Returns:
- Returns the proxy user,
or
null
if no proxy authorization is required.
-
setProxyUser
Set the user name for proxy authorization.- Parameters:
user
- The proxy user name. Note: You must also set the proxypassword
.
-
getProxyPassword
Set the proxy user's password.- Returns:
- Returns the proxy password.
-
setProxyPassword
Get the proxy user's password.- Parameters:
password
- The password for the proxy user. Note: You must also set the proxyuser
.
-
getUser
Get the user name to access the URL.- Returns:
- Returns the username that will be used to access the URL,
or
null
if no authorization is required.
-
setUser
Set the user name to access the URL.- Parameters:
user
- The user name for accessing the URL. Note: You must also set thepassword
.
-
getPassword
Get the URL users's password.- Returns:
- Returns the URL password.
-
setPassword
Set the URL users's password.- Parameters:
password
- The password for the URL.
-
getCookieProcessingEnabled
public boolean getCookieProcessingEnabled()Predicate to determine if cookie processing is currently enabled.- Returns:
true
if cookies are being processed.
-
setCookieProcessingEnabled
public void setCookieProcessingEnabled(boolean enable) Enables and disabled cookie processing.- Parameters:
enable
- iftrue
cookie processing will occur, else cookie processing will be turned off.
-
setCookie
Adds a cookie to the cookie jar.- Parameters:
cookie
- The cookie to add.domain
- The domain to use in case the cookie has no domain attribute.
-
getMonitor
Get the monitoring object, if any.- Returns:
- Returns the monitor, or null if none has been assigned.
-
setMonitor
Set the monitoring object.- Parameters:
monitor
- The monitor to set.
-
getRedirectionProcessingEnabled
public boolean getRedirectionProcessingEnabled()Predicate to determine if url redirection processing is currently enabled.- Returns:
true
if redirection is being processed manually.- See Also:
-
setRedirectionProcessingEnabled
public void setRedirectionProcessingEnabled(boolean enabled) Enables or disables manual redirection handling. Normally theHttpURLConnection
follows redirections (HTTP response code 3xx) automatically if thefollowRedirects
property istrue
. With this flag set theConnectionMonitor
performs the redirection processing; The advantage being that cookies (if enabled) are passed in subsequent requests.- Parameters:
enabled
- The new state of the redirectionProcessingEnabled property.
-
getLocation
Get the Location field if any.- Parameters:
http
- The connection to get the location from.
-
openConnection
Opens a connection using the given url.- Parameters:
url
- The url to open.- Returns:
- The connection.
- Throws:
ParserException
- if an i/o exception occurs accessing the url.
-
encode
Encodes a byte array into BASE64 in accordance with RFC 2045.- Parameters:
array
- The bytes to convert.- Returns:
- A BASE64 encoded string.
-
fixSpaces
Turn spaces into %20. ToDo: make this more generic (see RFE #1010593 provide URL encoding/decoding utilities).- Parameters:
url
- The url containing spaces.- Returns:
- The URL with spaces as %20 sequences.
-
openConnection
Opens a connection based on a given string. The string is either a file, in which casefile://localhost
is prepended to a canonical path derived from the string, or a url that begins with one of the known protocol strings, i.e.http://
. Embedded spaces are silently converted to %20 sequences.- Parameters:
string
- The name of a file or a url.- Returns:
- The connection.
- Throws:
ParserException
- if the string is not a valid url or file.
-
addCookies
Generate a HTTP cookie header value string from the cookie jar.The syntax for the header is: cookie = "Cookie:" cookie-version 1*((";" | ",") cookie-value) cookie-value = NAME "=" VALUE [";" path] [";" domain] cookie-version = "$Version" "=" value NAME = attr VALUE = value path = "$Path" "=" value domain = "$Domain" "=" value
- Parameters:
connection
- The connection being accessed.- See Also:
-
addCookies
Add qualified cookies from cookies into list.- Parameters:
cookies
- The list of cookies to check (may be null).path
- The path being accessed.list
- The list of qualified cookies.- Returns:
- The list of qualified cookies.
-
getDomain
Get the domain from a host.- Parameters:
host
- The supposed host name.- Returns:
- The domain (with the leading dot), or null if the domain cannot be determined.
-
generateCookieProperty
Creates the cookie request property value from the list of valid cookies for the domain.- Parameters:
cookies
- The list of valid cookies to be encoded in the request.- Returns:
- A string suitable for inclusion as the value of the "Cookie:" request property.
-
parseCookies
Check for cookie and parse into cookie jar.- Parameters:
connection
- The connection to extract cookie information from.
-
saveCookies
Save the cookies received in the response header.- Parameters:
list
- The list of cookies extracted from the response header.connection
- The connection (used when a cookie has no domain).
-