Prev Class | Next Class | Frames | No Frames |
Summary: Nested | Field | Method | Constr | Detail: Nested | Field | Method | Constr |
java.lang.Object
org.apache.commons.httpclient.URI
public class URI
extends java.lang.Object
implements Cloneable, Comparable, Serializable
So, a URI is a sequence of characters as an array of a char type, which is not always represented as a sequence of octets as an array of byte. URI Syntactic ComponentsURI character sequence: char octet sequence: byte original character sequence: String
The following examples illustrate URI that are in common use.- In general, written as follows: Absolute URI = <scheme>:<scheme-specific-part> Generic URI = <scheme>://<authority><path>?<query> - Syntax absoluteURI = scheme ":" ( hier_part | opaque_part ) hier_part = ( net_path | abs_path ) [ "?" query ] net_path = "//" authority [ abs_path ] abs_path = "/" path_segments
ftp://ftp.is.co.za/rfc/rfc1808.txt -- ftp scheme for File Transfer Protocol services gopher://spinaltap.micro.umn.edu/00/Weather/California/Los%20Angeles -- gopher scheme for Gopher and Gopher+ Protocol services http://www.math.uio.no/faq/compression-faq/part1.html -- http scheme for Hypertext Transfer Protocol services mailto:mduerst@ifi.unizh.ch -- mailto scheme for electronic mail addresses news:comp.infosystems.www.servers.unix -- news scheme for USENET news groups and articles telnet://melvyl.ucop.edu/ -- telnet scheme for interactive services via the TELNET ProtocolPlease, notice that there are many modifications from URL(RFC 1738) and relative URL(RFC 1808). The expressions for a URI
For escaped URI forms - URI(char[]) // constructor - char[] getRawXxx() // method - String getEscapedXxx() // method - String toString() // method For unescaped URI forms - URI(String) // constructor - String getXXX() // method
Nested Class Summary | |
static class |
|
static class |
|
Field Summary | |
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected char[] |
|
protected char[] |
|
protected char[] |
|
protected boolean | |
protected boolean | |
protected boolean | |
protected boolean | |
protected boolean | |
protected boolean | |
protected boolean | |
protected boolean | |
protected boolean | |
protected boolean | |
protected char[] |
|
protected char[] |
|
protected int |
|
protected char[] |
|
protected char[] |
|
protected char[] |
|
protected char[] |
|
protected static BitSet |
|
protected static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
static BitSet |
|
protected static String |
|
protected static String | |
protected static String | |
protected static String |
|
static BitSet |
|
protected static BitSet |
|
static BitSet |
|
static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected int |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected String |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static char[] |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
protected static BitSet |
|
static BitSet |
|
Constructor Summary | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
Method Summary | |
Object |
|
int |
|
protected static String |
|
protected static String |
|
protected static char[] |
|
boolean |
|
protected boolean |
|
String |
|
String |
|
String |
|
static String |
|
static String |
|
static String |
|
static String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
String |
|
int |
|
String |
|
String |
|
char[] |
|
char[] |
|
char[] |
|
protected char[] |
|
char[] |
|
char[] |
|
char[] |
|
char[] |
|
char[] |
|
char[] |
|
char[] |
|
char[] |
|
char[] |
|
char[] |
|
String |
|
String |
|
String |
|
String |
|
boolean |
|
boolean |
|
boolean |
|
boolean |
|
int |
|
protected int |
|
protected int |
|
protected int |
|
protected int |
|
boolean |
|
boolean |
|
boolean |
|
boolean |
|
boolean |
|
boolean |
|
boolean |
|
boolean |
|
boolean |
|
boolean |
|
boolean |
|
boolean |
|
void |
|
protected char[] |
|
protected void |
|
protected void |
|
protected boolean |
|
protected void |
|
protected char[] |
|
protected char[] |
|
static void |
|
static void |
|
void |
|
void |
|
void |
|
void |
|
void |
|
void |
|
void |
|
void |
|
void |
|
void |
|
void |
|
protected void |
|
String |
|
protected boolean |
|
protected boolean |
|
protected void |
|
protected static final BitSet IPv4address
Bitset that combines digit and dot fo IPv$address.IPv4address = 1*digit "." 1*digit "." 1*digit "." 1*digit
protected static final BitSet IPv6address
RFC 2373.IPv6address = hexpart [ ":" IPv4address ]
protected static final BitSet IPv6reference
RFC 2732, 2373.IPv6reference = "[" IPv6address "]"
protected static final BitSet URI_reference
BitSet for URI-reference.URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
protected char[] _authority
The authority.
protected char[] _fragment
The fragment.
protected char[] _host
The host.
protected boolean _is_IPv4address
protected boolean _is_IPv6reference
protected boolean _is_abs_path
protected boolean _is_hier_part
protected boolean _is_hostname
protected boolean _is_net_path
protected boolean _is_opaque_part
protected boolean _is_reg_name
protected boolean _is_rel_path
protected boolean _is_server
protected char[] _opaque
The opaque.
protected char[] _path
The path.
protected int _port
The port.
protected char[] _query
The query.
protected char[] _scheme
The scheme.
protected char[] _uri
This Uniform Resource Identifier (URI). The URI is always in an "escaped" form, since escaping or unescaping a completed URI might change its semantics.
protected char[] _userinfo
The userinfo.
protected static final BitSet abs_path
URI absolute path.abs_path = "/" path_segments
protected static final BitSet absoluteURI
BitSet for absoluteURI.absoluteURI = scheme ":" ( hier_part | opaque_part )
public static final BitSet allowed_IPv6reference
Those characters that are allowed for the IPv6reference component. The characters '[', ']' in IPv6reference should be excluded.
public static final BitSet allowed_abs_path
Those characters that are allowed for the abs_path.
public static final BitSet allowed_authority
Those characters that are allowed for the authority component.
public static final BitSet allowed_fragment
Those characters that are allowed for the fragment component.
public static final BitSet allowed_host
Those characters that are allowed for the host component. The characters '[', ']' in IPv6reference should be excluded.
public static final BitSet allowed_opaque_part
Those characters that are allowed for the opaque_part.
public static final BitSet allowed_query
Those characters that are allowed for the query component.
public static final BitSet allowed_reg_name
Those characters that are allowed for the reg_name.
public static final BitSet allowed_rel_path
Those characters that are allowed for the rel_path.
public static final BitSet allowed_userinfo
Those characters that are allowed for the userinfo component.
public static final BitSet allowed_within_authority
Those characters that are allowed for the authority component.
public static final BitSet allowed_within_path
Those characters that are allowed within the path.
public static final BitSet allowed_within_query
Those characters that are allowed within the query component.
public static final BitSet allowed_within_userinfo
Those characters that are allowed for within the userinfo component.
protected static final BitSet alpha
BitSet for alpha.alpha = lowalpha | upalpha
protected static final BitSet alphanum
BitSet for alphanum (join of alpha & digit).alphanum = alpha | digit
protected static final BitSet authority
BitSet for authority.authority = server | reg_name
public static final BitSet control
BitSet for control.
protected static String defaultDocumentCharset
The default charset of the document. RFC 2277, 2396 The platform's charset is used for the document by default.
protected static String defaultDocumentCharsetByLocale
protected static String defaultDocumentCharsetByPlatform
protected static String defaultProtocolCharset
The default charset of the protocol. RFC 2277, 2396
public static final BitSet delims
BitSet for delims.
protected static final BitSet digit
BitSet for digit.digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
public static final BitSet disallowed_opaque_part
Disallowed opaque_part before escaping.
public static final BitSet disallowed_rel_path
Disallowed rel_path before escaping.
protected static final BitSet domainlabel
BitSet for domainlabel.domainlabel = alphanum | alphanum *( alphanum | "-" ) alphanum
protected static final BitSet escaped
BitSet for escaped.escaped = "%" hex hex
protected static final BitSet fragment
BitSet for fragment (alias for uric).fragment = *uric
protected int hash
Cache the hash code for this URI.
protected static final BitSet hex
BitSet for hex.hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f"
protected static final BitSet hier_part
BitSet for hier_part.hier_part = ( net_path | abs_path ) [ "?" query ]
protected static final BitSet host
BitSet for host.host = hostname | IPv4address | IPv6reference
protected static final BitSet hostname
BitSet for hostname.hostname = *( domainlabel "." ) toplabel [ "." ]
protected static final BitSet hostport
BitSet for hostport.hostport = host [ ":" port ]
protected static final BitSet mark
BitSet for mark.mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
protected static final BitSet net_path
BitSet for net_path.net_path = "//" authority [ abs_path ]
protected static final BitSet opaque_part
URI bitset that combines uric_no_slash and uric.opaque_part = uric_no_slash *uric
protected static final BitSet param
BitSet for param (alias for pchar).param = *pchar
protected static final BitSet path
URI bitset that combines absolute path and opaque part.path = [ abs_path | opaque_part ]
protected static final BitSet path_segments
BitSet for path segments.path_segments = segment *( "/" segment )
protected static final BitSet pchar
BitSet for pchar.pchar = unreserved | escaped | ":" | "@" | "&" | "=" | "+" | "$" | ","
protected static final BitSet percent
The percent "%" character always has the reserved purpose of being the escape indicator, it must be escaped as "%25" in order to be used as data within a URI.
protected static final BitSet port
Port, a logical alias for digit.
protected String protocolCharset
The charset of the protocol used by this URI instance.
protected static final BitSet query
BitSet for query (alias for uric).query = *uric
protected static final BitSet reg_name
BitSet for reg_name.reg_name = 1*( unreserved | escaped | "$" | "," | ";" | ":" | "@" | "&" | "=" | "+" )
protected static final BitSet rel_path
BitSet for rel_path.rel_path = rel_segment [ abs_path ]
protected static final BitSet rel_segment
BitSet for rel_segment.rel_segment = 1*( unreserved | escaped | ";" | "@" | "&" | "=" | "+" | "$" | "," )
protected static final BitSet relativeURI
BitSet for relativeURI.relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ]
protected static final BitSet reserved
BitSet for reserved.reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | ","
protected static char[] rootPath
The root path.
protected static final BitSet scheme
BitSet for scheme.scheme = alpha *( alpha | digit | "+" | "-" | "." )
protected static final BitSet segment
BitSet for segment.segment = *pchar *( ";" param )
protected static final BitSet server
Bitset for server.server = [ [ userinfo "@" ] hostport ]
public static final BitSet space
BitSet for space.
protected static final BitSet toplabel
BitSet for toplabel.toplabel = alpha | alpha *( alphanum | "-" ) alphanum
protected static final BitSet unreserved
Data characters that are allowed in a URI but do not have a reserved purpose are called unreserved.unreserved = alphanum | mark
public static final BitSet unwise
BitSet for unwise.
protected static final BitSet uric
BitSet for uric.uric = reserved | unreserved | escaped
protected static final BitSet uric_no_slash
URI bitset for encoding typical non-slash characters.uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | ","
protected static final BitSet userinfo
Bitset for userinfo.userinfo = *( unreserved | escaped | ";" | ":" | "&" | "=" | "+" | "$" | "," )
public static final BitSet within_userinfo
BitSet for within the userinfo component like user and password.
protected URI()
Create an instance as an internal use
public URI(String original) throws URIException
Deprecated. Use #URI(String, boolean)
Construct a URI from the given string.An URI can be placed within double-quotes or angle brackets like "http://test.com/" and <http://test.com/>URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
- Parameters:
original
- the string to be represented to URI character sequence It is one of absoluteURI and relativeURI.
- Throws:
URIException
- If the URI cannot be created.
- See Also:
getDefaultProtocolCharset()
public URI(String original, String charset) throws URIException
Deprecated. Use #URI(String, boolean, String)
Construct a URI from the given string with the given charset.
- Parameters:
original
- the string to be represented to URI character sequence It is one of absoluteURI and relativeURI.charset
- the charset string to do escape encoding
- Throws:
URIException
- If the URI cannot be created.
- See Also:
getProtocolCharset()
public URI(String scheme, String schemeSpecificPart, String fragment) throws URIException
Construct a general URI from the given components.It's for absolute URI = <scheme>:<scheme-specific-part># <fragment>.URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ] absoluteURI = scheme ":" ( hier_part | opaque_part ) opaque_part = uric_no_slash *uric
- Parameters:
scheme
- the scheme stringschemeSpecificPart
- scheme_specific_partfragment
- the fragment string
- Throws:
URIException
- If the URI cannot be created.
- See Also:
getDefaultProtocolCharset()
public URI(String scheme, String host, String path, String fragment) throws URIException
Construct a general URI from the given components.
- Parameters:
scheme
- the scheme stringhost
- the host stringpath
- the path stringfragment
- the fragment string
- Throws:
URIException
- If the new URI cannot be created.
- See Also:
getDefaultProtocolCharset()
public URI(String scheme, String authority, String path, String query, String fragment) throws URIException
Construct a general URI from the given components.It's for absolute URI = <scheme>:<path>?<query>#< fragment> and relative URI = <path>?<query>#<fragment >.URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ] absoluteURI = scheme ":" ( hier_part | opaque_part ) relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ] hier_part = ( net_path | abs_path ) [ "?" query ]
- Parameters:
scheme
- the scheme stringauthority
- the authority stringpath
- the path stringquery
- the query stringfragment
- the fragment string
- Throws:
URIException
- If the new URI cannot be created.
- See Also:
getDefaultProtocolCharset()
public URI(String scheme, String userinfo, String host, int port) throws URIException
Construct a general URI from the given components.
- Parameters:
scheme
- the scheme stringuserinfo
- the userinfo stringhost
- the host stringport
- the port number
- Throws:
URIException
- If the new URI cannot be created.
- See Also:
getDefaultProtocolCharset()
public URI(String scheme, String userinfo, String host, int port, String path) throws URIException
Construct a general URI from the given components.
- Parameters:
scheme
- the scheme stringuserinfo
- the userinfo stringhost
- the host stringport
- the port numberpath
- the path string
- Throws:
URIException
- If the new URI cannot be created.
- See Also:
getDefaultProtocolCharset()
public URI(String scheme, String userinfo, String host, int port, String path, String query) throws URIException
Construct a general URI from the given components.
- Parameters:
scheme
- the scheme stringuserinfo
- the userinfo stringhost
- the host stringport
- the port numberpath
- the path stringquery
- the query string
- Throws:
URIException
- If the new URI cannot be created.
- See Also:
getDefaultProtocolCharset()
public URI(String scheme, String userinfo, String host, int port, String path, String query, String fragment) throws URIException
Construct a general URI from the given components.
- Parameters:
scheme
- the scheme stringuserinfo
- the userinfo stringhost
- the host stringport
- the port numberpath
- the path stringquery
- the query stringfragment
- the fragment string
- Throws:
URIException
- If the new URI cannot be created.
- See Also:
getDefaultProtocolCharset()
public URI(String s, boolean escaped) throws URIException, NullPointerException
Construct a URI from a string with the given charset. The input string can be either in escaped or unescaped form.
- Parameters:
s
- URI character sequenceescaped
- true if URI character sequence is in escaped form. false otherwise.
- Throws:
URIException
- If the URI cannot be created.
- Since:
- 3.0
- See Also:
getProtocolCharset()
public URI(String s, boolean escaped, String charset) throws URIException, NullPointerException
Construct a URI from a string with the given charset. The input string can be either in escaped or unescaped form.
- Parameters:
s
- URI character sequenceescaped
- true if URI character sequence is in escaped form. false otherwise.charset
- the charset string to do escape encoding, if required
- Throws:
URIException
- If the URI cannot be created.
- Since:
- 3.0
- See Also:
getProtocolCharset()
public URI(char[] escaped) throws URIException, NullPointerException
Deprecated. Use #URI(String, boolean)
Construct a URI as an escaped form of a character array. An URI can be placed within double-quotes or angle brackets like "http://test.com/" and <http://test.com/>
- Parameters:
escaped
- the URI character sequence
- Throws:
URIException
- If the URI cannot be created.
- See Also:
getDefaultProtocolCharset()
public URI(char[] escaped, String charset) throws URIException, NullPointerException
Deprecated. Use #URI(String, boolean, String)
Construct a URI as an escaped form of a character array with the given charset.
- Parameters:
escaped
- the URI character sequencecharset
- the charset string to do escape encoding
- Throws:
URIException
- If the URI cannot be created.
- See Also:
getProtocolCharset()
public URI(URI base, String relative) throws URIException
Deprecated. Use #URI(URI, String, boolean)
Construct a general URI with the given relative URI string.
- Parameters:
base
- the base URIrelative
- the relative URI string
- Throws:
URIException
- If the new URI cannot be created.
public URI(URI base, String relative, boolean escaped) throws URIException
Construct a general URI with the given relative URI string.
- Parameters:
base
- the base URIrelative
- the relative URI stringescaped
- true if URI character sequence is in escaped form. false otherwise.
- Throws:
URIException
- If the new URI cannot be created.
- Since:
- 3.0
public URI(URI base, URI relative) throws URIException
Construct a general URI with the given relative URI.Resolving Relative References to Absolute Form. Examples of Resolving Relative URI References Within an object with a well-defined base URI ofURI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ] relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ]the relative URI would be resolved as follows: Normal Exampleshttp://a/b/c/d;p?qSome URI schemes do not allow a hierarchical syntax matching theg:h = g:h g = http://a/b/c/g ./g = http://a/b/c/g g/ = http://a/b/c/g/ /g = http://a/g //g = http://g ?y = http://a/b/c/?y g?y = http://a/b/c/g?y #s = (current document)#s g#s = http://a/b/c/g#s g?y#s = http://a/b/c/g?y#s ;x = http://a/b/c/;x g;x = http://a/b/c/g;x g;x?y#s = http://a/b/c/g;x?y#s . = http://a/b/c/ ./ = http://a/b/c/ .. = http://a/b/ ../ = http://a/b/ ../g = http://a/b/g ../.. = http://a/ ../../ = http://a/ ../../g = http://a/gsyntax, and thus cannot use relative references.
- Parameters:
base
- the base URIrelative
- the relative URI
- Throws:
URIException
- If the new URI cannot be created.
public Object clone()
Create and return a copy of this object, the URI-reference containing the userinfo component. Notice that the whole URI-reference including the userinfo component counld not be gotten as aString
. To copy the identicalURI
object including the userinfo component, it should be used.
- Returns:
- a clone of this instance
public int compareTo(Object obj) throws ClassCastException
Compare this URI to another object.
- Parameters:
obj
- the object to be compared.
- Returns:
- 0, if it's same, -1, if failed, first being compared with in the authority component
protected static String decode(String component, String charset) throws URIException
Decodes URI encoded string. This is a two mapping, one from URI characters to octets, and subsequently a second from octets to original characters:A URI must be separated into its components before the escaped characters within those components can be allowedly decoded. Notice that there is a chance that URI characters that are non UTF-8 may be parsed as valid UTF-8. A recent non-scientific analysis found that EUC encoded Japanese words had a 2.7% false reading; SJIS had a 0.0005% false reading; other encoding such as ASCII or KOI-8 have a 0% false reading. The percent "%" character always has the reserved purpose of being the escape indicator, it must be escaped as "%25" in order to be used as data within a URI. The unescape method is internally performed within this method.URI character sequence->octet sequence->original character sequence
- Parameters:
component
- the URI character sequencecharset
- the protocol charset
- Returns:
- original character sequence
- Throws:
URIException
- incomplete trailing escape pattern or unsupported character encoding
- Since:
- 3.0
protected static String decode(char[] component, String charset) throws URIException
Decodes URI encoded string. This is a two mapping, one from URI characters to octets, and subsequently a second from octets to original characters:A URI must be separated into its components before the escaped characters within those components can be allowedly decoded. Notice that there is a chance that URI characters that are non UTF-8 may be parsed as valid UTF-8. A recent non-scientific analysis found that EUC encoded Japanese words had a 2.7% false reading; SJIS had a 0.0005% false reading; other encoding such as ASCII or KOI-8 have a 0% false reading. The percent "%" character always has the reserved purpose of being the escape indicator, it must be escaped as "%25" in order to be used as data within a URI. The unescape method is internally performed within this method.URI character sequence->octet sequence->original character sequence
- Parameters:
component
- the URI character sequencecharset
- the protocol charset
- Returns:
- original character sequence
- Throws:
URIException
- incomplete trailing escape pattern or unsupported character encoding
protected static char[] encode(String original, BitSet allowed, String charset) throws URIException
Encodes URI string. This is a two mapping, one from original characters to octets, and subsequently a second from octets to URI characters:An escaped octet is encoded as a character triplet, consisting of the percent character "%" followed by the two hexadecimal digits representing the octet code. For example, "%20" is the escaped encoding for the US-ASCII space character. Conversion from the local filesystem character set to UTF-8 will normally involve a two step process. First convert the local character set to the UCS; then convert the UCS to UTF-8. The first step in the process can be performed by maintaining a mapping table that includes the local character set code and the corresponding UCS code. The next step is to convert the UCS character code to the UTF-8 encoding. Mapping between vendor codepages can be done in a very similar manner as described above. The only time escape encodings can allowedly be made is when a URI is being created from its component parts. The escape and validate methods are internally performed within this method.original character sequence->octet sequence->URI character sequence
- Parameters:
original
- the original character sequenceallowed
- those characters that are allowed within a componentcharset
- the protocol charset
- Returns:
- URI character sequence
- Throws:
URIException
- null component or unsupported character encoding
public boolean equals(Object obj)
Test an object if this URI is equal to another.
- Parameters:
obj
- an object to compare
- Returns:
- true if two URI objects are equal
protected boolean equals(char[] first, char[] second)
Test if the first array is equal to the second array.
- Parameters:
first
- the first character arraysecond
- the second character array
- Returns:
- true if they're equal
public String getAboveHierPath() throws URIException
Get the level above the this hierarchy level.
- Returns:
- the above hierarchy level
- Throws:
URIException
- IfgetRawCurrentHierPath(char[])
fails.
- See Also:
decode
public String getAuthority() throws URIException
Get the authority.
- Returns:
- the authority
- Throws:
URIException
- Ifdecode
fails
public String getCurrentHierPath() throws URIException
Get the current hierarchy level.
- Returns:
- the current hierarchy level
- Throws:
URIException
- IfgetRawCurrentHierPath(char[])
fails.
- See Also:
decode
public static String getDefaultDocumentCharset()
Get the recommended default charset of the document.
- Returns:
- the default charset string
public static String getDefaultDocumentCharsetByLocale()
Get the default charset of the document by locale.
- Returns:
- the default charset string by locale
public static String getDefaultDocumentCharsetByPlatform()
Get the default charset of the document by platform.
- Returns:
- the default charset string by platform
public static String getDefaultProtocolCharset()
Get the default charset of the protocol. An individual URI scheme may require a single charset, define a default charset, or provide a way to indicate the charset used. To work globally either requires support of a number of character sets and to be able to convert between them, or the use of a single preferred character set. For support of global compatibility it is STRONGLY RECOMMENDED that clients and servers use UTF-8 encoding when exchanging URIs.
- Returns:
- the default charset string
public String getEscapedAboveHierPath() throws URIException
Get the level above the this hierarchy level.
- Returns:
- the raw above hierarchy level
- Throws:
URIException
- IfgetRawCurrentHierPath(char[])
fails.
public String getEscapedAuthority()
Get the escaped authority.
- Returns:
- the escaped authority
public String getEscapedCurrentHierPath() throws URIException
Get the escaped current hierarchy level.
- Returns:
- the escaped current hierarchy level
- Throws:
URIException
- IfgetRawCurrentHierPath(char[])
fails.
public String getEscapedFragment()
Get the escaped fragment.
- Returns:
- the escaped fragment string
public String getEscapedName()
Get the escaped basename of the path.
- Returns:
- the escaped basename string
public String getEscapedPath()
Get the escaped path.path = [ abs_path | opaque_part ] abs_path = "/" path_segments opaque_part = uric_no_slash *uric
- Returns:
- the escaped path string
public String getEscapedPathQuery()
Get the escaped query.
- Returns:
- the escaped path and query string
public String getEscapedQuery()
Get the escaped query.
- Returns:
- the escaped query string
public String getEscapedURI()
It can be gotten the URI character sequence. It's escaped. For the purpose of the protocol to be transported, it will be useful.
- Returns:
- the escaped URI string
public String getEscapedURIReference()
Get the escaped URI reference string.
- Returns:
- the escaped URI reference string
public String getEscapedUserinfo()
Get the escaped userinfo.
- Returns:
- the escaped userinfo
- See Also:
getAuthority()
public String getFragment() throws URIException
Get the fragment.
- Returns:
- the fragment string
- Throws:
URIException
- incomplete trailing escape pattern or unsupported character encoding
- See Also:
decode
public String getHost() throws URIException
Get the host.host = hostname | IPv4address | IPv6reference
- Returns:
- the host
- Throws:
URIException
- Ifdecode
fails
- See Also:
getAuthority()
public String getName() throws URIException
Get the basename of the path.
- Returns:
- the basename string
- Throws:
URIException
- incomplete trailing escape pattern or unsupported character encoding
- See Also:
decode
public String getPath() throws URIException
Get the path.path = [ abs_path | opaque_part ]
- Returns:
- the path string
- Throws:
URIException
- Ifdecode
fails.
- See Also:
decode
public String getPathQuery() throws URIException
Get the path and query.
- Returns:
- the path and query string.
- Throws:
URIException
- incomplete trailing escape pattern or unsupported character encoding
- See Also:
decode
public int getPort()
Get the port. In order to get the specfic default port, the specific protocol-supported class extended from the URI class should be used. It has the server-based naming authority.
- Returns:
- the port if -1, it has the default port for the scheme or the server-based naming authority is not supported in the specific URI.
public String getProtocolCharset()
Get the protocol charset used by this current URI instance. It was set by the constructor for this instance. If it was not set by contructor, it will return the default protocol charset.
- Returns:
- the protocol charset string
- See Also:
getDefaultProtocolCharset()
public String getQuery() throws URIException
Get the query.
- Returns:
- the query string.
- Throws:
URIException
- incomplete trailing escape pattern or unsupported character encoding
- See Also:
decode
public char[] getRawAboveHierPath() throws URIException
Get the level above the this hierarchy level.
- Returns:
- the raw above hierarchy level
- Throws:
URIException
- IfgetRawCurrentHierPath(char[])
fails.
public char[] getRawAuthority()
Get the raw-escaped authority.
- Returns:
- the raw-escaped authority
public char[] getRawCurrentHierPath() throws URIException
Get the raw-escaped current hierarchy level.
- Returns:
- the raw-escaped current hierarchy level
- Throws:
URIException
- IfgetRawCurrentHierPath(char[])
fails.
protected char[] getRawCurrentHierPath(char[] path) throws URIException
Get the raw-escaped current hierarchy level in the given path. If the last namespace is a collection, the slash mark ('/') should be ended with at the last character of the path string.
- Parameters:
path
- the path
- Returns:
- the current hierarchy level
- Throws:
URIException
- no hierarchy level
public char[] getRawFragment()
Get the raw-escaped fragment. The optional fragment identifier is not part of a URI, but is often used in conjunction with a URI. The format and interpretation of fragment identifiers is dependent on the media type [RFC2046] of the retrieval result. A fragment identifier is only meaningful when a URI reference is intended for retrieval and the result of that retrieval is a document for which the identified fragment is consistently defined.
- Returns:
- the raw-escaped fragment
public char[] getRawHost()
Get the host.host = hostname | IPv4address | IPv6reference
- Returns:
- the host
- See Also:
getAuthority()
public char[] getRawName()
Get the raw-escaped basename of the path.
- Returns:
- the raw-escaped basename
public char[] getRawPath()
Get the raw-escaped path.path = [ abs_path | opaque_part ]
- Returns:
- the raw-escaped path
public char[] getRawPathQuery()
Get the raw-escaped path and query.
- Returns:
- the raw-escaped path and query
public char[] getRawQuery()
Get the raw-escaped query.
- Returns:
- the raw-escaped query
public char[] getRawScheme()
Get the scheme.
- Returns:
- the scheme
public char[] getRawURI()
It can be gotten the URI character sequence. It's raw-escaped. For the purpose of the protocol to be transported, it will be useful. It is clearly unwise to use a URL that contains a password which is intended to be secret. In particular, the use of a password within the 'userinfo' component of a URL is strongly disrecommended except in those rare cases where the 'password' parameter is intended to be public. When you want to get each part of the userinfo, you need to use the specific methods in the specific URL. It depends on the specific URL.
- Returns:
- the URI character sequence
public char[] getRawURIReference()
Get the URI reference character sequence.
- Returns:
- the URI reference character sequence
public char[] getRawUserinfo()
Get the raw-escaped userinfo.
- Returns:
- the raw-escaped userinfo
- See Also:
getAuthority()
public String getScheme()
Get the scheme.
- Returns:
- the scheme null if undefined scheme
public String getURI() throws URIException
It can be gotten the URI character sequence.
- Returns:
- the original URI string
- Throws:
URIException
- incomplete trailing escape pattern or unsupported character encoding
- See Also:
decode
public String getURIReference() throws URIException
Get the original URI reference string.
- Returns:
- the original URI reference string
- Throws:
URIException
- Ifdecode
fails.
public String getUserinfo() throws URIException
Get the userinfo.
- Returns:
- the userinfo
- Throws:
URIException
- Ifdecode
fails
- See Also:
getAuthority()
public boolean hasAuthority()
Tell whether or not this URI has authority. It's the same function as the is_net_path() method.
- Returns:
- true iif this URI has authority
- See Also:
isNetPath()
public boolean hasFragment()
Tell whether or not this URI has fragment.
- Returns:
- true iif this URI has fragment
public boolean hasQuery()
Tell whether or not this URI has query.
- Returns:
- true iif this URI has query
public boolean hasUserinfo()
Tell whether or not this URI has userinfo.
- Returns:
- true iif this URI has userinfo
public int hashCode()
Return a hash code for this URI.
- Returns:
- a has code value for this URI
protected int indexFirstOf(String s, String delims)
Get the earlier index that to be searched for the first occurrance in one of any of the given string.
- Parameters:
s
- the string to be indexeddelims
- the delimiters used to index
- Returns:
- the earlier index if there are delimiters
protected int indexFirstOf(String s, String delims, int offset)
Get the earlier index that to be searched for the first occurrance in one of any of the given string.
- Parameters:
s
- the string to be indexeddelims
- the delimiters used to indexoffset
- the from index
- Returns:
- the earlier index if there are delimiters
protected int indexFirstOf(char[] s, char delim)
Get the earlier index that to be searched for the first occurrance in one of any of the given array.
- Parameters:
s
- the character array to be indexeddelim
- the delimiter used to index
- Returns:
- the ealier index if there are a delimiter
protected int indexFirstOf(char[] s, char delim, int offset)
Get the earlier index that to be searched for the first occurrance in one of any of the given array.
- Parameters:
s
- the character array to be indexeddelim
- the delimiter used to indexoffset
- The offset.
- Returns:
- the ealier index if there is a delimiter
public boolean isAbsPath()
Tell whether or not the relativeURI or hier_part of this URI is abs_path.
- Returns:
- true iif the relativeURI or hier_part is abs_path
public boolean isAbsoluteURI()
Tell whether or not this URI is absolute.
- Returns:
- true iif this URI is absoluteURI
public boolean isHierPart()
Tell whether or not the absoluteURI of this URI is hier_part.
- Returns:
- true iif the absoluteURI is hier_part
public boolean isHostname()
Tell whether or not the host part of this URI is hostname.
- Returns:
- true iif the host part is hostname
public boolean isIPv4address()
Tell whether or not the host part of this URI is IPv4address.
- Returns:
- true iif the host part is IPv4address
public boolean isIPv6reference()
Tell whether or not the host part of this URI is IPv6reference.
- Returns:
- true iif the host part is IPv6reference
public boolean isNetPath()
Tell whether or not the relativeURI or heir_part of this URI is net_path. It's the same function as the has_authority() method.
- Returns:
- true iif the relativeURI or heir_part is net_path
- See Also:
hasAuthority()
public boolean isOpaquePart()
Tell whether or not the absoluteURI of this URI is opaque_part.
- Returns:
- true iif the absoluteURI is opaque_part
public boolean isRegName()
Tell whether or not the authority component of this URI is reg_name.
- Returns:
- true iif the authority component is reg_name
public boolean isRelPath()
Tell whether or not the relativeURI of this URI is rel_path.
- Returns:
- true iif the relativeURI is rel_path
public boolean isRelativeURI()
Tell whether or not this URI is relative.
- Returns:
- true iif this URI is relativeURI
public boolean isServer()
Tell whether or not the authority component of this URI is server.
- Returns:
- true iif the authority component is server
public void normalize() throws URIException
Normalizes the path part of this URI. Normalization is only meant to be performed on URIs with an absolute path. Calling this method on a relative path URI will have no effect.
- Throws:
URIException
- no more higher path level to be normalized
- See Also:
isAbsPath()
protected char[] normalize(char[] path) throws URIException
Normalize the given hier path part. Algorithm taken from URI reference parser at http://www.apache.org/~fielding/uri/rev-2002/issues.html.
- Parameters:
path
- the path to normalize
- Returns:
- the normalized path
- Throws:
URIException
- no more higher path level to be normalized
protected void parseAuthority(String original, boolean escaped) throws URIException
Parse the authority component.
- Parameters:
original
- the original character sequence of authority componentescaped
-true
iforiginal
is escaped
- Throws:
URIException
- If an error occurs.
protected void parseUriReference(String original, boolean escaped) throws URIException
In order to avoid any possilbity of conflict with non-ASCII characters, Parse a URI reference as aString
with the character encoding of the local system or the document. The following line is the regular expression for breaking-down a URI reference into its components.For example, matching the above expression to http://jakarta.apache.org/ietf/uri/#Related results in the following subexpression matches:^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))? 12 3 4 5 6 7 8 9$1 = http: scheme = $2 = http $3 = //jakarta.apache.org authority = $4 = jakarta.apache.org path = $5 = /ietf/uri/ $6 =query = $7 = $8 = #Related fragment = $9 = Related
- Parameters:
original
- the original character sequenceescaped
-true
iforiginal
is escaped
- Throws:
URIException
- If an error occurs.
protected boolean prevalidate(String component, BitSet disallowed)
Pre-validate the unescaped URI string within a specific component.
- Parameters:
component
- the component string within the componentdisallowed
- those characters disallowed within the component
- Returns:
- if true, it doesn't have the disallowed characters if false, the component is undefined or an incorrect one
protected void readObject(ObjectInputStream ois) throws ClassNotFoundException, IOException
Read a URI.
- Parameters:
ois
- the object-input stream
protected char[] removeFragmentIdentifier(char[] component)
Remove the fragment identifier of the given component.
- Parameters:
component
- the component that a fragment may be included
- Returns:
- the component that the fragment identifier is removed
protected char[] resolvePath(char[] basePath, char[] relPath) throws URIException
Resolve the base and relative path.
- Parameters:
basePath
- a character array of the basePathrelPath
- a character array of the relPath
- Returns:
- the resolved path
- Throws:
URIException
- no more higher path level to be resolved
public static void setDefaultDocumentCharset(String charset) throws URI.DefaultCharsetChanged
Set the default charset of the document. Notice that it will be possible to contain mixed characters (e.g. ftp://host/KoreanNamespace/ChineseResource). To handle the Bi-directional display of these character sets, the protocol charset could be simply used again. Because it's not yet implemented that the insertion of BIDI control characters at different points during composition is extracted. Always all the time, the setter method is always succeeded and throwsDefaultCharsetChanged
exception. So API programmer must follow the following way:The API programmer is responsible to set the correct charset. And each application should remember its own charset to support.
import org.apache.util.URI$DefaultCharsetChanged; . . . try { URI.setDefaultDocumentCharset("EUC-KR"); } catch (DefaultCharsetChanged cc) { // CASE 1: the exception could be ignored, when it is set by user if (cc.getReasonCode() == DefaultCharsetChanged.DOCUMENT_CHARSET) { // CASE 2: let user know the default document charset changed } else { // CASE 2: let user know the default protocol charset changed } }
- Parameters:
charset
- the default charset for the document
- Throws:
URI.DefaultCharsetChanged
- default charset changed
public static void setDefaultProtocolCharset(String charset) throws URI.DefaultCharsetChanged
Set the default charset of the protocol. The character set used to store files SHALL remain a local decision and MAY depend on the capability of local operating systems. Prior to the exchange of URIs they SHOULD be converted into a ISO/IEC 10646 format and UTF-8 encoded. This approach, while allowing international exchange of URIs, will still allow backward compatibility with older systems because the code set positions for ASCII characters are identical to the one byte sequence in UTF-8. An individual URI scheme may require a single charset, define a default charset, or provide a way to indicate the charset used. Always all the time, the setter method is always succeeded and throwsDefaultCharsetChanged
exception. So API programmer must follow the following way:The API programmer is responsible to set the correct charset. And each application should remember its own charset to support.
import org.apache.util.URI$DefaultCharsetChanged; . . . try { URI.setDefaultProtocolCharset("UTF-8"); } catch (DefaultCharsetChanged cc) { // CASE 1: the exception could be ignored, when it is set by user if (cc.getReasonCode() == DefaultCharsetChanged.PROTOCOL_CHARSET) { // CASE 2: let user know the default protocol charset changed } else { // CASE 2: let user know the default document charset changed } }
- Parameters:
charset
- the default charset for each protocol
- Throws:
URI.DefaultCharsetChanged
- default charset changed
public void setEscapedAuthority(String escapedAuthority) throws URIException
Set the authority. It can be one type of server, hostport, hostname, IPv4address, IPv6reference and reg_name. Note that there is no setAuthority method by the escape encoding reason.
- Parameters:
escapedAuthority
- the escaped authority string
- Throws:
URIException
- IfparseAuthority(java.lang.String,boolean)
fails
public void setEscapedFragment(String escapedFragment) throws URIException
Set the escaped fragment string.
- Parameters:
escapedFragment
- the escaped fragment string
- Throws:
URIException
- escaped fragment not valid
public void setEscapedPath(String escapedPath) throws URIException
Set the escaped path.
- Parameters:
escapedPath
- the escaped path string
- Throws:
URIException
- encoding error or not proper for initial instance
- See Also:
encode(String,BitSet,String)
public void setEscapedQuery(String escapedQuery) throws URIException
Set the escaped query string.
- Parameters:
escapedQuery
- the escaped query string
- Throws:
URIException
- escaped query not valid
public void setFragment(String fragment) throws URIException
Set the fragment.
- Parameters:
fragment
- the fragment string.
- Throws:
URIException
- If an error occurs.
public void setPath(String path) throws URIException
Set the path.
- Parameters:
path
- the path string
- Throws:
URIException
- set incorrectly or fragment only
- See Also:
encode(String,BitSet,String)
public void setQuery(String query) throws URIException
Set the query. When a query string is not misunderstood the reserved special characters ("&", "=", "+", ",", and "$") within a query component, it is recommended to use in encoding the whole query with this method. The additional APIs for the special purpose using by the reserved special characters used in each protocol are implemented in each protocol classes inherited fromURI
. So refer to the same-named APIs implemented in each specific protocol instance.
- Parameters:
query
- the query string.
- Throws:
URIException
- incomplete trailing escape pattern or unsupported character encoding
- See Also:
encode(String,BitSet,String)
public void setRawAuthority(char[] escapedAuthority) throws URIException, NullPointerException
Set the authority. It can be one type of server, hostport, hostname, IPv4address, IPv6reference and reg_name.authority = server | reg_name
- Parameters:
escapedAuthority
- the raw escaped authority
- Throws:
URIException
- IfparseAuthority(java.lang.String,boolean)
fails
public void setRawFragment(char[] escapedFragment) throws URIException
Set the raw-escaped fragment.
- Parameters:
escapedFragment
- the raw-escaped fragment
- Throws:
URIException
- escaped fragment not valid
public void setRawPath(char[] escapedPath) throws URIException
Set the raw-escaped path.
- Parameters:
escapedPath
- the path character sequence
- Throws:
URIException
- encoding error or not proper for initial instance
- See Also:
encode(String,BitSet,String)
public void setRawQuery(char[] escapedQuery) throws URIException
Set the raw-escaped query.
- Parameters:
escapedQuery
- the raw-escaped query
- Throws:
URIException
- escaped query not valid
public String toString()
Get the escaped URI string. On the document, the URI-reference form is only used without the userinfo component like http://jakarta.apache.org/ by the security reason. But the URI-reference form with the userinfo component could be parsed. In other words, this URI and any its subclasses must not expose the URI-reference expression with the userinfo component like http://user:password@hostport/restricted_zone.
It means that the API client programmer should extract each user and password to access manually. Probably it will be supported in the each subclass, however, not a whole URI-reference expression.
- Returns:
- the escaped URI string
- See Also:
clone()
protected boolean validate(char[] component, BitSet generous)
Validate the URI characters within a specific component. The component must be performed after escape encoding. Or it doesn't include escaped characters.
- Parameters:
component
- the characters sequence within the componentgenerous
- those characters that are allowed within a component
- Returns:
- if true, it's the correct URI character sequence
protected boolean validate(char[] component, int soffset, int eoffset, BitSet generous)
Validate the URI characters within a specific component. The component must be performed after escape encoding. Or it doesn't include escaped characters. It's not that much strict, generous. The strict validation might be performed before being called this method.
- Parameters:
component
- the characters sequence within the componentsoffset
- the starting offset of the given componenteoffset
- the ending offset of the given component if -1, it means the length of the componentgenerous
- those characters that are allowed within a component
- Returns:
- if true, it's the correct URI character sequence
protected void writeObject(ObjectOutputStream oos) throws IOException
Write the content of this URI.
- Parameters:
oos
- the object-output stream