public class PercentEscaper extends UnicodeEscaper
UnicodeEscaperthat escapes some set of Java characters using the URI percent encoding scheme. The set of safe characters (those which remain unescaped) is specified on construction.
When encoding a String, the following rules apply:
plusForSpaceis true, the space character " " is converted into a plus sign "+".
RFC 3986 defines the set of unreserved characters as "-", "_", "~", and "." It goes on to state:
URIs that differ in the replacement of an unreserved character with its corresponding
percent-encoded US-ASCII octet are equivalent: they identify the same resource. However, URI
comparison implementations do not always perform normalization prior to comparison (see Section
6). For consistency, percent-encoded octets in the ranges of ALPHA (%41-%5A and %61-%7A), DIGIT
(%30-%39), hyphen (%2D), period (%2E), underscore (%5F), or tilde (%7E) should not be created by
URI producers and, when found in a URI, should be decoded to their corresponding unreserved
characters by URI normalizers.
Note: This escaper produces uppercase hexadecimal sequences. From RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings."
|Modifier and Type||Field and Description|
Contains the safe characters plus all reserved characters.
A string of safe characters that mimics the behavior of
A string of characters that do not need to be encoded when used in URI path segments, as specified in RFC 3986.
A string of characters that do not need to be encoded when used in URI query strings, as specified in RFC 3986.
A string of characters that do not need to be encoded when used in URI user info part, as specified in RFC 3986.
|Constructor and Description|
Constructs a URI escaper with the specified safe characters.
|Modifier and Type||Method and Description|
Escapes the given Unicode code point in UTF-8.
Returns the escaped form of a given literal string.
Scans a sub-sequence of characters from a given
public static final String SAFECHARS_URLENCODER
public static final String SAFEPATHCHARS_URLENCODER
public static final String SAFE_PLUS_RESERVED_CHARS_URLENCODER
public static final String SAFEUSERINFOCHARS_URLENCODER
public static final String SAFEQUERYSTRINGCHARS_URLENCODER
public PercentEscaper(String safeChars)
safeChars- a non null string specifying additional safe characters for this escaper (the ranges 0..9, a..z and A..Z are always safe and should not be specified here)
IllegalArgumentException- if any of the parameters are invalid
@Deprecated public PercentEscaper(String safeChars, boolean plusForSpace)
PercentEscaper(String safeChars)instead which is the same as invoking this method with plusForSpace set to false. Escaping spaces as plus signs does not conform to the URI specification.
%20. and optional handling of the space
safeChars- a non null string specifying additional safe characters for this escaper. The ranges 0..9, a..z and A..Z are always safe and should not be specified here.
plusForSpace- true if ASCII space should be escaped to
IllegalArgumentException- if safeChars includes characters that are always safe or characters that must always be escaped
protected int nextEscapeIndex(CharSequence csq, int index, int end)
CharSequence, returning the index of the next character that requires escaping.
Note: When implementing an escaper, it is a good idea to override this method for
efficiency. The base class implementation determines successive Unicode code points and invokes
UnicodeEscaper.escape(int) for each of them. If the semantics of your escaper are such that code
points in the supplementary range are either all escaped or all unescaped, this method can be
implemented more efficiently using
Note however that if your escaper does not escape characters in the supplementary range, you should either continue to validate the correctness of any surrogate characters encountered or provide a clear warning to users that your escaper does not validate its input.
PercentEscaper for an example.
If you are escaping input in arbitrary successive chunks, then it is not generally safe to
use this method. If an input string ends with an unmatched high surrogate character, then this
method will throw
IllegalArgumentException. You should ensure your input is valid UTF-16 before calling this method.
Copyright © 2011–2020 Google. All rights reserved.