diff options
| author | Anubhav Joshi <anubhav9042@gmail.com> | 2014-07-22 17:55:22 +0530 |
|---|---|---|
| committer | Loic Bistuer <loic.bistuer@gmail.com> | 2014-10-16 02:31:17 +0700 |
| commit | 10b17a22bec2eaf44c3315614aea87c127caee46 (patch) | |
| tree | 39145c16ca06aa33050e1642076db4216d663a10 /docs | |
| parent | 3af5af1a61d73c533aca4fb0ea1f53e4f6300b17 (diff) | |
Fixed #19508 -- Implemented uri_to_iri as per RFC.
Thanks Loic Bistuer for helping in shaping the patch and Claude Paroz
for the review.
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/ref/unicode.txt | 31 | ||||
| -rw-r--r-- | docs/ref/utils.txt | 15 | ||||
| -rw-r--r-- | docs/releases/1.8.txt | 3 |
3 files changed, 41 insertions, 8 deletions
diff --git a/docs/ref/unicode.txt b/docs/ref/unicode.txt index 90201d2d33..21e8c537c8 100644 --- a/docs/ref/unicode.txt +++ b/docs/ref/unicode.txt @@ -173,11 +173,11 @@ URL from an IRI_ -- very loosely speaking, a URI_ that can contain Unicode characters. Quoting and converting an IRI to URI can be a little tricky, so Django provides some assistance. -* The function ``django.utils.encoding.iri_to_uri()`` implements the - conversion from IRI to URI as required by the specification (:rfc:`3987`). +* The function :func:`django.utils.encoding.iri_to_uri()` implements the + conversion from IRI to URI as required by the specification (:rfc:`3987#section-3.1`). -* The functions ``django.utils.http.urlquote()`` and - ``django.utils.http.urlquote_plus()`` are versions of Python's standard +* The functions :func:`django.utils.http.urlquote()` and + :func:`django.utils.http.urlquote_plus()` are versions of Python's standard ``urllib.quote()`` and ``urllib.quote_plus()`` that work with non-ASCII characters. (The data is converted to UTF-8 prior to encoding.) @@ -213,12 +213,29 @@ you can construct your IRI without worrying about whether it contains non-ASCII characters and then, right at the end, call ``iri_to_uri()`` on the result. -The ``iri_to_uri()`` function is also idempotent, which means the following is -always true:: +Similarly, Django provides :func:`django.utils.encoding.uri_to_iri()` which +implements the conversion from URI to IRI as per :rfc:`3987#section-3.2`. +It decodes all percent-encodings except those that don't represent a valid +UTF-8 sequence. + +An example to demonstrate:: + + >>> uri_to_iri('/%E2%99%A5%E2%99%A5/?utf8=%E2%9C%93') + '/♥♥/?utf8=✓' + >>> uri_to_iri('%A9helloworld') + '%A9helloworld' + +In the first example, the UTF-8 characters and reserved characters are +unquoted. In the second, the percent-encoding remains unchanged because it +lies outside the valid UTF-8 range. + +Both ``iri_to_uri()`` and ``uri_to_iri()`` functions are idempotent, which means the +following is always true:: iri_to_uri(iri_to_uri(some_string)) = iri_to_uri(some_string) + uri_to_iri(uri_to_iri(some_string)) = uri_to_iri(some_string) -So you can safely call it multiple times on the same IRI without risking +So you can safely call it multiple times on the same URI/IRI without risking double-quoting problems. .. _URI: http://www.ietf.org/rfc/rfc2396.txt diff --git a/docs/ref/utils.txt b/docs/ref/utils.txt index c38579cb7a..1cbc23449b 100644 --- a/docs/ref/utils.txt +++ b/docs/ref/utils.txt @@ -271,7 +271,20 @@ The functions defined in this module share the following properties: since we are assuming input is either UTF-8 or unicode already, we can simplify things a little from the full method. - Returns an ASCII string containing the encoded result. + Takes an IRI in UTF-8 bytes and returns ASCII bytes containing the encoded + result. + +.. function:: uri_to_iri(uri) + + .. versionadded:: 1.8 + + Converts a Uniform Resource Identifier into an Internationalized Resource + Identifier. + + This is an algorithm from section 3.2 of :rfc:`3987#section-3.2`. + + Takes a URI in ASCII bytes and returns a unicode string containing the + encoded result. .. function:: filepath_to_uri(path) diff --git a/docs/releases/1.8.txt b/docs/releases/1.8.txt index 7cdb6aaf77..94d09eed4f 100644 --- a/docs/releases/1.8.txt +++ b/docs/releases/1.8.txt @@ -348,6 +348,9 @@ Requests and Responses * The :attr:`HttpResponse.charset <django.http.HttpResponse.charset>` attribute was added. +* ``WSGIRequestHandler`` now follows RFC in converting URI to IRI, using + ``uri_to_iri()``. + Tests ^^^^^ |
