summaryrefslogtreecommitdiff
path: root/docs/ref/unicode.txt
diff options
context:
space:
mode:
Diffstat (limited to 'docs/ref/unicode.txt')
-rw-r--r--docs/ref/unicode.txt44
1 files changed, 35 insertions, 9 deletions
diff --git a/docs/ref/unicode.txt b/docs/ref/unicode.txt
index 46ce4138a4..85e48ae15d 100644
--- a/docs/ref/unicode.txt
+++ b/docs/ref/unicode.txt
@@ -45,6 +45,28 @@ rendering or anywhere else -- you have two choices for encoding those strings.
You can use Unicode strings, or you can use normal strings (sometimes called
"bytestrings") that are encoded using UTF-8.
+.. versionchanged:: 1.5
+
+In Python 3, the logic is reversed, that is normal strings are Unicode, and
+when you want to specifically create a bytestring, you have to prefix the
+string with a 'b'. As we are doing in Django code from version 1.5,
+we recommend that you import ``unicode_literals`` from the __future__ library
+in your code. Then, when you specifically want to create a bytestring literal,
+prefix the string with 'b'.
+
+Python 2 legacy::
+
+ my_string = "This is a bytestring"
+ my_unicode = u"This is an Unicode string"
+
+Python 2 with unicode literals or Python 3::
+
+ from __future__ import unicode_literals
+
+ my_string = b"This is a bytestring"
+ my_unicode = "This is an Unicode string"
+
+
.. admonition:: Warning
A bytestring does not carry any information with it about its encoding.
@@ -182,7 +204,7 @@ An example might clarify things here::
>>> urlquote(u'Paris & Orléans')
u'Paris%20%26%20Orl%C3%A9ans'
- >>> iri_to_uri(u'/favorites/François/%s' % urlquote(u'Paris & Orléans'))
+ >>> iri_to_uri(u'/favorites/François/%s' % urlquote('Paris & Orléans'))
'/favorites/Fran%C3%A7ois/Paris%20%26%20Orl%C3%A9ans'
If you look carefully, you can see that the portion that was generated by
@@ -268,7 +290,9 @@ You can pass either Unicode strings or UTF-8 bytestrings as arguments to
``filter()`` methods and the like in the database API. The following two
querysets are identical::
- qs = People.objects.filter(name__contains=u'Å')
+ from __future__ import unicode_literals
+
+ qs = People.objects.filter(name__contains='Å')
qs = People.objects.filter(name__contains=b'\xc3\x85') # UTF-8 encoding of Å
Templates
@@ -276,9 +300,10 @@ Templates
You can use either Unicode or bytestrings when creating templates manually::
- from django.template import Template
- t1 = Template(b'This is a bytestring template.')
- t2 = Template(u'This is a Unicode template.')
+ from __future__ import unicode_literals
+ from django.template import Template
+ t1 = Template(b'This is a bytestring template.')
+ t2 = Template('This is a Unicode template.')
But the common case is to read templates from the filesystem, and this creates
a slight complication: not all filesystems store their data encoded as UTF-8.
@@ -316,14 +341,15 @@ characters.
The following code example demonstrates that everything except email addresses
can be non-ASCII::
+ from __future__ import unicode_literals
from django.core.mail import EmailMessage
- subject = u'My visit to Sør-Trøndelag'
- sender = u'Arnbjörg Ráðormsdóttir <arnbjorg@example.com>'
+ subject = 'My visit to Sør-Trøndelag'
+ sender = 'Arnbjörg Ráðormsdóttir <arnbjorg@example.com>'
recipients = ['Fred <fred@example.com']
- body = u'...'
+ body = '...'
msg = EmailMessage(subject, body, sender, recipients)
- msg.attach(u"Une pièce jointe.pdf", "%PDF-1.4.%...", mimetype="application/pdf")
+ msg.attach("Une pièce jointe.pdf", "%PDF-1.4.%...", mimetype="application/pdf")
msg.send()
Form submission