Tuesday, April 9, 2013

mailto: links and plus encoding

Python has urllib.quote() and urllib.quote_plus().  Which one should use for URL-encoding mailto: links?  Answer: use urlib.quote(), since spaces will normally be converted to urllib.quote_plus() and cause confusion in whether it actually is part of a person's email address.

http://tools.ietf.org/html/rfc6068#page-8

When creating 'mailto' URIs, any reserved characters that are used in
the URIs MUST be encoded so that properly written URI interpreters
can read them. Also, client software that reads URIs MUST decode
strings before creating the mail message so that the mail message
appears in a form that the recipient software will understand. These
strings SHOULD be decoded before showing the message to the sending
user.

Software creating 'mailto' URIs likewise has to be careful to encode
any reserved characters that are used. HTML forms are one kind of
software that creates 'mailto' URIs. Current implementations encode
a space as '+', but this creates problems because such a '+' standing
for a space cannot be distinguished from a real '+' in a 'mailto'
URI. When producing 'mailto' URIs, all spaces SHOULD be encoded as
%20, and '+' characters MAY be encoded as %2B. Please note that '+'
characters are frequently used as part of an email address to
indicate a subaddress, as for example in .

The 'mailto' URI scheme is limited in that it does not provide for
substitution of variables. Thus, it is impossible to create a
'mailto' URI that includes a user's email address in the message
body. This limitation also prevents 'mailto' URIs that are signed
with public keys and other such variable information.

No comments:

Post a Comment