[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Proposed Up/Down protocol description
On 24/05/2007, at 6:05 PM, Robert Kisteleki wrote:
Robert Loomans wrote:
<?xml version="1.0" encoding="UTF-8"?>
Is there a real reason why we're using UTF instead of something
simpler?
While browsing through the messages, I could find no attributes that
really needed UTF (unless a sender/recipient will be spelled with
Kanji
or something).
a) It's the default encoding for XML.
b) It is a common convention in XML to use UTF-8 unless you have very
good reason not to.
I know. I just have doubts that I want to handle UTF names, URLs,
etc. when receiving a resource from the JPNIC region... For me,
that is a very good reason to restrict the fields to latin
character set or such.
Fields are not the entire document. The charset attribute in the XML
preamble is for the entire document. The fields are:
version - Restricted by data type to US ASCII digits
sender - Opaque token
recipient - Opaque token
msg_ref - Restricted by data type to US ASCII digits
type - Restricted by value enumeration
class_name - Opaque token
cert_url - URLs are printable US ASCII by definition
cert_ski - Base 64 encoded; printable US ASCII
resource sets - Restricted by specification to a subset of US ASCII
SIA head - URLs are printable US ASCII by definition
cert_aki - Base 64 encoded; printable US ASCII
cert_serial - Restricted by data type to US ASCII digits
status - Restricted by value enumeration
certificates - Base 64 encoded; printable US ASCII
status - Restricted by data type to US ASCII digits
last_msg_ref - Restricted by data type to US ASCII digits
description - Opaque string
The only fields which could possibly contain characters outside of
UTF-8 are the opaque fields, and as long as your data storage and
string comparison methods are 8-bit clean, that makes absolutely no
difference to processing. None of them need to be lexically ordered,
none of them need to be interpreted for meaning: they're just opaque
sequences of octets.
If your data storage or comparison methods aren't 8-bit clean, or
there's some reason I'm missing that you'd want to restrict those
opaque values to some subset of UTF-8, that can be done in the data
typing for those field values, rather than at the XML document level.
--
bje
Attachment:
smime.p7s
Description: S/MIME cryptographic signature