* WGs marked with an * asterisk has had at least one new draft made available during the last 5 days

Ticket #43 (closed defect: fixed)

Opened 5 years ago

Last modified 4 years ago

How to with illegal and disallowed IRI characters

Reported by: duerst@it.aoyama.ac.jp Owned by:
Priority: major Milestone:
Component: 3987bis Version:
Severity: - Keywords:


(transferred from "Open Issues" section in draft -01)
Section "General percent-encoding of IRI components" used to apply only to characters in either 'ucschar' or 'iprivate', but then later said that systems accepting IRIs MAY also deal with the printable characters in US-ASCII that are not allowed in URIs, namely "<", ">", '"', space, "{", "}", "|", "\", "", and "`". Larry felt that this a MAY would result in non-uniform behavior, because some systems would produce valid URI components and others wouldn't. Non-printable US-ASCII characters should be stripped by most software, so if they get to if they're passed on somewhere as IRI characters, encoding them makes sense.
The section also used to say "If these characters are found
but are not converted, then the conversion SHOULD fail." but there is no notion of conversion failing -- every string is converted. Please note that the number sign ("#"), the percent sign ("%"), and the square bracket characters (",?") are not part of the above list and MUST NOT be converted.

Change History

comment:1 Changed 4 years ago by masinter@adobe.com

  • Status changed from new to closed
  • Resolution set to fixed

This section was rewritten long ago and this issue no longer a problem.

Note: See TracTickets for help on using tickets.