Ticket #30 (closed design: fixed)
|Reported by:||email@example.com||Owned by:|
Is LWS permitted between the field-name and colon?
The grammar of RFC 2616 suggests that it is, because ":" is a separator character, and thus the rule for implied LWS between a token and a separator applies.
The wording suggests otherwise, although it is not explicit:
Each header field consists of a name followed by a colon (":") and the field value. Field names are case-insensitive. The field value MAY be preceded by any amount of LWS, though a single SP is preferred.
The wording explicit states LWS is permitted after the colon, suggesting that the intention is that it's not permitted before the colon.
Many authors have taken that interpretion, resulting in most of the servers I looked at not accepting LWS before the colon. (They should probably reject the request, but all of them treat it as an unknown header name including a space in the name token).
Apache now, and Mozilla, accept LWS at that position.
What about LWS before the field-name?
At first sight, this doesn't make sense: LWS at the start of the line indicates folding. However, all implementations I looked at accept a line beginning with LWS immediately after the Request-Line or Status-Line. Some of them treat the initial LWS as part of the field-name (they don't enforce the limited character range of tokens), or they skip the LWS.
Apache doesn't look for and ignore LWS prior to the first field-name. Neither do Squid, thttpd or lighttpd. Mozilla and phttpd do.
Technically, the grammar disallows LWS before the field-name: Implied LWS is only implied _between_ words and separators.
Both of these inconsistencies between programs, and also that lone CR is treated as LWS by some and not others, lead to potential security holes due to non-compliant messages that claim to be HTTP/1.1. Although it isn't the standard's role to state how a program should respond to every kind of invalid message, it would be good to clarify these points because they do have security implications (which was Apache's stated reason for their change):
- Whether LWS is actually permitted between the field-name and colon. (Grammar says it is; wording suggests it isn't. Implementations vary).
- Whether LWS is actually permitted before the field-name. (Grammar says it isn't. Implementations vary).
- That lone CR in a line is explicitly not allowed and SHOULD (or MUST?) be rejected, for the specific reason that implementations vary as to whether it is treated as LWS, which has security implications for programs which must match on the field-name.
- That invalid field-names (such as containing control characters or LWS) SHOULD (or MUST?) be rejected.
- version set to d00
- Component set to messaging
- Milestone set to unassigned
- Status changed from new to closed
- Resolution set to fixed
- Status changed from closed to reopened
- Resolution fixed deleted