* WGs marked with an * asterisk has had at least one new draft made available during the last 5 days

Changeset 1240


Ignore:
Timestamp:
2011-03-30 07:03:49 (3 years ago)
Author:
julian.reschke@gmx.de
Message:

Remove ISO-8859-1 default for text/*, remove special-case for ISO-8859-1 from Accept-Charset (see #20)

Location:
draft-ietf-httpbis/latest
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • draft-ietf-httpbis/latest/p3-payload.html

    r1234 r1240  
    546546         </li> 
    547547         <li>2.&nbsp;&nbsp;&nbsp;<a href="#protocol.parameters">Protocol Parameters</a><ul> 
    548                <li>2.1&nbsp;&nbsp;&nbsp;<a href="#character.sets">Character Encodings (charset)</a><ul> 
    549                      <li>2.1.1&nbsp;&nbsp;&nbsp;<a href="#missing.charset">Missing Charset</a></li> 
    550                   </ul> 
    551                </li> 
     548               <li>2.1&nbsp;&nbsp;&nbsp;<a href="#character.sets">Character Encodings (charset)</a></li> 
    552549               <li>2.2&nbsp;&nbsp;&nbsp;<a href="#content.codings">Content Codings</a><ul> 
    553550                     <li>2.2.1&nbsp;&nbsp;&nbsp;<a href="#content.coding.registry">Content Coding Registry</a></li> 
     
    703700      <p id="rfc.section.2.1.p.6">Implementors need to be aware of IETF character set requirements <a href="#RFC3629" id="rfc.xref.RFC3629.1"><cite title="UTF-8, a transformation format of ISO 10646">[RFC3629]</cite></a>  <a href="#RFC2277" id="rfc.xref.RFC2277.1"><cite title="IETF Policy on Character Sets and Languages">[RFC2277]</cite></a>. 
    704701      </p> 
    705       <h3 id="rfc.section.2.1.1"><a href="#rfc.section.2.1.1">2.1.1</a>&nbsp;<a id="missing.charset" href="#missing.charset">Missing Charset</a></h3> 
    706       <p id="rfc.section.2.1.1.p.1">Some HTTP/1.0 software has interpreted a Content-Type header field without charset parameter incorrectly to mean "recipient 
    707          should guess". Senders wishing to defeat this behavior <em class="bcp14">MAY</em> include a charset parameter even when the charset is ISO-8859-1 (<a href="#ISO-8859-1" id="rfc.xref.ISO-8859-1.1"><cite title="Information technology -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1">[ISO-8859-1]</cite></a>) and <em class="bcp14">SHOULD</em> do so when it is known that it will not confuse the recipient. 
    708       </p> 
    709       <p id="rfc.section.2.1.1.p.2">Unfortunately, some older HTTP/1.0 clients did not deal properly with an explicit charset parameter. HTTP/1.1 recipients <em class="bcp14">MUST</em> respect the charset label provided by the sender; and those user agents that have a provision to "guess" a charset <em class="bcp14">MUST</em> use the charset from the content-type field if they support that charset, rather than the recipient's preference, when initially 
    710          displaying a document. See <a href="#canonicalization.and.text.defaults" title="Canonicalization and Text Defaults">Section&nbsp;2.3.1</a>. 
    711       </p> 
    712702      <h2 id="rfc.section.2.2"><a href="#rfc.section.2.2">2.2</a>&nbsp;<a id="content.codings" href="#content.codings">Content Codings</a></h2> 
    713703      <p id="rfc.section.2.2.p.1">Content coding values indicate an encoding transformation that has been or can be applied to a representation. Content codings 
     
    798788      <p id="rfc.section.2.3.1.p.3">If a representation is encoded with a content-coding, the underlying data <em class="bcp14">MUST</em> be in a form defined above prior to being encoded. 
    799789      </p> 
    800       <p id="rfc.section.2.3.1.p.4">The "charset" parameter is used with some media types to define the character encoding (<a href="#character.sets" title="Character Encodings (charset)">Section&nbsp;2.1</a>) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined 
    801          to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character encodings other than "ISO-8859-1" 
    802          or its subsets <em class="bcp14">MUST</em> be labeled with an appropriate charset value. See <a href="#missing.charset" title="Missing Charset">Section&nbsp;2.1.1</a> for compatibility problems. 
    803       </p> 
    804790      <h3 id="rfc.section.2.3.2"><a href="#rfc.section.2.3.2">2.3.2</a>&nbsp;<a id="multipart.types" href="#multipart.types">Multipart Types</a></h3> 
    805791      <p id="rfc.section.2.3.2.p.1">MIME provides for a number of "multipart" types — encapsulations of one or more representations within a single message-body. 
     
    11481134      </p> 
    11491135      <div id="rfc.figure.u.16"></div><pre class="text">  Accept-Charset: iso-8859-5, unicode-1-1;q=0.8 
    1150 </pre><p id="rfc.section.6.2.p.5">The special value "*", if present in the Accept-Charset field, matches every character encoding (including ISO-8859-1) which 
    1151          is not mentioned elsewhere in the Accept-Charset field. If no "*" is present in an Accept-Charset field, then all character 
    1152          encodings not explicitly mentioned get a quality value of 0, except for ISO-8859-1, which gets a quality value of 1 if not 
    1153          explicitly mentioned. 
     1136</pre><p id="rfc.section.6.2.p.5">The special value "*", if present in the Accept-Charset field, matches every character encoding which is not mentioned elsewhere 
     1137         in the Accept-Charset field. If no "*" is present in an Accept-Charset field, then all character encodings not explicitly 
     1138         mentioned get a quality value of 0. 
    11541139      </p> 
    11551140      <p id="rfc.section.6.2.p.6">If no Accept-Charset header field is present, the default is that any character encoding is acceptable. If an Accept-Charset 
     
    15381523      <h2 id="rfc.references.1"><a href="#rfc.section.10.1" id="rfc.section.10.1">10.1</a> Normative References 
    15391524      </h2> 
    1540       <table>                                 
    1541          <tr> 
    1542             <td class="reference"><b id="ISO-8859-1">[ISO-8859-1]</b></td> 
    1543             <td class="top">International Organization for Standardization, “Information technology -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1”, ISO/IEC&nbsp;8859-1:1998, 1998.</td> 
    1544          </tr> 
     1525      <table>                               
    15451526         <tr> 
    15461527            <td class="reference"><b id="Part1">[Part1]</b></td> 
     
    17871768      <p id="rfc.section.C.p.1">Clarify contexts that charset is used in. (<a href="#character.sets" title="Character Encodings (charset)">Section&nbsp;2.1</a>) 
    17881769      </p> 
    1789       <p id="rfc.section.C.p.2">Change ABNF productions for header fields to only define the field value. (<a href="#header.fields" title="Header Field Definitions">Section&nbsp;6</a>) 
    1790       </p> 
    1791       <p id="rfc.section.C.p.3">Remove base URI setting semantics for Content-Location due to poor implementation support, which was caused by too many broken 
     1770      <p id="rfc.section.C.p.2">Remove the default character encoding for text media types; the default now is whatever the media type definition says. (<a href="#canonicalization.and.text.defaults" title="Canonicalization and Text Defaults">Section&nbsp;2.3.1</a>) 
     1771      </p> 
     1772      <p id="rfc.section.C.p.3">Change ABNF productions for header fields to only define the field value. (<a href="#header.fields" title="Header Field Definitions">Section&nbsp;6</a>) 
     1773      </p> 
     1774      <p id="rfc.section.C.p.4">Remove ISO-8859-1 special-casing in Accept-Charset. (<a href="#header.accept-charset" id="rfc.xref.header.accept-charset.3" title="Accept-Charset">Section&nbsp;6.2</a>) 
     1775      </p> 
     1776      <p id="rfc.section.C.p.5">Remove base URI setting semantics for Content-Location due to poor implementation support, which was caused by too many broken 
    17921777         servers emitting bogus Content-Location header fields, and also the potentially undesirable effect of potentially breaking 
    17931778         relative links in content-negotiated resources. (<a href="#header.content-location" id="rfc.xref.header.content-location.3" title="Content-Location">Section&nbsp;6.7</a>) 
    17941779      </p> 
    1795       <p id="rfc.section.C.p.4">Remove reference to non-existant identity transfer-coding value tokens. (<a href="#no.content-transfer-encoding" title="No Content-Transfer-Encoding">Appendix&nbsp;A.5</a>) 
     1780      <p id="rfc.section.C.p.6">Remove reference to non-existant identity transfer-coding value tokens. (<a href="#no.content-transfer-encoding" title="No Content-Transfer-Encoding">Appendix&nbsp;A.5</a>) 
    17961781      </p> 
    17971782      <h1 id="rfc.section.D"><a href="#rfc.section.D">D.</a>&nbsp;<a id="collected.abnf" href="#collected.abnf">Collected ABNF</a></h1> 
     
    20562041      <p id="rfc.section.E.15.p.1">Closed issues: </p> 
    20572042      <ul> 
     2043         <li> &lt;<a href="http://tools.ietf.org/wg/httpbis/trac/ticket/20">http://tools.ietf.org/wg/httpbis/trac/ticket/20</a>&gt;: "Default charsets for text media types" 
     2044         </li> 
    20582045         <li> &lt;<a href="http://tools.ietf.org/wg/httpbis/trac/ticket/276">http://tools.ietf.org/wg/httpbis/trac/ticket/276</a>&gt;: "untangle ABNFs for header fields" 
    20592046         </li> 
     
    20662053            <li><a id="rfc.index.A" href="#rfc.index.A"><b>A</b></a><ul> 
    20672054                  <li>Accept header field&nbsp;&nbsp;<a href="#rfc.xref.header.accept.1">2.3</a>, <a href="#rfc.xref.header.accept.2">5.1</a>, <a href="#rfc.iref.a.1"><b>6.1</b></a>, <a href="#rfc.xref.header.accept.3">7.1</a></li> 
    2068                   <li>Accept-Charset header field&nbsp;&nbsp;<a href="#rfc.xref.header.accept-charset.1">5.1</a>, <a href="#rfc.iref.a.2"><b>6.2</b></a>, <a href="#rfc.xref.header.accept-charset.2">7.1</a></li> 
     2055                  <li>Accept-Charset header field&nbsp;&nbsp;<a href="#rfc.xref.header.accept-charset.1">5.1</a>, <a href="#rfc.iref.a.2"><b>6.2</b></a>, <a href="#rfc.xref.header.accept-charset.2">7.1</a>, <a href="#rfc.xref.header.accept-charset.3">C</a></li> 
    20692056                  <li>Accept-Encoding header field&nbsp;&nbsp;<a href="#rfc.xref.header.accept-encoding.1">2.2</a>, <a href="#rfc.xref.header.accept-encoding.2">5.1</a>, <a href="#rfc.iref.a.3"><b>6.3</b></a>, <a href="#rfc.xref.header.accept-encoding.3">7.1</a></li> 
    20702057                  <li>Accept-Language header field&nbsp;&nbsp;<a href="#rfc.xref.header.accept-language.1">5.1</a>, <a href="#rfc.iref.a.4"><b>6.4</b></a>, <a href="#rfc.xref.header.accept-language.2">7.1</a></li> 
     
    21332120                     <ul> 
    21342121                        <li>Accept&nbsp;&nbsp;<a href="#rfc.xref.header.accept.1">2.3</a>, <a href="#rfc.xref.header.accept.2">5.1</a>, <a href="#rfc.iref.h.1"><b>6.1</b></a>, <a href="#rfc.xref.header.accept.3">7.1</a></li> 
    2135                         <li>Accept-Charset&nbsp;&nbsp;<a href="#rfc.xref.header.accept-charset.1">5.1</a>, <a href="#rfc.iref.h.2"><b>6.2</b></a>, <a href="#rfc.xref.header.accept-charset.2">7.1</a></li> 
     2122                        <li>Accept-Charset&nbsp;&nbsp;<a href="#rfc.xref.header.accept-charset.1">5.1</a>, <a href="#rfc.iref.h.2"><b>6.2</b></a>, <a href="#rfc.xref.header.accept-charset.2">7.1</a>, <a href="#rfc.xref.header.accept-charset.3">C</a></li> 
    21362123                        <li>Accept-Encoding&nbsp;&nbsp;<a href="#rfc.xref.header.accept-encoding.1">2.2</a>, <a href="#rfc.xref.header.accept-encoding.2">5.1</a>, <a href="#rfc.iref.h.3"><b>6.3</b></a>, <a href="#rfc.xref.header.accept-encoding.3">7.1</a></li> 
    21372124                        <li>Accept-Language&nbsp;&nbsp;<a href="#rfc.xref.header.accept-language.1">5.1</a>, <a href="#rfc.iref.h.4"><b>6.4</b></a>, <a href="#rfc.xref.header.accept-language.2">7.1</a></li> 
     
    21482135            <li><a id="rfc.index.I" href="#rfc.index.I"><b>I</b></a><ul> 
    21492136                  <li>identity (Coding Format)&nbsp;&nbsp;<a href="#rfc.iref.i.1">2.2</a></li> 
    2150                   <li><em>ISO-8859-1</em>&nbsp;&nbsp;<a href="#rfc.xref.ISO-8859-1.1">2.1.1</a>, <a href="#ISO-8859-1"><b>10.1</b></a></li> 
    21512137               </ul> 
    21522138            </li> 
  • draft-ietf-httpbis/latest/p3-payload.xml

    r1234 r1240  
    368368   <xref target="RFC2277"/>. 
    369369</t> 
    370  
    371 <section title="Missing Charset" anchor="missing.charset"> 
    372 <t> 
    373    Some HTTP/1.0 software has interpreted a Content-Type header field without 
    374    charset parameter incorrectly to mean "recipient should guess". 
    375    Senders wishing to defeat this behavior &MAY; include a charset 
    376    parameter even when the charset is ISO-8859-1 (<xref target="ISO-8859-1"/>) and &SHOULD; do so when 
    377    it is known that it will not confuse the recipient. 
    378 </t> 
    379 <t> 
    380    Unfortunately, some older HTTP/1.0 clients did not deal properly with 
    381    an explicit charset parameter. HTTP/1.1 recipients &MUST; respect the 
    382    charset label provided by the sender; and those user agents that have 
    383    a provision to "guess" a charset &MUST; use the charset from the 
    384    content-type field if they support that charset, rather than the 
    385    recipient's preference, when initially displaying a document. See 
    386    <xref target="canonicalization.and.text.defaults"/>. 
    387 </t> 
    388 </section> 
    389370</section> 
    390371 
     
    555536   data &MUST; be in a form defined above prior to being encoded. 
    556537</t> 
    557 <t> 
    558    The "charset" parameter is used with some media types to define the 
    559    character encoding (<xref target="character.sets"/>) of the data. When no explicit charset 
    560    parameter is provided by the sender, media subtypes of the "text" 
    561    type are defined to have a default charset value of "ISO-8859-1" when 
    562    received via HTTP. Data in character encodings other than "ISO-8859-1" or 
    563    its subsets &MUST; be labeled with an appropriate charset value. See 
    564    <xref target="missing.charset"/> for compatibility problems. 
    565 </t> 
    566538</section> 
    567539 
     
    10891061<t> 
    10901062   The special value "*", if present in the Accept-Charset field, 
    1091    matches every character encoding (including ISO-8859-1) which is not 
    1092    mentioned elsewhere in the Accept-Charset field. If no "*" is present 
    1093    in an Accept-Charset field, then all character encodings not explicitly 
    1094    mentioned get a quality value of 0, except for ISO-8859-1, which gets 
    1095    a quality value of 1 if not explicitly mentioned. 
     1063   matches every character encoding which is not mentioned elsewhere in the 
     1064   Accept-Charset field. If no "*" is present in an Accept-Charset field, then 
     1065   all character encodings not explicitly mentioned get a quality value of 0. 
    10961066</t> 
    10971067<t> 
     
    17221692<references title="Normative References"> 
    17231693 
    1724 <reference anchor="ISO-8859-1"> 
    1725   <front> 
    1726     <title> 
    1727      Information technology -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1 
    1728     </title> 
    1729     <author> 
    1730       <organization>International Organization for Standardization</organization> 
    1731     </author> 
    1732     <date year="1998"/> 
    1733   </front> 
    1734   <seriesInfo name="ISO/IEC" value="8859-1:1998"/> 
    1735 </reference> 
    1736  
    17371694<reference anchor="Part1"> 
    17381695  <front> 
     
    26012558</t> 
    26022559<t> 
     2560  Remove the default character encoding for text media types; the default 
     2561  now is whatever the media type definition says. 
     2562  (<xref target="canonicalization.and.text.defaults"/>) 
     2563</t> 
     2564<t> 
    26032565  Change ABNF productions for header fields to only define the field value. 
    26042566  (<xref target="header.fields"/>) 
     2567</t> 
     2568<t> 
     2569  Remove ISO-8859-1 special-casing in Accept-Charset. 
     2570  (<xref target="header.accept-charset"/>) 
    26052571</t> 
    26062572<t> 
     
    30763042  <list style="symbols">  
    30773043    <t> 
     3044      <eref target="http://tools.ietf.org/wg/httpbis/trac/ticket/20"/>: 
     3045      "Default charsets for text media types" 
     3046    </t> 
     3047    <t> 
    30783048      <eref target="http://tools.ietf.org/wg/httpbis/trac/ticket/276"/>: 
    30793049      "untangle ABNFs for header fields" 
Note: See TracChangeset for help on using the changeset viewer.