From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!border1.nntp.ams1.giganews.com!nntp.giganews.com!ecngs!feeder2.ecngs.de!81.171.118.61.MISMATCH!peer01.fr7!futter-mich.highwinds-media.com!news.highwinds-media.com!fx31.am4.POSTED!not-for-mail From: Felix Krause Newsgroups: comp.lang.ada Message-ID: <2017072015354641511-contact@flyx.org> References: <2017071720305687401-contact@flyx.org> <2017071918093536089-contact@flyx.org> <2017071923134489971-contact@flyx.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: HTTP with Simple Components: Status.Kind always File User-Agent: Unison/2.2 X-Complaints-To: abuse@eweka.nl NNTP-Posting-Date: Thu, 20 Jul 2017 13:34:22 UTC Organization: Eweka Internet Services Date: Thu, 20 Jul 2017 15:35:46 +0200 X-Received-Bytes: 3962 X-Received-Body-CRC: 2202995023 X-Original-Bytes: 3911 Xref: news.eternal-september.org comp.lang.ada:47465 Date: 2017-07-20T15:35:46+02:00 List-Id: On 2017-07-20 08:57:57 +0000, Dmitry A. Kazakov said: > >> As I explained earlier, for a File, I only get the Path string. I >> cannot parse this properly since HTTP server already unescaped the >> escaping sequences. Let me give an example: >> >> curl "http://localhost:8088/foo?key=value" >> >> curl "http://localhost:8088/foo%3Fkey=value" > > I see, you want to the recognize the query part (and possibly the > fragment part) even when no scheme present. > > I think what you want is illegal. I might be wrong, people claim ARM is > complicated, they should read RFCs! > > Anyway, my interpretation of RFC 3986 is that the scheme part must be present: > > https://tools.ietf.org/html/rfc3986#section-1.1.1 > > No scheme, no query part. Therefore in both > > GET /foo?key=value > GET foo%3Fkey=value > > "foo?key=value" must be the path. I read a bit into the HTTP 1.1 and the URI spec. I think your interpretation is correct with one minor error: According to HTTP/1.1: https://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html The Request-URI of an HTTP request is defined as follows: Request-URI = "*" | absoluteURI | abs_path | authority "*" and authority are only for certain request types, so let's ignore them. So the URI can either be an absoluteURI or and abs_path, both of which are defined in the URI spec you linked: absolute-URI = scheme ":" hier-part [ "?" query ] abs_path is declared obsolete in the URI spec; it defines that the translation of the obsolete rule is path-absolute: path-absolute = "/" [ segment-nz *( "/" segment ) ] segment = *pchar pchar = unreserved / pct-encoded / sub-delims / ":" / "@" Since '?' is not part of unreserved nor sub-delims, it is, according to the spec, completely illegal in a path. So, GET /foo?key=value should be rejected as it is illegal syntax. But GET /foo%3Fkey=value is legal since pchar may contain pct-encoded. Now, my problem is that this argument is quite far away from reality. I tested various HTTP client implementations (curl, Chrome, Firefox), and every one, when I tell it to get http://example.com/?key=value sends the following Request-Line: GET /?key=value HTTP/1.1 So, HTTP clients do not seem to respect this part of the HTTP specification. That means for me that if I want to build a server supporting the wide-spread HTTP clients, I need a server implementation that supports this kind of GET request even though it violates the HTTP specification. I understand your point that you want to conform to the spec, but this seems to mean that it will not work well with existing HTTP clients. My suggestion is thus to parse the query even if you get a path. As I see it, this will not directly violate the spec. HTTP defines that: Servers SHOULD respond to invalid Request-URIs with an appropriate status code. If a path contains a '?', it is an invalid Request-URI. The SHOULD allows you to handle this differently without violating the spec. I look forward to your opinion. Cheers, Felix