[Web-SIG] Random thoughts

Gregory (Grisha) Trubetskoy grisha at modpython.org
Fri Oct 31 10:54:46 EST 2003



On Thu, 30 Oct 2003, Greg Ward wrote:

> An aside: in the query string
>
>    ?name=Greg&colour=blue&age=31
>
> what exactly are 'name', 'colour', and 'age'?

Short answer: "field names"

Long answer:

I cannot claim to be an absolute expert on the matter, but here is my best
understanding:


In ?name=Greg&colour=blue&age=31

"name=Greg&colour=blue&age=31" is called "searchpart", "query information"
or simply "query"

from RFC 1808 sec 2.1 "URL Syntactic Components":

  <scheme>://<net_loc>/<path>;<params>?<query>#<fragment>

  - [snip] -

  "?" query    ::= query information, as per Section 3.3 of
                       RFC 1738 [2].

Then if we look at RFC 1738, it describes an HTTP URL specifically:

  An HTTP URL takes the form:

    http://>:<port>/<path>?<searchpart>


Now, RFC 1866 (HTML)  introduces the concept of a "form". Forms have a
METHOD attribute which lets you specify how the form is to be submitted.
When method is 'GET', the form will be submitted as "query information",
described above.

Since there are limits to what is allowed in a URL, the data has to be
"url encoded", as described in 8.2.1 of RFC 1866:

        2. The fields are listed in the order they appear in the
        document with the name separated from the value by `=' and
        the pairs separated from each other by `&'.

[Note BTW that the order is specified]

Therefore, 'name', 'colour', and 'age' are "field names", and 'Greg',
'blue', '31' are "field values".

A more clever example would be:

?name=Greg%20Ward&colour=blue&age=31

Here, "Greg Ward" is a form field value, while "Greg%20Ward" is a random
chunk of a URL query with no particular meaning, just as "0Ward&col".


Here is the interesting part (RFC 1866 8.2.3):

   To process a form whose action URL is an HTTP URL and whose method is
   `POST', the user agent conducts an HTTP POST transaction using the
   action URI, and a message body of type `application/x-www-form-
   urlencoded' format as above.

Note that it doesn't say that the action URI cannot contain a query,
so based on this, I can have a form like this:

  <form method="post"
   action="http://blah/blah?some=form&data=as&query=info">

  <input type="text" name="bleh>

  ...


This form would result in a POST request containing a query string as
well.

To the best of my understanding, there is no formal specification of what
happens on the server side; all HTML RFC's only describe the client
behaviour. So it's up to the developer to decide whether field "some"
above is part of the form data.

Or to put it another way, to insist that POST form data does NOT contain
fields from the query would not be correct, it should be optional
behaviour.

*My* inclanation is that combined data from POST and quesry should be
*default* behaviour, and if you want to separate them, you may have to do
extra work.

IMHO adding a query to the action of a POST form is a simple technique of
injecting form data into a form without resorting to hidden inputs.


Grisha




More information about the Web-SIG mailing list