POST or PUT without Content-Length bug

$Id: index.html,v 1.4 1998/01/22 21:14:45 dgaudet Exp $


Netscape has since fixed this bug in version 3.04 of Navigator. More information is available on their site. The rest of this page will be left intact for hysterical purposes.

This page contains all the info I know about the Navigator browser bug which results in the "POST or PUT without Content-length" errors being logged, and POSTs to fail.

Doug Salot was kind enough to send me a tcpdump with full packets of the bug in action. His bug report is on file as PR#1142, and includes the full tcpdump and some of my commentary.

I pull apart tcpdumps with a tool called tcpshow. It gives ASCII dumps of the packets in a readable form. Note that it prints a lone . for all unprintable characters, so you'll notice a . at the end of each header line for example (for the unprintable CR).

First we'll look at a successful POST from the client. Go ahead, click here and bring it up in another window, and look at it while you read my commentary.

Well, ignore the rest of that dump, it worked. The critical thing to notice is how Navigator breaks POSTs up into two packets. The dump session I pulled this from shows many more POSTs and they all follow this pattern.

No for a dump showing the bug. Go ahead, click here and bring it up in another window, and look at it while you read my commentary.

It looks like a cut and dry browser bug to me (I can even imagine how the code is structured and why it goes wrong). I can't find any fault in what the TCP/IP stacks are doing (except for the Windows RST issue, but that's a known stack problem). The server is behaving within the bounds of the protocols.

Notice that the server is based on Apache 1.1.3, which you can also characterize by its behaviour of flushing a packet after sending the response headers. This behaviour changed in 1.2b7 when networking performance improvements eliminated the extra flush. At any rate this session shows that it's not just a new bug -- so it is unlikely to be related to the networking performance improvements during 1.2b7.

Doug's client is "Mozilla/3.02 (WinNT; U)". We have other reports of it happening with 3.03 on NT.

In PR#1237 we have a report of it occuring with 3.02 Gold on Win95. The submitter reports that it may be caused by his installation of MSIE 4.0.

In PR#1292 we have a report of it occuring with 3.03 on Win95.

Workaround

There isn't one. Ok, you can disable keep-alives (set KeepAlive Off in your config), but this is a bad idea because it increases network traffic. You can increase the length of keep-alives (set KeepAliveTimeout), but anything beyond one minute loses the benefits of keep-alive. Both of these should do the job, but are unacceptable. And given that the browser is Mozilla version 3, the most popular client out there, it's senseless to conditionally disable keep-alive based on browser type.

If someone can think of another workaround, I'd like to hear it. (Better yet, I'd like to see code for it.)

Further Work

I am unlikely to do further work on this problem because I cannot reproduce it myself. I don't use the platforms involved, and I don't have the time. But it affects so many folks that I hope some of them have the time to do more research.

What isn't known yet is if certain aspects of Apache's earlier responses on the keep-alive connection can be tweaked to avoid the bug. For example, the 257th byte bug was discovered by varying the length of a header field one byte by one byte until the bug manifested itself. Maybe something like this happens with this bug. You would have to hack up the code to work on this angle.

Here is a brief note on how you can generate useful tcpdump sessions should you find some novel new twist to the bug:

Please don't mail dump files unless they're small (i.e. less than 50k)! And please don't mail them if they look like either of the two dumps listed above... we already know what those look like, and your dump is unlikely to help any.