From dgaudet-list-new-httpd@arctic.org Tue May 9 00:35:49 2000 Reply-To: new-httpd@apache.org Date: Mon, 24 Apr 2000 19:18:48 -0700 (PDT) From: dean gaudet To: new-httpd@apache.org Subject: aligned copies X-comment: visit http://arctic.org/~dean/legal for information regarding copyright and disclaimer. anyone feel like benchmarking with/without the below patch? it can probably dropped into 2.0 without much effort. the 32 is a guess... might try with 8, 16, and 64. 4 is essentially a waste 'cause the shortest padding header you can add is 4 bytes long. -dean ---------- Forwarded message ---------- Date: Mon, 24 Apr 2000 19:00:59 -0700 (PDT) From: dean gaudet To: Artur Skawina Cc: kumon@flab.fujitsu.co.jp, Linus Torvalds , Manfred Spraul , linux-kernel@vger.rutgers.edu Subject: Re: lockless poll() (was Re: namei() query) X-comment: visit http://arctic.org/~dean/legal for information regarding copyright and disclaimer. On Mon, 24 Apr 2000, Artur Skawina wrote: > kumon@flab.fujitsu.co.jp wrote: > > > > In the heavy duty case, csum_partial_copy_generic() becomes the new > > winner of the worst time consuming function with the poll() > > optimization. We are arranging the global figure now. > > > > Though csum_partial_copy_generic() is highly optimized with > > hand-crafted code, it eats lots of time. It may be inevitable, but may > > be reducible. We are now investigating why it does. > > csum_partial_copy_generic() could certainly be more optimized; > attached is a snapshot of a version that does upto 20% better in > dumb benchmarks, if and what difference there is for real loads > i haven't yet measured. > > [patch vs 2.3.99pre6pre5, offsets are wrong] > here is a patch against apache-1.3 which forces it to align the length of its headers to a 32-byte boundary. if your benchmark is requesting objects greater than ~4k in size this will cause apache to generate writev()s such as: writev(3, [{"HTTP/1.1 200 OK\r\nDate: Tue, 25 A"..., 288}, {"outbase = ap_palloc(p, fb->bufsiz + 2); + if (flags & B_WR) { +#define ALIGN (32) + fb->outbase = ap_palloc(p, fb->bufsiz + 2 + ALIGN); + fb->outbase += ALIGN - ((long)fb->outbase % ALIGN); + } else fb->outbase = NULL; Index: src/main/http_protocol.c =================================================================== RCS file: /home/cvs/apache-1.3/src/main/http_protocol.c,v retrieving revision 1.289 diff -u -r1.289 http_protocol.c --- src/main/http_protocol.c 2000/02/20 01:14:47 1.289 +++ src/main/http_protocol.c 2000/04/25 01:59:00 @@ -1445,6 +1445,21 @@ if (bs >= 255 && bs <= 257) ap_bputs("X-Pad: avoid browser bug" CRLF, client); +#define ALIGN (32) + ap_bgetopt(client, BO_BYTECT, &bs); + bs += 2; /* for the final terminating empty line */ + if (bs % ALIGN) { + ap_bputc('X', client); + ap_bputc(':', client); + bs += 4; /* 2 for "X:" and 2 for the final CRLF */ + while (bs % ALIGN) { + ap_bputc('X', client); + ++bs; + } + ap_bputc('\r', client); + ap_bputc('\n', client); + } + ap_bputs(CRLF, client); /* Send the terminating empty line */ } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/ ---------- Forwarded message ---------- Date: Mon, 24 Apr 2000 19:12:24 -0700 (PDT) From: Linus Torvalds To: dean gaudet Cc: Artur Skawina , kumon@flab.fujitsu.co.jp, Manfred Spraul , linux-kernel@vger.rutgers.edu Subject: Re: lockless poll() (was Re: namei() query) On Mon, 24 Apr 2000, dean gaudet wrote: > > here is a patch against apache-1.3 which forces it to align the length of > its headers to a 32-byte boundary. if your benchmark is requesting > objects greater than ~4k in size this will cause apache to generate > writev()s such as: Nice. The intel guys already did an experimental (and fairly ugly) patch to make the kernel try to semi-pad the destination by selecting specific sizes for the packets sent out over TCP. They claimed a 3% speedup in specweb (or something) from that. The argument from Dave and Alan was that it should be done from within the web-server. Linus