Subject Re: [Firebird-Architect] event based nonblocking io
Author Paul Ruizendaal
> There are many ways to interpret this data, none conclusive. But
> right off the bat, I will say that a single thread waiting on
> non-blocking I/O cannot exploit multi-core / SMP servers, which is just
> plain dumb.

Jim, non-blocking IO does not equate single threaded. For example, Gerd
Knorr's webfs http server combines the two:
http://linux.bytesex.org/misc/webfs.html
This 10 year old code running on a vanilla linux box handles 1,000
simultaneous clients, saturating a 100mb/s link with hardly any system
load.

I agree with your observations on futexes. It is actually an area where
Windows was ahead of Linux. CRITICAL_SECTION on Windows is implemented as a
futex. In the same area, the Windows TransmitFile syscall preceded
Linux/BSD's sendfile (see below) by several years.

> On 9/24/2010 3:33 AM, marius adrian popa wrote:
>> nginx won over the apache threading model and here are some benchmarks
>> and papers about the model

Marius, writing high performance web servers is about more than
nonblocking IO, although it is an important ingredient. Felix von Leitner
did a lot of measuring on *nix and windows and his insights are worth a
read:
http://www.fefe.de/
http://bulk.fefe.de/scalable-networking.pdf
http://bulk.fefe.de/lk2006/talk.pdf
http://www.fefe.de/kludge/

High performance web servers try to solve the "C10K" problem, how to
handle 10,000 simultaneous connections from a single box. This brings lot's
of interesting issues, for example a process can only open 1,024 file
handles on 'out-of-the-box' linux and windows. Next is how to react to
events on those 10,000 ports (and on the backend, the 10,000 file handles
for static contenct). "select" can only do 1,024 handles on linux and 32 on
windows by default; in both cases the kernel does a linear search for a
socket/file with an event and that scales really bad. Calls on handles
might block and the two naive ways to handle this wont work: most systems
cannot handle 10,000 processes very well, nor are 10,000 threads very
desirable. The solution is in asynchroneous IO and having a call back
mechanism to react to events: epoll, IO signals, completion ports,
whatever.

But there is more: not all syscalls have a nonblocking option, notably
'stat', the file system can become a bottleneck way before the disk
hardware is saturated, and properly doing "zero copy" from disk buffer to
ethernet card isn't a piece of cake either (for example, many cheaper
ethernet cards do not properly support the 'gather' operation required for
this and the linux 'sendfile' call doesn't really work well on multipart
http replies).

Nginx was not the first to exploit this nonblocking IO design direction.
Many small servers were designed this way in the late 90's, such thttpd,
fnord and webfs (thttpd is still used by tudou.com [chinese youtube] for
static content). Next Jan Knetsche (a MySQL contractor) built LigHTTPD
along these principles when he got frustrated with Apache, and got quite
popular, eg. used by YouTube. Somehow he lost the plot (i.e. making
LigHTTPD too complex) and the project floundered. Probably unaware of each
others projects, Igor Sysoev had started on Nginx in 2002 for Rambler, and
kept it simple. In the last 3 years, he has mopped up where LigHTTPD left
of.

The question now is 'how is this relevant to rdms design?'. I would think
that in nearly all cases a traditinal database engine has between 5 and 500
simultaneous clients. In that range the above techniques are not really
needed. It is only when you move to a design where transactions are short
and do not span requests, that web style design patterns become relevant.