firebird-architect - Re: [Firebird-Architect] Re: Well, here we go again

Subject	Re: [Firebird-Architect] Re: Well, here we go again
Author	Jim Starkey
Post date	2008-06-18T15:55:15Z

paulruizendaal wrote:

> Hi Jim,
>
> Thanks for your thoughts. It sounds like we have the beginnings of a
> good discussion here.
>
> "I have to disagree here. There are two different types of problems
> out there and, not surprisingly, they require different solutions.
> [..snip..] But most of the remaining 99.999% of Web applications
> aren't like Google or Amazon."
>
> We probably agree, but it would be good to be specific and avoid
> confusion. In this thread I have a Netsuite or a Salesforce in the
> back of my mind. Could you give one or two high volume example web
> apps that you would find a natural fit for this architecture
> discussion?
>

The social networking sites are interesting -- very high volume with a
significant update load. The two that come to mind are Facebook (uses
massive MySQL read replication and memcached) and LinkedIn (Oracle
based). Both suffer from update propagation time, and both are willing
to do (and pay) for whatever it takes.

Another interesting high volume application is Google ads. They use
InnoDB through MySQL for their master with read replication to handle
queries. I believe they also shard their data.

> "The metric for Web applications is page latency at a given load." I
> agree fully. It is almost a trueism.
>
> I'm not sure I understand the "sources of latency" list. It doesn't
> seem to be inclusive nor do you use it later in your reasoning. From
> that reasoning I derive a high level list of three:
> - internet (WAN) latency
> - app server latency (say, http + app code)
> - database latency
>

I skipped Internet latency because there is nothing to be done about
it. It is what it is and that's that. I didn't list actual compute
time because a) that's what it takes, b) it's small relative to
communication overhead, and c) machines keep getting faster with more
cores, so that isn't the bottleneck.

Database latency is another question, and one that is near and dear to
me. The costs involved are:

* Actual beneficial work (comparing values, transforming data, etc.)
* Replicating and/or writing data to disk
* Thread and lock synchronization
* Disk and/or network data access latency.

I think the actual beneficial work is the smallest component, so the
potential gain is the lowest.

I hate the idea of a human having to wait while an archaic disk platter
slowly orbits the hub. Ugh. Adding memory reduces the probability of a
disk read, which is a very good thing. Using a cloud for an L2 cache
reduces it even more.

> Internet latency has been well studied & reported, eg. by Yahoo. Note
> that precompiling boilerplate does not only reduce the load on the
> database, but also helps internet latency by optimising request
> pipelining and client/proxy caching. Even with a very fast database,
> you would still want to do it.
>
> You address app server latency with the Netfrastructure example. Of
> course, connecting to a scripting and/or database server introduces
> significant latency. It is often countered by running stuff in a
> single process (eg. apache mod_php) or through connection pooling
> (eg. fcgi). The former scales up but not out, the second seems to
> scale out. On a last note, I hope you agree that scaling out the http
> servers is not a key issue for the purposes of this discussion.
>

I'm assuming that no real Web application can exist without at least a
de facto database. So a connection is required.

The question of connection pooling, however, is a tad interesting. I
rather like the idea of associated a piece of data with a connection
request that can be hashed to a subset of servers in cloud to establish
a de facto node affinity to maximize cache efficiency. Accomplishes the
same thing as sharding without the programming overhead (the data
associated with connection request would be the same data that
identifies a shard).

> Remains the core of the current discussion: database latency at a
> given load. Whilst I do not fully agree with your reasoning, we are
> fully agreed on the fact that all current solutions are cumbersome
> for the dba/app desinger and that is what we are trying to solve.
>

Yup. My lifetime ambition is to put DBAs out of work. Failing
miserably so far.

> "I believe that the relational model has passed the test of time and
> is organization of choice, other things being equal. The hierarchical,
> network (CODASYL), and "object oriented" database have all died while
> relational systems have thrived. The issue is not the model (though I
> prefer semantic extensions) but the implementation."
>
> Agree fully. That is why we have row stores, column stores, memory
> stores, etc. All the relational model, but implemented in different
> ways. I also like semantic extensions, and with that on offer,
> the "complex record" folks can be happy too -- for good or bad
> reasons.
>

No, I think complex records are a cumbersome result is restricting
transactions to a single row update. No matter how you slice, the
complex row model leads to data redundancy that is difficult and
expensive to maintain.

> "Last I looked, Salesforce used a zillion MySQL servers with each
> instance a separate MySQL database" .
>
> Hmmm, that statement totally puzzles me ... why would you think that?
> >From their last annual report (in which lying is a criminal offence):
>
> "We built our service as a highly scalable, multi-tenant application
> written in Java and Oracle PL/SQL. We use commercially available
> hardware and a combination of proprietary and commercially available
> software, including database software from Oracle Corporation, to
> provide our service. The application server is custom-built and runs
> on a lightweight Java Servlet and Java Server Pages engine. We have
> custom-built core services such as database connection pooling and
> user session management tuned to our specific architecture and
> environment, allowing us to continue to scale our service. We have
> combined a stateless environment, in which a user is not bound to a
> single server but can be routed in the most optimal way to any number
> of servers, with an advanced data caching layer."
>

Beats me. I got my information from a former very senior technical type
almost a year ago. Maybe they've re-invented themselves since then.

> In my mind the key question is whether a memory database with write
> once & only (WOO) backfill scales out better than a real application
> cluster (RAC) design -- and I mean the design pattern, not just the
> Oracle implementation. Better means lower TCO, including hardware &
> engineers. RAC or WOO, that's the question.
>

I'm not talking about a single node database but a cloud. I don't think
there is any future for single node databases.

> "Pure in-memory (i.e. non-persistent) relational systems are much
> faster, but with the obvious drawback. If each node in a cloud can
> execute at the speed of an in-memory database, the cloud would scale
> linearly. But to make this work, we need replication, and the cost of
> replication grows with the number of nodes in the cloud. There are
> lots clever things we can do to minimize the cost of replication, but
> we can[not] avoid it."
>
> In one of his recent papers on in-memory databases, Stonebraker
> argues that a confirmed replication to one or more other nodes is
> sufficient to make a transaction durable (sorry can't find the link
> anymore). After all, writing to disk doesn't mean durable either when
> there is an explosion in the server room.
>

Yup.

> "The cost of a relational database engine isn't the SQL engine. Even a
> really stupid SQL engine is fast. The problems are synchronizing
> memory and the disk to be ACID. [...] Even better is having SQL nodes
> never write at all. The network is better redundancy than a disk.
> Disks are bigger than memory, though, and more persistent, so things
> ought to get written to disk, but there is no reason for a SQL node
> to even bother."
>
> I guess you better define what tasks a SQL node does, and what tasks
> a storage node does, because I am totally not getting what you are
> trying to explain. Sorry to be so dimwhitted.
>
>

Sorry, I wasn't trying to be clear. Let's leave it as an archive node
participates in replication and from time to time writes stuff to disk.
The writing, however, isn't time critical as long as the data exists
someplace else in the cloud.

--
James A. Starkey
President, NimbusDB, Inc.
978 526-1376

[Non-text portions of this message have been removed]