From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,3cd3b8571c28b75f X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2003-09-04 00:22:16 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!in.100proofnews.com!in.100proofnews.com!news-out.visi.com!petbe.visi.com!feed.news.qwest.net!namche.sun.com!news1brm.central.sun.com!new-usenet.uk.sun.com!not-for-mail From: olehjalmar kristensen - Sun Microsystems - Trondheim Norway Newsgroups: comp.lang.ada Subject: Re: A Customer's Request For Open Source Software Date: 04 Sep 2003 09:19:19 +0200 Organization: Sun Microsystems Inc., http://www.sun.com/ Message-ID: References: <3F44BC65.4020203@noplace.com> <20030822005323.2ff66948.david@realityrift.com> <3F4828D9.8050700@attbi.com> <3F4EA616.30607@attbi.com> <3F512BD1.8010402@attbi.com> <3F52AA5F.8080607@attbi.com> <3F5559A4.8030507@attbi.com> <34p5b.8361$Kj.843491@news20.bellglobal.com> NNTP-Posting-Host: khepri06.norway.sun.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: new-usenet.uk.sun.com 1062659960 20477 129.159.112.195 (4 Sep 2003 07:19:20 GMT) X-Complaints-To: usenet@new-usenet.uk.sun.com NNTP-Posting-Date: 4 Sep 2003 07:19:20 GMT User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.2 Xref: archiver1.google.com comp.lang.ada:42129 Date: 2003-09-04T07:19:20+00:00 List-Id: >>>>> "WWGV" == Warren W Gay VE3WWG writes: WWGV> You might wonder why buffered "block" devices are not good WWGV> enough for the purpose. I can't answer to the specifics, but WWGV> only that database engines are in a position to better manage WWGV> the cache based upon what they "know" needs to be done. Another WWGV> important reason to control caching details is that when a WWGV> transaction is committed, you need to guarantee that the WWGV> data is written to the disk media (or can be recovered if WWGV> the database is to be restarted at that point). The cache management will typically be the same with both OS files and raw devices. The main difference is as you say, that you have better control over the raw device with respect to the layout of your tables, so you may be able to optimize disk writes better. Usually, you will be able to get higher bandwidth when running on a raw device. Most operating systems will allow you to wait until the data are even if you are using the file system, so there is no difference with respect to the durability of data. All portable DBMS need to do their own cache management, so if you are running on files, both the OS and the DBMS cache the same blocks, thereby wasiting RAM. Also, the replacement strategies may be conflicting, resulting in suboptimal performance. WWGV> The database WWGV> must be recoverable at any given point anyhow, and this WWGV> usually requires fine grained control over physical writes WWGV> to the media. This aspect and performance means that the WWGV> engine must balance performance with reliability (persistance), WWGV> which are conflicting goals when using disk. WWGV> It is this last area where oodles of persistent fast memory WWGV> (instead of disk), can make a world of difference. In this WWGV> case persistence = memory performance, which of course is WWGV> where the win is. If disks became obsolete (one can hope), WWGV> then I could see that new database engine design (internals) WWGV> will become much different than what it is today. Certainly, WWGV> many of the present compromises would be eliminated. WWGV> -- WWGV> Warren W. Gay VE3WWG WWGV> http://home.cogeco.ca/~ve3wwg Possibly, but keep in mind that most current DBMS's are already CPU bound when it comes to throughput. Fast disks and techniques like group commit ensures that the log is rarely a bottleneck, and all you need to recover is in the log. What you can get with large non-volatile memory is much lower latency per transaction, that is, the response time for a single transaction can be dramatically lower, even if the throughput in terms of TPS stays the same. And in case you were wondering if you could do away with the log, the answer is yes, if you create some kind of multi-version system. But you always need some way of keeping track of the history, so you can roll back your changes in case a transaction decides to abort for some reason. Actually, one may say that the log IS the database, all the rest is there only to give faster access to the latest version. ---------------------------------------------------------------------- There can be only One.