[geeks] Accountable Geeking

Mon Mar 11 19:49:07 CST 2002

> Okay, for simple stuff, can anyone give me reasons (not a flamewar, but
> reasons) to get acquainted with PostgreSQL instead of continuing to use
> MySQL?

It depends on what "simple" entails.  If you want your RDBMS to catch a
lot of silly errors, then features like foreign keys and constraints will
assist you a lot.  MySQL doesn't have them unless you use the InnoDB table
type, and seems to take the stance that foreign keys are a Bad Thing(TM).  
That sort of stuff helps out regardless of the size or complexity of the
database.  If you're using relations, foreign keys will prevent errors.

PostgreSQL has server-side scripting in pgsql, C, PERL, and TCL.  This
code can be referenced both through stored procedures and triggers.  So,
if you have multiple front-ends to the same database (example: a
trouble-ticket tracker that accepts tickets via email and interaction via
web or an X11 app), you can avoid duplicating a decent bit of code,
without having to write a broker (tier-2 in a 3-tier client/server model).
Particularly, you can avoid duplicating verification code, which is a Big
Win.

Also, the beauty of sub-selects cannot be overstated.  When used
properly, that greatly improve the readability of your SQL code, and
usually give a bit of optimization to boot.  For example, this query:

   SELECT firstname,lastname 
   FROM   employees 
   WHERE  empid IN
          (SELECT empid
           FROM   salaries
           WHERE  salary > 50000)

will tend to run faster and consume fewer resources than this query:

   SELECT employees.firstname AS firstname, employees.lastname AS
          lastname, salaries.empid AS salary
   FROM   employees, salaries
   WHERE  employees.empid = salaries.empid
   AND    salaries.salary > 50000

The former runs in O(2N) time and space, the latter runs in O(N^2) time
and O(2N) space[1].  This code can be optimized with a left join (since I
think that MySQL uses full joins for implicit joins--I'm not sure
anymore), but still won't be as fast as the sub-select.  MySQL does not
have subselects, but PostgreSQL does.

You could fake subselects with views in MySQL, except that MySQL has also
deicded that views aren't important.

MySQL is hands-down much faster for most things[2], especially if you
can't use persistent connections (as in the case of CGIs).  MySQL finally
has transactions, but most software written against MySQL doesn't use
them, because it's such a new feature.

>From a developer's point of view, MySQL has a much friendlier API.  The
whole process of obtaining a connection, building a query, and iterating
over the result set is appreciably nicer and more polished.  

In fact, most-everything about MySQL is more polished.  The console is
nicer, permissions are easier to manage, and there's a lot more
third-party support for it, both in terms of mindshare and applications.

However, I still don't trust MySQL's transaction or foriegn key code, as
it's so new.  I'm of the mindset that if I'm going to at-all be worried
about losing the coherency of the data, I'm going to use a RDBMS that has
supported transactions for quite a while.  In fact, I've chosen MS-SQL
(which is really just Sybase with 666 stamped on its forehead) over MySQL
before, for just that reason.  This stance is largely due to the fact that
transactions are very nontrivial to implement correctly, and I want to use
code that has had a while for most of the bugs to surface.

But, yeah, for web stuff that, at the end of the day doesn't matter all
-that- much, MySQL is great stuff.  It's fast and easy-to-use, and, for
toys and small projects, that's what matters.  If you want features,
though, PostgreSQL is approximately as featureful as any modern RDBMS
(probably not as feature-laden as Orcale, but definitely up there with
SyBase and MSSQL).  It's concurrency and locking features surpass just
about everything out there except for (maybe) Oracle.

$PreviousEmployer bought an wholly automated accounting/billing system
that runs on MySQL, despite my protests.  They are now feeling the pain of
broken relations because the software goofed, and the RDBMS didn't have
foreign keys to catch it.  Customers don't like getting cross-billed with
someone else's services.  The software also doesn't use transactions.  
The most I could do was put it on software RAID, journalize the
filesystem, and sacrifice a goat to it every Wednesday midnight.

I'd compare PostgreSQL and MySQL like I'd compare a Sun E250 to a Dell P3
Xeon server.  One is a lot faster for most things, but I'd sleep a lot
better trusting the other.

--Jonathan
[1] Warning: I did that calculation in my head.  It might be wrong.  Feel
    free to correct.
[2] Unless you nonportably optimise your SQL for PostgreSQL.  There are
    lots of optimization tricks, but most of them will make your SQL code
    a bit confusing.