Re: Ada DB bindings and APQ - Warren W. Gay VE3WWG

comp.lang.ada
 help / color / mirror / Atom feed

From: "Warren W. Gay VE3WWG" <ve3wwg@NoSpam.cogeco.ca>
Subject: Re: Ada DB bindings and APQ
Date: Tue, 14 Dec 2004 23:05:17 -0500
Date: 2004-12-14T23:05:17-05:00	[thread overview]
Message-ID: <4KOvd.54$jT5.8@read1.cgocable.net> (raw)
In-Reply-To: <zwjllkzbw80n.1faok6gffugtd$.dlg@40tude.net>

Dmitry A. Kazakov wrote:
> On Tue, 14 Dec 2004 02:25:22 -0500, Warren W. Gay VE3WWG wrote:
>>Dmitry A. Kazakov wrote:
>>
>>>After that hot "Windows Ada database support" discussion I have looked at
>>>APQ by Warren W. Gay. I do believe that it could be used as a starting
>>>point, because it was developed with generic DB access in mind. Here are
>>>things I consider important for further development of APQ:
>>>
>>>1. Dynamic binding to target DB.
>>>
>>>1.a. A factory which returns Root_Connection_Type'Class depending on the DB
>>>type parameter. Communication DLL should not be loaded until DB connection
>>>requested. Otherwise, all DBs have to be installed.
>>
>>It is not quite true that "all DBs have to be installed". The windows
>>version of APQ does indeed include a static library component. However,
>>the remaining part of the code rests in the DLL pieces. For example,
>>MySQL uses the module c_mysql.c (and in the latest version Sybase
>>uses c_sybase.c) as an interface between the statically linked APQ
>>client program and the Windows DLL programs. Under Linux/UNIX, the
>>shared libraries are used for the Database libraries (libpq for
>>PostgreSQL for example).
> 
> Yep, the first error one gets: lpq not found. libpq.a is not distributed
> with GNAT. This might become a problem in the future.

APQ 2.2 does not require you to compile for PostgreSQL, though
earlier versions probably did (I don't remember anymore). I think
that the URL idea that Brian May presented was a good one, and this
should be combined with the idea of dynamically loading shared
libraries upon demand as required (as you suggested).

>>>1.b. A factory which determines the DB type upon connection and returns an
>>>appropriate Root_Connection_Type'Class. (this might be difficult though)
>>
>>There is connection cloning available, but I see that
>>no factory form of it is available. I have made the newest
...
> I meant a quite wild thing: you do not know the DB type, only the
> host+DB+user+password. Then you connect (conventional ports are scanned,
> severs detected etc (:-))

Ah, but I thinks you ask too much ;-)  However, specifying the
database type in URL syntax does become a more practical
possibility.

>>>3. Anyway, Execute should take only one parameter:
>>>
>>>   Execute (Q); -- Query knows its connection
>>
>>See the above. From a design point of view, I didn't
>>like this, because the application writer could mistakenly
>>allow the connection to fall out of scope, and hence
>>create "undefined behavior" at this point (because of
>>a hidden connection reference).
> 
> Connection could be just a handle to some dynamic-scope connection object
> with reference counting. This will solve both the problem of above and
> still allow cloning connections.

I didn't like the use of handles for database objects. I purposely
stuck to using controlled objects, so that they can go out of
scope gracefully, be finalized (perhaps committed or rolled back)
and then destructed. Handles, like open Ada File_Types, never
get that maintenance.

>>>4. To add a scratch Query to each connection object and make anything
>>>available for a query for Root_Connection_Type as well.
>>
>>I'm not sure I follow this point. It sounds like you want
>>to combine Query_Type and Connection_Type for programmer
>>convenience. This is a bad idea, IMHO.
>>
>>Remember that there is a 1 -> many relationship between
>>the connection and query objects. Most DB client libraries
>>permit more than one active query on the same connection
>>(ie. connection sharing).
> 
> Yes, but from user's perspective, at least one query per connection is
> always necessary. In some cases the user will need more, but not so often.
> So the idea is that he just creates a connection and then does with it
> anything he would do with a query. Further many "connection-level" things
> require a query, if they return result sets, like "SHOW TABLES".

The thing is that you normally only create and connect to
the database once in the whole application (at least in the
normal case). You pass around the connection in the user
data in GUI programs or any other way you like in non-GUI
programs.

You then create, use and finalize Query_Type objects as
required, occaisionally specifying a connection when
required. This is really not that onerous to code. The
code reads better, because the programmer has a clear
concept of a connection and a query. By rolling it all
up into one object that does "all", the programmer must
visualize the components that are participating behind
the scenes.

When you use a database like Sybase that permits cursor
updates etc., then you definitely will be using 2 Query_Type
objects as a matter of course.

Rolling it all into one object blurs the distinctions, and
requires internal references to hidden objects, linked lists
and reference counting -- all in black box fashion.

I believe that the Ada approach of using objects for what
they do, is both sensible and more readable as it sits now.

Below you mentioned the idea of an environment object.  I was
tempted to implement that in APQ as well. However, I
compromised on this to consolidate the environment into
a combined environment + connection object for simplicity's
sake. The main reason for this was that in most applications,
you will use one connection and thus one environment.
But in purer terms, I would agree that APQ should have
an Environment_Type, that the Connection_Type uses.

> BTW, one could get rid of query type altogether if there would be two
> different connection-cloning. One that returns a handle to a new
> "connection" (physically just new query) for the same underlying connection
> object, another that makes a new connection.
> 
> I do not see much sense in all that levels of types exposed to the user. In
> ODBC it is even one more: environment > connection > statement. It is
> rather implementation details to me. (Everything is a file (:-))

ODBC is not the only one. Sybase implements an object that
also fills this role. I believe that you want to keep the
objects well separated by function, and I'll admit that I
compromised principle by combining environment & connection.
But I think OO design principles are such that you don't
want to roll everything into one massive object.

Now having said that, if you really insist on a one-object
approach, you might want to test drive the idea by
rolling your own "interface package" if you will, that
does just that using the APQ provided types. But I believe
that you'll quickly arrive at the unhappy conclusion that
it just gets very messy, and results in a more complicated
to understand object. I prefer to have 2 simple objects
instead of one complicated to understand object. The
code is much easier to read and understand this way.
It also separates the finalization of the query from the
finalization of the connection.

>>>5. Arguably, connection should maintain list of all queries.
>>
>>You seem to be at cross purposes. First you want connection
>>and query rolled into one, and then the connection should
>>maintain a list of queries?
> 
> Internally of course. This could be necessary for "light-weight" connection
> cloning.
> 
> Handle --> Query_Object --> Connection_Object
> 
> 1. Handle copy (+1 reference count of Query_Object)
> 2. Light-weight copy: creates new query (+1 reference count of connection)
> 3. Heavy-weight copy: creates a new query and a new connection

When designing an API, I always try to look at the way
it is going to be used and why it must exist. I shy away
from unnecessary API, unless it is frequently needed (to
save or simplify code).

So coming back to this, I have to ask, what is the
problem that this is intended to solve?

>>>6. At least a rudimentary tables management:
>>>
>>>   Get_Tables (Query : in out Root_Query_Type);
...
> Basically because of PostgreSQL, which lacks SQL query for that.

I think you can derive it from some "system tables".

>>>9. For row identifiers there should be three of them:
>>>
>>>-  Row_ID_DB_Type should return kind of "INTEGER NOT NULL AUTO_INCREMENT
>>>PRIMARY KEY" for MySQL, "SERIAL" for PostgreSQL etc.
>>>-  Row_Ref_DB_Type would return "INTEGER" to reference a row.
>>>-  Row_New_ID that would return the literal to be used in INSERT, like
>>>"NewID()", "NULL" etc
>>
>>I can see where you are going with this, but I can't see
>>a lot of value in doing this. Usually tables are declared
>>ahead of time, outside of the application (temp tables are
>>the exception - depending upon database vendor [Oracle does
>>things differently again]). Where you do declare them on
>>the fly, you have the opportunity to use proper primary
>>keys etc. (providing you have one of course). Perhaps
>>your application needs are different.
> 
> The above is the scenario when you are working with some pre-existing DB.

But if the tables are pre-existing, you are not concerned
with how the table's column is declared. The only normal
case for caring about this IMHO is for temp tables, but
that is its own can of worms (see portability section of
the latest manual for why).

> In that case you need to enumerate tables, table columns, their types.
> Which BTW would require even more complex stuff: to determine Ada data type
> most fitting to the column type. My case is much simpler. The application
> starts, it checks if DB exists, if not, it creates that empty. I believe,
> this is about 80% of needs for small applications.

These kinds of operations are important for SQL-helper
tools, but are not the normal kinds of operations for
normal applications, which tend to be more static.

>>>11. Presently values are placed using Append or Append_Quoted. It is IMO
>>>error prone. Proposal: to add dispatching Value, which will encode values
>>>according to the DB:
>>>
>>>   Prepare (Q, "INSERT ... VALUES (");
>>>   Value (Q, Customer, ","); -- Quote if necessary
>>>   Value (Q, Clock, ",");    -- Use appropriate time literal
>>>   Value (Q, Data, ")");
>>
>>Actually, for some database vendors, quotes are _automatically_
>>provided for data values that need them (bit, date/time etc.).
>>In some cases dates must be quoted, and in others not. Booleans
>>are fun also, since some take TRUE/FALSE, others T/F, and still
>>others 0/1, and seems to me IIRC, even '0'b, and '1'b.
>>
>>The Append_Quoted call has in mind a different purpose, and its
>>a subtle one: Append_Quoted is particularly important for character
>>strings, where special characters may be found within it. The most
>>obvious character that creates problems is the quote character itself!
>>
>>So if you need to build a query like:
>>
>>   INSERT INTO CUSTOMER ( NAME, ... )
>>                 VALUES ( 'Fred's Emporium', ... );
>>
>>you will experience a nice tidy SQL syntax error. For some database
>>vendors (like Sybase, IIRC), you want:
>>
>>   INSERT INTO CUSTOMER ( NAME, ... )
>>                 VALUES ( 'Fred''s Emporium', ... );
>>
>>for others, like PostgreSQL, you want:
>>
>>   INSERT INTO CUSTOMER ( NAME, ... )
>>                 VALUES ( 'Fred\'s Emporium', ... );
>>
>>The Append_Quoted call is designed to relieve you from having to
>>figure out which characters to quote, and how to quote it.
>>
>>For this call to work, it _must_ have both the Query_Type object
>>and the Connection_Type object. While this isn't strictly true
>>of all database vendors, APQ now requires it because most require
>>this, so that it can properly escape the string according to the
>>locale chosen. The locale is usually configured at the connection
>>level.
> 
> See, this is why they should not be separated (at least, as long as Ada
> does not support parallel type hierarchies.)
> 
> Anyway, there should be a uniform way of putting values into requests
> separated from SQL-keywords, and, probably, names of tables, of columns
> etc.

Doing table names in a portable way is problematic. Some databases
are not case sensitive, but they are case sensitive when they come
to TABLE names. Others can be configured to force lowercase table
names (MySQL IIRC), still others are always case sensitive (Sybase).

This is why "case policy" was introduced into APQ 2.2. This is also
why it is necessary to distinguish between certain SQL building
APIs and other others like Append_Quoted.

Believe me, if there was a unified way to do it all, I would have
welcomed it. But the reality is a bit more complicated than that ;-)

>>The Future of APQ:
>>------------------
>>
>>What I plan to do in the near future, is to make APQ available
>>somewhere accessible to all. I haven't given it much thought
>>yet, but I was considering sourceforge. I want to make it
>>available for others to contribute to it, because it needs
>>more work and time than I have available. Quite frankly I
>>personally don't have much interest in an M$ SQL Server
>>branch, yet I recognize this would be useful to others. Obviously,
>>other ports like Oracle would also be useful.
> 
> I think you could provide APQ.ODBC which would then cover anything "not yet
> ready". Though OID would be a hammer with ODBC.

That idea is good also. Perhaps someone will contribute that
for us all ;-)
--
Warren W. Gay VE3WWG
http://home.cogeco.ca/~ve3wwg

next prev parent reply	other threads:[~2004-12-15  4:05 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-12-12 17:42 Ada DB bindings and APQ Dmitry A. Kazakov
2004-12-14  7:25 ` Warren W. Gay VE3WWG
2004-12-14 17:37   ` Dmitry A. Kazakov
2004-12-14 18:29     ` Georg Bauhaus
2004-12-14 19:45       ` Dmitry A. Kazakov
2004-12-14 21:06         ` Georg Bauhaus
2004-12-15  8:19           ` Ole-Hjalmar Kristensen
2004-12-15 17:20           ` Dmitry A. Kazakov
2004-12-16 13:28             ` Georg Bauhaus
2004-12-17 13:23               ` Dmitry A. Kazakov
2004-12-14 23:26         ` Brian May
2004-12-15 17:43           ` Dmitry A. Kazakov
2004-12-15 21:54             ` Brian May
2004-12-15  4:05     ` Warren W. Gay VE3WWG [this message]
2004-12-15 18:26       ` Dmitry A. Kazakov
2004-12-16  2:53         ` Warren W. Gay VE3WWG
2004-12-18 16:43           ` Dmitry A. Kazakov
2004-12-18 20:36             ` Warren W. Gay VE3WWG
2004-12-18 22:21               ` Dmitry A. Kazakov
2004-12-19  0:53                 ` Warren W. Gay VE3WWG
2004-12-19 12:21                   ` Dmitry A. Kazakov
2004-12-20  5:33                     ` Warren W. Gay VE3WWG
2004-12-20 20:01                       ` Dmitry A. Kazakov
2004-12-20 20:54                         ` Warren W. Gay VE3WWG
2004-12-14 22:40   ` Brian May
2004-12-15  3:23     ` Warren W. Gay VE3WWG
2004-12-15 15:01       ` Georg Bauhaus
2004-12-17  4:31         ` Brian May
2004-12-15 10:48   ` Brian May
2004-12-16  1:40     ` Brian May
2004-12-16  3:10       ` Warren W. Gay VE3WWG
2004-12-17  4:55         ` Brian May
2004-12-17 11:13           ` Warren W. Gay VE3WWG

replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox