Advice re database reporting wanted 
Author Message
 Advice re database reporting wanted

5/23/01

Asking where the real expertise is  (applications programmers, that is...)
-----------

I have recently put together an application which translates an ANSI X.12
document and stores it in any database for which an ODBC driver is
available.

Now I am thinking about adding some reports to the software, reporting FROM
that database.

My basic structure of the data input to the report consists of this:

For each document:

Header  (one DB row per header)
Zero to N header extra rows (one db row per fact)
One to 'n' line items  (one line item per DB row)
For each line, zero to 'n' Lineitem extra info (one db row per fact)

There are multiple documents in the database, and mutiple documents may be
selected for inclusion on the report.

Since I was able to design the tables myself, you can always find all the
associated records for any document header. (You can also find them going
'bottom up' if you'd like, but that's not a requirement right now).

I have not done any reporting from a database in a couple of years, and was
looking at this pseudo-code structure:

SELECT Headers  Where (some condition) ==> Header Result Set

IF rows found

   PREPARE statement to find all related header extra info
    (SELECT columnsneeded from Extra_Header WHERE column1=? and column2=?)
   PREPARE Statement to find all related line items
      (SELECT columnsneeded from line_items where line_item_order_no=?)
   PREPARE Statement to find all related line item extra info
     (SELECT columnsneeded from extra_line_item_info where
line_item_order_no=?)
    (Bind parameter buffers for all three statements)

  DO UNTIL No-more-headers
    Fetch Header-Row
    Output Header info
    Move parameter info for extra header info to prepared statement;
excecute
    DO UNTIL No-More_extra-header-info
      Fetch extra-header-info
      report it
    END DO
    Move parameter info for line items to prepared statement and execute
    DO UNTIL No-more-line-items
      Fetch line item record
      Output
      move data to parameters for extra line item info & execute
      DO UNTIL No-More-Extra-Lineitem-Info
         Fetch  extra-line-item-info
         output it
      END DO
    END DO
 END DO
ENDIF      if any header rows returned

(Clean up and exit)

A second way to do it is to retrieve all 4 table result sets at one time,
ordered by order number/line number, and control the fetches and outputs
from the multiple result sets and do my own control break logic.

Yet a third way to do it is to retrieve all 4 table results at once (i.e,
only execute 4 SELECTs) into storage (RAM or disk), disconnect from the
database, and do my reporting based on the stored data.

Questions:

1. I like the first method (gee, is it that obvious?), because it handles
all the control break logic; but am somewhat concerned about efficiency.
Given this is a report process (i.e., not interactive), am I worried about
nothing?

2. Am I asking for trouble trying to hold open the database and execute
many, many 'small' statements "on the fly" as opposed to issuing just 4
executions of a SELECT and handling 4 big result sets?

I am querying the database to find out the maximum number of statements
which may be open on a connection; unfortunately, the first DBMS I am
working with, MS-Access, returns "0" which means _either_ "unlimited" or
"information unavailable."

Or am I yet again worrying about nothing, as four (4) open result sets
should be considered trivial?

3. Since I'm out of practice, I'll take any thoughts anyone may have. I am
also new to working with databases on PCs, although I've worked with
databases on IBM M/F and Unix before.

4. Would it make more sense to license something like Crystal
Reports/Developer and use their API/Designer for doing stuff like this?

BACKGROUND:

1. Environment: Windows/32

2. Could, if necessary, code this to run the report in a separate thread
(but that is only a thought, as that introduces a whole new set of problems,
like maybe requiring a separate environment and connection for each thread?)

3. I don't care how 'hard' or 'tricky' it is to code. I do care to engineer
'maximum reliability' from the DBMS and driver. e.g., if you have experience
that a particular brand of DBMS is 'flaky' when multiple result sets are
open, that would affect my design. I know the DBMS vendors won't admit it if
they have this kind of problem, which is why I am asking here.

4. I am using 'native' API calls to the ODBC driver, so development language
flakiness is not an issue. (e.g., "Programs written in 'Joe's C++' have
trouble with ODBC database access").

5. "Extended" SQL capability (e.g., PL/SQL) is not an option.

Thanks,
Michael Mattias
Tal Systems
Racine WI USA



Sun, 09 Nov 2003 22:00:00 GMT  
 Advice re database reporting wanted


Quote:
> SELECT Headers  Where (some condition) ==> Header Result Set

> IF rows found

>    PREPARE statement to find all related header extra info
>     (SELECT columnsneeded from Extra_Header WHERE column1=? and
>     column2=?)
>    PREPARE Statement to find all related line items
>       (SELECT columnsneeded from line_items where line_item_order_no=?)
>    PREPARE Statement to find all related line item extra info
>      (SELECT columnsneeded from extra_line_item_info where
> line_item_order_no=?)
>     (Bind parameter buffers for all three statements)

>   DO UNTIL No-more-headers
>     Fetch Header-Row
>     Output Header info
>     Move parameter info for extra header info to prepared statement;
> excecute
>     DO UNTIL No-More_extra-header-info
>       Fetch extra-header-info
>       report it
>     END DO
>     Move parameter info for line items to prepared statement and
>     execute DO UNTIL No-more-line-items
>       Fetch line item record
>       Output
>       move data to parameters for extra line item info & execute
>       DO UNTIL No-More-Extra-Lineitem-Info
>          Fetch  extra-line-item-info
>          output it
>       END DO
>     END DO
>  END DO
> ENDIF      if any header rows returned

> (Clean up and exit)

> A second way to do it is to retrieve all 4 table result sets at one
> time, ordered by order number/line number, and control the fetches and
> outputs from the multiple result sets and do my own control break
> logic.

Hi Michael

since SQL_MAX_CONCURRENT_ACTIVITIES may return 0 meaning "Unknown"
or "Unlimited" amount of open statements per *connection* you should
really only design for several concurrent statements per connection
if SQL_MAX_CONCURRENT >= Number_Of_Statements_Per_Connection for that
driver.

A more portable approach (as long as the database does not limit you
from just one active connection at at time) would be to open 4 connections
to the DB and use one statement per connection - however, I'd only
recommend doing this if you use connection pooling. When you use connection
pooling the driver manager will only open a new connection if all
cached connections are in use. That way you minimize the overhead
associated with new connections. Will your app support many concurrent
users? Are these long running transactions? Keep in mind that connection
pooling support was kind of buggy prior to >= 3.0 ODBC driver manager
version - i'd have to look up the flaky version number(s)....

I'd opt for option 1 - DBMS are well optimised - no reason you should
duplicate data access in your code - use the database.

Cheers

Florent Heyworth



Wed, 12 Nov 2003 20:16:51 GMT  
 Advice re database reporting wanted


Quote:


> > SELECT Headers  Where (some condition) ==> Header Result Set

> Hi Michael

> A more portable approach (as long as the database does not limit you
> from just one active connection at at time) would be to open 4 connections
> to the DB and use one statement per connection - however, I'd only
> recommend doing this if you use connection pooling.

Hmmm, that's something I had not thought of. I will think about this.

Quote:
> Will your app support many concurrent  users?

No.

Quote:
>Are these long running transactions?

No.

My major concern is end-user support. Many (most?) of the users for whom
this is designed  wear loafers, as tying their own shoes is considered a
"highly technical" activity.

Fortunately for this user base, once the product is installed and operating,
ain't nobody ever touches it again.

Thank you for your comments.

MCM



Thu, 13 Nov 2003 02:47:00 GMT  
 Advice re database reporting wanted
Since it's become obvious I have been working with  databases, let me ask
more questions of the applications programmers here...

Working with ODBC, I created a database using the MS-Access driver. I put
some data in it. (And "some" is about all I inserted. For production
purposes the number of items I stored was peanuts).

The database (*.mdb)  then took up 26 Mb of disk space!  When I put the same
data into "flat" (comma-delimited) files, it takes up less than 2 Mb.

When I WinZipped the *.mdb file, it compressed to 1.8 Mb.

I looked back at my design, and realized I had created all the text columns
as "CHAR(size)" rather then "VARCHAR(size)". I know there is some
"whitespace" in the data, but not a lot except when there is no data, and
when the data were empty, I stored explict NULL in the columns. All the
numeric columns are defined to the DBMS as either CURRENCY (the Access
literal for SQL_NUMBER) or INTEGER. There are a total of about 210 columns
across seven different tables, about half of them text/character in nature.
There are no indexes (yet?).

Questions:

1. Is this kind of disk space utilization by the Access(r) brand DBMS "the
norm?"

2. If so, will this be true of all the major DBMS brands (e.g., Oracle)?

3. In general, would I save much space by going to VARCHAR in lieu of CHAR
types?

4. I am all worked up about nothing again?

Unrelated question: does anyone know what the *.ldb files created by the
Access driver are? They are 64 byte files, apparently text, and look like
this:

U8MRL                           admin

Thanks,
--
Michael Mattias
Tal Systems
Racine WI USA



Fri, 14 Nov 2003 21:02:04 GMT  
 Advice re database reporting wanted
See the answers to your questions below:


Quote:
> Since it's become obvious I have been working with  databases, let me ask
> more questions of the applications programmers here...

> Working with ODBC, I created a database using the MS-Access driver. I put
> some data in it. (And "some" is about all I inserted. For production
> purposes the number of items I stored was peanuts).

> The database (*.mdb)  then took up 26 Mb of disk space!  When I put the
same
> data into "flat" (comma-delimited) files, it takes up less than 2 Mb.

> When I WinZipped the *.mdb file, it compressed to 1.8 Mb.

> I looked back at my design, and realized I had created all the text
columns
> as "CHAR(size)" rather then "VARCHAR(size)". I know there is some
> "whitespace" in the data, but not a lot except when there is no data, and
> when the data were empty, I stored explict NULL in the columns. All the
> numeric columns are defined to the DBMS as either CURRENCY (the Access
> literal for SQL_NUMBER) or INTEGER. There are a total of about 210 columns
> across seven different tables, about half of them text/character in
nature.
> There are no indexes (yet?).

> Questions:

> 1. Is this kind of disk space utilization by the Access(r) brand DBMS "the
> norm?"

I wouldn't have been surprised if 2 MB of flat file turned into 10 MB of
Access file.  However 26 MB seems a little excessive.  Access trades space
for time.  By laying out the data in a less space efficient method, Access
allows for faster processing.

Quote:
> 2. If so, will this be true of all the major DBMS brands (e.g., Oracle)?

An Access database of 1 MB will convert to an SQL Server database of 1.3 to
1.5 MB.  I would expect the same sort of increase for conversion to Oracle
or Sybase.

Quote:
> 3. In general, would I save much space by going to VARCHAR in lieu of CHAR
> types?

If the strings were variable in length I would use VARCHAR.  That should
save space.  If they were fixed in length, I would still use VARCHAR even
though CHAR would give a slight saving in this case.  It's simpler that way.

Quote:
> 4. I am all worked up about nothing again?

Possibly.  But to check, try doubling the amount of data which you are
inserting and change the CHARs to VARCHARs.  If the Access file doubles in
size there is a problem.  having said that, I have clients who run 400 MB
Access files.  This is not recommended because Access does not run in a
client server manner whereas Oracle, SQL Server, etc. do.  The result is
that Access may start using significant network bandwidth for a query
involving large tables stored in an MDB file on a remote file server because
all the SQL processing is done on the client.

Quote:

> Unrelated question: does anyone know what the *.ldb files created by the
> Access driver are? They are 64 byte files, apparently text, and look like
> this:

> U8MRL                           admin

The *.LDB files are Access lock files, used to keep track of record locking
in a multi-user environment.  Since Access always runs in multi-user mode,
these files are always created when you open an Access MDB file.  Access 97
or later should delete these files when their corresponding MDB files are
completely closed but earlier versions may leave *.LDB files lying around.

Cheers

Derek



Sat, 15 Nov 2003 06:32:14 GMT  
 Advice re database reporting wanted
On Mon, 28 May 2001 13:02:04 GMT, "Michael Mattias"

Quote:

>The database (*.mdb)  then took up 26 Mb of disk space!  When I put the same
>data into "flat" (comma-delimited) files, it takes up less than 2 Mb.

>When I WinZipped the *.mdb file, it compressed to 1.8 Mb.

>Questions:

>1. Is this kind of disk space utilization by the Access(r) brand DBMS "the
>norm?"

Yes, in my limited experience with Access.  We have a [test data]
275,000 record Access table which is over 75 meg. Converted to dBase3+
format, it's a little over 5 meg.  The final SDF (fixed-record-length
format) file is about 1.5 meg.
Quote:

>2. If so, will this be true of all the major DBMS brands (e.g., Oracle)?

>3. In general, would I save much space by going to VARCHAR in lieu of CHAR
>types?

>4. I am all worked up about nothing again?

>Unrelated question: does anyone know what the *.ldb files created by the
>Access driver are? They are 64 byte files, apparently text, and look like
>this:

>U8MRL                           admin

As I recall they are "lock" files which tell Access that the table /
MDB is open and in use.

Quote:

>Thanks,

Ron  n1zhi

ICQ: 26516311


Sat, 15 Nov 2003 07:15:33 GMT  
 Advice re database reporting wanted
On Mon, 28 May 2001 13:02:04 GMT, "Michael Mattias"

Quote:

>2. If so, will this be true of all the major DBMS brands (e.g., Oracle)?

Not Visual FoxPro. Your EXE will be huge but data is stored in
DBF-style tables. There is also an optional DBC (database container)
which holds validity-checking rules, relationships or anything else
that a group of tables might share. The result is that the data is
larger than it would have been in FP/DOS or FP/Win but not orders of
magnitude larger.

Quote:
>4. I am all worked up about nothing again?

I don't think so but this might just be my personal bugaboo. I don't
want to buy bigger drives just to have Microsoft eat up the space
because they're too lazy to do it right.


Anything here that looks like an opinion is mine and not the company which vastly underpays me.



Sun, 16 Nov 2003 00:22:51 GMT  
 
 [ 8 post ] 

 Relevant Pages 

1. MIMER database or ODBC extension advice wanted.

2. Advice on databases

3. database dictonary design advice/help

4. Database / Frontend Advice

5. Advice on Database Packages

6. Python (Win) Database Advice needed

7. Database application sample - need advice

8. Tcl Test Script Data – Database Advice (Test Automation Improvements)

9. Tcl Test Script Data – Database Advice

10. Need advice on Database options

11. Advice of cost of distributed architecture and OO database

12. Need report advice

 

 
Powered by phpBB® Forum Software