Help WIth File Processing (CW 5.5G) 
Author Message
 Help WIth File Processing (CW 5.5G)

Hi

I have a problem that has me stumped.

I have a large TPS file (156 megs) that I want to convert to a somehat
different format (I want to split an "amount" field into a debit or
credit field.

To speed the conversion process, I have created a large (200meg) RAM
Disk and copied the file & conversion program to this RAM Disk. When I
run my conversion program, it runs very, very slow.  If I abort the
process and restart it, the process is extremely fast until it reaches
a record that has not had the new fields populated.  Once it hits
these unprocessed records, the process slows to a crawl again.

The code I am using is:

      vlCount = Records(Trans)

      Loop Until(Eof(Trans))
         Next(Trans)
         If Error() Then Stop(Error()).

         IF TRA:Amount < 0
            TRA:DEBIT_Amount  =-TRA:Amount
         Else
            TRA:CREDIT_Amount =+TRA:Amount
         .
         Put(Trans)
         If Error() Then Stop(Error()).
         vlCount -= 1
         Display(?vlCount)

      .

What is interesting is that if I just do a get & then a put without
populating the new fields, the process is fast. As soon as I change a
value on one of the fields, the PUT seems to take a significant amount
of time.

Can anyone explain what is happening here. My theory is that CW 5.5
only does a PUT if a record has changed thus the difference in "Put
Times" bew{*filter*} the processed / unprocessed records.  

Another related question, even if the above theory is right, why does
the use of RAM Disk not speed up the file processing. I cannot see any
appreciable difference bew{*filter*} processing the file using a RAM Disk
and without.

Thanks for the help!

Les



Sat, 21 May 2005 00:20:02 GMT  
 Help WIth File Processing (CW 5.5G)
Hi Les,
Yes, that is correct. You can only do a put after a GET( ).

The best way to do this type of conversion is to let the data dictionary
write the conversion code and modify it accordingly.  Also, if you have a
RAMDISK of 200 Mb and your data file is 156Mb  you would not have enough
memory allocated for the RAMDISK in order to convert the file.  (are you
writing to the ram disk?

Also, the use of STREAM( ) and FLUSH( ) would help with the conversion
speed.

--
Ben E. Brady
http://www.*-*-*.com/


Quote:
> Hi

> I have a problem that has me stumped.

> I have a large TPS file (156 megs) that I want to convert to a somehat
> different format (I want to split an "amount" field into a debit or
> credit field.

> To speed the conversion process, I have created a large (200meg) RAM
> Disk and copied the file & conversion program to this RAM Disk. When I
> run my conversion program, it runs very, very slow.  If I abort the
> process and restart it, the process is extremely fast until it reaches
> a record that has not had the new fields populated.  Once it hits
> these unprocessed records, the process slows to a crawl again.

> The code I am using is:

>       vlCount = Records(Trans)

>       Loop Until(Eof(Trans))
>          Next(Trans)
>          If Error() Then Stop(Error()).

>          IF TRA:Amount < 0
>             TRA:DEBIT_Amount  =-TRA:Amount
>          Else
>             TRA:CREDIT_Amount =+TRA:Amount
>          .
>          Put(Trans)
>          If Error() Then Stop(Error()).
>          vlCount -= 1
>          Display(?vlCount)

>       .

> What is interesting is that if I just do a get & then a put without
> populating the new fields, the process is fast. As soon as I change a
> value on one of the fields, the PUT seems to take a significant amount
> of time.

> Can anyone explain what is happening here. My theory is that CW 5.5
> only does a PUT if a record has changed thus the difference in "Put
> Times" bew{*filter*} the processed / unprocessed records.

> Another related question, even if the above theory is right, why does
> the use of RAM Disk not speed up the file processing. I cannot see any
> appreciable difference bew{*filter*} processing the file using a RAM Disk
> and without.

> Thanks for the help!

> Les



Sat, 21 May 2005 06:26:05 GMT  
 Help WIth File Processing (CW 5.5G)
Les,

Are you running under XP or WIN2K?  If so, this is what's been normal
for me since moving to MS$ new nd improved :) OS's.  WIN85/98 was many
magnitudes faster in disk access.  If you find out what's making disk
access so slow .... I, and a whole lot of other Clarioneers would like
to know.

Rich Knoch
www.compulink-ltd.com



Sat, 21 May 2005 08:07:39 GMT  
 Help WIth File Processing (CW 5.5G)
Ben

Many thanks for the heads up regarding the Stream() / Flush()
statements.

Here is what I found - to process 1.4 million rcords "normally" takes
over 24 hours. (1k records / minute). Using a RAM Disk does not appear
to help this process much.

To process the 1.4 million records using a ram disk & the stream()
command takes about 18 minutes.  (77k records / minute processed).

What this tells me is that the logical closing done by Calrion adds a
great deal of overhead.

Go figure.

Again, thanks for the help - it is a bit of a life saver.

Cheers

Les Kearney

On Mon, 2 Dec 2002 14:26:05 -0800, "Ben E. Brady"

Quote:

>Hi Les,
>Yes, that is correct. You can only do a put after a GET( ).

>The best way to do this type of conversion is to let the data dictionary
>write the conversion code and modify it accordingly.  Also, if you have a
>RAMDISK of 200 Mb and your data file is 156Mb  you would not have enough
>memory allocated for the RAMDISK in order to convert the file.  (are you
>writing to the ram disk?

>Also, the use of STREAM( ) and FLUSH( ) would help with the conversion
>speed.



Sat, 21 May 2005 09:49:53 GMT  
 Help WIth File Processing (CW 5.5G)
Rich

I am running Win98 (and based on your comments will think long & hard
about any change!).

What seems to have the biggest impact on the processing speed is the
logical file closing that Clariion does. As my previous post to Ben
suggests, by using the stream() function to turn disable the Clarion
file flushing, I go from 1k records / minute to 100 krecords a minute.

Go figure, you can be fast or you can be safe but you can't be fast &
safe.

Cheers

Les Kearney


Quote:
>Les,

>Are you running under XP or WIN2K?  If so, this is what's been normal
>for me since moving to MS$ new nd improved :) OS's.  WIN85/98 was many
>magnitudes faster in disk access.  If you find out what's making disk
>access so slow .... I, and a whole lot of other Clarioneers would like
>to know.

>Rich Knoch
>www.compulink-ltd.com



Sat, 21 May 2005 10:37:18 GMT  
 Help WIth File Processing (CW 5.5G)
You are welcome Les.
Glad I was able to help.

--
Ben E. Brady
www.clariondeveloper.com


Quote:
> Ben

> Many thanks for the heads up regarding the Stream() / Flush()
> statements.

> Here is what I found - to process 1.4 million rcords "normally" takes
> over 24 hours. (1k records / minute). Using a RAM Disk does not appear
> to help this process much.

> To process the 1.4 million records using a ram disk & the stream()
> command takes about 18 minutes.  (77k records / minute processed).

> What this tells me is that the logical closing done by Calrion adds a
> great deal of overhead.

> Go figure.

> Again, thanks for the help - it is a bit of a life saver.

> Cheers

> Les Kearney

> On Mon, 2 Dec 2002 14:26:05 -0800, "Ben E. Brady"

> >Hi Les,
> >Yes, that is correct. You can only do a put after a GET( ).

> >The best way to do this type of conversion is to let the data dictionary
> >write the conversion code and modify it accordingly.  Also, if you have a
> >RAMDISK of 200 Mb and your data file is 156Mb  you would not have enough
> >memory allocated for the RAMDISK in order to convert the file.  (are you
> >writing to the ram disk?

> >Also, the use of STREAM( ) and FLUSH( ) would help with the conversion
> >speed.



Sat, 21 May 2005 13:16:14 GMT  
 Help WIth File Processing (CW 5.5G)
If this procedure is running on a standalone machine or files are being used
exclusively you may want to use
Stream(Filename)
Process Codes
Flush(Filename)

Quote:
> Hi

> I have a problem that has me stumped.

> I have a large TPS file (156 megs) that I want to convert to a somehat
> different format (I want to split an "amount" field into a debit or
> credit field.

> To speed the conversion process, I have created a large (200meg) RAM
> Disk and copied the file & conversion program to this RAM Disk. When I
> run my conversion program, it runs very, very slow.  If I abort the
> process and restart it, the process is extremely fast until it reaches
> a record that has not had the new fields populated.  Once it hits
> these unprocessed records, the process slows to a crawl again.

> The code I am using is:

>       vlCount = Records(Trans)

>       Loop Until(Eof(Trans))
>          Next(Trans)
>          If Error() Then Stop(Error()).

>          IF TRA:Amount < 0
>             TRA:DEBIT_Amount  =-TRA:Amount
>          Else
>             TRA:CREDIT_Amount =+TRA:Amount
>          .
>          Put(Trans)
>          If Error() Then Stop(Error()).
>          vlCount -= 1
>          Display(?vlCount)

>       .

> What is interesting is that if I just do a get & then a put without
> populating the new fields, the process is fast. As soon as I change a
> value on one of the fields, the PUT seems to take a significant amount
> of time.

> Can anyone explain what is happening here. My theory is that CW 5.5
> only does a PUT if a record has changed thus the difference in "Put
> Times" bew{*filter*} the processed / unprocessed records.

> Another related question, even if the above theory is right, why does
> the use of RAM Disk not speed up the file processing. I cannot see any
> appreciable difference bew{*filter*} processing the file using a RAM Disk
> and without.

> Thanks for the help!

> Les



Sat, 21 May 2005 07:10:10 GMT  
 Help WIth File Processing (CW 5.5G)
Hi Les,

Many many years ago and certainly with the introduction of Clarion
for Windows beta Topspeed issued a strong warning about using the
EOF and BOF functions and the following you can find in the help file
and documentation about the EOF function:

The EOF procedure is not supported by all file drivers, and can be very
inefficient even if supported (check the driver documentation).
Therefore, for efficiency and guaranteed file system support it is
not recommended to use this procedure. Instead, check the ERRORCODE()
procedure after each disk read to detect an attempt to read past the end
of the file.
The EOF procedure was most often used as an UNTIL condition at the top
of a LOOP, so EOF returns true after the last record has been read and
processed.
Not recommended, and still available for backward compatibility:

For that matter it is better that you code as follows:
  SET(Trans)
  LOOP
    NEXT(Trans)
    IF ERRORCODE() THEN BREAK.
    IF TRA:Amount <
       TRA:DEBIT_Amount  =-TRA:Amount
    ELSE
       TRA:CREDIT_Amount =+TRA:Amount
    END
    PUT(Trans)
  END

Regards,
Marius Luidens

Quote:

> Hi

> I have a problem that has me stumped.

> I have a large TPS file (156 megs) that I want to convert to a somehat
> different format (I want to split an "amount" field into a debit or
> credit field.

> To speed the conversion process, I have created a large (200meg) RAM
> Disk and copied the file & conversion program to this RAM Disk. When I
> run my conversion program, it runs very, very slow.  If I abort the
> process and restart it, the process is extremely fast until it reaches
> a record that has not had the new fields populated.  Once it hits
> these unprocessed records, the process slows to a crawl again.

> The code I am using is:

>       vlCount = Records(Trans)

>       Loop Until(Eof(Trans))
>          Next(Trans)
>          If Error() Then Stop(Error()).

>          IF TRA:Amount < 0
>             TRA:DEBIT_Amount  =-TRA:Amount
>          Else
>             TRA:CREDIT_Amount =+TRA:Amount
>          .
>          Put(Trans)
>          If Error() Then Stop(Error()).
>          vlCount -= 1
>          Display(?vlCount)

>       .

> What is interesting is that if I just do a get & then a put without
> populating the new fields, the process is fast. As soon as I change a
> value on one of the fields, the PUT seems to take a significant amount
> of time.

> Can anyone explain what is happening here. My theory is that CW 5.5
> only does a PUT if a record has changed thus the difference in "Put
> Times" bew{*filter*} the processed / unprocessed records.  

> Another related question, even if the above theory is right, why does
> the use of RAM Disk not speed up the file processing. I cannot see any
> appreciable difference bew{*filter*} processing the file using a RAM Disk
> and without.

> Thanks for the help!

> Les



Sat, 21 May 2005 12:13:16 GMT  
 Help WIth File Processing (CW 5.5G)
Hi Les
Definitely use Stream and Flush with such a large file - you will need
to open the file in exclusive mode.
I don't think the Topspeed file supports BOF and EOF - you need the code
  as previously described.

Open(Trans,ReadWrite+DenyAll) ! Exclusive access
Stream(Trans)
Counter# = 0
SET(Trans)
  LOOP
    NEXT(Trans)
    IF ERRORCODE() THEN BREAK.
    IF TRA:Amount <
       TRA:DEBIT_Amount  =-TRA:Amount
    ELSE
       TRA:CREDIT_Amount =+TRA:Amount
    END
    PUT(Trans)
         Counter# += 1
        If Counter# > 50 !or whatever you decide :)
             Flush(Trans)
             Stream(Trans)
            Counter# = 0
        End    
  END
  Flush(Trans)

Regards
Tanya



Sat, 21 May 2005 15:51:42 GMT  
 Help WIth File Processing (CW 5.5G)
On Mon, 2 Dec 2002 14:26:05 -0800, "Ben E. Brady"

Quote:

>Hi Les,
>Yes, that is correct. You can only do a put after a GET( ).

?!??

You can do a PUT after any record retrieval (NEXT, PREVIOUS, GET,
REGET).

The OP is correct, in that a PUT without changing the values does not
write out anything to the file (at least with TPS files).

Jason



Mon, 23 May 2005 04:22:03 GMT  
 Help WIth File Processing (CW 5.5G)
Shailesh

Thanks for the help. I have used the stream command with much better
performance.

Cheers

Les

On Mon, 2 Dec 2002 18:10:10 -0500, "Shailesh Shah"

Quote:

>If this procedure is running on a standalone machine or files are being used
>exclusively you may want to use
>Stream(Filename)
>Process Codes
>Flush(Filename)


>> Hi

>> I have a problem that has me stumped.

>> I have a large TPS file (156 megs) that I want to convert to a somehat
>> different format (I want to split an "amount" field into a debit or
>> credit field.

>> To speed the conversion process, I have created a large (200meg) RAM
>> Disk and copied the file & conversion program to this RAM Disk. When I
>> run my conversion program, it runs very, very slow.  If I abort the
>> process and restart it, the process is extremely fast until it reaches
>> a record that has not had the new fields populated.  Once it hits
>> these unprocessed records, the process slows to a crawl again.

>> The code I am using is:

>>       vlCount = Records(Trans)

>>       Loop Until(Eof(Trans))
>>          Next(Trans)
>>          If Error() Then Stop(Error()).

>>          IF TRA:Amount < 0
>>             TRA:DEBIT_Amount  =-TRA:Amount
>>          Else
>>             TRA:CREDIT_Amount =+TRA:Amount
>>          .
>>          Put(Trans)
>>          If Error() Then Stop(Error()).
>>          vlCount -= 1
>>          Display(?vlCount)

>>       .

>> What is interesting is that if I just do a get & then a put without
>> populating the new fields, the process is fast. As soon as I change a
>> value on one of the fields, the PUT seems to take a significant amount
>> of time.

>> Can anyone explain what is happening here. My theory is that CW 5.5
>> only does a PUT if a record has changed thus the difference in "Put
>> Times" bew{*filter*} the processed / unprocessed records.

>> Another related question, even if the above theory is right, why does
>> the use of RAM Disk not speed up the file processing. I cannot see any
>> appreciable difference bew{*filter*} processing the file using a RAM Disk
>> and without.

>> Thanks for the help!

>> Les



Tue, 24 May 2005 08:37:28 GMT  
 Help WIth File Processing (CW 5.5G)
Marius

Do you have a documentation reference for the EOF / BOF limitations. I
have been using them for years without any ill effect. I will make the
change however and would like to understand what is happening.

Cheers

Les

On Mon, 02 Dec 2002 20:13:16 -0800, Marius Luidens

Quote:

>Hi Les,

>Many many years ago and certainly with the introduction of Clarion
>for Windows beta Topspeed issued a strong warning about using the
>EOF and BOF functions and the following you can find in the help file
>and documentation about the EOF function:

>The EOF procedure is not supported by all file drivers, and can be very
>inefficient even if supported (check the driver documentation).
>Therefore, for efficiency and guaranteed file system support it is
>not recommended to use this procedure. Instead, check the ERRORCODE()
>procedure after each disk read to detect an attempt to read past the end
>of the file.
>The EOF procedure was most often used as an UNTIL condition at the top
>of a LOOP, so EOF returns true after the last record has been read and
>processed.
>Not recommended, and still available for backward compatibility:

>For that matter it is better that you code as follows:
>  SET(Trans)
>  LOOP
>    NEXT(Trans)
>    IF ERRORCODE() THEN BREAK.
>    IF TRA:Amount <
>       TRA:DEBIT_Amount  =-TRA:Amount
>    ELSE
>       TRA:CREDIT_Amount =+TRA:Amount
>    END
>    PUT(Trans)
>  END

>Regards,
>Marius Luidens


>> Hi

>> I have a problem that has me stumped.

>> I have a large TPS file (156 megs) that I want to convert to a somehat
>> different format (I want to split an "amount" field into a debit or
>> credit field.

>> To speed the conversion process, I have created a large (200meg) RAM
>> Disk and copied the file & conversion program to this RAM Disk. When I
>> run my conversion program, it runs very, very slow.  If I abort the
>> process and restart it, the process is extremely fast until it reaches
>> a record that has not had the new fields populated.  Once it hits
>> these unprocessed records, the process slows to a crawl again.

>> The code I am using is:

>>       vlCount = Records(Trans)

>>       Loop Until(Eof(Trans))
>>          Next(Trans)
>>          If Error() Then Stop(Error()).

>>          IF TRA:Amount < 0
>>             TRA:DEBIT_Amount  =-TRA:Amount
>>          Else
>>             TRA:CREDIT_Amount =+TRA:Amount
>>          .
>>          Put(Trans)
>>          If Error() Then Stop(Error()).
>>          vlCount -= 1
>>          Display(?vlCount)

>>       .

>> What is interesting is that if I just do a get & then a put without
>> populating the new fields, the process is fast. As soon as I change a
>> value on one of the fields, the PUT seems to take a significant amount
>> of time.

>> Can anyone explain what is happening here. My theory is that CW 5.5
>> only does a PUT if a record has changed thus the difference in "Put
>> Times" bew{*filter*} the processed / unprocessed records.  

>> Another related question, even if the above theory is right, why does
>> the use of RAM Disk not speed up the file processing. I cannot see any
>> appreciable difference bew{*filter*} processing the file using a RAM Disk
>> and without.

>> Thanks for the help!

>> Les



Tue, 24 May 2005 08:39:29 GMT  
 Help WIth File Processing (CW 5.5G)
Hi Les,
Other that the information in the help file and Clarion pdf's about the
EOF/BOF functions I have no further documentation.
Realize that in your coding example the EOF function is being called
after each record read. The EOF/BOF functions take some time to execute
and calling it thousands of time will certainly not be efficient.
Best Regards,
Marius Luidens
Quote:

> Marius

> Do you have a documentation reference for the EOF / BOF limitations. I
> have been using them for years without any ill effect. I will make the
> change however and would like to understand what is happening.

> Cheers

> Les

> On Mon, 02 Dec 2002 20:13:16 -0800, Marius Luidens

>>Hi Les,

>>Many many years ago and certainly with the introduction of Clarion
>>for Windows beta Topspeed issued a strong warning about using the
>>EOF and BOF functions and the following you can find in the help file
>>and documentation about the EOF function:

>>The EOF procedure is not supported by all file drivers, and can be very
>>inefficient even if supported (check the driver documentation).
>>Therefore, for efficiency and guaranteed file system support it is
>>not recommended to use this procedure. Instead, check the ERRORCODE()
>>procedure after each disk read to detect an attempt to read past the end
>>of the file.
>>The EOF procedure was most often used as an UNTIL condition at the top
>>of a LOOP, so EOF returns true after the last record has been read and
>>processed.
>>Not recommended, and still available for backward compatibility:

>>For that matter it is better that you code as follows:
>> SET(Trans)
>> LOOP
>>   NEXT(Trans)
>>   IF ERRORCODE() THEN BREAK.
>>   IF TRA:Amount <
>>      TRA:DEBIT_Amount  =-TRA:Amount
>>   ELSE
>>      TRA:CREDIT_Amount =+TRA:Amount
>>   END
>>   PUT(Trans)
>> END

>>Regards,
>>Marius Luidens


>>>Hi

>>>I have a problem that has me stumped.

>>>I have a large TPS file (156 megs) that I want to convert to a somehat
>>>different format (I want to split an "amount" field into a debit or
>>>credit field.

>>>To speed the conversion process, I have created a large (200meg) RAM
>>>Disk and copied the file & conversion program to this RAM Disk. When I
>>>run my conversion program, it runs very, very slow.  If I abort the
>>>process and restart it, the process is extremely fast until it reaches
>>>a record that has not had the new fields populated.  Once it hits
>>>these unprocessed records, the process slows to a crawl again.

>>>The code I am using is:

>>>      vlCount = Records(Trans)

>>>      Loop Until(Eof(Trans))
>>>         Next(Trans)
>>>         If Error() Then Stop(Error()).

>>>         IF TRA:Amount < 0
>>>            TRA:DEBIT_Amount  =-TRA:Amount
>>>         Else
>>>            TRA:CREDIT_Amount =+TRA:Amount
>>>         .
>>>         Put(Trans)
>>>         If Error() Then Stop(Error()).
>>>         vlCount -= 1
>>>         Display(?vlCount)

>>>      .

>>>What is interesting is that if I just do a get & then a put without
>>>populating the new fields, the process is fast. As soon as I change a
>>>value on one of the fields, the PUT seems to take a significant amount
>>>of time.

>>>Can anyone explain what is happening here. My theory is that CW 5.5
>>>only does a PUT if a record has changed thus the difference in "Put
>>>Times" bew{*filter*} the processed / unprocessed records.  

>>>Another related question, even if the above theory is right, why does
>>>the use of RAM Disk not speed up the file processing. I cannot see any
>>>appreciable difference bew{*filter*} processing the file using a RAM Disk
>>>and without.

>>>Thanks for the help!

>>>Les



Tue, 24 May 2005 14:08:55 GMT  
 
 [ 13 post ] 

 Relevant Pages 

1. CW 5.0b Process Help

2. HTML help-file in CW

3. CW Help File Bug??

4. Help with AVI's etc, CW and File Explorer or other product

5. Window Help Files for CW (newbee)

6. Recommendations for CW Install & Help files

7. Recommendations for CW install & Help files

8. Recomendations for CW install & Help files

9. making Help files for CW

10. making Help files for CW

11. Help on using AWK to process an XML file

12. HELP - Processing and Posting 2 files

 

 
Powered by phpBB® Forum Software