Efficient string concatenation 
Author Message
 Efficient string concatenation

We need to build up an output file of multi-lined messages.  Only at the end
do we know if an individual message it is valid or not.  Then it should
either go on the end of a valid output file or then end of a valid error
file.
My problem is that we have noticed that building up a string with embedded
CrLfs is a couple of hundred times slower than writing a file line by line.
I assume this is because the string is constantly being moved around memory
as it expands.  Is there anyway to make string concatenation more efficient
?.  It suits our application to have the full message available - as we also
want to send it out to message queueing systems as a unit.

At the moment, I am just using the "&" operator to build up the string.
Any help???
TvG



Mon, 28 May 2001 03:00:00 GMT  
 Efficient string concatenation
One VB trick is to use byte arrays instead of strings in most cases. You can
convert a string to a byte array by just an assignment and vice versa
(although you need 2 bytes per char due to Unicode and have). Then convert
the byte array back to string for other processing. Byte access to the data
is like C strings (indexes) so you can bypass instr, mid, left, etc. calls
with direct access and if your not using Unicode fonts you can skip every
other byte. The cost of converting to byte arrays and back does cost but it
may be cheaper if you have to play with the data much.

Another alternative is one of the third party string packages.


Quote:
>We need to build up an output file of multi-lined messages.  Only at the
end
>do we know if an individual message it is valid or not.  Then it should
>either go on the end of a valid output file or then end of a valid error
>file.
>My problem is that we have noticed that building up a string with embedded
>CrLfs is a couple of hundred times slower than writing a file line by line.
>I assume this is because the string is constantly being moved around memory
>as it expands.  Is there anyway to make string concatenation more efficient
>?.  It suits our application to have the full message available - as we
also
>want to send it out to message queueing systems as a unit.

>At the moment, I am just using the "&" operator to build up the string.
>Any help???
>TvG



Mon, 28 May 2001 03:00:00 GMT  
 Efficient string concatenation

Quote:

> We need to build up an output file of multi-lined messages.  Only at the end
> do we know if an individual message it is valid or not.  Then it should
> either go on the end of a valid output file or then end of a valid error
> file.
> My problem is that we have noticed that building up a string with embedded
> CrLfs is a couple of hundred times slower than writing a file line by line.
> I assume this is because the string is constantly being moved around memory
> as it expands.  Is there anyway to make string concatenation more efficient
> ?.  It suits our application to have the full message available - as we also
> want to send it out to message queueing systems as a unit.

> At the moment, I am just using the "&" operator to build up the string.
> Any help???
> TvG

First, the ampersand performs some validation that you may not need to incur as
overhead.  The "+" will work faster.

Second, depending upon how you're copying, the native methods for string
manipulation are natorious for lack of speed.  There's a recent (last 6 months
or so) VBPJ article that deals with string maniuplation via a memory copy that's
pretty good on the subject, syntax looks something like this when used:
  client_tracking$ = Space(25)
  CopyMemory ByVal StrPtr(client_tracking$), ByVal StrPtr(SocketData) _
                    + BYTEPOS_CLIENT_TRACKING_NO, LENB_CLIENT_TRACKING_NO
this is a replacement for the mid$ function, see the article for more details.

I hope that I understood your question, and that this helps.

jb



Mon, 28 May 2001 03:00:00 GMT  
 Efficient string concatenation

Quote:
> We need to build up an output file of multi-lined messages.  Only at the end
> do we know if an individual message it is valid or not.  Then it should
> either go on the end of a valid output file or then end of a valid error
> file.
> My problem is that we have noticed that building up a string with embedded
> CrLfs is a couple of hundred times slower than writing a file line by line.
> I assume this is because the string is constantly being moved around memory
> as it expands.  Is there anyway to make string concatenation more efficient
> ?.  It suits our application to have the full message available - as we also
> want to send it out to message queueing systems as a unit.

Here are two suggestions for you:

1) One thing that helps dramatically is preallocating your string and
using Mid$() to make your assignments.  Mid$() not only lets you get the
substring specified, it lets you *set* it as well.  So for example, you
can do

    sText = Space$(1024)
    lCurrPosition = 1

and instead of concatenating strings, replace parts of the above string
like

    Mid$(sText, lCurrPosition, Len(sNewText & vbCrLf) = sNewText & vbCrLf
    lCurrPosition = lCurrPosition + Len(sNewText & vbCrLf)

to set its contents to what you want.  It's a bit more work, since you
have to keep track of your position in the string, and if you go beyond
the allocated space, you get an error message.  But since you don't have
to allocate more space and move an increasingly longer string around in
memory every time you append something to your string, your performance
will be greatly improved.

2) If the above seems like too much work, or the messages you generate
are really long to the point where available memory would be an issue,
and you're otherwise happy with the speed of writing the message to a
file, you can write your message line by line *to a temp file*, make your
determination of whether the message is valid or not at the end, and then
append the temp file to either the output file or error file.  There's a
good article on the proper use of temp files at
http://www.mvps.org/vbnet/code/fileapi/gettempfilename.htm.

--



Mon, 28 May 2001 03:00:00 GMT  
 Efficient string concatenation
<cut>
Quote:
>First, the ampersand performs some validation that you may not need to
incur as
>overhead.  The "+" will work faster.

<cut>

Do you have any reference for that, or an example of when + is faster than &
for string concatenation?

Since + has to determine whether the operands are numeric or character it
seems like the exact opposite should be true.



Mon, 28 May 2001 03:00:00 GMT  
 Efficient string concatenation
This might not be the best benchmark but??? You be the judge...
Using the following:

Private Sub Form_Click()
    Dim l As Long
    Dim ltime As Single
    Dim s As String
    ltime = Timer
    For l = 0 To 50000: s = s + "a": Next
    Debug.Print "Using '+' " & Timer - ltime
    s = ""
    ltime = Timer
    For l = 0 To 50000: s = s & "a": Next
    Debug.Print "Using '&' " & Timer - ltime
End Sub

I got about the same results after running it 3 times:
Using '+'   56.17969
Using '&'   55.69922
Yours may vary <grin>
166mhz Pentium
Winders 95a, VB5

Using the & has got my vote. Not because of the approx 1/2 sec
but for readability and unforgiving errors that the + can give if one
is not careful...
--
Have a good day.
Don


Quote:

> <cut>
> >First, the ampersand performs some validation that you may not need to
> incur as
> >overhead.  The "+" will work faster.
> <cut>

> Do you have any reference for that, or an example of when + is faster than &
> for string concatenation?

> Since + has to determine whether the operands are numeric or character it
> seems like the exact opposite should be true.



Mon, 28 May 2001 03:00:00 GMT  
 Efficient string concatenation
Hello Tony

You could experiment with a workaround I've developed.

See the selection Howto[Concatenate] and sample VB
project Concat.

If using an external ActiveX Dll is not overkill then
you can use my FileProcess.dll at
http://www.bcsupernet.com/users/Murray/vbhomepage/fileprocess/index.htm
260+ functions for file, directory, drive, system, internet
operations. Win95 and Nt systems.

Best wishes, Murray

If you decide to experiment would appreciate knowing your results.



Mon, 28 May 2001 03:00:00 GMT  
 Efficient string concatenation


Quote:
>>First, the ampersand performs some validation that you may not need to incur

as overhead.  The "+" will work faster.<<

Can you please explain the differences you found? I didn't find a different
behaviour of & and +.

DoDi



Wed, 30 May 2001 03:00:00 GMT  
 Efficient string concatenation



Quote:
>My problem is that we have noticed that building up a string with embedded
>CrLfs is a couple of hundred times slower than writing a file line by line.

Best you use a String array for the lines. Then no movement in memory occurs,
after the strings are read in. A loop to write the strings to the appropriate
file should not be too much additional code ;-)

DoDi



Wed, 30 May 2001 03:00:00 GMT  
 Efficient string concatenation
Try this

debug.print "Hello " + 13 + " year old" 'this line returns an error

debug.print "Hello " & 13 & " year old" 'this line is valid

The first line returns a error because, you are combining a string with a
value (13) and another string.
The second line is combining the 13 as if it were a string, so it would act
as if you typed "Hello " + str(13) + " year old"

BTW, don't take the 13 year old offensively, it just came to mind.

Quote:


>>>First, the ampersand performs some validation that you may not need to
incur
>as overhead.  The "+" will work faster.<<

>Can you please explain the differences you found? I didn't find a different
>behaviour of & and +.

>DoDi



Wed, 30 May 2001 03:00:00 GMT  
 Efficient string concatenation
I think it is the other way around. The ampersand is an explicit command to
perform string concatenation whereas the + checks the data type of the
arguments and performs addition or concatenation depending on the types.

Lee

Quote:
>>>First, the ampersand performs some validation that you may not need to
incur
>as overhead.  The "+" will work faster.<<



Wed, 30 May 2001 03:00:00 GMT  
 Efficient string concatenation
I really don't think that it matters. You are talking about very small
differences in the time taken between one format and the other. The
important point is that if you are working with VERY large numbers of
strings then you should not ask VB to produce any string which will require
the movement in memory of other sub-strings. In such cases you should make
sure that all your strings are fixed-length.

Mike

Quote:

>I think it is the other way around. The ampersand is an explicit command to
>perform string concatenation whereas the + checks the data type of the
>arguments and performs addition or concatenation depending on the types.

>Lee

>>>>First, the ampersand performs some validation that you may not need to
>incur
>>as overhead.  The "+" will work faster.<<



Sat, 02 Jun 2001 03:00:00 GMT  
 Efficient string concatenation

Quote:
> debug.print "Hello " + 13 + " year old" 'this line returns an error
> debug.print "Hello " & 13 & " year old" 'this line is valid

> The first line returns a error because, you are combining a string with a
> value (13) and another string.
> The second line is combining the 13 as if it were a string, so it would act
> as if you typed "Hello " + str(13) + " year old"

If I ever see code like that I _want_ it to fail, because there are
probably other horrible things in there that I'm going to have to dig out
anyway, and really obvious stuff like that makes for a good example to
show management.

ajm

--
Alan Miller \\ ajm at pobox.com \\ http://www.pobox.com/~ajm
"I consider the fact that NT requires twice to three times the staff to
maintain to be the Full Employment Act for LAN administrators." -CLB, 11/2/98



Sun, 03 Jun 2001 03:00:00 GMT  
 Efficient string concatenation

Quote:


>> debug.print "Hello " + 13 + " year old" 'this line returns an error
>> debug.print "Hello " & 13 & " year old" 'this line is valid

>> The first line returns a error because, you are combining a string with a
>> value (13) and another string.
>> The second line is combining the 13 as if it were a string, so it would act
>> as if you typed "Hello " + str(13) + " year old"

>If I ever see code like that I _want_ it to fail, because there are
>probably other horrible things in there that I'm going to have to dig out
>anyway, and really obvious stuff like that makes for a good example to
>show management.

And

        10 + 3

doesn't give an arror, but it probably won't give the answer you want in
this context (string concatenating).

At least, "&" behaves.

   HTH,
   Bart.



Sun, 03 Jun 2001 03:00:00 GMT  
 
 [ 19 post ]  Go to page: [1] [2]

 Relevant Pages 

1. Efficient converting from byte-string to word-string

2. docmd.RunSQL String Concatenation

3. refering to recordset field with string concatenation

4. String concatenation corrupted in recordset but not query

5. String Concatenation in a text field

6. ANOTHER NEWBIE QUESTION: String Concatenation

7. String Concatenation Problem

8. String concatenation performance?!?

9. string concatenation

10. Quotation Marks and String Concatenation

11. String Concatenation Fails

12. Concatenation of strings

 

 
Powered by phpBB® Forum Software