I'm getting results I don't understand, when I set $: to something other than
the defaults (in several ports of Perl 4.036).  In particular, I get wrapped
lines which BEGIN with one of the characters in $:, even though a desired break
location occurs within a few bytes of the actual selected break location.

Essentially, all I'm really trying to do is allow ^<<<~~ fields to break on
spaces and AFTER a '>'.  I get the impression that there is some built-in
tolerance factor that I do not know about, but since I've never attempted to
crawl around in the Perl source, I'm not quite sure whether I don't understand
something about the way built-in formats use $:, or whether I am misreading the
description in the Camel book and the man page, or what.

I am very pressed for a solution, since my boss is going to make a mess of a
fairly nice, well-behaved program, if I don't come up with a reasonable
solution myself by Wednesday (and there isn't time for me to write the wrapping
routines myself).  What might help is if someone could point me at the source
file where the line breaks are chosen, in which case I can take a look and see
if I can trick the Perl interpreter into doing what I want it to do.

Am I correct in assuming that Perl will only 'look back' a certain number of
characters to find a place to break, so as to avoid, in general, making a lot
of unusually short lines?

Thank you for your kind attention to this naive enquiry.

Robert S. Kissel

Sat, 09 May 1998 03:00:00 GMT  
After running a few tests, I am fairly sure I am dealing with an interpreter
bug, now.  Here's how you may reproduce the situation:

Suppose you choose $: = '>; ' as your break characters (as, for instance, a
great big stream of SGML, containing end-of-tag characters and
end-of-entity-reference characters).  Next, you set up your nice little
paragraph formatter:

format STDOUT=

Sat, 09 May 1998 03:00:00 GMT  
When a break character occurs PRECISELY one position to the RIGHT of the field,
the formatting of a string will NOT back up to the prior break-point (if any).
So, for example, if one is setting a big mess of SGML character references,
like &alpha;&beta;&gamma;... and one of the semi-colons falls EXACTLY one
position beyond the end of the ^-field, then one will NOT back up to the
previous semi-colon for the break.

This looks like an off-by-one bug to me, but I'd like some confirmation of it,
either from a Perl porter or someone else who knows their way around 4.036.  Is
there a known bug here?  Is there a patch?  Is this gone in Perl 5?

Thank you for your kind attention and help.

Sat, 09 May 1998 03:00:00 GMT  
