What to do if a split matches beginning of string? 
Author Message
 What to do if a split matches beginning of string?

Hi,

When splitting a string, (using the classic spaces example),



However, if the string is

" The quick brown fox."

I go about testing for the condition where the first element
is (I think) empty:

# if the first element is empty but defined,


But I would like to avoid that altogether, if there's some way
to prevent the first match from being scanned.

--
Robert Kiesling
Linux FAQ Maintainer

http://www.*-*-*.com/ ; http://www.*-*-*.com/
---



Tue, 23 Sep 2003 11:54:58 GMT  
 What to do if a split matches beginning of string?

Quote:
>Hi,

>When splitting a string, (using the classic spaces example),




>However, if the string is

>" The quick brown fox."

>I go about testing for the condition where the first element
>is (I think) empty:

># if the first element is empty but defined,


>But I would like to avoid that altogether, if there's some way
>to prevent the first match from being scanned.

Here are some ideas, plus a framework for testing others. Best here
are Option A and Option B-4. There are regex adepts hereabouts who
improve on these...

#!perl -w

use strict;
my $string      = " The quick brown fox.";

# Option A: remove any initial spaces *before* rest of process
# $string       =~ s/^\s*//;

# Option B: try various splits
for (
   # 1. for reference
   [ '*' . $string . '*' ],

   # 2. standard split on space -- *the problem*
   [ split( / /   => $string ) ],

   # 3. word boundary required before space
   [ split( /\b / => $string ) ],

   # 4. remove initial spaces -  no change to $string
   [ split( / /   => ($string =~ m/\s*(.*)$/)[0] ) ],

   )
    {

    }

__END__

output:
* The quick brown fox.*
*The*quick*brown*fox.   # arrgh
  The*quick*brown*fox.  # initial space goes to " The"
The*quick*brown*fox.    # aaah

output with initial spaces removed (uncomment Option A):
*The quick brown fox.*
The*quick*brown*fox.
The*quick*brown*fox.
The*quick*brown*fox.
--

   - Bruce

__bruce_van_allen__santa_cruz_ca__



Wed, 24 Sep 2003 01:48:49 GMT  
 What to do if a split matches beginning of string?
On Fri, 06 Apr 2001 03:54:58 GMT, Robert Kiesling

Hello Robert,

Quote:
>When splitting a string, (using the classic spaces example),



>However, if the string is
>" The quick brown fox."

in this special case try


perldoc -f split

Christian



Thu, 25 Sep 2003 00:57:05 GMT  
 What to do if a split matches beginning of string?

Quote:

> [...]
> However, if the string is

> " The quick brown fox."

> I go about testing for the condition where the first element
> is (I think) empty:

> # if the first element is empty but defined,


> But I would like to avoid that altogether, if there's some way
> to prevent the first match from being scanned.

split (' '," The quick brown fox.");

Works only for space as delimiter.
See perldoc -f split.
--
Cheers,
haj



Tue, 23 Sep 2003 23:51:02 GMT  
 What to do if a split matches beginning of string?
Others gave the best answer.

Even though I sometimes forget about them, one of the reasons I
appreciate perl is that it has things like
   split ' ', $string;
which handle special cases like whitespace at the beginning of $string.

Downright thoughtful. Anyway, here's my previous exercise with the
*BEST* option added.

#!perl -w

use strict;
my $string      = " The quick brown fox.";

# Option A: remove any initial spaces *before* rest of process
# $string =~ s/^\s*//;

# Option B: try various splits
for (
   # 1. for reference
   [ '*' . $string . '*' ],

   # 2. standard split on space -- *the problem*
   [ split( / /   => $string ) ],

   # 3. word boundary required before space
   [ split( /\b / => $string ) ],

   # 4. remove initial spaces -  no change to $string
   [ split( / /   => ($string =~ m/\s*(.*)$/)[0] ) ],

   # 5. use ' ' instead of / /
   [ split( ' ' => $string ) ],

   )
    {

    }

__END__

output:
* The quick brown fox.*
*The*quick*brown*fox.   # arrgh
  The*quick*brown*fox.  # initial space goes with " The"
The*quick*brown*fox.    # aaah
The*quick*brown*fox.    # the best *which I forgot previously*

output with initial spaces removed (Option A):
*The quick brown fox.*
The*quick*brown*fox.
The*quick*brown*fox.
The*quick*brown*fox.
The*quick*brown*fox.

--

   - Bruce

__bruce_van_allen__santa_cruz_ca__



Wed, 24 Sep 2003 02:41:26 GMT  
 What to do if a split matches beginning of string?
Quote:

>When splitting a string, (using the classic spaces example),



                ^^^

That splits on a single space, so you will get "interior" empty
strings with something like "The  quick brown fox."
                                ^^

Is that what you want?

Must be, because if you wanted to split on _all_ whitespace
characters instead of just the space character, you would
use the solution given in the description for split().

Quote:


>However, if the string is

>" The quick brown fox."

>I go about testing for the condition where the first element
>is (I think) empty:

So, your question is more accurately stated:

   How can I eliminate leading null fields?

Is that it?

Quote:
># if the first element is empty but defined,


>But I would like to avoid that altogether, if there's some way
>to prevent the first match from being scanned.

Remove the leading separator chars from the string before you split it:

   s/^ +//;

--
    Tad McClellan                          SGML consulting

    Fort Worth, Texas



Tue, 23 Sep 2003 21:23:52 GMT  
 What to do if a split matches beginning of string?
In the general case of a split which may return null fields which I want
to ignore, I find this idiom very helpful:

  $_ = '   a    bunch of words    with extra    spaces    ';


This solves a slightly different problem from the one you specified, of
course.  Of the solutions presented, using the magic split behavior of ' '
seems most direct.

--
   |   Craig Berry - http://www.cinenet.net/~cberry/
 --*--  "When the going gets weird, the weird turn pro."
   |               - Hunter S. Thompson



Wed, 24 Sep 2003 08:38:20 GMT  
 What to do if a split matches beginning of string?

Quote:


> > [...]
> > However, if the string is

> > " The quick brown fox."

> > I go about testing for the condition where the first element
> > is (I think) empty:

> > # if the first element is empty but defined,


> > But I would like to avoid that altogether, if there's some way
> > to prevent the first match from being scanned.

> split (' '," The quick brown fox.");

> Works only for space as delimiter.
> See perldoc -f split.

That is what nearly everyone pointed to - the example I gave is
actually a little contrived. It involves text, not only spaces. I
wasn't certain that there was an "approved" method to cope with a
leading match.  

Thanks, though, to everyone - everyone seems to have one or two
different methods of coping with the first match, with or
without leading whitespace.

Robert

--
Robert Kiesling
Linux FAQ Maintainer

http://www.mainmatter.com/linux-faq/toc.html  http://www.mainmatter.com/
---



Wed, 24 Sep 2003 04:57:10 GMT  
 
 [ 8 post ] 

 Relevant Pages 

1. Match string at the beginning of each line

2. split inside split - can it be done ?

3. Searching for spaces in a string only getting the beginning of the string

4. Queries' columns union

5. PERLFUNC: split - split up a string using a regexp delimiter

6. PERLFUNC: split - split up a string using a regexp delimiter

7. split a string not only by a single char but also by a string

8. Help...Trig problem!

9. REQ: Diffrence between TP7 and BP7

10. Delphi / Paradox 4 File Corruptions

11. Error "Full dBase expressions not allowed"

12. Need a faster solution to matching begin and end curly braces

 

 
Powered by phpBB® Forum Software