String.Trim() behavior 
Author Message
 String.Trim() behavior

When I call .Trim(' ') on a String, I still get back a large number of
empty strings in the returned array.

Please tell me how I can process a string to get back only an array of
the "words" within a particular line of text. here is the code I am
currently using which still gives me back some either empty characters or
spaces:

while ((input = docFile.ReadLine()) != null)
{
        tokens = input.Trim().Split(' ');
        for (int i=0;i<tokens.Length;i++)
        {
                token = tokens[i];
                //do something with token
        }

Quote:
}

Since adding the call to Trim() it seems to behave better, but as I said
there are still some empty characters. I even tried using:

Regex.Split(input, "[^0-9a-zA-Z]");

but I still got back empty strings and spaces.

What I really want is a way to read lines from a text file and get all of
the "words". I am defining a "word" as being an unbroken chunk of
alphanumeric characters.

Any help would be greatly appreciated. Thanks!

Josh



Sun, 20 Feb 2005 09:50:40 GMT  
 String.Trim() behavior
Josh, what does your string look like that you are splitting?

--
Greg
http://www.claritycon.com


Quote:
> When I call .Trim(' ') on a String, I still get back a large number of
> empty strings in the returned array.

> Please tell me how I can process a string to get back only an array of
> the "words" within a particular line of text. here is the code I am
> currently using which still gives me back some either empty characters or
> spaces:

> while ((input = docFile.ReadLine()) != null)
> {
> tokens = input.Trim().Split(' ');
> for (int i=0;i<tokens.Length;i++)
> {
> token = tokens[i];
>         //do something with token
> }
> }

> Since adding the call to Trim() it seems to behave better, but as I said
> there are still some empty characters. I even tried using:

> Regex.Split(input, "[^0-9a-zA-Z]");

> but I still got back empty strings and spaces.

> What I really want is a way to read lines from a text file and get all of
> the "words". I am defining a "word" as being an unbroken chunk of
> alphanumeric characters.

> Any help would be greatly appreciated. Thanks!

> Josh



Sun, 20 Feb 2005 10:11:08 GMT  
 String.Trim() behavior


Quote:
> Josh, what does your string look like that you are splitting?

I don't really know ahead of time... Assume that the string I am splitting
is a line of text from a text file....

basically all I am trying to do is to parse the individual words out of a
text file...

Arrays are 0 based in C# aren't they?

Josh



Sun, 20 Feb 2005 10:17:23 GMT  
 String.Trim() behavior
It might contain other embedded whitespace, like \t, \r, \n, etc.  You
might consider using regular expressions to remove anything that is
not A-z, a-z, 0-9 first.

Jonathan Schafer

On Tue, 3 Sep 2002 21:11:08 -0500, "Greg Ewing"

Quote:

>Josh, what does your string look like that you are splitting?



Sun, 20 Feb 2005 10:32:01 GMT  
 String.Trim() behavior
Josh, yes, they are 0 based.

It's possible, as Jonathan said that there are other characters surrounded
by spaces in your file (tabs, new lines, etc).  Each of these would look
like spaces.  Can you verify what the characters are in your array elements?

Two other options you have are to extract out only the words and single
spaces using Regex before you split, or combine the two with the
Regex.Split() function.  You can find more info on MSDN.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpre...
frlrfSystemTextRegularExpressionsRegexClassSplitTopic.asp

If that doesn't work could you post a line from the file which produces the
array with which you are having trouble?

--
Greg
http://www.claritycon.com


Quote:


> > Josh, what does your string look like that you are splitting?

> I don't really know ahead of time... Assume that the string I am splitting
> is a line of text from a text file....

> basically all I am trying to do is to parse the individual words out of a
> text file...

> Arrays are 0 based in C# aren't they?

> Josh



Sun, 20 Feb 2005 10:36:21 GMT  
 String.Trim() behavior


Quote:
> It might contain other embedded whitespace, like \t, \r, \n, etc.  You
> might consider using regular expressions to remove anything that is
> not A-z, a-z, 0-9 first.

If you read the first email I sent, you would see that I did try using

Regex.Split(input, "[^0-9a-zA-Z]");

but still had the problem...

Josh

Quote:
> Jonathan Schafer

> On Tue, 3 Sep 2002 21:11:08 -0500, "Greg Ewing"

>>Josh, what does your string look like that you are splitting?



Sun, 20 Feb 2005 10:47:09 GMT  
 String.Trim() behavior


Quote:
> Josh, yes, they are 0 based.

> It's possible, as Jonathan said that there are other characters
> surrounded by spaces in your file (tabs, new lines, etc).  Each of
> these would look like spaces.  Can you verify what the characters are
> in your array elements?

> Two other options you have are to extract out only the words and
> single spaces using Regex before you split, or combine the two with
> the Regex.Split() function.  You can find more info on MSDN.

If you read the first email I sent, you would see that I did try using

Regex.Split(input, "[^0-9a-zA-Z]");

but still had the problem...

Josh



Sun, 20 Feb 2005 11:01:58 GMT  
 String.Trim() behavior
OK, could you post a line from your text file that gives you the array with
the spaces?  Did you try looking at the actual characters in the array to
see if they were control characters?  Did you try using Regex to extract
only the words?

--
Greg
http://www.claritycon.com/


Quote:


> > Josh, yes, they are 0 based.

> > It's possible, as Jonathan said that there are other characters
> > surrounded by spaces in your file (tabs, new lines, etc).  Each of
> > these would look like spaces.  Can you verify what the characters are
> > in your array elements?

> > Two other options you have are to extract out only the words and
> > single spaces using Regex before you split, or combine the two with
> > the Regex.Split() function.  You can find more info on MSDN.

> If you read the first email I sent, you would see that I did try using

> Regex.Split(input, "[^0-9a-zA-Z]");

> but still had the problem...

> Josh



Sun, 20 Feb 2005 20:44:09 GMT  
 String.Trim() behavior
That's because you are using the Regex.Split the wrong way.  According
to the docs, [^aeiou] matches any single character NOT in the list (in
this case, aeiou).  So, you are splitting on any character not in
0-9a-zA-Z.  Any other character that occurs in your string will cause
a split at that point, INCLUDING \t, \r, \n.

You should use the Regex.Replace before splitting,

Like

string s = Regex.Replace(sInput, "[^0-9a-zA-Z ]", "");
That would leave you with only letters, numbers, and spaces in your
current line of text.

Then, you could split it into multiple fields.

Jonathan Schafer

Quote:



>> It might contain other embedded whitespace, like \t, \r, \n, etc.  You
>> might consider using regular expressions to remove anything that is
>> not A-z, a-z, 0-9 first.

>If you read the first email I sent, you would see that I did try using

>Regex.Split(input, "[^0-9a-zA-Z]");

>but still had the problem...

>Josh

>> Jonathan Schafer

>> On Tue, 3 Sep 2002 21:11:08 -0500, "Greg Ewing"

>>>Josh, what does your string look like that you are splitting?



Sun, 20 Feb 2005 22:17:59 GMT  
 String.Trim() behavior
class readStr{

public void readFile(){
        StreamReader sr=File.OpenText(s);
        String str=null;
        while (null!=(str=sr.ReadLine())){
        Console.WriteLine(str);}
        }
        }

Quote:
}

with regards,

J.V.Ravichandran

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!



Mon, 21 Feb 2005 23:31:00 GMT  
 
 [ 10 post ] 

 Relevant Pages 

1. Trim a comma off end of string???

2. Help to trim spaces of a string

3. trim the string

4. String.Trim is this correct functionality

5. String.Trim();

6. Need help to trim spaces off a string

7. trimming a string

8. How to trim a string in a _variant_t

9. How to trim a String?

10. Right trim string function in C

11. trim for strings

12. Way to do trim right in string?

 

 
Powered by phpBB® Forum Software