Tokenizing a string 
Author Message
 Tokenizing a string

I've got a string I'd like to tokenize, but it's not in a file, and it'd
be rather inefficient to write it to a file just to tokenize it.  Is
there any function I can use to pass this string to
tokenize.tokenize()?  

I could probably use re in this situation, but I'd really rather not, as
the eat() function I'm working with is really nifty, making it much more
convenient to just use tokenize().

Thanks in advance for your help!

--Mike



Wed, 04 Sep 2002 03:00:00 GMT  
 Tokenizing a string

Quote:

> I've got a string I'd like to tokenize, but it's not in a file, and it'd
> be rather inefficient to write it to a file just to tokenize it.  Is
> there any function I can use to pass this string to
> tokenize.tokenize()?

the "tokenize" function takes any method which returns
a new line of code for each call, and an empty string when
it runs out of data.

the easiest way to use this on a string is to wrap the
string in a StringIO object, and pass the readline method
to the tokenizer:

import tokenize
import StringIO

prog = "print 'hello'\n"

tokenize.tokenize(StringIO.StringIO(prog).readline)

## this prints:
##
## 1,0-1,5:     NAME    'print'
## 1,6-1,13:    STRING  "'hello'"
## 1,13-1,14:   NEWLINE '\012'
## 2,0-2,0:     ENDMARKER       ''

alternatively, you can use your own wrapper, such as:

import string

class Wrapper:
    def __init__(self, program):
        self.prog = string.split(program, "\n")
        if program[-1:] == "\n":
            del self.prog[-1] # trim tail
    def __call__(self):
        try:
            return self.prog.pop(0) + "\n"
        except IndexError:
            return "" # end of list

tokenize.tokenize(Wrapper(prog))

hope this helps!

</F>

<!-- (the eff-bot guide to) the standard python library:
http://www.pythonware.com/people/fredrik/librarybook.htm
-->



Wed, 04 Sep 2002 03:00:00 GMT  
 Tokenizing a string

Quote:

> I've got a string I'd like to tokenize, but it's not in a file, and it'd
> be rather inefficient to write it to a file just to tokenize it.  Is
> there any function I can use to pass this string to
> tokenize.tokenize()?  

Use StringIO or cStringIO, or just use split('\n') on it, and write a
small class to get out the lines.

--

http://www.oreilly.com/news/prescod_0300.html
http://www.linux.org.il -- we put the penguin in .com



Wed, 04 Sep 2002 03:00:00 GMT  
 Tokenizing a string

Quote:

> I've got a string I'd like to tokenize, but it's not in a file, and it'd
> be rather inefficient to write it to a file just to tokenize it.  Is
> there any function I can use to pass this string to
> tokenize.tokenize()?  

You just need to pass tokenize a function that returns a line. If
your text is in a list, you could use
 tokenize(lambda txt=txt: txt.pop(0))

StringIO or cStringIO is another good possibility.

- Gordon



Wed, 04 Sep 2002 03:00:00 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. Tokenizing a string in Ada

2. i want to tokenize a string

3. Tokenize a string or split on steroids

4. Need syntax help. Need to tokenize a string.

5. Haskell newbie: Tokenizing

6. Tokenized Objrexx Dies

7. distributed tokenized object rexx program

8. Tokenized Script

9. Tokenizing Modula-2

10. Tokenize

11. Tokenize.py for C++?

12. parser and tokenize modules: sample code?

 

 
Powered by phpBB® Forum Software