Optimization help needed: Search and Replace using dictionary of parameters 
Author Message
 Optimization help needed: Search and Replace using dictionary of parameters

How can I do this most efficiently:

    I have filenames and parameters in a sparse matrix that is a
dictionary:

    {'file1, parameter1': xxx, 'file1, parameter2' : yyy,
    'file2, parameter1' : zzz, 'file2, parameter2': hhh, 'file2,
parameter3': ggg,
    'file3, parameter1': ccc,
    'file4, parameter1:' ddd, 'file4, parameter2:', eee}

    I would like to run search and replace to files for each parameters;

    For example: parameter1 in file1 should be replaced with xxx etc.

    Before I start writing code, any ideas what is the fastest way of
doing it ?:
        regex- or string -functions ? Map or readlines() ?

    Any ideas are appreciated..

-pekka-



Sat, 19 Jun 2004 01:20:36 GMT  
 Optimization help needed: Search and Replace using dictionary of parameters

Quote:
>     Before I start writing code, any ideas what is the fastest way of
> doing it ?:
>         regex- or string -functions ? Map or readlines() ?

sed could be much faster (albeit less flexible) than python
for this task.

Your data structure could be slow.
Use nested dictionaries instead:

x = {
  'filename1':    { 'parameter': 'value',
                    'parameter2': 'value',
                    'parameter3': 'value' },
  'filename2':    { 'parameter2': 'value',
                    'parameter4': 'value',
                    'parameter6': 'value' },
  'filename3':    { 'parameter5': 'value',
                    'parameter7': 'value',
                    'parameter8': 'value' },
  'filename4':    { 'parameter': 'value',
                    'parameter3': 'value' }

Quote:
}

As for string vs re, it depends.  Just use whichever one is easier
for your particular situation.  But take a special look at re.sub().

  def get_replacement(match):
      param = match.group(1)
      return lookup[filename][param]

  for line in file.xreadlines():
      # Find and replace all tags that are set off in a certain way...
      line = re.sub(r'<<([A-Z0-9_]+)>>', get_replacement, line)
      out.write(line)

To read lines from a file, the fastest thing is probably:

  # Convoluted, but speedy
  x = file.readlines(16000)
  while x:
      for line in x:
          blah_blah_blah(line)
      x = file.readlines(16000)

But it is usually plenty fast enough to do:

  # Quick and obvious
  for line in file.xreadlines():
      blah_blah_blah(line)

Or:

  # Quick and even more obvious, but new in Python 2.2
  for line in file:
      blah_blah_blah(line)

## Jason Orendorff    http://www.jorendorff.com/



Sat, 19 Jun 2004 02:37:32 GMT  
 Optimization help needed: Search and Replace using dictionary of parameters

Quote:
> How can I do this most efficiently:

>     I have filenames and parameters in a sparse matrix that is a
> dictionary:

I dont remember who posted this snippet on c.l.py a long time ago, but it
works like a charm, so I havn't bothered changing it.

Actually it ought to be a standard method on the string object, or at least
avaliable in the standard distribution somehow as it get's asked regularly
on the group.

But here it is again. Using this it should be easy to just itereate over the
files you want changed and then use the class on it.

regards Max M

###################################################

import re, string

class MultiReplace:

    def __init__(self, repl_dict):
        # "compile" replacement dictionary

        # assume char to char mapping
        charmap = map(chr, range(256))
        for k, v in repl_dict.items():
            if len(k) != 1 or len(v) != 1:
                self.charmap = None
                break
            charmap[ord(k)] = v
        else:
            self.charmap = string.join(charmap, "")
            return

        # string to string mapping; use a regular expression
        keys = repl_dict.keys()
        keys.sort() # lexical order
        pattern = string.join(map(re.escape, keys), "|")
        self.pattern = re.compile(pattern)
        self.dict = repl_dict

    def replace(self, str):
        # apply replacement dictionary to string
        if self.charmap:
            return string.translate(str, self.charmap)
        def repl(match, get=self.dict.get):
            item = match.group(0)
            return get(item, item)
        return self.pattern.sub(repl, str)

if __name__ == '__main__':

    r = MultiReplace({"spam": "eggs", "spam": "eggs"})
    print r.replace("spam&eggs")
    ## eggs&spam

    r = MultiReplace({"a": "b", "b": "a"})
    print r.replace("keaba")
    ## kebab



Sat, 19 Jun 2004 19:34:16 GMT  
 Optimization help needed: Search and Replace using dictionary of parameters
Quote:
> I dont remember who posted this snippet on c.l.py a long time ago, but it
> works like a charm, so I havn't bothered changing it.
...
> if __name__ == '__main__':

>     r = MultiReplace({"spam": "eggs", "spam": "eggs"})
>     print r.replace("spam&eggs")
>     ## eggs&spam

This actually outputs eggs&eggs, to get the desired output of spam&eggs the
dictionary needs to be modified to:{"spam": "eggs", "eggs": "spam"}


Sun, 20 Jun 2004 04:10:52 GMT  
 Optimization help needed: Search and Replace using dictionary of parameters

Well to do a followup to my own post ;-) I sort of needed the search and
replace functionality myself and so wrote a small script.

here goes...

###########################################################
# recursivley replaces ALL ocurrences of a string in files
# and optionally in the filenames too.

import re, string

class MultiReplace:

    def __init__(self, repl_dict):
        # "compile" replacement dictionary

        # assume char to char mapping
        charmap = map(chr, range(256))
        for k, v in repl_dict.items():
            if len(k) != 1 or len(v) != 1:
                self.charmap = None
                break
            charmap[ord(k)] = v
        else:
            self.charmap = string.join(charmap, "")
            return

        # string to string mapping; use a regular expression
        keys = repl_dict.keys()
        keys.sort() # lexical order
        pattern = string.join(map(re.escape, keys), "|")
        self.pattern = re.compile(pattern)
        self.dict = repl_dict

    def replace(self, str):
        # apply replacement dictionary to string
        if self.charmap:
            return string.translate(str, self.charmap)
        def repl(match, get=self.dict.get):
            item = match.group(0)
            return get(item, item)
        return self.pattern.sub(repl, str)

from os.path import walk, isdir, join, split
from os import rename

def getAllFiles(startDir):
    def visit(result, dirname, names):
        for file in names:
            result.append(join(dirname, file))
    result = []
    walk(startDir, visit, result)
    return result

def replace(startDir, mrDict, changeFileNames=0):
    # replaces ALL ocurrences of a string in both files and filenames
    mr = MultiReplace(mrDict)
    # search and replace in file content
    files = getAllFiles(startDir)
    for file in files:
         # Replace in files
        if not isdir(file):
            f = open(file, 'r+w')
            content = f.read()
            newContent = mr.replace(content)
            f.seek(0)
            f.write(newContent)
            f.truncate()
            f.close()
        #
        if changeFileNames == 1:
            # Rename filenames
            head, tail = split(file)
            newTail = mr.replace(tail)
            rename(file, join(head, newTail))

if __name__ == '__main__':
    startDir = 'C:/zope/zope243/lib/python/Products/ots_User/'
    replace(startDir, {'ots_emne_003':'ots_User'}, changeFileNames=1)



Sun, 20 Jun 2004 20:49:52 GMT  
 
 [ 5 post ] 

 Relevant Pages 

1. search replace help needed

2. need help on using Ruby to replace some SED expressions

3. in-file search/replace question without using mv or cp

4. help!: optimization of search problem (example code)

5. help with search & replace

6. How to replace or create a file using the open/create/replace.vi

7. Need IO Optimization help

8. Need help with flame code optimization, please

9. Optimization help needed

10. sed help needed - replacing parts of a string

11. language rules question: using a (passed in) parameter to define a parameter

12. Replacing multiple (el)ifs with one dictionary access

 

 
Powered by phpBB® Forum Software