sorting question 
Author Message
 sorting question

i would like to iterate through all possible combinations of the characters
in a string, and i am not thinking very clearly about how to approach this.
that is, if the initial string is 'bar', i want to print
['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
any hints?

thanks,
ds



Mon, 28 Jul 2003 08:02:29 GMT  
 sorting question

Quote:
> i would like to iterate through all possible combinations of the
characters
> in a string, and i am not thinking very clearly about how to approach
this.
> that is, if the initial string is 'bar', i want to print
> ['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
> any hints?

> thanks,
> ds

Tim Peters was working on something a while ago, see

    http://www.deja.com/[ST_rn=ps]/getdoc.xp?AN=558836665

but I'm not sure where it eventually went to live

regards
 Steve
--
Tools, training and technology to help you meet your information needs



Mon, 28 Jul 2003 08:36:15 GMT  
 sorting question


Quote:
> i would like to iterate through all possible combinations of the
characters
> in a string, and i am not thinking very clearly about how to approach
this.
> that is, if the initial string is 'bar', i want to print
> ['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
> any hints?

I adapted Sedgewick's permutation function (from Algorithms in C, pp
628-629). My solution (below) first generates a list of all
permutations of integers in the range (1..n), where n is the length of
the string to be permuted. It then translates the original string into
a list of permutations using the list of integers. Because generating
the list of permuted integers is expensive, you could cache the lists
so you wouldn't have to do it for every 3- or 4-character string.

Note that this process gets very slow and the lists get very long for
even short strings... after five characters you can see the hit.

`````````````````````````````````````
def visit(k, val, id, v, perms):
    "permutation by exhaustive search, from sedgewick"
    id += 1
    val[k] = id
    if id == v:
        perms.append(val[:])
    else:
        for i in range(v):
            if val[i] == 0:
                visit(i, val, id, v, perms)
    id -= 1
    val[k] = 0

def permutations(n):
    "generate list of all permutations of the integers 1..n"
    perms = []
    val = [0] * n
    visit(0, val, -1, n, perms)
    return perms

def permutestring(s):
    "generate list of all permutations of a string"
    perms = permutations(len(s))
    result = []
    for p in perms:
        t = ""
        for i in p:
            t += s[i-1]
        result.append(t)
    return result

#print permutations(4)
print permutestring("bar")
print permutestring("abcd")

`````````````````````````````````````

--
Greg Jorgensen
Portland, Oregon, USA

Sent via Deja.com
http://www.deja.com/



Mon, 28 Jul 2003 17:39:56 GMT  
 sorting question

Quote:

> i would like to iterate through all possible combinations of the characters
> in a string, and i am not thinking very clearly about how to approach this.
> that is, if the initial string is 'bar', i want to print
> ['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
> any hints?

The following works for me:

def permute(choices, fixed=''):
    bag = []                  
    numchoices = len(choices)
    for pos in range(numchoices):
        start = choices[pos]
        rest = choices[:pos] + choices[pos+1:]
        bag.extend(permute(rest, fixed + start))
        if numchoices == 1: bag.append(fixed + start + rest)
    return bag

print permute('bar')
print permute('1234')

It produces the following output:

['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
['1234', '1243', '1324', '1342', '1423', '1432', '2134', '2143', '2314', '2341 '2413', '2431', '3124', '3142', '3214', '3241', '3412', '3421', '4123', '4132 '4213', '4231', '4312', '4321']

With longer strings, you probably don't want to store the results in memory,
so you can use a visitor function instead of a list:

def permute2(choices, visitfunc, fixed=''):
    numchoices = len(choices)              
    for pos in range(numchoices):
        start = choices[pos]
        rest = choices[:pos] + choices[pos+1:]
        permute2(rest, visitfunc, fixed + start)
        if numchoices == 1: visitfunc(fixed + start + rest)

permutations = []
permute2('1234567', permutations.append)  # pretty silly example of a visitor
print permutations

If your strings have repeated characters and you want only unique combinations:

permutations = {}
def visitor(v):
    permutations[v] = None
permute2('ababcbed', visitor)
print permutations.keys()

Hope that helps,

-Steve

--
Steve Purcell, Pythangelist
Get testing at http://pyunit.sourceforge.net/
Available for consulting and training.
"Even snakes are afraid of snakes." -- Steven Wright



Mon, 28 Jul 2003 19:20:45 GMT  
 sorting question

Quote:
> i would like to iterate through all possible combinations of the
characters
> in a string, and i am not thinking very clearly about how to approach
this.
> that is, if the initial string is 'bar', i want to print
> ['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
> any hints?

Perhaps not very fast, but, IMHO, pretty cool:

class Perm:
    def __init__(self, seq):
        self._seq = list(seq)
        self._len = len(self._seq)
    def __getitem__(self, x):
        index = x
        indices = [-1]*(self._len-1)
        for n in range(self._len,1,-1):
            index,indices[self._len-n] = divmod(index,n)
        if index: raise IndexError,x
        result = self._seq[:]
        for i in range(len(indices)):
            index = i+indices[i]
            result[i],result[index] = result[index],result[i]
        return result

and now

for p in Perm('bar'):
    print ''.join(p)

Quote:
>>> for p in Perm('bar'):

...     print ''.join(p)
...
bar
abr
rab
bra
arb
rba

the idea is to associate to each number between 0 and
one less than the factorial of N (where N is the length
of the sequence being permuted) a specific permutation,
which is a list of N integers -- the first in
range(0,N), the second in range(0,N-1), ... the last
always 0 (so, no need to record or compute it).  As
usual for enumeration tasks, it looks a bit like a
problem of 'counting in some base', except that, here,
it's several bases -- N for the first digit, N-1 for
the second one, and so on.  Unfortunately I don't think
there is a close-form way to get the list of N integers
from the number it represents, whence the loop with
the divmod in it.

The second half of the __getitem__ method then gets
from the list of numbers to an actual permutation, by
a way that will remind one of 'Knuth shuffling' if one
has ever programmed the latter:-).  It can be done
in other good ways, though.

Alex



Mon, 28 Jul 2003 21:11:07 GMT  
 sorting question

  A little recursive solution to the problem.

  The idea here is firstly consider the base case of a string of length
one.  Easy, it's just that string.  

  Secondly, for a string of length n, where n>1 choose each character in
turn for the start of the result and use the characters left over to
construct the tail.

  The tail is all the combinations of the remaining letters.  Which is
the same problem again except the string is now of length n-1.  So
recurse until we reach the base case.

  Not sure if this is faster than Alex's code, but it's a few line less.

  Regards...

Alistair

-----------------------------------

text = 'bar'

def f(s):
  if len(s) == 1:
    return s[0]
  r = []
  for i in range(len(s)):
    for rest in f(s[:i] + s[i+1:]):
      r.append(s[i] + rest)
  return r

Quote:
>>> f(text)

['bar', 'bra', 'abr', 'arb', 'rba', 'rab']

-----------------------------------

Quote:



> > i would like to iterate through all possible combinations of the
> characters
> > in a string, and i am not thinking very clearly about how to approach
> this.
> > that is, if the initial string is 'bar', i want to print
> > ['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
> > any hints?

> Perhaps not very fast, but, IMHO, pretty cool:

> class Perm:
>     def __init__(self, seq):
>         self._seq = list(seq)
>         self._len = len(self._seq)
>     def __getitem__(self, x):
>         index = x
>         indices = [-1]*(self._len-1)
>         for n in range(self._len,1,-1):
>             index,indices[self._len-n] = divmod(index,n)
>         if index: raise IndexError,x
>         result = self._seq[:]
>         for i in range(len(indices)):
>             index = i+indices[i]
>             result[i],result[index] = result[index],result[i]
>         return result

> and now

> for p in Perm('bar'):
>     print ''.join(p)

> >>> for p in Perm('bar'):
> ...     print ''.join(p)
> ...
> bar
> abr
> rab
> bra
> arb
> rba

> the idea is to associate to each number between 0 and
> one less than the factorial of N (where N is the length
> of the sequence being permuted) a specific permutation,
> which is a list of N integers -- the first in
> range(0,N), the second in range(0,N-1), ... the last
> always 0 (so, no need to record or compute it).  As
> usual for enumeration tasks, it looks a bit like a
> problem of 'counting in some base', except that, here,
> it's several bases -- N for the first digit, N-1 for
> the second one, and so on.  Unfortunately I don't think
> there is a close-form way to get the list of N integers
> from the number it represents, whence the loop with
> the divmod in it.

> The second half of the __getitem__ method then gets
> from the list of numbers to an actual permutation, by
> a way that will remind one of 'Knuth shuffling' if one
> has ever programmed the latter:-).  It can be done
> in other good ways, though.

> Alex

> --
> http://mail.python.org/mailman/listinfo/python-list

--
-----------------------------------
Business Collaborator Team
Enviros Software Solutions
http://www.businesscollaborator.com
-----------------------------------


Mon, 28 Jul 2003 23:17:16 GMT  
 sorting question
Thank you all for the suggestions.

i am trying to unscramble scrambled words (which i am frustratingly bad at)
so i send all of the permutations to Word's spell checker and see what it
thinks are words.  I'll post the code this afternoon when i'm done in case
anyone wants it.

doug


Quote:
> i would like to iterate through all possible combinations of the
characters
> in a string, and i am not thinking very clearly about how to approach
this.
> that is, if the initial string is 'bar', i want to print
> ['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
> any hints?

> thanks,
> ds



Tue, 29 Jul 2003 00:30:04 GMT  
 sorting question
Depending upon your constraints, another way to approach this problem may be
better.  If you can get your word (or Word) dictionary, then you can build a
compiled form that collapses your search.  The compiled form of a word takes
the letters of a word and sorts them.   Use the compiled form as a
dictionary key to the original word (or words).  For example:
'bar' and 'bra' both compile to 'abr'.  Thus

words = {'arb': ('bar', bra')}

When you want to check a word, compile it by sorting its letters and test
its key against your words dictionary.

    a=list('bar')
    a.sort()
    compiled_word = string.join(a, '')
    if words.has_key(compiled_word):
        print words[compiled_word]

This would be blazingly fast, but would depend upon compiling your word
list.  There are lots of pretty complete text lists of English words
available on the web.

Andrew Henshaw


Quote:
> Thank you all for the suggestions.

> i am trying to unscramble scrambled words (which i am frustratingly bad
at)
> so i send all of the permutations to Word's spell checker and see what it
> thinks are words.  I'll post the code this afternoon when i'm done in case
> anyone wants it.

> doug



> > i would like to iterate through all possible combinations of the
> characters
> > in a string, and i am not thinking very clearly about how to approach
> this.
> > that is, if the initial string is 'bar', i want to print
> > ['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
> > any hints?

> > thanks,
> > ds



Tue, 29 Jul 2003 10:57:31 GMT  
 sorting question


Quote:
> Depending upon your constraints, another way to approach this problem may
be
> better.  If you can get your word (or Word) dictionary, then you can build
a
> compiled form that collapses your search.  The compiled form of a word
takes
> the letters of a word and sorts them.   Use the compiled form as a
> dictionary key to the original word (or words).  For example:
> 'bar' and 'bra' both compile to 'abr'.  Thus
...snip..
> This would be blazingly fast, but would depend upon compiling your word
> list.  There are lots of pretty complete text lists of English words
> available on the web.

...snip...

There is a word list available at
http://www-personal.umich.edu/~jlawler/wordlist.html
using this list and the code at the end of this note, I get the following
output:

Quote:
>>> import Anagram
>>> a = Anagram.Anagram('c:/temp/wordlist.txt')
>>> a.unjumble('mujleb')
['jumble']
>>> a.unjumble('redaps')

['spared', 'spread']

(Interesting. It missed 'parsed', 'drapes', and 'rasped'.  You might look
for a better list!)
On my machine, it took eight seconds to load and compile the list.

AH

#################################
import string

class Anagram:
    def __init__(self, words_file_name):
        self.word_dict = {}
        for word in open(words_file_name).readlines():
            word = string.strip(word)
            if word:
                compiled = self.compile(word)
                if self.word_dict.has_key(compiled):
                    self.word_dict[compiled].append(word)
                else:
                    self.word_dict[compiled] = [word]

    def compile(self, word):
        l = list(word)
        l.sort()
        return string.join(l, '')

    def unjumble(self, word):
        compiled = self.compile(word)
        if self.word_dict.has_key(compiled):
            print self.word_dict[compiled]
        else:
            print '**not found**'



Tue, 29 Jul 2003 11:46:10 GMT  
 sorting question


Quote:
> Thank you all for the suggestions.

> i am trying to unscramble scrambled words (which i am frustratingly bad at)
> so i send all of the permutations to Word's spell checker and see what it
> thinks are words.  I'll post the code this afternoon when i'm done in case
> anyone wants it.

Hmm, sort of sounds similar to what I did the year before last.

The local 'Newspaper' (I give it more credit than it is due :-), the Adelaide
Advertiser, has a 9-letter word puzzle, where 9 letters are given, and the aim
is to make as many words as possible that are 4 letters or more, and contain
the magic 'middle' letter.

For instance:

C   A   R
D   A   B
R   O   O

would give many words, with cardboard as the 9-letter word.

My school had a competition where each homegroup had to try and find as many
words, and the results were tallied over the whole term.

Naturally, I wrote a python program, and took all of the fun out of it.

If anyone is interested, you might find it at:

http://www.chariot.net.au/~jaq/matt/NineLetter.zip

(It's almost 2Mb, and includes a large, almost[?] comprehensive word list.)
--
Matthew Schinckel
matt at null dot net



Tue, 29 Jul 2003 16:34:11 GMT  
 
 [ 10 post ] 

 Relevant Pages 

1. Listbox sorting question

2. Another sorting question

3. CW2003 Browse Sort Question

4. newbie locale/ascii sort question

5. Sort Question

6. sort question

7. Sort Question

8. Homework: Sorting Question

9. Sort question

10. a sorting question...

11. list sort question

12. List sorting question

 

 
Powered by phpBB® Forum Software