Permutations 
Author Message
 Permutations

Here's a puzzle for anyone interested. I've been messing with this off & on
over the weekend based on a posting in a different ng. The data is a table
of players and scores, each line/record containing the player's name and
several scores. The problem is to count how many times each player would win
out of all possible permutations of one score per player. If there were n
players with s scores each, there'd be s^n 'contests' based on all
permutations of scores.

Figuring out clear wins (no ties) is fairly simple. Given data looking like

A 24 30 85 16 59 20
B 18 49 16 37 50 92
C 90 23 64 58 34 11
D 57 86 5 11 64 2
E 11 49 26 68 90 76
F 91 93 38 90 78 33
G 100 17 71 51 94 72
H 21 56 71 8 52 20
I 64 45 4 47 8 4
J 40 69 54 69 28 47

the following script gives the number of clear wins.

{
  N = NR
  player[NR] = $1
  $1 = ""; $0 = $0 # delete player name, reparse fields, leaving only scores
  if (S < NF) S = NF # but no error checking below
  for (i = 1; i <= NF; ++i) score[NR, i] = $i

Quote:
}

END {
  for (i = 1; i <= N; ++i) { # for each player
    wins = 0
    for (j = 1; j <= S; ++j) {
      p = 1
      for (k = 1; k <= N; ++k) {
        if (k != i) {
          t = 0
          for (m = 1; m <= S; ++m) t += (score[i, j] > score[k, m])
          p *= t
        }
      }
      wins += p
    }
    print player[i], wins
  }

Quote:
}

The problem is handling ties. If there are m out of n players with the same
high score in any 'contest', then each of those players should earn 1/m
wins. Once ties are handled like this, the sum of all wins by all players
should equal the total number of contests/permutations, s^n. The goal is to
do so without using undue storage or processing time (10 players with 6
scores each gives 60,466,176 contests, so brute force is highly undesirable;
with 20 players and 10 scores each, brute force on a single CPU/FPU would
take a bit longer than the average human lifetime).

Anyone else want to play with this?



Sat, 18 Dec 2004 02:56:33 GMT  
 Permutations
Hallo,

Quote:

> out of all possible permutations of one score per player. If there were n

I'm afraid the word permutation means changing order and is not suitable here.

Def.: by a "contest" we mean selecting one value in each row.
(That gives s^n different contests.)
The row/player with a maximum value wins a point.
(1/ii points if the maximum is reached ii times.)

This should be summed over all such "contests".

Using your code to read the block in, it's possible to modify the END block
this way:

END {
  for (i = 1; i <= N; ++i) { # for each player
    wins = 0
    for (j = 1; j <= S; ++j) {
      split("",r)
      r_n = r[1] = 1
      r[0] = 0   # prevent accessing uninitialised element
      for (k = 1; k <= N; ++k) {
        if (k != i) {
          eq = t = 0
          for (m = 1; m <= S; ++m)
            if (score[i, j] > score[k, m])
              t++
            else if (score[i, j] == score[k, m])  ## spatne cislo radky!!!
              eq++
          if (eq > 0) {
            r_n++
            r[r_n] = 0   # prevent accessing uninitialised element
          }
          for (ii = r_n; ii >= 1; ii--)
            r[ii] = t * r[ii] + eq * r[ii-1]
        }
      }
      for (ii = 1; ii <= r_n; ii++)
        wins += r[ii] / ii
    }
    print player[i], wins
  }

Quote:
}

The complexity of this algorithm seems to be O(n^2*s*(s+x)),
where x stands for maximum number of winners in one contest.

I think there is a better way to implement this.
I hope to post the code tomorrow.

Thank you very much, Harlan, for this nice problem,
                Stepan Kasal



Sat, 18 Dec 2004 21:52:38 GMT  
 Permutations


Quote:
> Here's a puzzle for anyone interested. I've been messing with this off &
on
> over the weekend based on a posting in a different ng. The data is a table
> of players and scores, each line/record containing the player's name and
> several scores. The problem is to count how many times each player would
win
> out of all possible permutations of one score per player. If there were n
> players with s scores each, there'd be s^n 'contests' based on all
> permutations of scores.

> Figuring out clear wins (no ties) is fairly simple. Given data looking
like

> A 24 30 85 16 59 20
> B 18 49 16 37 50 92
> C 90 23 64 58 34 11
> D 57 86 5 11 64 2
> E 11 49 26 68 90 76
> F 91 93 38 90 78 33
> G 100 17 71 51 94 72
> H 21 56 71 8 52 20
> I 64 45 4 47 8 4
> J 40 69 54 69 28 47

> the following script gives the number of clear wins.

> {
>   N = NR
>   player[NR] = $1
>   $1 = ""; $0 = $0 # delete player name, reparse fields, leaving only
scores
>   if (S < NF) S = NF # but no error checking below
>   for (i = 1; i <= NF; ++i) score[NR, i] = $i
> }
> END {
>   for (i = 1; i <= N; ++i) { # for each player
>     wins = 0
>     for (j = 1; j <= S; ++j) {
>       p = 1
>       for (k = 1; k <= N; ++k) {
>         if (k != i) {
>           t = 0
>           for (m = 1; m <= S; ++m) t += (score[i, j] > score[k, m])
>           p *= t
>         }
>       }
>       wins += p
>     }
>     print player[i], wins
>   }
> }

> The problem is handling ties. If there are m out of n players with the
same
> high score in any 'contest', then each of those players should earn 1/m
> wins. Once ties are handled like this, the sum of all wins by all players
> should equal the total number of contests/permutations, s^n. The goal is
to
> do so without using undue storage or processing time (10 players with 6
> scores each gives 60,466,176 contests, so brute force is highly
undesirable;
> with 20 players and 10 scores each, brute force on a single CPU/FPU would
> take a bit longer than the average human lifetime).

> Anyone else want to play with this?

Harlan:

I think you've turned this into a computational problem when it doesn't have
to be.  The example you give can be figured out "by hand" in a few minutes.
To determine the wins for player A, for each possible A score, count how
many wins he would have against player B, player C, etc.  Multiply these
numbers, and sum over all player A scores.

For example:

A    25    97    55
B    62    18    85
C    80    47    12

To determine A wins:    for 25 -- 1 in B & 1 in C  = 1
                                  for 97 -- 3 in B & 3 in C = 9
                                   for 55 -- 1 in B and 2 in C = 2
        total for A: 12 wins out of 27 possible permutations

The "tie" calculations are not much more difficult.

TK



Sat, 18 Dec 2004 23:41:58 GMT  
 Permutations

Quote:


...
>> the following script gives the number of clear wins.

>> {
>>   N = NR
>>   player[NR] = $1
>>   $1 = ""; $0 = $0
>>   if (S < NF) S = NF # but no error checking below
>>   for (i = 1; i <= NF; ++i) score[NR, i] = $i
>> }
>> END {
>>   for (i = 1; i <= N; ++i) { # for each player
>>     wins = 0
>>     for (j = 1; j <= S; ++j) {
>>       p = 1
>>       for (k = 1; k <= N; ++k) {
>>         if (k != i) {
>>           t = 0
>>           for (m = 1; m <= S; ++m) t += (score[i, j] > score[k, m])
>>           p *= t
>>         }
>>       }
>>       wins += p
>>     }
>>     print player[i], wins
>>   }
>> }
...
>I think you've turned this into a computational problem when it doesn't
have
>to be.  The example you give can be figured out "by hand" in a few minutes.
>To determine the wins for player A, for each possible A score, count how
>many wins he would have against player B, player C, etc.  Multiply these
>numbers, and sum over all player A scores.
...
>The "tie" calculations are not much more difficult.

I understand how simple it is to find what I call 'clear wins' - only one
player with the highest score. My script does what you say to do
(implementing it does take 4 for loops), and that was my response in the
other unspecified ng in which this problem arose (it was
comp.apps.spreadsheets if you want to go look).

Ties are NOT as simple to handle. If you believe they are, share your
implementation of the simple algorithm needed. This MUST be implemented for
a computer to do.

Note that if there are 4 players with the same score, say B, C, E, G, then
you need to count ties for each of the combinations of tying players B & C,
B & E, B & G, C & E, C & G, E & G, B & C & E, B & C & G, B & E & G, C & E &
G, B & C & E & G. Consider

A    1    9
B    2    10
C    3    10
D    4    11
E    5    10
F    6    12
G    7    10
H    8    13

The ties are (A to H)

1, 10, 10, 4, 5, 6, 7, 8
1, 10, 3, 4, 10, 6, 7, 8
1, 10, 3, 4, 5, 6, 10, 8
1, 2, 10, 4, 10, 6, 7, 8
1, 2, 10, 4, 5, 6, 10, 8
1, 2, 3, 4, 10, 6, 10, 8
1, 10, 10, 4, 10, 6, 7, 8
1, 10, 10, 4, 5, 6, 10, 8
1, 10, 3, 4, 10, 6, 10, 8
1, 2, 10, 4, 10, 6, 10, 8
1, 10, 10, 4, 10, 6, 10, 8

Handling the permutations of m players having the same score is not as
trivial as you may think. It gets harder to implement if one or more players
have the multiple instances of a tying score.

Granted this isn't terribly difficult for a human to do with paper & pencil,
but it has to be given a computer implementation. This may be as pointless
as programming a robot to clean a room and struggling over where the robot
should start. Continue only if it interests you.



Sun, 19 Dec 2004 00:43:02 GMT  
 Permutations
...
Quote:
>Handling the permutations of m players having the same score

...

Improper terminology! Instead of 'permutations' I should have used
'combinations'.



Sun, 19 Dec 2004 00:46:13 GMT  
 Permutations

Quote:
> Hallo,


> > out of all possible permutations of one score per player. If there were
n

> I'm afraid the word permutation means changing order and is not suitable
here.

> Def.: by a "contest" we mean selecting one value in each row.
> (That gives s^n different contests.)
> The row/player with a maximum value wins a point.
> (1/ii points if the maximum is reached ii times.)

> This should be summed over all such "contests".

> Using your code to read the block in, it's possible to modify the END
block
> this way:

> END {
>   for (i = 1; i <= N; ++i) { # for each player
>     wins = 0
>     for (j = 1; j <= S; ++j) {
>       split("",r)
>       r_n = r[1] = 1
>       r[0] = 0   # prevent accessing uninitialised element
>       for (k = 1; k <= N; ++k) {
>         if (k != i) {
>           eq = t = 0
>           for (m = 1; m <= S; ++m)
>             if (score[i, j] > score[k, m])
>               t++
>             else if (score[i, j] == score[k, m])  ## spatne cislo radky!!!
>               eq++
>           if (eq > 0) {
>             r_n++
>             r[r_n] = 0   # prevent accessing uninitialised element
>           }
>           for (ii = r_n; ii >= 1; ii--)
>             r[ii] = t * r[ii] + eq * r[ii-1]
>         }
>       }
>       for (ii = 1; ii <= r_n; ii++)
>         wins += r[ii] / ii
>     }
>     print player[i], wins
>   }
> }

> The complexity of this algorithm seems to be O(n^2*s*(s+x)),
> where x stands for maximum number of winners in one contest.

> I think there is a better way to implement this.
> I hope to post the code tomorrow.

> Thank you very much, Harlan, for this nice problem,
> Stepan Kasal

Did you test this on Harlan's data?  What scores did you get?

I think the complexity is always going to be somewhere around O( (n*s)^2 )
    [ n^2*s*(s+x) = n^2 * (s^2 + s*x) ~ n^2 * s^2 = (n*s)^2 ]

I think have a solution that is more efficient (but the same complexity) by
virtue of  pre-sorting players' scores and doing rank multiplication instead
of counting; this can be used to speed calculation  by reducing s2, and
introduce 0 <= c < 1 in the expansion O( (n*s) * c*(n * s2) ).  ...but still
the same order of complexity.  I'm not sure, I may be able to remove n by
repeating an array creation for ties vs. clear wins, instead of searching an
existing array (n*s2) for ties.  (BTW, as for big-oh, little-oh, omega, and
on-the-order-of, I am not so knowledgeble as are those that might criticize
my characterization of complexity.)

I'll see what I can do.  Right now, I have just isolated the scoring subsets
for the data, which avoids a lot of work calculating wins for numbers that
can't possibly win.  I think the way I calculated the subsets, I can avoid
the work of counting the wins, and just do rank products.  But with the
holiday coming up, I may not be able to post, but maybe you can see where
I'd be headed.

Good luck (and post the scores, so I know what I'm after!)

    - Dan



Sun, 19 Dec 2004 02:16:18 GMT  
 Permutations


Quote:


> ...
> >> the following script gives the number of clear wins.

> >> {
> >>   N = NR
> >>   player[NR] = $1
> >>   $1 = ""; $0 = $0
> >>   if (S < NF) S = NF # but no error checking below
> >>   for (i = 1; i <= NF; ++i) score[NR, i] = $i
> >> }
> >> END {
> >>   for (i = 1; i <= N; ++i) { # for each player
> >>     wins = 0
> >>     for (j = 1; j <= S; ++j) {
> >>       p = 1
> >>       for (k = 1; k <= N; ++k) {
> >>         if (k != i) {
> >>           t = 0
> >>           for (m = 1; m <= S; ++m) t += (score[i, j] > score[k, m])
> >>           p *= t
> >>         }
> >>       }
> >>       wins += p
> >>     }
> >>     print player[i], wins
> >>   }
> >> }
> ...
> >I think you've turned this into a computational problem when it doesn't
> have
> >to be.  The example you give can be figured out "by hand" in a few
minutes.
> >To determine the wins for player A, for each possible A score, count how
> >many wins he would have against player B, player C, etc.  Multiply these
> >numbers, and sum over all player A scores.
> ...
> >The "tie" calculations are not much more difficult.

> I understand how simple it is to find what I call 'clear wins' - only one
> player with the highest score. My script does what you say to do
> (implementing it does take 4 for loops), and that was my response in the
> other unspecified ng in which this problem arose (it was
> comp.apps.spreadsheets if you want to go look).

> Ties are NOT as simple to handle. If you believe they are, share your
> implementation of the simple algorithm needed. This MUST be implemented
for
> a computer to do.

> Note that if there are 4 players with the same score, say B, C, E, G, then
> you need to count ties for each of the combinations of tying players B &
C,
> B & E, B & G, C & E, C & G, E & G, B & C & E, B & C & G, B & E & G, C & E
&
> G, B & C & E & G. Consider

> A    1    9
> B    2    10
> C    3    10
> D    4    11
> E    5    10
> F    6    12
> G    7    10
> H    8    13

> The ties are (A to H)

> 1, 10, 10, 4, 5, 6, 7, 8
> 1, 10, 3, 4, 10, 6, 7, 8
> 1, 10, 3, 4, 5, 6, 10, 8
> 1, 2, 10, 4, 10, 6, 7, 8
> 1, 2, 10, 4, 5, 6, 10, 8
> 1, 2, 3, 4, 10, 6, 10, 8
> 1, 10, 10, 4, 10, 6, 7, 8
> 1, 10, 10, 4, 5, 6, 10, 8
> 1, 10, 3, 4, 10, 6, 10, 8
> 1, 2, 10, 4, 10, 6, 10, 8
> 1, 10, 10, 4, 10, 6, 10, 8

> Handling the permutations of m players having the same score is not as
> trivial as you may think. It gets harder to implement if one or more
players
> have the multiple instances of a tying score.

> Granted this isn't terribly difficult for a human to do with paper &
pencil,
> but it has to be given a computer implementation. This may be as pointless
> as programming a robot to clean a room and struggling over where the robot
> should start. Continue only if it interests you.

Maybe the implementation is more difficult than I think, but conceptually
the basic idea for counting the ties and the partial points is already here.
Consider your example.  I've added a third column, just to consider the
cases when a player has more that one tie possible -- e.g., C has two scores
of 10.

Consider calculating the points for A for his "10" score.  The X column is
the number of clear wins, the Y column is the number of ties

                                 X          Y
A    1    9      10        --            --
B    2    10    15        1            1
C    3    10    10        1            2
D    4    11    12        1            0
E    5    10    10        1            2
F    6    12    14        1            0
G    7    10    8         2            1
H    8    13    10        1            1

For full points, as stated before, mulitply down column (array) X.  For half
points, go to column Y and multiply the Y entry (non-zero, but it really
doesn't matter) times the X entries not in the same row.  For example, with
the two C ties the calculation would be 2* (1*1*1*1*2*1).  Sum this for all
non-zero values in column Y and that gives the total half points.  For
one-third points, the idea is similar, but you take the Y elements 2 at a
time, then quarter points take the Y elements 3 at a time, up to (here)
one-sixth points, of which there should be four.

Wouldn't this give the correct answer?

To your last comment, no, I don't really want to write the code.

TK



Sun, 19 Dec 2004 02:38:16 GMT  
 Permutations

Quote:





> > ...
> > >> the following script gives the number of clear wins.

> > >> {
> > >>   N = NR
> > >>   player[NR] = $1
> > >>   $1 = ""; $0 = $0
> > >>   if (S < NF) S = NF # but no error checking below
> > >>   for (i = 1; i <= NF; ++i) score[NR, i] = $i
> > >> }
> > >> END {
> > >>   for (i = 1; i <= N; ++i) { # for each player
> > >>     wins = 0
> > >>     for (j = 1; j <= S; ++j) {
> > >>       p = 1
> > >>       for (k = 1; k <= N; ++k) {
> > >>         if (k != i) {
> > >>           t = 0
> > >>           for (m = 1; m <= S; ++m) t += (score[i, j] > score[k, m])
> > >>           p *= t
> > >>         }
> > >>       }
> > >>       wins += p
> > >>     }
> > >>     print player[i], wins
> > >>   }
> > >> }
> > ...
> > >I think you've turned this into a computational problem when it doesn't
> > have
> > >to be.  The example you give can be figured out "by hand" in a few
> minutes.
> > >To determine the wins for player A, for each possible A score, count
how
> > >many wins he would have against player B, player C, etc.  Multiply
these
> > >numbers, and sum over all player A scores.
> > ...
> > >The "tie" calculations are not much more difficult.

> > I understand how simple it is to find what I call 'clear wins' - only
one
> > player with the highest score. My script does what you say to do
> > (implementing it does take 4 for loops), and that was my response in the
> > other unspecified ng in which this problem arose (it was
> > comp.apps.spreadsheets if you want to go look).

> > Ties are NOT as simple to handle. If you believe they are, share your
> > implementation of the simple algorithm needed. This MUST be implemented
> for
> > a computer to do.

> > Note that if there are 4 players with the same score, say B, C, E, G,
then
> > you need to count ties for each of the combinations of tying players B &
> C,
> > B & E, B & G, C & E, C & G, E & G, B & C & E, B & C & G, B & E & G, C &
E
> &
> > G, B & C & E & G. Consider

> > A    1    9
> > B    2    10
> > C    3    10
> > D    4    11
> > E    5    10
> > F    6    12
> > G    7    10
> > H    8    13

> > The ties are (A to H)

> > 1, 10, 10, 4, 5, 6, 7, 8
> > 1, 10, 3, 4, 10, 6, 7, 8
> > 1, 10, 3, 4, 5, 6, 10, 8
> > 1, 2, 10, 4, 10, 6, 7, 8
> > 1, 2, 10, 4, 5, 6, 10, 8
> > 1, 2, 3, 4, 10, 6, 10, 8
> > 1, 10, 10, 4, 10, 6, 7, 8
> > 1, 10, 10, 4, 5, 6, 10, 8
> > 1, 10, 3, 4, 10, 6, 10, 8
> > 1, 2, 10, 4, 10, 6, 10, 8
> > 1, 10, 10, 4, 10, 6, 10, 8

> > Handling the permutations of m players having the same score is not as
> > trivial as you may think. It gets harder to implement if one or more
> players
> > have the multiple instances of a tying score.

> > Granted this isn't terribly difficult for a human to do with paper &
> pencil,
> > but it has to be given a computer implementation. This may be as
pointless
> > as programming a robot to clean a room and struggling over where the
robot
> > should start. Continue only if it interests you.

> Maybe the implementation is more difficult than I think, but conceptually
> the basic idea for counting the ties and the partial points is already
here.
> Consider your example.  I've added a third column, just to consider the
> cases when a player has more that one tie possible -- e.g., C has two
scores
> of 10.

> Consider calculating the points for A for his "10" score.  The X column is
> the number of clear wins, the Y column is the number of ties

>                                  X          Y
> A    1    9      10        --            --
> B    2    10    15        1            1
> C    3    10    10        1            2
> D    4    11    12        1            0
> E    5    10    10        1            2
> F    6    12    14        1            0
> G    7    10    8         2            1
> H    8    13    10        1            1

> For full points, as stated before, mulitply down column (array) X.  For
half
> points, go to column Y and multiply the Y entry (non-zero, but it really
> doesn't matter) times the X entries not in the same row.  For example,
with
> the two C ties the calculation would be 2* (1*1*1*1*2*1).  Sum this for
all
> non-zero values in column Y and that gives the total half points.  For
> one-third points, the idea is similar, but you take the Y elements 2 at a
> time, then quarter points take the Y elements 3 at a time, up to (here)
> one-sixth points, of which there should be four.

> Wouldn't this give the correct answer?

> To your last comment, no, I don't really want to write the code.

> TK

I'm not sure how multiplication of the rank of winning subsets (counting
wins per row and multiplying) is any less "computational" than scanning
comparative scores and adding 1 for each win found.

Quote:
> Wouldn't this give the correct answer?

I came up with a different derivation, and I think that this is the same
thing.
Let wins be 0, let ties be 1.  Let N = number of players, N-1 is less one
for the player whos number is being judged.

Then, all the combinations of wins and ties are covered by counting in
binary from 0 to an (N-1) bit number, all 1.  The scores can then be
multiplied, and divided by (the number of tie bits + 1)

To keep things brief, I've shortened the space to 4 players, since the
number of  individual addends is 2^(N-1), = 2^(4-1), = 2^3, = 8.

Quote:
>                                  X          Y
> A    1    9      10        --            --
> B    2    10    15        1            1
> C    3    10    10        1            2
> D    4    11    12        1            0

# of tie bits is number of  players winis shared with.

BCD
000    bits = 0    Bx * Cx * Dx / (0+1)
001    bits = 1    Bx * Cx * Dy / (1+1)
010    bits = 1    Bx * Cy * Dx / (1+1)
011    bits = 2    Bx * Cy * Dy / (2+1)
100    bits = 1    By * Cx * Dx / (1+1)
101    bits = 2    By * Cx * Dy / (2+1)
110    bits = 2    By * Cy * Dx / (2+1)
111    bits = 3    By * Cy * Dy / (3+1)
                     -------------------------
Score for A's 10:       sum()

This is a bit brutish, though.

Of course, every time a Qx or Qy is zero, you can zero those lines; any line
containing Dy can be skipped:
000    bits = 0    Bx * Cx * Dx / (0+1)
010    bits = 1    Bx * Cy * Dx / (1+1)
100    bits = 1    By * Cx * Dx / (1+1)
110    bits = 2    By * Cy * Dx / (2+1)

And, "no ties" means no need to do the bit-driven sum at all, it's just Bx *
Cx * Dx clear wins.

The number of rows with ties, T (=2) is the maximum number of bits that will
ever be set, so this reduces the size of the sum set, if you can figure out
a way to generate al of the N-1-bit numbers with T or fewer bits set.

I haven't figured out a way to count bits or generate the set of N-1-bit
numbers with T or fewer bits set, or eliminate 0 rows from the sum set, but
haven't thought about it much, either..

BTW, the "bit" model is only a bit model, to illustrate that the full count
represents all of the n-way computations.  I'm not really trying to do
bit-wise manipultion in awk.

    - Dan



Sun, 19 Dec 2004 11:43:39 GMT  
 Permutations
Hallo,

Quote:

> Consider calculating the points for A for his "10" score.  The X column is
> the number of clear wins, the Y column is the number of ties

>                                  X          Y
> A    1    9      10        --            --
> B    2    10    15        1            1
> C    3    10    10        1            2
> D    4    11    12        1            0
> E    5    10    10        1            2
> F    6    12    14        1            0
> G    7    10    8         2            1
> H    8    13    10        1            1

> For full points, as stated before, mulitply down column (array) X.  For half
> points, go to column Y and multiply the Y entry (non-zero, but it really
> doesn't matter) times the X entries not in the same row.  For example, with
> the two C ties the calculation would be 2* (1*1*1*1*2*1).  Sum this for all
> non-zero values in column Y and that gives the total half points.  For
> one-third points, the idea is similar, but you take the Y elements 2 at a
> time, then quarter points take the Y elements 3 at a time, up to (here)
> one-sixth points, of which there should be four.

> Wouldn't this give the correct answer?

it would but it would be too slow.  For the case above, you have to examine
all possible groups of winners.  Since there are 5 people (besides A) who can
have 10 points, there is 2^5 of such groups.

In general case, you get exponential complexity of the program, which means
that your program would be too slow in cases where ties are frequent.
(This can happen in cases when the range of possible points is small, for
example in cases where the numbers cannot be bigger then a certain fixed
limit independent on the size of the table, ie. on n and s.)

Stepan Kasal



Sun, 19 Dec 2004 15:40:20 GMT  
 Permutations

Quote:

> I think the complexity is always going to be somewhere around O( (n*s)^2 )

No, it can be improved, the trick is to sort *all scores,* not only scores
of one player.  Without ties, you can achieve O(n*s*log(n*s)) this way.
I hope to post the program soon.

Quote:
>     [ n^2*s*(s+x) = n^2 * (s^2 + s*x) ~ n^2 * s^2 = (n*s)^2 ]

x means (maximum) number of winning players in one contest.
It can be ~ n, consider case when the matrix is randomly filled with 0 and 1,
ie. the score has to be 0 or 1---there would be plenty of ties.

So it can be also expressed as O(max(n^3*s, (n*s)^2)), if you prefer,
but it isn't O((n*s)^2).

Quote:
> Good luck (and post the scores, so I know what I'm after!)

OK, so for Harlan's data, I've got this:
A 1716000
B 5617188
C 3252152
D 2165000
E 4398060
F 19737432
G 22273992
H 525600
I 127832
J 652920

The sum is really 6^10, as expected.

Yours,
        Stepan



Sun, 19 Dec 2004 16:04:56 GMT  
 Permutations


Quote:
> Hallo,


> > Consider calculating the points for A for his "10" score.  The X column
is
> > the number of clear wins, the Y column is the number of ties

> >                                  X          Y
> > A    1    9      10        --            --
> > B    2    10    15        1            1
> > C    3    10    10        1            2
> > D    4    11    12        1            0
> > E    5    10    10        1            2
> > F    6    12    14        1            0
> > G    7    10    8         2            1
> > H    8    13    10        1            1

> > For full points, as stated before, mulitply down column (array) X.  For
half
> > points, go to column Y and multiply the Y entry (non-zero, but it really
> > doesn't matter) times the X entries not in the same row.  For example,
with
> > the two C ties the calculation would be 2* (1*1*1*1*2*1).  Sum this for
all
> > non-zero values in column Y and that gives the total half points.  For
> > one-third points, the idea is similar, but you take the Y elements 2 at
a
> > time, then quarter points take the Y elements 3 at a time, up to (here)
> > one-sixth points, of which there should be four.

> > Wouldn't this give the correct answer?

> it would but it would be too slow.  For the case above, you have to
examine
> all possible groups of winners.  Since there are 5 people (besides A) who
can
> have 10 points, there is 2^5 of such groups.

> In general case, you get exponential complexity of the program, which
means
> that your program would be too slow in cases where ties are frequent.
> (This can happen in cases when the range of possible points is small, for
> example in cases where the numbers cannot be bigger then a certain fixed
> limit independent on the size of the table, ie. on n and s.)

> Stepan Kasal

My thanks to Dan and Stepan for their clear responses.  My posting to Harlan
(above) came from a(probable) misunderstanding on my part -- that he didn't
have a method for counting "ties" for this problem.  From the other
postings, I see that my method, while giving the correct numbers, is not
going to be the fastest, by any means.  However, one of Harlan's earlier
postings said:

"...with 20 players and 10 scores each, brute force on a single CPU/FPU
would
take a bit longer than the average human lifetime)."

I don't think my algorithm would have been that slow.

TK



Sun, 19 Dec 2004 21:57:33 GMT  
 Permutations

Quote:
> No, it can be improved, the trick is to sort *all scores,* not only scores

Yes, of course, I see now.  Harlan, how long did it that the "different ng"
to start sorting?

Quote:
> x means (maximum) number of winning players in one contest.

Sorry, I mis-read, I didn't see that x was on the order on n, I misread it
as on the order of s.

    - Dan


Quote:


> > I think the complexity is always going to be somewhere around O(
(n*s)^2 )

> No, it can be improved, the trick is to sort *all scores,* not only scores
> of one player.  Without ties, you can achieve O(n*s*log(n*s)) this way.
> I hope to post the program soon.

> >     [ n^2*s*(s+x) = n^2 * (s^2 + s*x) ~ n^2 * s^2 = (n*s)^2 ]

> x means (maximum) number of winning players in one contest.
> It can be ~ n, consider case when the matrix is randomly filled with 0 and
1,
> ie. the score has to be 0 or 1---there would be plenty of ties.

> So it can be also expressed as O(max(n^3*s, (n*s)^2)), if you prefer,
> but it isn't O((n*s)^2).

> > Good luck (and post the scores, so I know what I'm after!)

> OK, so for Harlan's data, I've got this:
> A 1716000
> B 5617188
> C 3252152
> D 2165000
> E 4398060
> F 19737432
> G 22273992
> H 525600
> I 127832
> J 652920

> The sum is really 6^10, as expected.

> Yours,
> Stepan



Sun, 19 Dec 2004 22:28:53 GMT  
 Permutations
Hallo,

Quote:

> I think there is a better way to implement this.
> I hope to post the code tomorrow.

I'm sorry but till now I have only the non-tie version of the more
effective algorithm.  The complexity is O(n*s*log(n*s)).

The program doesn't require all players to have the same length of their
scores list.
If we denote by  v  the number of all scores (n*s, if all n players
have the same number of possible scores), we can express the complexity
as  O(v*log(v)).

This program reports error if two players may have the same score,
ie. if a tie is possible.  (Thus it won't run on Harlans data.)

The program relies on gawk's asort().  For other awks you need to
implement a sort to run this code.

I hope to continue with this nice excercise on Monday.

BEGIN {
        exiting = 0

Quote:
}

{
        N = NR
        player_name[NR] = $1
        # trivial check for an error

        for (i = 2; i <= NF; ++i) {
                # remember all numbers we have
                val[$i] = $i
                # remember which player has them
                if ($i in player && player[$i] != N) {
                        printf "$%d: %s\n", i, $0
                        exiting = 1
                        exit(2)  # a tie detected!
                }
                player[$i] = N
                # count duplicates:
                occur[$i]++
                if (i == 2 || $i < player_min)
                        player_min = $i
        }
        # find out the minimal possible winning score (max of player minimums)
        if (N == 1 || player_min > max_min)
                max_min = player_min

Quote:
}

END {
        if (exiting) exit

        vals = asort(val)

        for (i = 1; i <= N; i++)
                wins[i] = small[i] = 0

        ind = 0
        do {
                ind++
                v = val[ind]
                small[player[v]] += occur[v]
        } while (v < max_min)

        p = 1
        for (i = 1; i <= N; i++)
                p *= small[i]

        # invariant: p contains number of all `contests' containing only
        # point values <= v

        pl = player[v]
        wins[pl] = p
        # in this particular case, the above line is the same as:
        # wins[pl] += (p / small[pl]) * occur[v]

        for (ind++; ind <= vals; ind++) {
                v = val[ind]
                pl = player[v]
                p /= small[pl]
                wins[pl] += p * occur[v]
                small[pl] += occur[v]
                p *= small[pl]
        }

        for (i = 1; i <= N; i++)
                print player_name[i], wins[i]

Quote:
}

With data

A 24 30 85 16 59 20
B 18 49 17 37 50 92
C 90 23 64 58 34 11
D 57 86 5 12 65 2
E 13 46 26 68 94 76
F 91 93 38 93 78 33
G 100 19 71 51 95 72
H 21 56 70 8 52 21
I 66 45 4 48 9 4
J 40 69 54 69 28 47

you get:

A 1716000
B 3752568
C 2549232
D 2166000
E 8103648
F 18565848
G 22363992
H 435600
I 158424
J 654864

Have a nice weekend (Friday is a holiday here in Prague),

Stepan



Mon, 20 Dec 2004 22:26:12 GMT  
 Permutations

Quote:
(Stepan Kasal) writes:

>> I think the complexity is always going to be somewhere around O( (n*s)^2 )

>No, it can be improved, the trick is to sort *all scores,* not only scores
>of one player.  Without ties, you can achieve O(n*s*log(n*s)) this way.
>I hope to post the program soon.

...

Sorting the scores does nothing useful. You need to handle all permutations of
one score per player. Consider something extremely simple.

A  1  2
B  2  1

The contests are 1/2, 1/1, 2/2, 2/1, so 1 solo win each and two ties each, so
each player scores 1 + 2 * 1/2 = 2, and the total of scores is 4, as required.
How would sorting help here?

Now if you mean sorting individual player's scores, followed by a graphical
form of data structure, starting with

A  1  2  3
B  2  4  6
C  3  6  9

and transformed to something (crudely visualized) like

A_1_2_3
B___2___4_6
C_____3___6_9

then maybe, which would make recognizing high scores and ties easier, but some
iterative mechanism would still be needed to count wins and ties.



Tue, 21 Dec 2004 06:09:35 GMT  
 Permutations
...

Quote:
>OK, so for Harlan's data, I've got this:
>A 1716000
>B 5617188
>C 3252152
>D 2165000
>E 4398060
>F 19737432
>G 22273992
>H 525600
>I 127832
>J 652920

>The sum is really 6^10, as expected.

These are the correct overall scores per player. I did a quasi-brute force
calculation in Excel to get the answers.


Tue, 21 Dec 2004 06:58:48 GMT  
 
 [ 22 post ]  Go to page: [1] [2]

 Relevant Pages 

1. all-permutations is faster than permutation

2. A permutation on permutations

3. Generating permutations

4. Finding permutations

5. All Permutations

6. All permutations

7. Permutation

8. Permutations

9. Linked list to Permutation vector

10. Puzzler: function of permutation vector

11. Fwd: re: Puzzler: Permutations and Anagrams

12. Puzzler: Permutations and Anagrams

 

 
Powered by phpBB® Forum Software