93
[A 2][A 2][A 2][A 2][A 2][A 2][A 2]
[B 8][B 8]
[C 4][C 4][C 4]
[D 4][D 4][D 4]
[E 1][E 1][E 1][E 1][E 1][E 1][E 1][E 1][E 1]
[F 6][F 6]
[G 6][G 6]
[H 3][H 3][H 3][H 3]

[I 2][I 2][I 2][I 2][I 2][I 2]
[J 19]
[K 14]
[L 4][L 4][L 4][L 4]
[M 6][M 6]
[N 3][N 3][N 3][N 3][N 3]
[O 2][O 2][O 2][O 2][O 2][O 2]
[P 4][P 4][P 4]

[Q 17]
[R 2][R 2][R 2][R 2][R 2][R 2]
[S 3][S 3][S 3][S 3][S 3]
[T 2][T 2][T 2][T 2][T 2][T 2][T 2][T 2][T 2]
[U 5][U 5][U 5]
[V 8][V 8]
[W 7][W 7]
[X 14]

[Y 7][Y 7]
[Z 18]
[  -1]

  This file is part of CardWords.
  (C) 1999 Tobias Peters

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

// example.cards

Here is the way I built this file:

I counted thze character occurrences in the README.sgml file that I wrote:

$ ./cardwords_count example.charset  < README.sgml
A: 2723
B: 477
C: 1273
D: 1233
E: 3968
F: 666
G: 686
H: 1527
I: 2292
J: 11
K: 144
L: 1439
M: 742
N: 1967
O: 2205
P: 1080
Q: 56
R: 2334
S: 1887
T: 3599
U: 833
V: 495
W: 596
X: 126
Y: 595
Z: 28

I thought that it might not be desirable to have cards as a directly
proportional representation of these occurrences, because:

- If I had only one card showing a `J', I would need over 300 cards
  showing `E's. This would be a little too many cards for a game.
- I thought that the characters that occur more often will be considered
  `easy' characters and the characters occurring only rarely in the
  language will be considered `difficult'. This would have the result 
  that words cross more often at easy characters than at difficult ones and
  that they are thus more shared between words. One would need lesser
  cards with easy characters and more cards with difficult characters.

To do a bit calculation I started up octave:

$ octave
Octave, version 2.0.13 (i386-pc-linux-gnu).
Copyright (C) 1996, 1997, 1998 John W. Eaton.
This is free software with ABSOLUTELY NO WARRANTY.
For details, type `warranty'.

octave:1> 

and cut-and-pasted the output from

$ cardwords_count example.charset <README.sgml | cut -d " " -f 2 | tr "\n" " "
2723 477 1273 1233 3968 666 686 1527 2292 11 144 1439 742 1967 2205 1080 56 2334 1887 3599 833 495 596 126 595 28

into octave:

octave:1> characters = [ 2723 477 1273 1233 3968 666 686 1527 2292 11 144 1439 742 1967 2205 1080 56 2334 1887 3599 833 495 596 126 595 28 ];
octave:2> max(characters) / min (characters)
ans = 360.73
octave:3> average = sum(characters) / 26
average = 1268.5

The second command asks for the relation between the character with the most
occurrences and the one with the least occurrences.

The third command calculates the average occurrences of a character.

All numbers of occurrences should now go a bit in the direction of the average
value to cope with easy and difficult values. I first tried

octave:4> for index = [1:26];
> cards(1,index) = (characters(index) - average) / 2 + average;
> endfor
octave:5> max(cards) / min (cards)
ans = 4.0925
octave:6> 

, but then there is only a factor 4 between the card character with the most
 occurrences and the one with the least occurrences.

octave:6> for index = [1:26];
> cards(1,index) = (characters(index) - average) / 1.5 + average;
> endfor
octave:7> max(cards) / min (cards)
ans = 7.1323
octave:8> 

That's better, but still not what I want (I'd like to have at least
a factor 10).

octave:8> for index = [1:26];
> cards(1,index) = (characters(index) - average) / 1.25 + average;
> endfor
octave:9>  max(cards) / min (cards)
ans = 13.059
octave:10> 

That looks nice. The card-table has approx 180 cells, I think every second
cell in the game should remain free if one wants to move all cards to the
card-table, otherwise this gets too difficult.

So I want approx 90 cards.

octave:10> cards = cards * 90 / sum(cards)
cards =

 Columns 1 through 8:

  6.63664  1.73360  3.47128  3.38396  9.35449  2.14619  2.18985  4.02576

 Columns 9 through 16:

  5.69576  0.71632  1.00666  3.83366  2.31210  4.98629  5.50584  3.04996

 Columns 17 through 24:

  0.81456  5.78745  4.81165  8.54896  2.51075  1.77290  1.99338  0.96737

 Columns 25 and 26:

  1.99120  0.75343

octave:11> 

By rounding these values to the nearest integer number I get my distribution
of cards -- only the card points are still missing.

The points of cards with the same character meaning should not differ from
each other in an example card set - this would make the game too complicated
for new users, especially because there is no way to show the card set in the
client program by now.

I thought it would be nice if the sum of points of all cards showing `A'
should be approximately the same as the sum of points of all cards showing
`B' etc.

So I make a second vector, `points', which holds the points for each
character meaning, with the maximum points value of 19, because that is the maximum
value that the gtk client allows, and no negative points by:


octave:12> points = 1 ./ cards;
octave:13> points = points * 19 / max(points)
points =

 Columns 1 through 8:

   2.0508   7.8508   3.9208   4.0219   1.4549   6.3415   6.2151   3.3807

 Columns 9 through 16:

   2.3895  19.0000  13.5200   3.5502   5.8865   2.7295   2.4719   4.4624

 Columns 17 through 24:

  16.7086   2.3517   2.8286   1.5920   5.4207   7.6768   6.8276  14.0692

 Columns 25 and 26:

   6.8351  18.0641

octave:14> 

Rounding these values to ints, I get the card set as given in this file, without the
wildcard.

octave:14> quit
$ 

Then I thought, a wildcard would be nice and added one.