INVERSE BOGGLE Boggle is a fairly well known word game distributed by Parker Brothers. There are also numerous web implementations and there is public domain code available on the web. In the most common form, 25 random letters are arranged in a 5x5 grid and the player then finds words by connecting letters within the grid in sequence. Longer words mean more points. The game is FAR too easy for the POTM gang. So. This time around we play "Inverse Boggle" (or 1/BOGGLE). I'll give your program a bunch of words, and you construct a grid that contains as many words as possible according to "boggle rules". ========== THE DETAILS UNLEASHED =================== I. The task You will write a program that takes a list of words as input. Your program will construct a 5x5 grid of letters. Your goal will be to create a square that "contains" as many of these words as possible. Your program will print out the letter square and then print out the words from the input that may be found in your square. More words means more score. Longer words mean more score. High score wins. II. Finding words in a square of letters - modified Boggle rules If you're unfamiliar with the game, check out some web sites like http://boggle.stanford.edu/ ... there are many variations possible - I have chosen a single rule set for this POTM. A. The letter square will be a 5x5 NPDNN grid of letters. Each grid position AVADS GRID: is occupied by one of 26 letters (A-Z). ITTEB Q and U are individual letters. All TPLFA letters will be upper case. A sample AUEMD letter square is shown at the right. B. A word may be found in the letter square by connecting letters with their neighbors. You can move from a letter to any other letter that touches it, even diagonally, WORDS: but you can't use the same individual letter more than once in a word. Some of the words that can be found in the grid above are: PIT PITTED BEADS DATES BELT MELTED FAME SETTLE Some words that can't be legally found are: FLEET (the E's don't touch) FELL (can't use the L twice) BELATED (the L and the A don't touch) ADDED (not enough D's) BELTED (uses the same E twice) C. ONLY words that are in the file I provide to your program are valid words ... they may be in any language, of length WORDSLIST 1-25 letters, or in no language whatever. For example, I may present a wordlist that contains the following "words": GESUNDHEIT, NPDNNN, POTM, GEORGE, XX, YOURNAME, MYNAME In this case ONLY these seven words would be worth score if they are found in the letter grid. Other sequences, like NAME or POT or CAT would not be considered valid. The wordlist may contain words as short as one letter or as long as 25 letters. The wordlist may contain words which are subsets of one another (like A,AT,EAT,SEAT,SEA). All these words can be found in a four letter sequence and each would be a valid word eligible for scoring if it was in the provided word list. All words in the wordlist will be unique - there will be no duplicates. D. Any word from the wordlist that can be found in your letter square will add to your score. A specific word can score only once even though the word may be found in SCORING: multiple places in your square. Each word will score points according to the number of letters in the word. A one letter word will score one point. A two letter word will score two points. A three letter word will score three points. < ... there is no mystery here ... > A ten letter word will score ten points. An N letter word will score N points for N = 1 to 25 Your total score will be the sum of the scores of all valid words that you find in your final letter grid. In order to score, the word must be present in your grid AND it must be present in your output word list. Your program will print the square you find followed by the words you claim are present. > The following rule was added after the initial distribution > of the rules: If your output contains a word which does > not result in a score (because it is not in the square, or > not in the wordlist, or a duplicate, or whatever) then 10 > points will be subtracted from your score for each such word. In the event of a tie, there will be three tiebreakers (each is applied in sequence ONLY if necessary to resolve ties of the earlier scores. 2) The FEWEST number of words found. (given that two programs have the same score, this rewards the one that uses longer words) --> 3) The FEWEST vowels (AEIOUY) in the square. (very unlikely I need to use this tiebreaker) --> 4) If two or more entries are still tied, I will apply the same set of scoring criteria to a second wordlist in order to break the tie. III. Your program SHOULD a) should accept a single argument that provides the file ACCEPT name of a wordlist I will provide. The wordlist will SINGLE contain between 5 and 1000 words, one per line. All words ARGUMENT will be in upper case letters using only (A-Z) and will THAT contain no white space, hyphens, apostrophes, etc. There PROVIDES will be no blank lines in the file. The wordlist for the THE system test is shown in the left margin (25 words). FILE NAME b) your output MUST be exactly as follows: OF * the letter square you generate must appear on the WORDLIST first five lines, five upper case letters per line. OUTPUT * starting with line six, you must list the words that LETTER can be found in your square - all in upper case. SQUARE * there should be no blanks, tabs, or anything other AND than upper case letters in your output. LIST WORDS Note that only the words you CLAIM are in your square YOU will be evaluated for possible scoring. In order to CLAIM score they must be present in the square AND they must PRESENT be present in the source list I provide. A KEWL c) I will run the program as follows: POTM nohup timex a.out INPUTFILENAME >output 2>timexoutput & EXTRAORDINAIRE THANKS ============================================================================= The following items are standard stuff for ALL the POTMs .... (but they occasionally will change ... so READ 'EM!) ============================================================================= I. About your programming: a) I compile on one machine (usually) and execute on another as follows: compilation machine: SunOS 5.3 Generic_101318-59 sun4m sparc execution machine : SunOS 5.3 Generic sun4c sparc b) The compilers I have available are (at least): SPARCompiler C++ 4.0 SPARCompiler C ANSI C compiler SVID compliant under SunOS 5.x gcc/g++ version 2.7.2 PERL version 5.002 Java, version 1.0 of the Java Developer Toolkit from SUN (as downloaded to a SPARC workstation) Scheme48 - a Lex variant All compilation will be done WITHOUT ANY OPTIMIZATION, and the standard math library will be loaded (even if not used). While this might not reflect the real world, it is at least consistent. No makefiles are permitted, if there are special instructions please so indicate in your program header comments. >>> IMPORTANT: submit early so we can resolve any >>> portability problems!!! (Particularly if you >>> are working in a PC environment. NOTE: assembly code submissions are NOT acceptable. I must be the one to compile your code so I can check for cheating! c) if you wish to submit a shell program, Bourne, Korn, and csh are available ... along with any bin or /usr/bin tools that are available as generic Unix tools - my judgement!!! Also nawk, awk, dc, or whatever. (again - submit early in case there are version differences) d) Temporary files may be created in /tmp, but MUST be removed when you are done ... creation of files anywhere else is strictly prohibited. e) Maximum size of the source code is roughly 25K characters - different rules may apply for specific POTMS, and comments don't count against you. f) Maximum size of executable is 1 Megabyte ... please!!!! g) Maximum malloc'able space is a bit over 50 Meg .... h) sizeof(short)=2; sizeof(int)=4; sizeof(long)=4; sizeof(char)=1 i) a = 0x80000000 = -2147483648 a - 1 = 2147483647 a + 1 = -2147483647 {a} is true. {a == 0} is false. II. The system testing .... a) mail me an entry as soon as possible - you can always submit another entry if you improve your solution .. but try and keep it down to one a week once the porting issues are resolved .... please! b) one entry per person (or team) please. I associate each entry name with an email address and a person for communication purposes. Communication is fine - and encouraged - but everyone's code should be their own unless there is a stated collaboration and a single team entry. Honor system! c) on receipt of your entry, I'll run a system test to make sure your program works ... you'll receive the results and a weekly standing of how you fared against other entries. (I usually will get to new mail once a night but perhaps not!) d) please make sure your program works on the system test problem. e) your program must perform the task specified within 10 minutes sys+user time as measured by timex on MY execution system. Your time will be provided along with your system test run so you can see the differences in speed between yours and mine. All execution time measurements used for tiebreakers (if any) will be measured using time via timex (similar to time command). ===> NOTE: A code fragment to measure elapsed time ===> is available from the POTM-master for the asking. III. SENDING ME YOUR ENTRY - PLEASE USE EMAIL!!! Please email (not uuto, no attachments) your source code to me at: enter@potm.ffast.att.com (preferred) hicinbothem@att.com (use only as a last resort!) IMPORTANT: Please use the following (or equivalent) lines at the front of the program you mail to me (this will help immeasurably!): /* POTM ENTRY: entryname (something clever please!) */ /* Your Name: First Last */ /* Your email: log@machine.att.com (or whatever) */ /* compile instructions (if other than "make entryname") */ NOTE: If you submit a shell program, please include these lines with a leading "#" and indicate which shell to run it under! IMPORTANT: ENTER EARLY - you will receive weekly standings and you will resolve any portability issues early. You may improve your entry at any time by simply sending me another entry - so it pays to enter earlier! (I process most everything in a day or so) IMPORTANT: If you don't hear from me within a few days - it may mean that the mail got screwed up .... please follow up with an inquiry to hicinbothem@att.com .... Thanks! If you have any questions, mail 'em to me - if I answer them I'll include them in the Frequently Asked Questions (FAQ) list I circulate with the weekly standings!!! Don't call me ... please! WATCH THE FAQ - ESPECIALLY IN THE FIRST FEW WEEKS AS ALL THE STUPID ERRORS I MAKE IN THE PROBLEM STATEMENT TURN UP!!!! Looking forward to your entry! (remember: enter@potm.ffast.att.com) =Fred