WORDSEARCH _________________________________________________________________ These are the long rules for the latest POTM. I have tried to be as clear as possible but if there are any questions please ask them as soon as possible so I can correct errors in the weekly FAQ mailings. My goal is to eliminate loopholes - if you have a question about wording, I probably meant the statement to have the simplest possible interpretation. ****** W O R D S E A R C H ****** (deadline is 23:59 PM EST December 31, 1999) I'm sure most of you have seen "wordsearch" puzzles in the newspapers - you are given an array of letters and asked to find words within the array. The POTM wordsearch is a little different (not all the words will be in the array, you can't re-use letters, and the words don't have to be in a straight line), but the idea is the same. Your job is to write a program to FIND some of the words in the letter grid. Do a better job than anyone else and YOU could win the POTM! ================== T H E L O N G R U L E S ================= I. YOUR PROGRAM AND YOUR TASK IN SUMMARY I will provide an input file in the first argument to your POTM entry ... the input file will contain a grid of letters (up to 26x80 at most), and a list of "words" (up to 500 at most). Your program tries to find words from this list somewhere in the grid - and prints out the location of any words it finds. Your score will be the total number of letters in the found words. Restrictions on exactly HOW your program may find words are below - and they may be a bit different than you're used to for other "wordsearches". II. SAMPLE INPUT: FINDING WORDS IN THE LETTER GRID A. The input file consists of the number of rows (2), then the number of columns (3 in this case), then the letter grid with one row of letters per input line, and then the wordlist. B. The rows are lettered from A through B in this case, and up to "Z" in the largest case. The columns are numbered 1, 2, 3 in this case, and up to 80 in the largest case. C. After the grid comes the list of words for the rest of the file - up to 500 "words" which are candidates. D. Words are found by listing the positions of their letters in the grid - subject to the following rules: 1. the letters of words must be "connected" either horizontally, vertically, or diagonally. Wrapping around the borders is not permitted. 2. once a letter is used for a word, the same letter in the same grid position may not be re-used in the same word, or in any other word. 3. a word may start anywhere in the grid. ############################################################################ # Sample input file (there will be no white space or comments in # the actual input files - they are added here just to help!) ############################################################################ 2 : Number of rows (A and B in this case) 3 : Number of columns (1, 2, and 3 ... numbering begins at 1) OTR : row A of the grid ... letters A1="O", A2="T", A3="R" AEH : row B of the grid ... letters B1="A", B2="E", B3="H" QUIZ : This word could not be found ... required letters not in grid! TEA : A2 B2 B1 THERE : This word could not be found ... the "E" cannot be reused TO : A2 A1 TOO : This word could not be found ... the "O" cannot be reused OTHER : A1 A2 B3 B2 A3 (note the diagonal connections may "cross") RATE : Not found ... letters are present but cannot be connected OATH : A1 B1 A2 B3 HEAT : B3 B2 B1 A2 TRH : A2 A3 B3 (words may not be actual words in any language) OAE : A1 B1 B2 HER : B3 B2 A3 ############################################################################ III. SCORING AND TIEBREAKERS A. Your score is the total number of LETTERS found in the words your program finds. From the example above, If you find TRH and OAE your score is six. If you find OTHER, your score is five. If you find TO and HER, your score is five. B. If any invalid words are claimed, your score will be zero. Words are invalid if the positions claimed do not actually spell a word in the wordlist, or if a position in the grid is reused in one or more words. C. In the event of a tie based on letters, the winner is determined by counting the number of WORDS that are found - a small number of words is better. Thus, in the example above, The entry of "OTHER" beats the entry of "TO" and "HER" because the score of five was achieved with one word rather than with two. D. If some number of entries are tied after processing a single final problem (based on both number of letters and words), only those tied entries will be subjected to a second grid. The winner will then be determined on the scores from the second grid. If there are still ties, the fastest (sys+user time) entry will win. IV. YOUR PROGRAM'S INPUT A. The path of a file containing the input information will be passed as an argument to your program, which then must read the file. Example: yourprogramname INPUTFILENAME B. The INPUT FILE will be as descibed below ... there will be no white space in the file. R : Number of rows (1 less than R less than 27) C : Number of columns (1 less than C less than 81) XXXXX... : first row of the grid, there wil be C letters in this line XXXXX... : second row of grid, there will be C letters in each line .... : there will be a total of R rows of C letters each WORDA : The first word to look for WORDB : The second word to look for .... : The rest of the words in the wordlist ... C. In the actual problem, there may be up to 26 rows and up to 80 columns. There may be up to 500 words in the wordlist ... the wordlist begins after the grid and goes until the end of file. The "words" in the wordlist are not necessarily real words in any language, not do they necessarily appear inthe grid! --* No words in the wordlist will be repeated. --* "Words" will be no longer than 80 characters long. D. All letters in the input file will be upper case (A-Z). V. YOUR PROGRAM'S OUTPUT A. Your program must generate one line of output for each word it finds. Your output should NOT contain the actual word that was found, but SHOULD contain the "coordinates" of the letters in the grid that make up the word. For example, in a larger grid, some sample output might be as follows - with one "word" per line: C1 D2 E3 D3 E2 A62 A63 A64 A65 A66 B66 C66 D79 E80 F80 X45 Y46 Z47 B. There should be no space between the letter which defines the "row" and the number which defines the "column". C. The coordinates of each found word must appear in order, such that the word can be deduced by looking in the indicated grid positions for the letters. D. The output lines do not need to be ordered in any way. There is one word represented per line, but those words do not need to be in any particular order. (That is, they do not need to be inthe same order in which they appeared in the wordlist.) --* E. If a word in the wordlist can be found in multiple places in the letter grid, it may be part of your output more than once provided completely different grid letters are used in each instance ============================================================================= The following items are standard stuff for ALL the POTMs .... (but they occasionally will change ... so READ 'EM!) ============================================================================= I. About your programming: a) As of 9/98 the POTM compiles and executes on and Ultra-2: potm potm 5.5.1 Generic_103640-17 sun4u sparc SUNW,Ultra-2 The Ultra-2 I'm on has 128Mb physical memory and 350Mb virtual memory. It runs Solaris 2.5.1. I must insist that your entry be contained in a SINGLE ASCII file, mailed in plaintext from a mailer that does not split long lines or insert strange characters. Please help keep my life simple!!! b) The compilers I have available are (at least): Sun WorkShop Compiler C 4.2 (ANSI C Compiler SVID compliant) Sun WorkShop Compiler C++ 4.2 GNU compilers gcc/g++ version 2.8.1 *** C/C++ Note: please do not use platform dependent include files *** (like conion.h for example) since they will cause my compilation *** to fail. If YOU need them, consider using ifdefs so that only *** YOUR compilation will see them ... thanks! PERL version 5.004 Python 1.5.1 Java, version 1.1.6 of the Java Developer Toolkit from SUN (as downloaded to a SPARC workstation) FORTRAN compilers from SUN, sorry - no PASCAL support! FORTRAN 90: WorkShop Compilers 4.2 10/22/96 FORTRAN 90 1.2 FORTRAN 77: WorkShop Compilers 4.2 30 Oct 1996 FORTRAN 77 4.2 All compilation will be done WITHOUT ANY OPTIMIZATION, and the standard math library will be loaded (even if not used). While this might not reflect the real world, it is at least consistent. No makefiles are permitted, if there are special instructions please so indicate in your program header comments. *** IMPORTANT: submit early so we can resolve any *** portability problems!!! (Particularly if you *** are working in a PC environment. NOTE: assembly code submissions are NOT acceptable. I must be the one to compile your code so I can check for cheating! c) if you wish to submit a shell program, Bourne, Korn, and csh are available ... along with any bin or /usr/bin tools that are available as generic Unix tools - my judgement!!! Also nawk, awk, dc, or whatever. (again - submit early in case there are version differences) d) Temporary files may be created in /tmp, but MUST be removed when you are done ... creation of files anywhere else is strictly prohibited. e) Maximum size of the source code is roughly 25K characters - different rules may apply for specific POTMS, and comments don't count against you. f) Maximum size of executable is 1 Megabyte ... please!!!! g) sizeof(short)=2; sizeof(int)=4; sizeof(long)=4; sizeof(char)=1 h) a = 0x80000000 = -2147483648 a - 1 = 2147483647 a + 1 = -2147483647 {a} is true. {a == 0} is false. i) RAND_MAX = 2^15 - 1 (32767) when using the rand() function. II. The system testing .... a) mail me an entry as soon as possible - you can always submit another entry if you improve your solution .. but try and keep it down to one a week once the porting issues are resolved .... please! b) one entry per person (or team) please. I associate each entry name with an email address and a person for communication purposes. Communication is fine - and encouraged - but everyone's code should be their own unless there is a stated collaboration and a single team entry. Honor system! Use the POTM bulletin board at: http://www.sitepowerup.com/mb/view.asp?BoardID=100152 c) on receipt of your entry, I'll run a system test to make sure your program works ... you'll receive the results and a weekly standing of how you fared against other entries. (I usually will get to new mail once a night but perhaps not!) d) please make sure your program works on the system test problem. e) your program must perform the task specified within the specified time limit (ten minutes) sys+user time as measured by timex on MY execution system. Your time will be provided along with your system test run so you can see the differences in speed between your machine and mine. All execution time measurements used for tiebreakers (if any) will be measured using (sys+user) time via timex (similar to time command). **** NOTE: A code fragment to measure elapsed time **** is available from the POTM-master for the asking. III. SENDING ME YOUR ENTRY - PLEASE USE EMAIL!!! Please email (not uuto, no attachments) your source code to me at: enter@att.com (preferred) hicinbothem@att.com (use only as a last resort!) WARNING!!! Please be sure your mailer does NOT truncate long lines or make any substitutions for characters. These kinds of problems will prevent the program from compiling when I receive it! IMPORTANT: Please use the following (or equivalent) lines at the front of the program you mail to me (this will help immeasurably!): /* POTM ENTRY: entryname (something clever please!) */ /* Your Name: First Last */ /* Your email: log@machine.att.com (or whatever) */ /* compile instructions (if other than "make entryname") */ NOTE: These comments should be referenced in whatever method is appropriate for your programming language of choice. IMPORTANT: ENTER EARLY - you will receive weekly standings and you will resolve any portability issues early. You may improve your entry at any time by simply sending me another entry - so it pays to enter earlier! (I process most everything in a day or so) IMPORTANT: If you don't hear from me within a week it may mean that the mail got screwed up .... please follow up with an inquiry to hicinbothem@att.com .... Thanks! If you have any questions, mail 'em to me - if I answer them I'll include them in the Frequently Asked Questions (FAQ) list I circulate with the weekly standings!!! Don't call me ... please! WATCH THE FAQ - ESPECIALLY IN THE FIRST FEW WEEKS AS ALL THE STUPID ERRORS I MAKE IN THE PROBLEM STATEMENT TURN UP!!!! Looking forward to your entry! (remember: enter@att.com) =Fred