Interfacing with the Acorn Teletext Adaptor - by - Gordon Horsington ---------------------------------------------------------------------- Module 4. Searching for keywords -------------------------------- The frame grabber used to illustrate Module 3 can be modified to search for keywords in the Teletext data. There are surprisingly few modifications needed to produce a simple keyword search for the serial magazines broadcast on BBC 1, BBC 2 and Channel 4. Neither CEEFAX nor ORACLE broadcast a keyword directory which can be used to identify the pages associated with with a particular word and so the best that can be done is to search for ASCII strings in the Teletext data downloaded from the adaptor. Essentially all that a keyword search needs to do is to load every page, search every packet associated with the current page, and print every page which contains the required ASCII string. The demonstration program KEYWORD uses this serial magazine keyword search technique but it also works with the parallel magazines broadcast on the ITV network, although it does not search parallel magazines very effectively. The reason why this simple idea does not work very well with parallel magazines is because it only loads and searches one page at a time. The ITV network currently broadcast up to four magazines in parallel and so up to three magazines will be ignored when a page is being loaded. The four magazines currently used by the ITV network are numbered 1, 2, 7 and 8. The program KEYWORD ignores magazine 8, which only contains subtitles, and can be made to ignore the ITV magazine 7, which only contains encrypted Air Call pages, but even then magazines 1 and 2 are still broadcast in parallel and so at least two pages need to be searched at the same time if the program is to work efficiently. Ideally the program should be capable of loading and searching up to 8 pages at the same time to cope with the worst possible case when all the available magazines are used in parallel. Modifying the program to cope with 8 buffers is not as difficult as it sounds and this has been left for you to do as an exercise. I have used a maximum keyword length of 8 characters in the demonstration program. I did this because very long text strings could take too long to find and the program could crash. Eight characters seems to be sufficient to identify most ASCII strings. I have stored the keyword in zero page to make the program run as fast as possible. Every byte of the keyword has bit 7 set (line 160) because bit seven of the every data byte is set ready for display after downloading (line 2260). The required word might be in lower case, upper case, or perhaps have just a capital first letter. To ensure the word is recognised whichever form it takes, each byte is converted to a capital before comparison (line 3620). The required keyword should, therefore, be entered in upper case. In the previous module, the program GRABBER loaded only the header packets into the computer's memory until the header of the required page was found. It then loaded packets 1 to 23 and displayed the page when the terminating header was found. As long as the page buffer was clear when the program started, and was cleared after printing the page, GRABBER could simply print out the entire contents of the buffer because it only ever loaded packets 1 to 23 of the required page. The keyword search program associated with this module does not have enough time to clear the buffer after searching every page and so it flags every packet as unused at the beginning of every page search and flags every packet it downloads as used when it is received. This is done using the array markers (line 50). The number &80 is used to mark a packet as unused and the packet number is stored to mark the packet as used. The flag byte grabflag is used again in the program KEYWORD. In this program it can have one of 4 values. &00 means that it is searching for any start-of-page header, ie. not a subtitle and not page xFF. &40 means that it has found a start-of-page header and it is loading that page into the buffer but it has yet not identified the keyword. &80 means that it is still loading the page into the buffer but the keyword has been found. &C0 means that the keyword has been found and an end-of-page header has been identified so that the required page has been loaded into the buffer. When grabflag stores &C0 the page can be printed and the search will continue for another page with the same keyword. The program will continue to search for the same keyword until the Escape key is pressed. If you modify the program to search parallel magazines more effectively you will need the equivalent of one grabflag for each buffer. When a page has been displayed on the screen the page number is taken from the ASCII text in the page header and displayed in the top left hand corner of the frame. This is a short cut method of displaying the page number and if you want to modify the program to include the sub-page number you will have to extract this number from the header data using the information provided in module 2. If you decide to take up the challenge of designing a frame grabber you might like to include some or all of the following features: 1) Page selection by page number and sub-page number. 2) Page selection by keyword. 3) Correct display of subtitles. 4) Detect and display newsflashes. 5) Use the alarm clock on BBC1 page 196. 6) Hold and release page. 7) Switch between reveal and conceal hidden characters. 8) Page dump to printer. 9) Save page in user memory, second processor or on disc, and re-load page when required. If you can create a program with these and any other useful features that improve on the facilities offered by the ATS frame grabber or the Edinburgh Software Products Keyword Frame Grabber (broadcast by BBC Telesoftware) then you might like to consider sending a copy of your program to BBC Telesoftware so that the best program can be transmitted. 10 REM> KEYWORD 20 MODE7 30 DIM mcode &1000 :REM: space for machine code 40 DIM buffer &400 :REM: page grabber buffer 50 DIM markers &20 :REM: packet markers 60 $buffer=STRING$(40," ") :REM: clear the header 70 PROCmcode :REM: assemble machine code 80 INPUT"TV channel (1-4) = "answer$ 90 channel?0=EVAL("&"+LEFT$(answer$,1))+&1B 100 IF channel?0 < &1C THEN channel?0 = &1C 110 IF channel?0 > &1F THEN channel?0 = &1F 120 INPUT"Search keyword (UPPER CASE) = "answer$ 130 answer$=LEFT$(answer$,8) 140 len=LEN(answer$)-1 150 FOR pass=0 TO len 160 searchtext?pass = (ASC(MID$(answer$,pass+1,1)) OR &80) 170 NEXT 180 searchtext?(len+1)=0 190 wordlength?0=len 200 VDU12,23,1,0;0;0;0; 210 CALL mcode 220 VDU30,23,1,1;0;0;0; 230 END 240 DEFPROCmcode 250 packet=&70 :REM: row number of current packet 260 magazine=&71 :REM: magazine number of current page 270 grabflag=&72 :REM: 0=searching, &40=loading, &80=found, &C0=loaded & found 280 usermag=&73 :REM: last magazine number 290 userpage=&74 :REM: last page number, low byte + high byte 300 channel=&76 :REM: TV channel 310 workspace=&77 :REM: 2 byte workspace 320 destination=&79 :REM: another 2 byte workspace 330 wordlength=&7B :REM: length of keyword 340 tempy=&7C :REM: temporary store for Y register 350 searchfor=&7D :REM: temporary store for character in search string 360 searchtext=&80 :REM: store for keyword 370 irq2v=&206 :REM: irq2 vector 380 ttxcontrol=&FC10 :REM: TTX control register, write only 390 ttxstatus=&FC10 :REM: TTX status register, read only 400 rowreg=&FC11 :REM: TTX row register, write only 410 datareg=&FC12 :REM: TTX data register, read & write 420 statclr=&FC13 :REM: TTX clear status register, read & write 430 oswrch=&FFEE 440 osbyte=&FFF4 450 FOR pass=0 TO 2 STEP 2 460 P%=mcode 470 [ OPT pass 480 JSR clearmark \ set up for searching 490 LDX irq2v \ load secondary interrupt vector 500 LDY irq2v+1 510 STX oldirq2v \ save secondary interrupt vector 520 STY oldirq2v+1 530 LDX #interrupt MOD 256 \ install new interrupt routine 540 LDY #interrupt DIV 256 550 SEI \ disable interrupts when altering vectors 560 STX irq2v 570 STY irq2v+1 580 CLI \ re-enable interrupts 590 LDA channel \ load (channel number + #&1C) 600 STA ttxcontrol \ enable TTX 610 .mainloop 620 LDA grabflag \ test to see if page grabbed 630 CMP #&C0 \ &C0 = page grabbed and ready to print 640 BNE continue \ branch if not ready 650 JSR checkdouble \ look for double height lines 660 JSR transfer \ transfer data to screen 670 JSR clearmark \ start searching again 680 .continue 690 LDX #&00 \ screen row 700 LDY #&08 \ screen column 710 JSR vdu31 \ VDU 31,8,0 720 .headloop 730 LDA buffer,Y 740 JSR oswrch \ write header on screen 750 INY 760 CPY #&28 \ decimal 40 770 BCC headloop 780 BIT &FF \ poll escape flag 790 BPL mainloop \ loop if escape not pressed 800 LDA #&7E \ decimal 126 810 JSR osbyte \ acknowledge escape 820 LDA #&00 830 STA ttxcontrol \ disable TTX 840 LDX oldirq2v \ load original vector 850 LDY oldirq2v+1 860 SEI \ disable interrupts when altering vectors 870 STX irq2v \ restore original vector 880 STY irq2v+1 890 CLI \ re-enable interrupts 900 RTS \ return to BASIC 910 .interrupt 920 BIT ttxstatus \ poll TTX hardware 930 BMI ttxinter \ branch if TTX interrupt 940 JMP (oldirq2v) \ not TTX interrupt 950 .ttxinter 960 LDA &FC \ interrupt accumulator save register 970 PHA \ push interrupt accumulator save register 980 TXA 990 PHA \ push X 1000 TYA 1010 PHA \ push Y 1020 LDA grabflag \ is a page waiting to be displayed? 1030 CMP #&C0 \ &C0 = page grabbed 1040 BEQ clearstatus \ clear status and RTI if page grabbed 1050 CLD \ clear decimal flag 1060 .startrow 1070 LDY #&00 \ start with row 0 1080 .readttxt 1090 STY rowreg \ try rows 0 to 15 1100 LDA datareg \ load framing code (#&27) 1110 BEQ emptyrow \ if zero try next row 1120 TYA 1130 PHA \ save row number 1140 JSR readpacket 1150 PLA 1160 TAY \ restore row number 1170 .emptyrow 1180 INY \ increment row number 1190 CPY #&10 \ try rows 0 - 15 1200 BNE readttxt 1210 .clearstatus 1220 LDA #&00 1230 LDY #&0F \ clear 16 rows in adaptor 1240 .clearloop 1250 STY rowreg 1260 STA datareg 1270 DEY 1280 BPL clearloop 1290 STA statclr \ clear status flags before returning 1300 PLA 1310 TAY \ restore Y 1320 PLA 1330 TAX \ restore X 1340 PLA 1350 STA &FC \ restore interrupt accumulator save register 1360 RTI \ return from interrupt 1370 .readpacket 1380 LDY datareg \ read magazine number 1390 LDA hamtable,Y \ de-ham it 1400 BMI cleargrab \ stop loading if error 1410 STA magazine \ save magazine number 1420 LDY datareg \ read packet number 1430 LDA hamtable,Y \ de-ham it 1440 BMI cleargrab \ stop loading if error 1450 STA packet \ save packet number 1460 LDA magazine \ load magazine number 1470 CMP #&08 \ bit 3 of mag. number is bit 0 of packet \ number 1480 ROL packet \ 5 bit packet number 1490 AND #&07 \ use only bits 0-2 1500 STA magazine \ 3 bit magazine number 1510 LDA packet 1520 CMP #&18 \ ignore TSDP, Datacast, etc. 1530 BCS exit \ ie. use Level 1 Teletext only 1540 PHA \ push packet number 1550 ASL A \ packet number * 2 1560 TAY 1570 LDA bufftable,Y \ load buffer address, lsb 1580 STA workspace \ store in zero page 1590 LDA bufftable+1,Y \ load buffer address, msb 1600 STA workspace+1 \ store in zero page 1610 PLA \ pull packet number 1620 CMP #&00 \ is it a header? 1630 BNE notheader 1640 TAX \ init index for hammed data 1650 .readheader 1660 LDY datareg \ read data register 1670 LDA hamtable,Y \ de-ham it 1680 BMI cleargrab \ stop loading if error 1690 STA buffer,X \ store de-hammed data 1700 INX \ increment index 1710 CPX #&08 \ use X = 0-7 1720 BCC readheader \ continue reading hammed data 1730 LDA grabflag \ are we loading or searching? 1740 BEQ checkstart \ branch if searching 1750 LDA magazine \ we must be loading a page 1760 CMP usermag \ is it the magazine we want? 1770 BNE exit \ branch if not the one we want 1780 LDA buffer \ low byte of page number 1790 CMP userpage \ is this the same as the one we are loading? 1800 BNE endfound \ end of page when different 1810 LDA buffer+1 1820 CMP userpage+1 \ is this the same as the one we are loading? 1830 BEQ exit \ not end of page if the same 1840 .endfound 1850 LDA grabflag \ grabflag = either #&40 or #&80 1860 ORA #&40 \ grabflag = #&C0 if keyword found 1870 STA grabflag \ update grabflag 1880 BMI exit \ branch if grabflag = #&C0 1890 .cleargrab 1900 JSR clearmark \ start searching again 1910 .exit 1920 RTS 1930 .checkstart 1940 LDA buffer+5 \ check for subtitle 1950 AND #&08 \ check bit 3 1960 BNE display \ don't load subtitle 1970 LDA buffer \ low nybble of current page 1980 CMP userpage \ are we loading it? 1990 BNE loadit \ if not then load it 2000 LDA buffer+1 \ high nybble of current page 2010 CMP userpage+1 \ are we loading it? 2020 BEQ display \ if loading display header, if not load page 2030 .loadit 2040 LDA #&40 \ page loading flag 2050 STA grabflag \ load this page 2060 LDA magazine \ current magazine number 2070 STA usermag \ load this magazine 2080 LDA buffer \ current page, low nybble 2090 STA userpage \ load this page 2100 LDA buffer+1 \ current page, high nybble 2110 STA userpage+1 \ load this page 2120 .display 2130 \ LSR buffer+6 \ check for suppress header 2140 \ BCS exit \ don't bother with suppressed headers 2150 LDY #&08 \ header data starts at byte 8 2160 BNE readmore \ go to read the header 2170 .notheader 2180 LDA grabflag \ is a page loading? 2190 BEQ return \ return if page not loading 2200 LDA magazine \ is the magazine number the one we want? 2210 CMP usermag \ compare with the one we are loading 2220 BNE return \ return if different magazine 2230 LDY #&00 \ read bytes 0 - 39 2240 .readmore 2250 LDA datareg \ read data register 2260 ORA #&80 \ set bit 7 for display 2270 STA (workspace),Y \ store in buffer 2280 INY \ increment index 2290 CPY #&28 \ decimal 40 2300 BNE readmore \ more data in this packet 2310 LDA packet \ load current packet number 2320 BEQ return \ return if header 2330 TAX \ A is greater than 0 and less than 24 2340 STA markers,X \ mark this as a valid packet 2350 BIT grabflag \ have we found the keyword yet? 2360 BMI return \ return if word found 2370 JSR wordsearch 2380 .return 2390 RTS 2400 .checkdouble 2410 LDX #&01 \ screen row numbers 1-23 2420 .nextcolumn 2430 LDA markers,X 2440 BMI carryset 2450 LDY #&00 \ screen column numbers 0-40 2460 JSR setup \ set up workspace for indirect addressing 2470 .singleloop 2480 LDA (workspace),Y 2490 JSR conceal \ check for concealed display 2500 CMP #&8D \ TTX double height character 2510 BEQ doubleheight 2520 INY 2530 CPY #&28 \ decimal 40 2540 BCC singleloop 2550 .carryset 2560 INX 2570 CPX #&17 \ decimal 23 2580 BCC nextcolumn 2590 RTS 2600 .doubleheight 2610 LDY #&00 2620 TXA 2630 PHA \ store row number 2640 INX 2650 TXA 2660 STA markers,X 2670 ASL A \ (row number + 1) * 2 2680 TAX 2690 LDA bufftable,X 2700 STA destination 2710 INX 2720 LDA bufftable,X 2730 STA destination+1 2740 PLA 2750 TAX \ restore row number 2760 .doubleloop 2770 LDA (workspace),Y 2780 JSR conceal \ check for concealed display 2790 STA (destination),Y 2800 INY 2810 CPY #&28 \ decimal 40 2820 BCC doubleloop 2830 INX 2840 CPX #&17 \ decimal 23 2850 BCC carryset 2860 RTS 2870 .conceal 2880 CMP #&98 \ TTX conceal display character 2890 BNE goback 2900 LDA #ASC(" ") \ substitute with a space 2910 STA (workspace),Y 2920 .goback 2930 RTS 2940 .setup 2950 TXA 2960 PHA \ store row number 2970 ASL A \ (row number) * 2 2980 TAX 2990 LDA bufftable,X 3000 STA workspace 3010 INX 3020 LDA bufftable,X 3030 STA workspace+1 3040 PLA 3050 TAX \ restore row number 3060 RTS 3070 .transfer 3080 LDA #&00 3090 TAX 3100 LDY #&02 3110 JSR vdu31 \ VDU 31,2,0 3120 LDA #ASC("P") 3130 JSR oswrch \ start printing page number 3140 LDX #&0F \ decimal 15 3150 .numberloop 3160 LDA buffer,X \ cheat by getting page number from header 3170 JSR oswrch 3180 INX 3190 CPX #&12 3200 BCC numberloop \ print 3 page bytes from header 3210 LDX #&01 \ screen rows 1-23 3220 .nextline 3230 LDY #&00 \ columns 0-39 3240 JSR vdu31 \ VDU 31,0,1-23 3250 LDA markers,X \ look for valid packet 3260 BPL markfound \ display valid packets only 3270 LDA #blanks MOD 256 \ else display 40 spaces 3280 STA workspace 3290 LDA #blanks DIV 256 3300 STA workspace+1 3310 JMP writescreen 3320 .markfound 3330 JSR setup \ set up workspace for indirect addressing 3340 .writescreen 3350 LDA (workspace),Y \ load data from buffer 3360 JSR oswrch \ write to screen 3370 INY 3380 CPY #&28 \ decimal 40 3390 BCC writescreen 3400 INX 3410 CPX #&18 \ decimal 24 3420 BCC nextline \ last packet number = 23 3430 RTS 3440 .vdu31 3450 LDA #&1F \ decimal 31 3460 JSR oswrch 3470 TYA 3480 JSR oswrch 3490 TXA 3500 JMP oswrch \ and return 3510 .wordsearch 3520 LDX #&00 \ index for search string 3530 LDY #&00 \ index for packet 3540 STY tempy \ temporary store for Y 3550 .wordloop 3560 LDA searchtext,X 3570 BEQ found 3580 PHA \ save character from keyword 3590 LDA (workspace),Y 3600 CMP #&E1 \ lower case 'a' 3610 BCC uppercase \ branch if upper case 3620 AND #&DF \ force lower to upper case 3630 .uppercase 3640 STA searchfor \ store character from TTX 3650 PLA \ restore character from keyword 3660 CMP searchfor 3670 BNE notfound \ branch if keyword does not match 3680 INX \ if matched look at next character 3690 INY 3700 CPY #&29 \ decimal 41 3710 BCC wordloop 3720 .notfound 3730 LDX #&00 \ reset index on keyword 3740 INC tempy \ increment index on TTX 3750 LDY tempy 3760 CPY #&29 \ decimal 41 3770 BCC wordloop \ branch if more data to test 3780 RTS 3790 .found 3800 LDA #&80 \ flag keyword found 3810 STA grabflag \ keyword found 3820 RTS 3830 .clearmark 3840 LDA #&80 3850 LDX #&17 \ decimal 23 3860 .loopmark 3870 STA markers,X 3880 DEX 3890 BNE loopmark 3900 STX grabflag \ grabflag = searching 3910 LDA #&0F \ end of page marker 3920 STA userpage 3930 STA userpage+1 3940 RTS 3950 .oldirq2v 3960 EQUW &00 3970 .hamtable 3980 EQUD &0101FF01 3990 EQUD &FF0100FF 4000 EQUD &FF0102FF 4010 EQUD &07FFFF0A 4020 EQUD &FF0100FF 4030 EQUD &00FF0000 4040 EQUD &0BFFFF06 4050 EQUD &FF0300FF 4060 EQUD &FF010CFF 4070 EQUD &07FFFF04 4080 EQUD &07FFFF06 4090 EQUD &070707FF 4100 EQUD &05FFFF06 4110 EQUD &FF0D00FF 4120 EQUD &FF060606 4130 EQUD &07FFFF06 4140 EQUD &FF0102FF 4150 EQUD &09FFFF04 4160 EQUD &02FF0202 4170 EQUD &FF0302FF 4180 EQUD &05FFFF08 4190 EQUD &FF0300FF 4200 EQUD &FF0302FF 4210 EQUD &0303FF03 4220 EQUD &05FFFF04 4230 EQUD &FF040404 4240 EQUD &FF0F02FF 4250 EQUD &07FFFF04 4260 EQUD &050505FF 4270 EQUD &05FFFF04 4280 EQUD &05FFFF06 4290 EQUD &FF030EFF 4300 EQUD &FF010CFF 4310 EQUD &09FFFF0A 4320 EQUD &0BFFFF0A 4330 EQUD &FF0A0A0A 4340 EQUD &0BFFFF08 4350 EQUD &FF0D00FF 4360 EQUD &0B0B0BFF 4370 EQUD &0BFFFF0A 4380 EQUD &0CFF0C0C 4390 EQUD &FF0D0CFF 4400 EQUD &FF0F0CFF 4410 EQUD &07FFFF0A 4420 EQUD &FF0D0CFF 4430 EQUD &0D0DFF0D 4440 EQUD &0BFFFF06 4450 EQUD &FF0D0EFF 4460 EQUD &09FFFF08 4470 EQUD &090909FF 4480 EQUD &FF0F02FF 4490 EQUD &09FFFF0A 4500 EQUD &FF080808 4510 EQUD &09FFFF08 4520 EQUD &0BFFFF08 4530 EQUD &FF030EFF 4540 EQUD &FF0F0CFF 4550 EQUD &09FFFF04 4560 EQUD &0F0FFF0F 4570 EQUD &FF0F0EFF 4580 EQUD &05FFFF08 4590 EQUD &FF0D0EFF 4600 EQUD &FF0F0EFF 4610 EQUD &0EFF0E0E 4620 .bufftable 4630 EQUW buffer 4640 EQUW buffer+40 4650 EQUW buffer+(2*40) 4660 EQUW buffer+(3*40) 4670 EQUW buffer+(4*40) 4680 EQUW buffer+(5*40) 4690 EQUW buffer+(6*40) 4700 EQUW buffer+(7*40) 4710 EQUW buffer+(8*40) 4720 EQUW buffer+(9*40) 4730 EQUW buffer+(10*40) 4740 EQUW buffer+(11*40) 4750 EQUW buffer+(12*40) 4760 EQUW buffer+(13*40) 4770 EQUW buffer+(14*40) 4780 EQUW buffer+(15*40) 4790 EQUW buffer+(16*40) 4800 EQUW buffer+(17*40) 4810 EQUW buffer+(18*40) 4820 EQUW buffer+(19*40) 4830 EQUW buffer+(20*40) 4840 EQUW buffer+(21*40) 4850 EQUW buffer+(22*40) 4860 EQUW buffer+(23*40) 4870 EQUW buffer+(24*40) 4880 .blanks 4890 EQUS STRING$(40," ") 4900 ] 4910 NEXT 4920 ENDPROC