The interface to the spelling checker RM (Relocatable Module) is defined in
terms of the procedure calling standard outlined in "The ARM Procedure Call
Standard" reference APCS Issue 2/01.

The entries are called via the SWI instruction (SWI No's to be determined),
but the parameter passing conventions are as for compiled language procedure
calls - treat it just like an external procedure call but replace the
"BL addr" by "SWI number".

This makes it easy to call a procedure contained in a relocatable module -
by writing a 'stub' procedure in whatever language is being used (as long as
it conforms to the calling conventions).

For example, a call to a hypothetical procedure which takes two integer
parameters and returns an integer result would be written in (extended) Pascal
as follows:

program fred(input, output);

function add(i, j: integer): integer;
begin
  *SWI_1234                   {R0, R1 have the parameters in them}
end {add};                    {R0 on exit holds results}
                              { (Functions with embedded machine code don't
                                 need to assign to function result)  }
begin
  a := add(i, j)
end.

------------------------------------------------------------------------------

In the calls below, words are represented by two integers - byte pointers
to the first char of the word, and the byte after the last character of the
word.  Similarly, store areas are bounded by two pointers - the first
inclusive and the second exclusive.

This convention makes it easier to operate on pointers in several ways.
For example, a text buffer of 1024 bytes at address 20000 would be delimited
by (20000,21024).  A ten character word at the start of the buffer would
be described by (20000,20010), i.e. the end-pointer is the start-pointer
plus the length.

------------------------------------------------------------------------------

type checkcode = (Correct {0}, Dubious {1}, Error {2})

function Check(WordStart,WordEnd: address): checkcode;

This function takes a word supplied as an entity and performs a simple check.
With this call, it is up to the caller to decide what constitutes a word:
The caller could pass "well-wisher's" as a single unit - if the dictionary
contains hyphenated words, and understands the possessive case, it will
return "correct" for this unit.  If it does not contain hyphenated words,
it is allowed to check the individual parts, i.e. "well", "wisher" and "s",
and again will return "correct".

However, if the unit were "well-washers", although the second case above
would still say "correct" - the first case would return the result code
"dubious" - because the hyphenated unit was not found in a dictionary which
does contain hyphenated words.

Finally, if the unit were "wll-wisher's", both cases would return "error".

Callers who want an easy life will of course simply pass each contiguous
alphabetic sequence separately.

-----------------------------------------------------------------------------

function FindError(BufferStart, BufferEnd: address;
                   var WordStart, WordEnd: address): checkcode;

This function will search within the buffer limits for anything if finds
incorrect or dubious.  It will make its own decisions about what constitutes
a word.  A status result of "Correct" means that there are no more items
in the buffer to check.  This procedure would normally be called in a loop
as below.  The buffer might be the whole file, a cached block, or just that
part which is currently being displayed on screen.

An example of use would be:

   { Buffstart, Buffend are set up already }

   WHILE FindError(Buffstart,Buffend, WordStart,WordEnd)<>Correct DO BEGIN
       HighLight(WordStart, WordEnd);
       BuffStart := WordEnd;    { Skip past word ready for next search }
   END;

---------------------------------------------------------------------------


function Correct(WordStart, WordEnd,
                 ResultStart, ResultEnd: integer;
                 var Always: boolean): integer;
                                        {result is count of words returned}

The word as returned from FindError above or as passed to Check is examined
and any possible alternatives are returned.  No more alternatives than will
fit in the buffer will be returned.  The alternatives are returned in
order of likelihood, as measured by the Dammerau-Levenstein spelling metric,
although this ordering may be changed under instruction from the caller (see
below).  You can expect only one or two corrections for average length words,
but more for shorter words.  

"Always" will be true if the word should be corrected (to the first word in
the list - there may be others too...) without asking the user first.
Offering a flag leaves this decision to the utility writer or the user.

Each word in the buffer will be newline terminated (NL = 10).

---------------------------------------------------------------------------

type ActCode = (Only {0}, Preferred {1}, Detested {2});

function Correction(WrongStart, WrongEnd, RightStart, RightEnd: integer;
                    Action: ActCode);

The Action codes have the following meanings:

Only:  This alternative is the only one to be offered if this erroneous
word is found again.

Preferred: This alternative should be offered at the head of any multiple
alternatives.

Detested: This alternative should be offered at the foot of any list of
alternatives; hopefully it will drop off the end...

Example;  Correction( <"archemedes">, <"Archimedes">, Only);

(I'm cheating here: <"..."> is clearly a denotation for the two pointers)

If a correction is only wanted once, the utility should not bother to call
"Correction" at all.

---------------------------------------------------------------------------


Procedure AddTemp(WordStart, WordEnd): boolean;

This adds a new word to the store dictionary.  Action will have to be taken
to save the store dictionary to a file.  This will be done externally
by a *-command in the RM, the details of which are not needed here.
The Boolean result is in case it is not possible to add new words to the
dictionary; I cannot foresee this being likely, but callers should be
prepared to take action (a status-line message perhaps) or not as preferred.

----------------------------------------------------------------------------

Procedure RemoveTemp(WordStart, WordEnd): boolean;

This removes a word from the dictionary - possibly by adding it to a list
of "bad" words rather than by actually removing it in situ.  This is to
allow the implementor to force permanent dictionary updates to go through
some sort of validation.  Again, this will be external and does not concern
this document.
The boolean result is in case the extraction cannot be done, although I do
not intend to draw the distinction between the word not having been found,
and it being there but not removable for implementation reasons.

----------------------------------------------------------------------------

That's all I've got in at the moment.  The problems I can see are interfacing
to BCPL, and space constraints on the dictionary.  Please let me know once
you have digested this whether you have any suggestions or comments.

Graham.

P.S.  Points still under consideration are advisory messages about word
usage (e.g. license, licence); consistency of equivalent words (enquire,
inquire), acronyms (O.H.M.S., OHMS); Capitalisation, Proper names and place
names; and English and American spellings.  I can also sometimes suggest
when a word might be better hyphenated even although there is no hyphenated
version in the dictionary: if you offer "counterinsurgency" it will not find
it, however it will find "counter" and "insurgency" and may advise that
perhaps a hyphen would be better here.  I have not decided whether this should
be a standard part of the "Correct" procedure or should be on an option flag.

I think we can get away with omitting these from A-writer, and having a
stand-alone programme to handle them separately.  I am also discussing
a grammar checker (unix "style" like, but better) with a chap at Edinburgh
who has done one which seems pretty good.