There are some options which we might want to enable either via %control bits or PARM settings: 1) column major vs row major arrays: to switch them over, all array indices will have to be reversed, eg A(I,J) will have to be implemented as A[J][I] to duplicate Imp's layout. 2) Strictly speaking Imp does not define any ordering of evaluating parameters in a procedure call, however in practice it has always been left to right order. C however expects right to left order. We could - under control of an option - actually reverse all the parameters in the generated C code. (In contrast we could leave them alone and use %systemroutinespec to indicate that an external procedure should use the C convention) 3) Turn on/off unassigned variable checking 4) Very unlikely to implement this, but *could* reverse the bytes of any variable as it is written to or read from memory. The code would look ghastly though! 5) Implement consts using #define (with whatever method we come up with to handle undoing them at the end of a scope). Vs replace constant expressions with folded value when possible. 6) Imp strings vs C strings. Note that Imp80/IOPT for EMAS3 introduced the concept of string descriptors which were 64 bit objects with flags in the upper 32 bits and the address in the lower 32 bits. 7) Make the generated C more idiomatic at the expense of correctness, for the creation of a new C source which would then be modified and maintained manually, as opposed to a C source that was a stage in the Imp compilation and never visible to the programmer. Especially the conversion of Imp streams to C files, and folding of multiple calls into a single printf, etc. (May need more than one control option) Use of C library vs Imp library. %mainep etc. 8) Backtracing options (as in the earlier "imps" Imp to C translator) and possible %signal handling. 9) Case style for generated C: a) all upper case b) Initial cap c) all lower case but with renaming for clashes d) follow the Imp case but with renaming for clashes using first case seen e) ditto but using most frequently seen version 10) use of %dynamicroutinespec calls. 11) handling of %include files 12) Always adjust array base vs add extra zeroth item for (1:n) 13) It may need to be a command-line PARM rather than a %control, but it should be possible to specify big-endian byte sex and get the effect of that by compiling with the MIPs gcc. (which is supported with Multilib - at least on my x86 it works under emulation using binfmt - I haven't checked to see if this is also the case on ARM) n) Add more here as they come up. ------------------------------------------------------------------------------- #include #include int main(int argc, char **argv) { float f; // This confirms that C does type coercion bottom-up - 5/3 looks // like two ints so the result of that subexpression is 1, not 1.66666 // and therefore the output is 1.5 rather than 2.16666 // There is no direct parallel of this problem in Imp because Imp has // separate '/' and '//' operators for integer and real division, and // I don't think there are any other operators that could exhibit // different behaviour by treating its arguments as integers rather // than reals. However this is an idiosyncracy of C which we must // be aware of when translating Imp to C! (Explicit casts will be // necessary for '3/5' or any operands of '/' that are typed as // integers. f = 5/3 + 0.5; fprintf(stderr, "%2.8f\n", f); exit(0); } ------------------------------------------------------------------------------- The CST code creates tuples with: return T[0] = mktuple(G_ELSEQ, alt, phrases, T); The AST code gets it wrong and returns T[0] which contains nothing sensible. (It should contain an index to arbitrary user tuples. Also $$ and $1 $2 etc yacc-style to be added to takeon? Aside: still need to put P_xxx in the grammar at index G_x... ------------------------------------------------------------------------------- gtoal@linux:~/github/uparse-main$ cat array2d.c int fred[6][4]; $define fred[a][b] fred[(a)+1][(b)+2] fred[23][45] = 123; gtoal@linux:~/github/uparse-main$ gtcpp < array2d.c #line 1 "" int fred[6][4]; fred[(23)+1][(45)+2] = 123; Would be helpful to extend it with scopes - $begin/$end as in HAL. Which would also imply a command to push a definition so that an earlier one encountered before the last $begin was restored on a $end ... btw may want to reverse order of indexes because I believe Imp and C differ as to row/column-major. C's right-to-left evaluation of parameters is annoying - but I did find a note by Peter Stephens that Imp's order of evaluation was undefined and that if you wanted to force an order you should assign each parameter to a variable in whatever order you wanted before calling the procedure. So just document and maybe warn rather than implement differently. ------------------------------------------------------------------------------- #pragma GCC diagnostic error "-Wuninitialized" foo(a); /* error is given for this one */ #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wuninitialized" foo(b); /* no diagnostic for this one */ #pragma GCC diagnostic pop foo(c); /* error is given for this one */ #pragma GCC diagnostic pop foo(d); /* depends on command line options */ - to allow checks but suppress them for things like << or calculating a hash where the multiplication is allowed to overflow. Maybe invoke with a %control? ------------------------------------------------------------------------------- The issue of name clashes with C header files etc... a possible avenue might be: ctags -v /usr/include/stdio.h -o /dev/stdout| grep -v ^_ |grep -v "^[^ ]*_"|grep -v ^[A-Z]|sort ------------------------------------------------------------------------------- Some useful tips about bad C constructs to avoid in https://progforperf.github.io/Expert_C_Programming.pdf ------------------------------------------------------------------------------- Here's a valid Imp program... gtoal@linux:~/github/uparse-main/lang/imp80$ cat casting.imp %begin %real realvar realvar = 1.25@7 ! Determine internal representation of floating point: write(INTEGER(ADDR(realvar)), 0) ; newline %endofprogram That program could plausibly be implemented in C by: gtoal@linux:~/github/uparse-main/lang/imp80$ cat casting.c #include int main(int argc, char **argv) { float Realvar; Realvar = 1.25e7; fprintf(stderr, "%d\n", *(int *)&Realvar); return (0); } And yet, it turns out that C allows that code to output practically any random crap, because that conversion is defined as Undefined Behaviour! That simplified test program above doesn't reproduce the 'undefined behaviour', but I have a larger real-life example that does. Apparently there is a rule about 'strict aliasing' and the 'correct' way to indirect through a casted pointer (as opposed to casting the actual variable, which can alter its representation) is to use memcpy: gtoal@linux:~/github/uparse-main/lang/imp80$ cat casting-safe.c #include #include int main(int argc, char **argv) { float Realvar; int Intvar; Realvar = 1.25e7; memcpy(&Intvar, &Realvar, sizeof(Realvar)); fprintf(stderr, "%d\n", Intvar); return (0); } As you may guess, this makes translation of some Imp constructs into C somewhat uglier, if we have to defend against this even though it appears to work with the current compiler. ------------------------------------------------------------------------------- perms: // NOTE!!! As well as the *multiple* tracing/backtrace mechanisms I've been experimenting with // for imp backtraces, there's also https://justine.lol/ftrace/ which I haven't even looked at yet, // but anything by Justine Tunney has to be worth looking at. Would be nice if Imp to C could // produce platform independent binaries for example using her APE project. ------------------------------------------------------------------------------- For the longest time I thought only GCC supported nested procedures... today I found out that tcc does too! tcc is definitely a candidate for bundling with imp to c. -------------------------------------------------------------------------------