If both left & right are variables (not constants), the corresponding push & pop can be elided, and the parameter placed inline in the cmp.
However you cannot do it for *both* parameters.

           push dword LEFT
           push dword RIGHT
                                        ;; CMPEQ
           pop eax
           pop ecx
           cmp ecx,eax

=>
           push dword LEFT
           pop ecx
           cmp ecx,RIGHT

=>
           push dword RIGHT
                                        ;; CMPEQ
           pop eax
           cmp LEFT,eax



if the first parameter is a more complex expression, there won't be a POP (it'll be something like an ADD) so there's only one rewrite:

           op
           push dword RIGHT
                                        ;; CMPEQ
           pop eax
           pop ecx
           cmp ecx,eax

=>

                                        ;; CMPEQ
           pop ecx
           cmp ecx,RIGHT


need to check of CMP [var],const  is allowed.

=======

To catch chains like this:

            mov    ecx, dword [code]               ;; PUSH code
            cmp    ecx, dword 'J'                  ;; PUSH # 'J'  ;; CMPEQ
            je     xj                              ;; BF F_1148  ;; B xj
                                                   ;; F_1148 :
            mov    ecx, dword [code]               ;; PUSH code
            cmp    ecx, dword 'K'                  ;; PUSH # 'K'  ;; CMPEQ
            je     xk                              ;; BF F_1149  ;; B xk
                                                   ;; F_1149 :

we need to do the register-remembering and drop of subsequent loads *before* converting the above to cmp dword [code], dword 'K'

=======

need to be able to peephole arbitrarily long sequences, to handle:

            push   dword [ci]                      ;; PUSH ci  ;; PUSH & c     ; note param is rel to ebp in procedures
            mov    eax, dword [text]               ;; PUSH text
            add    eax, dword 1                    ;; PUSH # 1  ;; ADD  ;; POPI 4
            mov    ecx, c
            pop    edx
                         (can load edx out of order rather than pushing)

as well as

            mov    ecx, dword [code]               ;; PUSH code 
            cmp    ecx, dword 'M'                  ;; PUSH # 'M'  ;; CMPEQ 
            je     xm                              ;; BF F_1139  ;; B xm  ;; F_1139 : 
            mov    ecx, dword [code]               ;; PUSH code 
            cmp    ecx, dword 'W'                  ;; PUSH # 'W'  ;; CMPEQ 
            je     xw                              ;; BF F_1140  ;; B xw  ;; F_1140 : 
            mov    ecx, dword [code]               ;; PUSH code 
            cmp    ecx, dword 'L'                  ;; PUSH # 'L'  ;; CMPEQ 
            je     xl                              ;; BF F_1141  ;; B xl  ;; F_1141 : 
            mov    ecx, dword [code]               ;; PUSH code 
            cmp    ecx, dword 'R'                  ;; PUSH # 'R'  ;; CMPEQ 
            je     xr                              ;; BF F_1142  ;; B xr  ;; F_1142 : 

... just loop forward as long as the opcodes are ones that we expect.  remove redundant movs when found

remember to check for psects and labels.  (until psect hack to remove them all at the start is done)

==============


Something has to be possible here...

            mov    ecx, dword [num]                ;; PUSH num 
            cmp    ecx, dword 0                    ;; PUSH # 0  ;; CMPEQ 
            je     xzero

maybe  "test dword [num]"  perhaps?   Or  "cmp dword [num], dword 0"  (and other non-0 constants)

however this can not be done *after* register remembering, because the next instructions might be:

           cmp    ecx, dword 'C'
           je     xc

IDEA!  one way to mark an opcode as 'don't mess with me any more' is to change the opcode from
lower case to upper case.  will still assemble the same, but strcmp will fail.  (hack :-) )

==============


            mov    dword [i], eax                  ;; POP i 
            mov    ecx, dword [i]                  ;; PUSH i 
   =>
            mov    dword [i], eax                  ;; POP i 
            mov    ecx, eax

                 if i.opd1 == i+1.opd2    - trivial case of register remembering

==============
still the add direct to memory constants to handle, plus any special case for inc/dec?

            mov    ecx, dword [fp]                 ;; PUSH fp 
            add    ecx, dword 1                    ;; PUSH # 1  ;; ADD 
            mov    dword [fp], ecx                 ;; POP fp 

==============
 access processing: if remove a jmp to a F_ label, look ahead and remove the label too.  wipeout if needed.

==============

            mov    dword [matched_k], eax          ;; POP matched_k 
            push   dword [matched_k]               ;; PUSH matched_k 
=>
            mov    dword [matched_k], eax          ;; POP matched_k 
            push   eax                             ;; PUSH matched_k 
=============

did I miss this one?  maybe only do it the other way round?

            mov    dword [i], eax                  ;; POP i 
            mov    eax, dword [i]                  ;; PUSH i 
=>
            mov    dword [i], eax
=============
            mov    ecx, dword [sym]                ;; PUSH sym 
            cmp    ecx, dword ' '                  ;; PUSH # ' '  
                                                   ;; CMPEQ 
            je     B_1013                          ;; BT B_1013  
                                                   ;; PUSH sym 
            cmp    ecx, dword ' '                  ;; PUSH # ' '  
                                                   ;; CMPLT 
            jge    F_1015                          ;; BF F_1015 
            mov    eax, dword 0                    ;; PUSH # 0  
                                                   ;; RET 1        ; pass result back in register eax
            mov    esp, ebp                                        ; takedown stack frame
            pop    ebp                                             ; same as "leave"
            ret                                                    ; return

can avoid second comparison. (very small gain)

============

%define ...
sub esp, 4
%define ...
sub esp, 4

can cascade just with a 3-item lookahead.

ignore first define.

sub esp, 4
%define ...
sub esp, 4

=>

%define ...
sub esp, 8

then

sub esp, 8
%define ...
sub esp, 4

=>

%define ...
sub esp, 12

trick is not to look for '4' but any integer constant.  Then add it and put it back for re-processing.

note %define is in .text psect
================

            mov    ecx, dword [fp] ;; PUSH fp
            add    ecx, dword 1                    ;; PUSH # 1   (or any constant)
                                                   ;; ADD
            mov    dword [fp], ecx                 ;; POP fp

=>
            add    dword [fp], dword 1             

================

knock on effect of avoiding store-to-store increment when memory address reused soon after:

            mov    dword [pp], ecx                 ;; POP pp
            mov    ecx, a                          ;; PUSH & a
            mov    eax, dword [pp]                 ;; PUSH pp

=>

            mov    dword [pp], ecx                 ;; POP pp
            mov    eax, ecx                        ;; PUSH & a
            mov    ecx, a                          ;; PUSH pp
===============
            mov    ecx, dword [cmax]               ;; PUSH cmax 
            cmp    ecx, dword 0                    ;; PUSH # 0  
                                                   ;; CMPNE 
            je     F_1072                          ;; BF F_1072 


as long as this isn't part of a skip chain, can be optimised perhaps by

            cmp    dword [cmax], dword 0
            je     F_1072

or possibly some variant of the "test" opcode?  

===============

when optimising this construct:

            mov    edx, dword [sym]                ;; PUSH sym
            mov    ecx, stored                     ;; PUSH & stored
            mov    eax, dword [i]                  ;; PUSH i
                                                   ;; POPI 4
            mov    [ecx+eax*4], edx
 
remember you cannot do:

            mov    edx, dword [ecx+eax*4]
            mov    ecx, a                          ;; PUSH & a
            mov    eax, dword [pp]                 ;; PUSH pp
                                                   ;; POPI 4
            mov    [ecx+eax*4], edx                ;; PUSH pp
 
also want to handle
            mov    edx, dword [num]                ;; PUSH num
                                                   ;; PUSH & c
            mov    eax, dword [cmax]               ;; PUSH cmax
            add    eax, dword 2                    ;; PUSH # 2
                                                   ;; ADD
                                                   ;; POPI 4
            mov    ecx, c
            mov    [ecx+eax*4], edx
 
want to optimise constants in eax...


===============

next 2 to be done (lest I forget):

            mov    edx, ecx                        ;; PUSH & a
                                                   ;; PUSH fp
                                                   ;; POPI 4
            mov    [a+eax*4], edx

and

            mov    ecx, a                          ;; PUSH pp
                                                   ;; PUSHI 4
            mov    edx, dword [ecx+eax*4]

================

can now consolidate into a constant offset:

            mov    eax, dword [text]               ;; PUSH text
            add    eax, dword 1                    ;; PUSH # 1
                                                   ;; ADD
                                                   ;; PUSHI 4
            mov    eax, dword [c+eax*4]
 
================
*May* be able to remove the first mov below, if ecx isn't reused somewhere else.
If it werem it would have to be protected by a "Mov" hack.

xlb:                                               ;; xlb :
            mov    ecx, dword [num]                ;; PUSH num
            add    ecx, dword 1                    ;; PUSH # 1
                                                   ;; ADD
            mov    edx, ecx                        ;; PUSH & c
            mov    eax, dword [text]               ;; PUSH text
                                                   ;; PUSH # 2
                                                   ;; ADD
                                                   ;; POPI 4
            mov    [c+eax*4+2*4], edx
            jmp    get                             ;; B get
================

TO DO:

            push   edx
            pop    dword [xrem]                    ;; POP xrem

trivial, but not handled.  see one of the regression tests.


Also:

            push   dword [A1]                      ;; PUSH A1
            mov    eax, dword [A1]                 ;; PUSH A1

... swap order and push eax...

