[ASM] Compare 16-bit registers?

Got questions? Got answers? Go here for both.

Moderator: MaxCoderz Staff

User avatar
qarnos
Maxcoderz Staff
Posts: 227
Joined: Thu 01 Dec, 2005 9:04 am
Location: Melbourne, Australia

Post by qarnos »

Timendus wrote:So, the subtraction sbc hl,bc sets all the flags right and add hl,bc restores de without screwing up the flags?
add hl, xx does affect the carry flag - just not the zero flag. The carry is fine, however, because if the sbc caused a carry, then add will also cause one. eg:

1 - 3 = -2 with carry.
-2 + 3 = 1 with carry.
That sounds pretty good. In fact, I can probably just throw one value away after the comparison, and just do

Code: Select all

ex de,hl
or a
sbc hl,bc
ex de,hl
Is that or a really necessary, by the way? Doesn't sbc take care of the carry flag, or does it only set and not reset it?
As well as affecting the carry, sbc will also subtract an extra 1 if the carry flag is set when the instruction is executed (that's where it gets its name from - subtract with carry). Example:

Code: Select all

ld hl, 6
ld de, 5
scf        ; set carry flag
sbc hl, de ; hl will contain 0 (6 - 5 - 1)

ld hl, 6
ld de, 5
or a       ; zero carry
sbc hl, de ; hl will contain 1 (6 - 5 - 0)
Of course, you can get rid of the or if you know the carry flag will be reset by this point anyway. It's just there to be sure.
"I don't know why a refrigerator is now involved, but put that aside for now". - Jim e on unitedti.org

avatar courtesy of driesguldolf.
User avatar
Timendus
Calc King
Posts: 1729
Joined: Sun 23 Jan, 2005 12:37 am
Location: Netherlands
Contact:

Post by Timendus »

Okay, thanks! Should have figured that out myself ;)
http://clap.timendus.com/ - The Calculator Link Alternative Protocol
http://api.timendus.com/ - Make your life easier, leave the coding to the API
http://vera.timendus.com/ - The calc lover's OS
User avatar
qarnos
Maxcoderz Staff
Posts: 227
Joined: Thu 01 Dec, 2005 9:04 am
Location: Melbourne, Australia

Post by qarnos »

Sorry for the necropost, but I've stumbled on a cool trick for doing comparisons byte-by-byte. It doesn't work exactly like CP, but is very useful when you are short on registers.

I came up with this for comparing against IX (which has no SBC instruction), but it can modified easily enough.

Code: Select all

            ld      a, ixl
            cpl
            add     a, c
            ld      a, ixh
            cpl
            adc     a, b
            
            ; m flag set if BC <= IX
Since I couldn't SBC, I decided to ADD instead. the CPL of a number is NEG without the increment - so here I am effectively doing (-IX - 1 + BC) - hence the result is negative if BC <= IX. CPL doesn't affect the carry flag, so the ADC is fine.
"I don't know why a refrigerator is now involved, but put that aside for now". - Jim e on unitedti.org

avatar courtesy of driesguldolf.
User avatar
Dwedit
Maxcoderz Staff
Posts: 579
Joined: Wed 15 Dec, 2004 6:06 am
Location: Chicago!
Contact:

Post by Dwedit »

Is using IXH and IXL worth the huge overhead? I haven't timed it, but is pushing and popping other registers faster?
You know your hexadecimal output routine is broken when it displays the character 'G'.
User avatar
qarnos
Maxcoderz Staff
Posts: 227
Joined: Thu 01 Dec, 2005 9:04 am
Location: Melbourne, Australia

Post by qarnos »

Dwedit wrote:Is using IXH and IXL worth the huge overhead? I haven't timed it, but is pushing and popping other registers faster?
Surprisingly, no. In my situation, I need to preserve all registers except A (otherwise I wouldn't be using IX to begin with!).

First, my code:

Code: Select all

        ld      a, ixl      ; [8]
        cpl                 ; [4]
        add     a, c        ; [4]
        ld      a, ixh      ; [8]
        cpl                 ; [4]
        adc     a, b        ; [4]


That's 32 T-States. Now, the obvious way:

Code: Select all

        push    ix          ; [15]
        ex      (sp), hl    ; [19]
        sbc     hl, bc      ; [15]
        add     hl, bc      ; [11]
        ex      (sp), hl    ; [19]
        pop     ix          ; [14]
A whopping 93 T-States. Even if I remove the need to preserve HL:

Code: Select all

        
        push    ix          ; [15]
        pop     hl          ; [10]
        sbc     hl, bc      ; [15]


40 T-States - still 8 T-States slower.

I'll try trashing DE instead:

Code: Select all

        ld      e, ixl      ; [8]
        ld      d, ixh      ; [8]
        ex      de, hl      ; [4]
        sbc     hl, bc      ; [15]
        ex      de, hl      ; [4]
That's 39 T-states - still not quite there.

Obviously, using IX for mathematics isn't desirable, but sometimes you have no choice.

I'll also add that this isn't limited to using IX. I have used it now in a bubble sort routine:

Code: Select all

            ;-------------------------------------------------------------------
            ; Compare key (DE) to key (HL)
            ; Must preserve BC, DE & HL
            ;-------------------------------------------------------------------
_swapLoop:  ld      a, (de)         ; [7] do -(DE) - 1 + (HL)
            inc     de              ; [6]
            cpl                     ; [4]
            add     a, (hl)         ; [7]
            inc     hl              ; [6]
            ld      a, (de)         ; [7]
            cpl                     ; [4]
            adc     a, (hl)         ; [7]
            dec     de              ; [6]
            dec     hl              ; [6]
            jp      m, _loopTail    ; [10]
So it has its uses when registers are scarce.
"I don't know why a refrigerator is now involved, but put that aside for now". - Jim e on unitedti.org

avatar courtesy of driesguldolf.
User avatar
driesguldolf
Extreme Poster
Posts: 395
Joined: Thu 17 May, 2007 4:49 pm
Location: $4080
Contact:

Post by driesguldolf »

:o

I'm so going to use that.

I have a similar problem with my binary search. Though my 'problem' is that I want to use IX to access a list of structs. Using HL has the disadvantage that one change in the struct could break the entire code.
User avatar
tr1p1ea
Maxcoderz Staff
Posts: 4141
Joined: Thu 16 Dec, 2004 10:06 pm
Location: I cant seem to get out of this cryogenic chamber!
Contact:

Post by tr1p1ea »

People used to preach harshly about staying away from IX, but i like to use it. Although the instruction on its own is slow, it can save you some serious tstates if used creatively as demonstrated.
"My world is Black & White. But if I blink fast enough, I see it in Grayscale."
Image
Image
User avatar
driesguldolf
Extreme Poster
Posts: 395
Joined: Thu 17 May, 2007 4:49 pm
Location: $4080
Contact:

Post by driesguldolf »

I like index regs because it allows flexibility in accessing a list structures.
Using HL and inc/dec on the fly can cause major problems. Any change in the structure itself can and will completely break your code (while with ix you just change the constants you defined to access its members).

Also if you need to access the struct randomly (first the last byte, then the first) or conditionally will screw things up when using HL while this is exactly the reason why index regs exist in the first place. I just like to be consistent even if it isn't needed. (At least that's what the teachers are constantly telling us: always abstract things and stay consistent :P)
User avatar
qarnos
Maxcoderz Staff
Posts: 227
Joined: Thu 01 Dec, 2005 9:04 am
Location: Melbourne, Australia

Post by qarnos »

I only just noticed this, but the code:

Code: Select all

            ld      a, c
            cpl
            add     a, l
            ld      a, b
            cpl
            adc     a, h
            ; m flag set if BC <= HL            
is actually 6 clocks faster than:

Code: Select all

            or      a
            sbc     hl, bc
            add     hl, bc
24 T-States versus 30. So there's another one to keep in mind if you want to do a <= test. Additionally, with the first case, you can do the actual <= test with one JP instruction (JP m) whereas the second requires two (JP C/M, JP Z).
"I don't know why a refrigerator is now involved, but put that aside for now". - Jim e on unitedti.org

avatar courtesy of driesguldolf.
User avatar
qarnos
Maxcoderz Staff
Posts: 227
Joined: Thu 01 Dec, 2005 9:04 am
Location: Melbourne, Australia

Post by qarnos »

Sorry for the double post, but I've just realised we've been looking at this the wrong way. Doing it byte by byte is fine, but who says we have to do the low byte first?

Code: Select all

            ld      a, b        ; [4]
            cp      h           ; [4]
            jr      nz, $ + $04 ; [12/7]
            ld      a, c        ; [4]
            cp      l           ; [4]
This checks the high byte first. If H > B then HL must be > BC and vice-versa. If the high bytes are equal, the the low bytes determine the answer.

This code takes either 20 T-States if H & B are not equal or 23 T-States otherwise. I reckon that must be about the fastest possible way to do it and it only destroys A.
"I don't know why a refrigerator is now involved, but put that aside for now". - Jim e on unitedti.org

avatar courtesy of driesguldolf.
User avatar
calc84maniac
Regular Member
Posts: 112
Joined: Wed 18 Oct, 2006 7:34 pm
Location: The ex-planet Pluto
Contact:

Post by calc84maniac »

Nice, that one even returns Z if they're equal unlike some of the alternatives. :)
~calc84maniac has spoken.

Projects:
F-Zero 83+
Project M (Super Mario for 83+)
King Harold
Calc King
Posts: 1513
Joined: Sat 05 Aug, 2006 7:22 am

Post by King Harold »

I thought that was the "normal" way?
User avatar
calc84maniac
Regular Member
Posts: 112
Joined: Wed 18 Oct, 2006 7:34 pm
Location: The ex-planet Pluto
Contact:

Post by calc84maniac »

I meant "unlike some of the routines people have supplied".
~calc84maniac has spoken.

Projects:
F-Zero 83+
Project M (Super Mario for 83+)
Post Reply