I'm trying to optimize this sprite rountine (it works perfectly as-is).
Does anyone have any more optimizations for it? I've looked for awhile...
Excuse my comments, I'm a new assembly programmer
; ----------------------------------------------------------------------
; Draws an 8x8 sprite.
; ----------------------------------------------------------------------
DrawSprite:
; ---------------------------------------------------------------------
; Need to multply by something not a power of two.
; ---------------------------------------------------------------------
ld h, 0
ld d, h
ld e, l
; ---------------------------------------------------------------------
; Calculate the vertical offset.
;
; Each line of pixels has 12 bytes, so multiply the vertical by 12.
; ---------------------------------------------------------------------
add hl, hl
add hl, de
add hl, hl
add hl, hl
; ---------------------------------------------------------------------
; Calculate the byte to start shifting the sprite at, horizontally.
;
; Need to divide by 8 since there's 8 bits to a byte.
; ---------------------------------------------------------------------
sra c
sra c
sra c
add hl, bc
; ---------------------------------------------------------------------
; Get the address of the buffer, find the new offset.
; ---------------------------------------------------------------------
ld de, $9340
add hl, de
; ---------------------------------------------------------------------
; Prepare to draw the sprite.
; ---------------------------------------------------------------------
ld e, 11
ld b, 8
; ---------------------------------------------------------------------
; Vertical shift loop.
; ---------------------------------------------------------------------
vert_shift:
ld a, (ix)
ld (hl), a
; ---------------------------------------------------------------------
; Save the current row being worked on.
; Get the number of horizontal shits to do.
; ---------------------------------------------------------------------
ld d, b
ld b, c
; ------------------------------------------------------------------
; Horizontal shift loop.
; ------------------------------------------------------------------
horiz_shift:
ld a, (hl)
srl a
ld (hl), a
inc hl
ld a, (hl)
rr a
ld (hl), a
djnz horiz_shift
; --------------------------------------------------------------------
; Restore the loop counter for vertical loop.
; Increment offsets and continue.
; --------------------------------------------------------------------
ld b, d
ld d, 0
inc ix
add hl, de
djnz vert_shift
ret
Trying to optimize this sprite routine...
Moderator: MaxCoderz Staff
- Jim e
- Calc King
- Posts: 2457
- Joined: Sun 26 Dec, 2004 5:27 am
- Location: SXIOPO = Infinite lives for both players
- Contact:
I'm fairly certain that doesn't work. First Register C which should contain X is divided by 8 so to add the X offset in memory. But you use it later as though it contains the bit shift, that won't work.
Second you are shifting the graph buffer the way you have it. That assumes that the graph has nothing useful there.
There are a number of optimizations that could happen, but it would end up just rewriting it.
Second you are shifting the graph buffer the way you have it. That assumes that the graph has nothing useful there.
There are a number of optimizations that could happen, but it would end up just rewriting it.
I edited the routine so that it XORs the sprite to the buffer. I'm just posting the code again in case anyone wanted it or has any suggestions or optimizations.
Sorry for the bad formatting, it tabs up perfectly in LateNite.
Disreguard the byte count, I just realized how horribly wrong I was.
Also, I'm trying to write a routine that writes the buffer to the screen (I know some exist already), and I'm not sure how long the delay would be on a silver edition. The Toshiba T6A04 Documentation says about 10us...
How would I go about converting that to clock cycles on a 6/15MHz processor?
Sorry for the bad formatting, it tabs up perfectly in LateNite.
Code: Select all
; ======================================================================
; The address to the screen buffer should be modified here.
; ======================================================================
buffer .equ $9340
; ======================================================================
; Simple routine that draws an 8x8 sprite to the screen.
;
; 51 Bytes, Variable Number of Clock Cycles
; ======================================================================
DrawSpr8x8:
; ----------------------------------------------------------------------
; Need to multiply by something that is not a power of two...
; ----------------------------------------------------------------------
8x8_Get_Screen_Offset:
ld h, 0 ; 7CC, 2B
ld d, h ; 4CC, 1B
ld e, l ; 4CC, 1B
; ----------------------------------------------------------------------
; Every horizontal line of pixels on the screen is twelve bytes.
; ----------------------------------------------------------------------
add hl, de ; 11CC, 1B
add hl, de ; 11CC, 1B
add hl, hl ; 11CC, 1B
add hl, hl ; 11CC, 1B
; ----------------------------------------------------------------------
; Need to divide by eight to obtain horizontal aspect of offset.
; ----------------------------------------------------------------------
ld e, a ; 4CC, 1B
srl e ; 8CC, 2B
srl e ; 8CC, 2B
srl e ; 8CC, 2B
add hl, de ; 11CC, 1B
; ----------------------------------------------------------------------
; Get the offset of the screen buffer and calculate the offset.
; ----------------------------------------------------------------------
ld bc, buffer ; 10CC, 3B
add hl, bc ; 11CC, 1B
; ----------------------------------------------------------------------
; Initialize values before the loops begin.
; ----------------------------------------------------------------------
ld c, a ; 4CC, 1B
ld b, 8 ; 7CC, 2B
; ----------------------------------------------------------------------
; This is vertical shift loop, moves down a line to draw the next.
; ----------------------------------------------------------------------
8x8_Vert_Shift:
ld e, d ; 4CC, 1B
ld d, (ix) ; 19CC, 3B
ld a, c ; 4CC, 1B
; -------------------------------------------------------------------
; Horizontally shift sprite data using bit shifts.
; -------------------------------------------------------------------
8x8_Horiz_Shift_1:
srl d ; 8CC, 2B
rr e ; 8CC, 2B
dec a ; 4CC, 1B
jp nz, 8x8_Horiz_Shift_1 ; 10/1CC, 3B
; -------------------------------------------------------------------
; XOR the sprite data with data already on screen, write to buffer.
; -------------------------------------------------------------------
8x8_Horiz_Shift_2:
ld a, (hl) ; 7CC, 1B
xor d ; 4CC, 1B
ld (hl), a ; 7CC, 1B
inc hl ; 6CC, 1B
ld a, (hl) ; 7CC, 1B
xor e ; 4CC, 1B
ld (hl), a ; 7CC, 1B
; ----------------------------------------------------------------------
; Increment pointers necessary amounts, jump if not finished.
; ----------------------------------------------------------------------
8x8_Increment_Pointers:
inc ix ; 10CC, 2B
ld de, 11 ; 10CC, 3B
add hl, de ; 11CC, 1B
djnz 8x8_Vert_Shift ; 13/8CC, 1B
ret ; 10CC, 1B
Also, I'm trying to write a routine that writes the buffer to the screen (I know some exist already), and I'm not sure how long the delay would be on a silver edition. The Toshiba T6A04 Documentation says about 10us...
How would I go about converting that to clock cycles on a 6/15MHz processor?
Last edited by Tyler on Fri 15 Jun, 2007 1:38 am, edited 1 time in total.
- calc84maniac
- Regular Member
- Posts: 112
- Joined: Wed 18 Oct, 2006 7:34 pm
- Location: The ex-planet Pluto
- Contact:
- Jim e
- Calc King
- Posts: 2457
- Joined: Sun 26 Dec, 2004 5:27 am
- Location: SXIOPO = Infinite lives for both players
- Contact:
SafecopyTyler wrote:By "safe copy", I suppose you mean bcalls?
Those are much too slow for my purposes
I've read approximations on the delay time, such as 10us = 60cc @ 6mhz, 240cc @ 15mhz, but I think these are off (hence the term 'approximation')...
Earlier most people would say 10us would be eough, but now you need about 11~12us(this is true for all models including the 83). So 70cc at 6mhz, 180cc at 15mhz.
Really though you should use safe copy when possible. I typically doesn't perform any worse than fastcopy (potentially better even) and has the benefit of always working. The down side though is the fact that it varies in clock cycles.
Your sprite routine still doesn't work. A contains X, and you need that for the shift but you need to mask the upper bits out. For example, if x=87, the bit shifting would rotate 87 times. What you need is AND 7.
Next is in the case no shifting occurs, X=0. In that situation the bit shifter would shift 256 times. In other words you need to test for when X aligned in the byte columns.
First of all, thank you for all the help. I just got out of my high-school Java course and compared to that and TI-BASIC, this is quite a bit more difficult (but that was expected)
Secondly, I think the sprite rountine works both correctly and efficently now. It appeared to be working before because I hadn't tested a y-coordinate greater than 8, so there was no need to modulo by 8 and figure out many shifts were needed.
Thirdly, thanks for that link to the Wiki, it is an invaluable source of information. I'll try to get my buffer routine working now; all I could get last night was a screen with garbage data all over.
Edit: I'm just going to use the copybuffer routine from WikiTI because I can't create anything that is more efficent than that my myself, let anyone anything that fully works.
Secondly, I think the sprite rountine works both correctly and efficently now. It appeared to be working before because I hadn't tested a y-coordinate greater than 8, so there was no need to modulo by 8 and figure out many shifts were needed.
Code: Select all
; ======================================================================
; Simple routine that draws an 8x8 sprite to the screen.
; ======================================================================
DrawSpr88:
; ----------------------------------------------------------------------
; Need to multiply by something that is not a power of two...
; ----------------------------------------------------------------------
8x8_Get_Screen_Offset:
ld h, 0
ld d, h
ld e, l
; ----------------------------------------------------------------------
; Every horizontal line of pixels on the screen is twelve bytes.
; ----------------------------------------------------------------------
add hl, de
add hl, de
add hl, hl
add hl, hl
; ----------------------------------------------------------------------
; Need to divide by eight to obtain horizontal aspect of offset.
; ----------------------------------------------------------------------
ld e, a
srl e
srl e
srl e
add hl, de
; ----------------------------------------------------------------------
; Get the offset of the screen buffer and calculate the offset.
; ----------------------------------------------------------------------
ld bc, $9340
add hl, bc
; ----------------------------------------------------------------------
; Initialize values before the loops begin.
; ----------------------------------------------------------------------
and 7
ld c, a
ld b, 8
; ----------------------------------------------------------------------
; This is vertical shift loop, moves down a line to draw the next.
; ----------------------------------------------------------------------
8x8_Vert_Shift:
ld e, d
ld d, (ix)
ld a, c
; ---------------------------------------------------------------------
; Check if the sprite needs to be shifted, if not, continue on.
; ---------------------------------------------------------------------
or a \ jp z, 8x8_Horiz_Shift_2
; -------------------------------------------------------------------
; Horizontally shift sprite data using bit shifts.
; -------------------------------------------------------------------
8x8_Horiz_Shift_1:
srl d
rr e
dec a \ jp nz, 8x8_Horiz_Shift_1
; -------------------------------------------------------------------
; XOR the sprite data with data already on screen, write to buffer.
; -------------------------------------------------------------------
8x8_Horiz_Shift_2:
ld a, (hl)
xor d
ld (hl), a
inc hl
ld a, (hl)
xor e
ld (hl), a
; ----------------------------------------------------------------------
; Increment pointers necessary amounts, jump if not finished.
; ----------------------------------------------------------------------
8x8_Increment_Pointers:
inc ix
ld de, 11
add hl, de
djnz 8x8_Vert_Shift
ret
Edit: I'm just going to use the copybuffer routine from WikiTI because I can't create anything that is more efficent than that my myself, let anyone anything that fully works.