[TI ASM] port $11, weirdness on wikiTI

Got questions? Got answers? Go here for both.

Moderator: MaxCoderz Staff

King Harold
Calc King
Posts: 1513
Joined: Sat 05 Aug, 2006 7:22 am

Post by King Harold »

the way I did it, it becomes quite crazy
Image

the dark-gray part is much lighter in reality than on the screeny..

edit: but it's a bit dark, so this time with an other mask that has less dark and more light in it:
Image

edit again: that was crap, 128 cc's between each write omg, now it's 85, looks a bit better:
Image
User avatar
tr1p1ea
Maxcoderz Staff
Posts: 4141
Joined: Thu 16 Dec, 2004 10:06 pm
Location: I cant seem to get out of this cryogenic chamber!
Contact:

Post by tr1p1ea »

You are only using 1 mask (and rotating that for each byte?) -- thats usually what causes the rainy effect. You could try using a combination of masks each loop.
"My world is Black & White. But if I blink fast enough, I see it in Grayscale."
Image
Image
King Harold
Calc King
Posts: 1513
Joined: Sat 05 Aug, 2006 7:22 am

Post by King Harold »

a combination of masks..?
more than %11011011 and %00100100 and their rotated versions?
Now I'm rotating those masks every byte and an additional time every row, otherwise the rotating results in the same masks every row - which doesn't make any gray, but masked out lines.

What other masks should I use then? and where?
User avatar
tr1p1ea
Maxcoderz Staff
Posts: 4141
Joined: Thu 16 Dec, 2004 10:06 pm
Location: I cant seem to get out of this cryogenic chamber!
Contact:

Post by tr1p1ea »

Well that diagonal rainy effect is caused by only using 1 mask. To reduce flicker you would say use %11011011 for one out to the lcd, then you could use %01101101 fo the next out, then %10110110 and repeat that. Then the next frame you would switch these masks around ... that usually takes care of the uniform rainy effect.
"My world is Black & White. But if I blink fast enough, I see it in Grayscale."
Image
Image
King Harold
Calc King
Posts: 1513
Joined: Sat 05 Aug, 2006 7:22 am

Post by King Harold »

I can't do it yet :(
Maybe tomorrow, I feel a bit braindead at the moment..
User avatar
Jim e
Calc King
Posts: 2457
Joined: Sun 26 Dec, 2004 5:27 am
Location: SXIOPO = Infinite lives for both players
Contact:

Post by Jim e »

Well I thought I'd find the guide I wrote in my sentbox but apparently it doesn't save to many messages. So heres a quick run down of interlacing


Lets say this is the image you want.

Image


It would be composed of 2 layers commonly referred to as a dark layer and a light layer. The dark layer is generally stored before the light layer. So that image would look like this.

Dark:
Image

Light:
Image

For 4 level gray scale, the dark layer is displayed 2 times longer than the light. So if the light layer flashed once than the dark is twice.

Image


This method is generally faster, the biggest reason is because you don't have to perform fastcopy again when the dark layer was shown prior. So in other words you can skip 1 lcd update. That save a tremendous amount of time. This method of updating the entire screen with a layer is also simpler to code and generally lighter.

However, if timing is inaccurate, the screen will look flickery and be quite unpleasant. If timing is accurate but not tuned properly, you'll end up seeing a line moving across the screen. Which is horribly noticeable.


Rigview came along with accurate controllable timing and a new method of displaying the screen. This was done by interlacing bytes from both layers.
Image
Unlike the last screen you can actually tell even at this low frame rate whats suppose to be dark and whats suppose to be light. Even if the timing is slightly off the noise produced by that is spread out more evenly. So the effect is less painful. This however gets more bloated because you have to carry pointers to both layers and have a method to decide which layer gets used on what byte. Thats either by unrolling code or giving up some more registers.

I think it was Duck who decided to take it to the bit level, not sure on that. But there was a good deal of benefit from it. No matter how bad timing was dithering the bits together allowed for a very even image. The screen, no matter how noisy, was not unpleasant.

Image

Of course with every advantage there is a disadvantage, the code is MUCH more complicated. Duck's code actually used shadow registers, so you can imagine the nightmare of having to work with that code. Big issue with it is that you have the annoyance of having to work with masks. So lets count the registers.

A 16bit pointer for the dark layer. How bout HL?
An 8bit mask for the dark layer. Ummm C?
A 8bit temporary storage place for the masked dark layer. B!
A 16bit pointer for the light layer. DE then
An 8bit mask for the light layer. ....uh
A 16bit value to add the offset to the next byte. oh SP!!!?
An 8bit loop counter. crap.

So in other words, this code would be register starved. You want each write to the lcd port to occur within ~70 tstates of the last write. Self Modifying Code works to an extent but is still not very fast.

Then some bright boy(I forget his name, had letters in it I swear) came up with a method that relied on reversible operations. What this did was get rid of the need of temporary storage and the need to hold both masks. (Kinda plays back into your xor swapping thing).

So this would look like:

Dark_layer ^ Light_layer & Dark_mask ^ Light_layer = result

As opposed to:
(Dark_layer & Dark_mask) | ( Light_layer & Light_mask ) = result

Just comparing the code is noticeable improvement.

Code: Select all

;Old broken code
	ld a,(ix)
	and d
	ld c,a
	ld a,(hl)
	and e
	or c

;New hotness
	ld a,(de)
	xor (hl)
	and c
	xor (hl)
This can really save you from requiring use of slower registers or shadow registers. Typically you can get WAY below the required lcd delay. So its extremely helpful.


So this is what I ended up using for RGP. It runs at fastcopy speed so I decided thats enough optimization. It requires that layers be stored next to each other, but that isn't to much of an issue.

Code: Select all

;-------------------------------------------------
;4 level Grey interlace routine
;by James Montelongo 
lcd:					;52744
	in a,($20)
	push af
	ld a,0
	out ($20),a
	ld (stacksave),sp
	ld a,$80
	out ($10),a
	ld a,(gsmasknum)
	inc a
	cp 3
	jr c,skipmaskswap
	xor a
skipmaskswap:
	ld (gsmasknum),a
	ld e,a
	ld d,0
	ld hl,gsmasks
	add hl,de
	ld d,(hl)
	inc hl		;accidentally deleted.
	ld a,(hl)
	cpl
	ld e,a
	ld hl,gsActivebuf1-12
	ld sp,12
	ld a,$20
	ld c,a
colloop:

	out ($10),a
	ld b,32
rowloop:
	add hl,sp
	ld a,(hl)
	inc h
	inc h
	inc h
	xor (hl)
	and d
	xor (hl)
	out ($11),a
	add hl,sp
	nop		;I actually need to delay.
	ld a,(hl)
	dec h
	dec h
	dec h
	xor (hl)
	and e
	xor (hl)
	out ($11),a
	djnz rowloop
	inc c
	dec h
	dec h
	dec h
	inc hl
	ld a,c
	cp $2c
	jr nz,colloop
	ld sp,(stacksave)
	pop af
	out ($20),a
	ret


gsmasks:
 .db %11011011
 .db %10110110
 .db %01101101
 .db %11011011


For the masks, The dark layer should mask out 1/3 of its bits, the light layer should mask out 2/3 of its bits.
Last edited by Jim e on Mon 14 May, 2007 5:10 am, edited 1 time in total.
Image
King Harold
Calc King
Posts: 1513
Joined: Sat 05 Aug, 2006 7:22 am

Post by King Harold »

ok I get it now, thanks :)

Not to criticize you, but I would change

Code: Select all

ld d,(hl)
ld a,(hl)
into

Code: Select all

ld a,(hl)
ld d,a
or similar to save 3 clocks
not that it matters, I'm just saying it so you all know I'm awake :P
same goes for the ld a,0

edit: it actually looks sortof OK now:
Image
the dark area looks a bit odd on the screeny, it's not that bad on HW
User avatar
tr1p1ea
Maxcoderz Staff
Posts: 4141
Joined: Thu 16 Dec, 2004 10:06 pm
Location: I cant seem to get out of this cryogenic chamber!
Contact:

Post by tr1p1ea »

It was Tijl Coosemans (Kalimero) who first came up with bit level interlacing, Duck continued his work. This was pretty good and a typical routine (without unrolling) could get down to say 70cc's per write.

Then it was Johan Forslöf (doyanx) who came up with (A ^ B) & C ^ B = (A & C) | (B & ~C). Although his implementation was all over the place and it needed a non-standard buffer layout, this paved the way for an ordinary routine to be optimsed faster than fastcopy, which both myself and jim e ended up doing. My routine is a little different, but the core is essentially the same -- Jim it appears you did some things to avoid using smc (only 1 mask per update?)

Oh and is that only 63cc's between writes ... is that safe enough?
"My world is Black & White. But if I blink fast enough, I see it in Grayscale."
Image
Image
User avatar
Jim e
Calc King
Posts: 2457
Joined: Sun 26 Dec, 2004 5:27 am
Location: SXIOPO = Infinite lives for both players
Contact:

Post by Jim e »

King Harold wrote:or similar to save 3 clocks
Sweet, I can save half a microsecond. At 70fps, running for oh say 8 hours, I'll save bout 1 second.

Meh...I'd probably waste that second of my life anyway.
Image
King Harold
Calc King
Posts: 1513
Joined: Sat 05 Aug, 2006 7:22 am

Post by King Harold »

not that it matters, I'm just saying it so you all know I'm awake
let's not start a discussion about that, ok?
it doesn't hurt to save those 3 clocks and a byte anyway..

ok, lets continue with the grayscale:
is it normal that it gets weird like that on screeny's? can anything be done about it?
User avatar
tr1p1ea
Maxcoderz Staff
Posts: 4141
Joined: Thu 16 Dec, 2004 10:06 pm
Location: I cant seem to get out of this cryogenic chamber!
Contact:

Post by tr1p1ea »

The uniform noise is due to the fact that the routine only uses 1 mask per lcd update. To reduce this you can use all 3 masks per lcd update -- this would require expansion of the routine. I think jim has done it that way to avoid the need for smc.
"My world is Black & White. But if I blink fast enough, I see it in Grayscale."
Image
Image
User avatar
Jim e
Calc King
Posts: 2457
Joined: Sun 26 Dec, 2004 5:27 am
Location: SXIOPO = Infinite lives for both players
Contact:

Post by Jim e »

I use 2 masks to break the uniform look. 3 masks would be best but 2 is enough. You could also rotate the mask circularly after each write.

I also use the 3 inc\dec to save de from pointer use. If I didn't I would have to add de with sp which would waste 15 clocks as opposed to 12.
Image
User avatar
tr1p1ea
Maxcoderz Staff
Posts: 4141
Joined: Thu 16 Dec, 2004 10:06 pm
Location: I cant seem to get out of this cryogenic chamber!
Contact:

Post by tr1p1ea »

So ... that is not the routine you use in RGP then? I only see 1 mask being used (used twice).
"My world is Black & White. But if I blink fast enough, I see it in Grayscale."
Image
Image
User avatar
Jim e
Calc King
Posts: 2457
Joined: Sun 26 Dec, 2004 5:27 am
Location: SXIOPO = Infinite lives for both players
Contact:

Post by Jim e »

whoops I accidentally deleted an inc hl

There were some ifdefs there that weren't relevant.
Image
User avatar
tr1p1ea
Maxcoderz Staff
Posts: 4141
Joined: Thu 16 Dec, 2004 10:06 pm
Location: I cant seem to get out of this cryogenic chamber!
Contact:

Post by tr1p1ea »

Ahh ok, i thought that was the case, since your mask table was padded.
"My world is Black & White. But if I blink fast enough, I see it in Grayscale."
Image
Image
Post Reply