Page 2 of 2

Posted: Mon 19 Sep, 2005 9:10 pm
by CoBB
If the LUTs are big, they cannot be cached, especially if they are traversed in a cache-unfriendly way. Cache misses can be very expensive. You often cannot tell in advance what the fastest method is.

Posted: Mon 19 Sep, 2005 9:46 pm
by coelurus
The only way is to write a test-program that does something meaningful. Generally, today's PCs will evaluate trig functions very quickly and with better accuracy than what any LUTs can give you. Not quite like that on calcs of course...

Posted: Tue 20 Sep, 2005 2:41 am
by threefingeredguy
How do PCs do it?

Posted: Tue 20 Sep, 2005 5:07 am
by CoBB
It's a single assembly instruction. Formerly (pre-Pentium era) this functionality was available in separate coprocessors, but it was integrated in the first Pentium.

Posted: Tue 20 Sep, 2005 9:54 am
by benryves
If the CPU can calculate trig itself, then you should have no real need for a LUT (unless the instruction was very ineffecient).
Of course, using maths to favour a LUT was what gave us the ATi DOOM3 shader tweak. ;) So LUTs are not always the best.

Posted: Thu 22 Sep, 2005 4:56 am
by crzyrbl
im happy as long as what im learning in AP calc doesnt become obsolete.

Posted: Thu 22 Sep, 2005 9:58 pm
by Kerey
benryves wrote:If the CPU can calculate trig itself, then you should have no real need for a LUT (unless the instruction was very ineffecient).
Of course, using maths to favour a LUT was what gave us the ATi DOOM3 shader tweak. ;) So LUTs are not always the best.
Ok, I guess one assembly instruction might be a tad faster than an LUT, heh. I was thinking they were a Taylor expansion or something.

Posted: Fri 23 Sep, 2005 8:51 am
by coelurus
They probably are, somewhere deep down below... Which is a little worrying, so to settle my nerves, I made a benchmark calculating a whole heap of sins and taking values from a high resolution LUT, compiled with all optimizations I could find for my amd64 3000+. The results were a little surprising, the LUT was nearly 100x faster :) I've never looked into any asm so I didn't use any weird vector-tricks with SSE to get a whole heap of values in a row.

This doesn't imply LUTs are better overall, remember that :)