Help: Add with 31 saturation

(posted & emailed)

Quote:

> Hello!

> I have a small problem. A friend of mine is currently using 15-bit

> color video modes and he phoned me the other day asking how to

> add two colors together.

> I gave him a small solution but slow (used som conditional jumps)

> What I need to do is:

> a:=a+b

> if a>31 then a:=31

> but since they are combined into two words like

> rrrrrgggggbbbbb0 + aaaaabbbbbccccc0

> and each of the five bits should be saturated by 31 there is a problem.

> How can I do this fast.

There's probably no really fast way of doing this, but MMX can help:

Unpack the 5:5:5 pixels into 8:8:8 format, do add with saturation, and

pack them back together.

Assuming plain Pentium/PPro cpu you'll have to do something else:

The best idea is probably to handle two pairs of pixels simultaneously,

using 32-bit variables:

typedef unsigned long ulong;

ulong addSaturate(ulong a, ulong b) // both a & b contains two 15-bit

pixels!

{

ulong sum, carry, mask, low_bits;

sum = (a & 0x7bdf7bdf) + (b & 0x7bdf7bdf); // Mask away bottom bit

of GB

carry = sum & 0x84208420; // Did we get any carries from the add?

sum &= 0x7bdf7bdf; // Keep 5 bits from R, top 4 bits from GB

mask = carry - (carry >> 5); // Either 11111 or 00000 for each 5-bit

value

low_bits = (a ^ b) & 0x04200420; // a+b == a^b for 1-bit variables!

return (sum | low_bits | mask);

Quote:

}

Assuming this code works (I wrote it just now :-) ), it should take less

than 10 cycles to combine two pairs of packed pixels, for an effective

cost of about 4-5 cycles/pixel.

Terje

--

Using self-discipline, see http://www.eiffel.com/discipline

"almost all programming can be viewed as an exercise in caching"