Algorithm for "bit copy" function

I need to write a function with roughly the following behavior:

void bitcpy( void *target, int tbit1, void *source, int sbit1,

int nbits )

If you think of the void ptrs as unsigned char ptrs, then the

behavior is as follows. Bitcpy copies nbits from source to

target, starting with the sbit1-th bit from source to the tbit1-th

bit of target.

Bit indexes are meant as offsets, in the most natural manner, i.e.,

if source and target are unsigned char ptrs, then bit 8 of source[0]

is the same as bit 0 of source[1], etc.

I did a little pencil-and-paper work, playing with >> and << and &=

operators, trying to move and mask the bit string from source into

target. But the logic necessary to handle arbitrary situations

got complicated, at least for the straightforward procedure I was

trying to implement.

While doing this pencil-and-paper work, I seemed to recall seeing

a suitable algorithm somewhere or other (I immediately pulled out

Knuth, but couldn't find it there). Can anyone suggest a reference

for this problem? Or, equally good, an algorithm that accomplishes

the functionality in a (hopefully) simple and elegant manner.

TIA,

John