Samples of Huffman encode/decode? 
Author Message
 Samples of Huffman encode/decode?

Hi all,

Does anyone have any samples of Huffman encode/decode
scheme, writen in VB, that they would care to share.

I have a half way decent grasp of the concept, but
thought that I'd ask for some samples that my be able
to clear up, or solidify my thoughts on this.

Thanks in advance.

Brent
P.S:  Either my news server is acting up, or the group
postings have really slowed down.  Has anyone else noticed
a real reduction in the message volume?  If it is my ISP,
I'll be able to let him know.  Thanks again folks.



Fri, 28 Sep 2001 03:00:00 GMT  
 Samples of Huffman encode/decode?
Huffman coding is a character encoding technique for compressing data,
however it is not used much in current systems. Basically, the idea involves
replacing characters by bit strings in such a way that different characters
are represented by strings of different length with the most commonly
occurring characters being represented by the shortest strings. As an
example, suppose the data to be represented involves only the characters
A,B,C, and D, and suppose the relative frequencies as given.

           Character       Frequency       Code
               A             40%             1
               D             25%             01
               C             20%             001
               B             15%             000

Note that no character code of bit length 'n', is identical to the first 'n'
bits of any other character encoding.

**** Posted from RemarQ - http://www.remarq.com - Discussions Start Here (tm) ****



Fri, 28 Sep 2001 03:00:00 GMT  
 Samples of Huffman encode/decode?
Sounds like a school project.

Try this: http://ciir.cs.umass.edu/cmpsci646/Notes/Compression/sld014.htm

**** Posted from RemarQ - http://www.remarq.com - Discussions Start Here (tm) ****



Fri, 28 Sep 2001 03:00:00 GMT  
 Samples of Huffman encode/decode?

Quote:

> Huffman coding is a character encoding technique for compressing data,
> however it is not used much in current systems. Basically, the idea involves
> replacing characters by bit strings in such a way that different characters
> are represented by strings of different length with the most commonly
> occurring characters being represented by the shortest strings. As an
> example, suppose the data to be represented involves only the characters
> A,B,C, and D, and suppose the relative frequencies as given.

>            Character       Frequency       Code
>                A             40%             1
>                D             25%             01
>                C             20%             001
>                B             15%             000

> Note that no character code of bit length 'n', is identical to the first 'n'
> bits of any other character encoding.

> **** Posted from RemarQ - http://www.remarq.com - Discussions Start Here (tm) ****

Thanks for the replies.  This is not a school project,
though it would be a good one for any teachers that
happen to be reading this.

I have read over 20 explanations of the Huffman encoding,
and I'm seeking some sample code that I may learn from.

I want to put this(my program) together the way it should
be coded, based on the specification, rather than my
interpretation of the specification.

My main hang up on proceeding is the manner in which the
leaves, and branches should be stored in the file format.
Do they start at the begining of the file?  Or would
such a dictionary be found at the end of the file?  What
symbol(s)/characters would delimit or signal the separation
of the two sections?

Thanks again, Keith, and anyone else that may be able to
provide further assistance.

Have fun coding.
Brent



Fri, 28 Sep 2001 03:00:00 GMT  
 Samples of Huffman encode/decode?


Quote:
>Hi all,

>Does anyone have any samples of Huffman encode/decode
>scheme, writen in VB, that they would care to share.

 I've implemented Huffman encoding and decoding in VB(5) - (in fact also
a modified L-Z compression too).

 Its fairly messy, because VB doesn't have support for an array of bits
- which is what I'd really like. ( Why did they decide that boolean type
should take more than 1 bit ? )

 Contact me if you're interested - but you won't learn half as much from
someone else's code ;-)

 Howard Tomlinson
--

           Homepage:  http://www.astraware.com/
New Game for Win95!  http://www.astraware.com/bzzz.html



Fri, 28 Sep 2001 03:00:00 GMT  
 Samples of Huffman encode/decode?

Quote:



> >Hi all,

> >Does anyone have any samples of Huffman encode/decode
> >scheme, writen in VB, that they would care to share.

>  I've implemented Huffman encoding and decoding in VB(5) - (in fact also
> a modified L-Z compression too).

>  Its fairly messy, because VB doesn't have support for an array of bits
> - which is what I'd really like. ( Why did they decide that boolean type
> should take more than 1 bit ? )

>  Contact me if you're interested - but you won't learn half as much from
> someone else's code ;-)

>  Howard Tomlinson
> --

>            Homepage:  http://www.astraware.com/
> New Game for Win95!  http://www.astraware.com/bzzz.html

Thanks.  I will contact you through private email.

I agree with you on the comment:

Quote:
>  Contact me if you're interested - but you won't learn half as much from
> someone else's code ;-)

I'm training someone, and have been training them for about 2 months.
It is hard to express this fact sufficiently enough.  A person will always
learn more by figuring it out on their own.  Not only do you learn what
does work, but what doesn't work, and why.

The main reason for requesting such code is to help me get a grasp on
the file format.  There is a lot of info on the web which describes
the theory, yet very little(none that I've found) that deals with
the format.

Thanks again.
Brent



Sat, 29 Sep 2001 03:00:00 GMT  
 Samples of Huffman encode/decode?


Quote:

>The main reason for requesting such code is to help me get a grasp on
>the file format.  There is a lot of info on the web which describes
>the theory, yet very little(none that I've found) that deals with
>the format.

 There isn't _A_ format - Huffman encoding (as you have seen) is just
the theory, or way of compressing information. Every compression format
that uses it (be it zip lzh or whatever) will decide on its own way to
store the data.

 I would guess that by convergent evolution, most formats will come out
about the same - but here is the method I used:

 (I can't remember which flavour of Huffman encoding this is - but
you'll get the idea)

1) Skim through the file once to get the frequencies of each character.

2) Normalise these to give each character a relative frequency between 0
and 255. (Those which actually appear _must_ be given a value of at
least one)

3) Sort these pairs into order based on relative frequency.

4) Go through the list, counting until you reach a zero frequency
character.

*********
Now there is enough information to form the header:
1 byte:  'n' The value from (4) saying how many pairs there will be
n * 2 bytes: Pairs of character and its relative frequency.
*********

5) Construct the tree using the sorted information from (3)

6) Run through the input file, outputting the huffman code for each
character.

 Decoding is quite easy - read the header, reconstruct the tree, and run
through the data.

 I must admit that the hardest part of all of this for me was learning
how to construct the tree out of the data pairs - but that's another
story :-)

 Hope this helps a little, but do feel free to contact!

 Howard.

--

           Homepage:  http://www.astraware.com/
New Game for Win95!  http://www.astraware.com/bzzz.html



Sat, 29 Sep 2001 03:00:00 GMT  
 Samples of Huffman encode/decode?

Quote:


> >Huffman coding is a character encoding technique for compressing data,
> >however it is not used much in current systems.

> Except, of course, in almost every FAX machine on the planet.

> --
> #include <standard.disclaimer>
>  _
> Kevin D Quitt  USA 91351-4454           96.37% of all statistics are made up
> Per the FCA, this email address may not be added to any commercial mail list

You are absolutely correct.  You win the prize for being the only one to
bring up this important point.

I've been coding since 78 and was recently accused of submitting this
question as help with homework.  I wanted to laugh, but I could see
the guy's point.  A lot of students do ask that their homework be done,
or helped along, by newsgroup members.

Do you work in the fax industry?  If so, do you contract as a consultant,
or take on programming work?  The final code will be contained within an
embedded controller, but I like to prove things in VB before moving the
code into firmware.

Thanks in advance for any replies.

Brent



Sat, 29 Sep 2001 03:00:00 GMT  
 Samples of Huffman encode/decode?

Quote:



> >The main reason for requesting such code is to help me get a grasp on
> >the file format.  There is a lot of info on the web which describes
> >the theory, yet very little(none that I've found) that deals with
> >the format.

>  There isn't _A_ format - Huffman encoding (as you have seen) is just
> the theory, or way of compressing information. Every compression format
> that uses it (be it zip lzh or whatever) will decide on its own way to
> store the data.

>  I would guess that by convergent evolution, most formats will come out
> about the same - but here is the method I used:

>  (I can't remember which flavour of Huffman encoding this is - but
> you'll get the idea)

> 1) Skim through the file once to get the frequencies of each character.

> 2) Normalise these to give each character a relative frequency between 0
> and 255. (Those which actually appear _must_ be given a value of at
> least one)

> 3) Sort these pairs into order based on relative frequency.

> 4) Go through the list, counting until you reach a zero frequency
> character.

> *********
> Now there is enough information to form the header:
> 1 byte:  'n' The value from (4) saying how many pairs there will be
> n * 2 bytes: Pairs of character and its relative frequency.
> *********

> 5) Construct the tree using the sorted information from (3)

> 6) Run through the input file, outputting the huffman code for each
> character.

>  Decoding is quite easy - read the header, reconstruct the tree, and run
> through the data.

>  I must admit that the hardest part of all of this for me was learning
> how to construct the tree out of the data pairs - but that's another
> story :-)

>  Hope this helps a little, but do feel free to contact!

>  Howard.

> --

>            Homepage:  http://www.astraware.com/
> New Game for Win95!  http://www.astraware.com/bzzz.html

Thank you very much for taking your time to document your approach.
This is greatly appreciated.  

I checked out your site.  That is some neat stuff that you have there.

Please let me know if I may ever be able to return the favor.
Thanks again.

Brent



Sat, 29 Sep 2001 03:00:00 GMT  
 
 [ 9 post ] 

 Relevant Pages 

1. Encode/Decode source sample (esempio di codice)

2. Sample Code for Base64 Encoding/Decoding

3. Decode Encoded-word in MIME field mail using VB.net

4. Encoding - Decoding Variables

5. SOURCE CODE FOR ARC/ZIP DECODING/ENCODING

6. encoding and decoding

7. encoding and decoding password

8. Encoding decoding what I think is ASCII/ECDB

9. Base64 encoding/decoding

10. Encoding/Decoding

11. Encoding/Decoding Mime, UUENCODE, Printed-Quoteable...

12. Encoding & Decoding

 

 
Powered by phpBB® Forum Software