Decoding Quoted Printable Text 
Author Message
 Decoding Quoted Printable Text

Hi,

I'm importing vcards, which use quoted printable to store certain infos.
How can I decode quoted printable text into "normal" text? I know there is
no functionality built into the .net framework.
Writing it myself does not seem really appealing to me..

Regards
Karsten
--
http://www.*-*-*.com/



Mon, 30 May 2005 23:53:54 GMT  
 Decoding Quoted Printable Text
Karsten,

I believe you are referring to comma-delimited text files that include some
or all strings within quotes, where the quoted strings might also contain
commas.

As you have no doubt already figured out, using String.Split() is not going
to cut it here.  It's not smart enough.

Your choices are to use a text file driver for ODBC or OLE-DB (for example,
the ODBC text desktop driver) and access the data using ADO.NET; or as you
said, roll your own.

I have rolled my own and it is a somewhat dull project.  Unfortunately I
don't own the code so I can't share the classes.

The nice thing about rolling your own is that it is much lighter-weight and
flexible than ADO, which you sometimes need in dealing with oddball or semi
free-form formats.  I have the ability to take any field separator (comma,
tab, pipe, etc) and pull the values for each record into an ArrayList,
optionally boxed to the type of data that each field is supposed to contain.
From there it's easy to write it back out in any other format, or feed it to
a database.

--Bob


Quote:
> Hi,

> I'm importing vcards, which use quoted printable to store certain infos.
> How can I decode quoted printable text into "normal" text? I know there is
> no functionality built into the .net framework.
> Writing it myself does not seem really appealing to me..

> Regards
> Karsten
> --
> http://www.umluex.de



Tue, 31 May 2005 05:33:18 GMT  
 Decoding Quoted Printable Text
Hi Bob,

unfortunately I'm not having dificulties with comma-delimted strings. To
show you what I'm trying to decode here is an excerpt from an vcard:

NOTE;ENCODING=QUOTED-PRINTABLE:Rainer A. 0201 12-26417 Fax 158=0D=0ARainer
Umluex.=0D=0AR System=
s GmbH=0D=0AKey Account Management=0D=0AAlfredstra=DFe 28      =
 =0D=0AD - 45130 Essen=0D=0ATel. +49(0)20-26417       Stefan H=0D=0AF=
ax  +49(0)20-1222874=0D=0Amobil 017123123301

Getting a text like this back into "normal" from with all line breaks and
carriage returns is what I'm trying to achieve.
You'd think that this is a standard function found in some web related
class..

Regards
Karsten

--
http://www.umluex.de



Quote:
> Karsten,

> I believe you are referring to comma-delimited text files that include
some
> or all strings within quotes, where the quoted strings might also contain
> commas.

> As you have no doubt already figured out, using String.Split() is not
going
> to cut it here.  It's not smart enough.

> Your choices are to use a text file driver for ODBC or OLE-DB (for
example,
> the ODBC text desktop driver) and access the data using ADO.NET; or as you
> said, roll your own.

> I have rolled my own and it is a somewhat dull project.  Unfortunately I
> don't own the code so I can't share the classes.

> The nice thing about rolling your own is that it is much lighter-weight
and
> flexible than ADO, which you sometimes need in dealing with oddball or
semi
> free-form formats.  I have the ability to take any field separator (comma,
> tab, pipe, etc) and pull the values for each record into an ArrayList,
> optionally boxed to the type of data that each field is supposed to
contain.
> From there it's easy to write it back out in any other format, or feed it
to
> a database.

> --Bob



> > Hi,

> > I'm importing vcards, which use quoted printable to store certain infos.
> > How can I decode quoted printable text into "normal" text? I know there
is
> > no functionality built into the .net framework.
> > Writing it myself does not seem really appealing to me..

> > Regards
> > Karsten
> > --
> > http://www.umluex.de



Tue, 31 May 2005 19:32:31 GMT  
 Decoding Quoted Printable Text
I don't know of any existing library for this, but it should be a fairly
simple fixup.  In fact, if linefeed / carriage return pairs are all you ever
see in the data, you'd just do something like this, assuming fileContent is
a string containing the whole file:

while (fileContent.IndexOf("=OD=OA") > -1) {
    fileContent = fileContent.Replace("=0D=0A",Environment.Newline);

Quote:
}

... perhaps adding a few additional replaces for tabs (=09) or other control
characters that might exist.  If there is a greater variety than that, write
a while loop that looks for "=" and decodes the following two hex characters
and turns them into a single unicode character.  The latter would be a good
excuse to turn fileContent into a StringBuilder, which would be more
efficient on balance:

// fileContent is a StringBuilder instance containing the entire file
content.

int i = 0;

while (i < fileContent.Length - 2) {

    if (fileContent[i] == '=') {
        string hexNumber = fileContent[i + 1] + fileContent[i + 2];
        string newString = ((char)Convert.ToInt32(hexNumber,16)).ToString;
        fileContent.Replace(i,3,newString);
    }

    i++;

Quote:
}

Above code is a little simplistic, just off the cuff and untested.  I'd
check whether there are equal signs possible in the stream that do not
denote an escape sequence (there shouldn't be, but dumber things have been
designed).

Or if you're into regex, no doubt there is fifty bytes of gobbeldy-gook that
will do the deed in one line.

--Bob


Quote:
> Hi Bob,

> unfortunately I'm not having dificulties with comma-delimted strings. To
> show you what I'm trying to decode here is an excerpt from an vcard:

> NOTE;ENCODING=QUOTED-PRINTABLE:Rainer A. 0201 12-26417 Fax 158=0D=0ARainer
> Umluex.=0D=0AR System=
> s GmbH=0D=0AKey Account Management=0D=0AAlfredstra=DFe 28      =
>  =0D=0AD - 45130 Essen=0D=0ATel. +49(0)20-26417       Stefan H=0D=0AF=
> ax  +49(0)20-1222874=0D=0Amobil 017123123301

> Getting a text like this back into "normal" from with all line breaks and
> carriage returns is what I'm trying to achieve.
> You'd think that this is a standard function found in some web related
> class..

> Regards
> Karsten

> --
> http://www.umluex.de



> > Karsten,

> > I believe you are referring to comma-delimited text files that include
> some
> > or all strings within quotes, where the quoted strings might also
contain
> > commas.

> > As you have no doubt already figured out, using String.Split() is not
> going
> > to cut it here.  It's not smart enough.

> > Your choices are to use a text file driver for ODBC or OLE-DB (for
> example,
> > the ODBC text desktop driver) and access the data using ADO.NET; or as
you
> > said, roll your own.

> > I have rolled my own and it is a somewhat dull project.  Unfortunately I
> > don't own the code so I can't share the classes.

> > The nice thing about rolling your own is that it is much lighter-weight
> and
> > flexible than ADO, which you sometimes need in dealing with oddball or
> semi
> > free-form formats.  I have the ability to take any field separator
(comma,
> > tab, pipe, etc) and pull the values for each record into an ArrayList,
> > optionally boxed to the type of data that each field is supposed to
> contain.
> > From there it's easy to write it back out in any other format, or feed
it
> to
> > a database.

> > --Bob



> > > Hi,

> > > I'm importing vcards, which use quoted printable to store certain
infos.
> > > How can I decode quoted printable text into "normal" text? I know
there
> is
> > > no functionality built into the .net framework.
> > > Writing it myself does not seem really appealing to me..

> > > Regards
> > > Karsten
> > > --
> > > http://www.umluex.de



Sat, 04 Jun 2005 23:22:05 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. quoted-printable decoder

2. POP3-MIME How to decode "quoted-printable"

3. POP3-MIME How to decode "quoted-printable"

4. POP3-MIME How to decode "quoted-printable"

5. text missing in areas closer to printable margins

6. Parsing quotes and double quotes

7. HowTo? System.Text Encoding/Decoding - Localization

8. Retirieving printable area of a page

9. printable characters

10. building a printable report doc/form

11. non-printable characters

12. Non-printable Characters

 

 
Powered by phpBB® Forum Software