how to find extended ascii characters? 
Author Message
 how to find extended ascii characters?

I need to be able to find an ascii string and the extended ascii
characters that precede and follow it. For example,  ?DES  (In case
the email hoses this, that is an "A" with a ~ on top followed by
the ascii string DES. I think the extended ascii number is 227.)
Following this is another extended ascii character and then a
string of text that I want to put into a new file.

Here is the full line I'm trying to find and extract.

?DES a3NUT,WELD,1/2-13THD,4 PJTN

What I want to extract is "NUT,WELD,1/2-13THD,4 PJTN"

Before my CAD vendor changed their file format I was able to do the
following.

foreach prt (*.prt*)
nawk   '/ DES/ {c_disc = NR+3};\
(NR == c_disc) && ($3 !~ "NULL") {description = substr($0,6)};\
END {print (FILENAME,"\n"description,"\n""#")} '
--
+-----------------------------------------------------------------+
| David A. Haigh                                                  |
| Alcatel USA                                                     |
| 1400 McDowell Blvd, Petaluma, CA 94954                          |
| phone:  (707)7927-007   <--- Yes double "O" seven!              |
| fax:    (707)792-7059                                           |

+-----------------------------------------------------------------+
 The opinions represented here are not necessarily those of Alactel



Tue, 12 Aug 2003 03:49:46 GMT  
 how to find extended ascii characters?
I need to be able to find an ascii string and the extended ascii
characters that precede and follow it. For example,  ?DES  (In case
the email hoses this, that is an "A" with a ~ on top followed by
the ascii string DES. I think the extended ascii number is 227.)
Following this is another extended ascii character and then a
string of text that I want to put into a new file.

Here is the full line I'm trying to find and extract.

?DES a3NUT,WELD,1/2-13THD,4 PJTN

What I want to extract is "NUT,WELD,1/2-13THD,4 PJTN"

Before my CAD vendor changed their file format I was able to do the
following.

foreach prt (*.prt*)
nawk   '/ DES/ {c_disc = NR+3};\
(NR == c_disc) && ($3 !~ "NULL") {description = substr($0,6)};\
END {print (FILENAME,"\n"description,"\n""#")} '
--
+-----------------------------------------------------------------+
| David A. Haigh                                                  |
| Alcatel USA                                                     |
| 1400 McDowell Blvd, Petaluma, CA 94954                          |
| phone:  (707)7927-007   <--- Yes double "O" seven!              |
| fax:    (707)792-7059                                           |

+-----------------------------------------------------------------+
 The opinions represented here are not necessarily those of Alactel



Tue, 12 Aug 2003 03:51:00 GMT  
 how to find extended ascii characters?
I need to be able to find an ascii string and the extended ascii
characters that precede and follow it. For example,  ?DES  (In case
the email hoses this, that is an "A" with a ~ on top followed by
the ascii string DES. I think the extended ascii number is 227.)
Following this is another extended ascii character and then a
string of text that I want to put into a new file.

Here is the full line I'm trying to find and extract.

?DES a3NUT,WELD,1/2-13THD,4 PJTN

What I want to extract is "NUT,WELD,1/2-13THD,4 PJTN"

Before my CAD vendor changed their file format I was able to do the
following.

foreach prt (*.prt*)
nawk   '/ DES/ {c_disc = NR+3};\
(NR == c_disc) && ($3 !~ "NULL") {description = substr($0,6)};\
END {print (FILENAME,"\n"description,"\n""#")} '
--
+-----------------------------------------------------------------+
| David A. Haigh                                                  |
| Alcatel USA                                                     |
| 1400 McDowell Blvd, Petaluma, CA 94954                          |
| phone:  (707)7927-007   <--- Yes double "O" seven!              |
| fax:    (707)792-7059                                           |

+-----------------------------------------------------------------+
 The opinions represented here are not necessarily those of Alactel



Tue, 12 Aug 2003 03:54:41 GMT  
 how to find extended ascii characters?
I need to be able to find an ascii string and the extended ascii
characters that precede and follow it. For example,  ?DES  (In case
the email hoses this, that is an "A" with a ~ on top followed by
the ascii string DES. I think the extended ascii number is 227.)
Following this is another extended ascii character and then a
string of text that I want to put into a new file.

Here is the full line I'm trying to find and extract.

?DES a3NUT,WELD,1/2-13THD,4 PJTN

What I want to extract is "NUT,WELD,1/2-13THD,4 PJTN"

Before my CAD vendor changed their file format I was able to do the
following.

foreach prt (*.prt*)
nawk   '/ DES/ {c_disc = NR+3};\
(NR == c_disc) && ($3 !~ "NULL") {description = substr($0,6)};\
END {print (FILENAME,"\n"description,"\n""#")} '
--
+-----------------------------------------------------------------+
| David A. Haigh                                                  |
| Alcatel USA                                                     |
| 1400 McDowell Blvd, Petaluma, CA 94954                          |
| phone:  (707)7927-007   <--- Yes double "O" seven!              |
| fax:    (707)792-7059                                           |

+-----------------------------------------------------------------+
 The opinions represented here are not necessarily those of Alactel



Tue, 12 Aug 2003 03:58:14 GMT  
 how to find extended ascii characters?
I need to be able to find an ascii string and the extended ascii
characters that precede and follow it. For example,  ?DES  (In case
the email hoses this, that is an "A" with a ~ on top followed by
the ascii string DES. I think the extended ascii number is 227.)
Following this is another extended ascii character and then a
string of text that I want to put into a new file.

Here is the full line I'm trying to find and extract.

?DES a3NUT,WELD,1/2-13THD,4 PJTN

What I want to extract is "NUT,WELD,1/2-13THD,4 PJTN"

Before my CAD vendor changed their file format I was able to do the
following.

foreach prt (*.prt*)
nawk   '/ DES/ {c_disc = NR+3};\
(NR == c_disc) && ($3 !~ "NULL") {description = substr($0,6)};\
END {print (FILENAME,"\n"description,"\n""#")} '
--
+-----------------------------------------------------------------+
| David A. Haigh                                                  |
| Alcatel USA                                                     |
| 1400 McDowell Blvd, Petaluma, CA 94954                          |
| phone:  (707)7927-007   <--- Yes double "O" seven!              |
| fax:    (707)792-7059                                           |

+-----------------------------------------------------------------+
 The opinions represented here are not necessarily those of Alactel



Tue, 12 Aug 2003 05:18:16 GMT  
 how to find extended ascii characters?
Sorry for the multiple post. My netscape wasn't configured correctly
and it was telling me that it couldn't send my message. (however it
did send them.)
--
+-----------------------------------------------------------------+
| David A. Haigh                                                  |
| Alcatel USA                                                     |
| 1400 McDowell Blvd, Petaluma, CA 94954                          |
| phone:  (707)7927-007   <--- Yes double "O" seven!              |
| fax:    (707)792-7059                                           |

+-----------------------------------------------------------------+
 The opinions represented here are not necessarily those of Alactel


Tue, 12 Aug 2003 05:28:10 GMT  
 how to find extended ascii characters?

...
Quote:
>Here is the full line I'm trying to find and extract.

>?DES a3NUT,WELD,1/2-13THD,4 PJTN

>What I want to extract is "NUT,WELD,1/2-13THD,4 PJTN"

...

How about passing the files through tr before they get to nawk? If ? is
ASCII 227, it's octal \343, so

tr "\343" " " < file | nawk '...'

This becomes problematic if there are many different extended ASCII
characters that need to be handled. In that case, it might almost be worth
it to write a small C program to convert any character with ASCII value >
127 into a space char, but it may make more sense just to use perl. If your
nawk has an ord function, you could use a two-step pattern like

match($0, /.DES/) && ord(substr($0, RSTART)) > 127



Tue, 12 Aug 2003 09:09:26 GMT  
 how to find extended ascii characters?

Quote:
> Sorry for the multiple post. My netscape wasn't configured correctly and
> it was telling me that it couldn't send my message. (however it did send
> them.)

If your using NS4 then I suggest you switch to 4.7[56] as is seems to be
the least buggy. If you are already running it, then just curse and swear.

-Ed

--
                                                     | u98ejr

             Share, and enjoy.                       | eng.ox
                                                     | .ac.uk



Tue, 12 Aug 2003 09:31:54 GMT  
 how to find extended ascii characters?

Quote:

> I need to be able to find an ascii string and the extended ascii
> characters that precede and follow it. For example,  ?DES  (In case
> the email hoses this, that is an "A" with a ~ on top followed by
> the ascii string DES. I think the extended ascii number is 227.)
> Following this is another extended ascii character and then a
> string of text that I want to put into a new file.

> Here is the full line I'm trying to find and extract.

> ?DES a3NUT,WELD,1/2-13THD,4 PJTN

> What I want to extract is "NUT,WELD,1/2-13THD,4 PJTN"

> Before my CAD vendor changed their file format I was able to do the
> following.

> foreach prt (*.prt*)
> nawk   '/ DES/ {c_disc = NR+3};\
> (NR == c_disc) && ($3 !~ "NULL") {description = substr($0,6)};\
> END {print (FILENAME,"\n"description,"\n""#")} '

Maybe I lost something here, but what's the problem of having

nawk   '/?DES/ {c_disc = NR+3};\

in your platform?



Tue, 12 Aug 2003 13:39:59 GMT  
 how to find extended ascii characters?
going for broke, but you could use this
to drop out all the extended ascii characters
tr -d '[\0177-\0377] < file > clean-file
(untested, check the octal stuff!)
jen
--
Quote:



> ...
> >Here is the full line I'm trying to find and extract.

> >?DES a3NUT,WELD,1/2-13THD,4 PJTN

> >What I want to extract is "NUT,WELD,1/2-13THD,4 PJTN"
> ...

> How about passing the files through tr before they get to nawk? If ? is
> ASCII 227, it's octal \343, so

> tr "\343" " " < file | nawk '...'

> This becomes problematic if there are many different extended ASCII
> characters that need to be handled. In that case, it might almost be worth
> it to write a small C program to convert any character with ASCII value >
> 127 into a space char, but it may make more sense just to use perl. If your
> nawk has an ord function, you could use a two-step pattern like

> match($0, /.DES/) && ord(substr($0, RSTART)) > 127



Tue, 12 Aug 2003 15:35:00 GMT  
 how to find extended ascii characters?
Thanks for all the suggestions. Harlan's suggestions gave me what
I needed. The syntax for my command is below if anyone is interested.

foreach prt (i2*.prt*)
foreach? tr "\343" "\012" < $prt | tr "\342" "\012" | tr "\0" "\012" |
sed 's/SAME AS DES/junk/' | nawk   '/DES/ {c_disc = NR+2};(NR == c_disc)
{descrpt = substr($0,2)};END {print (FILENAME,"\n"descrpt,"\n""#")} '

Note: there is a problem with "FILENAME" in the print section. But
I'll start a new thread on that.
--
+-----------------------------------------------------------------+
| David A. Haigh                                                  |
| Alcatel USA                                                     |
| 1400 McDowell Blvd, Petaluma, CA 94954                          |
| phone:  (707)7927-007   <--- Yes double "O" seven!              |
| fax:    (707)792-7059                                           |

+-----------------------------------------------------------------+
 The opinions represented here are not necessarily those of Alactel



Sat, 16 Aug 2003 05:28:19 GMT  
 how to find extended ascii characters?
: I need to be able to find an ascii string and the extended ascii
: characters that precede and follow it. For example,  ?DES  (In case
: the email hoses this, that is an "A" with a ~ on top followed by
: the ascii string DES. I think the extended ascii number is 227.)

   GNU awk will let you find strings by their hex or octal codes:

   \xE3 = decimal 227 = \343 (octal)

You can also use character classes:

    [\xDF-\xEE]

if you'd like.

--
Eric Pement



Wed, 27 Aug 2003 08:10:55 GMT  
 
 [ 12 post ] 

 Relevant Pages 

1. Extended ASCII Characters in CW

2. Reading Text files with extended ASCII characters

3. Help printing extended ASCII character set

4. How can I process Extended ASCII Characters?

5. REQUEST: Printing Extended ASCII characters

6. loock for ASCII extended for Win32

7. Extended ASCII table

8. Extended ASCII

9. Extended Ascii problem

10. Demise of the non-ASCII character set

11. accessing ASCII value of character

12. ASCII value of character

 

 
Powered by phpBB® Forum Software