Extracting an IP Address from a Linein() 
Author Message
 Extracting an IP Address from a Linein()

those are ipv4 addresses.... ipv6 is completely different, exists in the
world today, and coming soon (well, soon for a couple of years anyway).

Quote:

> For reasons I can explain if necessary I am processing Some of the headers of
> articles posted to Usenet, no I am not a spammer. I need to extract any IP
> address that occurs anywhere in a single line. Here are two from Patrick's
> earlier post.

> NNTP-Posting-Host: 209.29.175.109
> X-Trace: news.ca.inter.net 1034368351 209.29.175.109 (Fri, 11 Oct 2002
> 16:32:31 EDT)

<snip>


Wed, 30 Mar 2005 21:52:08 GMT  
 Extracting an IP Address from a Linein()
IF you can assume that the ipaddress will be blank-delimited (and I think
you can), then try something like the following, where linevar contains the
linein() you just read:

do i = 1 to words(linevar)
  if countstr('.',word(linevar,i)) = 3 then
    do
      testdigits = changestr('.',word(linevar,i),'') /* all are single
quotes */
      if datatype(testdigits) = 'NUM' then
        do
          ipsave = word(linevar,i)
          leave
        end
    end
end

/* Now do something with ipsave  */


Quote:
> For reasons I can explain if necessary I am processing Some of the headers
of
> articles posted to Usenet, no I am not a spammer. I need to extract any IP
> address that occurs anywhere in a single line. Here are two from Patrick's
> earlier post.

> NNTP-Posting-Host: 209.29.175.109
> X-Trace: news.ca.inter.net 1034368351 209.29.175.109 (Fri, 11 Oct 2002
> 16:32:31 EDT)

> These are typical, the IP address can be anywhere in the line, they must
be
> of the form N.N.N.N where any N can be an integer between 1 and 255. I
have
> written one, far to big and ugly to post here, but it is very slow, I have
> 1.5GB of data to process.

> I am using standard REXX (actually Regina 3) and would welcome any ideas
on
> how to write a function to test if 1-3 characters is an integer between 1
and
> 255, or any ideas on the best way to tackle the problem. My ways are
> suboptimal :(

> The string "news.ca.inter.net 1034368351 209.29.175.109" out of which I
want
> 209.29.175.109 illustrates the problem very well. Any advance on 185 lines
of
> code?

> {R}



Wed, 30 Mar 2005 22:13:06 GMT  
 Extracting an IP Address from a Linein()


Quote:
> For reasons I can explain if necessary I am processing Some of the
> headers of articles posted to Usenet, no I am not a spammer. I
> need to extract any IP address that occurs anywhere in a single
> line. Here are two from Patrick's earlier post.

FWIW here is one quick attempt:

ln = 'X-Trace: news.ca.inter.net 1034368351 209.29.175.109 (Fri, 11 Oct 2002 16:32:31 EDT)'
ln = 'NNTP-Posting-Host: 209.29.175.109'

p. = ''
ipadd = ''
PARSE VAR ln p.1 p.2 p.3 p.4 p.5 p.6 p.7 p.8 p.9 p.10 p.11 p.12 p.13 p.14 p.15
DO i = 1 TO 15
  IF COUNTSTR('.', p.i) = 3 THEN,
    IF DATATYPE(LEFT(p.i, 1), 'NUM') THEN
      DO
        ipadd = p.i
        LEAVE i
      END
    ELSE ITERATE i
  ELSE NOP
END



Wed, 30 Mar 2005 23:19:23 GMT  
 Extracting an IP Address from a Linein()
| IF you can assume that the ipaddress will be blank-delimited (and I think
| you can), then try something like the following, where linevar contains the
| linein() you just read:
|
| do i = 1 to words(linevar)
|   if countstr('.',word(linevar,i)) = 3 then
|     do
|       testdigits = changestr('.',word(linevar,i),'') /* all are single
| quotes */
|       if datatype(testdigits) = 'NUM' then
|         do
|           ipsave = word(linevar,i)
|           leave
|         end
|     end
| end
|
| /* Now do something with ipsave  */
|
|
|> For reasons I can explain if necessary I am processing Some of the headers of
|> articles posted to Usenet, no I am not a spammer. I need to extract any IP
|> address that occurs anywhere in a single line. Here are two from Patrick's
|> earlier post.
|>
|> NNTP-Posting-Host: 209.29.175.109
|> X-Trace: news.ca.inter.net 1034368351 209.29.175.109 (Fri, 11 Oct 2002
|> 16:32:31 EDT)
|>
|> These are typical, the IP address can be anywhere in the line, they must be
|> of the form N.N.N.N where any N can be an integer between 1 and 255. I have
|> written one, far to big and ugly to post here, but it is very slow, I have
|> 1.5GB of data to process.
|>
|> I am using standard REXX (actually Regina 3) and would welcome any ideas on
|> how to write a function to test if 1-3 characters is an integer between 1 and
|> 255, or any ideas on the best way to tackle the problem. My ways are
|> suboptimal :(
|>
|> The string "news.ca.inter.net 1034368351 209.29.175.109" out of which I want
|> 209.29.175.109 illustrates the problem very well. Any advance on 185 lines
|> of code?

There are quite a few test cases (I hope I identified enough of them) that you
probably want to check for.  I assumed there might be more than one IP address
per "line", and you could add additional checks to speed things up.

_______________________________________________________________________________
verifyIP: procedure: parse arg xxx
  do j=1 to words(xxx)
  z=word(xxx,j)
  say 'processing word:' z                        /*just for testing.*/
  if verify(z,'1234567890.')\==0 then iterate
  parse var z a '.' b '.' c '.' d
  if a=='' | b=='' | c=='' | d=='' then iterate
  if a>255 | b>255 | c>255 | d>255 then iterate
  if a==0  | b==0  | c==0  | d==0  then iterate        /*is this true?*/
  if length(a)>3 | length(b)>3 | length(c)>3 | length(d)>3 then iterate
  call ip_process z
  end

exit

ip_process: say '----------------------- IP address found' arg(1)
return
_______________________________________________________________________________

Gerard S.



Thu, 31 Mar 2005 00:02:15 GMT  
 Extracting an IP Address from a Linein()

| For reasons I can explain if necessary I am processing Some of the headers of
| articles posted to Usenet, no I am not a spammer. I need to extract any IP
| address that occurs anywhere in a single line. Here are two from Patrick's
| earlier post.
|
| NNTP-Posting-Host: 209.29.175.109
| X-Trace: news.ca.inter.net 1034368351 209.29.175.109 (Fri, 11 Oct 2002
| 16:32:31 EDT)
|
| These are typical, the IP address can be anywhere in the line, they must be
| of the form N.N.N.N where any N can be an integer between 1 and 255. I have
| written one, far to big and ugly to post here, but it is very slow, I have
| 1.5GB of data to process.
|
| I am using standard REXX (actually Regina 3) and would welcome any ideas on
| how to write a function to test if 1-3 characters is an integer between 1 and
| 255, or any ideas on the best way to tackle the problem. My ways are
| suboptimal :(
|
| The string "news.ca.inter.net 1034368351 209.29.175.109" out of which I want
| 209.29.175.109 illustrates the problem very well. Any advance on 185 lines
| of code?
|
| There are quite a few test cases (I hope I identified enough of them) that you
| probably want to check for.  I assumed there might be more than one IP address
| per "line", and you could add additional checks to speed things up.
|
| _______________________________________________________________________________
| verifyIP: procedure: parse arg xxx
|   do j=1 to words(xxx)
|   z=word(xxx,j)
|   say 'processing word:' z                        /*just for testing.*/
|   if verify(z,'1234567890.')\==0 then iterate
|   parse var z a '.' b '.' c '.' d
|   if a=='' | b=='' | c=='' | d=='' then iterate
|   if a>255 | b>255 | c>255 | d>255 then iterate
|   if a==0  | b==0  | c==0  | d==0  then iterate        /*is this true?*/
|   if length(a)>3 | length(b)>3 | length(c)>3 | length(d)>3 then iterate
|   call ip_process z
|   end
|
| exit
|
| ip_process: say '----------------------- IP address found' arg(1)
| return
| _______________________________________________________________________________

Since I posted this, I thought about added support for IP address like:

blah blah blah 123.1.2.3(really cool site!!) yadda yadda yadda ---and---
yak yak yak amazing.cool.site (123.1.2.3) yak yak yak
blab blab blab, and check out 123.1.2.3, and you might also ...

All it would take is another line of rexx right after the first line of code:
You may also want to add more delimiters to the TRANSLATE function.

verifyIP: procedure; parse arg xxx; xxx=translate(xxx,,'(),{};[]')

--- and I see I had a typo in the previous posting. tch-tch. ___Gerard S.



Thu, 31 Mar 2005 00:10:23 GMT  
 Extracting an IP Address from a Linein()


Quote:
> I suppose I should sit and RTFM so that I know what functions
> are in the language I have never used, I think it would be
> worthwhile.

It can be one of the best ways to find out what features a language has
available.  And simply doing it once can plant seeds in your mind for
when you tackle a problem (like this one :).

David Martin
DynaComp Solutions
http://DynaComp-Solutions.com



Thu, 31 Mar 2005 01:23:02 GMT  
 Extracting an IP Address from a Linein()

Quote:

>For reasons I can explain if necessary I am processing Some of the headers of
>articles posted to Usenet, no I am not a spammer. I need to extract any IP
>address that occurs anywhere in a single line. Here are two from Patrick's
>earlier post.

>NNTP-Posting-Host: 209.29.175.109
>X-Trace: news.ca.inter.net 1034368351 209.29.175.109 (Fri, 11 Oct 2002
>16:32:31 EDT)

...

My 2cts:

/* scan ip addresses from file */
Parse Arg fn
 Do While lines(fn) > 0
  IpNumber = IPaddress(LineIn(fn));
  if IPNumber = '' Then Iterate; /* ignore empty records */
  /* do something here with ipaddress */
 End; /* do while */
Return 0; /* program exit, back to caller */

IPaddress: Procedure; /* extract ip number */
Parse Arg Data
 if Data = '' Then Return ''; /* no scan here */
 Parse Var Data Item Data /* extract 1st item here */
 Parse Var Item N.1'.'N.2'.'N.3'.'N.4  
 Do i = 1 to 4;
  If DataType(N.i,'N') = '0', /* number here ? */
  Then Return IPaddress(data); /* no, next attempt recursively */
 End;
Return Item; /* return a valid ipaddress here */

-wolfram



Thu, 31 Mar 2005 02:41:24 GMT  
 Extracting an IP Address from a Linein()
|>For reasons I can explain if necessary I am processing Some of the headers of
|>articles posted to Usenet, no I am not a spammer. I need to extract any IP
|>address that occurs anywhere in a single line. Here are two from Patrick's
|>earlier post.
|>
|>NNTP-Posting-Host: 209.29.175.109
|>X-Trace: news.ca.inter.net 1034368351 209.29.175.109 (Fri, 11 Oct 2002
|>16:32:31 EDT)
|>
|...
|
| My 2cts:
|
| /* scan ip addresses from file */
| Parse Arg fn
|  Do While lines(fn) > 0
|   IpNumber = IPaddress(LineIn(fn));
|   if IPNumber = '' Then Iterate; /* ignore empty records */
|   /* do something here with ipaddress */
|  End; /* do while */
| Return 0; /* program exit, back to caller */
|
| IPaddress: Procedure; /* extract ip number */
| Parse Arg Data
|  if Data = '' Then Return ''; /* no scan here */
|  Parse Var Data Item Data /* extract 1st item here */
|  Parse Var Item N.1'.'N.2'.'N.3'.'N.4
|  Do i = 1 to 4;
|   If DataType(N.i,'N') = '0', /* number here ? */
|   Then Return IPaddress(data); /* no, next attempt recursively */
|  End;
| Return Item; /* return a valid ipaddress here */

Don't forget to check for a range.  All of the below are invalid and
pass muster by the above routine:

1.2.3.0
1.2.3.444
1.+2.3.4
1.-2.3.4
1.2e0.3.4
1.2.3E+1.4
1.2.000000000003.4
1.2.3.4.6

Your  DATATYPE  check could be changed to:
          if datatype(n.i,'W') then call IPaddress data
but that won't catch all the errors.  It would be better to use a
simple VERIFY  (as indicated by my earlier post). _________Gerard S.



Thu, 31 Mar 2005 03:05:55 GMT  
 Extracting an IP Address from a Linein()


%                                                   I need to extract any IP
% address that occurs anywhere in a single line. Here are two from Patrick's
% earlier post.
%
% NNTP-Posting-Host: 209.29.175.109
% X-Trace: news.ca.inter.net 1034368351 209.29.175.109 (Fri, 11 Oct 2002
% 16:32:31 EDT)

This will work provided the IP addresses are space-delimited:

  line = 'X-Trace: news.ca.inter.net 1034368351 209.29.175.109 (Fri, 11 Oct 2002 16:32:31 EDT)'

  do while line \= ''
    parse var line . a '.' b '.' c '.' d line
    if datatype(a) || datatype(b) || datatype(c) || datatype(d) ==,
         'NUMNUMNUMNUM' & 0 <= a & 255 >= a & 0 <= b & 255 >= b & 0 <= c,
         & 255 >= c & 0 <= d & 255 >= d then
       say a||'.'||b||'.'||c||'.'d
    end
--

Patrick TJ McPhee
East York  Canada



Thu, 31 Mar 2005 01:04:37 GMT  
 Extracting an IP Address from a Linein()
hi

Quote:

----- snip

> Your  DATATYPE  check could be changed to:
>           if datatype(n.i,'W') then call IPaddress data
> but that won't catch all the errors.

"NUMERIC DIGITS 3" valid for that part of the program though would limit the
'approved' number of digits to 1, 2 or 3; but wouldn't discard 0 or numbers >
255

Quote:
> It would be better to use a simple VERIFY  (as indicated by my earlier post).

I also like verify much, but in this situation it wouldn't discard 0 or
numbers > 255 either.

--
good luck

peter



Thu, 31 Mar 2005 12:02:31 GMT  
 Extracting an IP Address from a Linein()


Quote:

>| For reasons I can explain if necessary I am processing Some of the headers of
>| articles posted to Usenet, no I am not a spammer. I need to extract any IP
>| address that occurs anywhere in a single line. Here are two from Patrick's
>| earlier post.
>|
>| NNTP-Posting-Host: 209.29.175.109
>| X-Trace: news.ca.inter.net 1034368351 209.29.175.109 (Fri, 11 Oct 2002
>| 16:32:31 EDT)
>|
>| These are typical, the IP address can be anywhere in the line, they must be
>| of the form N.N.N.N where any N can be an integer between 1 and 255. I have
>| written one, far to big and ugly to post here, but it is very slow, I have
>| 1.5GB of data to process.

FYI - some (Over a dozen here, so far) pacrim sites are misconfigured -
the I.P. shows in the header as www-xxx-yyy-zzz instead of www.xxx.yyy.zzz

Mike-

-------------------------------------------------------------------
              My web page is at http://www.catherders.com

          Because network administration is like herding cats.
-------------------------------------------------------------------

Due to the amazing (and disgusting) amount of unsolicited email  (spam) we
have been receiving, site-wide email filters have been  installed at
catherders.com. Exact details of how this has been  configured will not be
disclosed (I don't want the darned spammers  to figure workarounds), but I
will say that mail from .tw and .kr  domains is bounced without exception.
If this makes PacRim folks  unhappy, get those open relays fixed, folks!

-------------------------------------------------------------------

-----------== Posted via Newsfeed.Com - Uncensored Usenet News ==----------
   http://www.newsfeed.com       The #1 Newsgroup Service in the World!
-----= Over 100,000 Newsgroups - Unlimited Fast Downloads - 19 Servers =-----



Thu, 31 Mar 2005 16:06:56 GMT  
 Extracting an IP Address from a Linein()
And of course if REXX had regex support...................

Regards

Dave Saville

NB switch saville for nospam in address



Thu, 31 Mar 2005 20:03:24 GMT  
 Extracting an IP Address from a Linein()


Quote:
>For reasons I can explain if necessary I am processing Some of the
>headers of articles posted to Usenet, no I am not a spammer. I need
>to extract any IP address that occurs anywhere in a single line. Here
>are two from Patrick's earlier post.

You're probably asking the wrong question. I assume that what you
really want to do is to extract IP addresses from Received headers.
There are a couple of ugly problems:

 1. Forgery

 2. Lack of standardization

--
     Shmuel (Seymour J.) Metz, SysProg and JOAT
     Atid/2, Team OS/2, Team PL/I

Any unsolicited commercial junk E-mail will be subject to legal
action.  I reserve the right to publicly post or ridicule any
abusive E-mail.

I mangled my E-mail address to foil automated spammers; reply to
domain Patriot dot net user shmuel+news to contact me.  Do not



Thu, 31 Mar 2005 10:43:36 GMT  
 Extracting an IP Address from a Linein()


Quote:


>>For reasons I can explain if necessary I am processing Some of the
>>headers of articles posted to Usenet, no I am not a spammer. I
>>need to extract any IP address that occurs anywhere in a single
>>line. Here are two from Patrick's earlier post.

> You're probably asking the wrong question. I assume that what you
> really want to do is to extract IP addresses from Received
> headers. There are a couple of ugly problems:

>  1. Forgery

>  2. Lack of standardization

I just saw an IP string in a header as nnn-nnn-nnn-nnn
So some previous methods are already proved broken.


Fri, 01 Apr 2005 00:38:03 GMT  
 Extracting an IP Address from a Linein()

Quote:
>verifyIP: procedure: parse arg xxx
>  do j=1 to words(xxx)
>  z=word(xxx,j)
>  say 'processing word:' z                        /*just for testing.*/
>  if verify(z,'1234567890.')\==0 then iterate
>  parse var z a '.' b '.' c '.' d
>  if a=='' | b=='' | c=='' | d=='' then iterate
>  if a>255 | b>255 | c>255 | d>255 then iterate
>  if a==0  | b==0  | c==0  | d==0  then iterate        /*is this true?*/

No, this isn't true.  For instance, the address of www.ox.ac.uk is
163.1.0.45.  It's unusual to find an IP address ending with 0, but even
that has been known to happen.  (I think NTLworld includes them in its
pool of dialup addresses.  Occasionally I have been allocated one and
firewall bugs have prevented me from connecting to the office...)

On the other hand, IP addresses *beginning* with 0 are reserved for
special purposes [RFC3330].

Quote:
>  if length(a)>3 | length(b)>3 | length(c)>3 | length(d)>3 then iterate

I'm not sure there is much point testing this when you know the numbers
are all below 256.  Sure, this disqualifies 4.3.2.0100, but it doesn't
disqualify 4.3.2.010 (and who is to say that's not a valid IP address
anyway?).
--

------ http://users.comlab.ox.ac.uk/ian.collier/imc.shtml

New to this group?  Answers to frequently-asked questions can be had from
http://rexx.hursley.ibm.com/rexx/ .



Fri, 01 Apr 2005 20:13:47 GMT  
 
 [ 18 post ]  Go to page: [1] [2]

 Relevant Pages 

1. Extract IP addresses from text file

2. Anyway to get the server ip address not the browser ip address

3. Code to extract address information ?

4. IP address

5. Expand IP address range

6. ip address from ifconfig

7. Matching Octet of IP address

8. IP Addresses and Port Numbers

9. Getting the IP Address within RB

10. Reach new site through IP address?

11. Internet connection IP address

12. TCP OPEN ERROR when using IP Address !

 

 
Powered by phpBB® Forum Software