How to download whole page, not just part 
Author Message
 How to download whole page, not just part

This is going to show up my ignorance something chronic... but here
goes!

I am writing an app to download webpages, extract information etc but
the page I download comes in two parts. Click View Source from the menu
bar and you get a brief (2k) page, right click on the body of the screen
itself... you get 55k!

I want the 55k but my app only returns the 2k. For some reason it was
bahving itself this afternoon, I re run it now, (after a reboot etc) and
it only gives me the 2k.

A key difference might be that I was on the LAN at work this pm, but
tonight it's dial up... I can't see how that would impact things. Either
way I want to make it bullet proof for dialup AND LAN conections.

Code is below, based on a weather application found in developerdex code
samples pages (thanks guys). You will see the Function is expecting
input of sURL but I have hard coded a value in the function itself so
you can parse it any old junk.

To be fair - I am a little out of my depth and have simply squeezed the
code below till it worked. And now it ain't reliable. I would love to
know 2 things...

1) What is wrong with the code (obvious)
2) What concepts do I need to learn to avoid me sponging off you guys
and gals again

Any help paid in full (in beer generally...!)

Cheers
Nigel

=========
Option Explicit
Public Declare Function InternetOpen Lib "wininet.dll" Alias
"InternetOpenA" (ByVal sAgent As String, ByVal lAccessType As Long,
ByVal sProxyName As String, ByVal sProxyBypass As String, ByVal lFlags
As Long) As Long
Public Declare Function InternetOpenUrl Lib "wininet.dll" Alias
"InternetOpenUrlA" (ByVal hInternetSession As Long, ByVal sURL As
String, ByVal sHeaders As String, ByVal lHeadersLength As Long, ByVal
lFlags As Long, ByVal lContext As Long) As Long
Public Declare Function InternetReadFile Lib "wininet.dll" (ByVal hFile
As Long, ByVal sBuffer As String, ByVal lNumBytesToRead As Long,
lNumberOfBytesRead As Long) As Integer
Public Declare Function InternetCloseHandle Lib "wininet.dll" (ByVal
hInet As Long) As Integer

Public Function GetUrlSource(ByVal sURL As String) As String

Dim sBuffer As String * BUFFER_LEN, iResult As Integer, sData As String
Dim hInternet As Long, hSession As Long, lReturn As Long

sURL =
" http://www.*-*-*.com/
de=NZ&LocnCode=ACKLND&PrceTo=500000&PropTypeCode=HU&DateFrom=15%2F07%2F2
002+2%3A01%3A01+AM"

On Error GoTo errHandler

  'get the handle of the current internet connection and the url
  hSession = InternetOpen("vb wininet", 1, vbNullString, vbNullString,
0)
  If hSession Then hInternet = InternetOpenUrl(hSession, sURL,
vbNullString, 0, IF_NO_CACHE_WRITE, 0)

  'if we have the handle, read the web page
  If hInternet Then
    'Buffer the first chunk.
    iResult = InternetReadFile(hInternet, sBuffer, BUFFER_LEN, lReturn)
    sData = sBuffer

    'Keep buffering data till there is no more
    Do While lReturn <> 0
      iResult = InternetReadFile(hInternet, sBuffer, BUFFER_LEN,
lReturn)
      sData = sData + Mid(sBuffer, 1, lReturn)
    Loop
  End If
   Debug.Print sData

  'close the URL
  iResult = InternetCloseHandle(hInternet)
  GetUrlSource = sData

  Exit Function

errHandler:
    If Err.Number = 0 Then
      Resume
    Else
      MsgBox Err.Number & vbCrLf & Err.Description, vbCritical, "Error"
    End If

End Function

*** Sent via Developersdex http://www.*-*-*.com/ ***
Don't just participate in USENET...get rewarded for it!



Sat, 01 Jan 2005 16:43:34 GMT  
 How to download whole page, not just part
Sounds like you are getting a frameset in the 2k base url. The frameset will
then tell your browser to reload another page into each of the frame areas.
Your application will need to detect a frameset and parse the additional
urls and re-request the inside url... If you already know the inside URL you
can bypass the frameset completely... to test this right click on the body
(where you got the large source) and click properties... if the page is a
frame this will be the URL of the inside page.

How this helps

--

______________________________________________
Michael B. Murdock
Starphire Technologies, LLC
"Affordable Web Content Management Solutions"
http://www.starphire.com


Quote:

> This is going to show up my ignorance something chronic... but here
> goes!

> I am writing an app to download webpages, extract information etc but
> the page I download comes in two parts. Click View Source from the menu
> bar and you get a brief (2k) page, right click on the body of the screen
> itself... you get 55k!

> I want the 55k but my app only returns the 2k. For some reason it was
> bahving itself this afternoon, I re run it now, (after a reboot etc) and
> it only gives me the 2k.

> A key difference might be that I was on the LAN at work this pm, but
> tonight it's dial up... I can't see how that would impact things. Either
> way I want to make it bullet proof for dialup AND LAN conections.

> Code is below, based on a weather application found in developerdex code
> samples pages (thanks guys). You will see the Function is expecting
> input of sURL but I have hard coded a value in the function itself so
> you can parse it any old junk.

> To be fair - I am a little out of my depth and have simply squeezed the
> code below till it worked. And now it ain't reliable. I would love to
> know 2 things...

> 1) What is wrong with the code (obvious)
> 2) What concepts do I need to learn to avoid me sponging off you guys
> and gals again

> Any help paid in full (in beer generally...!)

> Cheers
> Nigel

> =========
> Option Explicit
> Public Declare Function InternetOpen Lib "wininet.dll" Alias
> "InternetOpenA" (ByVal sAgent As String, ByVal lAccessType As Long,
> ByVal sProxyName As String, ByVal sProxyBypass As String, ByVal lFlags
> As Long) As Long
> Public Declare Function InternetOpenUrl Lib "wininet.dll" Alias
> "InternetOpenUrlA" (ByVal hInternetSession As Long, ByVal sURL As
> String, ByVal sHeaders As String, ByVal lHeadersLength As Long, ByVal
> lFlags As Long, ByVal lContext As Long) As Long
> Public Declare Function InternetReadFile Lib "wininet.dll" (ByVal hFile
> As Long, ByVal sBuffer As String, ByVal lNumBytesToRead As Long,
> lNumberOfBytesRead As Long) As Integer
> Public Declare Function InternetCloseHandle Lib "wininet.dll" (ByVal
> hInet As Long) As Integer

> Public Function GetUrlSource(ByVal sURL As String) As String

> Dim sBuffer As String * BUFFER_LEN, iResult As Integer, sData As String
> Dim hInternet As Long, hSession As Long, lReturn As Long

> sURL =
> "http://www.realestate.co.nz/default.asp?file=/search/results.asp&CtryCo
> de=NZ&LocnCode=ACKLND&PrceTo=500000&PropTypeCode=HU&DateFrom=15%2F07%2F2
> 002+2%3A01%3A01+AM"

> On Error GoTo errHandler

>   'get the handle of the current internet connection and the url
>   hSession = InternetOpen("vb wininet", 1, vbNullString, vbNullString,
> 0)
>   If hSession Then hInternet = InternetOpenUrl(hSession, sURL,
> vbNullString, 0, IF_NO_CACHE_WRITE, 0)

>   'if we have the handle, read the web page
>   If hInternet Then
>     'Buffer the first chunk.
>     iResult = InternetReadFile(hInternet, sBuffer, BUFFER_LEN, lReturn)
>     sData = sBuffer

>     'Keep buffering data till there is no more
>     Do While lReturn <> 0
>       iResult = InternetReadFile(hInternet, sBuffer, BUFFER_LEN,
> lReturn)
>       sData = sData + Mid(sBuffer, 1, lReturn)
>     Loop
>   End If
>    Debug.Print sData

>   'close the URL
>   iResult = InternetCloseHandle(hInternet)
>   GetUrlSource = sData

>   Exit Function

> errHandler:
>     If Err.Number = 0 Then
>       Resume
>     Else
>       MsgBox Err.Number & vbCrLf & Err.Description, vbCritical, "Error"
>     End If

> End Function

> *** Sent via Developersdex http://www.developersdex.com ***
> Don't just participate in USENET...get rewarded for it!



Sun, 02 Jan 2005 02:18:29 GMT  
 How to download whole page, not just part

What you describe makes sense in term of the page architecture (frames
etc) but why would it work at one point (download the frame too) and
then not work later on?

No code change, only a reboot... I was thinking it might be something to
do with the speed of delivery, and the page being cached so it worked
time and time again to start with but a roebbot (or some other event)
clearing the cache resuting in the dial up connection effectively
"timing out" while the frame decided to download?

Of course I may just be ranting on with no good reason...

I'm still digging on this one but as before, any help appreciated!

Regards
Nigel

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!



Sun, 02 Jan 2005 04:53:55 GMT  
 How to download whole page, not just part
CANCEL EVERYTHING!

I found it. It seems some development work has been going on, on the web
server end and syntax of the URL changed.

Now I wonder why it wasn't working... DOH!

Thanks to all who put in some effort on my behalf.
Cheers
NK

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!



Sun, 02 Jan 2005 07:29:28 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. ' INet control not download full page

2. ' INet control not downloading full page

3. ' INet control not downloading full page

4. Internet Transfer Control and whole directory download....

5. Download the whole images directory. HTTP and vbscript

6. Problem with INET control - can't download the whole file

7. MS Active X control downloads OK but home grown Active X Control will not download

8. MS Active X control downloads OK but home grown Active X Control will not download

9. help: text justed in reports access

10. Downloading the first part of a file / the header

11. Downloading the first part of a file / the header

12. INET: DOWNLOADING PART of FILES

 

 
Powered by phpBB® Forum Software