Please help me to concatenate data files 
Author Message
 Please help me to concatenate data files

Hello

May I please ask for your help.  I am a sociology grad student.  For
my research I must create one Main Input file using 90 data files.  I
must then repeat the process 29 times for the other Input variables.
I tried doing this manually [the Copy/Paste method].  I found that
this takes about 6 hours per variable.   I tried writing a Visual
Basic Macro [please see below], but my programming skills are clearly
inadequate.  May I please ask you to help me to write such a Macro?

I have 90 [csv] files named in order [file1, file2, ..., file90].
Each file contains two columns [the date and the observation].  The
observation is a positive number, Unless it is "N/A" [Not Available].
The macro should:

a)      Take every file and delete the rows [both the date and "N/A"
symbol] with N/A in the "observation column".  [the remaining rows
should move up.]   Thus none of the resulting 90 data files [call them
file#a] should have rows containing "N/A" or empty rows.  The Macro
below was my attempt to do this for One file with one data column in
it.

b)      Take the first X [i.e. 500] rows from file1a, place them into
the Main Input File,  repeat this for files file2a, file3a, ...
,file90a.
Thus the resulting Main Input File would have 90*X = 90*500 rows, and
two columns [date and observation].

Thank You very much, I sincerely appreciate your help and/or
suggestions.                    

Regards

Stan Miles

Sub MacroDataPreprocessing()

Dim Input_Array(1 To 10) As Double
Dim Output_Array(1 To 10) As Double

Dim Counter1, Number

Open "C:\myfiles\data1.txt" For Input As #1

For Counter = 1 To 10
        Input #1, Number    'reads consecutive numbers from the input
file
        Input_Array(Counter) = Number    'each number is stored in the
array
Next Counter

For Counter = 1 To 10

        If Input_Array(Counter) > 0 Then
            Output_Array(Counter) = Input_Array(Counter)

        End If

Next Counter

Close

Open "C:\myfiles\Output1.txt" For Output As #1

For Counter = 1 To 10
    Write #1, Output_Array(Counter)
Next Counter

Close

End Sub

'When I try to execute this, and there are letters in the input file
' I get an error message "type mismatch".  I suppose when an array
' is being formed, letters are not counted as numbers...
' when  I replace letters by negative numbers, I get 0's in the Output
'file,  and not a "skip = move up the rows below"...



Thu, 30 Jan 2003 03:00:00 GMT  
 Please help me to concatenate data files
Since you legimately don't appear to be a computing student asking for their
programming homework to be done, the following code is probably what you
need.  It probably won't offer the most optimal performace, but hey, you
only have to run it once.  Further, I find it unlikely that your 90 files
combined are many megabytes large.  Anyway, since I don't have some sample
data files to play with, I can't test it, but it should work based on your
description.

BTW, stop crossposting so darn much!  It isn't necessary, besides the fact
that your problem really has no place whatsoever in powerbasic and Database
newsgroups.

Sub MacroDataPreprocessing()
Dim i As Long, j As Long
Dim TempString As String
  'Go through all fileX.csv files and write clean versions to fileXa.csv
  For i = 1 To 90
  Open "C:\myfiles\file" & CStr(i) & ".csv" For Input As #1
  Open "C:\myfiles\file" & CStr(i) & "a.csv" For Output As #2
    Do While Not EOF(1) 'Loop until the end of file marker is reached
      Line Input #1, TempString  'Read a line into the TempString buffer
      If InStr(1, TempString, "N/A", vbTextCompare) = 0 _
      And Trim(TempString) <> "" Then  'Check if this is a {*filter*}one
        'This is not a {*filter*}one, let's write it to our other file
        Print #2, TempString
      End If
    Loop
  Close #1
  Close #2
  Next i

  'Part (b); Write first 500 lines to our Main Input File
  Open "C:\myfiles\MainInputFile.csv" For Output As #1
    For i = 1 To 90
    Open "C:\myfiles\file" & CStr(i) & "a.csv" For Input As #2
      For j = 1 To 500
        Line Input #1, TempString  'Read a line into the TempString buffer
        Print #2, TempString       'Dump it into our Main Input File
      Next j
    Close #2
  Close #1

End Sub

--
Howard Henry Schlunder
 Winamp is currently playing:
 no doubt--you can do it


Quote:
> Hello

> May I please ask for your help.  I am a sociology grad student.  For
> my research I must create one Main Input file using 90 data files.  I
> must then repeat the process 29 times for the other Input variables.
> I tried doing this manually [the Copy/Paste method].  I found that
> this takes about 6 hours per variable.   I tried writing a Visual
> Basic Macro [please see below], but my programming skills are clearly
> inadequate.  May I please ask you to help me to write such a Macro?

> I have 90 [csv] files named in order [file1, file2, ..., file90].
> Each file contains two columns [the date and the observation].  The
> observation is a positive number, Unless it is "N/A" [Not Available].
> The macro should:

> a) Take every file and delete the rows [both the date and "N/A"
> symbol] with N/A in the "observation column".  [the remaining rows
> should move up.]   Thus none of the resulting 90 data files [call them
> file#a] should have rows containing "N/A" or empty rows.  The Macro
> below was my attempt to do this for One file with one data column in
> it.

> b) Take the first X [i.e. 500] rows from file1a, place them into
> the Main Input File,  repeat this for files file2a, file3a, ...
> ,file90a.
> Thus the resulting Main Input File would have 90*X = 90*500 rows, and
> two columns [date and observation].

> Thank You very much, I sincerely appreciate your help and/or
> suggestions.

> Regards

> Stan Miles

> Sub MacroDataPreprocessing()

> Dim Input_Array(1 To 10) As Double
> Dim Output_Array(1 To 10) As Double

> Dim Counter1, Number

> Open "C:\myfiles\data1.txt" For Input As #1

> For Counter = 1 To 10
>         Input #1, Number    'reads consecutive numbers from the input
> file
>         Input_Array(Counter) = Number    'each number is stored in the
> array
> Next Counter

> For Counter = 1 To 10

>         If Input_Array(Counter) > 0 Then
>             Output_Array(Counter) = Input_Array(Counter)

>         End If

> Next Counter

> Close

> Open "C:\myfiles\Output1.txt" For Output As #1

> For Counter = 1 To 10
>     Write #1, Output_Array(Counter)
> Next Counter

> Close

> End Sub

> 'When I try to execute this, and there are letters in the input file
> ' I get an error message "type mismatch".  I suppose when an array
> ' is being formed, letters are not counted as numbers...
> ' when  I replace letters by negative numbers, I get 0's in the Output
> 'file,  and not a "skip = move up the rows below"...



Thu, 30 Jan 2003 03:00:00 GMT  
 Please help me to concatenate data files

Quote:
>   'Part (b); Write first 500 lines to our Main Input File
>   Open "C:\myfiles\MainInputFile.csv" For Output As #1
>     For i = 1 To 90
>     Open "C:\myfiles\file" & CStr(i) & "a.csv" For Input As #2
>       For j = 1 To 500
>         Line Input #1, TempString  'Read a line into the TempString buffer
>         Print #2, TempString       'Dump it into our Main Input File

Oops, slight mismatch.  This should be:
        Line Input #2, TempString  'Read a line into the TempString buffer
        Print #1, TempString       'Dump it into our Main Input File
Quote:
>       Next j
>     Close #2
>   Close #1



Thu, 30 Jan 2003 03:00:00 GMT  
 Please help me to concatenate data files


Quote:
> >   'Part (b); Write first 500 lines to our Main Input File
> >   Open "C:\myfiles\MainInputFile.csv" For Output As #1
> >     For i = 1 To 90
> >     Open "C:\myfiles\file" & CStr(i) & "a.csv" For Input As #2
> >       For j = 1 To 500
> >         Line Input #1, TempString  'Read a line into the TempString
buffer
> >         Print #2, TempString       'Dump it into our Main Input File

> Oops, slight mismatch.  This should be:
>         Line Input #2, TempString  'Read a line into the TempString buffer
>         Print #1, TempString       'Dump it into our Main Input File

> >       Next j
> >     Close #2

Also needs to have a "Next i" here. Otherwise nice work Howard. :)

Quote:
> >   Close #1

Stan,

Assuming the files "fileXa.csv" are junk and needs deleted later anyways,
you could combine the two parts to eliminate double work. Here is a minor
rewrite combining the two parts. I also included a check to bail in case any
of the input files falls short of the 500(whatever) line requirement due to
the filter process.

Sub MacroDataPreprocessingVer2()
Dim i As Long, j As Long
Dim TempString As String
  'Go through all fileX.csv files to remove "N/A" and any blank lines
  'and write first 500 clean lines of each file to "MainInputFile.csv"
  Open "C:\myfiles\MainInputFile.csv" For Output As #1
    For i = 1 To 90
    j=0
    Open "C:\myfiles\file" & CStr(i) & ".csv" For Input As #2
      Do While Not Eof(2) Or j = 500 'Set 500 to whatever here
        Line Input #2, TempString  'Read a line into the TempString buffer
        If InStr(1, TempString, "N/A", vbTextCompare) = 0 _
        And Trim(TempString) <> "" Then  'Check if this is a {*filter*}one
          'This is not a {*filter*}one, let's write it to our Main file
          Print #1, TempString       'Dump it into our MainInputFile
          j=j+1
        End If
      Loop
    Close #2
    Next i
  Close #1
End Sub

HTH,
Todd Vargo (body of msg must contain my name to reply)



Thu, 30 Jan 2003 03:00:00 GMT  
 Please help me to concatenate data files

Quote:
> > Oops, slight mismatch.  This should be:
> >         Line Input #2, TempString  'Read a line into the TempString
buffer
> >         Print #1, TempString       'Dump it into our Main Input File

> > >       Next j
> > >     Close #2

> Also needs to have a "Next i" here. Otherwise nice work Howard. :)

Indeed you are right.  That was a second error that slipped through due to
the utter lack of testing or proof reading.  It was brought to my attention
via email, but I neglected to post it since the work has already been
completed successfully (or at least as I was told).

Quote:
> Assuming the files "fileXa.csv" are junk and needs deleted later anyways,
> you could combine the two parts to eliminate double work. Here is a minor
> rewrite combining the two parts. I also included a check to bail in case
any
> of the input files falls short of the 500(whatever) line requirement due
to
> the filter process.

While the idea is good, that isn't what you coded.  What actually will
happen is every single valid line will get pushed into the MainInputFile,
regardless of if 500 lines have already been pushed into it.

Quote:
> Sub MacroDataPreprocessingVer2()
> Dim i As Long, j As Long
> Dim TempString As String
>   'Go through all fileX.csv files to remove "N/A" and any blank lines
>   'and write first 500 clean lines of each file to "MainInputFile.csv"
>   Open "C:\myfiles\MainInputFile.csv" For Output As #1
>     For i = 1 To 90
>     j=0
>     Open "C:\myfiles\file" & CStr(i) & ".csv" For Input As #2
>       Do While Not Eof(2) Or j = 500 'Set 500 to whatever here

j will only equal 500 on the 500th iteration.  For all the times where
(j=500) evaluates to False, the loop will likely continue because Not EOF(2)
will still evaluate to True.  What you actually meant to write, is most
likely:

    Do While Not EOF(2) And j <> 500  'Set 500 to whatever here

or if we used Demorgan's logical rule and factored a negation out of it:

    Do Until EOF(2) Or j = 500 'Set 500 to whatever here

--
Howard Henry Schlunder
 Winamp is currently playing:
 Merril Bainbridge --  Under the Water



Fri, 31 Jan 2003 03:00:00 GMT  
 Please help me to concatenate data files


Quote:
> >     Open "C:\myfiles\file" & CStr(i) & ".csv" For Input As #2
> >       Do While Not Eof(2) Or j = 500 'Set 500 to whatever here

> j will only equal 500 on the 500th iteration.  For all the times where
> (j=500) evaluates to False, the loop will likely continue because Not
EOF(2)
> will still evaluate to True.  What you actually meant to write, is most
> likely:

>     Do While Not EOF(2) And j <> 500  'Set 500 to whatever here

> or if we used Demorgan's logical rule and factored a negation out of it:

>     Do Until EOF(2) Or j = 500 'Set 500 to whatever here

Right you are, this last line is what I meant but due to lack of testing...

--
Todd Vargo (body of msg must contain my name to reply)



Fri, 31 Jan 2003 03:00:00 GMT  
 
 [ 6 post ] 

 Relevant Pages 

1. Binary Data File - PLEASE HELP

2. help reading negative values in data file - test code and test data

3. please, please, please, please, help

4. will someone please, please, please, please HELP me?!!

5. Data File Info Please

6. Please Help - NetTools Help File needed

7. Read VFP data files into Clipper data files

8. I have 100 data files, I want to join them together as one data file

9. Unix Data files vs DOS data files

10. non-gridded ASCII data file to netCDF data file

11. Fundamental Question - Help Filtering Data - Please

12. C4 - data module, please HELP!!

 

 
Powered by phpBB® Forum Software