Fastest way to read a certain line in a text file 
Author Message
 Fastest way to read a certain line in a text file

I have a text file with around 31,000 lines of text.

Is there a way to open the file and jump to (for example...) line 20,000 and
get that single line's content and then close the file?

Or is there a way to serialize the data in an arraylist or a custom
structure and read only one record at a time?

Each line in the file consists of a keyword and a description separated by a
"$".
ex:  "keyword$description"

Any help would be greatly appreciated.



Sat, 15 Jan 2005 01:04:35 GMT  
 Fastest way to read a certain line in a text file

Quote:
> I have a text file with around 31,000 lines of text.

> Is there a way to open the file and jump to (for example...) line 20,000
and
> get that single line's content and then close the file?

> Or is there a way to serialize the data in an arraylist or a custom
> structure and read only one record at a time?

> Each line in the file consists of a keyword and a description separated by
a
> "$".
> ex:  "keyword$description"

> Any help would be greatly appreciated.

Unless the lines are a guaranteed fixed-length, the best way I know of to do
this would be to load the entire file and iterate through. This is why a lot
of flat data format files have fixed field sizes. If you have control over
the format and will be routinely using such large datasets, I'd highly
recommending changing it to use fixed field widths for your name-value
pairs. Then accessing a particular record becomes a matter of a trivial
offset calculation. For instance, 32 characters for the name, 64 for the
value. Then, to get to record n, you just open the file and jump to offset
(n-1) * (32 + 64). Gives you larger files to work with, but that's just the
tradeoff that you get.

Then again, I'm an old C/C++ guy; C# may implement some clever thing under
the hood for you that will do this; however, performance is likely to be
poor unless one of the two methods listed above is used.

- Steve



Sat, 15 Jan 2005 01:29:52 GMT  
 Fastest way to read a certain line in a text file
First, I'd second what Steve wrote - change the method of storage if you
can.

If you can't and if the line number you're going to need to get is
completely arbitrary and evenly distributed amongst possible values (i.e.
just as likely to be the 1st line requested as the 31000th line) then my
guess is that you will have the best success using a StreamReader's ReadLine
method, discarding (i.e. not even assigning locally) all lines before the
one you want.

Or you could try binary file access with a FileStream class, where you would
search for the unique byte values that represent a newline (that's 4 bytes
in unicode, I guess).  It might be faster to implement the asynchronous
BeginRead methods to simultaneously to load a new buffer of data while a
second thread searches the current byte buffer, counting the newlines until
it reaches the right one.

I'm guessing the second way *could* be faster, but would probably take 4-8
times more of your time to properly build, test and optimize it (using
threading for speed reasons always does that to an app, IMO).

Richard


Quote:


> > I have a text file with around 31,000 lines of text.

> > Is there a way to open the file and jump to (for example...) line 20,000
> and
> > get that single line's content and then close the file?

> > Or is there a way to serialize the data in an arraylist or a custom
> > structure and read only one record at a time?

> > Each line in the file consists of a keyword and a description separated
by
> a
> > "$".
> > ex:  "keyword$description"

> > Any help would be greatly appreciated.

> Unless the lines are a guaranteed fixed-length, the best way I know of to
do
> this would be to load the entire file and iterate through. This is why a
lot
> of flat data format files have fixed field sizes. If you have control over
> the format and will be routinely using such large datasets, I'd highly
> recommending changing it to use fixed field widths for your name-value
> pairs. Then accessing a particular record becomes a matter of a trivial
> offset calculation. For instance, 32 characters for the name, 64 for the
> value. Then, to get to record n, you just open the file and jump to offset
> (n-1) * (32 + 64). Gives you larger files to work with, but that's just
the
> tradeoff that you get.

> Then again, I'm an old C/C++ guy; C# may implement some clever thing under
> the hood for you that will do this; however, performance is likely to be
> poor unless one of the two methods listed above is used.

> - Steve



Sat, 15 Jan 2005 03:11:29 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. Reading from certain lines in a text file

2. How to remove certain lines in text file

3. Reading Line by Line through Text File

4. Reading text file line by line

5. How to read partial text in a line of a text file

6. How To Do Fastest Text File Reads?

7. Fastest Way to Read In Text File??

8. Reading Certain Line

9. How to read one file and write certain fields to another file

10. FileLen only works certain ways?

11. Read lines (with commas) from a text file

12. Problem reading in text file line

 

 
Powered by phpBB® Forum Software