Can I know a file is binary or text in ruby? 
Author Message
 Can I know a file is binary or text in ruby?

I am newbie in ruby and computer language.
To make simple ruby script, I need a file information.
Which file is binary or text.
In ruby, how can I know file information?
I found File class and File::stat class.
but there is no method which tells file state. (isbinary?)
Please give me an advice.
Thanks.


Wed, 05 Jan 2005 00:00:40 GMT  
 Can I know a file is binary or text in ruby?

Quote:
> I am newbie in ruby and computer language.
> To make simple ruby script, I need a file information.
> Which file is binary or text.
> In ruby, how can I know file information?
> I found File class and File::stat class.
> but there is no method which tells file state. (isbinary?)
> Please give me an advice.
> Thanks.

Ignoring for a moment what it would mean for a Unicode file to be
"binary", you could just do this:

#!/usr/bin/ruby
# From the Perl documentation:
#
# The "-T" and "-B" switches work as follows.  The
# first block or so of the file is examined for odd
# characters such as strange control codes or char-
# acters with the high bit set.  If too many strange
# characters (>30%) are found, it's a "-B" file,
# otherwise it's a "-T" file.  Also, any file con-
# taining null in the first block is considered a
# binary file.  If "-T" or "-B" is used on a file-
# handle, the current stdio buffer is examined
# rather than the first block.  Both "-T" and "-B"
# return true on a null file, or a file at EOF when
# testing a filehandle.  Because you have to read a
# file to do the "-T" test, on most occasions you
# want to use a "-f" against the file first, as in
# "next unless -f $file && -T $file".

# I don't know how to get to the stdio buffer...

class File
  def self.isBinary(name)
    myStat = stat(name)
    return false unless myStat.file?
    open(name) { |file|
      blk = file.read(myStat.blksize)
      return blk.size == 0 ||
          blk.count("^ -~", "^\r\n") / blk.size > 0.3 ||
          blk.count("\x00") > 0
    }
  end
end

Dir.new('.').each { |entry|
  if File.stat(entry).file?
    puts "#{entry} #{ File.isBinary(entry) ? 'binary' : 'text' }"
  else
    puts "#{entry} directory"
  end

Quote:
}

--
Ned Konz
http://bike-nomad.com
GPG key ID: BEEA7EFE


Wed, 05 Jan 2005 01:19:01 GMT  
 Can I know a file is binary or text in ruby?
At Sat, 20 Jul 2002 01:20:56 +0900,

Quote:

> I am newbie in ruby and computer language.
> To make simple ruby script, I need a file information.
> Which file is binary or text.
> In ruby, how can I know file information?

It is impossible to determine completely whether a file is binary or
not because binary file is undefined concept.  Unix command `file'
does that by heuristics.  file command is pragmatically enough well
but not complete.

Now, the following tests if a file includes non ascii printable code
point byte.  You can improve this script to detect non latin-1 etc.
However some coding systems is stateful, for example iso-2022, and
this approach does not work for such character coding systems.

  #! ruby
  NON_ASCII_PRINTABLE = /[^\x20-\x7e\s]/

  def nonbinary?(io, forbidden, size = 1024)
    while buf = io.read(size)
      return false if forbidden =~ buf
    end
    true
  end

  # usage: ruby this_script.rb filename ...
  ARGV.each do |fn|
    begin
      open(fn) do |f|
        if nonbinary?(f, NON_ASCII_PRINTABLE)
          puts "#{fn}: ascii printable"
        else
          puts "#{fn}: binary"
        end
      end
    rescue
      puts "#$0: #$!"
    end
  end



Wed, 05 Jan 2005 02:05:49 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. Reading a binary file / writing binary data to a file

2. I am trying to copy a text string from a front panel indicator to a text

3. I am trying to copy a text string from a front panel indicator to a text

4. Saving Styled text to Binary File

5. Binary Files in Text Editor

6. reading a file containing both text and binary data

7. questions on binary vs text files

8. checking if file is text or binary

9. How can I know the .EXE file name, file date, file size

10. Anybody know why I am timing out

11. I know. I am full of it

12. Reading binary files, unflatten binary string

 

 
Powered by phpBB® Forum Software