Perl multiple match RE in Ruby? 
Author Message
 Perl multiple match RE in Ruby?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Have this in Perl, would prefer it in Ruby.


Need to take String and get Array of all successive matches against string.

Regex itself is fine, but hope there's an idiomatic way to run this without
building a loop using MatchData#[1] and MatchData#post_match. Which I can
do, but it seems clumsy.

Thanks...
  -michael

<signature>
  <name>Michael C. Libby</name>

  <web-site> http://www.*-*-*.com/ ;/web-site>
  <public-key> http://www.*-*-*.com/ ;/public-key>
</signature>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9tL2h4ClW9KMwqnMRAvImAJ9pbHoGpwe9Bp6pnkZbocnFMS+8lQCfVQBK
yUud+RcxxquK2Y4msXMYnnM=
=BmQw
-----END PGP SIGNATURE-----



Sat, 09 Apr 2005 10:54:09 GMT  
 Perl multiple match RE in Ruby?
Hi,

Quote:

> Have this in Perl, would prefer it in Ruby.


> Need to take String and get Array of all successive matches against string.

How about String#scan ?

% ruby -e 'p "a b c".scan(/\w/)'
["a", "b", "c"]

--
eban



Sat, 09 Apr 2005 11:52:51 GMT  
 Perl multiple match RE in Ruby?
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Quote:
> How about String#scan ?

That did the trick! This is why I love Ruby, so easy to get stuff done. The
Ruby version is SO much more readable than the Perl too.

So here's my adaptation of Perl's Sort::Versions. Anything I'm missing?
Other than the documentation, that is.

class Sort
  class Versions
    def Versions.versioncmp(version_a, version_b)
      vre = /[-.]|\d+|[^-.\d]+/
      ax = version_a.scan(vre)
      bx = version_b.scan(vre)

      while (ax.length>0 && bx.length>0) do
        a = ax.shift
        b = bx.shift

        if( a == b )                 then next
        elsif (a == '-' && b == '-') then next
        elsif (a == '-')             then return -1
        elsif (b == '-')             then return 1
        elsif (a == '.' && b == '.') then next
        elsif (a == '.' )            then return -1
        elsif (b == '.' )            then return 1
        elsif (a =~ /^\d+$/ && b =~ /^\d+$/) then
          if( a =~ /^0/ or b =~ /^0/ ) then
            return a.to_s.upcase <=> b.to_s.upcase
          end
          return a.to_i <=> b.to_i
        else
          return a.upcase <=> b.upcase
        end
      end
      return version_a <=> version_b;
    end

    def Versions.sort_versions(list)
      return list.sort{|a,b| Sort::Versions.versioncmp(a,b)}
    end
  end
end

puts Sort::Versions::sort_versions( %w{ 1.1.6 2.3 1.1a 3.0 1.5 1 2.4 1.1-4
2.3.1 1.2 2.3.0 1.1-3 2.4b 2.4 2.40.2 2.3a.1 3.1 0002 1.1-5 1.1.a 1.06} )
- --
<signature>
  <name>Michael C. Libby</name>

  <web-site>http://www.ichimunki.com/</web-site>
  <public-key>http://www.ichimunki.com/public_key.txt</public-key>
</signature>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9tM+z4ClW9KMwqnMRAhdYAJsHHLz3+cMTvzCctZqe66zo8W8qKQCffDJl
r1xy27/db+mGDu60gdfLvmM=
=D9xw
-----END PGP SIGNATURE-----



Sat, 09 Apr 2005 12:11:16 GMT  
 Perl multiple match RE in Ruby?

| Have this in Perl, would prefer it in Ruby.
|

|
| Need to take String and get Array of all successive matches against string.

You're in luck:

$_[0].scan( /([-.]|\d+|[^-.\d]+)/ )

You can also pass String#scan a block to execute on each match.  See
http://www.rubycentral.com/book/ref_c_string.html#String.scan for
details.

Dan

--
http://www.dfan.org



Sat, 09 Apr 2005 12:06:26 GMT  
 Perl multiple match RE in Ruby?

Quote:


> > How about String#scan ?

> That did the trick! This is why I love Ruby, so easy to get stuff done. The
> Ruby version is SO much more readable than the Perl too.

While I agree, to be fair to Perl, String#scan *in this instance* is
equivalent to String#split, which is Perl's split.

Of course, String#scan *does* do things that Perl's split can't do, but
what you're doing isn't one of them. :)

--
The warly race may riches chase,
    An' riches still may fly them, O;
An' tho' at last they catch them fast,
    Their hearts can ne'er enjoy them, O.



Sat, 09 Apr 2005 17:23:30 GMT  
 Perl multiple match RE in Ruby?

Quote:

> So here's my adaptation of Perl's Sort::Versions. Anything I'm missing?
> Other than the documentation, that is.

Kewl! Three things:

Why define your own sort, when you can just pass the comparison method to
normal sort-method? (I may be missing something here, though.)

Second, modules might fit the bill better than classes.

Third, I'd like to suggest some API changes - to make it less perl, and
more ruby-like:

Quote:
> class Sort
>   class Versions
>     def Versions.versioncmp(version_a, version_b)
>     end
>     def Versions.sort_versions(list)
>     end
>   end
> end

I'd make it:

module Version

  def self.cmp( a, b )
    Sort::Versions::versioncmp( a, b )
  end

  def self.sort( list )
    list.sort { |a,b| Version.cmp(a,b) }
  end

  def self.sort!( list )
    list.sort! { |a,b| Version.cmp(a,b) }
  end

  module Cmp
    def version_cmp( b )
      Version.cmp( self, b )
    end
  end

  module Sort
    def version_sort
      Version.sort( self )
    end
    def version_sort!
      Version.sort!( self )
    end
  end

end

Version.cmp( "0.77", "1.3.5" )

String.extend(Version::Cmp)
"0.1.3".version_cmp( "0.2.32" )

Version.sort(["0.22", "0.1", "1.4"])
Version.sort!(["0.22", "0.1", "1.4"])

Array.extend(Version::Sort)
["0.22", "0.1", "1.4"].version_sort
["0.22", "0.1", "1.4"].version_sort!

This naturally does break the CPAN-equivalence, which may be a nice thing
to have, so maybe make a separate hierarchy: CPAN::Sort::Versions that
refers to the same implementation.

  -- Nikodemus
------------------------------------------------------------
 I refuse to have a battle of wits with an unarmed person.



Sat, 09 Apr 2005 18:02:13 GMT  
 Perl multiple match RE in Ruby?
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Quote:


> > > How about String#scan ?

> > That did the trick! This is why I love Ruby, so easy to get stuff
> > done. The Ruby version is SO much more readable than the Perl too.

> While I agree, to be fair to Perl, String#scan *in this instance* is
> equivalent to String#split, which is Perl's split.

> Of course, String#scan *does* do things that Perl's split can't do, but
> what you're doing isn't one of them. :)

I was all set to refute this. Then I read 'perldoc -f split' and discovered
that in Perl's split() if /EXPR/ contains capturing parentheses it will
return the match as well as splitting on it. It works this way in Ruby,
too. Thanks for pointing this out.

 -michael

<signature>
  <name>Michael C. Libby</name>

  <web-site>http://www.ichimunki.com/</web-site>
  <public-key>http://www.ichimunki.com/public_key.txt</public-key>
</signature>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9tTCS4ClW9KMwqnMRApxKAJ459Xog+vh2MHmDpdGDnFdEXaUzOgCgtAy9
fWQM940x9PW1PMiKeWrMXUk=
=mqjq
-----END PGP SIGNATURE-----



Sat, 09 Apr 2005 19:04:40 GMT  
 Perl multiple match RE in Ruby?
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Quote:
> Why define your own sort, when you can just pass the comparison method
> to normal sort-method? (I may be missing something here, though.)

I guess because it makes a convenient wrapper?

Quote:
> Second, modules might fit the bill better than classes.

[snip]

They certainly would the way I had it. Your suggestions give a lot of food
for thought.

My natural tendency is to want the version_cmp method in String and
version_sort in Array... would it make more sense to simply do:

class String
  def version_cmp(b)
    #compare(self,b)
  end
end

class Array
  def version_sort
    self.sort{|a,b| a.to_s.version_cmp(b.to_s)}
  end
end

That would save the step of extending those classes in code.

 -michael

<signature>
  <name>Michael C. Libby</name>

  <web-site>http://www.ichimunki.com/</web-site>
  <public-key>http://www.ichimunki.com/public_key.txt</public-key>
</signature>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9tTsY4ClW9KMwqnMRAmGaAJ9/d3NTYMZDuMAtpdHqlOpXfzK0IwCfZBrs
M39aqSiikTX4NagPEFUByqo=
=/RGS
-----END PGP SIGNATURE-----



Sat, 09 Apr 2005 19:49:21 GMT  
 Perl multiple match RE in Ruby?

Quote:

> My natural tendency is to want the version_cmp method in String and
> version_sort in Array... would it make more sense to simply do:

<snip/>

Quote:
> That would save the step of extending those classes in code.

How about adding that to the Version module inside a BuiltinExt module, so
that to extend the classes you would just:

require 'version'
include Version::BuiltinExt

But those that do not like to see String and array extended, or would like
to extend some other String or Array like class could still use:

Version.cmp(a,b)

 or

MyClass.extend Version::Cmp

This (extension api design) is one area that would really deserve a "Best
Practices" doc.

  -- Nikodemus



Sat, 09 Apr 2005 20:03:14 GMT  
 Perl multiple match RE in Ruby?

Quote:

> This naturally does break the CPAN-equivalence, which may be a
> nice thing to have, so maybe make a separate hierarchy:
> CPAN::Sort::Versions that refers to the same implementation.

This is something I haven't been overly concerned with in my
reimplementations of Text::Format and MIME::Types. Ultimately, I
think that the Ruby implementation should have a different API
because Ruby allows for some things that Perl doesn't,
expression-wise.

-austin



Sat, 09 Apr 2005 23:36:38 GMT  
 Perl multiple match RE in Ruby?

Quote:

> reimplementations of Text::Format and MIME::Types. Ultimately, I
> think that the Ruby implementation should have a different API

I agree. Ruby API should be idiomatic Ruby, and oftentimes this means that
the entire name / class / module hierarchy should be different.

  -- Nikodemus



Sat, 09 Apr 2005 23:42:01 GMT  
 Perl multiple match RE in Ruby?

Quote:

> > That would save the step of extending those classes in code.

> How about adding that to the Version module inside a BuiltinExt module, so
> that to extend the classes you would just:

I'll throw mine in too; I wrote it for rpkg.  This makes Version a
Comparable object.  (Have a look at the tests for usage examples.)

class Version
  include Comparable

  def Version.[](*args)
    new(*args)
  end

  def initialize(s, separators = '.-')
    separators = separators.split('')

    items_regex_src = separators.collect {|sep| Regexp.escape(sep)}.join("|")
    seps_regex_src  = separators.collect {|sep| Regexp.escape(sep)}.join

    items_regex = Regexp.compile(items_regex_src)
    seps_regex = /[^#{seps_regex_src}]/



  end

  def to_a

  end

  def to_s
    s = ''


    end

    return s
  end

  def inspect

  end

  def [](n)

  end

  def <=>(other)
    raise unless other.is_a? Version

    comp = 0

      if n and other[i]
        comp = n <=> other[i]
      elsif n.nil?
        comp = -1
      elsif other[i].nil?
        comp = +1
      else
        raise "This should never happen!"
      end

      if comp == 0
        next
      else
        break
      end
    end

    return comp    
  end
end

if $0 == __FILE__
  require 'test/unit'

  class Version
    def separators

    end
  end

  class TestVersion < Test::Unit::TestCase
    def test_version_to_array
      v = Version['0.1.0']
      assert [0, 1, 0], v.to_a
    end

    def test_can_access_revision_number
      v = Version['0.1.0']
      assert_equal 0, v[0]
      assert_equal 1, v[1]
      assert_equal 0, v[2]
    end

    def test_non_dot_separators
      v = Version['0.1.0-20021099']

      assert_equal 0, v[2]
      assert_equal 20021099, v[3]
    end

    def test_can_mix_numbers_and_strings
      v = Version['0.1.4-unstable']

      assert_equal 0, v[0]
      assert_equal 1, v[1]
      assert_equal 4, v[2]
      assert_equal 'unstable', v[3]
    end

    def test_can_compare_equal_versions
      v1 = Version['0.1.0']
      v2 = Version['0.1.0']

      assert_equal v1 <=> v2, 0
      assert v1 == v2
      assert_equal v1, v2
    end

    def test_can_compare_different_versions
      v1 = Version['0.1.0']
      v2 = Version['0.1.2']

      assert_equal v1 <=> v2, -1
      assert_equal v2 <=> v1, 1
      assert v1 < v2
      assert v2 > v1
    end

    def test_can_compare_different_versions_with_different_number_of_items
      v1 = Version['0.1.0']
      v2 = Version['0.1.0.2']

      assert v2 > v1

      v1 = Version['0.1.0']
      v2 = Version['0.1.0-20021010']

      assert v2 > v1

      v1 = Version['0.1.0']
      v2 = Version['0.1.0-unstable']

      assert v2 > v1
    end

    def test_can_reconstruct_version_string
      s = '0.1.0-unstable-20021010'
      v = Version[s]

      assert_equal s, v.to_s
    end

    def test_finds_separators
      s = '0.1.0-unstable-20021010'
      v = Version[s]

      assert_equal ['.', '.', '-', '-'], v.separators
    end

    def test_unique_version_item_accepted
      v = Version['0123_done']

      assert_equal ["0123_done"], v.to_a      
    end

    def test_allow_alphabetic_only_versions
      v = Version['cvs']

      assert_equal ["cvs"], v.to_a
    end
  end
end



Sun, 10 Apr 2005 17:13:45 GMT  
 
 [ 18 post ]  Go to page: [1] [2]

 Relevant Pages 

1. My Perl to Ruby Story (was: perl and rub y)

2. string manipulation and pattern matching: prolog+perl?

3. multiple matching in a single line

4. pattern matching for multiple line?

5. multiple lines matching without separator

6. How to return multiple matches in Regexp?

7. Destructuring / pattern-matching (was: Multiple return values)

8. Destructuring / pattern-matching (was: Multiple return values)

9. Can ETAGS match multiple VHDL lines ?

10. Put multiple matched lines into an array?

11. regular expressions: grabbing variables from multiple matches

12. regexp and multiple matches

 

 
Powered by phpBB® Forum Software