Using grep in Ruby
Most users of POSIX based systems have most likely heard of or use the grep
command regularly. For those who don’t know, grep
is a method of searching an input file for a match of one or more patterns. It’s a very useful and powerful tool, especially when you pair it with some advanced regular expressions.
One of the things I didn’t know until yesterday, and thought would be interesting to share, is that you can also use grep
on any Ruby
class that includes the Enumerable
module.
The output of ri Enumerable#grep
is as follows:
= Enumerable#grep(from ruby core)
--------------------------------------------------------------------enum.grep(pattern) -> array
enum.grep(pattern) { |obj| block } -> array--------------------------------------------------------------------Returns an array of every element in enum for which Pattern ===
element. If the optional block is supplied, each matching element is
passed to it, and the block's result is stored in the output array.(1..100).grep 38..44 #=> [38, 39, 40, 41, 42, 43, 44]
c = IO.constants
c.grep(/SEEK/) #=> [:SEEK_SET, :SEEK_CUR, :SEEK_END]
res = c.grep(/SEEK/) { |v| IO.const_get(v) }
res #=> [0, 1, 2]
This means that we can do some seriously powerful and specific searches for elements within any Enumaerble
using regular expressions.
One of the things that isn’t mentioned explicity in the ri
documentation is that you can also grep on module
and class
types, too:
irb(main):001:0> array_of_diff_types = [{a: 1}, [1, 2, 3], 25, 'bob']
=> [{:a=>1}, [1, 2, 3], 25, "bob"]
irb(main):002:0> array_of_diff_types.grep(Hash)
=> [{:a=>1}]
irb(main):003:0> array_of_diff_types.grep(Array)
=> [[1, 2, 3]]
irb(main):006:0> array_of_diff_types.grep(Integer)
=> [25]
irb(main):008:0> array_of_diff_types.grep(String)
=> ["bob"]
irb(main):009:0> array_of_diff_types.grep(Symbol)
=> []
Whenever you grep
for a specific class, it returns an array of the elements within the Enumerable
that matches said class, or an empty array if there are no matches.
It also works across the class hierarchy, e.g.
irb(main):011:0> array_of_diff_types.grep(Object)
=> [{:a=>1}, [1, 2, 3], 25, "bob"]
Grepping against modules is also supported, and works how you would imagine — matching any classes that include the module given:
irb(main):012:0> array_of_diff_types.grep(Enumerable)
=> [{:a=>1}, [1, 2, 3]]
If you would like to be more specific and find a certain element in an Enumerable
, grep
allows you to do it:
irb(main):017:0> array_of_diff_types.grep({a: 1})
=> [{:a=>1}]
Interestingly, grep
is only a smidge faster than select
in the benchmark I ran below ran using ruby 2.4.1p111
.
require 'rubygems'
require 'benchmark' test_array = %w(hamburger cheeseburger goatburger veggieburger lettuce tomato ketchup superburger) Benchmark.bm do |x|
x.report { 1000000.times { test_array.grep(/burger/) } }
x.report { 1000000.times { test_array.select { |item| /burger/ === item } } }
end user system total real
3.080000 0.010000 3.090000 ( 3.122951)
3.520000 0.030000 3.550000 ( 3.612067)
Keep in mind, in the real world, the speed difference most likely won’t matter, just use the one that is easier to read for you and your team.
Grepping, whether in a POSIX system’s command line, or in Ruby, is a great and powerful tool. This power is supercharged with the use of regular expressions and, in Ruby, grepping against an enumerable’s children’s class
, or module
types can prove to be a great way to make potentially complex searches easier to read.
Thanks to Pericles Theodorou for the enlightenment.
Edited on 30th of June to update the benchmark to use a more accurate comparison, thanks to @sgrif.
Interested in making an Impact? Join the carwow team!
Feeling social? Connect with us on Twitter and LinkedIn :-)