Ruby/String/scan

Материал из Wiki.crossplatform.ru

Перейти к: навигация, поиск

Содержание

Accept dashes and apostrophes as parts of words.

class String
  def word_count
    frequencies = Hash.new(0)
    downcase.scan(/[-"\w]+/) { |word, ignore| frequencies[word] += 1 }
    return frequencies
  end
end
%{"this is a test."}.word_count



Anything that"s not whitespace is a word.

class String
  def word_count
    frequencies = Hash.new(0)
    downcase.scan(/[^\S]+/) { |word, ignore| frequencies[word] += 1 }
    return frequencies
  end
end
%{"this is a test."}.word_count



A pretty good heuristic for matching English words.

class String
  def word_count
    frequencies = Hash.new(0)
    downcase.scan(/(\w+([-".]\w+)*)/) { |word, ignore| frequencies[word] += 1 }
    return frequencies
  end
end
%{"this is a test."}.word_count



Count words for a string with quotation marks

class String
  def word_count
    frequencies = Hash.new(0)
    downcase.scan(/\w+/) { |word| frequencies[word] += 1 }
   return frequencies
  end
end
 
%{"I have no shame," I said.}.word_count



extract numbers from a string

"The car costs $1000 and the cat costs $10".scan(/\d+/) do |x|
  puts x
end



Just like /\w+/, but doesn"t consider underscore part of a word.

class String
  def word_count
    frequencies = Hash.new(0)
    downcase.scan(/[0-9A-Za-z]/) { |word, ignore| frequencies[word] += 1 }
    return frequencies
  end
end
%{"this is a test."}.word_count



scan a here document

#!/usr/bin/env ruby
sonnet = <<129
this is a test
this is another test
129
result = sonnet.scan(/^test/)
result << sonnet.scan(/test;$/)
puts result



Scan as split

text = "this is a test."
puts "Scan method: #{text.scan(/\w+/).length}"



Scan for \w+

#!/usr/bin/env ruby
hamlet = "The slings and arrows of outrageous fortune"
hamlet.scan(/\w+/) # => [ "The", "slings", "and", "arrows", "of", "outrageous", "fortune" ]



Scan() string with hex value

french = "\xc3\xa7a va"
french.scan(/./) { |c| puts c }



scan through all the vowels in a string: [aeiou] means "match any of a, e, i, o, or u."

"This is a test".scan(/[aeiou]/) { |x| puts x }



scan(/./u) string with hex value

french = "\xc3\xa7a va"
french.scan(/./u) { |c| puts c }



specify ranges of characters inside the square brackets

# This scan matches all lowercase letters between a and m.
"This is a test".scan(/[a-m]/) { |x| puts x }



Splitting Sentences into Words

class String
  def words
    scan(/\w[\w\"\-]*/)
  end
end
 
"This is a test of words" capabilities".words



uses \d to match any digit, and the + that follows \d makes \d match as many digits in a row as possible.

"The car costs $1000 and the cat costs $10".scan(/\d/) do |x|
  puts x
end