Ruby plus equals (+=) versus append/concatenation shovel (<<)
Alec Jacobson
September 08, 2010
I was stunned to watch how slow a recent ruby program was. All it was doing was concatenating a bunch of string literals in a big loop. Originally I was using plus equals:
str = ""
1000.times do |i|
str += "foo bar"
end
On a whim I tried switching to using an array then joining:
str = ""
str_array = []
1000.times do |i|
str_array << "foo bar"
end
str = str_array.join
Already this was way faster. I wrote up a little benchmarking program to see just how badly "+=" performs compared to "<<". I compare string +=, to the array set up I have above, and just using "<<" on the string:
str = ""
1000.times do |i|
str << "foo bar"
end
Here's my little test program.
#!/usr/bin/ruby
power = 20
power.times do |p|
n = 2**p
str = ""
start_time = Time.now
n.times do |i|
str += "x"
end
duration = Time.now - start_time
#puts "#{n} string appends took: #{duration}s"
puts "#{n} #{duration}"
end
power.times do |p|
n = 2**p
str3 = ""
start_time = Time.now
n.times do |i|
str3 << "x"
end
duration = Time.now - start_time
puts "#{n} #{duration}"
end
power.times do |p|
n = 2**p
str2 = ""
start_time = Time.now
str_array = []
n.times do |i|
str_array << "x"
end
str2 = str_array.join
duration = Time.now - start_time
puts "#{n} #{duration}"
end
And here are the results:
String += is asymptotically worse than <<. Reading through the ruby doc on strings its clear this is because:
str1 += str2
is syntactic sugar for something like
str1 = str1 + str2
whose "=" creates a new string object, hence the big computational cost.
But why?! I can't think of any reason why "+=" shouldn't be syntactic sugar for "<<". Can you?
Update:
I get it!
Here's two short snippets that illustrate the difference:
a = "x"
b = a
b += "y"
a
Which results in "x"
a = "x"
b = a
b << "y"
a
Which results in "xy"
It's subjective whether x+=y should mean "append y to x" or always be syntactic sugar for "x = x + y". My vote is for the later, which means I must be content that in Ruby these operators do different things and thus have different speeds.