How to make Ruby code a little bit faster
- 17 February 2017
- How to make Ruby code a little bit faster
Hi. Last couple years people say that Ruby is slow. Some people switch to more 'fast' languages. I think that Ruby is relatively slow language, but creators of Ruby made a huge work to make it faster. Last versions of Ruby work much much faster than previous ones.
Besides that, Ruby is a really flexible language and it's focused on programmer happiness. Control of memory usage is not something that many developers consider as a happiness. That's why Ruby-developers don't think enough about code optimization.
Let's check, why our code might work slow. I would like to show you this example:
require 'benchmark'
rows = 100000
cols = 10
data = Array.new(rows) { Array.new(cols) { 'a' * 1000 } }
time = Benchmark.realtime do
csv = data.map { |row| row.join(",") }.join("\n")
end
puts time.round(2)
We created array of 100 000 of elements. Each element is an array of 10 strings. Each string is a 1000 of 'a' characters. We're not interested in structure of that chunk of data. All we need to know that we deal with a huge amount of data (memory) and we want to transform it into csv
.
Let's check how long it takes to generate that csv in Ruby 2.3:
# => 2.27, 2.2
Ruby 2.3 processed that data in 2.2 seconds. Let's run the same code, but using older Ruby-version this time - Ruby 1.9.3:
# => 6.9, 6.75
Huge difference. From this example we see that latest versions of Ruby work much faster.
Now we will try to get some insights why this code works so slow. As you know, Garbage Collector (GC) in Ruby works slow.
We can easy prove that by disabling GC:
require 'benchmark'
rows = 100000
cols = 10
data = Array.new(rows) { Array.new(cols) { 'a' * 1000 } }
GC.disable # => disabled Garbage collector!
time = Benchmark.realtime do
csv = data.map { |row| row.join(",") }.join("\n")
end
puts time.round(2)
# => Ruby 2.3: 1.33, 1,41
# => Ruby 1.9.3: 1.28, 1,36
With disabled GC there is no difference between versions of Ruby. Code works faster.
Creators of Ruby work on changes in GC constantly. And as we see they're doing great. GC of Ruby 2.3 worked much faster than in Ruby 1.9.3.
We're not creators of Ruby, but we can make life of GC easier by using less memory. If we use less memory - GC will work faster and our code will run faster.
Is it possible to check how much memory we use?
This article will not cover all techniques of profiling, but I want to mention one really simple way to get amount of RAM that your process use.
require 'benchmark'
def display_memory_usage
puts "%d Mb" % (`ps -o rss= -p #{Process.pid}`.to_i/1024)
end
rows = 100000
cols = 10
data = Array.new(rows) { Array.new(cols) { 'a' * 1000 } }
display_memory_usage # => 1060 Mb
GC.disable
time = Benchmark.realtime do
csv = data.map { |row| row.join(",") }.join("\n")
end
display_memory_usage # => 2997 Mb
As we can see - structure of data takes a little bit more than 1GB of memory. But after transformation to csv our code consumes ~3GB of memory! It's too much, because even 2GB should be enough: 1GB for initial structure, and 1GB for csv.
The problem appears because of map
usage, which stores in memory temporary result of its execution. Let's rewrite that code:
require 'benchmark'
def display_memory_usage
puts "%d Mb" % (`ps -o rss= -p #{Process.pid}`.to_i/1024)
end
rows = 100000
cols = 10
data = Array.new(rows) { Array.new(cols) { 'a' * 1000 } }
display_memory_usage # => 1061
GC.disable
time = Benchmark.realtime do
csv = ''
rows.times do |i|
cols.times do |j|
csv << data[i][j]
csv << "," if j != cols - 1
end
csv << "\n" if i != rows - 1
end
end
display_memory_usage # => 2072
Code looks worse, but let's check the result! We use 1GB less memory. And GC will have less work to do. We can see the benefits of such approach in time of execution:
# => 1.35, 1.18, 1.13
Time of execution almost the same we had with disabled GC.
If you have part of your application which could be rewritten this way - use such approach to track memory usage.
You can extract slow piece from application, analyze how much memory it consumes and rewrite it to be more efficient.
I got inspiration for this post after reading this book: Alex Dymo - Ruby Performance Optimization. It describes many interesting techniques which allows you to find bottlenecks of your app and improve performance.