Ruby: pass by value or pass by reference?
- 01 September 2017
- Ruby
- Ruby: pass by value or pass by reference?
When developers switch to a new language, that's one of the questions they try to figure out: does it pass arguments by value or by reference? Ruby has a quite interesting answer to that question so let's find out how it works.
First of all, let's try to understand what does it mean at all.
Pass arguments by value
This example is pseudo code that should show the main idea:
function changeValue(int val) {
val = val + 1;
return val;
}
int foo = 14;
changeValue(foo); // returns 15
foo; // still 14
foo
variable has the initial value of 14, then we pass foo
to changeValue
function which supposed to change it. It should increment it by one. As we can see from results, changeValue
returns 15, but foo
remains unchanged.
Every variable has its own address in memory. Let's try to simplify the entire presenation of memory and visualize it:
As we can see the value of foo
lives in square #2 of memory.
Languages that pass arguments by value copy value of the passed variable to a new address. When we called changeValue(foo)
it copied value of foo
to a new address:
So when we call val = val + 1
it changes the value of slot #7 of memory. The value that lives in slot #2 (foo
) remains the same. Keep in mind that I simplified the way memory works just to show the concept :)
Pass arguments by reference
When languages pass arguments by reference, it means that they pass the memory address (a pointer to the memory location) of the variable to a function.
func main() {
foo := 10
changeValue(&foo)
fmt.Println(foo) // 20
}
func changeValue(val *int) {
*val = 20
}
As we can see we initialized foo
with value of 10, then passed address of that value (&foo
) to function changeValue
. Function changed the value that lives on that address to 20.
Ruby
x = 10
def change_value(val)
val = 20
end
puts x # 10
change_value(x)
puts x # 10
Value of x
is still 10
, even after the call to change_value
. Seems like Ruby uses pass by value approach, right? Well, consider this example:
x = '10'
def change_value(val)
val << '20'
end
puts x # 10
change_value(x)
puts x # 1020
Value of x
was changed by change_value
method!
Ok, to figure out what's going on we should understand how assignment works and how Ruby passes objects to methods.
Assignment and object_id
In Ruby, each object has unique object id, to get that id we can use object_id
method:
nil.object_id # => 8
"foo".object_id # => 70174082708580
Here is an example from object_id
method description:
Object.new.object_id == Object.new.object_id # => false
(21 * 2).object_id == (21 * 2).object_id # => true
"hello".object_id == "hello".object_id # => false
"hi".freeze.object_id == "hi".freeze.object_id # => true
Let's try to play a little bit more with this concept to understand how it works:
10.object_id # 21
a = 10 # object_id: 21
a = 20 # object_id: 41
b = a # object_id of b: 41
Ok, as we can see from the example, the same integer value has the same object_id
.
We can display it like this:
Where variables are just labels that hold the reference to the actual object.
Here is what happens when we assign a new value to a
variable:
a = 20 # object_id: 41
b = a # object_id for b: 41
a = 10 # object_id: 21
That explains why change_value
doesn't change initial value:
x = 10
puts x.object_id # => 21
def change_value(val)
puts val.object_id # => 21. Still referencing the same value
val = 20
puts val.object_id # => 41. Referencing another value, not related to `x`
end
change_value(x)
There is one important note in the documentation for object_id
:
Immediate values are not passed by reference but are passed by value: nil, true, false, Fixnums, Symbols, and some Floats.
Ok, let's try the same trick with non-immediate value, let's try hash:
h = {}
puts h.object_id # => 70190696114000
def add_pair(hash)
puts hash.object_id # => 70190696114000
hash[:foo] = 'bar'
puts hash.object_id # => 70190696114000
end
add_pair(h)
puts h.inspect # {:foo=>"bar"}
What's interesting that even if you assign it to a new variable inside a function, It will still refer to the same object:
arr = [1,2]
def add_element(arr)
puts arr.object_id # 70101999166940
my_array = arr
puts my_array.object_id # 70101999166940
my_array << 3
end
add_element(arr)
puts arr.inspect # [1, 2, 3]
To me, this behavior feels like "pass by reference".
Mutating and Non-Mutating methods
One thing that I should mention, that when we call the method we should understand if it mutates the original object, or returns a copy of an object with changed state.
Let's see example:
def compact_array(arr)
arr.compact
end
arr = [1,nil,2]
compact_array(arr)
puts arr.inspect # [1, nil, 2]
In this case compact_array
used method Array#compact
which doesn't mutate original array. If we check documentation it says:
Returns a copy of self with all nil elements removed.
So it returns a copy of self. Original array remains the same.
If we switch to compact!
method, it will change original array:
def compact_array!(arr)
arr.compact!
end
arr = [1,nil,2]
compact_array!(arr)
puts arr.inspect # [1, 2]
It's good to know that in Ruby everything that mutates self, usually has !
at the end of method name. upcase!
, capitalize!
, compact!
, etc.
Usually, this behavior described in the documentation. Either it mutates self or returns a copy.
Another example:
def change_str(str)
puts str.object_id # => 70232513267740
str = str + 'bar'
puts str.object_id # => 70232513267320
end
s = 'foo'
change_str(s)
puts s # => foo
change_str
didn't concatenate 'bar' to 'foo'. Why? Because if we check the documentation for Strgin#+
it says:
Returns a new String containing other_str concatenated to str.
That's why we have a different value of object_id
after concatenation.
If we check the documentation for <<
(alias concat
) we will see:
Concatenates the given object to str
Let's try:
def change_str(str)
puts str.object_id # => 70151978493860
str.concat('bar')
puts str.object_id # => 70151978493860
end
s = 'foo'
change_str(s)
puts s # => foobar
It worked this time because concat
mutates self.
If you want to avoid mutations, you can pass the copy of the object to function using dup
or just freeze
the object.
I know that this topic was discussed many times and from what I've seen there is no clear answer to this question. It was my try to explain how it works, so if you have any ideas on improvements I'm always seeking for that. Let me know.
Also, I can recommend reading these 3 (1, 2, 3) articles. I've got a lot of inspiration from those.
Thanks for reading.