Unintuitive Ruby feature… or is it?
The fact that ruby variables reference objects was one of the first things I read about in the differences between PHP and Ruby, but over time, not having actually run into problems, recently this fact took me by surprise so I thought I’d share my thoughts.
Take this code sample for example:
» a1 = %w{ 1 2 3 }
» a2 = a1
» a1 = %w{ 4 5 6 }
» puts a1.inspect
[“4”, “5”, “6”]
» puts a2.inspect
[“1”, “2”, “3”]
As you can see, the code would seem to indicate that a1 and a2 are separate, and working as you’d expect (making a copy of the a1 and assigning it to a2). But then take this code for example:
» a1 = %w{ 1 2 3 }
» a2 = a1
» a1.delete(‘1’)
» puts a1.inspect
[“2”, “3”]
» puts a2.inspect
[“2”, “3”]
As you can, a2 was affected by the change to a1. Why? Well it’s those references I mentioned at the beginning of the post. When you use an assignment (=), it makes a new object and makes that variable reference it, but when you run methods on a1 (like delete), it doesn’t delete from the variable, but from the object it references, which a2 is also referencing at that point.
Now the fix for this is quite simple. Instead of just
a2 = a1
you replace it with
a2 = a1.dup
which works exactly the same way as before, but it points a2 at a new object (a duplicate of a1), rather than making a2 point at an existing object. This is pretty common stuff (anyone using Ruby for a period of time has come across this and knows how it works). But is it really intuitive?
Wouldn’t it make more sense to create duplicates by default and create a new a2 = a1.ref type syntax? Afterall, if someone wanted to call a reference, they might as well just call the original (I cannot think of any logical reason to have duplicate variables pointing to the same thing).
Or perhaps, if references were to stay the default, changing the implementation so that any write actions happen on a copy of it, but read actions happen on the original?
Any thoughts?