IronRuby - portrait of a dynamic language

This article was originally published in VSJ, which is now part of Developer Fusion.
IronRuby is an open source version of the Ruby programming language tailored to the .NET environment, and it joins IronPython (see VSJ June 2009) as yet another dynamic language for .NET. In many ways it’s still early days in the project, but given the amount of publicity that Ruby is attracting it is worth looking at the language and the implementation now to see why it is so revolutionary. In this article I will explain why Ruby is different from static languages such as C++ and how dynamic languages differ in general from what we have learned to accept as the standard way of working.

The latest version of IronRuby at the time of writing is the CTP for .NET 4.0 Beta 1, which is important simply because .NET 4.0 has explicit support for dynamic languages. To get started you will need to first install a version of the .NET 4.0 Beta or Visual Studio 2010. I would recommend installing Visual Studio 2010 just to see how it works but as IronPython still isn’t integrated into Visual Studio there is no real need to use it. If you want an IDE to use with IronPython there are a few available but the obvious choice is Ruby in Steel.

You can find both the .NET 4.0 beta and Visual Studio 2010 by searching the Microsoft website. Following the installation of .NET 4.0 you only have to download a small zip file containing both IronPython support files and IronRuby. You can also find earlier versions and documentation at the same site. As with many open source projects, you won’t find much in the way of help getting started. If you’re an IronRuby veteran then you will take exception to my comments because it’s all so obvious. This isn’t a good way to get new and enthusiastic users – it is vitally important for the future of any software project that you spell out the obvious and provide absolutely trivial examples. Yes, you’ve guessed it, the only supplied example is an extremely complex one that demonstrates how IronRuby can interwork with C#. This is an important selling point in the IronRuby campaign but it is going to leave the complete beginner out in the cold and completely unable to appreciate the finer points.

To make use of the zip you have to unzip the files and no guidance is given on where to unzip them – however c:\ruby is a common choice even for IronRuby. Once you have the files unzipped you need to open a console and change directory to the unzip location \bin to access the interpreter. Obviously a better idea is to set the \bin directory as part of the command line path but let’s stay simple for the moment.

To make sure everything is working use NotePad and create a plain text file containing:

puts 'Hello IronRuby World'
Save the file under the name Hello.rb in the bin directory and run the program using:
ir Hello.rb
You should see the hello message appear in the console as puts, i.e. put string, writes to the stdio stream. As long as this works you have IronRuby installed and working. The ir.exe program is the IronRuby interpreter. There is also iirb which provides an interactive environment for the interpreter.

What is Ruby?

The big problem with coming to terms with any new programming language is trying to find out its essential character. It is all too common for the expert in the language to point out the minor syntactic decorations that make something very specific into something very easy. Such “gadgets” are always handy but they generally don’t give you the big picture of the idea or philosophy that has brought yet another programming language into the world.

Ruby can be described as a cross between Smalltalk and Perl, which if you know either language is a strange prospect. Smalltalk is a pioneering object oriented language that popularised many of the object oriented approaches we take for granted today but usually in a watered down form compared to the original. Perl is a scripting language with enough bolted-on features to make it possible for an expert to write very condensed scripts which even another expert can find difficult to decode. Mixing these two languages together clearly has the potential to create something that is a mess but the creator of Ruby, Yukihiro Matsumoto, decided that the Principle of Least Surprise was to be the guiding light. That is the language should as far as possible work as you would expect it too. Of course this is very subjective – what is surprising to a C# programmer will be expected by a Perl programmer.

In practice all you can really say is that Ruby tries to be transparent about what it means – whether it actually succeeds in this aim is something that will be argued about well into the language’s maturity. As a neutral observer my own opinion is that Ruby is more like JavaScript with object orientation “done right”… or perhaps “done better”.

Objects

Ruby is object oriented and the first lesson is that in Ruby everything is an object, even those things that would be primitives or otherwise made exceptions for reasons of efficiency. For example, even a numeric literal has methods:
1.size
…returns the number of bytes needed to store the value and:
'Hello Ruby'.length
…returns the length of the string.

You can, of course, define your own objects and Ruby implements a fairly class-based approach to object creation – but, as is the tendency in dynamic languages, a Ruby Class is just as much an object as any other object. That is a class is just an object, i.e. an instance of Class, with some special methods that will make further instances and methods that work with instance data. For example, to create a point class you simply write:

class Point
end
This creates a constant called Point – any identifier beginning with a capital letter is defined to be constant – and assigns to it a new instance of class. This has no useful methods except for new which creates a new instance of the new class, i.e. a new Point. You can already write:
p=Point.new
…but all this really gives you is a new instance of the class object. The next step is to customise the class object to add some instance methods and variables. The role of a constructor is played by the initialise method:
class Point
	def initialize(x,y)
		@x,@y=x,y
	end
end
p=Point.new(1,2)
When you use new it automatically calls initialise and passes on any parameters you specify. Instance variables are denoted by the @ prefix and new copies are automatically created by the new method for each instance as it is created. Also notice the use of parallel assignment which is a Ruby feature.

You might think that now you can refer to the instance variables, but no. Ruby objects can only have methods and not properties. The solution to this lack of method is of course accessor methods. Ruby methods don’t have to be called using brackets so it’s entirely possible, and even desirable, to confuse a method with a variable. All you need for a “get” accessor is to define suitable methods called x and y as in:

def x
	@x
end
def y
	@y
end
To understand how these methods work you also need to know that as well as being an object nearly everything in Ruby is an expression and expressions always return a value. This is a functional programming aspect of Ruby and there are lots more – so much so that it’s fair to consider Ruby object oriented with functional programming bolted on.

In most cases the value of something considered as an expression is the value of the last statement it contains. Hence just using @x and @y as the last statements is equivalent to using return @x and return @y – which is also valid Ruby. With these definitions you can now write:

puts p.x
puts p.y
…and fully appreciate the fact that while p.x and p.y look like properties they are in fact method evocations. Once you have this idea the “get” accessor is written in a very similar way but using the assignment symbol in the method name!
def x=(v)
	@x=v
end
def y=(v)
	@y=v
end
As you don’t need to use brackets in method calls you can now apparently assign to a property in the usual way:
p.x=3
p.y=4
In the introduction I said that Ruby was designed not to be surprising – personally I find this deliberate confusion between methods and variable surprising and pleasing. You can define methods that redefine operators in much the same way including the [] array or hash access operator.

Class variables – static

The @ is used to define instance variables but the new class is also an object and it can have variables and methods of its own. These play the role of what would be called, in other languages, static variables or methods. Any instance variables you create within the class i.e. outside a method “belong” to the class instance. For example:
class Point
	@maxpoints=20
…defines a class instance variable which isn’t accessible from instance methods and isn’t created anew when an instance is created, i.e. the variable belongs to the Point object. To access it you have to write class accessor methods. A class method is prefixed by “self” and this always resolves to the instance which in this case is the class Point. (You can use Point in place of self which makes it even more clear that you are adding a method to a specific instance.)

For example, a get accessor is:

def self.maxpoints
	@maxpoints
end
…and you can now write:
puts Point.maxpoints
…and access the variable in exactly the same way within an instance method. You can also create a complete static class by defining nothing but class variables and methods. Once again this makes it even clearer that the class object is just a standard instance object which just happens to have a new method to create additional instances.

Notice that all Ruby objects are dynamic and can have methods added at anytime. For example:

def Point.maxpoints=(v)
	@maxpoints=v
end
…adds a new method to the Point object even though it isn’t inside a class…end block. In fact all a class…end block does is to add set self to the name of the class.

If you know Javascript this should sound familiar and very similar to prototype inheritance. Javascript doesn’t support classes, only objects, and in a sense Ruby only provides objects with a neat method of creating instances from a prototype object, e.g. Point is a prototype object used to create instances via its new method. This is a very common and natural approach to object orientation in any dynamic, i.e. interpreted, language.

The disadvantage of class instance variables is that they can’t be shared by instance methods. To get around this problem Ruby adds – class variables. These are defined using @@ and they are static in the sense that they belong to the class and behave just like class instance variables but they can be accessed from an instance method. This is confusing and it leads to problems when we come to consider inheritance.

Inheritance and type

Now we come to the really interesting question: how does Ruby implement inheritance given it is fully object based and doesn’t have true classes? The answer is that Ruby basically implements a copy based inheritance. When you define a new class which inherits from Point say:
class Point3D < Point
end
…then it is as if you had written the entire definition of Point within the definition of Point3D. What this means is that Point3D inherits all of the instance methods and hence all of the instance methods (which are created dynamically when the instance methods run). Class variables are also copy inherited but these remain associated with the super class, i.e. they are shared between the super class and the new child class – which it not what you might expect.

Class instance variables behave in a more reasonable way in that copy inheritance has no effect on them at all, i.e. they still belong to the original class object, and they are effectively not inherited. You can override any method including private methods simply by redefining them and as Ruby is a dynamic language late binding is the only option and which method is actually used depends on its latest definition. There is also a super statement that you can use to call the method belonging to the super class.

You may be wondering what type has to do with any of this? The simple answer is that type hardly ever crops up in a Ruby program. Everything is an object and objects simply differ according to what methods they have. On this level there is very little concept of type at all. Basically all that matters in constructing a program is whether the object has the method you are trying to use (recall properties are implemented as methods). It is up to you to test for the existence of methods with the correct call signature explicitly and there are Ruby statements that let you do this. However in most cases you know what the object can do simply because you wrote it, modified it or looked it up in a manual. This is what dynamic programming is all about and you either think it’s great or the biggest threat to code reliability since the goto.

Also notice that the lack of type means that functions are not distinguished by their signatures, i.e. parameter types. This immediately makes the whole idea of overloading, i.e. providing multiple definitions of the same function according to signature, redundant. There is no overloading but you don’t need it. Any function can be written to deal with any signature by simply testing that each parameter is appropriate for the job. Similarly there is no need to invent the concept of generics as every method is generic simply because there is no strong typing to worry about!

What all this adds up to is a slightly messy implementation of dynamic objects which makes it difficult to correctly create chains of inheritance that are robust and amenable to re-use.

Ruby as a functional language

Although Ruby is often presented at first as an object oriented language it has a lot of features that make it a good and practical functional language. Without some idea of how these features fit together it can seem like a collection of odds and ends. Let’s start with the most puzzling – the block.

Any method can be passed a block of code that it can optionally execute at any point it cares to. This might seem like a strange idea but it is very useful and is a direct consequence of Ruby being a dynamic language. For a dynamic language the distinction between code and data is much less clear than in a static language. Because code is interpreted it can be modified as the program runs and one moment treated as data and next code.

To see a block in action try the following simple program:

class MyClass
	def myMethod
		puts 'Start myMethod'
		puts 'End myMethod'
	end
end
The block is added to the end of the method call after all of the standard parameters if there are any:
myObject=MyClass.new
myObject.myMethod {puts 'Myblock'}
If you run the program you will see the Start and End message but no sign of the block. To make use of a block the method has to call it using the yield command. This is just like a method call and when the block is complete control passes back to the statement following the yield. If you change the method definition to:
def myMethod
	puts 'Start myMethod'
	yield
	puts 'End myMethod'
end
…and re-run the program you will see Start myMethod , MyBlock and End myMethod.

Internal and external iterators

You can run the block of code as many times as you like by repeating the yield statement and this indeed is the most common usage pattern as blocks and yield are mostly used to implement iterators. For example the following method is a trivial “repeat twice” iterator:
def myMethod
	yield
	yield
end
If you run the program you will see that the block is called twice. In a more extensive example you would want to pass values to the block and this needs a block parameter:
myObject.myMethod {|x| puts x}
You specify parameters using vertical bars but apart from this everything works like a standard parameterised method. To pass values to parameters you simply include them following the yield, again just like a parameter call:
def myMethod
	yield 1
	yield 2
end
Now the program displays 1 followed by 2.

Iterators are used by most Ruby objects to allow the user to scan through any collections they might hold. For example, the array object has an “each” iterator which will run a block on each of its elements:

[1,2,3,4].each {|x| puts x}
The fact that everything is an object makes iterators almost the standard way of implementing loops in Ruby even though it does have a full complement of control structures such as for, while, until etc. However, notice that iterators, or more exactly internal iterators of the sort described above, have a serious shortcoming – you can only easily iterate through one object’s collection at a time. Now consider how you would compare two collections for equality or add two arrays element by element? Not easy.

To do this sort of task you need an external iterator which allows the client code, e.g. the role played by the block, to control the iteration. External iteration is what you are most familiar with because it is typified by the standard for loop. Basically an external iterator supplies you with a “next” method which you can call to get the next element. This makes it possible to step through multiple collections by starting separate iterations on each object and moving through each iteration using next.

Currently external iterators are only available in Ruby 1.9 and so not yet in IronRuby but how they work is interesting. If you call an iterator without a block then it returns an enumeration object which can be used to step through the collection using its next method. So for example to step through the elements of an array you would use:

it1=[1,2,3,4].each
loop do
	puts it1.next
end
Currently in IronRuby you simply have to use a for loop to perform parallel iteration on multiple collections.

As well as iterators blocks are used to provide many of the familiar functional approaches to coding. For example:

puts [1,2,3,4].map {|x| x*x}
The map method takes the block and applies it to each of the elements of the collection. In this case the result is 1,4,9,16 as each integer is squared. Notice that this is subtly different from:
puts [1,2,3,4].each {|x| x*x}
…which returns 1,2,3,4 rather than the result of the block.

Passing blocks

The use of blocks with methods is a syntactic cover that hides the fact that blocks can be passed like data. There are both Proc and Lambda objects which can be used to wrap methods as objects. The difference between the two types of “wrapping” is subtle but basically comes down the behaviour of the return statement. In a block, or a proc object that wraps it, a return acts as if it was in the calling method, i.e. the method that used the yield to call the block. That is, return doesn’t just terminate the block it terminates the calling method and the method that called it. In the case of a Lambda the return simply terminates the method and returns control to the calling program.

To see this in action try:

class MyClass
	def myMethod p
		p.call 1
		p.call 2
	end
end
Now the method accepts a parameter p which it assumes to be a Proc object wrapping a block of code. To execute this code you simply use the Proc object’s call method. To create the Proc object all we have to do is:
myObject=MyClass.new
p1=Proc.new {|x| puts x}
myObject.myMethod p1
…where the new method accepts the block of code to be wrapped by the Proc object. There are shorter ways of writing this (using the & operator for example) but this form reveals what is actually going on.

To see the surprising behaviour when a return is used within a block/Proc we need another method to call the method that calls the Proc:

class MyClass
	def myMethod1 p
		p.call 1
		puts 'End MyMethod'
	end
	def myMethod2
		p1=Proc.new {|x| puts x}
		myMethod1 p1
		puts 'End myMethod2'
	end
end
Notice that the only real difference here is that myMethod2 calls myMethod1 in the same way as it was called in the main program. If you add:
myObject=MyClass.new
myObject.myMethod2
…and run the program you will see both methods ending with suitable messages.

Now add a return to the Proc:

p1=Proc.new {|x| puts x;return}
Now when you run the program you will see 1 printed but neither of the methods get to print their ending message. The return in the block returns control from the block to the calling method method1, from there to method2 and from there to the main program. A single return seems to unwind three procedure calls! Note that this works in the same way even if you use blocks and yield. However if you change the Proc object to a Lambda object then the return just terminates the block code and the two methods get to print their final messages as you would expect:
p1=lambda {|x| puts x;return}
Notice that lambda is a method of the Kernal object that creates a Lambda object.

Of course given that Ruby incorporates elements of functional programming, it has closures – that is when you create a Proc or a Lambda it incorporates the current bindings. What this means is that everything that is in scope when the object is created remains in scope for the entire lifetime of the object even if they have gone out of scope in the normal execution of the program.

For example, if you define a method that returns a Lambda object then any local method variables that the object uses are available whenever you call the wrapped code. To see this in action first define a suitable method:

class MyClass
	def myMethod
		@n=7
		return lambda {puts @n}
	end
end
Notice that the method returns a lambda object that makes use of n which is only in scope, i.e. exists, while the method is active. Even so you can write:
myObject=MyClass.new
p=myObject.myMethod
p.call
…and you will see 7 displayed indicating that the variable is still available to the lambda object.

Once again unless you are familiar with the idea of closure this is surprising behaviour. Ruby actually takes this a stage further and provides a binding object which records all of the relevant bindings to the object it is created in. You can capture a set of bindings, i.e. the current state, and execute code in the context of the stored bindings. For example:

class MyClass
	def myMethod
		@n=7
		@b=binding
		@n=8
		return @b
	end
end
Notice that the binding object is created when @n is 7. If you now try:
myObject=MyClass.new
b1=myObject.myMethod
eval("puts @n")
eval("puts @n",b1)
…the first eval reveals that @n is nil because the method has ended and its local instance variables are out of scope, i.e. in the normal way of things @n doesn’t exist. The second eval displays @n as not only in existence but with a value of 8, i.e. the last value that was assigned to the variable. Notice that bindings aren’t snapshots of the values in a variable, they really are the set of variables used by an object. To see this more clearly try:
class MyClass
	def myMethod x
		@b=binding
		@n=x
		return @b
	end
end
myObject=MyClass.new
b1=myObject.myMethod 8
b2=myObject.myMethod 9
eval("puts @n")
eval("puts @n",b1)
eval("puts @n",b2)
The result is nil, 9, 9 as both binding objects refer to the last state of myObject.

Reflection and metaprogramming

You might have noticed the use of the eval method in the last few examples. Ruby supports code examination, modification and creation at run time. The documentation refers to this as reflection and metaprogramming and indeed if you keep it under control it can be this well organised but the fact of the matter is that this is a slippery slope to self-modifying code. This is simply a result of Ruby being an interpreted language and as such weak in distinguishing between code and data.

The problem is that while it seems to have protective mechanisms in reality it provides you with more than enough rope to tie a knot.

What more?

We could continue in this way for some time but the two themes of Ruby are:
  • everything is an object
and
  • code can be data.
Notable things that you should lookup are the strange backward syntax typified by:
x=y if defined? x
…and its implementation of threading and multitasking. IronRuby also adds to the mix interworking with .NET. You can simply load assemblies using “require” and then use the objects as if they were Ruby objects but with slight naming changes. You can also host Ruby script within a .NET program and provide it as a scripting language for the end user.

Overall the verdict on Ruby has to be that it’s a bit of a mess – a quite likable workable mess but it still lacks any coherent design philosophy and it certainly doesn’t live up to the principle of least surprise by any reasonable interpretation.


Dr. Mike James’ programming career has spanned many languages. Editor of VSJ and the author of Foundations of Programming (ISBN 1871962048), he has always been interested in the latest developments and the synergy between different languages.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“Nine people can't make a baby in a month.” - Fred Brooks