Quick numerical benchmark of Ruby 1.9

November 28, 2007

I was looking for something to run to try to put the current Ruby 1.9 through its paces. The first code I found was this

total = 0.0

1.0.step(2000.0,0.0001) do |x|

  result = (5.4*x**5 - 3.211*x**4 + 100.3*x**2 - 100 +
    20*Math.sin(x) - Math.log(x)) * 20*Math.exp(-x/100.3)
  total += result / 0.0001


puts total

Its heavy on the floating point that’s what I mostly use; ‘normal’ use is more likely integer math. Its a nonsensical calculation that I should rework as a real integration problem…

Timings (intel iMac)

imac:$ time ruby pure_ruby.rb

real    1m43.283s
user    1m38.615s
sys     0m1.213s
imac:$ time /usr/local/ruby1.9/bin/ruby pure_ruby.rb

real    1m1.352s
user    0m58.550s
sys     0m0.749s

Which is 1.7x faster. Not bad for no additional work. These numbers are very preliminary, and shouldn’t be mistaken for real benchmarks. I have a series of tests I did with pair correlation function calculations on a simulated liquid that I want to rerun and see how 1.9 holds up.


Performance in Ruby 1.9

November 27, 2007

Ruby logoI have seen some rumblings about Ruby 1.9 being almost ready (here’s hoping that it turns out better than this years OS upgrades) and faster than before. I haven’t looked at 1.9 since I ran some of my early benchmarks.

How To Start Playing With Ruby 1.9 Right Now!

More Real World Performance Data

Changes in Ruby 1.9

Shortcomings in Scientific Computing

July 16, 2007

I read an interesting article (via The Third Bit) a couple of weeks ago that addresses some of the shortcomings of ‘Scientific Computing’. The major shortcomings mentioned are

  • Simple text editors vs IDE
  • Lack of version control
  • Not enough testing

The major item (to me) is version control. If you aren’t using some type of version control, start now. Subversion is an easy enough system to get working (although the repository setup is a little harder than it needs to be.) If you think of this as just a backup system, you miss the real power of version control. In the project I am working on, I can instantly revert back to the code as it was on any date. I can tag or label a certain version for a paper, or other event. I trace a certain section of code and figure out how it has changed over the versions. This came in handy recently as I was trying to figure out a certain constant in our code. I was able to look back and find out that the code I inherited was setup that way. It would have taken much longer to sift through tarballs looking for this value.

I’m not convinced that the text editor vs IDE is that significant of an issue, but it depends on the language and environment that you are using. Some languages, like Java and C variants, have a lot of boilerplate text for functions and class definitions. For these languages, having something that can autogenerate this text is a real timesaver. For languages that are more compact, this is less of an issue. The same issue is there for some frameworks, like Microsoft’s .NET (which I used in some web development contract work I did a few years ago.) The class and function names in .net are generally long to very long, but the visual studio IDE is really good about popping up a list of things that could be called at whatever point you are (since the MS languages are static typed). There is some value to this, but I’m not convinced that the efficiency differences are enough to totally change how you work. My advice would be to use the editor you are most comfortable with, but try other things from time to time and keep an open mind.

One thing that an IDE does provide that is helpful is a good debugging facility. If you have to spend lots of time inside a debugger figuring things out, then you will benefit from doing more testing, but there are times that having a debugger is much more helpful than adding a whole bunch of print statements.

Finally, the article talks about testing. Testing and verification are vitally important to scientific computing. There have been cases where published results have turned out to be incorrect due to bugs in the program. A particular case came out this year where incorrect results were published in a very high profile journal, which lead to some groups not getting grant funding or published results, because their results didn’t agree. The problem with correcting this is that it isn’t a simple change. Software testing is a real commitment and a pretty fundamental shift in how you develop your code. I believe that the gains in productivity and knowing what your code is doing offsets the additional time to do the testing, but even in commercial software development, where testing is much more common, it is still not done by everyone. Even if you don’t adopt a full methodology, such as TDD, XP, or something like that, you really owe it to yourself to do as much verification as you possibly can. Test simple cases by hand. Verify things using simple models. Make sure your results make sense. I think it would be valuable to pick up a good Test driven development book, or look around for a unit testing program for your particular language and give it a try.

The real bottom line is that there is a lot of development going on outside of scientific computing. Keeping an open mind and trying new things is the best assurance we have of having better quality code. One resource is the Software Carpentry page at scipy.org. This page is python related, but most of the concepts should transfer to other languages fairly well.

Text overlay using RMagick

July 13, 2007
Text overlay example

For what I’m doing (simple text overlay on an image) this is the setup I’m using. This is based on an article I saw using something to autogenerate images for rails.

require 'RMagick'

image = Magick::Image.read("input").first.minify

drawable = Magick::Draw.new

drawable.pointsize = 18.0
drawable.gravity = Magick::SouthEastGravity
drawable.font_weight = Magick::BoldWeight

tm = drawable.get_type_metrics(image, "numericalruby.com")

drawable.fill = 'red'

drawable.annotate(image,0,0,20,20,"numericalruby.com") {self.fill='black'}


Some good info also available on the RMagick site

Ruby used for computation book

June 23, 2007

I ran across a neat book a little while ago that was designed as an introduction to numerical calculation in the context of gravitational science, starting with a 2 body system and going through many body. It has a good introduction to things like different order integrators, energy conservation, and stability. They decided to use Ruby for the book as a simpler introduction to programming than traditional languages. I think the intention is to include this in a classroom setting, so it will be interesting to see how that works out.

From my personal experience with this type of class (numerical programming, not gravitation) I really think it will help, because so much overhead of the class is dedicated to how to get the compiler to work and all of the wrangling with things that aren’t really important. There is a point where you need to introduce some C code for performance reasons (which is also discussed in one of the chapters), but the nice thing about Ruby is that you can do so much of the prototyping and “learning” in Ruby, then replace a very small amount of code and get tremendous performance gains. It would be very easy for the C part to be provided to a class once students have proven that they can implement the algorithm in Ruby.

Reading a file with Ruby

June 22, 2007

I intended to write this just after the file writing post, but got sidetracked with other ideas. The basic setup is like writing to a file.

f = File.new(filname)

There are two easy methods, and then one with more control.

The easy way

If all you want is the text of a small file, then simply use

text = f.read

which dumps the entire file into a string, with the \n characters embedded inside. This seems to be most useful for files that you are going to do something to, and then write back out. This is what I used for my template system. I didn’t want to have to worry about tags that happened to be split across lines.

text = f.readlines

is interesting in that it returns an array, where the elements of the array are the lines in the file. This is easiest to use when each line of the file is an individual ‘element’ (whatever that means for your situation.)

Also, you can read a file multiple times by calling #rewind, which resets you at the beginning of the file. The downside to both of these methods is that they read the entire file into memory. If the file is too large, you could exhaust the available memory, which would certainly cause bad things.

The hard way

The method with the most control is #gets. This returns the next line of the file, and also updates the #lineno attribute to hold the current line number. If you need (say) the fourth line of the file, you can do so with

3.times {f.gets}
text = f.gets

I have used this with some of the larger data files that our simulation programs produce. Since some of these data files can be very large, I wasn’t sure I wanted to read the whole thing into memory at once. The way I got around this was to read the file, and build a table of contents array with a new entry when I found a new timestep (which had distinctive text). Then I could rewind the file and read in only the timestep I was interested in. To get another timestep you simply repeat the process.

What about the \n?

There is a built in method to deal with the extra \n’s you will get with readlines or gets. The #chomp method will remove any number of newline characters (and is smart enough to deal with newline, carriage return, or both with one call.) Unless you need the newlines for some reason, all of you calls will likely be

text = f.gets.chomp


text = f.readlines.map {|line| line.chomp}

and thats all there is to it. This is fairly simple compared to some other languages I’ve used, and is one of the reasons I really like using Ruby.

Easy Installation of Ruby Classes

June 21, 2007

Just a quick tidbit, but I’ve got a copule of classes that I consistently require, and it gets a bit redundant to include particular directories. Its not something that I’m ready to make a gem out of, but I looked into the standard Ruby inlude path. I beleive the paths are fairly standard, but you can always check this using irb

irb(main):001:0> $:

will display the include path.

If you copy .rb files to any of these directories (probably requiring sudo) you can require them from anywhere. I know there are other ways to do this, but this is a very quick way to do it.