Reading a file with Ruby

June 22, 2007

I intended to write this just after the file writing post, but got sidetracked with other ideas. The basic setup is like writing to a file.

f = File.new(filname)

There are two easy methods, and then one with more control.

The easy way

If all you want is the text of a small file, then simply use

text = f.read

which dumps the entire file into a string, with the \n characters embedded inside. This seems to be most useful for files that you are going to do something to, and then write back out. This is what I used for my template system. I didn’t want to have to worry about tags that happened to be split across lines.

text = f.readlines

is interesting in that it returns an array, where the elements of the array are the lines in the file. This is easiest to use when each line of the file is an individual ‘element’ (whatever that means for your situation.)

Also, you can read a file multiple times by calling #rewind, which resets you at the beginning of the file. The downside to both of these methods is that they read the entire file into memory. If the file is too large, you could exhaust the available memory, which would certainly cause bad things.

The hard way

The method with the most control is #gets. This returns the next line of the file, and also updates the #lineno attribute to hold the current line number. If you need (say) the fourth line of the file, you can do so with

3.times {f.gets}
text = f.gets

I have used this with some of the larger data files that our simulation programs produce. Since some of these data files can be very large, I wasn’t sure I wanted to read the whole thing into memory at once. The way I got around this was to read the file, and build a table of contents array with a new entry when I found a new timestep (which had distinctive text). Then I could rewind the file and read in only the timestep I was interested in. To get another timestep you simply repeat the process.

What about the \n?

There is a built in method to deal with the extra \n’s you will get with readlines or gets. The #chomp method will remove any number of newline characters (and is smart enough to deal with newline, carriage return, or both with one call.) Unless you need the newlines for some reason, all of you calls will likely be

text = f.gets.chomp

or

text = f.readlines.map {|line| line.chomp}

and thats all there is to it. This is fairly simple compared to some other languages I’ve used, and is one of the reasons I really like using Ruby.

Advertisements

File output

June 11, 2007

A fairly common thing I’ve run into is having an array of some kind, and wanting to dump that information into a file for visualization or processing. Dumping to a file is fairly easy to do in Ruby.

out = File.new('ouptut.txt','w')
my_array.each do |item|
    out.puts item
end
out.close

Which results in a file with each item of the array on its own line. I usually end up with nested arrays (x and y values, usually), which require a slight variation. The ‘w’ in the call to File.new is (as you might have guessed) for ‘write access’. Also do not forget to close the file when you are done.

out = File.new('ouptut.txt','w')
my_array.each do |item|
  out.puts "#{item[0]}     #{item[1]}"
end
out.close