Reading a file with Ruby

I intended to write this just after the file writing post, but got sidetracked with other ideas. The basic setup is like writing to a file.

f = File.new(filname)

There are two easy methods, and then one with more control.

The easy way

If all you want is the text of a small file, then simply use

text = f.read

which dumps the entire file into a string, with the \n characters embedded inside. This seems to be most useful for files that you are going to do something to, and then write back out. This is what I used for my template system. I didn’t want to have to worry about tags that happened to be split across lines.

text = f.readlines

is interesting in that it returns an array, where the elements of the array are the lines in the file. This is easiest to use when each line of the file is an individual ‘element’ (whatever that means for your situation.)

Also, you can read a file multiple times by calling #rewind, which resets you at the beginning of the file. The downside to both of these methods is that they read the entire file into memory. If the file is too large, you could exhaust the available memory, which would certainly cause bad things.

The hard way

The method with the most control is #gets. This returns the next line of the file, and also updates the #lineno attribute to hold the current line number. If you need (say) the fourth line of the file, you can do so with

3.times {f.gets}
text = f.gets

I have used this with some of the larger data files that our simulation programs produce. Since some of these data files can be very large, I wasn’t sure I wanted to read the whole thing into memory at once. The way I got around this was to read the file, and build a table of contents array with a new entry when I found a new timestep (which had distinctive text). Then I could rewind the file and read in only the timestep I was interested in. To get another timestep you simply repeat the process.

What about the \n?

There is a built in method to deal with the extra \n’s you will get with readlines or gets. The #chomp method will remove any number of newline characters (and is smart enough to deal with newline, carriage return, or both with one call.) Unless you need the newlines for some reason, all of you calls will likely be

text = f.gets.chomp

or

text = f.readlines.map {|line| line.chomp}

and thats all there is to it. This is fairly simple compared to some other languages I’ve used, and is one of the reasons I really like using Ruby.

Advertisements

3 Responses to Reading a file with Ruby

  1. Richard Wilcox says:

    This is why I love Ruby! Thanks for the article – I knew there had to be a simple way to read a file into a string. In fact, I took your simple method one step further. You can read a file into a string with 1 line using the read class method of the IO class:

    text = IO.read(filename)

  2. joe says:

    why does every “how to read files” in ruby cover only the dead-obvious basic strings? show some examples of pulling non-text data out, i.e. parsing integers, floats, etc. where there are multiple items per line…its rare that somebody is only going to read text!

  3. sebi says:

    Don’t forget the INPUT_RECORD_SEPARATOR variable
    If you have multiline entries, you can easiely read them one by one:

    $/ = “$$$$”
    f.gets

    will return everything including newline until the next “$$$$”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: