If you’re a programmer, you’ve probably worked through one or more books teaching you the syntax of a new language. I’ve had this experience with half a dozen languages, like C, Javascript, and Perl. These books are typically introduce loops midway through the syntax discussion, after datatypes and control flow, but before I/O and advanced features.
Loops are almost always presented according to this formula.
- Inane intro text: “what if you want to do an operation more than once”?
- Introduce
whileloop, with difference betweendo whileandwhile do.
- Introduce
forloop, thewhileloop’s crazy cousin.
- (Bonus) Introduce
foreachloop if language is sufficiently high-level. And that’s it – you know how to loop through code; time to move on.
Not so fast. If you’re lucky enough to use a language that draws from functional programming, you shouldn’t loop like this.
The point
From now on, I’m going to use Ruby for examples, but this article isn’t about Ruby. It is about transitioning from primitive loops to iterating through collections, and from generic collection functions (like each) to more specific functions (like map).
From loops to array traversal
For the last several months, I’ve been working on Tumblon, a medium-sized Rails application. I’ve worked on 15-20 Ruby applications over the last three years, probably totaling 50,000 lines of Ruby code.
I’ve only used a primitive loop once.
That primitive loop was a loop {} loop, forever polling a task list looking for jobs. In other words, a loop with no exit condition beyond ^C or a server crash. As far as I know, Ruby doesn’t have a for loop at all, which would explain why I haven’t used it. It has a foreach loop (for item in arr), but that’s syntactic sugar for arr.each {}.
So the first reason why I’ve only used a simple loop in one case: the each concept usually a better option. Its Ruby implementation will be familiar to anyone who’s seen Ruby code before:
1 2 3 |
["horse", "pig", "cow"].each do |animal| puts "Old MacDonald has a #{animal}" end |
(Yes, I have a small child.)
This is far cleaner than its for or while loop alternatives. And it is a better abstract representation of what we’re doing: we aren’t looping with an exit condition, we are iterating through an array. But what if you want to do something a fixed number of times? Even that can be understood as traversing a list, like [1,2,3,4,5,6,7,8,9,10].each {}. Of course, Ruby provides a cleaner version: 10.times {}.
So if your loop is working through a list of some sort, each is a better abstraction of the problem. And in my experience building Ruby applications, every loop but one has been traversing a list. Parsing XML? Traversing a collection. Summing numbers? Traversing a collection. Reading in a textfile? Listening to STDIN? Working with rows in a database? Traversing a collection. That’s what each loops do well.
Beyond arr.each
But each isn’t the final word. It is a step up from a primitive for or while loop when working with a collection of values, but many each loops should be replaced with other array methods, like map, inject, and select.
When is each useful? Simple: when you want to create side-effects, like saving to the database, printing a result, or sending a web service call. In these cases, you’re not concerned with the return value; you want to change state on the screen, the disk, the database, or something else. Take a look at this code.
1 2 3 |
User.find(:all).each do |user| Notification.deliver_email_newsletter(user) end |
You don’t need a return value from this – you need emails to be delivered.
But don’t use each if you want to extract some new value from an array. That’s not what it’s for. Instead, take a look at three other powerful functions: map, inject, or select. To see why, let’s take a look at select. Here is code that takes in an array, and creates a new array from elements that match a certain condition, using each.
1 2 3 4 5 |
active_users = [] users.each do |user| active_users << user if user.active? end active_users |
Man, the first and last lines are ugly. Why do you have to initialize and return active_users? Answer: because this is a misuse of each. You are much better off using select (or its equivalent, find_all):
1 2 3 |
users.select do |user| user.active? end |
Using select is shorter, easier to understand, and less bug-prone. And more importantly, it clearly encapsulates one common use of each (and looping in general).
Two other key functions – map and inject (or reduce) – complement select and follow a similar pattern. And not surprisingly, they form the foundation of the mapreduce approach to distributed processing. I’ve written more about map and reduce in another article, and here is shorthand for knowing which of these functions to use:
| Desired Return Value | Function |
|---|---|
| New array with same number of values | map |
| New array composed of part of the old array | select |
| Single value (though this value can be an array) | inject |
| none | each |
The point, redux
Use each for changing state. Otherwise, avoid side-effects and use “functional” array methods that return a value. Simple. Your code will be cleaner and less bug prone.
And remember the dead giveaway:
- Initialize an empty value, or array, or whatever (
new_arr = []) arr.each, changing the initialized value- Return the value (
return new_arr)
Whenever you see this pattern, you know you’ve got an each loop that needs swapping out.
(Edit: I’ve posted a follow-up article with more about map and reduce.)

Hmm, very useful your explanation about looping/iteration. I’m going to use it to show Ruby features to convince my friends about ruby !
Ruby has a barebones for loop, and it’s faster than enumerating, so you might encounter it in the case of things that have to be heavily optimized. Outside of that, no, there is no reason to use it, generally.
Hell, I use range enumeration* for looping between just a series of numbers, even.
*(1..5).each
Great post, very well explained. I can think of a handful of mis-used each loops that I will refactor to selects or map asap.
Looking forward to your posts regarding map and inject functions.
Thanks, really, for this simple but great article. I have learned something today !
RE Tim Connor’s comment: From the semantics, I wonder if Ruby’s for is the same as Ruby’s foreach.
I don’t have a source, but I believe that if you benchmark, you’ll see that foreach is not the same as .each. One is faster than the other. Semantically, your statement is correct. They are the same.
This is a great post, Jon.
I’m going to print out your little chart and mount it above my bed.
Ruby actually has a better endless loop instead of this: while true{}. It is simply: loop {}
Remi – you’re right, and that’s actually what I used. I’ll edit the article.
Tim – is Jim correct (that the only Ruby
forloop isfor i in arr), or is there also a straightforloop in Ruby that I’m not familiar with?Tim, Jim, etc. – you’re absolutely right that about optimization. Just like denormalizing a database, there may be times when you should move from a more abstract function to a faster function.
Excellent points. If I had my way , for loops would be banished. We should force up the level of abstraction.
Maybe you should look at functional programing language techniques, out of curiosity, there is lot’s more than iter and map. I think you can also consider recursion algorithm… I’d like to see how those abstraction are internaly represented in Ruby, maybe with for loops ?
Xavier – sounds like you’ve got a blog post waiting to happen. :)
Recursion could be used in place of a bare loop (
loop {}), and of course, there is a lot more to functional programming than two functions. As for howmapetc. are represented internally, that’s a bit beyond my expertise; could be a C loop, but it could also be iterative recursion or something else.Man, the first and last lines are ugly. Why do you have to initialize and return active_users? Answer: because this is a misuse of each.
Dang. I wrote code like that just yesterday. I’ve got some cleaning up to do.
Thanks for the article. I’ve tried to remember to use enumerable methods properly, but went through my code and found a couple of loops I could fix. However, it also prompted me to try fix what I had thought had a bad code smell, but hadn’t been motivated to find a better way of doing:
items = {} collection.each {|item| items[item] => item.method } itemsI had a couple of these scattered around in my code, and after a quick search, found build_hash which cleaned up my code a lot!
Just thought I’d mention it since it’s along the same lines as what you spoke about, and my finding it was definitely a result of reading your post.
Hi Luke – I’d never seen
build_hashbefore, but it looks appropriate. If you weren’t familiar with that function, you could also just useinject.That’s true for just about everything – you can implement
map,select, etc. in terms ofinject– but whenever possible, you’re better off using a more abstract method (likebuild_hashinstead ofinject).(I also fixed the link in your comment.)
What would you use if you’d like to check if there’s any element within array that matches given condition? In such case probably most efficient is ‘for’ loop so you can break after you encounter first element that matches (??)
Mariusz,
In Ruby, you would want to use Enumerable#find or Enumerable#any? to do that.
findwill return the first object that returns true for a block, whileany?simply returns true if any of the elements returns true for a block.Good stuff, but don’t completely write off the for.
Consider (1) the case where the index itself is instrumental to the operation mutating the collection elements; and (2) the case where the operation mutating the collection element depends on the preceding or succeeding element.
I realize that both of these cases can be handled without a ‘for’, but there are times when a for can be the clearer abstraction.
@Jon: Your example for building a hash using #inject has one problem: you’re creating and then throwing away an anonymous hash for each iteration through the inner loop, which could consume a lot of extra memory and CPU (for garbage collection) if your initial collection is large.
I actually tend to use #inject to construct hashes quite often, but without the “magic” method call parameter capture of a new anonymous hash:
my_hash = @array_val.inject({}) {|h, val| h[val] = val.some_method; h }