Why Hate the for Loop?
There’s one example that comes up sooner or later every time someone starts talking about closures. This time it’s from Bruce Tate on developerWorks:
Listing 1. The simplest possible closure
3.times {puts "Inside the times method."} Results: Inside the times method. Inside the times method. Inside the times method.
times
is a method on the3
object. It executes the code in the closure three times.{puts "Inside the times method."}
is the closure. It’s an anonymous function that’s passed into thetimes
method and prints a static sentence. This code is tighter and simpler than the alternative with afor
loop, shown in Listing 2:Listing 2: Looping without closures
for i in 1..3 puts "Inside the times method." end
Personally I find the latter example simpler, clearer, and easier to understand. For one thing, when I see the word times suffixed to a number like 3 I expect multiplication, not action. But even if times
were changed to a more reasonable method name such as do
or act
, I’d still prefer the for
loop. (Perhaps what you really want here is “do 3 times”. That might really be clearer. )
I don’t know what it is some people have against for
loops that they’re so eager to get rid of them. This isn’t the first or even the second time CS theorists have revolted against for loops (or equivalent). One advantage cited for RATFOR over traditional Fortran-77 was the ability to use index-less while
loops instead of DO
loops (Fortran’s equivalent of for). The Java Collections API brought us a confusing mass of iterators with weird modification behavior to avoid having to do something as simple and obvious as indexing our walks through a list. Then in Java 5 this wasn’t good enough. Some people were still ignoring iterators and stubbornly persisting in indexing their list traversals, so we got a whole new indexless for ( String arg : args )
syntax.
What about a simple indexed loop offends people so much that they invent massive, complex syntax just to avoid it? Personally I find the indexed for
loop syntax to be familiar and comforting. That’s why I deliberately designed XOM to support indexed access to the various components of the document tree. No iterators. No fancy loops. Just plain, old
for (int i = 0; i < element.getChildCount() ; i++) {
Node child = element.getChild(i);
//...
}
The times
syntax does avoid the explicit declaration and creation of a loop variable, though I’m sure that’s still happening behind the scenes and there’s no performance difference. Still, it is nice that you don’t have the variable getting in your way if you don’t want it. Certainly in C and even in Java programmers are always enbugging their code with wrongly scoped loop indices or fencepost errors. This makes indexless loops a nice feature for a language that has it from the get-go like Ruby. However I don’t think this is a big enough advantage to justify changing Java.
A much more serious concern is that indexless loops don’t have to be serial. That is, there’s nothing in a statement like 3.times {doSomething()}
that promises any particular order of execution. In fact, just maybe we can do all three actions at the same time. This enables parallel processing, and is going to be very important as multicore processors and multi-CPU systems become even more common. For example, consider the code to sum an array:
double sum = 0;
for (int i = 0; i < array.length; i++) {
sum += array[i];
}
The programmer probably doesn't care (and certainly shouldn't care) that we add array[0]
to the sum before array[5]
. If the array is large, we could actually execute this on eight separate CPUs, each summing an eighth of the array, and then add the subtotals at the end. A smart compiler could figure this out, but it's easier to do that if there's nothing in the code that refers to the loop index.
The for
syntax implies serialization where you may not need it. The closure syntax doesn't necessarily guarantee the order of execution of the various statements. However, sometimes you actually do need a particular order of execution, or you need to refer to the loop index from within or outside of the code. For example, consider this simple loop that concatenates an array of strings named args
:
String s = "";
for (int i = 0; i < args.length; i++) {
s += args[i];
}
String concatenation is not commutative, and it's really important that we add the strings in the proper order. A really smart compiler might still break this up into multiple threads. but it would have to be a lot more careful that the intended order was preserved.
Thus the closure syntax and the for
syntax really aren't equivalent and closures can't replace for
loops. They might supplement them, but this is only relevant if they really can be run on multiple processors simultaneously. In functional systems, this works because there are no side-effects. Thread safety is almost free. However, this isn't true in Java. In Java you have to think very carefully about thread safety, and typical closure code doesn't do that. Unless we have true functional programming, I'm not sure I see the point.
The current proposals for closures in Java all seem to still have sequential execution of code. For instance, the BGGA proposal makes a big point out of allowing break
and continue
inside closures, but what does that mean if the different iterations of the loop are in fact running on different processors at the same time? If the code is going to be sequential anyway, I prefer the style that makes that more obvious. The traditional indexed loop does that. A closure doesn't.
February 7th, 2007 at 12:01 pm
Speaking as a computational scientist and former Fortran programmer (now using Java for non-mathematical applications) there is a very good reason for using indexed loops when implementing mathematical algorithms. First, they closely mirror mathematical syntax for sums and products and other operations using indexed symbols so the code resembles what is being coded. Second, it makes it more direct for a compiler (and easier for a human) to determine whether or not the various iterations of the loop are independent and can be carried out in parallel. (And remember that in many computational algorithms, a single loop will contain a large number of statements so the analysis can be considerably more complex than in the examples you’ve given.)
The compilers on Cray and other supercomputers have done this type of analysis for more than 25 years. Even on old single processor so-called vector architectures, several iterations of a loop were carried out independently; this was why these machines were so fast for their time.
Of course, if you overload + to mean String concatenation, which is not commutative, the compiler has to understand this. A similar problem might arise if you are adding very small numbers to very large ones where you need to know what you are doing from a numerical standpoint in order to maintain precision.
February 7th, 2007 at 12:47 pm
Just to set the record straight on Ratfor:
Ratfor predates Fortran-77; it was layered over Fortran-66, for which DO was the only structured statement available; IF statements allowed you to have exactly one statement in their scope, with no option for ELSE at all. Consequently, people contorted their loops to make sure they always had the form of DO, even when the index was totally irrelevant to the loop.
Furthermore, DO in both -66 and -77 was buggy: DO loops were tested at the bottom, so they always executed at least once, no matter what the loop termination value was.
I quite agree with you that the 3.times { … } is a crappy example; in fact, most of the examples in that particular paper are crappy.
February 7th, 2007 at 4:07 pm
Objective-C has picked up the “for foo in bar” syntax, and I think it’s a very succinct and transparent approach. I only know what for(;;) means because I’ve spent so long programming C, I don’t think it’s at all intrinsically obvious. Even BASIC’s FOR I=1 TO N : REM do some stuff : NEXT I is more self-explanatory than the C for(;;) construct.
February 7th, 2007 at 4:28 pm
By “crappy”, I think John means “trivial” or “silly”.
Adding a language feature that’s primarily for trivial or simple cases, now THAT’S crappy.
February 7th, 2007 at 4:34 pm
What are these “closures” you speak of??!!
February 7th, 2007 at 5:49 pm
Why hate the for loop? Off by one errors, for one. How many times does this execute?
for i in 1…3
puts “Inside the method”
end
Did you say 3? Even for a second?
Also, there’s this variable ‘i’ hanging around — is it used? If the block was long, you might assume it is, because if you didn’t need it you’d write 3.times {}.
Programmer nirvana is when you get so say exactly what you mean. If you want to execute some code three times, there’s no more direct way to say it than 3.times {}. If you want to run some code several times, assigning successive values of i, for all values from 1 to 3 inclusive, there’s no more direct way to say it than: for i in 1..3.
February 8th, 2007 at 1:07 am
If you don’t want to hate for loops, don’t hate them. But this is a little like the old goofy Perl arguments of five or six years ago.
There’s too much punctuation! It’s too haaaaaard. Whatever. I’ve been programming with Ruby for two years now and I can safely say that the n.times construction doesn’t throw me even for a nanosecond. Nope. I never ever would look at 3.times {foo} and think, “Hey! What is he multiplying? What!? He’s not multiplying? He’s iterating? What!!?? I don’t get it … wait. Oh. Man, that threw me.” Why doesn’t it throw me? Because I’m used to Ruby. If instead of “times”, the method name was “humpifies”, and I’d seen the docs and knew that to humpify something was to cause it to execute n times, then that would be it, I guarantee you. I’d “get it” from that point forward, and the only thing that would annoy me would be that “humpifies” is four letters longer than “times”.
As for compiler considerations: meh. Most of the programming I do on the job doesn’t involve worries about concurrency. If I was going to be concerned with concurrency and performance, let’s face it, I probably wouldn’t use Ruby. I’d use Erlang, or Cilk, or if I didn’t have a choice, well, then I’d use C++ or Java, make the best of it, and masturbate more often to relieve the stress. Other than that, I enjoy having choices and the options to express, more or less, what I mean. All else being equal, it’s just a matter of getting used to whatever a programming language has to throw at you.
February 8th, 2007 at 4:32 am
There is a loop style that is as terse as the closure example but not turned inside out:
repeat (3){
System.out.println(“Three times”);
}
Without the unnecessary, distracting and namespace-polluting loop variable and the meaningless bounds of
for(i=0;i
February 8th, 2007 at 1:22 pm
It is funny you made a cut and paste bug in your example
String s = “”;
for (int i = 0; i
February 8th, 2007 at 2:30 pm
“Adding a language feature that’s primarily for trivial or simple cases, now THAT’S crappy.”
The real feature is code blocks aka closures. The methods like times, upto etc are so extremly simple and do add some syntatic sugar which make code easier to read so it would be stupid to not implement them…
Here are complete implementations of Integer’s times and upto methods in Ruby:
class Integer
def upto(other)
for value in self..other
yield value
end
end
def times
1.upto(self) { yield }
end
end
They are so simple I can’t call them “features” but they do begin to show the power of having closures.
February 8th, 2007 at 3:22 pm
“…there’s nothing in a statement like 3.times {doSomething()} that promises any particualr order of execution”
And that is *exactly* why you want to use it. You shouldn’t care about the order of execution; you only want {doSomething()} to be executed 3 times. Let the “times” method decide what the best way of looping is (for the specific implementation of “list” that you might be using). It will free you of having to specify how to loop (and make your code cleaner, and prevent a bug or two).
Now if you do want a guaranteed execution order, you could for example define a new method “sequential_times” that would do just that: execute 3 sequential times. Then your code would say “3.sequential_times {doSomething()}”, in other words, it says exactly what it does. Had you written out an indexed loop for that goal, you’d have to examine the loop code to find out what it does.
February 8th, 2007 at 6:46 pm
I think you make an excellent point that there is some what of an over reaction to the need for closures and also you make an excellent point that the BGGA proposal is highly serial in nature and the real advantage of something like a closure would be in a multiprocessing system. I think the CICE (http://docs.google.com/Doc.aspx?id=k73_1ggr36h) and C3S (http://www.artima.com/forums/flat.jsp?forum=106&thread=182412) proposals are both superior; since they are based on inner classes, that naturally allow parallelism, and can build upon the already existing infrastructure of Runnable and Callable.
Note: I proposed C3S and hence obviously think it is superior to the other proposals; otherwise I wouldn’t have suggested it 🙂
February 9th, 2007 at 3:53 am
I’ve become more and more fond of iterator, even though I usually don’t use Java 1.5. The main reason is that the code becomes less fragile, since it’s quicker to change an iterator implementation than all the for loops. When you’re changing the representation of associations, one can usually support existing iterators without much work, but indexed access are more difficult to support.
Indexed access tend to reveal the underlying representation. E.g. if all objects related to an “owner” are stored together there will typically be indexed access to the whole set. If each has it’s own list, there are specific index-based accessors for each. By using iterators, such things may be hidden and you are more free to change the representation later. This may of course not be a problem in a specific case.
You mention XOM: I’ve often wanted iterators for stepping through children of specific kinds (all, only elements, only elements within a namespace, only elements with specific local names, …). Having one iterator class for each case would let me use the same loop style, instead of introducing and if the relevant test.
You mention remove: This is an example where iterators are useful. I usually use the following code to remove all elements of a certain kind:
for (int i = 0; i ) {
list.remove(i);
} else {
i++;
}
}
The trick is to step the index only in the cases where the remove is not performed.
However, isn’t the iterator variant clearer:
for (Iterator it = list.iterator(); it.hasNext();) {
Object o = it.next();
if () {
it.remove(i);
}
}
Of course, this would be even nicer with closures:
list.removeIf({Object o => });
February 9th, 2007 at 5:31 am
My god, if you think closures are just for looping constructs you have seriously missed the boat.
February 9th, 2007 at 6:33 am
You mention remove: This is an example where iterators are useful. I usually use the following code to remove all elements of a certain kind:
for (int i = 0; i ) {
list.remove(i);
} else {
i++;
}
}
Correction.
The filter function is useful – and yes without the blubby (is that a word Paul?) looping constructs.
filter (
February 9th, 2007 at 6:34 am
You mention remove: This is an example where iterators are useful. I usually use the following code to remove all elements of a certain kind:
for (int i = 0; i ) {
list.remove(i);
} else {
i++;
}
}
Correction.
The filter function is useful – and yes without the blubby (is that a word Paul?) looping constructs.
filter (<10) [1..20]
[1,2,3,4,5,6,7,8,9]
Prefer a higher-order function over loops – think about it some more.
February 9th, 2007 at 8:55 am
> However, sometimes you actually do need a particular order of execution
And in these cases you’ll use a construct that guarantees order of execution
> String concatenation is not commutative, and it’s really important that we add the strings in the proper order.
Wrong, that’s not what you want, what you want is to have the final string in the correct order, the order of concatenation does not matter, the computer could join the last two strings of the list first for all you care, as long as the final string is the one you wanted _it does not matter_ and saying otherwise is a fallacy. Not to mention, of course, the fact that this example is a strawman since most languages handle string-concatenation via native constructs, which means that you don’t need to use `for` loops anyway.
Now since you went through so many efforts to build that strawman we’re going to refute it shall we? And we’re going to refute it using the tools that fits your fallacy the most: a functional, lazy-evaluating (which means that it _really_ doesn’t guarantee order of evaluation) language: Haskell
> let s = foldl1 (++) array
Well, it wasn’t that complex was it?
It shall of course be noted that most people would use the function built in the standard prelude and write `let s = concat array`.
And in most case you don’t care about the order of execution, the only thing you care about is the order of the result (note: for the cases when you do care about OOE, the languages that may break it all offer constructs that allow you to fix it however you want). Case in point: the `map` construct.
`map` is really simple: it applies a function each value of a collection, and returns a collection of the results _where the order of the results is the same as the order of the initial value_.
Does this mean that it can’t be parallelized? Of course not, it just mean that you have to resync the final collection of all the values, you can see how to do that via Joe Armstrong’s implementation of a simple `pmap` (Parallel Map): http://www.erlang.org/ml-archive/erlang-questions/200606/msg00187.html
The order of execution is unspecified, the mapping is completely parallelized, yet the result collection is ordered which is exactly what we wanted.
> Thus the closure syntax and the for syntax really aren’t equivalent and closures can’t replace for loops.
Actually they can, they’re just an example of inner iteration versus Java’s or Python’s “outer iteration” (via iteration objects) or — worse — Java’s or C’s “manual iteration” (via regular for loops). Smalltalk showed more than 20 years ago that inner iteration worked, Ruby merely took from Smalltalk’s experience.
And you also forgot another information, or you didn’t realize it: closures, or anonymous functions in general, can do much more than just iteration:
Want transactions? In Java, you have to setup and acquire your transaction manually, then remember to commit or rollback it at the end. How could you do that in Ruby? Well something along the lines of:
db_object.transaction do |db|
# do stuff
end
You’re done, the transaction method takes a block and will handle all the low-level gritty stuff such as starting the transaction and committing or rollbacking it when then block has been executed.
Reading a file?
File.open “myfile.txt” do |f|
# do stuff with your file
end
Once you reach the end of the block the file will be neatly closed, _always_, if an exception is thrown from the block the file will still be cleanly closed. No need to muck around with nested levels of trys and excepts and finallys just to be sure that you can’t leave a file open.
People usually show the iterative advantage of blocks via inner iteration, but that’s only because it’s the lowest level of simplicity, the one that shocks the most, and the one that _everyone_ will use first.
But blocks/anonymous functions/first class functions are not limited to iteration, far from that.
> In functional systems, this works because there are no side-effects. Thread safety is almost free. However, this isn’t true in Java. In Java you have to think very carefully about thread safety, and typical closure code doesn’t do that.
Guess what? Smalltalk and Ruby are not functional either. And most functional languages aren’t “pure”, meaning they’re not side-effects free (they nearly all use the single-assignment principle though) hence thread safety wouldn’t be “almost free” if they did indeed use threads (advanced concurrent functional languages don’t).
February 9th, 2007 at 1:20 pm
This argument is like saying Java is pointless because it’s easier to write “Hello World” on paper than go to the trouble of entering the canonical first Java program, compiling, and running it.
Loop constructs are simply a basic example to demonstrate closures. Closures are much deeper and more powerful than that.
February 9th, 2007 at 1:44 pm
Dude! It’s about preventing bugs at compile time!
How do you guarantee at compile time that, inside your “indexed for” loop, you did not accidentally walk out of your data structure (array, dom tree, whatever…)? Such an error will never occur with the new for syntax!
But this is complete besides the point of adding closures to a programming language …
February 9th, 2007 at 6:07 pm
Joseph: Are you implying that there is something wrong with that argument?
February 9th, 2007 at 10:14 pm
Why, yes.
February 10th, 2007 at 5:24 pm
[…] Elliotte Rusty Harold parte da una proposta di sintassi per le closures in Java e finisce per domandarsi come mai così spesso il loop di for diventa il bersaglio di così tante critiche e, di conseguenza, di così tante alternative. In effetti, nemmeno io ci sono molto affezionato… :-/ […]
February 14th, 2007 at 11:03 am
[…] Elliotte Rusty Harold posted some closure related musings as “Why hate the for loop?” […]
April 5th, 2007 at 11:38 am
[…] But the explanation that hit home most for me was by Adam Turoff at his site Notes On Haskell, in which he responds to a closures-skeptic who asks why peeps be hatin’ on the good ol’ “for” loop. […]
June 13th, 2007 at 3:30 pm
Closures are also a tool for building concise methods by yielding control to a block instead of forcing the programmer to directly access the data from the object. For instance (haha pun) the join and map methods of class Array in Ruby.
[“args”, “one”, “two”, “three].join(“, “)
or
args.reverse.join(“, “)
[5, 4, 2, 1].map { |n| n**2 }