On the history of the Multigrid method creation.
by R.P. Fedorenko (2001).

(This is a translation from Russian of a note written by Radii Petrovich Fedorenko in 2001 and posted on his home page in 2004.)

Recently, I have heard about the talk "Multigrid: something old, something new" made by P. Wesseling. The contents of the talk were kindly communicated to me by M. Botchev, and I realized that the history of the method creation provokes a certain interest but, unfortunately, is not very well known (especially in the West). In particular, it is believed that the method has a purely theoretical origin, which was not related to any real-life computations, and that I did not know about its real efficiency. However, this is not completely true and I believe it is worthwhile to recall the actual story of the origin of the method. Especially as this story is related to the history of the establishment of computational mathematics in the era of the first computers, the history that still has to be written.

At the very end of the fifties a group of young researchers at the Applied Math Division of the Steklov Institute, headed by I.M. Gelfand (the Division later became the Keldysh institute for Applied Mathematics), started to gradually reduce works on secret defense computations. These defense projects were being transfered to the other computer centers. By that time we did already have some experience in numerical solutions of P.D.E. equations and I.M. Gelfand realized what possibilities for the other applied sciences were opened by the computers and our new knowledge. The scientific community did not have any idea on these new tools of scientific research, majority of the scientists then were simply not aware of the existence of computers in the USSR. I.M. Gelfand had begun setting up contacts with scientists from various research institutes, looking for those who were mature enough to apply numerical methods in solving equations they were interested in.

This is how the "weather forecast" problem had appeared which we started working at in collaboration with the Institute for Atmospheric Physics of the Academy of Sciences. Of course, this was a very simple mathematical model for a one or two day wind forecast. The model led to 2D hydrodynamics equations of an incompressible medium to be solved on a part of a sphere with the Coriolis force taken into account (i.e. a "one-layer" atmospheric model). It should not be forgotten that the state-of-the-art computer M-20 that had appeared 2 or 3 years before had only 4096 addresses of operational memory and performance of 20 thousands floating operations per second (flops).

It was in the numerical integration of these equations that, at each time step, a Poisson equation in a rectangular domain had to be solved. We had absolutely no experience of solving these equations and what experience one could talk about - we had just switched from the "Strela" computer (1024 memory addresses, 2000 flops) to the M-20. In addition, most of the computer time was strictly distributed among various defense computations and [to test new algorithms] we could only count on occasional pauses in the planned runs.

In my program, a 48x40 grid was used so that the unknown grid function and the right hand side vector occupied almost all operational memory. I started working with the only iterative method familiar to me then, the relaxation method [here, apparently, simple iteration or Richardson method is meant-M.B.], and soon became convinced in its poor efficiency. If at that time I was aware of the ADM (Alternated Direction Method) with optimally chosen parameters (by that time it was already known), or, at least of the overrelaxation (SOR) method, then I would probably not have looked for something else. But "fortunately" I was sufficiently ignorant in this topic. I was trying to understand what caused the slow convergence by looking at the residual evolution in the course of iterations and easily discovered the well known fact: first, the nonsmooth residual decreased fast and became smooth. After this the decrease became desperately slow. How the idea to formulate the correction equation as a problem on a coarse grid with the residual at the right hand side crossed my mind is difficult to recall now. Apparently, a certain hint was given by the Newton method which, for linear equations, leads to the same problem. This problem can be eased by switching to a coarse grid and the smoothness of the right hand side (the residual) justifies such an approach.

(Translated by M.A. Botchev)

A translator note: Section 9 of this often overlooked Fedorenko's paper summarizes his early multigrid developments:
R.P. Fedorenko, Iterative Methods for Elliptic Difference Equations, Russian Mathematical Surveys, 1973, Vol. 28, pp. 129-195, http://dx.doi.org/10.1070/RM1973v028n02ABEH001542.