Imagine that you were tasked with solving the following equation:

x^{2}+ 2x - 8 = 0

Easy peasy. You'd just look up the quadratic formula that you learned in
school, apply it and come up with the two solutions (or *roots*) of
the equation: x = 2 and x = -4. Now imagine that you were tasked with solving
the equation:

x^{8}+ 3x^{4}- 4 = 0

With a lot of head-scratching and tedious pen-and-paper calculation, it is
possible to solve analytically cubic and even quartic equations. However,
equations of 5th order and higher have no general solutions. By using root finding methods
you *can* solve these equations numerically, eventually coming up with all the roots
you need to whatever degree of precision you require.

The Newton-Raphson method (Newton's method for short; after all, who the heck
is Raphson? Answer: English mathematician Joseph Raphson who derived the method independently of
Newton) is just one of many root finding methods but is often the method of
choice due to its simplicity and speed. Here it is in its generalized form
for real values *x*:

f(x_{k}) x_{k+1}= x_{k}- R·------ fʹ(x_{k})

Ignore *R* for the moment by pretending that it's 1. In case the notation
*fʹ* isn't familiar to you, it means the differential of *f* where
*f* is any old mathematical function. If you don't know what "differential"
means, check out the five minute guide to calculus.

Imagine that you were tasked with solving the following equation:

x^{3}- 1 = 0

(Imagine also that you are too stupid to be able to see instantly that at least one solution is 1). The iterative function for Newton's method will look like this:

x_{k}^{3}- 1 x_{k+1}= x_{k}- -------- 3x_{k}^{2}

Now imagine that you are stupid enough to figure that 2 might be a sensible
guess to the solution. Here, including the initial dumb guess, is what pops
out: 2, 1.4167, 1.1121, 1.0106, 1.0001. So after just four iterations, we end up
with a solution correct to three decimal places which should be sufficient for
most applications. The rate of convergence is, on average, quadratic which means
that the number of accurate digits roughly doubles with every step (0, 1, 1, 2, 4
in our simple example). In practice, since
we don't know the answer ahead of time, we stop calculating when the difference
between two steps falls below a certain threshold and figure that the value of
*x* at that point is the value we're after.

No matter how sharply curved the graph of a function may be, if you get in
close enough it looks like a straight line (unless, of course, said curve is
fractal). This is what
the differential of a function gives you: a straight line, called the *tangent*,
whose slope you can measure. If we follow this line up or down to the x-axis,
so that y = 0, then the value of x at this point is assumed to be a better guess
and fed into the equation for the next iteration.

Remember, *R*, the thing that I told you to pretend was 1? It is called
the *relaxation parameter*. It's generally written as *A*, but I call
it *R* because I'm a programmer and "R" for "relaxation parameter" is a lot
more meaningful than "A" for nothing at all. It is used to tune the rate of
convergence. Imagine that you had a problem function that overshot the root you
were seeking and then undershot it and overshot it and... With *R*< 1
you can reduce the step size of each iteration and solve the overshooting. This
is called *under-relaxation*. Or,
imagine that you had a function that converged on a solution at a linear rate rather
than a quadratic one. This will happen when two roots are extremely close or when
one of the roots is a solution more than once (in mathematical terms, its
multiplicity is >1). In this case *R*>1 will increase the step size
and improve the rate of convergence. This is called *over-relaxation*.
Randomly tinkering with *R* is far more likely to make the method less
effective than more effective so when required, its value is carefully tuned
for the specific problem domain. For our purposes, since we are more interested
in generating fractals than in solving equations, it's just another parameter
to play with (and a quite effective one at that).

19th century British mathematician Arthur Cayley investigated the application
of Newton's method to values in the complex plane. Specifically, he asked the
question, given any starting value, is it possible to predict which root the
value will converge to? Analyzing the family of quadratic polynomials, he found that
it was. Note that we can transform any quadratic function into the form z^{2} + c,
so the simple case can be generalized to more complex cases. Here's the result
for the function:

f(z) = z^{2}- 1 z^{2}- 1 z^{2}+ 1 N(z) = z - ------ = ------ 2z 2z

The two bright spots are the roots of *f* lying, as expected, at 1 and
-1, the square roots of 1. The basins of attraction for 1 and -1, written
as *A*(+1) and *A*(-1) are:

A(+1) = {z:Re(z) >0}A(-1) = {z:Re(z) <0}

In English, the basin of attraction for +1 is all values of *z* that
have a real component >0, while that for -1 is all values whose real component
is <0. The common boundary of the basins, denoted by ∂*A*, is given by

∂A(+1) = ∂A(-1) =J

where *J* represents the imaginary axis. Furthermore,

N(J) =J= N^{-1}(J)

meaning that set of values *J* is invariant under *N*, i.e.
numbers that lie on the imaginary axis move up and down the imaginary line but
never away from it under iteration so that *J* stays the same.

Cayley left the analysis of cubic functions as an open problem. As we'll see that turned out to be rather tricky.

Back to the index