<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
 
  <title>Fred Akalin</title>
  <subtitle>Notes on math, tech, and everything in between</subtitle>
  <link type="text/html" rel="alternate" href="http://www.akalin.cx/"/>
  <link type="application/atom+xml" rel="self" href="http://www.akalin.cx/atom.xml"/>
  <updated>2011-12-29T20:44:36-08:00</updated>
  <id>http://www.akalin.cx/</id>
  <author>
  <name>Fred Akalin</name>
  <uri>http://www.akalin.cx/</uri>
</author>
<rights>© Fred Akalin
2005–2011.
All rights reserved.</rights>


  
  <entry>
    <id>http://www.akalin.cx/pair-counterexamples-vector-calculus</id>
    <link type="text/html" rel="alternate" href="http://www.akalin.cx/pair-counterexamples-vector-calculus"/>
    <title>A Pair of Counterexamples in Vector Calculus</title>
    <updated>2011-11-27T00:00:00-08:00</updated>
    <author>
  <name>Fred Akalin</name>
  <uri>http://www.akalin.cx/</uri>
</author>
<rights>© Fred Akalin
2005–2011.
All rights reserved.</rights>

    <content type="html">&lt;!-- Define LaTeX macros. --&gt;
&lt;script&gt;
MathJax.Hub.Register.StartupHook('TeX Jax Ready', function () {
  var TEX = MathJax.InputJax.TeX;
  TEX.Macro('bvec','\\mathbf{#1}', 1);
  TEX.Macro('RR', '\\mathbb{R}');
  TEX.Macro('pd', '\\frac{\\partial{#1}}{\\partial{#2}}', 2);
  TEX.Macro('sgn', '\\operatorname{sgn}');
});
&lt;/script&gt;

&lt;p&gt;While recently reviewing some topics in vector calculus, I became
curious as to why violating seemingly innocuous conditions for some
theorems leads to surprisingly wild results.  In fact, I was struck by
how these theorems resemble computer programs, not in some
&lt;a href=&quot;http://en.wikipedia.org/wiki/Curry-Howard_Correspondence&quot;&gt;abstract
way&lt;/a&gt;, but in how the lack of &amp;ldquo;input validation&amp;rdquo; leads
to
&lt;a href=&quot;http://en.wikipedia.org/wiki/Undefined_behavior&quot;&gt;non-obvious
behavior&lt;/a&gt; in the face of erroneous input.&lt;/p&gt;

&lt;aside&gt;There are
actually &lt;a href=&quot;http://amzn.com/048668735X&quot;&gt;whole&lt;/a&gt;
&lt;a href=&quot;http://amzn.com/0486428753&quot;&gt;books&lt;/a&gt; dedicated to
counterexamples.  They make good bathroom reading material.&lt;/aside&gt;

&lt;p&gt;I found that understanding why these counterexamples lead to wild
results deepened my understanding of the theorems involved and their
proofs.  Besides, pathological examples are more interesting than
well-behaved ones!&lt;/p&gt;

&lt;p&gt;First, let's look at a &amp;ldquo;counterexample&amp;rdquo;
to &lt;a href=&quot;http://en.wikipedia.org/wiki/Green%27s_theorem&quot;&gt;Green's
theorem&lt;/a&gt;:&lt;/p&gt;

&lt;p class=&quot;example&quot;&gt;1. Two functions \(L, M \colon \RR^2 \to \RR\) and
  a positively-oriented, piecewise smooth, simple closed curve \(C\)
  in \(\RR^2\) enclosing the region \(D\) such that

\[
\oint_C L \,dx + M \,dy \ne
\iint_D \left( \pd{M}{x} - \pd{L}{y} \right) \,dx \,dy \text{.}
\]&lt;/p&gt;

&lt;aside class=&quot;left&quot;&gt;The vector field \((L, M)\) also serves as the
canonical &amp;ldquo;counterexample&amp;rdquo; to
the &lt;a href=&quot;http://en.wikipedia.org/wiki/Gradient_theorem&quot;&gt;gradient
theorem&lt;/a&gt;.&lt;/aside&gt;

&lt;p&gt;Let

\begin{align*}
L &amp;= -\frac{y}{x^2+y^2} \text{, } M = \frac{x}{x^2+y^2} \text{,}
\end{align*}

and \(C\) be a curve going clockwise around the rectangle \(D = [-1,
1]^2\).  Then the integral of \(L \,dx + M \, dy\) around \(C\) is \(2
\pi\) since it encloses the origin.  But

\[
\pd{M}{x} = \pd{L}{y} = \frac{y^2-x^2}{x^2+y^2}
\]

so the difference of the two vanishes everywhere but the origin, where
neither function is defined.  Therefore, the (improper) integral over
\(D\) also vanishes, proving the inequality. &amp;#8718;&lt;/p&gt;

&lt;p&gt;Of course, the easy explanation is that the discontinuity of \(L\)
and \(M\) at the origin violates a condition of Green's theorem.  But
that doesn't really tell us anything, so let's break down the theorem
and see where exactly it fails.&lt;/p&gt;

&lt;p&gt;Green's theorem is usually proved first for rectangles \([a, b]
\times [c, d]\), which suffices for our purpose.  If \(C\) is a curve
that goes counter-clockwise around such a rectangle \(D\), then we can
easily show that

\[
\oint_C L \,dx = - \iint_D \pd{L}{y} \,dx \,dy
\]

and

\[
\oint_C M \,dy = \iint_D \pd{M}{x} \,dx \,dy \text{,}
\]

with the sum of these two formulas proving the theorem.&lt;/p&gt;

&lt;p&gt;So the first sign of trouble is that the theorem freely
interchanges addition and integration.  Since the partial derivatives
of our functions diverge at the origin, if \(D\) contains the origin
then the integrals of those partial derivatives over \(D\) may not
even be defined, even if the integral of their difference is.&lt;/p&gt;

&lt;p&gt;But the problem arises even before that.  The statements above are
proved by showing

\[
\oint_C L \,dx = - \int_a^b \left( \int_c^d \pd{L}{y} \,dy \right) \,dx
\]

and

\[
\oint_C M \,dy = \int_c^d \left( \int_a^b \pd{M}{x} \,dx \right) \,dy
\text{.}
\]

both of which hold for our example.  But notice that in one case we
integrate with respect to \(y\) first, and in the other case we
integrate with respect to \(x\) first.  Therefore, we have to
interchange the order of integration or convert to a double integral
in order to get them to a form where we can add them.  And there's the
rub: if \(D\) contains the origin, switching the order of integration
for either integral above switches the sign of the result!&lt;/p&gt;

&lt;p&gt;This fully explains the discrepancy; since the result of both
integrals above (with the iteration order preserved) is \(\pi\),
adding them together as-is gives the expected result of \(2 \pi\).
But if we switch the iteration order of one of the iterated integrals
as done in the proof of Green's theorem, then we switch the result of
that integral to \(-\pi\), which cancels with the result of the other
unchanged integral to produce \(0\).&lt;/p&gt;

&lt;p&gt;So now let's examine this strange behavior of the sign of an
integration's result depending on the iteration order.  This leads us
to our next &amp;ldquo;counterexample,&amp;rdquo; this time
for &lt;a href=&quot;http://en.wikipedia.org/wiki/Fubini%27s_theorem&quot;&gt;Fubini's
theorem&lt;/a&gt;:&lt;/p&gt;

&lt;p class=&quot;example&quot;&gt;2. A function \(f \colon \RR^2 \to \RR\) whose
  iterated integrals over a rectangle \(D = [a, b] \times [c, d]
  \subset \RR^2\) differ.&lt;/p&gt;

&lt;p&gt;Let

\[
f(x, y) = \frac{x^2-y^2}{(x^2+y^2)^2}
\quad \text{ and } \quad
D = [-1, 1]^2\text{.}
\]

The two iterated integrals of \(f\) over \(D\) are usually written as

\[
\int_{-1}^1 \left( \int_{-1}^1 f(x, y) \,dy \right) \,dx
\qquad \text{ and } \qquad
\int_{-1}^1 \left( \int_{-1}^1 f(x, y) \,dx \right) \,dy
\]

but let's define them more carefully to make it easier to justify our
calculations.&lt;/p&gt;

&lt;p&gt;Let

\begin{align*}
u_k &amp;= y \mapsto f(k, y) \\
v_l &amp;= x \mapsto f(x, l) \text{.}
\end{align*}

In other words, given the real constants \(k\) and \(l\), construct
the (possibly partial) real functions \(u_k(y)\) and \(v_l(x)\) by
partially-evaluating \(f\) at \(x = k\) and \(y = l\),
respectively.&lt;/p&gt;

&lt;aside class=&quot;left&quot;&gt;
\(U(x)\) and \(V(y)\) are also (partial) real functions.
&lt;/aside&gt;

&lt;p&gt;Then, if we also let

\[
U(x) = \int_{-1}^1 u_x(y) \,dy
\qquad \text{ and } \qquad
V(y) = \int_{-1}^1 v_y(x) \,dx \text{,}
\]

we can write the iterated integrals as

\[
\int_{-1}^1 U(x) \,dx
\qquad \text{ and } \qquad
\int_{-1}^1 V(y) \,dy \text{.}
\]
&lt;/p&gt;

&lt;aside&gt;We're justified in applying standard integration techniques
here since \(u_k(y)\) for \(k &gt; 0\) is defined and bounded for all
\(y\).&lt;/aside&gt;

&lt;p&gt;Computing \(U(x)\) for \(x \neq 0\), we get

\begin{align*}
  U(x) &amp;= \int_{-1}^1 \pd{}{y} \left( -\frac{y}{x^2+y^2} \right) \,dy \\
       &amp;= \left. -\frac{y}{x^2+y^2} \right|_{y=-1}^{y=1}              \\
       &amp;= -\frac{2}{x^2+1} \text{.}
\end{align*}
&lt;/p&gt;

&lt;p&gt;Attempting to evaluate \(U(0)\), we see that

\begin{align*}
  U(0) &amp;= \int_{-1}^1 \frac{0^2-y^2}{(0^2+y^2)^2} \,dy \\
       &amp;= - \int_{-1}^1 \frac{dy}{y^2}
\end{align*}

which diverges.  So

\[
  U(x) = -\frac{2}{x^2+1} \text{ for } x \ne 0 \text{.}
\]
&lt;/p&gt;

&lt;aside&gt;Note that \(U(x)\) and \(V(y)\) differ only in variable name
and sign.&lt;/aside&gt;

&lt;p&gt;
By a similar computation, we find that

\[
  V(y) = \frac{2}{y^2+1} \text{ for } y \ne 0 \text{.}
\]
&lt;/p&gt;

&lt;p&gt;Since \(U(x)\) isn't defined at \(0\), we have to treat it as an
improper integral, although doing so poses no real difficulty:

\begin{align*}
  \int_{-1}^1 U(x)\,dx
    &amp;= \lim_{a \nearrow 0} \left( \int_{-1}^a -\frac{2}{x^2+1} \,dx \right) +
       \lim_{a \searrow 0} \left( \int_{a}^1 -\frac{2}{x^2+1} \,dx \right) \\
    &amp;= \lim_{a \nearrow 0}
         \Bigl( \left. -2 \arctan(x) \right|_{-1}^{a} \Bigr) +
       \lim_{a \searrow 0}
         \Bigl( \left. -2 \arctan(x) \right|_{a}^{1} \Bigr) \\
    &amp;= \left. -2 \arctan(x) \right|_{-1}^{0} +
       \left. -2 \arctan(x) \right|_{0}^{1} \\
    &amp;= \left. -2 \arctan(x) \right|_{-1}^{1} \\
    &amp;= -\pi \text{.}
\end{align*}
&lt;/p&gt;

&lt;p&gt;Similarly,

\[
  \int_{-1}^1 V(y)\,dy = \pi \text{,}
\]

so the iterated integrals of \(f(x, y)\) over \([-1, 1]^2\) differ; in
fact, as we claimed above, switching the iteration order switches the
sign of the result. &amp;#8718;&lt;/p&gt;

&lt;p&gt;We can repeat the above calculations for an arbitrary rectangle to
see that the iterated integrals of \(f(x, y)\) differ if \(D\)
contains the origin either as an interior point or a corner.  But
there's an easier way to prove that statement and also gain some
insight as to why \(f(x, y)\) has this strange property.&lt;/p&gt;

&lt;p&gt;Note that the key facts in the above calculations were that \(U(x)
\lt 0\) for any \(x \ne 0\) and \(V(y) \gt 0\) for any \(y \ne 0\).
Therefore, integrating \(U(x)\) over any interval on the \(x\)-axis
would produce a negative result and integrating \(V(x)\) over any
interval on the \(y\)-axis would produce a positive result, leading to
the difference in iterated integrals.  This holds more generally; for
any \(m, n \gt 0\):

\[
\int_{-n}^n f(x, y) \,dy &lt; 0
\qquad \text{ and } \qquad
\int_{-m}^m f(x, y) \,dx &gt; 0 \text{.}
\]

Therefore,

\[
\int_{-m}^m \left( \int_{-n}^n f(x, y) \,dy \right) \,dx &lt; 0
\qquad \text{ and } \qquad
\int_{-n}^n \left( \int_{-m}^m f(x, y) \,dx \right) \,dy &gt; 0
\]

so the iterated integrals of \(f(x, y)\) differ over the rectangles
\([-m, m] \times [-n, n]\).  Since any rectangle \(D\) containing the
origin as an interior point must contain some smaller rectangle \(E =
[-m, m] \times [-n, n]\), the iterated integrals of \(f(x, y)\) over
\(E\) differ and therefore must also differ over \(D\).&lt;/p&gt;

&lt;p&gt;Furthermore, since \(f(x, y)\) is even in both \(x\) and \(y\), you
can carry out a similar argument to the above with intervals of the
form \([0, m]\) or \([-m, 0]\) to show that the iterated integrals of
\(f(x, y)\) must also differ over any rectangle with the origin as a
corner.
&lt;/p&gt;

&lt;p&gt;So the essential property of \(f(x, y)\) is that slicing it along
the \(x\)-axis gives a function which has positive area under the
curve on any interval symmetric around \(0\) or with \(0\) as an
endpoint, and that slicing it similarly along the \(y\)-axis gives a
function with has negative area.  Therefore, on a rectangle symmetric
around the origin or with the origin as a corner, one can choose the
sign of the iterated integral by choosing which axis to slice
first.&lt;/p&gt;

&lt;p&gt;The next thing to investigate is how exactly the iterated integrals
of \(f(x, y)\) over the rectangle \(D\) are expressed such that they
differ only when \(D\) contains the origin, especially considering
that the \(f(x, y)\) is expressed in quite a simple form.  To do that,
let's consider the simple case of a rectangle \(D = [\delta, 1] \times
[\epsilon, 1]\) where we can vary \(\delta\) and \(\epsilon\) at
will.&lt;/p&gt;

&lt;p&gt;Let

\begin{align*}
I_{yx}(\delta, \epsilon) &amp;=
  \int_{\delta}^1 \left( \int_{\epsilon}^1 f(x, y) \,dy \right) \,dx \\
I_{xy}(\delta, \epsilon) &amp;=
  \int_{\epsilon}^1 \left( \int_{\delta}^1 f(x, y) \,dx \right) \,dy
\text{.}
\end{align*}

Then, for \(\epsilon \neq 0\):

\begin{align*}
I_{yx}(\delta, \epsilon) &amp;=
  \int_{\delta}^1 \left( \int_{\epsilon}^1
    \frac{y^2-x^2}{(x^2+y^2)^2} \,dy \right) \,dx \\
  &amp;= \int_{\delta}^1 \left(
       \left. -\frac{y}{x^2+y^2} \right|_{y=\epsilon}^{y=1} \right) \,dx \\
  &amp;= \int_{\delta}^1 \Biggl(
       -\frac{1}{1+x^2} -
       \left( -\frac{\epsilon}{\epsilon^2+x^2} \right) \Biggr) \,dx \\
  &amp;= \int_{\delta}^1 \frac{dx/\epsilon}{1+(x/\epsilon)^2} -
     \int_{\delta}^1 \frac{dx}{1+x^2} \\
  &amp;= \arctan\left(\frac{1}{\epsilon}\right) -
     \arctan\left(\frac{\delta}{\epsilon}\right) -
     \frac{\pi}{4} + \arctan(\delta) \text{,}
\end{align*}

and for \(\epsilon = 0\):

\[
I_{yx}(\delta, 0) = -\frac{\pi}{4} + \arctan(\delta) \text{.}
\]

Similarly, for \(\delta \neq 0\):

\begin{align*}
I_{xy}(\delta, \epsilon) &amp;=
  \int_{\epsilon}^1 \left( \int_{\delta}^1
    \frac{y^2-x^2}{(x^2+y^2)^2} \,dx \right) \,dy \\
  &amp;= \int_{\epsilon}^1 \left(
       \left. \frac{x}{x^2+y^2} \right|_{x=\delta}^{x=1} \right) \,dy \\
  &amp;= \int_{\epsilon}^1 \left(
       \frac{1}{1+y^2} - \frac{\delta}{\delta^2+x^2} \right) \,dy \\
  &amp;= \int_{\epsilon}^1 \frac{dy}{1+y^2} -
     \int_{\epsilon}^1 \frac{dy/\delta}{1+(y/\delta)^2} \\
  &amp;= \frac{\pi}{4} - \arctan(\epsilon) -
     \arctan\left(\frac{1}{\delta}\right) +
     \arctan\left(\frac{\epsilon}{\delta}\right) \text{,}
\end{align*}

and for \(\delta = 0\):

\[
I_{xy}(0, \epsilon) = \frac{\pi}{4} - \arctan(\epsilon) \text{.}
\]


Then let \(\Delta = I_{xy} - I_{yx}\) be the difference between the
two iterated integrals.  We can use the identity

\[
\arctan(x) + \arctan\left(\frac{1}{x}\right) = \frac{\pi}{2} \sgn{x}
\]

to simplify \(\Delta(\delta, \epsilon)\) if neither \(\delta\) nor
\(\epsilon\) is zero:

\begin{align*}
\Delta(\delta, \epsilon)
  &amp;= \bigl( \pi/4 - \arctan(\epsilon) - \arctan(1/\delta)
     + \arctan(\epsilon/\delta) \bigr) \\
  &amp;  \quad \mathbin{-}
     \bigl( \arctan(1/\epsilon) - \arctan(\delta/\epsilon)
     - \pi/4 + \arctan(\delta) \bigr) \\
  &amp;= \pi/2 - \bigl( \arctan(\epsilon) + \arctan(1/\epsilon) \bigr) \\
  &amp;  \quad \mathbin{-} \bigl( \arctan(\delta) + \arctan(1/\delta) \bigr) \\
  &amp;  \quad \mathbin{+}
       \bigl( \arctan(\delta/\epsilon) + \arctan(\epsilon/\delta) \bigr) \\
  &amp;= \frac{\pi}{2} \bigl( 1 - \sgn(\epsilon) - \sgn(\delta)
     + \sgn(\delta/\epsilon) \bigr) \text{.}
\end{align*}
&lt;/p&gt;

&lt;p&gt;
Using the properties of \(\sgn(x)\), we can simplify this to the final
expression:

\[
  \Delta(\delta, \epsilon) =
    \frac{\pi}{2}
      \bigl( 1 - \sgn(\delta) \bigr) \bigl( 1 - \sgn(\epsilon) \bigr)
\]

which we can prove still holds if either \(\delta\) or \(\epsilon\) is
zero (or both).&lt;/p&gt;

&lt;p&gt;So with the simplified expression for \(\Delta(\delta, \epsilon)\),
it becomes apparent how \(\sgn(x)\) is used to control the value of
\(\Delta(\delta, \epsilon)\); as long as either \(\delta\) or
\(\epsilon\) is positive, \(1 - \sgn(x)\) zeroes out the entire
expression.&lt;/p&gt;
</content>
  </entry>
  
  <entry>
    <id>http://www.akalin.cx/evlis-tail-recursion</id>
    <link type="text/html" rel="alternate" href="http://www.akalin.cx/evlis-tail-recursion"/>
    <title>Understanding Evlis Tail Recursion</title>
    <updated>2011-10-28T00:00:00-07:00</updated>
    <author>
  <name>Fred Akalin</name>
  <uri>http://www.akalin.cx/</uri>
</author>
<rights>© Fred Akalin
2005–2011.
All rights reserved.</rights>

    <content type="html">&lt;p&gt;While reading
about &lt;a href=&quot;http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-6.html#%_sec_3.5&quot;&gt;proper
tail recursion&lt;/a&gt; in Scheme, I encountered a similar but obscure
optimization called &lt;em&gt;evlis tail recursion&lt;/em&gt;.
In &lt;a href=&quot;http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.8567&amp;rep=rep1&amp;type=pdf&quot;&gt;the
paper where it was first described&lt;/a&gt;, the author claims it
&quot;dramatically improve the space performance of many programs,&quot; which
sounded promising.&lt;/p&gt;

&lt;p&gt;However, the few places where its mentioned don't do much more than
state its definition and claim its usefulness.  Hopefully I can
provide a more detailed analysis here.&lt;/p&gt;

&lt;p&gt;Consider the straightforward factorial implementation in
Scheme:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-lisp&quot;&gt;(define (fact n) (if (&lt;= n 1) 1 (* n (fact (- n 1)))))&lt;/code&gt;&lt;/pre&gt;

&lt;aside&gt;Assume a left-to-right evaluation order for now.&lt;/aside&gt;

&lt;p&gt;It is not tail-recursive, since the recursive call is nested in
another procedure call.  However, it's &lt;em&gt;almost&lt;/em&gt; tail-recursive;
the call to &lt;code&gt;*&lt;/code&gt; is a tail call, and the recursive call is
its last subexpression, so it will be the last subexpression to be
evaluated.&lt;/p&gt;

&lt;aside&gt;The function that takes a list of expressions, evaluates them,
and returns the results as a list is traditionally
called &lt;code&gt;evlis&lt;/code&gt;, hence the name of the optimization.&lt;/aside&gt;

&lt;p&gt;Recall what happens when a procedure call (represented as a list of
subexpressions) is evaluated: each subexpression is evaluated, and the
first result (the procedure) is passed the other results as
arguments.&lt;/p&gt;

&lt;aside class=&quot;left&quot;&gt;This assumes that the calling environment isn't
stored somewhere else.&lt;/aside&gt;

&lt;p&gt;Evlis tail recursion can be described as follows: when performing a
procedure call and during the evaluation of the last subexpression,
the calling environment is discarded as soon as it is not
required. The distinction between evlis tail recursion and
proper tail recursion is subtle.  Proper tail recursion requires only
that the calling environment be discarded before the actual procedure
call; evlis tail recursion discards the calling environment even
sooner, if possible.&lt;/p&gt;

&lt;p&gt;An example will help to clarify things.  Given &lt;code&gt;fact&lt;/code&gt; as
defined above, say you evaluate &lt;code&gt;(fact 10)&lt;/code&gt; and you're in
the procedure call with &lt;code&gt;n = 5&lt;/code&gt;.  The call stack of a
properly tail-recursive interpreter would look like this:&lt;/p&gt;

&lt;pre&gt;
evalExpr
--------
env = { n: 10 } -&amp;gt; &amp;lt;top-level environment&amp;gt;
expr = '(* n (fact (- n 1)))'
proc = &amp;lt;native function: *&amp;gt;
args = [10, &amp;lt;pending evalExpr('(fact (- n 1))', env)&amp;gt;]

evalExpr
--------
env = { n: 9 } -&amp;gt; &amp;lt;top-level environment&amp;gt;
expr = '(* n (fact (- n 1)))'
proc = &amp;lt;native function: *&amp;gt;
args = [9, &amp;lt;pending evalExpr('(fact (- n 1))', env)&amp;gt;]

...

evalExpr
--------
env = { n: 6 } -&amp;gt; &amp;lt;top-level environment&amp;gt;
expr = '(* n (fact (- n 1)))'
proc = &amp;lt;native function: *&amp;gt;
args = [6, &amp;lt;pending evalExpr('(fact (- n 1))', env)&amp;gt;]

evalExpr
--------
env = { n: 5 } -&amp;gt; &amp;lt;top-level environment&amp;gt;
expr = '(if ...)'
&lt;/pre&gt;

&lt;p&gt;whereas the call stack of an evlis tail-recursive interpreter would
look like this:&lt;/p&gt;

&lt;pre&gt;
evalExpr
--------
env = { n: 5 } -&amp;gt; &amp;lt;top-level environment&amp;gt;
pendingProcedureCalls = [
  [&amp;lt;native function: *&amp;gt;, 10],
  [&amp;lt;native function: *&amp;gt;, 9],
  ...
  [&amp;lt;native function: *&amp;gt;, 6]
]
expr = (if ...)
&lt;/pre&gt;

&lt;p&gt;In this implementation, the last subexpression of a procedure call
is evaluated exactly like a tail expression, but the procedure call
and non-last subexpressions are pushed onto a stack.  Whenever an
expression is reduced to a simple one and the stack is non-empty, a
pending procedure call with its other args are popped off, and it is
called with the reduced expression as the final argument.&lt;/p&gt;

&lt;p&gt;Note that this didn't change the asymptotic behavior of the
procedure; it still takes $\Theta(n)$ memory to evaluate.  However,
only the bare minimum of information is saved: the list of pending
functions and their arguments.  Other auxiliary variables, and
crucially the nested calling environments, aren't preserved, leading
to a significant constant-factor reduction in memory.&lt;/p&gt;

&lt;p&gt;This raises the question: Are there cases where evlis tail
recursion leads to better asymptotic behavior?  In fact, yes; consider
the following (contrived) implementation of factorial:&lt;/p&gt;

&lt;aside&gt;This was adapted from an example
in &lt;a href=&quot;ftp://ftp.ccs.neu.edu/pub/people/will/tail.pdf&quot;&gt;Proper
Tail Recursion and Space Efficiency&lt;/a&gt;.&lt;/aside&gt;

&lt;pre&gt;&lt;code class=&quot;language-lisp&quot;&gt;(define (fact2 n)
  (define v (make-vector n))
  (* (n (fact2 (- n 1)))))&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Before the main body of the function, a vector of size $n$ is
defined.  This means that the environments in the call stack of a
properly tail-recursive interpreter would look like this:&lt;/p&gt;

&lt;aside&gt;Assume that the interpreter isn't smart enough to deduce that $v$
can be optimized out since it's never used.&lt;/aside&gt;

&lt;pre&gt;
env = { n: 10, v: &amp;lt;vector of size 10&amp;gt; } -&amp;gt; &amp;lt;top-level environment&amp;gt;
env = { n: 9, v: &amp;lt;vector of size 9&amp;gt; } -&amp;gt; &amp;lt;top-level environment&amp;gt;
env = { n: 8, v: &amp;lt;vector of size 8&amp;gt; } -&amp;gt; &amp;lt;top-level environment&amp;gt;
env = { n: 7, v: &amp;lt;vector of size 7&amp;gt; } -&amp;gt; &amp;lt;top-level environment&amp;gt;
...
&lt;/pre&gt;

&lt;p&gt;whereas the an evlis tail-recursive interpreter would keep around
only the current environment.  Therefore, the properly tail-recursive
interpreter would require $\Theta(n^2)$ memory to
evaluate &lt;code&gt;(fact2 n)&lt;/code&gt; while the evlis tail-recursive
interpreter would require only $\Theta(n)$&lt;/p&gt;

&lt;p&gt;Studying examples like the one above enabled me to finally
understand how evlin tail recursion worked and what sort of savings it
gives.  However, I have yet to find a practical example where evlis
tail recursion gives the same sort of asymptotic gains as described
above, and I'd be interested to receive some.  But perhaps the &quot;large
gains&quot; mentioned in the various papers describing it are only
constant-factor reductions in memory.&lt;/p&gt;

&lt;p&gt;In any case, another important difference in Scheme between proper
tail recursion and evlis tail recursion is that the former is
a &lt;em&gt;language feature&lt;/em&gt; and the latter is
an &lt;em&gt;optimization&lt;/em&gt;.  That means that it is acceptable and even
encouraged to write Scheme programs that take advantage of proper tail
recursion, but it would be unwise to rely on evlis tail recursion for
the asymptotic performance of your function.  Instead, one should
treat it just as a nice constant-factor speed gain.&lt;/p&gt;

&lt;p&gt;Note that it is easy to make evlis tail recursion &quot;smarter.&quot;  Since
Scheme doesn't specify the order of argument evaluation, an
interpreter could evaluate arguments to maximize the gains from evlis
tail recursion.  As an easy example, if we had switched the arguments
to &lt;code&gt;+&lt;/code&gt; in &lt;code&gt;fact&lt;/code&gt; above, making it
non-evlis-tail-recursive, a smart compiler could still treat it as
such.  A possible rule of thumb would be to pick a non-trivial
function call to evaluate last.&lt;/p&gt;

&lt;p&gt;To complete the picture, I will outline below the evaluation
function for a simple evlis tail-recursive Scheme interpreter in
Javascript.  All of the sources I've found describe it in terms of
compilers, so I think it'll be useful to have a reference
implementation for an interpreter.&lt;/p&gt;

&lt;p&gt;Let's say we already have a properly tail-recursive
interpreter:&lt;/p&gt;

&lt;aside&gt;Adapted from Peter Norvig's
excellent &lt;a href=&quot;http://norvig.com/lispy.html&quot;&gt;&lt;code&gt;lis.py&lt;/code&gt;&lt;/a&gt;.&lt;/aside&gt;

&lt;pre&gt;&lt;code class=&quot;lang-javascript&quot;&gt;function evalExpr(expr, env) {
  // Fake tail calls with a while loop and continue.
  while (true) {
    // Symbols, constants, quoted expressions, and lambdas.
    if (isSimpleExpr(expr)) {
      // The only exit point.
      return evalSimpleExpr(expr, env);
    }
    // (if test conseq alt)
    if (isSpecialForm(expr, 'if')) {
      expr = evalExpr(expr[1], env) ? expr[2] : expr[3];
      continue;
    }
    // (set! var expr)
    if (isSpecialForm(expr, 'set!')) {
      env.set(expr[1], evalExpr(expr[2], env));
      expr = null;
      continue;
    }
    // (define var expr?)
    if (isSpecialForm(expr, 'define')) {
      env.define(expr[1], evalExpr(expr[2], env));
      expr = null;
      continue;
    }
    // (begin expr*)
    if (isSpecialForm(expr, 'begin')) {
      if (expr.length == 1) {
        expr = null;
        continue;
      }
      // Evaluate all but the last subexpression.
      for (var i = 1; i &amp;lt; expr.length - 1; ++i) {
        evalExpr(expr[i], env);
      }
      expr = expr[expr.length - 1];
      continue;
    }
    // (proc expr*)
    var proc = evalExpr(expr.shift(), env);
    var args = expr.map(function(subExpr) { return evalExpr(subExpr, env); });
    // proc.run() returns its body in result.expr and the environment
    // in which to evaluate it (with all its arguments bound) in
    // result.env.
    var result = proc.run(args);
    expr = result.expr;
    // The only time when env is changed.
    env = result.env;
    continue;
  }
}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then implementing evlis tail recursion requires only a few
changes:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;lang-javascript&quot;&gt;function evalExpr(expr, env) {
  // This is a stack of procedures and their non-final arguments that
  // are waiting for their final argument to be evaluated.
  var pendingProcedureCalls = [];
  while (true) {
    if (isSimpleExpr(expr)) {
      expr = evalSimpleExpr(expr, env);
      // Discard calling environment.
      env = null;
      if (pendingProcedureCalls.length == 0) {
        // No pending procedure calls, so we're done (the only exit
        // point).
        return expr;
      }
      var args = pendingProcedureCalls.pop();
      var proc = args.shift();
      args.push(expr);
      var result = proc.run(args);
      expr = result.expr;
      // Change to new environment (the only time when env is
      // changed).
      env = result.env;
      continue;
    }
    ...
    // Everything else remains the same.
    ...
    // (proc expr*)
    var nonFinalSubExprs =
      exprs.slice(0, -1).map(
        function(subExpr) { return evalExpr(subExpr, env); });
    pendingProcecureCalls.push(nonFinalSubExprs);
    // Evaluate the last subexpression as a tail call.
    expr = expr[expr.length - 1];
    continue;
  }
}&lt;/code&gt;&lt;/pre&gt;
</content>
  </entry>
  
  <entry>
    <id>http://www.akalin.cx/elementary-gaussian-proof</id>
    <link type="text/html" rel="alternate" href="http://www.akalin.cx/elementary-gaussian-proof"/>
    <title>An Elementary Way to Calculate the Gaussian Integral</title>
    <updated>2011-01-06T00:00:00-08:00</updated>
    <author>
  <name>Fred Akalin</name>
  <uri>http://www.akalin.cx/</uri>
</author>
<rights>© Fred Akalin
2005–2011.
All rights reserved.</rights>

    <content type="html">&lt;p&gt;
While reading &lt;a href=&quot;http://gowers.wordpress.com&quot;&gt;Timothy Gowers's blog&lt;/a&gt; I stumbled on
&lt;a href=&quot;http://gowers.wordpress.com/2007/10/04/when-are-two-proofs-essentially-the-same/#comment-239&quot;&gt;Scott Carnahan's comment&lt;/a&gt;
describing an elegant calculation of the Gaussian integral
\[
\int_{-\infty}^{\infty} e^{-x^2} \, dx = \sqrt{\pi}\text{.}
\]
I was so struck by its elementary character that I imagined what it
would be like written up, say, as an extra credit exercise in a
single-variable calculus class:
&lt;/p&gt;

&lt;p class=&quot;exercise&quot;&gt;
  &lt;span class=&quot;exercise&quot;&gt;Exercise 1.&lt;/span&gt;
  (&lt;span class=&quot;exercise-name&quot;&gt;The Gaussian integral&lt;/span&gt;.)  Let
  \[
  F(t) = \int_0^t e^{-x^2} \, dx
  \text{, }\qquad
  G(t) = \int_0^1 \frac{e^{-t^2 (1+x^2)}}{1+x^2} \, dx
  \text{,}
  \]
  and $H(t) = F(t)^2 + G(t)$.

  &lt;ol class=&quot;exercise-list&quot;&gt;
  &lt;li&gt;Calculate $H(0)$.&lt;/li&gt;

  &lt;li&gt;Calculate and simplify $H'(t)$.  What does this
    imply about $H(t)$?&lt;/li&gt;

  &lt;li&gt;Use part&amp;nbsp;b to calculate $F(\infty) =
    \displaystyle\lim_{t \to \infty} F(t)$.&lt;/li&gt;

  &lt;li&gt;Use part&amp;nbsp;c to calculate
    \[
    \int_{-\infty}^{\infty} e^{-x^2} \, dx\text{.}
    \]&lt;/li&gt;
  &lt;/ol&gt;
&lt;/p&gt;

&lt;aside&gt;Similar to proving $\sum\limits_{i=0}^n m^3 =
  \frac{n^2(n+1)^2}{4}$ by induction.&lt;/aside&gt;

&lt;p&gt;
Although this is simpler than
&lt;a href=&quot;http://en.wikipedia.org/wiki/Gaussian_integral#Careful_proof&quot;&gt;the
  usual calculation of the Gaussian integral&lt;/a&gt;, for which careful
reasoning is needed to justify the use of polar coordinates, it seems
more like a
&lt;a href=&quot;http://en.wikipedia.org/wiki/Certificate_(complexity)&quot;&gt;certificate&lt;/a&gt;
than an actual
proof; you can convince yourself that the calculation is valid, but
you gain no insight into the reasoning that led up to it.
&lt;/p&gt;

&lt;p&gt;
Fortunately, &lt;a href=&quot;http://gowers.wordpress.com/2007/10/04/when-are-two-proofs-essentially-the-same/#comment-243&quot;&gt;David Speyer's
  comment&lt;/a&gt; solves the mystery; $G(t)$ falls out of doing the
integration in Cartesian coordinates over a triangular region.  Just
for kicks, here's how I imagine an exercise based on this method would
look like (this time for a multi-variable calculus class):
&lt;/p&gt;

&lt;p class=&quot;exercise&quot;&gt;
  &lt;span class=&quot;exercise&quot;&gt;Exercise 2.&lt;/span&gt;
  (&lt;span class=&quot;exercise-name&quot;&gt;The Gaussian integral in Cartesian coordinates.&lt;/span&gt;) Let
  \[
  A(t) = \iint\limits_{\triangle_t} e^{-(x^2+y^2)} \, dx \, dy
  \]
  where $\triangle_t$ is the triangle with vertices $(0, 0)$, $(t,
  0)$, and $(t, t)$.
  &lt;!-- TODO(akalin): Draw a diagram for \triangle_t. --&gt;

  &lt;ol class=&quot;exercise-list&quot;&gt;
  &lt;li&gt;Use the substitution $y = sx$ to reduce $A(t)$ to a
    one-dimensional integral.&lt;/li&gt;

  &lt;li&gt;Use part&amp;nbsp;a to calculate $A(\infty) =
    \lim_{t \to \infty} A(t)$.&lt;/li&gt;

  &lt;li&gt;Use part&amp;nbsp;b to calculate
    \[
    \int_{-\infty}^{\infty} e^{-x^2} \, dx\text{.}
    \]&lt;/li&gt;

  &lt;li&gt;Let
    \[
    F(t) = \int_0^t e^{-x^2} \, dx
    \qquad\text{ and }\qquad
    G(t) = \int_0^1 \frac{e^{-t^2 (1+x^2)}}{1+x^2} \, dx
    \text{.}
    \]
    Use part&amp;nbsp;a to relate $F(t)$ to $G(t)$.&lt;/li&gt;

  &lt;li&gt;Use part&amp;nbsp;d to derive a proof of part&amp;nbsp;c
    using only single-variable calculus.&lt;/li&gt;
  &lt;/ol&gt;
&lt;/p&gt;

</content>
  </entry>
  
  <entry>
    <id>http://www.akalin.cx/longest-palindrome-linear-time</id>
    <link type="text/html" rel="alternate" href="http://www.akalin.cx/longest-palindrome-linear-time"/>
    <title>Finding the Longest Palindromic Substring in Linear Time</title>
    <updated>2007-11-28T00:00:00-08:00</updated>
    <author>
  <name>Fred Akalin</name>
  <uri>http://www.akalin.cx/</uri>
</author>
<rights>© Fred Akalin
2005–2011.
All rights reserved.</rights>

    <content type="html">&lt;style type=&quot;text/css&quot; media=&quot;all&quot;&gt;
/*&lt;![CDATA[*/
span.palind {
  color: red;
}
/*]]&gt;*/
&lt;/style&gt;

&lt;p&gt;Another &lt;a
href=&quot;http://programming.reddit.com/info/2dykz/comments/c2e7r0&quot;&gt;interesting
problem&lt;/a&gt; I stumbled across on reddit is finding the longest
substring of a given string that is a palindrome.  I found &lt;a
href=&quot;http://johanjeuring.blogspot.com/2007/08/finding-palindromes.html&quot;&gt;the
explanation on Johan Jeuring's blog&lt;/a&gt; somewhat confusing and I had
to spend some time poring over the Haskell code (eventually rewriting
it in Python) and walking through examples before it &quot;clicked.&quot;  I
haven't found any other explanations of the same approach so hopefully
my explanation below will help the next person who is curious about
this problem.&lt;/p&gt;

&lt;p&gt;Of course, the most naive solution would be to exhaustively examine
all $n \choose 2$ substrings of the given $n$-length string, test each
one if it's a palindrome, and keep track of the longest one seen so
far.  This has complexity $O(n^3)$, but we can easily do better by
realizing that a palindrome is centered on either a letter (for
odd-length palindromes) or a space between letters (for even-length
palindromes).  Therefore we can examine all $2n + 1$ possible centers
and find the longest palindrome for that center, keeping track of the
overall longest palindrome.  This has complexity $O(n^2)$.&lt;/p&gt;

&lt;p&gt;It is not immediately clear that we can do better but if we're told
that an $\Theta(n)$ algorithm exists we can infer that the algorithm
is most likely structured as an iteration through all possible
centers.  As an off-the-cuff first attempt, we can adapt the above
algorithm by keeping track of the current center and expanding until
we find the longest palindrome around that center, in which case we
then consider the last letter (or space) of that palindrome as the new
center.  The algorithm (which isn't correct) looks like this
informally:&lt;/p&gt;

&lt;ol type=&quot;1&quot;&gt;
  &lt;li&gt;Set the current center to the first letter.&lt;/li&gt;
  &lt;li&gt;Loop while the current center is valid:
    &lt;ol type=&quot;a&quot;&gt;
      &lt;li&gt;Expand to the left and right simultaneously until we find
	the largest palindrome around this center.&lt;/li&gt;
      &lt;li&gt;If the current palindrome is bigger than the stored maximum
	one, store the current one as the maximum one.&lt;/li&gt;
      &lt;li&gt;Set the space following the current palindrome as the
	current center unless the two letters immediately surrounding
	it are different, in which case set the last letter of the
	current palindrome as the current center.&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;Return the stored maximum palindrome.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This seems to work but it doesn't handle all cases: consider the
string &quot;abababa&quot;.  The first non-trivial palindrome we see is &quot;&lt;span
class=&quot;palind&quot;&gt;a&lt;/span&gt;|bababa&quot;, followed by &quot;&lt;span
class=&quot;palind&quot;&gt;aba&lt;/span&gt;|baba&quot;.  Considering the current space as the
center doesn't get us anywhere but considering the preceding letter
(the second 'a') as the center, we can expand to get &quot;&lt;span
class=&quot;palind&quot;&gt;ababa&lt;/span&gt;|ba&quot;.  From this state, considering the
current space again doesn't get us anywhere but considering the preceding
letter as the center, we can expand to get &quot;ab&lt;span
class=&quot;palind&quot;&gt;ababa&lt;/span&gt;|&quot;.  However, this is incorrect as the
longest palindrome is actually the entire string!  We can remedy this
case by changing the algorithm to try and set the new center to be one
before the end of the last palindrome, but it is clear that having a
fixed &quot;lookbehind&quot; doesn't solve the general case and anything more
than that will probably bump us back up to quadratic time.&lt;/p&gt;

&lt;p&gt;The key question is this: given the state from the example above,
&quot;&lt;span class=&quot;palind&quot;&gt;ababa&lt;/span&gt;|ba&quot;, what makes the second 'b' so
special that it should be the new center?  To use another example, in
&quot;&lt;span class=&quot;palind&quot;&gt;abcbabcba&lt;/span&gt;|bcba&quot;, what makes the second
'c' so special that it should be the new center?  Hopefully, the
answer to this question will lead to the answer to the more important
question: once we stop expanding the palindrome around the current
center, how do we pick the next center?  To answer the first question,
first notice that the current palindromes in the above examples
themselves contain smaller non-trivial palindromes: &quot;ababa&quot; contains
&quot;aba&quot; and &quot;abcbabcba&quot; contains &quot;abcba&quot; which also contains &quot;bcb&quot;.
Then, notice that if we expand around the &quot;special&quot; letters, we get a
palindrome which shares a right edge with the current palindrome; that
is, &lt;em&gt;the longest palindrome around the special letters are proper
suffixes of the current palindrome&lt;/em&gt;.  With a little thought, we
can then answer the second question: &lt;em&gt;to pick the next center, take
the center of the longest palindromic proper suffix of the current
palindrome&lt;/em&gt;.  Our algorithm then looks like this:&lt;/p&gt;

&lt;ol type=&quot;1&quot;&gt;
  &lt;li&gt;Set the current center to the first letter.&lt;/li&gt;
  &lt;li&gt;Loop while the current center is valid:
    &lt;ol type=&quot;a&quot;&gt;
      &lt;li&gt;Expand to the left and right simultaneously until we find
	the largest palindrome around this center.&lt;/li&gt;
      &lt;li&gt;If the current palindrome is bigger than the stored maximum
	one, store the current one as the maximum one.&lt;/li&gt;
      &lt;li&gt;Find the maximal palindromic proper suffix of the current
	palindrome.&lt;/li&gt;
      &lt;li&gt;Set the center of the suffix from c as the current center
	and start expanding from the suffix as it is palindromic.&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;Return the stored maximum palindrome.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;However, unless step 2c can be done efficiently, it will cause the
algorithm to be superlinear.  Doing step 2c efficiently seems
impossible since we have to examine the entire current palindrome to
find the longest palindromic suffix unless we somehow keep track of
extra state as we progress through the input string.  Notice that the
longest palindromic suffix would by definition also be a palindrome of
the input string so it might suffice to keep track of every palindrome
that we see as we move through the string and hopefully, by the time
we finish expanding around a given center, we would know where all the
palindromes with centers lying to the left of the current one are.
However, if the longest palindromic suffix has a center to the right
of the current center, we would not know about it.  But we also have
at our disposal the very useful fact that &lt;em&gt;a palindromic proper
suffix of a palindrome has a corresponding dual palindromic proper
prefix&lt;/em&gt;.  For example, in one of our examples above, &quot;abcbabcba&quot;,
notice that &quot;abcba&quot; appears twice: once as a prefix and once as a
suffix.  Therefore, while we wouldn't know about all the palindromic
suffixes of our current palindrome, we would know about either it or
its dual.&lt;/p&gt;

&lt;p&gt;Another crucial realization is the fact that we don't have to keep
track of all the palindromes we've seen.  To use the example
&quot;abcbabcba&quot; again, we don't really care about &quot;bcb&quot; that much, since
it's already contained in the palindrome &quot;abcba&quot;.  In fact, we only
really care about keeping track of the longest palindromes for a given
center or equivalently, the length of the longest palindrome for a
given center.  But this is simply a more general version of our
original problem, which is to find the longest palindrome around
&lt;em&gt;any&lt;/em&gt; center!  Thus, if we can keep track of this state
efficiently, maybe by taking advantage of the properties of
palindromes, we don't have to keep track of the maximal palindrome and
can instead figure it out at the very end.&lt;/p&gt;

&lt;p&gt;Unfortunately, we seem to be back where we started; the second
naive algorithm that we have is simply to loop through all possible
centers and for each one find the longest palindrome around that
center.  But our discussion has led us to a different incremental
formulation: given a current center, the longest palindrome around
that center, and a list of the lengths of the longest palindromes
around the centers to the left of the current center, can we figure
out the new center to consider and extend the list of longest
palindrome lengths up to that center efficiently?  For example, if we
have the state:&lt;/p&gt;

&lt;p&gt;&amp;lt;&quot;ab&lt;span class=&quot;palind&quot;&gt;a&lt;/span&gt;ba|??&quot;, [0, 1, 0, 3, 0, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?]&amp;gt;&lt;/p&gt;

&lt;p&gt;where the highlighted letter is the current center, the vertical line
is our current position, the question marks represent unread
characters or unknown quantities, and the array represents the list
of longest palindrome lengths by center, can we get to the state:&lt;/p&gt;

&lt;p&gt;&amp;lt;&quot;aba&lt;span class=&quot;palind&quot;&gt;b&lt;/span&gt;a|??&quot;, [0, 1, 0, 3, 0, 5, 0, ?, ?, ?, ?, ?, ?, ?, ?]&amp;gt;&lt;/p&gt;

&lt;p&gt;and then to:&lt;/p&gt;

&lt;p&gt;&amp;lt;&quot;aba&lt;span class=&quot;palind&quot;&gt;b&lt;/span&gt;aba|&quot;, [0, 1, 0, 3, 0, 5, 0, 7, 0, 5, 0, 3, 0, 1, 0]&amp;gt;&lt;/p&gt;

&lt;p&gt;efficiently?  The crucial thing to notice is that the longest
palindrome lengths array (we'll call it simply the lengths array) in
the final state is palindromic since the original string is
palindromic.  In fact, the lengths array obeys a more general
property: &lt;em&gt;the longest palindrome &lt;var&gt;d&lt;/var&gt; places to the right
of the current center (the &lt;var&gt;d&lt;/var&gt;-right palindrome) is at least
as long as the longest palindrome d places to the left of the current
center (the &lt;var&gt;d&lt;/var&gt;-left palindrome) if the &lt;var&gt;d&lt;/var&gt;-left
palindrome is completely contained in the longest palindrome around
the current center (the center palindrome), and it is of equal length
if the &lt;var&gt;d&lt;/var&gt;-left palindrome is not a prefix of the center
palindrome or if the center palindrome is a suffix of the entire
string&lt;/em&gt;.  This then implies that we can more or less fill in the
values to the right of the current center from the values to the left
of the current center.  For example, from [0, 1, 0, 3, 0, 5, ?, ?, ?,
?, ?, ?, ?, ?, ?] we can get to [0, 1, 0, 3, 0, 5, 0, &amp;ge;3?, 0,
&amp;ge;1?, 0, ?, ?, ?, ?].  This also implies that the first unknown
entry (in this case, &amp;ge;3?) should be the new center because it
means that the center palindrome is not a suffix of the input string
(i.e., we're not done) and that the &lt;var&gt;d&lt;/var&gt;-left palindrome is a
prefix of the center palindrome.&lt;/p&gt;

&lt;p&gt;From these observations we can construct our final algorithm which
returns the lengths array, and from which it is easy to find the
longest palindromic substring:&lt;/p&gt;

&lt;ol type=&quot;1&quot;&gt;
  &lt;li&gt;Initialize the lengths array to the number of possible
  centers.&lt;/li&gt;
  &lt;li&gt;Set the current center to the first center.&lt;/li&gt;
  &lt;li&gt;Loop while the current center is valid:
    &lt;ol type=&quot;a&quot;&gt;
      &lt;li&gt;Expand to the left and right simultaneously until we find
	the largest palindrome around this center.&lt;/li&gt;
      &lt;li&gt;Fill in the appropriate entry in the longest palindrome
	lengths array.&lt;/li&gt;
      &lt;li&gt;Iterate through the longest palindrome lengths array
	backwards and fill in the corresponding values to the right of
	the entry for the current center until an unknown value (as
	described above) is encountered.&lt;/li&gt;
      &lt;li&gt;set the new center to the index of this unknown value.&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;Return the lengths array.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Note that at each step of the algorithm we're either incrementing
our current position in the input string or filling in an entry in the
lengths array.  Since the lengths array has size linear in the size of
the input array, the algorithm has worst-case linear running time.
Since given the lengths array we can find and return the longest
palindromic substring in linear time, a linear-time algorithm to find
the longest palindromic substring is the composition of these two
operations.&lt;/p&gt;

&lt;p&gt;Here is Python code that implements the above algorithm (although
it is closer to Johan Jeuring's Haskell implementation than to the
above description):&lt;/p&gt;

&lt;aside&gt;* An exercise for the reader: in this place in the code you
might think that you can replace the == with &gt;= to improve
performance.  This does not change the correctness of the algorithm
but it does hurt performance, contrary to expectations.  Why?&lt;/aside&gt;

&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;def fastLongestPalindromes(seq):
    &quot;&quot;&quot;
    Behaves identically to naiveLongestPalindrome (see below), but
    runs in linear time.
    &quot;&quot;&quot;
    seqLen = len(seq)
    l = []
    i = 0
    palLen = 0
    # Loop invariant: seq[(i - palLen):i] is a palindrome.
    # Loop invariant: len(l) &gt;= 2 * i - palLen. The code path that
    # increments palLen skips the l-filling inner-loop.
    # Loop invariant: len(l) &lt; 2 * i + 1. Any code path that
    # increments i past seqLen - 1 exits the loop early and so skips
    # the l-filling inner loop.
    while i &lt; seqLen:
        # First, see if we can extend the current palindrome.  Note
        # that the center of the palindrome remains fixed.
        if i &gt; palLen and seq[i - palLen - 1] == seq[i]:
            palLen += 2
            i += 1
            continue

        # The current palindrome is as large as it gets, so we append
        # it.
        l.append(palLen)

        # Now to make further progress, we look for a smaller
        # palindrome sharing the right edge with the current
        # palindrome.  If we find one, we can try to expand it and see
        # where that takes us.  At the same time, we can fill the
        # values for l that we neglected during the loop above. We
        # make use of our knowledge of the length of the previous
        # palindrome (palLen) and the fact that the values of l for
        # positions on the right half of the palindrome are closely
        # related to the values of the corresponding positions on the
        # left half of the palindrome.

        # Traverse backwards starting from the second-to-last index up
        # to the edge of the last palindrome.
        s = len(l) - 2
        e = s - palLen
        for j in xrange(s, e, -1):
            # d is the value l[j] must have in order for the
            # palindrome centered there to share the left edge with
            # the last palindrome.  (Drawing it out is helpful to
            # understanding why the - 1 is there.)
            d = j - e - 1

            # We check to see if the palindrome at l[j] shares a left
            # edge with the last palindrome.  If so, the corresponding
            # palindrome on the right half must share the right edge
            # with the last palindrome, and so we have a new value for
            # palLen.
            if l[j] == d: # *
                palLen = d
                # We actually want to go to the beginning of the outer
                # loop, but Python doesn't have loop labels.  Instead,
                # we use an else block corresponding to the inner
                # loop, which gets executed only when the for loop
                # exits normally (i.e., not via break).
                break

            # Otherwise, we just copy the value over to the right
            # side.  We have to bound l[i] because palindromes on the
            # left side could extend past the left edge of the last
            # palindrome, whereas their counterparts won't extend past
            # the right edge.
            l.append(min(d, l[j]))
        else:
            # This code is executed in two cases: when the for loop
            # isn't taken at all (palLen == 0) or the inner loop was
            # unable to find a palindrome sharing the left edge with
            # the last palindrome.  In either case, we're free to
            # consider the palindrome centered at seq[i].
            palLen = 1
            i += 1

    # We know from the loop invariant that len(l) &lt; 2 * seqLen + 1, so
    # we must fill in the remaining values of l.

    # Obviously, the last palindrome we're looking at can't grow any
    # more.
    l.append(palLen)

    # Traverse backwards starting from the second-to-last index up
    # until we get l to size 2 * seqLen + 1. We can deduce from the
    # loop invariants we have enough elements.
    lLen = len(l)
    s = lLen - 2
    e = s - (2 * seqLen + 1 - lLen)
    for i in xrange(s, e, -1):
        # The d here uses the same formula as the d in the inner loop
        # above.  (Computes distance to left edge of the last
        # palindrome.)
        d = i - e - 1
        # We bound l[i] with min for the same reason as in the inner
        # loop above.
        l.append(min(d, l[i]))

    return l&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And here is a naive quadratic version for comparison:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;def naiveLongestPalindromes(seq):
    &quot;&quot;&quot;
    Given a sequence seq, returns a list l such that l[2 * i + 1]
    holds the length of the longest palindrome centered at seq[i]
    (which must be odd), l[2 * i] holds the length of the longest
    palindrome centered between seq[i - 1] and seq[i] (which must be
    even), and l[2 * len(seq)] holds the length of the longest
    palindrome centered past the last element of seq (which must be 0,
    as is l[0]).

    The actual palindrome for l[i] is seq[s:(s + l[i])] where s is i
    // 2 - l[i] // 2. (// is integer division.)

    Example:
    naiveLongestPalindrome('ababa') -&gt; [0, 1, 0, 3, 0, 5, 0, 3, 0, 1]
    
    Runs in quadratic time.
    &quot;&quot;&quot;
    seqLen = len(seq)
    lLen = 2 * seqLen + 1
    l = []

    for i in xrange(lLen):
        # If i is even (i.e., we're on a space), this will produce e
        # == s.  Otherwise, we're on an element and e == s + 1, as a
        # single letter is trivially a palindrome.
        s = i / 2
        e = s + i % 2

        # Loop invariant: seq[s:e] is a palindrome.
        while s &gt; 0 and e &lt; seqLen and seq[s - 1] == seq[e]:
            s -= 1
            e += 1

        l.append(e - s)

    return l&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Note that this is not the only efficient solution to this problem;
building a suffix tree is linear in the length of the input string and
you can use one to solve this problem but as Johan also mentions,
that is a much less direct and efficient solution compared to this
one.&lt;/p&gt;

</content>
  </entry>
  
  <entry>
    <id>http://www.akalin.cx/number-theory-haskell-foray</id>
    <link type="text/html" rel="alternate" href="http://www.akalin.cx/number-theory-haskell-foray"/>
    <title>A Foray into Number Theory with Haskell</title>
    <updated>2007-07-06T00:00:00-07:00</updated>
    <author>
  <name>Fred Akalin</name>
  <uri>http://www.akalin.cx/</uri>
</author>
<rights>© Fred Akalin
2005–2011.
All rights reserved.</rights>

    <content type="html">&lt;style type=&quot;text/css&quot; media=&quot;all&quot;&gt;
/*&lt;![CDATA[*/
pre.console {
  background-color: #eee;
  overflow-x: auto;
}
/*]]&gt;*/
&lt;/style&gt;

&lt;p&gt;I encountered
&lt;a href=&quot;http://programming.reddit.com/info/216p9/comments&quot;&gt;an
interesting problem&lt;/a&gt; on reddit a few days ago which can be
paraphrased as follows:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;Find a perfect square $s$ such that $1597s + 1$ is also
perfect square.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;After reading the discussion about implementing a brute-force
algorithm to solve the problem and spending a futile half-hour or so
trying my hand at find a better way, someone noticed that the problem
was an instance
of &lt;a href=&quot;http://en.wikipedia.org/wiki/Pell%27s_equation&quot;&gt;Pell's
equation&lt;/a&gt; which is known to have an elegant and fast solution;
indeed, he posted
a &lt;a href=&quot;http://programming.reddit.com/info/216p9/comments/c21dpn&quot;&gt;one-liner
in Mathematica&lt;/a&gt; solving the given problem. However, I wanted to try
coding up the solution myself as the Mathematica solution, while
succinct, isn't very enlightening since the heavy lifting is already
done by a built-in function and an arbitrary constant was used for this
particular instance of Pell's equation.&lt;/p&gt;

&lt;aside&gt;As a rule we'll avoid considering trivial cases and re-stating
obvious assumptions (like $d$ having to be a positive
integer).&lt;/aside&gt;

&lt;p&gt;Pell's equation is simply the
&lt;a href=&quot;http://en.wikipedia.org/wiki/Diophantine_equation&quot;&gt;Diophantine
equation&lt;/a&gt; $x^2 - dy^2 = 1$ for a given $d$; being Diophantine means
that all variables involved take on only integer values. (In our
original problem, $d$ is 1597 and we are asked for $y^2$.) The
solution involves finding the &lt;em&gt;continued fraction expansion&lt;/em&gt; of
$\sqrt{d}$, finding the first &lt;em&gt;convergent&lt;/em&gt; of the expansion
that satisfies Pell's equation, and then generating all other
solutions from that
&lt;em&gt;fundamental solution&lt;/em&gt;. We rule out the trivial solution $x =
1$, $y = 0$ which also implies that if $d$ is a perfect square then
there is no solution.&lt;/p&gt;

&lt;p&gt;A continued fraction is an expression of the form: \[ x = a_0 +
\cfrac{1}{a_i + \cfrac{1}{a_2 + \cfrac{1}{a_3 +
\cfrac{1}{\ddots\,}}}}\] where all $a_i$ are integers and all but the
first one are positive.  The standard math notation for continued
fractions is quite unwieldy so from now on we'll use $\left \langle
a_0; a_1, a_2, \ldots \right \rangle$ instead of the above.&lt;/p&gt;

&lt;p&gt;The theory of continued fractions is a rich and beautiful one but
 for now we'll just state a few facts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The continued fraction expansion of a number is (mostly) unique.&lt;/li&gt;
&lt;li&gt;The continued fraction expansion of a rational number is
  finite.&lt;/li&gt;
&lt;li&gt;The continued fraction expansion of a irrational number is
infinite.&lt;/li&gt;
&lt;li&gt;A &lt;a href=&quot;http://en.wikipedia.org/wiki/Quadratic_surd&quot;&gt;quadratic
surd&lt;/a&gt; is a number of the form $\frac{a + \sqrt{b}}{c}$
where
$a$, $b$, and $c$ are integers.  Except
maybe for the first term, the continued fraction expansion of a
quadratic surd is periodic; that is, it repeats forever after a
certain number of terms. This applies in particular to the square root
of an integer.&lt;/li&gt;
&lt;li&gt;Truncating an infinite continued fraction to get a finite
continued fraction gives (in some sense) an optimal rational
approximation to the irrational number represented by the infinite
continued fraction.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Given a quadratic surd it is fairly easy to manipulate it into the
form $a + \frac{1}{q}$ where $q$ is another quadratic surd. This fact
can be used to come up with an algorithm to find the continued
fraction expansion of a square
root. Wikipedia &lt;a href=&quot;http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Continued_fraction_expansion&quot;&gt;explains
it pretty well&lt;/a&gt; so I won't go over it, but here is my Haskell
implementation:&lt;/p&gt;

&lt;pre&gt;
&lt;code class=&quot;language-haskell&quot;&gt;sqrt_continued_fraction n = [ a_i | (_, _, a_i) &lt;- mdas ]
    where
      mdas = iterate get_next_triplet (m_0, d_0, a_0)

      m_0 = 0
      d_0 = 1
      a_0 = truncate $ sqrt $ fromIntegral n

      get_next_triplet (m_i, d_i, a_i) = (m_j, d_j, a_j)
          where
            m_j = d_i * a_i - m_i
            d_j = (n - m_j * m_j) `div` d_i
            a_j = (a_0 + m_j) `div` d_j&lt;/code&gt;
&lt;/pre&gt;

&lt;p&gt;and here are some examples:&lt;/p&gt;

&lt;pre class=&quot;console&quot;&gt;
Prelude Main&gt; take 20 $ sqrt_continued_fraction 2
[1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2]

Prelude Main&gt; take 20 $ sqrt_continued_fraction 103
[10,6,1,2,1,1,9,1,1,2,1,6,20,6,1,2,1,1,9,1]

Prelude Main&gt; take 20 $ sqrt_continued_fraction 36
[6,*** Exception: divide by zero
&lt;/pre&gt;

&lt;p&gt;(Note that we're assuming that we won't be called with a perfect
square. Also, do you notice anything interesting about the periodic
portion of the continued fractions, particularly of $\sqrt{103}$?)&lt;/p&gt;

&lt;p&gt;For those who are unfamiliar with Haskell, here's a quick list of key facts:

&lt;ul&gt;
&lt;li&gt;The first line takes a list of triplets and forms a list of all
  third elements, which is what we're interested in. (The other two
  elements of the triplet are auxiliary variables used by the
  algorithm.)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;iterate&lt;/code&gt; is a function which takes in another
  function &lt;code&gt;f&lt;/code&gt;, an initial variable &lt;code&gt;x&lt;/code&gt;, and
  returns the infinite list &lt;code&gt;[ x, f(x), f(f(x)), f(f(f(x))),
  ... ]&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Note that Haskell
  uses &lt;a href=&quot;http://en.wikipedia.org/wiki/Lazy_evaluation&quot;&gt;lazy
  evaluation&lt;/a&gt; and so this function does not take an infinite amount
  of time to run; all its elements are evaluated (and memoized) only
  when needed.&lt;/li&gt;
&lt;li&gt;The rest of the function is a straightforward representation of
  the meat of the algorithm described in the above Wikipedia entry.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It may not be clear what $\sqrt{d}$ and its continued fraction
expansion has to do with solving Pell's equation. However, notice that
if $x$ and $y$ solve Pell's equation then manipulating Pell's equation
to get $\sqrt{d}$ on one side reveals that $\frac{x}{y}$ is a good
approximation of $\sqrt{n}$. In fact, it is so good that you can prove
that $\frac{x}{y}$ &lt;em&gt;must&lt;/em&gt; come from truncating the continued
fraction expansion of $\sqrt{d}$.&lt;/p&gt;

&lt;p&gt;This leads us to the following: if you have an infinite continued
fraction $\left \langle a_0; a_1, a_2, \ldots \right \rangle$ you can
truncate it into a finite continued fraction $\left \langle a_0; a_1,
a_2, \ldots, a_i \right \rangle$ and simplify it into the rational
number $\frac{p_i}{q_i}$.  The sequence $\frac{p_0}{q_0},
\frac{p_1}{q_1}, \frac{p_2}{q_2}, \ldots$ forms the
&lt;a href=&quot;http://en.wikipedia.org/wiki/Convergent_%28continued_fraction%29&quot;&gt;&lt;em&gt;convergents&lt;/em&gt;&lt;/a&gt;
of $\left \langle a_0; a_1, a_2, \ldots \right \rangle$ and converges to
its represented irrational number.&lt;/p&gt;

&lt;p&gt;It turns out you can calculate $p_{i+1}$ and $q_{i+1}$
efficiently from $p_i$, $q_i$, $p_{i-1}$, $q_{i-1}$, and $a_{i+1}$
using
the &lt;a href=&quot;http://en.wikipedia.org/wiki/Fundamental_recurrence_formulas&quot;&gt;&lt;em&gt;fundamental
recurrence formulas&lt;/em&gt;&lt;/a&gt; (which can be proved by induction). Here
is my Haskell implementation:&lt;/p&gt;

&lt;pre&gt;
&lt;code class=&quot;language-haskell&quot;&gt;get_convergents (a_0 : a_1 : as) = pqs
    where
      pqs = (p_0, q_0) : (p_1, q_1) :
            zipWith3 get_next_convergent pqs (tail pqs) as

      p_0 = a_0
      q_0 = 1

      p_1 = a_1 * a_0 + 1
      q_1 = a_1

      get_next_convergent (p_i, q_i) (p_j, q_j) a_k = (p_k, q_k)
          where
            p_k = a_k * p_j + p_i
            q_k = a_k * q_j + q_i&lt;/code&gt;
&lt;/pre&gt;

&lt;p&gt;and some more examples:&lt;/p&gt;

&lt;pre class=&quot;console&quot;&gt;
Prelude Main&gt; take 8 $ get_convergents $ sqrt_continued_fraction 2
[(1,1),(3,2),(7,5),(17,12),(41,29),(99,70),(239,169),(577,408)]

Prelude Main&gt; take 8 $ get_convergents $ sqrt_continued_fraction 103
[(10,1),(61,6),(71,7),(203,20),(274,27),(477,47),(4567,450),(5044,497)]

Prelude Main&gt; take 8 $ get_convergents $ sqrt_continued_fraction 1597
[(39,1),(40,1),(1039,26),(1079,27),(2118,53),(3197,80),(27694,693),(113973,2852)]

Prelude Main&gt; let divFrac (x, y) = (fromInteger x) / (fromInteger y)

Prelude Main&gt; take 8 $ map divFrac $ get_convergents $ sqrt_continued_fraction 2
[1.0,1.5,1.4,1.4166666666666667,1.4137931034482758,1.4142857142857144,1.4142011834319526,1.4142156862745099]

Prelude Main&gt; take 8 $ map divFrac $ get_convergents $ sqrt_continued_fraction 103
[10.0,10.166666666666666,10.142857142857142,10.15,10.148148148148149,10.148936170212766,10.148888888888889,10.148893360160965]

Prelude Main&gt; take 8 $ map divFrac $ get_convergents $ sqrt_continued_fraction 1597
[39.0,40.0,39.96153846153846,39.96296296296296,39.9622641509434,39.9625,39.96248196248196,39.9624824684432]
&lt;/pre&gt;

&lt;p&gt;Here are a few more quick facts to help those unfamiliar with
  Haskell:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The expression &lt;code&gt;a : as&lt;/code&gt; forms a new list from the
  element &lt;code&gt;a&lt;/code&gt; and the existing list &lt;code&gt;as&lt;/code&gt;
  (equivalent to &lt;code&gt;cons&lt;/code&gt; in Lisp).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;zipWith3&lt;/code&gt; is a function that takes in a
  function &lt;code&gt;f&lt;/code&gt;, three lists &lt;code&gt;a&lt;/code&gt;, &lt;code&gt;b&lt;/code&gt;,
  and &lt;code&gt;c&lt;/code&gt; of the same (possibly infinite)
  length &lt;code&gt;n&lt;/code&gt;, and forms the new list
  &lt;code&gt;[ f(a[0], b[0], c[0]), f(a[1], b[1], c[1]), ..., f(a[n], b[n],
  c[n]) ]&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Note that the result of &lt;code&gt;zipWith3&lt;/code&gt; is part of the
  variable &lt;code&gt;pqs&lt;/code&gt; which itself appears (twice!) in the
  arguments to &lt;code&gt;zipWith3&lt;/code&gt;. This is a Haskell idiom and
  reflects the fact that the recurrence formulas define a convergent
  in terms of its two previous convergents. A simpler example (using
  the Fibonacci sequence) can be found in the
  &lt;a href=&quot;http://en.wikipedia.org/wiki/Lazy_evaluation&quot;&gt;Wikipedia
  entry for lazy evaluation&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Haskell has built-in data types for integers of arbitrary size
  which is necessary as the numerators and denominators of the
  convergents get large quickly. In fact, Haskell has built-in
  data types for rational numbers (represented as fractions) but it
  doesn't help us much here.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Since we are guaranteed that some convergent eventually satisfies
  Pell's equation, we can write a simple function to generate all
  convergents, test each one to see if it satisfies Pell's equation,
  and return the first one we see. Here is the Haskell implementation:&lt;/p&gt;

&lt;pre&gt;
&lt;code class=&quot;language-haskell&quot;&gt;get_pell_fundamental_solution n = head $ solutions
    where
      solutions = [ (p, q) | (p, q) &lt;- convergents, p * p - n * q * q == 1 ]

      convergents = get_convergents $ sqrt_continued_fraction n&lt;/code&gt;
&lt;/pre&gt;

&lt;p&gt;Note the use of the
  Haskell's &lt;a href=&quot;http://en.wikipedia.org/wiki/List_comprehension&quot;&gt;list
  comprehension&lt;/a&gt; syntax, similar to Python, which expresses what I
  just described in a matter reminiscent of set notation.

Here is the full Haskell program designed so its output may be
  conveniently piped
  to &lt;a href=&quot;http://en.wikipedia.org/wiki/Bc_programming_language&quot;&gt;bc&lt;/a&gt;
  for verification:

&lt;pre&gt;
&lt;code class=&quot;language-haskell&quot;&gt;module Main where

import System (getArgs)

sqrt_continued_fraction :: (Integral a) =&gt; a -&gt; [a]
{- ... the sqrt_continued_fraction function explained above ... -}

get_convergents :: (Integral a) =&gt; [a] -&gt; [(a, a)]
{- ... the get_convergents function explained above ... -}

get_pell_fundamental_solution :: (Integral a) =&gt; a -&gt; (a, a)
{- ... the get_pell_fundamental_solution function explained above ... -}

main :: IO ()
main = do
  args &lt;- System.getArgs
  let d      = (read $ head $ args :: Integer)
      (p, q) = get_pell_fundamental_solution d in
    putStr $ &quot;d = &quot; ++ (show d) ++ &quot;\n&quot; ++
             &quot;p = &quot; ++ (show p) ++ &quot;\n&quot; ++
             &quot;q = &quot; ++ (show q) ++ &quot;\n&quot; ++
             &quot;p^2 - d * q^2 == 1\n&quot;&lt;/code&gt;
&lt;/pre&gt;

and here is it in action:

&lt;pre class=&quot;console&quot;&gt;
$ ./solve_pell 1597
d = 1597
p = 519711527755463096224266385375638449943026746249
q = 13004986088790772250309504643908671520836229100
p^2 - d * q^2 == 1
&lt;/pre&gt;

&lt;p&gt;The solution to the original problem is therefore:&lt;br/&gt;
&lt;strong&gt;5054112910466227478111803017176109047976100000000.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now that we've found a method to get &lt;em&gt;a&lt;/em&gt; solution, the
question remains as to whether it's the only one. In fact it is not,
but it is the minimal one, and all other solutions (of which there are
an infinite number) can be generated from this fundamental one with a
simple recurrence relation as described on
the &lt;a href=&quot;http://en.wikipedia.org/wiki/Pell%27s_equation#Solution_technique&quot;&gt;Wikipedia
article&lt;/a&gt;. My program above can be easily extended to generate all
solutions instead of just the fundamental one (I'll leave it to the
reader as an exercise).&lt;/p&gt;

&lt;p&gt;One remaining question is the efficiency of this algorithm. For
simplicity, let's neglect the cost of the arbitrary-precision
arithmetic involved and assume that the incremental cost of generating
each term of the continued fraction expansion and the convergents is
constant. Then the main cost is just how many convergents we have to
generate before we find one that satisfies Pell's equation. In fact,
it turns out that this depends on the length of the period of the
continued fraction expansion of $\sqrt{d}$, which has a rough upper
bound of $O(\ln(d \sqrt{d}))$. Therefore, the cost of solving Pell's
equation (in terms of how many convergents to generate) for a given
$n$-digit number is $O(n 2^{n/2})$. This is pretty expensive already,
although it's still much better than brute-force search (which is on
the order of exponentiating the above expression). Can we do better?
Well, sort of; it turns out the length of the answer is of the same
order as the expression above, so any algorithm that explicitly
outputs a solution necessarily takes that long. However, if you can
somehow factor $d$ into $s d'$, where $s$ is a perfect square and $d'$
is &lt;a href=&quot;http://en.wikipedia.org/wiki/Squarefree&quot;&gt;squarefree&lt;/a&gt;
(i.e., not divisible by any perfect square), then you can solve Pell's
equation for the smaller number $d'$ and output the solution for $d'$
as the smaller fundamental solution and an expression raised to a
certain power involving it. Note that in general this involves
factoring $d$, another hard problem, but for which there exists tons
of prior work. An interested reader can peruse the papers
by &lt;a href=&quot;http://www.ams.org/notices/200202/fea-lenstra.pdf&quot;&gt;Lenstra&lt;/a&gt;
and &lt;a href=&quot;http://www.math.nyu.edu/~crorres/Archimedes/Cattle/cattle_vardi.pdf&quot;&gt;Vardi&lt;/a&gt;
for more details.&lt;/p&gt;

&lt;p&gt;As a final note, one of the things I really like about number
theory is that investigating such a simple program can lead you down
surprising avenues of mathematics and computational theory. In fact,
I've had to omit a lot of things I had planned to say to avoid growing
this entry to be longer than it already is. Hopefully, this entry
helps someone else learn more about this interesting corner of number
theory.&lt;/p&gt;
</content>
  </entry>
  
 
</feed>

