Formation of Orthogonal Polynomials (Special Functions) using tools of Linear Algebra
Special Functions refer to a set of mathematical functions that arise frequently in various physical contexts, particularly in solving differential equations that describe physical phenomena.
But do we really need to solve the differential equations to get those functions/polynomials? or is there something more fundamental related to the functional sapce itself?
As we know, these functions are distributed through-out the whole physics world. To analyze most of the problems people need to know them. Well this is hard specially for students. So, is there any work around?
Well, the answer is yes! But how?
In this blog we are going to learn the answers of the previous questions.
Introduction to Gram-Schmidt Orthogonalization
In most of the physics problems, we use an inner product space and in most cases we choose a basis in which the basis vectors are orthonormal to one another. Further, we also choose the norm of each vector to be unity.
So, if we have basis vectors \(|e_i \rangle \) then,
where
\[ \delta_{ij} = \begin{cases} 1 &\quad\text{if } i=j\\ 0 &\quad\text{if } i\neq j \\ \end{cases} \]Here \(|e_i \rangle \) are called Orthonormal Basis.
Now, let's say we have some arbitrary linearly independent vectors \(| u_i \rangle\) which are not necessarily orthogonal to each other. It is required to obtain a set of orthogonal vectors \(| v_i \rangle\) starting from the original set of vectors.
Method to generate \(| v_i \rangle\): We proceed as following:
Take \(|v_1 \rangle = |u_1 \rangle\), i.e., choose the anyone of the vectors as the new vector.
Let \(|v_2 \rangle = |u_2 \rangle + a_{21} |v_1 \rangle\) with the unknown constant \(a_{21}\).
This \(a_{21}\) is found by forcing the condition \(\langle v_1,v_2 \rangle=0\). This gives us
Thus, we have two orthogonal vectors \(|v_1 \rangle\) and \(|v_2 \rangle\).
Again, we define \(|v_3 \rangle = |u_3 \rangle + a_{31}|v_1 \rangle + a_{32}|v_2 \rangle\) where \(a_{31}\) and \(a_{32}\) are constants.
Again, using \(\langle v_1 | v_3 \rangle = 0\) and \(\langle v_2 | v_3 \rangle = 0\). This implies
Proceed with the same method. In the \(i^{th}\) step, we will have,
Finally give the condition that all the previous vectors are orthogonal to the new one.
Let's see an example.
So, we have created two orthogonal vectors from two non-orthogonal vectors. If we can also take the vectors to be normalized. But like mentioned before, it really doesn't matter.Orthogonal Polynomials
Before discussing the Orthogonal Polynomials we have to define the inner product once again(wft again!... yes but for infinite dimensional vector spaces).
An inner product on a vector space as we know is a map \(I:V \times V \to \mathbb{C}\) that has certain properties. Here I will not discuss the properties(will assume you know if not then maybe read Afken's book or maybe Balakrishnan's book). So, let's define the inner product:
Suppose we are considering functions in the interval \(x_1 \leq x \leq x_2\). An inner product of two such functions \(f(x)\) and \(g(x)\) can be defined as,
\[ \langle f|g \rangle = \int_{x_1}^{x_2}dx w(x)f^*(x)g(x) \]where \(w(x)\) is called the weight function and \(f^*(x)\) represent complex conjugate of \(f(x)\).
With this we are now ready to see how to generate special functions.
As we know polynomials form a function space. If we restrict our attention to functions of \(1\) real variables, we can ask **what could be a convenient basis in these function space. Maybe you can say it is
\[ 1,x,x^2,x^3,\cdots, x^n,\cdots \]However, this is not necessarily an orthogonal basis as it depends upon \(w(x)\). Then how do we find it?, well well.. just use Gram-Schmidt orthogonalization. Let's see it in detail.
Procedure to generate
Given any weight function \(w(x)\) and interval \(x_1\leq x \leq x_2\), we start with,
\(f_0(x) =1\) β> We don't really care about the normalization in any arbitrary stage of iteration.
For the next function choose \(f_1(x) = A_1(a_{10}+x)\) where just like before we have two unknown constants which we have to find. Also note \(A_1\neq 0\).
Now, we do
This gives us
\[a_{10}I_0 + I_1 = 0 \]. Here we have defined the notation,
\[ I_n = \int_{x_1}^{x_2}dx w(x) x^n \]Apart from \(A_1\) we find \(a_{10}\) from eqn-(13).
For the next one, \(f_2(x) = A_2(a_{20}+a_{21}x+x^2)\) with \(A_2\neq 0\).
Then we have two condition,
and
\[\int_{x_1}^{x_2}dx w(x)(a_{20} + a_{21}x+x^2)(a_{10}+x)=0\]This two gives us,
and
\[a_{20}a_{10}I_0 + (a_{20}+a_{21}a_{10})I_1 + (a_{21}+a_{10})I_2 + I_3 = 0\]Use this two to find \(a_{20}\) and \(a_{21}\). Repeat this process depending on how many terms you need.
Let's see the application of this theory.
Generating Special Functions
Let's see few of the well-known ones.
Legendre Polynomials
We know Legendre Polynomials are defined in the range \(-1\leq x \leq 1\). Also, take \(w(x) = 1\).
Start with \(P_0(x) = 1\). Then using eqn-(13) we have
.
Then, \(P_1(x) = A_1x\). We set the normalization \(A_1 = 1\), this gives us \(P_1(x) = x\).
For the next one we get \(a_{21}=0\) and \(a_{20} = -1/3\). Again taking \(A_2 = 1\) gives \(P_2(x) = -1/3 + x^2\).
We can go as long as we want. This are exactly Legendre Polynomials.
It should be noted that we have not taken proper normalization. We can certainly do that. To do that just choose the coefficients such that the sum of them is \(1\).
As an example, coefficients of \(P_2(x)\) are \(1\) and \(-1/3\). Multiply by a factor of \(\alpha\) and solve such that the sum of all the coefficient is \(1\). This is \(\alpha + (-\frac{\alpha}{3}) = 1\) which gives \(\alpha = 3/2\). This gives \(P_2(x) = \frac{1}{2}(3x^2-1)\).
Let's write a code in julia for this.
function inner_product(f, g, a ,b, w)
return integrate(f * g * w, (x, a, b))
end
function gram_schmidt(funcs,a,b;w=1)
orthogonal_polynomials = []
for i in 1:length(funcs)
f_i = funcs[i]
for j in 1:(i-1)
proj = inner_product(orthogonal_polynomials[j], f_i, a,b,w) /
inner_product(orthogonal_polynomials[j], orthogonal_polynomials[j], a,b,w)
f_i -= simplify(proj * orthogonal_polynomials[j])
end
push!(orthogonal_polynomials, simplify(f_i))
end
return orthogonal_polynomials
end
function normalized(p)
k = sympy.poly(p,x)
pp = k.coeffs()
return p/sum(pp)
end
These are the functions we are going to use. Let's use this for recreating the result.
using SymPy
x = Sym("x")
monomials = [x^i for i in 0:5]
legendre_polynomials = gram_schmidt(monomials,-1,1;w=x^0)
leg_f = normalized.(legendre_polynomials)
println("Without normalization:")
for (i, poly) in enumerate(legendre_polynomials)
println("P_$(i-1)(x) = $poly")
end
println("With Normalization:")
for (i, poly) in enumerate(leg_f)
println("P'_$(i-1)(x) = $poly")
end
which gives,
Without normalization:
P_0(x) = 1
P_1(x) = x
P_2(x) = x^2 - 1/3
P_3(x) = x*(x^2 - 3/5)
P_4(x) = x^4 - 6*x^2/7 + 3/35
P_5(x) = x*(63*x^4 - 70*x^2 + 15)/63
With Normalization:
P'_0(x) = 1
P'_1(x) = x
P'_2(x) = 3*x^2/2 - 1/2
P'_3(x) = 5*x*(x^2 - 3/5)/2
P'_4(x) = 35*x^4/8 - 15*x^2/4 + 3/8
P'_5(x) = x*(63*x^4 - 70*x^2 + 15)/8
Let's see one more example:
Hermite Polynomials
We know Legendre Polynomials are defined in the range \(-\infty < x <\infty \). Also, take \(w(x) = \exp(-x^2)\).
Start with \(H_0(x) = 1\). Then using eqn-(13) we have
.
Then, \(H_1(x) = A_1x\). We set the normalization \(A_1 = 1\), this gives us \(H_1(x) = x\).
For the next one we get \(a_{21}=0\) and \(a_{20} = -\frac{1}{2}\). Again taking \(A_2 = 1\) gives \(H_2(x) = x^2 - \frac{1}{2}\).
Again we can normalize it in the similar way as before which gives us \(H_2(x) = 2x^2-1\).
We can go as long as we want. This are exactly Legendre Polynomials.
It should be noted that we have not taken proper normalization. We can certainly do that. To do that just choose the coefficients such that the sum of them is \(1\).
Our code gives us,
monomials = [x^i for i in 0:5]
hermite_polynomials = gram_schmidt(monomials,-oo,oo;w=exp(-x^2))
her_f = normalized.(hermite_polynomials)
println("Without normalization:")
for (i, poly) in enumerate(hermite_polynomials)
println("H_$(i-1)(x) = $poly")
end
println("With Normalization:")
for (i, poly) in enumerate(her_f)
println("H'_$(i-1)(x) = $poly")
end
which gives,
Without normalization:
H_0(x) = 1
H_1(x) = x
H_2(x) = x^2 - 1/2
H_3(x) = x*(x^2 - 3/2)
H_4(x) = x^4 - 3*x^2 + 3/4
H_5(x) = x*(x^4 - 5*x^2 + 15/4)
With Normalization:
H'_0(x) = 1
H'_1(x) = x
H'_2(x) = 2*x^2 - 1
H'_3(x) = -2*x*(x^2 - 3/2)
H'_4(x) = -4*x^4/5 + 12*x^2/5 - 3/5
H'_5(x) = -4*x*(x^4 - 5*x^2 + 15/4)
Differential Equations for the polynomials
So, those Special Functions are really something born from the structure of metric and hibert-space. But this implies that differential equtions corresponding to the special functions should be derivable from the \(w(x)\) itself. Let's see how to do that!
Method
Let's say we have an eigenvalue equation as,
\[ a(x)\frac{d^2y}{dx^2}+b(x)\frac{dy}{dx} + cy = 0 \]A bit of analysis tells us,
\[ \frac{d}{dx}[w(x)a(x)]=w(x)b(x) \]Also the solutions pertaining to different allowed values of \(c\) are orthogonal.
So, given a set of orthogonal polynomials, we find the differential equation using the following ways:
A constant function is always an eigenfunction, with the eigenvalue \(c_0=0\).
An overall factor in the coefficients \(a\), \(b\) and \(c\) is arbitrary, since eigenvalue equation is homogeneous. We can fix this arbitrariness by choosing \(c_1\) to anything we like.
Once \(c_1\) is fixed, \(a\) and \(b\) are fixed without any freedom. For \(n=1\), we have from eqn-(26) is
Using the form of \(f_1(x)=A_1(a_{10}+x)\), we get
which determines \(b(x)\).
Now, eqn-(26) can be used to find \(a(x)\). Any integration constant can aslo be obtained by
Finally we can find \(c_n\) for arbitrary \(n\) by putting the value of the orthogonal polynomial.
Example
Let's see an example for this.
For \(w(x)=1\) and \(x_1=-1\) & \(x_2=1\), we have,
\[ \frac{d}{dx}[1\times a(x)] = 1\times b(x) \implies \frac{da}{dx}=b \]Previously we showed that \(a_{10}=0\) so \(b\) is given by eqn-(28) as \(b_1 = -c_1 x\).
This implies \(a = -c_1 x^2/2 + k\) where \(k\) is some unknown constant of integration. Choosing \(c_1 = 2\), we have
\[ b = -2x \text{\ \ and \ \ } a = -x^2 + k \]Finally using eqn-(30), we get,
\[ 1\times (k-x_1^2) = 1\times (k-x_2^2)=0 \implies k=1 \]This gives,
\[ a = 1-x^2 \]So, we have the equation:
\[ (1-x^2)\frac{d^2 P_n}{dx^2} - 2x\frac{dP_n}{dx}+c_n P_n = 0 \]Now, we can plug \(P_n\) value of get the \(c_n\) with \(n>1\). But there is a nicer way, Use
\[ P_n(x) = \sum_{r=0}^{n}A_{nr}x^r \]Plugging this into the equation and equating the power of \(x^n\) gives us,
\[c_n = n(n+1)\]Isn't it nice? _________________________________________________
I hope you learn something new and enjoyed this article.
If you have some queries, do let me know in the comments or contact me using my using the informations that are given on the page About Me.