Legendre's Tranformation: An Intuitive Approach

Legendre Transformation is an involutive transformation on real-valued functions that are convex on a real variable. Specifically, if a real-valued multivariable function is convex on one of its independent real variables, then the Legendre transform with respect to this variable is applicable to the function.

This uses the fact that points and lines are related by some sort of duality, which let's us relate different ideas of physics to each other. Although, we learn it just as a transformation in Classical Mechanics & Thermodynamics, it is really simple and intuitive.

In this blog, we will undertand exactly that.

🪶 Poem A curve of thought, so smooth, so sly,
Hides truth in tangents passing by.
Each slope a whisper, each line a clue,
Trade x for p — the world feels new.
In mirrors of math, two forms agree,
Lagrange and Hamilton — duality’s key.
Where physics and geometry softly rhyme,
Lives the Legendre’s timeless design..

---K.A.Rousan

Introduction

Suppose we have some function \(y=f(x)\). We can draw it as some curve on x-y plane. So, all the information of the function is stored in that graph. But as It can be seen below if we just draw all the tangent lines of the graph, we can get the curve itself, i.e., All the information of the curve is stored in it's tangent lines. To be more precise y intercepts of it's tangent lines(but not for all type of functions).

Transforming one function into another without information loss

To go further, let's start with the function:

\[ f(x) = x^2 \]

What kind of information is present here? The answer is for any value \(x\) on the real number we have a value \(f(x)=x^2\). Our goal is to transform \(f\) from depending on \(x\) to a new function \(g\) such that it depends on \(p\). But the catch is: we don't want to loose any information in the process.

This is done through derivative. If we take the derivative of \(f(x)\), we get,

\[ p(x) = \frac{df}{dx} = 2x \]

So, \(p\) gives the slope of the tangent line at every \(x\). In this specific example, for every value of \(x\) we get a specific value of \(p\), i.e., there is a 1 to 1 corresondance between \(x\) and \(p\). This allows us to to consider another function \(x(p)\), i.e.,

\[ x(p) = \frac{p}{2} \]

Hence, we have two ways of understanding:

  1. Either \(p\) as a function of \(x\).

  2. Or \(x\) as a function of \(p\).

We can then plug \(x\) as a function of \(p\) directly into \(f\), making it now function of \(p\). This gives us:

\[ f(x(p)) = \frac{p^2}{4} = g(p) \]

It seems we have suceed in transforming \(f\) from a function that depends on \(x\) to a new function that depends on \(p\) without lossing any information. But sadly it's not that simple.

Finding the lost information

To understand this now consider,

\[ f(x) = (x-d)^2 \]

In this case \(dy/dx = 2(x-d) = p\). If we plug it back, we get,

\[ f(p) = \frac{p^2}{4} \]

again! So, whatever value we choose for \(d\), we will get the same end function. The information about \(d\) is lost in the procedure we are currently following. The reason it happens is very simple. This happens because we have only been considering the tangent lines of each function which turns out to be the same everywhere just shifted.

Fortunately, there is a closely related quantity that is able to capture the uniqueness of each of these functions and that is the \(y\) intercepts of the tangent line which we can see actually do differ depending on the original function. So, taking that into account, we can solve this problem.

Let's say \((x,y)=(x,f)\) is a point on the parabola (\(y=f(x)=x^2\)). Then, as mentioned slope is \(df/dx = p\) and the \(y\) intercept is g. This \(g\) can be given by,

\[ p = \frac{f + g}{x - 0} \implies g = px - f \]

This truely captures all the information of the function.

📝 Note Here as the intercept is always negative, I have already taken the intercept to be \((0,-g)\). So, above \(g\) is the distance from origin to intercept.

Legendre Transformation

Introduction

Finally, in general if we have a function \(f(x)\), then Legendre Transformation of it is,

\[ g(p) = p\cdot x(p) - f(x(p)) \]

i.e., a transformation which converts one function into another such that the information remains same and whose parameter is slope of the previous one.

For our two functions \(f_1(x) = x^2\) and \(f_2(x) = (x-d)^2\). We get,

\[ g_1(p) = \frac{p}{4} \text{ \ \ and\ \ } g_2(p) = \frac{p^2}{4} + pd \]

We have now included information of \(y\) intercept too, so each of the original functions of \(x\) maps to a unique function of \(p\). Also, beacuse no information has been lost in this, we can also find the Legendre transform of \(g\) and get back original functions \(f\)'s.

🤔 Problem Try showing that if we start from \(g_1\) and \(g_2\), we can recover the original functions \(g_1\) and \(g_2\).
We now truly have two ways of expressing the same information. One that depends on \(x\) and other on \(p\), providing a beautiful duality between points and lines. This mapping is 1-to-1, i.e., unique.

What functions respect legendre transformation?

Now the question is does it works for all functions?, sadly the answer is no!

To motivate the answer let's start with the function \(f(x) = x^3\). We do the exact same things, we have done before,

\[ f(x) = x^3 \implies f'(x) = 3x^2 = p(x) \implies x = \pm \Big(\frac{p}{3}\Big)^{1/2} \]

Immediately we run into a problem. This is not a function (not bijective). Any value we put for \(p\) will gives us two different values of \(x\). This means, there are many airs of points that have same slope. We can see it in the interactive plot below. With the exception of the origin(not Gojo Saturo), we are always able to find two points that have exact same value. In terms of \(f(x)\), these values correspond to producing the exact same slope for tangent lines.

This gives us the key thing we need in order to be able to find the legendre transform of a given function. We need all the slopes of tangent lines to be unique and this will happen if the function's derivative is always increasing or decreasing, i.e., monotonically increasing or decreasing. From \(f\)'s point of view, for f to have legendre transformation, f(x) must be convex, means if we connect any two points on the curve, the line is always on the same side of the curve. To check if any function is convex, we can just find it's second derivative and then if,

\[ f''(x)\geq 0 \text{ \ \ \ \ \ for all } x \text{ \ in the domain} \]

then \(f(x)\) is convex. With all of these, we now know what is Legendre Transform and on which functions we can apply it on.

But does it only works on single variable functions?, Well no!, we can apply the idea on multivariable functions too! and also there can be some variables which don't participate in the transformation. Let's see this:

Legendre Transform of multi-variable functions

Suppose we have a function \(f(x_1,x_2,\cdots, x_n, u_1,u_2,\cdots,u_m)\) then let's say we want to create a function for studying the same system but with \(p_1,p_2,p_3,\cdots , p_n\) where \(p_i = \partial f/\partial x_i\) and keep \(u_i\)'s as they are. Then, the Legendre Transform is,

\[ g(p_1,p_2,\cdots,p_m,u_1,u_2,\cdots,u_m) = \sum_{i=1}^{n} p_i x_i - f(x_1(p_1),x_2(p_2),\cdots, x_n(p_n),u_1,u_2,\cdots,u_m) \]

We can easily show that \(g\) will only be function of \(p\)'s and \(u\)'s only.

📝 Note Visit this page for the proof.
Here for multivariable case, we check the Hessian Matrix (positive semi-definite then convex) for convex check.

Application of Legendre Transform

There are many applications of this transformation. But the most iconic ones are in Classical Mechanics and Thermodynamics. As thermodyanics one is much more popular, we will see that one here.

As we know internal energy(\(U\)) of s system is a function of state which means that a system undergoes the same change in \(U\) when we move it from one equilibrium state to another, irrespetive of which route we take through parameter space. This makes \(U\) a very useful quantity. But we know \(U = U(V,S)\), i.e., it's function of entropy and volume(keeping particle number constant).

But to be honest \(S\) is very hard to control. Rather controlling \(T\) and \(p\) is much more easy. But then what we do?

To truly undertsand it let's start with \(U\). From 1st law of thermodynamics,

\[ dU = \Bigg(\frac{\partial U}{\partial S}\Bigg)_V dS+ \Bigg(\frac{\partial U}{\partial V}\Bigg)_S dV = T\ dS - p\ dV \]

This gives us,

\[ T = \Bigg(\frac{\partial U}{\partial S}\Bigg)_V \text{\ \ \ and \ \ \ } p = -\Bigg(\frac{\partial U}{\partial S}\Bigg)_S \]

So, for constant volume process of our system, \(dU = T\ dS\) and much more.

But now we want to study a system where the temperature is constant. How are we going to study such systems?.. How are we going to study the energy change?

The answer just find a function which is also energy but one of it's parameters are \(T\). As we know \(S\) and \(T\) are related and as we just showed they are related by derivative(like \(p\) and \(x\) in previous examples). This should shout the name of our legendary Legendre Transformation.

So, we will define,

\[ F = U - TS \]

(We can also use \(TS-U\) but then the physical interpretations will not be straight-forward, so we use this convention of using \(U-TS\)). This then tells us,

\[ dF = T\ dS - p\ dV - T\ dS - S\ dT = -S\ dT - p\ dV \]

This implies that the natural variables for \(F\) are \(V\) and \(T\), i.e., \(F=F(V,T)\). Then for a Isothermal process, we have,

\[ dF = -p\ dV \implies \Delta F = -\int_{V_1}^{V_2}p\ dV \]

Hence, a positive change in \(F\) represents reversible work done on the system by the surroundings, while a negative change in \(F\) means reversible work done on the surrounding by the system.

We can also see as \(dF = dU - T\ dS\) (for constant \(T\)) and \(dW\geq dU - T\ dS\) (equality holds for reversible process). So, we have,

\[ dW\geq dF \]

What it means is, adding work to the system increases the system's F(called Helmholtz energy). In a reversible process, \(dW = dF\) and the work added to the system goes directly into increasing the Helmholtz energy. If we extract a certain amount of work from system(\(dW<0\)), then this will be associated with at least as big a drop in the system's \(F\).

This shows how powerful Legendre Transformation is. Using it we just found something \(F\) which is also energy(check the dimension) and not just any energy, It represent a very powerful energy, i.e., free energy. This tells us how much energy we can extract from the system, i.e., how much energy is avaliable for us to extract from the system.

In this way, we can find many more useful things. Maybe in another blog I will discuss them.

🤔 Problem Suppose, we want to study our system under isobaric process (\(P=\) constant). Try creating a new type of energy for that from \(U(S,V)\).

Also, try finding what will be it's physical interpretation.

If you guys are interested read:

  1. A Student's Guide to Entropy by Lemons

  2. An Introductory Course of Statistical Mechanics by P.B. Pal

  3. Making sense of the Legendre transform. American Journal of Physics. 77 (7): 614, Zia, R. K. P.; Redish, E. F.; McKay, S. R. (2009)

I have mainly followed 3rd one along with 2nd one.


I hope you learn something new and enjoyed this article.

If you have some queries, do let me know in the comments or contact me using my using the informations that are given on the page About Me.

CC BY-SA 4.0 Kazi Abu Rousan. Last modified: October 31, 2025. Website built with Franklin.jl and the Julia programming language.