Introduction
In this post, I introduce gauge equivalence, and also investigate a few different types of reduction under symmetry (to build out a taxonomy).
If you haven’t followed along, in the last few posts we introduced the Lagrangian in the context of geometric controls. We then proved Noether’s theorem with time and applied it to similar systems.
Epistemic status: This post is still a bit rough: these are my informal notes navigating this subject. I’m more interested (ultimately) in computation so I’m not necessarily aiming for maximal rigour, and in fact I probably need to introduce more geometric machinery (bundles, connections, differential forms, symplectic geometry) to make the exposition more clean and rigorous. See the read more section for more rigorous sources.
Equivalence
Before reducing anything, let’s introduce a new notion of “equivalence” (and recall one we saw before).
Gauge Equivalence
We once again consider a manifold \(Q\) and a system with start and end configurations \(q_0, q_N \in Q\).
The Lagrangian is \(L(t, q, \dot q)\), and the action is:
\[ S[q] = \int_{t_0}^{t_N} L(t, q, \dot q) \ dt \]
Let’s consider the adjusted action
\[ S[q] = \int_{t_0}^{t_N} L(t, q, \dot q) \ dt + C \]
where \(C \in \mathbb{R}\) is some constant. Clearly, \(C\) does not affect the minimizing path for \(S[q]\).
Next, consider some function \(F: \mathbb{R} \times Q \to \mathbb{R}\). Let \(C = F(t_N, q(t_N)) - F(t_0, q(t_0))\), so the action becomes:
\[ S[q] = \int_{t_0}^{t_N} L(t, q, \dot q) \ dt + F(t_N, q(t_N)) - F(t_0, q(t_0)) \]
but this is equal to
\[ S[q] = \int_{t_0}^{t_N} L(t, q, \dot q) + \frac{d}{dt}[F(t, q)] \ dt \]
Therefore, given some Lagrangian \(L(t, q, \dot q)\), we can add an arbitrary \(\frac{d}{dt}[F(t, q)]\) without changing the underlying mechanics. That is, two Lagrangians \(L\) and \(L' = L + \frac{dF}{dt}\) produce the same Euler-Lagrange equations.
By analogy with our previous posts, if for some group \(G\) and some \(g \in G\), we have
\[ L(\Phi_g(q), T\Phi_g(\dot{q})) = L(t, q,\dot{q}) + \frac{d}{dt}F_g(t, q) \]
we say that that \(\Phi_g\) is a quasi-symmetry of \(L\), and that the Lagrangians \(L\) and \(L'\) are “gauge-equivalent”1.
If we combine this with our view of equivariance, we get:
\[ L(\Phi_g(q), T\Phi_g(\dot{q})) = \chi(g)\cdot L(t, q,\dot{q}) + \frac{d}{dt}F_g(t,q) \]
We can consider two Lagrangians \(L\) and \(L'\) to be “equivalent” if \(L' \sim \chi(g) \cdot L + \frac{dF}{dt}\) for some \(g \in G\).
In discrete coordinates, this becomes
\[ L'_d(t_k, q_k, q_{k+1}) = L_d(t_k, q_k, q_{k+1}) + F_{k+1}(q_{k+1}) - F_{k}(q_k) \]
The \(F_k\) telescope away, leaving the actual dynamics the same.
The situation should be unchanged if the Lagrangian depends or does not depend on \(t\).
The Noether charge is slightly modified in this case. We have an extra term:
\[ K(t, q): = \frac{d}{d\epsilon}[F_{\epsilon}(t, q)]\bigg|_{\epsilon=0} \]
And the Noether’s charge is
\[ J = p\cdot\omega_Q(q) - K \]
The proof (sketched in a later section) differs from the original in that action doesn’t equal \(0\) under variation, but instead equals the variation in the total derivative.
Equivalence under Equivariance
Here we have
\[ L(\Phi_g(q), T\Phi_g(\dot q)) = \chi(g)L(q, \dot q) \]
If we have some \(\chi(g) \in \mathbb{R}_{>0}\), really we want to work in quotient space
\[ L \sim cL \]
for \(c \in \mathbb{R}_{>0}\).
We’ve already covered this in the dynamical similarity post so I won’t belabor it. Essentially we end up with
\[ J := p \cdot \omega_Q(q) - k\int_{t_0}^t L(q(t'), \dot q(t'))dt' \]
where \(k := \frac{d}{d\epsilon}\log\chi(g(\epsilon))\bigg|_{\epsilon=0}\). The \(\log\) shows up due to maps between \(\mathbb{R}\) and \(\mathbb{R}_{>0}\).
Can we look at this with respect to “general representations”? I.e. more complex characters? It seems not really, we would need to have generalized Lagrangians (i.e. not just a scalar), which is out of scope of this post.
Equivariant “Reduction”
Equivariance won’t help us lower dimension the way quotienting by \(G\) does, since it only tells us when different-looking Lagrangians describe the same trajectories up to scaling. But, it did allow use to produce new coordinates that index entire families of solutions. This isn’t “reduction” per se, but reparametrization
What should this look like? (Presented without proof, we saw the simplified version in the last Kepler proof).
There’s some representation \(\rho(g): G \to GL_n(V)\) and character (homomorphism) \(a(g): G \to \mathbb{R}_{>0}\) \[ q' = \rho(g)q \] \[ t' = a(g)t \]
Such that
\[ L(a(g)t, \rho(g)q, \frac{\rho(g)}{a(g)}\dot q) = \chi(g)L(t, q, \dot q) \]
Even more generally, with \(\Phi_g\) and \(\tau_g\) diffeomorphisms (not necessarily linear), hand-waving
\[ L\!\left( \tau_g(t,q),\; \Phi_g(t,q),\; \frac{D\Phi_g(q)\,\dot q + \partial_t \Phi_g(t,q)}{\partial_t \tau_g(t,q) + \partial_q \tau_g(t,q)\,\dot q} \right) = \chi(g)L(t, q, \dot q) \]
This would be cool if we wanted to “transport” our solutions around between frames.
Summary
Putting it together:
\[ L(\Phi_g(q), T\Phi_g(\dot{q})) = \chi(g)L(t, q,\dot{q}) + \frac{d}{dt}F_g(t, q) \]
And we can define some equivalence relations in terms of gauge equivalence and equivariance.
As an aside:
It is interesting to consider if we wanted to consider conditions such that the group actions composed. That is,
\[ L(\Phi_{gh}(q), T\Phi_{gh}(\dot{q})) = L(\Phi_g(\Phi_h(q)), T\Phi_{g}(T\Phi_h(\dot q))) \]
so
\[ \chi(gh)L(t, q,\dot{q}) + \frac{d}{dt}F_{gh}(t, q) = \chi(g)\chi(h)L(t, q, \dot q) + \chi(g)\frac{d}{dt}F_h(t, q)+ \frac{d}{dt} F_g(t, \Phi_h(q)) \]
We know \(\chi(gh) = \chi(g)\chi(h)\).
So we would need
\[ \frac{d}{dt}F_{gh}(t, q) = \chi(g)\frac{d}{dt}F_h(t, q) + \frac{d}{dt} F_g(t, \Phi_h(q)) \]
This may be interesting if we ever want to classify Lagrangians.
Reduction
Now that we have some equivalence relations on \(L\), it makes sense to work in “reduced” space of Lagrangians, modulo symmetry. We’ll look at the equivalence relations above, plus others. Let’s go through each type of reduction one-at-a-time.
1. Gauge “Reduction”
As established, if \(\exists F: \mathbb{R} \times Q \to \mathbb{R}\) such that \(L'(t, q, \dot q) = L(t, q, \dot q) + \frac{dF}{dt}\), we write \(L' \sim L\).
What’s the point of this adding extra \(\frac{dF}{dt}\) term? Why care about it?
For some systems, we may not have “symmetries”, but by adding an extra term we can enforce a quasi-symmetry on the system.
Example
Consider the following Lagrangian on \(Q = \mathbb{R}^2\), where \(A(q) : \mathbb{R}^2 \to \mathbb{R}^2\):
\[ L(t, q, \dot q) =\frac{1}{2}m(\dot q \cdot \dot q) + A(q) \cdot \dot q \]
As written, the system is not invariant to rotation by \(\theta\).
Let
\[ R_{\theta} = \begin{bmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{bmatrix} \]
And consider new coordinates \(q' = R_{\theta}q\), \(\dot q' = R_{\theta}\dot q\)
We know the first term is invariant to rotation:
\[ \frac{1}{2}m||\dot q'||^2 = \frac{1}{2}m||R_{\theta}\dot q||^2 = \frac{1}{2}m||\dot q||^2 \]
The second term transforms as:
\[ A(q') \cdot \dot q' = A(R_{\theta}q) \cdot R_{\theta}\dot q \]
If there exists a function \(F_\theta : Q \to \mathbb{R}\) such that
\[ A(R_{\theta}q) \cdot R_{\theta}\dot q = A(q) \cdot \dot q + \frac{d}{dt}[F_{\theta}(q)] \]
then it would be quasi-invariant. When is this true?
Rearranging, we would have to have \[ (R_{\theta}^{\top}A(R_{\theta}q) - A(q))\dot q = \frac{d}{dt} [F_{\theta}(q)] \]
Since it’s true that \[ \frac{d}{dt} [F_{\theta}(q)] = \nabla F_{\theta} \cdot \dot q \]
So we conclude it is quasi-invariant iff \[ R_{\theta}^{\top}A(R_{\theta}q) - A(q) = \nabla F_{\theta} \]
A function is only a gradient if the mixed partials are the same. So (skipping one or two steps) we end up needing
\[ (\partial_xA_y - \partial_yA_x)(q) = (\partial_xA_y - \partial_yA_x)(R_{\theta}q) \]
Call \(B := (\partial_xA_y - \partial_yA_x)(q)\). So this is true if \(B\) is invariant under rotation.
We will need \(B\) in the next section.
Noether Charge
What is the Noether charge?2 Let’s compute it. The transformation is
\[ (t, q) \mapsto (t, R_{\epsilon}q) \]
We need \(\omega_Q(q)\), which is
\[ \frac{d}{d\epsilon}[R_{\epsilon}q]|_{\epsilon=0} = \begin{bmatrix} -\sin(\epsilon) & -\cos(\epsilon) \\ \cos(\epsilon) & -\sin(\epsilon) \end{bmatrix}_{\epsilon=0} q= \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}q = (-y, x) \]
Then we need \(K = \frac{d}{d\epsilon}[F_{\epsilon}]\bigg|_{\epsilon=0}\).
We can rearrange the terms of the Lagrangian with gauge term to get an expression for \(K\). Differentiating the quasi-invariance condition at \(\epsilon=0\) gives
\[ \frac{d}{d\epsilon}L(t,q_\epsilon,\dot q_\epsilon)\Big|_{\epsilon=0} = \frac{d}{dt}K(q) \]
Since we know the typical Noether charge formula along solutions, the left side can be replaced by the Noether charge:
\[ \frac{d}{dt}[p\cdot \omega_Q(q)] = \frac{d}{dt}K(q) \]
Thus the adapted charge for the gauge:
\[ \frac{d}{dt}[p\cdot \omega_Q(q) - K] = 0 \implies J := p\cdot \omega_Q(q) - K \]
In the example, the conserved quantity is
\[ J = p \cdot (-y, x) - K \]
We just need to compute this particular \(K\) (this will be relatively difficult since we don’t have an functional expression for \(A\), just some abstract criteria).
First, expanding \(J\) in coordinates and plugging in the components of \(p = \frac{\partial L}{\partial \dot q}\)
\[ J = m(x\dot y - y \dot x) + xA_y - yA_x - K \]
Call \[ M := xA_y - yA_x - K \]
Now, return to our definition of \(K\):
\(K = \frac{d}{d\epsilon}[F_{\epsilon}]\bigg|_{\epsilon=0}\)
Since
\[ A(R_{\theta}q) \cdot R_{\theta}\dot q = A(q) \cdot \dot q + \frac{d}{dt}[F_{\theta}(q)] \]
We get
\[ \nabla K(q) = \frac{d}{d\epsilon}[R_{\epsilon}^{\top}AR_{\epsilon}q]\bigg|_{\epsilon=0} \]
We can expand the \(R_{\epsilon}\) using their Taylor approximations
\[ R_{\epsilon} \approx I + \epsilon \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} + O(\epsilon^2) \]
\[ R_{\epsilon}^{\top} \approx I - \epsilon \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} + O(\epsilon^2) \]
If we split on the coordinates and rearrange we get
\[ \partial_x K = -y \partial_x A_x+ x\partial_y A_x + A_y \] \[ \partial_y K = -y\partial_x A_y + x \partial_y A_y - A_x \]
Also, the mixed-partials \(K\) must agree
\[ \partial_y \partial_x K = \partial_x \partial_y K \]
If you do a bunch of algebra, and substitute the two equations we have for the partials of \(K\), you can get
\[ y\partial_x B - x\partial_y B = 0 \]
Notice that, for \(M\), we have
\[ \partial_x M = xB \] \[ \partial_y M = yB \]
So we get \[ \nabla M = B(q)\cdot(x, y) \]
And if \(B\) is constant, then we integrate the partials and put them together to get \(M = \frac{1}{2}B(x^2 + y^2)\).
But also
\[ M = xA_y - yA_x - K \]
So the Noether charge reduces ultimately to
\[ J = mx\dot y - my\dot x + \frac{1}{2}B(x^2 + y^2) \]
Assuming B is constant. This is the angular momentum (the mass is included in the \(p\) term).
If \(B\) is some other rotation-invariant function, we can integrate \[ \nabla M = B(q)\cdot(x, y) \]
to find the Noether charge.
Thoughts on Example
In summary:
- given A term
- compute \(B := (\partial_xA_y - \partial_yA_x)(q)\), rotation-invariant
- find \(M\) based on the components of \(B\)
- this could give \(K\) based on the difference between \(p \cdot \omega_Q(q)\) and \(M\), or just use it to compute \(J\)
If there’s an easier way to do this, I don’t know what it is. This is a bit ugly because \(K\) is gauge-dependent. What we really want is a symbolic way to automatically get \(K\) given the gauge. In the code we can compute \(K\) numerically (or using autodiff).
You’d have to
- decide if a quasi-symmetry exists
- construct \(F_{\epsilon}\)
- differentiate it
- Return \(K\)
ChatGPT 5.2 says step 1 isn’t solvable. So we’d have to supply the symmetry ahead of time (like we already do with regular symmetries), then integrate the gradient of \(K\) (which we can get becausse we can compute \(\frac{d}{d\epsilon}L(\Phi(q), T\Phi(\dot q))\big|_{\epsilon=0}\))
But none of this really matters, we don’t even have to compute \(K\) because \(K\) is just a boundary term, the gauge telescopes away in the actual dynamics.
Also note that this isn’t a true reduction, as it doesn’t really reduce dimension.
2. Configuration Space Reduction
Here we will reduce \(Q\) into a simpler space, by action \(G\).
We have a map \(\Phi : G \times Q \to Q\).
Let \(\pi : Q \to \bar Q\), where \(\bar Q := Q/G\). The projection map \(\pi\) sends each element \(q \in Q\) to its corresponding orbit in \(\bar Q\).
How does the associated tangent bundle change under quotient by \(G\)?
Before, for a Lie group \(G\), we had the pair
\[ (g, \omega) \in G \times TG \]
If \(g\) acts on itself (call the acting element \(h \in G\)), we have
\[ h \cdot (g, \dot g) = (hg, h \dot g) \]
If we trivialize this
\[ h\cdot(g, \omega) = h\cdot(g, g^{-1}\dot g) = (hg, (hg)^{-1}(h\dot g)) = (hg, \omega) \]
So \(\omega\) is unaffected by the quotient.
Subcase 1: Euler-Poincare Reduction
Let’s consider \(Q = G\). Then \(\bar Q = G/G\), which is a single point. This is the same as our original reduced lagrangian, which was \(\ell(\text{id}_G, \omega)\).
If you recall, we completely got rid of any dependence on the actual manifold and worked completely in the Lie algebra. So we’ve already solved this case.
Subcase 2: Lagrange-Poincare Reduction
What if \(Q\) is just some arbitrary manifold? What does it even mean to take \(Q/G\), in general? We need to define an equivalence relation.
Consider the orbit of \(q\) with respect to \(G\):
\[ \text{Orb}_G(q) := \{g \cdot q \ | \ g \in G \} \]
We say \(q_1 \sim q_2\) if there exists some \(g \in G\) such that \(g \cdot q_1 = q_2\) (they are in the same orbit).
So we are talking about \(Q/G := \{\text{Orb}_G(q) | \ q \in Q\}\), the set of orbits.
The problem is the induced equivalence relation of \(TQ\). We need the velocities to transform:
\[ (q, v) \mapsto (g \cdot q, T_q(g)v) \]
We can define another equivalence relation in this way. Two elements \((q_1, v_1)\) and \((q_2, v_2)\) are equivalent if there exists a \(g \in G\) such that \((q_2, v_2) = (g \cdot q_1, T_{q_1}(g)v_1)\). This constructs \(TQ/G\).
We also know there’s a map \(\rho\)
\[ \rho: TQ/G \to Q/G = \bar Q \]
that just takes the equivalence classes on \(TQ\) (which are among pairs \((q, v)\)) to their corresponding equivalence classes in \(Q\) (whice are among \(q\)). Basically, it forgets the velocity.
If we have some \(\bar q\) we can take the fiber \(\rho^{-1}(\bar q)\). This points back to the entire orbit of \(q = \bar q\) and associated velocities. The question becomes: how do we resolve the ambiguity of which \(q\) to use as representative?
Pick an arbitrary \(q_0 \in \bar q\) as representative; all other \(q\) in the orbit equal \(g \cdot q_0\) for some \(g \in G\). The equivalence classes over velocities of the vertical part can be represented as
\[ (q_0, \ \omega_Q(q_0)) \]
for some \(\omega \in \mathfrak{g}\)3.
Once a representative \(q_0 \in \bar q\) is fixed, the velocity component along the group orbit is determined by an element \(\omega \in \mathfrak{g}\). What remains is the component of the velocity transverse to the orbit. So we can decompose
\[ \dot q = \dot q_s + \omega_Q(q) \]
Where \(\omega_Q(q)\) is along the orbit and \(q_s\) is the projection onto \(\dot {\bar q} \in T_{\bar q}(Q/G)\).
(This split isn’t canonical, it depends on a choice of connection on \(Q \to Q/G\).)
3. Phase Space Reduction
Subcase 1: Marsden-Weinstein Reduction
Note: I believe this is the original paper. I haven’t introduced symplectic geometry so I am omitting that language and keeping things informal.
Let’s say we have a system with Noether charges \(J_1, J_2, ..., J_n\). We can reduce this system by picking corresponding values for each charge \(J_1 = \mu_1, J_2 = \mu_2\), etc., then setting \(J_i(q, p) = \mu_i\). Let’s look in more detail.
We have the space of pairs \((q,p)\) (the phase space aka cotangent bundle):
\[ T^*Q := \{(q,p): q\in Q,\; p\in T_q^*Q\} \]
Suppose a Lie group \(G\) acts on configurations:
\[ \Phi: G \times Q \to Q \] \[ (g,q) \mapsto \Phi_g(q) \]
with tangent map \[ T\Phi_g: T_qQ \to T_{\Phi_g(q)}Q \]
We have:
\[ g \cdot (q,p) := (\Phi_g(q),\ p') \]
We want our momentum functional to work as so:
\[ p'(T\Phi_g(v)) = p(v) \]
Conveniently, for \(\omega \in \mathfrak g\) with induced vector field \(\omega_Q\) on \(Q\) (so \(\omega_Q(q)\in T_qQ\)):
\[ J(q, p)(\omega) = p(\omega_Q(q)) \]
We’ve contructed the “momentum map” \(J: T^*Q \to \mathfrak{g}^*\). This is essentially the definition of a Noether charge \(p \cdot \omega_Q(q)\) but written in functional form (\(p(q) = p \cdot q\)).
We can assign a value to the corresponding Noether charge for that symmetry4:
\[ J(q,p) = \mu \]
Consider the \(\mu\)-preserving symmetries (the \(\mu\)’s are preserved under action by \(g \in G\)):
\[ G_{\mu} := \{g \in G | \text{Ad}_g^*\mu = \mu \} \]
So we can think of the reduced state space as
\[ (T^*Q)_{\mu} = J^{-1}(\mu)/G_{\mu} \]
So we’ve restricted the phase space to a specific value of a conserved quantity and then quotiented out the symmetry corresponding to that quantity. This generalizes to multiple conserved quantities when they arise as components of a momentum map (for a product symmetry group) or via staged reduction (multiple commuting symmetries).
Example - Kepler’s Third Law - Marsden-Weinstein Reduction
Let’s return to Kepler’s third law in 2-dimensions.
\[ L(q, \dot q) = \frac{m}{2}||\dot q||^2 - V(q) \]
Let \(V(q) = -\frac{k}{||q||}\)
Convert to polar coordinates, \(q = (r \cos \theta, r \sin \theta)\):
\[ L((r, \theta), (\dot r, \dot \theta)) = \frac{m}{2}(\dot r^2 + r^2 \dot \theta^2) + \frac{k}{r} \]
Now let’s consider
\[ (r, \theta) \mapsto (r, \theta + \epsilon) \]
(Symmetry under \(SO_2\))
The conserved quantity is
\[ J = mr^2\dot\theta \]
Let’s fix this
\[ \mu = mr^2\dot\theta \]
This determines \(\dot \theta\)
\[ \dot \theta = \frac{\mu}{mr^2} \]
(this is angular momentum, the same as the free rotor)
So
\[ L((r, \theta), (\dot r, \dot \theta)) = \frac{m}{2}(\dot r^2 + r^2 (\frac{\mu}{mr^2})^2) + \frac{k}{r} \]
So we’ve reduced the Lagrangian to one dimension (\(r\)).
However, there’s a problem. The dynamics are correct, but this is no longer necessarily a Lagrangian. We’ve introduced a constraint (we haven’t looked at constraints yet). We’ll need a way to correct that (in the next section).
The last step is to handle the phase space.
Start with: \[ L(r,\theta,\dot r,\dot\theta)=\frac{m}{2}\left(\dot r^2+r^2\dot\theta^2\right)+\frac{k}{r}. \]
Compute the canonical momenta:
\[ p_r:=\frac{\partial L}{\partial \dot r}=m\dot r,\qquad p_\theta:=\frac{\partial L}{\partial \dot\theta}=mr^2\dot\theta. \]
Invert: \[ \dot r=\frac{p_r}{m},\qquad \dot\theta=\frac{p_\theta}{mr^2}. \]
Then the Hamiltonian is the Legendre transform
\[ H = p_r\dot r + p_\theta\dot\theta - L = \frac{p_r^2}{2m}+\frac{p_\theta^2}{2mr^2}-\frac{k}{r}. \]
So restricting to \(p_\theta=\mu\) gives \[ H_\mu(r,p_r)=\frac{p_r^2}{2m}+\frac{\mu^2}{2mr^2}-\frac{k}{r}. \]
\[ H(r,\theta,p_r,p_\theta) = \frac{p_r^2}{2m} + \frac{p_\theta^2}{2mr^2} - \frac{k}{r} \]
then on \(p_\theta=\mu\) it becomes
\[ H_\mu(r,p_r) = \frac{p_r^2}{2m} + \frac{\mu^2}{2mr^2} - \frac{k}{r} \]
Subcase 2: Routh Reduction
Note: Typically Routh reduction is for cyclic coordinates specifically. I’m looking at it a bit more generally.
We know, from the last section, that we have \(p \cdot \omega_Q(q) = \mu\). We are looking to modify our variational problem to consider this constraint.
By Lagrange multipliers, we can augment the action with the constraint:
\[ S[q] = \int_{t_0}^{t_N} (L(t, q, \dot q) + \lambda(t)(J(t, q, \dot q) - \mu))dt \]
Since \(\mu\) is constant along the orbits of the symmetry, we just need to ensure the solutions move along that submanifold to ensure the new equation is variational. From the earlier section on Lagrange-Poincare, we can decompose \(\dot q\) as \(v + u\omega_Q(q)\), where \(u(t)\) is along the “symmetry direction” (with conserved quantity \(\mu\)) and \(v(t)\) is “transverse” to the symmetry direction.
Define:
\[ \mathcal{L}(t, q, v, u) := L(t, q, v + u\cdot\omega_Q(q)) \]
This is the same as the original Lagrangian, just reparametrized.
Thus,
\[ \frac{\partial \mathcal{L}}{\partial u} = \frac{\partial L}{\partial \dot q} \frac{\partial \dot q}{\partial u} = p \cdot \omega_Q(q) = J \]
And we enforce \(J = \mu\).
Plugging back in to the action formula
\[ S[q, v, u, \lambda] = \int_{t_0}^{t_N} (\mathcal{L}(t, q, v, u) + \lambda(t)(\frac{\partial \mathcal{L}}{\partial u} - \mu))dt \]
Here the only subtlety is that \(u\) is the symmetry-direction velocity (e.g. \(u=\dot\theta\)), so the variation that produces the symmetry equation is really a variation of the symmetry coordinate \(\theta\).
That means \(\delta u=\delta\dot\theta=\frac{d}{dt}\delta\theta\), so after integrating by parts the stationarity condition is
\[ \frac{d}{dt}\Big(\partial_u\mathcal L + \lambda\,\partial^2_{uu}\mathcal L\Big) = 0 \]
Now that we have this, let’s try to modify \(\mathcal{L}\) to cancel the \(u\)-chain rule term, without affect \(q\) or \(v\).
Consider
\[ \mathcal{F}(t, q, v, u) = \mathcal{L}(t, q, v, u) + g(u) \]
So (chain rule in shorthand) \[ \delta \mathcal{F} = \partial_q \mathcal{L}\delta q + \partial_v \mathcal{L}\delta v + \partial_u\mathcal{L}\delta u + g'(u)\delta u \]
So
\[ \partial_u \mathcal{L} + g'(u) \]
is the \(u\) coefficient. Since the \(u\)-coefficient is \(0\) along the desired solutions, we set \(\mu + g'(u) = 0\), and thus \(g'(u) = -\mu\), and \(g(u) = -\mu u + C\).
Thus, the corrected Lagrangian is
\[ \mathcal{L}(t, q, v) = \mathcal{L}(t, q, v, u_{\mu}(t, q, v)) - \mu u_{\mu}(t, q, v) \]
Which is the form of the Routhian.
Example - Kepler’s Third Law - Routh Reduction
Take our solution from the end of the last example:
\[ L((r, \theta), (\dot r, \dot \theta)) = \frac{m}{2}(\dot r^2 + r^2 (\frac{\mu}{mr^2})^2) + \frac{k}{r} \]
Subtract \(\mu u\). In this case motion is split into \(r\) and \(\theta\) components,
\[ L((r, \theta), (\dot r, \dot \theta)) = \frac{m}{2}(\dot r^2 + r^2 (\frac{\mu}{mr^2})^2) + \frac{k}{r} - \mu \dot \theta \] \[ L((r, \theta), (\dot r, \dot \theta)) = \frac{m}{2}\dot r^2 + \frac{m}{2} r^2 \frac{\mu^2}{m^2r^4} + \frac{k}{r} - \frac{\mu^2}{mr^2} \] \[ L((r, \theta), (\dot r, \dot \theta)) = \frac{m}{2}\dot r^2 + \frac{k}{r} - \frac{\mu^2}{2mr^2} \]
So the new \(L\) is the variational formula that gives the dynamics that obeys the constraint.
Code
No code this time. There’s probably already enough here to implement Routh reduction, but I’m going to leave that for later, once I’ve thought more carefully about how it composes with the other notions of equivalence and reduction above.
Conclusion
We now have the start of a picture of how the Lagrangian works and reduces under symmetry. So far, we are still recapitulating well-known results, but I have a much better grasp on the subject than before. In subsequent posts, we will look at this picture an algebraic viewpoint and look at extensions. I also plan to look at applications of these principles to controls, games, and agents.
Read More
Marsden & Ratiu, Introduction to Mechanics and Symmetry
Marsden & Weinstein (1974), “Reduction of symplectic manifolds with symmetry.”
Marsden, Ratiu & Scheurle (2000), “Reduction theory and the Lagrange–Routh equations.”
Cendra, Marsden & Ratiu (2001), “Lagrangian Reduction by Stages”
Strongly suspect ChatGPT has memorized this blog post by Michael Kraus.
Footnotes
It seems the full notion from physics of “gauge symmetry” or “gauge theory” from physics implies a fair amount of structure I have not yet introduced, so I avoid it. Really this is “variational equivalence or Lagrangian equivalence modulo exact 1-forms on path space.”↩︎
I used ChatGPT to assist with some of the algebra here, though I checked in thoroughly and I think it works. Even with ChatGPT and significant effort, I think the proof is inelegant. It’s not important to the overall throughline I’m trying to build so skip it if it seems confusing. I think in most cases you’d have some formula for \(K\) and you’d just compute the derivative.↩︎
The \(G\)-action \(\Phi\) needs to technically be “free and proper” for all of this to work out nicely such that \(Q/G\) is a smooth manifold. Freeness: if we form the matrix whose columns are the velocity directions generated by each symmetry at the current state, that matrix has full column rank (no non-trivial nullspace). Properness: basically we require “large symmetry actions” to produce “different enough” parameters; we cannot send the symmetry parameter to infinity while the Jacobian and the transformed state both stay small. From the reduction point of view: 1. there is only one symmetry motion corresponding to a given “along-orbit” velocity, and 2. states that are “the same up to symmetry” stay close when you evolve or project them. ↩︎
I suppress the individual J_i and p_i from here out, but this can be done for each Noether charge.↩︎
