Motivation

What’s the point of geometric integrators? Why bother with the formalism around Lagrangians and Hamiltonians to manage basic control problems or simulations?

In the last post, I worked on the intuition behind geometric controls. This post continues that effort.

Setup

Group Actions

We have a configuration manifold \(Q\) with tangent bundle \(TQ\).

Let \(G\) be a Lie group. Introduce the smooth map \(\Phi\) such that

\[ \Phi: G \times Q \to Q \] \[ (g, q) \mapsto \Phi_g(q) \]

Choose a smooth curve \(g(\epsilon) \in G\) with \(g(0)=\text{id}_G\). For each \(q\in Q\), the map

\[ \epsilon \mapsto g(\epsilon)\cdot q \]

is a smooth curve in \(Q\). We can think of \(g(\epsilon)\cdot q\) as a curve through \(Q\), induced by the action of each \(g(\epsilon)\) on \(q\).

The derivative of \(g(\epsilon)\cdot q\) at \(\epsilon=0\) defines a tangent vector \[ \omega_Q(q) := \frac{d}{d\epsilon}\bigl[g(\epsilon)\cdot q\bigr]\bigg|_{\epsilon=0} \in T_qQ \]

and hence the map \(q \mapsto \omega_Q(q)\) defines a smooth vector field on \(Q\).

We call this map \(\omega_Q(q)\) the “infinitesimal generator” of the curve \(g(\epsilon)\).

\(G\)-invariance

Construct the map

\[ T\Phi_g : T_qQ \to T_{\Phi_g(q)}Q \]

Which takes an element of the tangent bundle at \(q \in Q\) and gives the element of the tangent bundle at \(\Phi_g(q)\).

If \(\forall g \in G\), we have

\[ L(\Phi_g(q), T\Phi_g(\dot q)) = L(q, \dot q) \]

then we say that the Lagrangian is \(G\)-invariant¹.

Noether’s Theorem

Suppose the Lagrangian is \(G\)-invariant:

\(\forall g \in G, q\in Q\), \[ L(\Phi_g(q), T\Phi_g(\dot q)) = L(q, \dot q) \]

Define the \(\epsilon\)–shifted trajectory by \[ q_\epsilon(t) := g(\epsilon)\cdot q(t) \]

Note that multiplying by \(g({\epsilon})\) is a smooth operator on \(q\).

It’s also true that, by construction, \[ \left.\frac{d}{d\epsilon}[q_\epsilon(t)]\right|_{\epsilon=0} = \omega_Q(q(t)) \]

First we will need a quick identity.

Start by differentiating \(q_\epsilon(t)\) with respect to \(t\): \[ \dot q_\epsilon(t) = \frac{d}{dt}\left[g(\epsilon)\cdot q(t)\right] \]

Then differentiate again, \(\dot q_\epsilon(t)\) with respect to \(\epsilon\) at \(\epsilon=0\)

\[ \left.\frac{d}{d\epsilon}[\dot q_\epsilon(t)]\right|_{\epsilon=0} = \frac{d}{d{\epsilon}}\biggr[\frac{d}{dt}\left[g(\epsilon)\cdot q(t)\right] \biggr]\biggr|_{\epsilon=0} \]

Since both \(\epsilon\) and \(t\) appear only through smooth compositions, the order of differentiation can be interchanged²: \[ \left.\frac{d}{d\epsilon}[\dot q_\epsilon(t)]\right|_{\epsilon=0} = \frac{d}{dt}\left[ \frac{d}{d\epsilon}[q_\epsilon(t)]|_{\epsilon=0} \right] \]

Subbing in the definition of \(\omega_Q(q)\), this becomes \[ \left.\frac{d}{d\epsilon}[\dot q_\epsilon(t)]\right|_{\epsilon=0} = \frac{d}{dt}\,[\omega_Q(q(t))] \tag{1}\]

Which is the identity we need.

We are now ready to derive the main result. Differentiate the Lagrangian \(L(q_{\epsilon}, \dot q_{\epsilon})\) at \(\epsilon=0\) and apply the chain rule (with implicit evaluation).

\[ \left.\frac{d}{d\epsilon}[ L\bigl(q_\epsilon(t),\, \dot q_\epsilon(t))]\right|_{\epsilon=0} = \frac{\partial L}{\partial q}\cdot \frac{dq_{\epsilon}}{d{\epsilon}}\biggr|_{\epsilon=0} + \frac{\partial L}{\partial \dot q} \cdot \frac{d\dot q_{\epsilon}}{d\epsilon}\biggr|_{\epsilon=0} \]

The first term on the RHS we sub in \(\omega_Q(q)\), the second term we sub in using Equation 1: \[ \left.\frac{d}{d\epsilon}[ L\bigl(q_\epsilon(t),\, \dot q_\epsilon(t))]\right|_{\epsilon=0} = \frac{\partial L}{\partial q}\cdot \omega_Q(q) + \frac{\partial L}{\partial \dot q} \cdot \frac{d}{dt}[\omega_Q(q)] \]

Since the Lagrangian is \(G\)–invariant, and \(q_{\epsilon}\) is the application of the smooth operator \(g(\epsilon)\) to \(q\), then \(\forall \epsilon\) we can rewrite \(L(q_{\epsilon}, \dot q_{\epsilon})\) as \(L(q, \dot q)\).

Therefore:

\[ \frac{d}{d\epsilon} L\bigl[q_\epsilon(t),\, \dot q_\epsilon(t)\bigr]\bigg|_{\epsilon=0} = 0 \]

the derivative on the LHS is zero.

We obtain \[ \frac{\partial L}{\partial q}\cdot \omega_Q(q) + \frac{\partial L}{\partial \dot q} \cdot \frac{d}{dt}[\omega_Q(q)] = 0 \]

Substitute in \(p := \frac{\partial L}{\partial \dot{q}}\) (previously defined in the last post)³.

\[ \frac{\partial L}{\partial q}\cdot \omega_Q(q) \;+\; p \cdot \frac{d}{dt}\bigl[\omega_Q(q)\bigr] = 0 \]

Now, assume the Euler–Lagrange equations hold. Then

\[ \frac{d}{dt} \left( \frac{\partial L}{\partial \dot{q}} \right) = \frac{\partial L}{\partial q} \]

Since \(p := \frac{\partial L}{\partial \dot{q}}\) we have

\[ \frac{dp}{dt} = \frac{\partial L}{\partial q} \]

\[ \frac{dp}{dt}\cdot\omega_Q(q) \;+\; p\cdot\frac{d}{dt}[\omega_Q(q)] = 0 \]

Therefore (using the product rule in reverse)

\[ \frac{d}{dt}\left[p \cdot \omega_Q(q)\right] = 0 \]

So \(p \cdot \omega_Q(q)\) is a conserved quantity over time!

We define the “Noether charge”

\[ \boxed{J := p \cdot \omega_Q(q)} \]

Noether’s theorem says: given some \(G\) invariant Lagrangian, and some smooth operator \(g\) (a symmetry) on \(q\), \(J\) is conserved along every solution of the Euler–Lagrange equations.

Example

Let’s compute the Noether charge for a free rotor⁴ under the action \(g : q \to q + \epsilon\).

\(q\) is just \(\theta\) in this case, so the Lagrangian is

\[ L(q,\dot q) = \frac{1}{2}m\ell^2\dot q ^2 \]

We have

\[ p = \frac{\partial L}{\partial \dot q} = m \ell^2 \dot q \]

We just need \[ w_Q(q(t)) = \frac{d}{d\epsilon} [q_\epsilon(t)]\biggr|_{\epsilon = 0} = \frac{d}{d\epsilon} [q + \epsilon]\biggr|_{\epsilon = 0} = 1 \]

So we have \[ J = m \ell^2 \dot q = m\ell^2\dot \theta \]

which is the angular momentum.

Noether’s with Lie Groups

The proof for Lie groups is essentially the same (I will omit it).

One note: if \(Q\) is a Lie group, symmetries are described directly by group multiplication.

That is, fix an element \(\omega\) of the Lie algebra \(\mathfrak{g}\) (the generator), and define a perturbed configuration by acting on \(q\) with a small group element:

\[ q_\epsilon := \Phi\!\bigl(\exp(\epsilon\omega),\, q\bigr) \]

Usually this is group multiplication

\[ q_\epsilon := \Phi\!\bigl(\exp(\epsilon\omega),\, q\bigr) = \log(\exp(\epsilon \omega)\exp(q)) \]

Recall that in our code the \(\log\) map converts from group representation back to vector.

Discrete Noether

Let’s get the discrete version of Noether’s Theorem.

\(G\)-invariance is:

\[ L_d\bigl(\Phi_g(q_k),\, \Phi_g(q_{k+1})\bigr) = L_d(q_k, q_{k+1}) \]

The infinitesimal generator is \[ \left.\frac{d q_{\epsilon}^k}{d\epsilon}\right|_{\epsilon=0} = \omega_Q(q_k) \]

The discrete momentum: \[ p_k := D_2 L_d(q_{k-1}, q_k), \]

And the Noether charge is: \[ J_k := p_k \cdot \omega_Q(q_k) \]

How do we compute the infinitesimal generator \(\omega_Q(q_k)\)?

Before we had

\[ \omega_Q(q) := \frac{d}{d\epsilon}\bigl[g(\epsilon)\cdot q\bigr]\bigg|_{\epsilon=0} \in T_qQ \]

So \(q_k\) is given by \(g(\epsilon) \cdot q_k\) and the derivative with respect to \(\epsilon\) is approximated by

\[ \omega_Q(q_k) = \frac{g(\epsilon)\cdot q_k - q_k}{\epsilon} \]

We can compute \(g(\epsilon)\) in Lie Algebra coordinates

\[ \omega_Q(q_k) = \frac{\log (\exp(\epsilon \omega)\exp(q_k)) - q_k}{\epsilon} \]

Note that in \(\mathbb{R}\) this reduces to \[ \omega_Q(q_k) = \frac{\log (\exp(\epsilon \omega)\exp(q_k)) - q_k}{\epsilon} = \frac{q_k + \epsilon \omega - q_k}{\epsilon} = \omega \]

Code

Let’s start to code. Here’s the FreeRotor:

class FreeRotor(VariationalSystem):
    def control_plane(self):
        return {
            "theta": Rn(1)
        }
    
    def params(self):
        return ["mass", "length"]

    def lagrangian(self, ctrl, dctrl):
        th  = ctrl.theta
        thd = dctrl.theta

        m = self.params.mass
        l = self.params.length

        T = 0.5 * m * l*l * thd*thd
        return T

I swapped theta to Rn(1) due to continued problems with SOn. Will come back to it later⁵.

Let’s add a new method to VariationalIntegrator

class VariationalIntegrator: 
    ...
    def register_noether_charge(self, name:str, symmetry: Symmetry):
        def _new_charge(qk, qk1):
            p = self.D2_Ld(qk, qk1)

            # qk_eps = symmetry.log(symmetry.exp(self.tol * params) * symmetry.exp(qk))
            # omega_qk = (qk_eps - qk)/self.tol

            qk1_eps = symmetry.log(symmetry.exp(self.tol * symmetry.generator) * symmetry.exp(qk1))
            omega_qk1 = (qk1_eps - qk1)/self.tol

            return (p*omega_qk1).sum()
        self.noether_charges[name] = _new_charge 

    def step(self, q_prev: torch.Tensor, q_curr: torch.Tensor):

        q_next = q_curr + (q_curr - q_prev)
        q_next = q_next.clone().detach().requires_grad_(True)
        const_term = self.D2_Ld(q_prev, q_curr).detach()

        success = False

        for _ in range(self.max_iters):
            F = const_term + self.D1_Ld(q_curr, q_next)

            if F.norm().item() < self.tol:
                success = True
                break

            def F_of(x: torch.Tensor) -> torch.Tensor:
                return const_term + self.D1_Ld(q_curr, x)

            J = torch.autograd.functional.jacobian(F_of, q_next)
            delta = torch.linalg.solve(J, F)

            with torch.no_grad():
                q_next -= delta
            q_next.requires_grad_(True)

            if delta.norm().item() < self.tol:
                success = True
                break
        q_prev_det = q_prev.detach()
        q_curr_det = q_curr.detach()
        q_next_det = q_next.detach()

        noether_charge_outputs = {
            name: fn(q_curr_det, q_next_det)
            for name, fn in self.noether_charges.items()
        }

        if self.on_step is not None:
            self.on_step({
                "q_prev": q_prev_det,
                "q_curr": q_curr_det,
                "q_next": q_next_det,
                "noether_charges": noether_charge_outputs,
                "success": success
            })

        return q_next_det, success

This takes a Symmetry, computes the Noether charge. step is also modified. How does a user provide the Symmetry?

Let’s do

@dataclass
class Symmetry:

    def __init__(self, group: LieGroup, generator: torch.Tensor):
        self.group = group
        self.generator = generator

    def exp(self, v: torch.Tensor):
        return self.group.exp(v)

    def log(self, g):
        return self.group.log(g)

    def shift(self, q: torch.Tensor, eps: float) -> torch.Tensor:
        g_q   = self.group.exp(q)
        g_eps = self.group.exp(eps * self.generator.to(q.device))
        g_new = g_eps * g_q
        return self.group.log(g_new)

We take a Lie group and define a “shift” operation that must give a symmetry.

We must also modify our callback function junk:

class StepRecorder:
    def __init__(self):
        self.records = []

    def on_step(self, record):
        self.records.append(record)

Let’s try with the free rotor:

if __name__ == "__main__":

    freeRotor = FreeRotor({
        "mass": 1.0,
        "length": 1.0
    })
    
    h = 0.001
    recorder = StepRecorder()
    integrator = VariationalIntegrator(freeRotor, step_size=h, on_step=recorder.on_step)

    theta_group, _ = freeRotor.model.layout["theta"]
    omega = torch.tensor([1.0], dtype=torch.float64)
    rotor_symmetry = Symmetry(group=theta_group, generator=omega)
    integrator.register_noether_charge("angular_momentum", rotor_symmetry)

    theta0 = 0.8
    theta_dot0 = 1
    ctrl0 = AttrObject({"theta": torch.tensor([theta0])})
    q0 = freeRotor.model.pack(ctrl0)

    ctrl1 = AttrObject({"theta": torch.tensor([theta0 + h * theta_dot0])})
    q1 = freeRotor.model.pack(ctrl1)

    steps = 10000
    qs = [q0.clone(), q1.clone()]
    q_prev, q_curr = q0, q1

    for _ in tqdm.tqdm(range(steps - 2)):
        q_next, ok = integrator.step(q_prev, q_curr)
        qs.append(q_next.clone())
        q_prev, q_curr = q_curr, q_next

    qs = torch.stack(qs, dim=0)
    print([record["noether_charges"] for record in recorder.records])

And we see the angular momentum is conserve at 1.0000:

[{'angular_momentum': tensor(1.0000)}, {'angular_momentum': tensor(1.0000)},...]

Conclusion

We’ve successfully implemented a Noether’s theorem implementation in our geometric controls library. Next time, I will presumably add in time dependence and look at jet controls, but it is possible we will instead look more deeply at invariants.

Footnotes

We have essentially reduced the Lagrangian to a single variable \(q\), and then applied \(\Phi_g\) to \(q\) by “pushing forward” \(\Phi_g\) through the tangent mapping.↩︎
Clairaut’s theorem.↩︎
I now realize this wasn’t explicit in the last post, but our fiber derivative \(p = d_vL_q\), restricted to a particular path (instead of arbitrary \(v\)) reduces to \(p = \frac{\partial L}{\partial \dot q}\). Potentially there is some technical subtlety here I am unclear on. If I look in Marsden & Ratiu they seem to define \(p\) as \(\frac{\partial L}{\partial \dot q}\).↩︎
I’d do the pendulum but thanks to gravity it’s not really symmetric. The free rotor removes gravity. The versions of the Lagrangian/Hamiltonian formulation and Noether’s theorem I’ve derived already don’t depend on absolute time (so I can’t derive energy). If we add time back in, time-invariance leads to energy conservation (the Hamiltonian is constant across time). Currently we are doing something more similar to Pontryagin control.↩︎
ChatGPT seems to think this is due to some subtle coordinate representation issue. I am somehow implicitly assuming the group is Abelian which is why it breaks for SOn. There might be some Newton solver issues as well (the code is not fully adaptes for Lie groups). Ignoring for now.↩︎