Building a Minimal Computational Invariant Theory Library

Symmetry and Structure
Exposition
Published

March 17, 2026

Theo van Doesburg. *Arithmetic Composition*. (1930)

Introduction

Now that we’ve investigated the basics of invariant theory, we can look at how to compute invariants with code. In this post, I’ll investigate computational invariant theory, with an emphasis on the algorithmic aspects. I’m vaguely following Derksen and Kemper’s book.

This is once again a long post, mixing theory, algorithms, and code. I used Claude to help with the code and the writing, but I also steered and reviewed heavily. Full code will be made available shortly.

Background

We are studying the action of a group \(G\) on a vector space \(V\). We want to understand the structure of the ring of invariants \(\mathbb{C}[V]^G\), which consists of all polynomial functions on \(V\) that are invariant under the action of \(G\).

That is:

\[ \mathbb{C}[V]^G = \{ f \in \mathbb{C}[V] : f(g \cdot v) = f(v) \text{ for all } g \in G, v \in V \} \]

Last time, we established a rudimentary process for computing invariants based on properties of the group action. The process was (roughly):

  1. Is \(G\) finite? If so, we can compute the invariants by averaging over the group using the Reynolds operator.
  2. Is the group infinite, but compact? If so, we can compute the invariants by integrating over the group with respect to the Haar measure (also Reynolds operator). The Haar measure depends on the topology of \(G\):
    1. If \(G\) is a Lie group, we can sometimes construct the Haar measure explicitly via the Maurer-Cartan form.
    2. There are other cases, but we won’t be concerned with them here.
  3. Is the group noncompact but reductive? If so, a Reynolds operator exists, but the actual computation uses representation theory to identify the equivariant projections directly.
  4. Is the group non-reductive? Idiosyncratic methods exist for specific cases.

In this post I’ll focus on the first three cases, which are the most common and well-studied. In particular, we’ll look at how to compute invariants for finite groups, compact Lie groups, and reductive groups.

Data Structures

What are we actually computing here? As inputs to the functions, we’ll need a way to encode a group \(G\) and its action on a structured set \(X\) (usually a vector space \(V\)). As outputs, we’ll want to compute a generating set for the ring of invariants \(\mathbb{C}[V]^G\). That is, a set of invariants such that any invariant can be expressed as a polynomial in the generators.

How should we represent these as data structures in code?

Review of Existing Libraries

The data structures will ultimately dictate the algorithms we can use, so we want to choose them carefully. How do existing libraries for computational invariant theory, such as Magma, SageMath, and Singular, represent groups, group actions, and invariants? I took a quick look (mostly not that helpful) at the documentation and source code for these libraries to get a sense of how they approach these problems.

Magma

Magma is a general computer algebra system maintained by the University of Sydney that includes functionality for computational invariant theory. The library uses a custom language, with the performance-critical algorithms implemented in C at the kernel level. The actual library is closed-source (which makes it hard to investigate the data structures).

Based on the documentation, Magma apparently uses a typing system that corresponds to algebraic categories, and the type system enforces particular mathematical structures. Unfortunately I couldn’t investigate too deeply.

SageMath

SageMath is an open-source computer algebra system with various libraries for different areas of mathematics. The chief language is Python, with performance-critical algorithms implemented in Cython or C. The architecture is “federated” in some sense, with interfaces to ~100 different external packages for different areas of mathematics, all made interoperable via a central type coercion system. It seems like SageMath is a nightmare to distribute due to the large number of dependencies, but the upside is that it has a huge variety of functionality.

The entire library is GPL, so we can look inside. For invariant theory, SageMath has a package called “invariant_theory”, which mostly focuses on the action of \(SL(n, \mathbb{C})\) on homogeneous polynomials (as in the classical invariant theory post).

Under the hood, there is a “parent/element” framework (which is used to deal with the federated nature of the library). The parent object encodes the algebraic structure, and the element objects encode specific instances of that structure. So for a polynomial, the parent would be the space it lives in (\(\mathbb{Q}[x]\)) and the element would be a specific polynomial (\(x^2 + 1\)). The parents are organized in a hierarchy of algebraic categories and then there’s a bunch of infra for managing all the types.

For invariant theory in particular, the wrapped library is Singular, so we should just look at that.

Singular

Singular looks like it’s designed for polynomial computations, especially commutative and non-commutative algebra, algebraic geometry, and singularity theory.

It also has an open-source C++ library for invariant theory. A bunch of the documentation is here.

Since Singular is designed for polynomial computations, a lot of its data structures look to be defined for those purposes. For example, a monomial \(x^2y^3z\) is a vector of exponents \((2, 3, 1)\), and a polynomial is a list of monomials and coefficients. Rings are a sort of global context, and ideals are represented as arrays of polynomial generators (it seems like they are actually arrays of pointers to polynomials).

Singular does have some functionality for group actions and invariants (including algorithms for Grobner bases and Hilbert series) but it’s focused on specific cases (like finite groups acting on polynomial rings) rather than a general framework for group actions.

In short, Singular looks like a cool/good library, but since we are mostly interested in group actions and invariants, it departs pretty heavily from the abstractions I think we would ideally want.

Data Structure Implementations

How ought we design our data structures? We want to be able to represent different types of groups (finite, classical, reductive) and their actions on different types of structured sets (vector spaces, affine varieties, etc). We also want to be able to represent the invariants themselves, which are usually polynomials or rational functions, as well as the relations among them, presentations, etc. Furthermore, I want to follow my minimalist, compositional style.

Polynomials

Let’s start with the invariants themselves, which are usually polynomials (or sometimes rational functions). We need a way to represent multivariate polynomials with rational coefficients. We need arithmetic operations on these polynomials (addition, multiplication, scaling), and the ability to evaluate them at specific points.

A monomial \(x_0^{a_0} x_1^{a_1} \cdots x_{n-1}^{a_{n-1}}\) can be identified with its exponent tuple \((a_0, a_1, \ldots, a_{n-1}) \in \mathbb{N}^n\). So a polynomial is just a finite linear combination of monomials with rational coefficients. Internally, we can use a dict mapping tuples to Fractions, with associated arithmetic operations:

class Poly:

    Type = dict[tuple[int, ...], Fraction]

    @staticmethod
    def add(f, g): ...
    @staticmethod
    def mul(f, g): ...
    @staticmethod
    def scale(c, f): ...
    @staticmethod
    def evaluate(f, point) -> Fraction: ...

    # -- Constructors --
    @staticmethod
    def mono(alpha, c=1): ...       
    @staticmethod
    def var(i, n_vars): ...          
    @staticmethod
    def const(value, n_vars): ...    

    # -- Leading term operations --
    @staticmethod
    def leading_monomial(f, order=grlex): ...
    @staticmethod
    def leading_coefficient(f, order=grlex) -> Fraction: ...

    # -- Monomial operations (for Gröbner bases) --
    @staticmethod
    def mono_divides(a, b) -> bool: ...   
    @staticmethod
    def mono_lcm(a, b): ...               
    @staticmethod
    def mono_mul(a, b): ...               
    @staticmethod
    def mono_div(a, b): ...               

    # -- Orderings --
    @staticmethod
    def grlex(alpha):
        """Graded lexicographic: total degree first, then lex."""
        return (sum(alpha), alpha)

    @staticmethod
    def elimination_order(k):
        """Orders the variables so that x_0,...,x_{k-1} are ordered before the remaining variables. For Gröbner elimination."""
        def order(alpha):
            return (alpha[:k], sum(alpha[k:]), alpha[k:])
        return order

For example, implementing the polynomial \(3x_0^2 x_1 - x_1^3 + 7\) in three variables:

f = {(2, 1, 0): Fraction(3), (0, 3, 0): Fraction(-1), (0, 0, 0): Fraction(7)}

Why use Fraction instead of floats? For now, we will use exact arithmetic (where possible, there is at least one exception since I don’t want to implement a full computer algebra system). With Fraction, the Reynolds operator, orbit sums, and Gröbner reductions stay exact.

The method implementations aren’t shown above, but they are pretty straightforward. The notable ones are the leading term operations, which are defined with respect to a monomial ordering (we will need this for Gröbner bases).

Group Actions

We can represent the group \(G\) as a set of generators and relations, or as a matrix group acting on \(V\) (or some \(X\)). The choice of representation will depend on the specific group and the context of the problem. For example, if \(G\) is a finite group, we can represent it as a list of its elements or as a permutation group. If \(G\) is a Lie group, we can represent it using its Lie algebra and the exponential map. So we’ll need some abstraction to ensure that we can work with different types of groups in a unified way.

Let’s assume we have some group object that can act on an object \(X\). We need something along the lines of:

class Group:
    def identity(self):
        raise NotImplementedError

    def multiply(self, g, h):
        raise NotImplementedError

    def inverse(self, g):
        raise NotImplementedError
class GroupAction:
    def act(self, g, x):
        """Apply group element g to object x."""
        raise NotImplementedError

As written, this is too abstract to be useful, since different group types compute invariants in different ways.

We will look at at least a few different types of groups (tori, finite groups, classical groups, and reductive groups), and the algorithms for computing invariants differ in each case. A torus solves an integer linear system, a finite group averages over its elements using the Reynolds operator, and a classical group hard-codes generators from the First Fundamental Theorems. There is no single act method we can implement that covers all of these.

However, the downstream API should be the same regardless of group type. In all cases, we test invariance, compute generators, compute the Hilbert series, test orbits, etc. So we should organize the abstractions around the actions, with closures that each group type can fill in as needed:

@dataclass(frozen=True)
class Action:
    """Bundle of closures encoding a (group, space) invariant theory problem."""
    is_invariant:        Callable[[Poly], bool]
    invariants_of_degree: Callable[[int], list[Poly]]
    hilbert_coeffs:      Callable[[int], list[int]] | None = None
    orbit_test:          Callable | None = None
    apply_element:       Callable | None = None
    elements:            Callable | None = None
    n_vars:              int = 0

We will have each group class provide an .action(space) method that attaches each of its primitives and returns an Action. The derived API is then group-agnostic, and operates on Action objects.

What are the different closures we need to fill in for different group types? We need to be able to check if a polynomial is an invariant, find invariants of a given degree, compute the Hilbert series, test if two points are in the same orbit, apply a group element to an object, and list the group elements (if finite). Not all of these will be implemented for every group type, but if we can implement them we should.

def invariant_theory(group, space) -> Action:
    """Assemble an Action from a group and a space descriptor."""
    return group.action(space)

The idea behind this design is to encapsulate all the group-specific logic inside the group classes, and then have a uniform API for working with invariants that is independent of the group type. The Action object serves as a bridge between the group and the algorithms for computing invariants, allowing us to write algorithms that are agnostic to the specific group structure. So adding a new group type requires only implementing a class with .action(space) -> Action. Everything else (generators, Hilbert series, orbit tests, separators) should work “automatically”.

Torus Actions

A torus \(T = (\mathbb{C}^*)^r\) acts on \(\mathbb{C}^m\) via an integer weight matrix \(W\) (\(r \times m\)). A monomial \(x^\alpha\) is invariant if and only if \(W\alpha = 0\). We will go through the actual theory for a torus action below. This case is simple enough that we don’t need Reynolds operators or representation theory, we can just use integer linear algebra.

class Torus:
    def __init__(self, W: np.ndarray):
        self.W = np.asarray(W, dtype=int)
        self.rank = self.W.shape[0]
        self.n_vars = self.W.shape[1]

    def is_invariant_monomial(self, alpha: tuple[int, ...]) -> bool:
        return np.all(self.W @ np.array(alpha, dtype=int) == 0)

    def hilbert_basis(self, max_degree: int = 20) -> list[tuple[int, ...]]:
        ...

Finite Groups

For finite groups, we need an explicit list of \(n \times n\) matrices (one for each group element). The core operations are the Reynolds operator (average over the group), orbit sums (a particularly clean basis construction in the monomial/permutation cases), and the Molien series (Hilbert series via eigenvalues).

class FiniteGroup:
    def __init__(self, matrices: list[np.ndarray]):
        self.matrices = matrices
        self.order = len(matrices)
        self.n_vars = matrices[0].shape[0]

    def reynolds(self, f: Poly) -> Poly:
        total: Poly = {}
        for g in self.matrices:
            total = poly.add(total, self.apply_to_poly(g, f))
        return poly.scale(Fraction(1, self.order), total)

    def orbit_sum(self, f: Poly) -> Poly:
        seen = set()
        total: Poly = {}
        for g in self.matrices:
            gf = self.apply_to_poly(g, f)
            key = frozenset(gf.items())
            if key not in seen:
                seen.add(key)
                total = poly.add(total, gf)
        return total

    def molien_coeffs(self, max_d: int) -> list[int]:
        ...

For common finite groups, we can provide constructors in a group library:

def symmetric(n: int) -> FiniteGroup:
    """S_n acting on C^n by permutation matrices."""
    ...

def cyclic(n: int, dim: int = 2) -> FiniteGroup:
    """Z/nZ acting on C^dim by rotation."""
    ...

def dihedral(n: int) -> FiniteGroup:
    """D_n acting on C^2 by rotations and reflections."""
    ...

Classical Groups

Classical groups (\(\mathrm{O}(n)\), \(\mathrm{SL}(n)\), \(\mathrm{Sp}(2n)\)) are infinite and continuous, so in principle the Reynolds-operator story is more complicated.

We could construct the Maurer-Cartan form, compute the Molien-Weyl integral for the Hilbert series, and then project the result onto trivial representations to extract the invariants. Alternatively, the First Fundamental Theorems of Invariant Theory (FFTs) give us explicit generators for the invariant ring, so we could just hard-code those and build the invariants as products.

def orthogonal_action(n: int, k: int) -> Action:
    """O(n) acting diagonally on k copies of C^n.
    Generators: inner products <v_i, v_j>."""
    ...

def sl_action(n: int, k: int) -> Action:
    """SL(n) acting diagonally on k copies of C^n.
    Generators: n x n bracket determinants."""
    ...

def symplectic_action(n: int, k: int) -> Action:
    """Sp(2n) acting diagonally on k copies of C^{2n}.
    Generators: symplectic pairings omega(v_i, v_j)."""
    ...

For \(\mathrm{O}(n)\), the generators are inner products. For \(\mathrm{SL}(n)\), the generators are determinantal brackets. For \(\mathrm{Sp}(2n)\), the generators are symplectic pairings.

Each of these returns an Action (the same interface as finite groups and tori), so downstream code for Hilbert series, presentations, and orbit separation should work unchanged. In the easy cases, we can get the Hilbert series by counting monomials in the generators rather than evaluating the Molien-Weyl integral.

Reductive Groups

For more general reductive groups (of particular interest is \(\mathrm{GL}(n)\) acting by conjugation), we’d need the full representation-theoretic machinery. This means decompose the polynomial ring into irreducible representations and extract the trivial summands.

There’s no general algorithm to handle all possible cases. For some cases (once again, \(\mathrm{GL}(n)\)) invariants are generated by traces of products.

This is out of scope for this post, but in principle we could implement the algorithms in the same framework as the other group types, with the Action object providing the necessary closures for testing invariance, computing generators, and so on.

Spaces

We need to encode how the group action affects the vector space the group acts on. For polynomials:

@dataclass(frozen=True)
class Space:
    n_vars: int
    apply_matrix: Callable  
    add: Callable
    scale: Callable
    zero: Callable

def polynomial_ring(n_vars: int) -> Space:
    """Standard polynomial ring C[x_0, ..., x_{n-1}]"""
    ...

Since the space is defined separately, the algorithms can be group-agnostic. The Reynolds operator and orbit sums in FiniteGroup use space.add, space.scale, and space.apply_matrix rather than calling polynomial arithmetic directly. So we can extend this library to work on new object types by implementing new space descriptors with the appropriate operations without adjusting the group classes or derived API.

Computational Tasks

Now that we have the data structures in place, we can ask what kinds of computations actually arise in invariant theory. From the last section, we have data structures for groups, group actions, and invariants. What are the key computational tasks we want to perform with these objects? That is, what should the API of our computational invariant theory library look like?

Could be something like this:

# Decision problems
def is_invariant(action: Action, f: Poly) -> bool: ...
def in_null_cone(action: Action, v: np.ndarray, max_degree: int = 6) -> bool: ...

# Construction problems
def compute_generators(action: Action, max_degree: int) -> list[Poly]: ...
def compute_hilbert_series(action: Action, max_degree: int) -> list[int]: ...

# Presentation problems (Gröbner-based)
def compute_relations(generators: list[Poly], n_vars: int) -> list[Poly]: ...
def normal_form(f: Poly, basis: list[Poly]) -> Poly: ...
def in_ideal(f: Poly, basis: list[Poly]) -> bool: ...
def eliminate(generators: list[Poly], k: int, n_vars: int) -> list[Poly]: ...

# Orbit problems
def same_orbit(action: Action, v: np.ndarray, u: np.ndarray) -> bool: ...
def find_separator(action: Action, v: np.ndarray, u: np.ndarray, max_degree: int = 6) -> Poly | None: ...

Let’s look through these in more detail.

Decision Problems

In these types of problems, we are checking an input for some property, and the output is a boolean.

Testing Whether a Polynomial Lies in \(k[V]^G\)

Probably the most fundamental decision problem is to determine whether a given polynomial is actually invariant under the action. This is the most direct membership test for the invariant ring.

This suggests an operation of the form

def is_invariant(action: Action, f: Poly) -> bool: ...

Testing Whether an Object Is Invariant Under the Action

More generally, we may want to test if some explicitly represented object is invariant under the action.

def is_invariant(action: Action, x) -> bool: ...  # same interface, different object types via Space

Testing Whether a Point Lies in the Null Cone

We have yet to introduce the concept of the null cone, but the idea is that, given the quotient map \(\pi: V \to V/\!\!/G\), we want to test whether a point \(v \in V\) maps to the origin in the quotient. This is equivalent to testing whether all positive-degree invariants vanish at \(v\). This has important geometric implications for the quotient space.

def in_null_cone(action: Action, v: np.ndarray, max_degree: int = 6) -> bool: ...

Construction Problems

These operations attempt to compute explicit invariants or structured generating data for the invariant ring.

Computing Generators of \(k[V]^G\)

We might want to compute a generating set for the invariant ring. This gives a finite description of all polynomial invariants and is often the starting point for further structural work.

In API terms, this means we want an operation

def compute_generators(action: Action, max_degree: int) -> list[Poly]: ...

Computing Primary and Secondary Invariants

If \(G\) is a finite group acting on \(V\), then the invariant ring \(k[V]^G\) is a finitely generated module over a polynomial subring generated by a homogeneous system of parameters (HSOP). The generators of the polynomial subring are called primary invariants, and the generators of the module are called secondary invariants. Computing these can give us a more structured understanding of the invariant ring.

def compute_primary_secondary(action: Action, max_degree: int) -> tuple[list[Poly], list[Poly]]: ...

Computing Separating Invariants

A separating set of invariants is a subset of the invariant ring that can distinguish between different orbits of the group action. That is, if we have two points \(v, u \in V\), then invariants in the separating set can distinguish them whenever they have different images in the quotient. For finite groups, this is the same as distinguishing different orbits. Think of this as a “weaker” version of a generating set that is only concerned with separating orbits rather than generating the entire ring. Computing a separating set can be easier than computing a full generating set, and it is often sufficient for many applications.

As an API operation:

def compute_separating_invariants(action: Action, max_degree: int) -> list[Poly]: ...

These are apparently even more important when the invariant ring has bad properties, such as being non-finitely generated, which can happen for non-reductive groups. In that case, we may not be able to compute a full generating set, but we can still compute a separating set. Not sure if/when this will show up.

Structural Computations

These operations try to understand the algebraic structure of the invariant ring once invariants have been found.

Computing the Hilbert Series

The Hilbert series is a generating function that counts the invariants by dimension. It is defined as:

\[ H(t) = \sum_{d=0}^{\infty} \dim_k(k[V]^G_d) t^d \]

where \(k[V]^G_d\) is the space of homogeneous invariants of degree \(d\). The Hilbert series encodes important information about the invariant ring and the structure of the invariants. For example, in the cases we care about here, the Hilbert series is rational once the invariant ring is finitely generated.

For finite groups, the Hilbert series can be computed using the Molien formula:

\[ H(t) = \frac{1}{|G|} \sum_{g \in G} \frac{1}{\det(I - t g)} \]

We will examine this in more detail in subsequent sections.

As an API operation:

def compute_hilbert_series(action: Action, max_degree: int) -> list[int]: ...

Computing Structural Properties of \(k[V]^G\)

There are several structural properties of the invariant ring that we may want to compute. These properties tell us how complicated the ring is, how many relations we should expect, and whether the ring admits especially efficient descriptions.

To define the most common ones, we first isolate the role of a polynomial subring inside the invariant ring.

Definition 1 Let \(R = k[V]^G\) be a graded invariant ring. A collection of homogeneous elements \[ \theta_1, \ldots, \theta_d \in R \]

is called a homogeneous system of parameters (HSOP) if \(R\) is finitely generated as a module over the polynomial subring \[ k[\theta_1, \ldots, \theta_d] \]

Intuitively, an HSOP is a choice of basic algebraically independent parameters over which the whole invariant ring is finite.

Definition 2 We say that \(R = k[V]^G\) is Cohen-Macaulay if, for some HSOP

\[ \theta_1, \ldots, \theta_d \in R \]

the ring \(R\) is a free module over the polynomial subring \(k[\theta_1, \ldots, \theta_d]\). Equivalently, there exist homogeneous elements

\[ \eta_1, \ldots, \eta_m \in R \]

such that

\[ R \cong \bigoplus_{j=1}^m \eta_j \, k[\theta_1, \ldots, \theta_d] \]

as a module over \(k[\theta_1, \ldots, \theta_d]\).

This is important computationally because it means the invariant ring has a simple description in terms of primary and secondary invariants.

Definition 3 Assume \(R = k[V]^G\) is Cohen-Macaulay, and let \[ A = R/(\theta_1, \ldots, \theta_d) \]

be the quotient by an HSOP. Then \(A\) is a finite-dimensional graded algebra:

\[ A = \bigoplus_{i \ge 0} A_i \]

Let \(s\) be the largest degree for which \(A_s \neq 0\).

We say that \(R\) is Gorenstein if \(A_s\) is one-dimensional, and for every \(i\), multiplication

\[ A_i \times A_{s-i} \to A_s \]

is nondegenerate. By nondegenerate, we mean that 1. For every nonzero \(a \in A_i\), there exists some \(b \in A_{s-i}\) such that \(ab \neq 0\), and 2. For every nonzero \(b \in A_{s-i}\), there exists some \(a \in A_i\) such that \(ab \neq 0\).

Intuitively, this means after we quotient the polynomial part coming from the HSOP out, the remaining finite-dimensional algebra has a strong symmetry between complementary degrees.

Definition 4 Suppose we present the invariant ring as \[ R \cong k[y_1, \ldots, y_m]/I \]

where the \(y_i\) correspond to chosen generators of \(R\). Let \(d\) be the number of parameters in an HSOP for \(R\). We say that \(R\) is a complete intersection if the ideal of relations \(I\) can be generated by \[ m - d \]

elements.

That is, once generators have been chosen, the ring is determined by as few relations as possible. This is one of the best possible situations computationally, since the presentation is controlled by a minimal number of equations.

As API operations, we might have:

def is_cohen_macaulay(invariant_ring) -> bool: ...
def is_gorenstein(invariant_ring) -> bool: ...
def is_complete_intersection(invariant_ring) -> bool: ...

Presentation Computations

These operations try to find explicit presentations of the invariant ring in terms of generators and relations.

Computing Relations Among Generators

The simplest form of this problem is to compute the ideal of relations among a given set of generators. That is, if we have a set of generators \(\{f_1, \ldots, f_m\}\) for \(k[V]^G\), we want to find the ideal \(I\) such that:

\[ I = \{ r \in k[f_1, \ldots, f_m] : r(f_1, \ldots, f_m) = 0 \} \]

This is important because it gives us a complete presentation of the invariant ring as a quotient of a polynomial ring by the ideal of relations. We can use Gröbner bases to compute this ideal, which allows us to perform various algebraic operations on the invariant ring.

In terms of the API:

def compute_relations(generators: list[Poly], n_vars: int) -> list[Poly]: ...

Computing Syzygies Among Relations

Once we have a set of relations \(r_1, \ldots, r_k\) among the generators, we may have syzygies among the relations themselves. We can represent these as tuples of polynomials \((s_1, \ldots, s_k)\) such that:

\[ s_1 r_1 + s_2 r_2 + \cdots + s_k r_k = 0 \]

The trivial syzygies come from commutativity (\(r_i \cdot r_j - r_j \cdot r_i = 0\)). The non-trivial syzygies help show the structure in the relation ideal.

In the last post, we saw that the computations are finite (by Hilbert’s syzygy theorem), and the resolution encodes homological invariants like depth and projective dimension. Computationally, we can use Gröbner bases to compute the syzygies among the relations, which gives us a deeper understanding of the structure of the invariant ring and can help us compute further invariants.

def compute_syzygies(relations: list[Poly]) -> list[Poly]: ...  

Solving Ideal-Membership and Normal-Form Problems

If we have a presentation of the invariant ring in terms of generators and relations, we can use this to solve ideal-membership problems. For example, given a polynomial \(f\) and an ideal \(I\) generated by some relations, we can test whether \(f\) belongs to \(I\) by computing the normal form of \(f\) with respect to a Gröbner basis for \(I\). If the normal form is zero, then \(f \in I\); else, \(f \notin I\).

def normal_form(f: Poly, basis: list[Poly]) -> Poly: ...
def in_ideal(f: Poly, basis: list[Poly]) -> bool: ...

Quotient and Orbit Computations

These operations try to compute the geometry of the quotient space \(V/\!\!/G\) and the orbits of a group action.

Determining Whether Two Points Have the Same Image in the Quotient

Let’s say we have two points \(v, u \in V\), and we want to determine whether they map to the same point in the quotient \(V/\!\!/G\). For finite groups, this is equivalent to asking whether there exists a group element \(g \in G\) such that \(g \cdot v = u\).

We can run a computation:

def same_orbit(action: Action, v: np.ndarray, u: np.ndarray) -> bool: ...

Separating Points, Orbits, or Orbit Closures

Similarly, we may want to determine whether two points lie in different orbits, or whether their orbit closures are different. This is related to the concept of separating invariants, which can distinguish between different orbits.

def find_separator(action: Action, v: np.ndarray, u: np.ndarray, max_degree: int = 6) -> Poly | None: ...

Example Workflow

Now that we know the types of computations we want to perform, we can outline the workflow we might use to actually compute invariants for a specific group action. This will help us understand how the different computational tasks fit together in practice.

Probably the sequence looks something like this:

  1. Encode the group action and the structured object in our data structure.
  2. Compute Hilbert/Molien data for the action to understand the structure of the invariant ring.
  3. Search for candidate generators for the invariant ring, using the Hilbert series to guide our search.
  4. Compute relations by Gröbner elimination to find a presentation of the invariant ring.

Once we have the presentation, we can use it to gain understanding of the object:

  1. Test ideal membership and compute normal forms. Given a polynomial \(f\), determine whether it lies in the ideal of relations (equivalently: can it be written in terms of the generators?). The normal form gives a canonical representative.
  2. Compute structural properties of the invariant ring, such as whether it is Cohen-Macaulay, Gorenstein, or a complete intersection.
  3. Compute syzygies and higher-order relations among the generators.
  4. Compute separating invariants and solve orbit-separation problems.
  5. Compute primary and secondary invariants, and understand the module structure of the invariant ring over a polynomial subring.

More advanced computations (which we may or may not do here) might include:

  1. Compute the “Krull dimension” of the invariant ring. This is the number of algebraically independent generators, equal to \(\dim(V) - \dim(\text{generic orbit})\). It tells you the dimension of the quotient variety \(V/\!\!/G\).
  2. Compute degree bounds. For finite groups in characteristic zero, all generators appear by degree \(|G|\) (Noether’s bound). The Hilbert series predicts how many generators to expect in each degree before computing them, giving a stopping criterion.

The generators and relations also determine the geometry of the quotient \(V/\!\!/G\) (its defining equations, dimension, and singularities), but that is algebraic geometry proper, and we won’t pursue it here.

An example program putting this all together might look like this:

import numpy as np
from invariants.groups.constructors import symmetric
from invariants.spaces import polynomial_ring
from invariants.action import (
    invariant_theory, compute_generators,
    compute_hilbert_series, in_null_cone, same_orbit,
)
from invariants.groebner import compute_relations

# 1. Encode the group action
G = symmetric(3)
ring = polynomial_ring(3)
action = invariant_theory(G, ring)

# 2. Hilbert series: how many invariants in each degree?
hs = compute_hilbert_series(action, 6)
# [1, 1, 2, 3, 4, 5, 7]

# 3. Find generators
generators = compute_generators(action, max_degree=3)
# [x0+x1+x2, x0^2+x1^2+x2^2, x0^3+x1^3+x2^3]

# 4. Relations among generators (Gröbner elimination)
relations = compute_relations(generators, n_vars=3)
# [] — S_3 invariant ring is freely generated

# 5. Orbit and quotient geometry
v = np.array([1.0, 2.0, 3.0])
u = np.array([3.0, 1.0, 2.0])
same_orbit(action, v, u)                    # True — same multiset
in_null_cone(action, np.array([0, 0, 0]))   # True — all invariants vanish at origin

For this particular example, the outputs look something like this:

Hilbert series: [1, 1, 2, 3, 4, 5, 7]
Generators (3): [x0+x1+x2, x0^2+x1^2+x2^2, x0^3+x1^3+x2^3]
Relations: 0 (freely generated)
Same orbit (1,2,3) ~ (3,1,2): True
In null cone (0,0,0): True

Hilbert Series

In the code above, step 2 was “compute the Hilbert series.” Before we implement anything, we should define this properly, since it will guide every computation that follows.

For any graded algebra \(R = \bigoplus_{d=0}^{\infty} R_d\) (where each \(R_d\) is a finite-dimensional vector space), the Hilbert series is the generating function:

\[ H_R(t) = \sum_{d=0}^{\infty} \dim_k(R_d) \, t^d \]

Applied to the invariant ring \(k[V]^G\), the coefficient of \(t^d\) counts the number of linearly independent invariant polynomials of degree \(d\). This is the single most useful piece of information you can have before searching for generators: it tells you how many to expect in each degree, and when you can stop looking.

Each group type gives a different formula for the Hilbert series. For finite groups, Molien’s theorem expresses it as a sum over group elements. For tori, it reduces to counting lattice points in a cone. For compact Lie groups, it becomes an integral over the maximal torus (the Molien-Weyl formula). We will derive and implement each of these.

The Hilbert series also encodes structural information. If \(k[V]^G\) is Cohen-Macaulay (which it always is for finite groups in characteristic zero, by the Hochster-Roberts theorem), then the series factors as:

\[ H(t) = \frac{h_0 + h_1 t + \cdots + h_s t^s}{(1-t^{d_1})(1-t^{d_2})\cdots(1-t^{d_n})} \]

where \(d_1, \ldots, d_n\) are the degrees of the primary invariants and the numerator counts secondary invariants. We will return to this decomposition in the section on primary and secondary invariants.

Tori

Now that we’ve laid out the data structures and the Hilbert series as our main bookkeeping tool, we can start implementing. We begin with the simplest case: torus actions on a vector space.

Introduction to Tori

Let’s define the torus and its action on a vector space.

Given a group \(G\), we say that \(G\) is a torus if it is isomorphic to \((\mathbb{C}^*)^n\) (for some integer \(n > 0\)).

An element of \(T\) is therefore a tuple \[ t = (t_1, \ldots, t_r) \] \[ t_i \in \mathbb{C}^* \]

and multiplication is componentwise.

Let \(T\) act linearly on a finite-dimensional complex vector space \(V\). Then we can choose a basis \[ e_1, \ldots, e_m \]

of \(V\) such that each basis vector is just rescaled by the action. That is, for each basis vector \(e_i\) and each \(t \in T\), there is some scalar \(\lambda_i(t) \in \mathbb{C}^\times\) such that \[ t \cdot e_i = \lambda_i(t) e_i \]

Proof: See this StackExchange post here, which shows that the action is diagonalizable. \(\square\)

The group law forces these scalar functions to behave multiplicatively. So \[ (ts)\cdot e_i = t \cdot (s \cdot e_i) \]

implies

\[ \lambda_i(ts)e_i = t \cdot (\lambda_i(s)e_i) = \lambda_i(s)\lambda_i(t)e_i \]

so

\[ \lambda_i(ts) = \lambda_i(t)\lambda_i(s) \]

Thus each \(\lambda_i : T \to \mathbb{C}^*\) is a group homomorphism.

Definition 5 A group homomorphism \[ \lambda : T \to \mathbb{C}^* \]

is called a character of the torus.

Integer Linear Algebra of Invariants

So each basis vector \(e_i\) comes with a character \(\lambda_i\), and under the right basis, the action is of the form \[ t \cdot e_i = \lambda_i(t)e_i \]

Because \[ T = (\mathbb{C}^*)^r \]

every character is given by a monomial in the torus coordinates:

\[ \lambda_i(t_1,\ldots,t_r) = t_1^{w_{i1}}\cdots t_r^{w_{ir}} \]

for some integers

\[ (w_{i1},\ldots,w_{ir}) \in \mathbb{Z}^r \]

These integers are called the weights of the action. So after choosing this basis, the torus action can be written as \[ t \cdot e_i = t_1^{w_{i1}}\cdots t_r^{w_{ir}} e_i \]

If we write a vector \[ v = \sum_{i=1}^m v_i e_i \]

then

\[ t \cdot v = \sum_{i=1}^m t_1^{w_{i1}}\cdots t_r^{w_{ir}} v_i e_i \]

So a torus action is encoded by a list of integer weight vectors

\[ w_i = (w_{i1},\ldots,w_{ir}) \in \mathbb{Z}^r \]

or equivalently by an integer matrix of weights.

Now let

\[ x_1, \ldots, x_m \]

denote the coordinates corresponding to the basis

\[ e_1, \ldots, e_m \]

Since each basis vector is scaled by a weight, the coordinates transform by

\[ t \cdot x_i = t^{w_i} x_i = t_1^{w_{i1}} \cdots t_r^{w_{ir}} x_i \]

where

\[ w_i = (w_{i1}, \ldots, w_{ir}) \in \mathbb{Z}^r \]

Now take a monomial \[ x^\alpha = x_1^{\alpha_1}\cdots x_m^{\alpha_m} \] \[ \alpha = (\alpha_1, \ldots, \alpha_m) \in \mathbb{N}^m \]

Then the torus acts on it by

\[ t \cdot x^\alpha = (t \cdot x_1)^{\alpha_1}\cdots (t \cdot x_m)^{\alpha_m} \]

Substituting in the weight formula gives

\[ t \cdot x^\alpha = (t^{w_1}x_1)^{\alpha_1}\cdots (t^{w_m}x_m)^{\alpha_m} = t^{\alpha_1 w_1 + \cdots + \alpha_m w_m} x^\alpha \]

So the monomial \(x^\alpha\) is again scaled by a single weight (the “total weight” of the monomial): \[ \alpha_1 w_1 + \cdots + \alpha_m w_m \in \mathbb{Z}^r \]

If we assemble the weight vectors into a matrix \[ W = [w_1 \ \cdots \ w_m] \in \operatorname{Mat}_{r \times m}(\mathbb{Z}), \]

then the total weight can be written as \[ W\alpha \]

Thus

\[ t \cdot x^\alpha = t^{W\alpha} x^\alpha \]

It follows that the monomial \(x^\alpha\) is invariant if and only if its total weight is zero: \[ W\alpha = 0 \]

So to find invariant monomials, just have to solve the integer linear system

\[ W\alpha = 0 \]

subject to the constraint that the exponents are nonnegative integers:

\[ \alpha \in \mathbb{N}^m \]

So the exponent vectors of invariant monomials form the set

\[ \ker(W) \cap \mathbb{N}^m \]

This is a semigroup under addition, and the invariant ring is generated by the corresponding monomials.

Implementation

What do our data structures and computational tasks look like in the case of a torus action? Let’s take an (abbreviated) look.

Invariance Test

As we saw, for a torus action, we can represent the action by an integer weight matrix \(W\). The invariant monomials correspond to integer solutions of the linear system \(W\alpha = 0\) with \(\alpha \in \mathbb{N}^m\). A polynomial is invariant if and only if every monomial in its support passes this test.

class Torus:
    """Torus T = (C*)^r acting on C^m via weight matrix W."""

    def __init__(self, W: np.ndarray):
        self.W = np.asarray(W, dtype=int)
        self.rank = self.W.shape[0]
        self.n_vars = self.W.shape[1]

    def monomial_weight(self, alpha: tuple[int, ...]) -> np.ndarray:
        """Total weight W @ alpha of monomial x^alpha."""
        return self.W @ np.array(alpha, dtype=int)

    def is_invariant_monomial(self, alpha: tuple[int, ...]) -> bool:
        return np.all(self.monomial_weight(alpha) == 0)

    def is_invariant(self, f: Poly) -> bool:
        """A polynomial is torus-invariant iff every monomial has weight zero."""
        return all(self.is_invariant_monomial(alpha) for alpha in f)

The penultimate method checks whether a monomial is invariant by checking if its weight is zero. The last method checks whether a polynomial is invariant by checking if every monomial in its support is invariant.

Enumerating Invariants

So given the invariance test, we can enumerate all invariant monomials of a given degree by filtering over all monomials. The Hilbert series (we will review in a subsequent section) counts how many there are in each degree:

    def invariants_of_degree(self, d: int) -> list[Poly]:
        """Basis of invariant monomials of degree d."""
        return [
            poly.mono(alpha)
            for alpha in poly.monomials_of_degree(self.n_vars, d)
            if self.is_invariant_monomial(alpha)
        ]

    def hilbert_coeffs(self, max_d: int) -> list[int]:
        """Number of invariant monomials in each degree."""
        return [len(self.invariants_of_degree(d)) for d in range(max_d + 1)]

The minimal generators of the semigroup of invariant monomials are the invariant monomials that are not products of simpler ones. These are the Hilbert basis of the semigroup:

    def hilbert_basis(self, max_degree: int = 20) -> list[tuple[int, ...]]:
        """Minimal generators of the semigroup ker(W) ∩ N^m.
        """
        all_inv = []
        for d in range(1, max_degree + 1):
            for alpha in poly.monomials_of_degree(self.n_vars, d):
                if self.is_invariant_monomial(alpha):
                    all_inv.append(alpha)

        inv_set = set(all_inv)
        generators = []
        for alpha in all_inv:
            a = np.array(alpha)
            is_reducible = any(
                np.all(a - np.array(beta) >= 0)
                and np.any(a - np.array(beta) > 0)
                and tuple(a - np.array(beta)) in inv_set
                for beta in generators
            )
            if not is_reducible:
                generators.append(alpha)
        return generators

Example

Consider \(\mathbb{C}^*\) acting on \(\mathbb{C}^3\) with weights \([1, 1, -2]\).

So we have

\[ t \cdot (x_0, x_1, x_2) = (t x_0, t x_1, t^{-2} x_2) \]

We are looking for a polynomial \(f(x_0, x_1, x_2)\) such that

\[ f(t x_0, t x_1, t^{-2} x_2) = f(x_0, x_1, x_2) \]

Given a monomial \(x_0^a x_1^b x_2^c\), we thus have:

\[ x_0^a x_1^b x_2^c = t^{a + b - 2c} x_0^a x_1^b x_2^c \]

Which is invariant when \(a + b - 2c = 0\), or \(a + b = 2c\).

Therefore, if we set \(a + b + c = d\), then we have \(2c + c = d\), or \(d = 3c\). Thus, since degrees are integers, we conclude that the invariant monomials exist only in degrees divisible by 3.

Degree 0

\(c = 0\), \(a + b = 0\)

One solution: \((0,0,0)\). So the only invariant monomial is the constant \(1\).

Degrees 1, 2

\(c = d/3\) is not an integer. No invariants.

Degree 3

\(c = 1\), \(a + b = 2\).

There are three solutions: \((2,0,1), (1,1,1), (0,2,1)\). Our invariant polynomials are \(x_0^2 x_2\) and \(x_0 x_1 x_2, x_1^2 x_2\).

Degree 6

\(c = 2\), \(a + b = 4\). Now there are five solutions: \((4,0,2), (3,1,2), (2,2,2), (1,3,2), (0,4,2)\).

General

The degree \(3k\) has \(2k+1\) invariant monomials (choose how to split \(a + b = 2k\) among two variables). So the Hilbert series is \(1, 0, 0, 3, 0, 0, 5, 0, 0, 7, \ldots\)

Let’s verify:

T = Torus(np.array([[1, 1, -2]]))

# Hilbert series: count invariant monomials by degree
T.hilbert_coeffs(6)    # [1, 0, 0, 3, 0, 0, 5]

# Hilbert basis: minimal generators of the invariant semigroup
T.hilbert_basis()      # [(2, 0, 1), (1, 1, 1), (0, 2, 1)]
# These correspond to x0^2*x2, x0*x1*x2, x1^2*x2

Therefore, the three generators correspond to the monomials \(x_0^2 x_2\), \(x_0 x_1 x_2\), and \(x_1^2 x_2\). Every invariant monomial is a product of these three.

For this example, it’s easy to see why these are the generators. The degree 6 invariants are all products of the degree 3 invariants, and the degree 3 invariants are not products of simpler invariants. So the degree 3 invariants are the minimal generators. We got the Hilbert basis just by brute force enumeration in this case, but in general we might need a more sophisticated algorithm to find the minimal generators of the semigroup.

Relations

Now that we have the three generators, we can ask about the relations among them.

In the example above, there are three generators \(g_0 = x_0^2 x_2\), \(g_1 = x_0 x_1 x_2\), \(g_2 = x_1^2 x_2\), which satisfy the relation \(g_0 g_2 = g_1^2\).

It makes sense that there is one relation, as the space \(V\) has dimension 3 and the torus has dimension 1, so the quotient \(V/\!\!/T\) has dimension 3 - 1 = 2. With 3 generators being mapped to a 2-dimensional space, we expect 3 - 2 = 1 relation among them.

How can we find these relations? We need Gröbner elimination.

As a general introduction, introduce new variables \(y_0, y_1, y_2\) (one per generator), form the ideal \((y_0 - g_0, y_1 - g_1, y_2 - g_2)\) in the extended ring \(k[x_0, x_1, x_2, y_0, y_1, y_2]\), and eliminate \(x_0, x_1, x_2\). The surviving polynomials in \(k[y_0, y_1, y_2]\) are the relations among the generators. In this case, we find the relation \(y_0 y_2 - y_1^2 = 0\).

from invariants.groebner import compute_relations
from invariants.poly import format_poly

generators = [poly.mono(a) for a in T.hilbert_basis()]
relations = compute_relations(generators, n_vars=3)
# One relation: y0*y2 - y1^2

Let’s understand Grobner elimination more deeply in the next section.

Gröbner Bases and Presentations

Let’s say we have a polynomial ring over a field \(k\), and we have an ideal \(I\) generated by some polynomials \(f_1, \ldots, f_r\). A Gröbner basis for \(I\) is a particular kind of generating set that allows us to perform algorithmic operations on the ideal.

Definition 6 A Gröbner basis for an ideal \(I\) in a polynomial ring \(k[x_1, \ldots, x_n]\) is a generating set \(\{g_1, \ldots, g_m\}\) of \(I\) such that the leading term \(LT(f)\) for all \(f \in I\) is divisible by the leading term of some \(g_i\). That is:

\[ (LT(I)) = (LT(g_1), \ldots, LT(g_m)) \]

Given a Gröbner basis \(\{g_1, \ldots, g_m\}\) for an ideal \(I\), any polynomial \(f\) can be uniquely expressed as: \[ f = \sum_{i=1}^{m} q_i g_i + r \]

where \(q_i\) are polynomials and \(r\) is a polynomial that cannot be reduced further by the \(g_i\) (i.e., no term of \(r\) is divisible by any leading term of the \(g_i\)).

Proof Sketch: This follows from the division algorithm for polynomials. We repeatedly divide \(f\) by the \(g_i\) until we can no longer reduce it, which gives us the desired expression.

Buchberger’s Algorithm

Define the S-polynomial of two polynomials \(f\) and \(g\) as:

\[ S(f, g) = \frac{\mathrm{lcm}(LT(f), LT(g))}{LT(f)} \cdot f - \frac{\mathrm{lcm}(LT(f), LT(g))}{LT(g)} \cdot g \]

The algorithm proceeds as follows:

  1. Start with a set of polynomials \(F = \{f_1, \ldots, f_r\}\).
  2. For each pair of polynomials \(f_i, f_j \in F\), compute their S-polynomial \(S(f_i, f_j)\).
  3. Reduce the S-polynomial modulo the current set of polynomials.
  4. If the reduction is non-zero, add it to the set \(F\).
  5. Repeat steps 2-4 until no new polynomials are added.

We can think of this as similar to Gaussian elimination. In Gaussian elimination, we perform row operations on linear equations: \[ \begin{align*} a_1x + b_1y + c_1z = d_1 \\ a_2x + b_2y + c_2z = d_2 \\ a_3x + b_3y + c_3z = d_3 \end{align*} \] \[ \begin{align*} a_1x + b_1y + c_1z = d_1 \\ (a_2 - a_1*a_2/a_1)x + (b_2 - b_1*a_2/a_1)y + (c_2 - c_1*a_2/a_1)z = d_2 - d_1*a_2/a_1 \\ (a_3 - a_1*a_3/a_1)x + (b_3 - b_1*a_3/a_1)y + (c_3 - c_1*a_3/a_1)z = d_3 - d_1*a_3/a_1 \end{align*} \] \[ \begin{align*} a_1x + b_1y + c_1z = d_1 \\ 0x + (b_2 - b_1*a_2/a_1)y + (c_2 - c_1*a_2/a_1)z = d_2 - d_1*a_2/a_1 \\ 0x + (b_3 - b_1*a_3/a_1)y + (c_3 - c_1*a_3/a_1)z = d_3 - d_1*a_3/a_1 \end{align*} \]

and so on, eliminating each variable in turn. In Buchberger’s algorithm, we perform similar operations on polynomials to eliminate leading terms and find a Gröbner basis.

As an example, suppose we have the ideal \(I\) generated by the polynomials \(f_1 = x^2 + y - 1\) and \(f_2 = xy + 1\). Impose an ordering on the monomials (\(x > y\)). \[ \begin{align*} f_1 &= x^2 + y - 1 \\ f_2 &= xy + 1 \end{align*} \]

The leading terms are \(LT(f_1) = x^2\) and \(LT(f_2) = xy\). The least common multiple of the leading terms is \(\mathrm{lcm}(x^2, xy) = x^2y\). Since \(\mathrm{lcm}/LT(f_1) = y\) and \(\mathrm{lcm}/LT(f_2) = x\), multiply the first polynomial by \(y\) and the second by \(x\) to get:

\[ \begin{align*} f_1 &= x^2y + y^2 - y \\ f_2 &= x^2y + x \end{align*} \]

Now subtract the second from the first (this is where the S-polynomial comes from): \[ S(f_1, f_2) = (x^2y + y^2 - y) - (x^2y + x) = y^2 - y - x \]

If we fix the maximum degree of the polynomials we want to consider in addition to the ordering, we can even use a matrix representation of the polynomials and perform Gaussian elimination on the coefficients to find the Gröbner basis. Essentially, we’ve replaced the notion of “leading term” with the notion of “leading monomial” in the context of polynomials (by constructing some symbols that represent each monomial), and we perform operations to eliminate these leading monomials until we have a basis that allows us to rewrite any polynomial in the ideal in a unique way.

The primary gotcha is that in Buchberger’s algorithm, we can end up with polynomials of a higher degree than the original generators, which can lead to combinatorial explosion. This is one of the main computational challenges in using Gröbner bases for invariant theory.

Here’s what this looks like in our Python code:

def s_poly(f: Poly, g: Poly, order=poly.grlex) -> Poly:
    """S-polynomial of f and g"""
    lm_f = poly.leading_monomial(f, order)
    lm_g = poly.leading_monomial(g, order)
    lc_f = poly.leading_coefficient(f, order)
    lc_g = poly.leading_coefficient(g, order)
    gamma = poly.mono_lcm(lm_f, lm_g)
    t_f = poly.mono(poly.mono_div(gamma, lm_f), Fraction(1) / lc_f)
    t_g = poly.mono(poly.mono_div(gamma, lm_g), Fraction(1) / lc_g)
    return poly.sub(poly.mul(t_f, f), poly.mul(t_g, g))

def buchberger(F: list[Poly], order=poly.grlex) -> list[Poly]:
    """Compute a reduced Gröbner basis for the ideal generated by F."""
    G = [g for g in F if g]
    if not G:
        return []
    pairs = [(i, j) for i in range(len(G)) for j in range(i + 1, len(G))]
    while pairs:
        i, j = pairs.pop(0)
        sp = s_poly(G[i], G[j], order)
        r = reduce(sp, G, order)
        if r:
            k = len(G)
            pairs.extend((m, k) for m in range(k))
            G.append(r)
    return _reduce_basis(G, order)

Key Operations

Using Grobner bases, we can perform several key reusable functions that are essential for computational invariant theory.

Computation of Normal Forms

Definition 7 Let \(S \subset k[x_1, \ldots, x_n]\) be a set of polynomials and let \(I = <S>\) be the ideal generated by \(S\). The normal form of a polynomial \(f\) with respect to \(I\) is the unique polynomial \(r\) such that no term of \(r\) is divisible by the leading term of any polynomial in a Gröbner basis for \(I\), and such that \(f - r \in I\).

In other words, the normal form of \(f\) is the “remainder” when \(f\) is reduced by the Gröbner basis of \(I\). It is a canonical representative of the equivalence class of \(f\) in the quotient ring \(k[x_1, \ldots, x_n]/I\).

We can compute the normal form with an algorithm:

  1. Start with a polynomial \(f\) and a Gröbner basis \(\{g_1, \ldots, g_m\}\) for an ideal \(I\).
  2. Initialize \(r = f\).
  3. While there exists a \(g_i\) such that the leading term of \(g_i\) divides a term in \(r\):
    1. Let \(t\) be the term in \(r\) that is divisible by \(LT(g_i)\).
    2. Replace \(t\) in \(r\) with \(t - \frac{t}{LT(g_i)} g_i\).
  4. Return \(r\) as the normal form of \(f\) with respect to \(I\).

Basically, this is just the division algorithm for polynomials. The normal form is important because it allows us to test whether a polynomial belongs to the ideal (if the normal form is zero) and to find unique representatives of equivalence classes in the quotient ring.

In our Python code, we can implement this as follows:

def reduce(f: Poly, G: list[Poly], order=poly.grlex) -> Poly:
    """Reduce f modulo G by repeated leading-term cancellation.
    Returns the remainder (normal form if G is a Groebner basis).
    """
    r: Poly = {}
    p = dict(f)
    while p:
        reduced = False
        for g in G:
            if not g:
                continue
            lm_g = poly.leading_monomial(g, order)
            lc_g = poly.leading_coefficient(g, order)
            for alpha in sorted(p.keys(), key=order, reverse=True):
                if poly.mono_divides(lm_g, alpha):
                    quot_mono = poly.mono_div(alpha, lm_g)
                    quot_coeff = p[alpha] / lc_g
                    shift = poly.mul(poly.mono(quot_mono, quot_coeff), g)
                    p = poly.sub(p, shift)
                    reduced = True
                    break
            if reduced:
                break
        if not reduced:
            if p:
                lm_p = poly.leading_monomial(p, order)
                lc_p = p[lm_p]
                r = poly.add(r, poly.mono(lm_p, lc_p))
                del p[lm_p]
    return r

Ideal Membership Testing

Based on our last section, we can also test for ideal membership. Given a polynomial \(f\) and an ideal \(I\) generated by a set of polynomials, we can test whether \(f\) belongs to \(I\) by computing the normal form of \(f\) with respect to a Gröbner basis for \(I\). If the normal form is zero, then \(f \in I\); otherwise, \(f \notin I\).

In code:

def normal_form(f: Poly, basis: list[Poly], order=poly.grlex) -> Poly:
    """Normal form of f with respect to a Gröbner basis."""
    return reduce(f, basis, order)

def in_ideal(f: Poly, basis: list[Poly], order=poly.grlex) -> bool:
    """Test whether f belongs to the ideal generated by basis."""
    return not normal_form(f, basis, order)

Ideal Intersection and Quotients

We can also use elimination to compute the intersection \(I \cap J\) of two ideals. Introduce a new variable \(t\) and form the ideal \(tI + (1-t)J\) (i.e., \((tf_1, \ldots, tf_r, (1-t)g_1, \ldots, (1-t)g_s)\)). Then eliminate \(t\), so the surviving polynomials generate \(I \cap J\). This works because a polynomial lies in \(I \cap J\) if and only if it can be written as both a combination of the \(f_i\) and a combination of the \(g_j\).

Note that taking the union of generators gives the sum \(I + J\), not the intersection. The intersection requires the elimination trick above.

def ideal_intersection(F: list[Poly], G: list[Poly], n_vars: int) -> list[Poly]:
    """Intersection of ideals I = (F) and J = (G).

    Introduces variable t, forms (t*f_i, (1-t)*g_j), eliminates t.
    """
    t_var = poly.var(0, n_vars + 1)
    one_minus_t = poly.sub(poly.const(1, n_vars + 1), t_var)

    def embed(f: Poly) -> Poly:
        return {(0,) + alpha: c for alpha, c in f.items()}

    gens = [poly.mul(t_var, embed(f)) for f in F]
    gens += [poly.mul(one_minus_t, embed(g)) for g in G]

    elim = eliminate(gens, k=1, n_vars=n_vars + 1)
    return [{alpha[1:]: c for alpha, c in g.items()} for g in elim]

As an example, take \(I = (x^2, y)\) and \(J = (x, y^2)\) in \(k[x, y]\). A polynomial is in \(I\) if it’s divisible by \(x^2\) or \(y\), and it’s in \(J\) if it’s divisible by \(x\) or \(y^2\). The intersection consists of polynomials in both: \(I \cap J = (x^2, xy, y^2)\).

I = [poly.mono((2, 0)), poly.var(1, 2)]   # (x^2, y)
J = [poly.var(0, 2), poly.mono((0, 2))]   # (x, y^2)

result = ideal_intersection(I, J, n_vars=2)
# [x^2, xy, y^2]

Elimination of Variables

Suppose we want to eliminate a variable \(x_n\) from an ideal \(I\) in \(k[x_1, \ldots, x_n]\). We can do this by computing a Gröbner basis for \(I\) with respect to an elimination ordering that prioritizes \(x_n\) last. The resulting Gröbner basis will contain polynomials that do not involve \(x_n\), and these polynomials will generate the elimination ideal \(I \cap k[x_1, \ldots, x_{n-1}]\).

def eliminate(generators: list[Poly], k: int, n_vars: int) -> list[Poly]:
    """Eliminate the first k variables.

    Computes a Gröbner basis with an elimination ordering that pushes
    x_0, ..., x_{k-1} to the top, then returns only the polynomials
    that live in k[x_k, ..., x_{n-1}].
    """
    order = poly.elimination_order(k)
    gb = buchberger(generators, order)
    return [g for g in gb if _involves_only(g, k, n_vars)]

Of course, this is just a special case of the ideal intersection method, since we can think of the elimination ideal as the intersection of \(I\) with the subring that does not involve \(x_n\).

Computation of Relations between Polynomial Generators

Given generators \(g_1, \ldots, g_s\) in \(k[x_0, \ldots, x_{n-1}]\), we want to find all the polynomial relations among them.

Introduce new variables \(y_0, \ldots, y_{s-1}\), form the ideal \((y_i - g_i)\) in \(k[x, y]\), and eliminate \(x_0, \ldots, x_{n-1}\). The surviving polynomials in \(k[y]\) are exactly the relations.

def compute_relations(generators: list[Poly], n_vars: int,
                      order=poly.grlex) -> list[Poly]:
    """Compute relations among polynomial generators.

    Given generators g_1, ..., g_s in k[x_0, ..., x_{n-1}], introduce
    new variables y_0, ..., y_{s-1} and compute the kernel of the map
    k[y_0, ..., y_{s-1}] -> k[x_0, ..., x_{n-1}] sending y_i -> g_i.
    """
    s = len(generators)
    total_vars = n_vars + s

    # embed each g_i into k[x_0,...,x_{n-1}, y_0,...,y_{s-1}]
    elim_gens = []
    for i, g in enumerate(generators):
        y_alpha = (0,) * n_vars + tuple(1 if j == i else 0 for j in range(s))
        yi = poly.mono(y_alpha)
        g_emb: Poly = {}
        for alpha, c in g.items():
            new_alpha = alpha + (0,) * s
            g_emb[new_alpha] = c
        elim_gens.append(poly.sub(yi, g_emb))

    # eliminate x_0, ..., x_{n-1}
    elim_order = poly.elimination_order(n_vars)
    gb = buchberger(elim_gens, elim_order)

    # keep only polynomials in k[y_0, ..., y_{s-1}]
    return [g for g in gb if _involves_only(g, n_vars, total_vars)]

Computation of Syzygies

Given generators \(f_1, \ldots, f_s\) of an ideal, a syzygy is a tuple \((h_1, \ldots, h_s)\) of polynomials such that \(h_1 f_1 + \cdots + h_s f_s = 0\). The syzygies encode the dependencies among the generators.

The S-polynomial construction we looked at above already produces some syzygies. If \(S(f_i, f_j)\) reduces to zero modulo the generators, the reduction path gives a relation \(h_1 f_1 + \cdots + h_s f_s = 0\). The routine below should be read as a partial syzygy computation rather than a complete one.

def compute_syzygies(generators: list[Poly], n_vars: int,
                     order=poly.grlex) -> list[Poly]:
    """Compute syzygies among generators of an ideal.

    For each pair (i,j), if the S-polynomial reduces to zero,
    record the corresponding syzygy in k[x, e_0, ..., e_{s-1}].
    """
    s = len(generators)
    syzygies = []
    for i in range(s):
        for j in range(i + 1, s):
            fi, fj = generators[i], generators[j]
            if not fi or not fj:
                continue
            sp = s_poly(fi, fj, order)
            if not reduce(sp, generators, order):
                # Build syzygy: (lcm/LT(fi))/lc_i * e_i - (lcm/LT(fj))/lc_j * e_j
                lm_i = poly.leading_monomial(fi, order)
                lm_j = poly.leading_monomial(fj, order)
                gamma = poly.mono_lcm(lm_i, lm_j)
                qi = poly.mono_div(gamma, lm_i)
                qj = poly.mono_div(gamma, lm_j)
                lc_i = poly.leading_coefficient(fi, order)
                lc_j = poly.leading_coefficient(fj, order)
                # encode in k[x_0,...,x_{n-1}, e_0,...,e_{s-1}]
                ei = tuple(1 if k == i else 0 for k in range(s))
                ej = tuple(1 if k == j else 0 for k in range(s))
                syz = poly.sub(
                    poly.mono(qi + ei, Fraction(1) / lc_i),
                    poly.mono(qj + ej, Fraction(1) / lc_j),
                )
                syzygies.append(syz)
    return syzygies

# Example: syzygies of (x, y) in k[x,y]
# Returns y*e_0 - x*e_1, i.e., y*x - x*y = 0

Computation of Hilbert Series

Given a Gröbner basis for an ideal \(I\), the Hilbert function of the quotient ring \(k[x_1, \ldots, x_n]/I\) counts the monomials in each degree that are not divisible by any leading monomial of the basis. This works because the leading term ideal \(\mathrm{LT}(I)\) has the same Hilbert function as \(I\), and monomials not in \(\mathrm{LT}(I)\) form a vector space basis for \(k[x]/I\).

def hilbert_function(basis: list[Poly], n_vars: int, max_d: int,
                     order=poly.grlex) -> list[int]:
    """Hilbert function of k[x]/I from degree 0 to max_d.

    Counts monomials of each degree not divisible by any leading
    monomial of the Gröbner basis.
    """
    leading_monomials = [poly.leading_monomial(g, order) for g in basis if g]
    result = []
    for d in range(max_d + 1):
        count = 0
        for alpha in poly.monomials_of_degree(n_vars, d):
            if not any(poly.mono_divides(lm, alpha) for lm in leading_monomials):
                count += 1
        result.append(count)
    return result

# Example: GB of (x^2 + y - 1, xy + 1) has LT ideal (x^2, xy, y^2)
# Hilbert function: [1, 2, 0, 0, ...] — quotient is 3-dimensional

Nullstellensatz-Style Problems

The Nullstellensatz, which we saw in the last post, says that a system of polynomial equations \(f_1 = \cdots = f_r = 0\) has no common solution if and only if \(1\) lies in the ideal \((f_1, \ldots, f_r)\). We can test this by computing a Gröbner basis. If the basis contains a nonzero constant, the ideal is all of \(k[x]\) and the system is inconsistent.

def has_common_root(basis: list[Poly], order=poly.grlex) -> bool:
    """Test whether V(I) is nonempty.

    By the Nullstellensatz, V(I) = {} iff 1 ∈ I iff GB = {1}.
    """
    gb = buchberger(basis, order)
    for g in gb:
        if all(e == 0 for e in poly.leading_monomial(g, order)):
            return False  # 1 ∈ I, no common root
    return True

# Example: has_common_root([x, 1-x]) returns False (no solution)

Finite Groups

Now that we have a bit more general infra from the last two sections, let’s work on another computationally friendly class of examples: finite groups.

Reynolds Operator

In the last post, we saw that for certain kinds of groups (like finite groups) \(G\) acting on vector spaces \(V\), we could define the Reynolds operator, which is an idempotent operator that takes any polynomial and averages over the group action to produce an invariant polynomial:

\[ R(f) = \frac{1}{|G|} \sum_{g \in G} g \cdot f \]

If we have a polynomial \(f\) that is not invariant, applying the Reynolds operator will give us an invariant polynomial.

Recall that we represented a finite group as a list of matrices that act on the variables. To apply the Reynolds operator, we can take any polynomial and apply each group element to it by substituting the linear forms corresponding to the group action. Then we average these transformed polynomials to get an invariant.

Start with a set of polynomials that generate the polynomial ring \(k[V]\) and apply the Reynolds operator to each of these polynomials to project them onto the invariant ring. This will give us a set of invariant polynomials, which may generate the invariant ring or at least give us a starting point for finding generators.

Let’s implement the Reynolds operator in code, which involves summing over the group elements and applying the group action to the polynomial. While computationally expensive for large groups, it’s a straightforward way to find invariants.

The key subroutine is apply_to_poly. Given a matrix \(g\) and a polynomial \(f\), we compute \((g \cdot f)(x) = f(g^{-1}x)\) by substituting linear forms for each variable. Reynolds then just averages over all group elements.

def apply_to_poly(self, g: np.ndarray, f: Poly) -> Poly:
    """(g · f)(x) = f(g^{-1} x) via variable substitution."""
    g_inv = np.linalg.inv(g)
    n = self.n_vars
    sub_polys = []
    for i in range(n):
        row: Poly = {}
        for j in range(n):
            c = Fraction(g_inv[i, j]).limit_denominator(10**9)
            if c != 0:
                row[tuple(1 if k == j else 0 for k in range(n))] = c
        sub_polys.append(row)

    result: Poly = {}
    for alpha, coeff in f.items():
        term = poly.const(coeff, n)
        for i, e in enumerate(alpha):
            for _ in range(e):
                term = poly.mul(term, sub_polys[i])
        result = poly.add(result, term)
    return result

def reynolds(self, f: Poly) -> Poly:
    """Reynolds operator: R(f) = (1/|G|) sum_{g in G} g · f."""
    total: Poly = {}
    for g in self.matrices:
        total = poly.add(total, self.apply_to_poly(g, f))
    return poly.scale(Fraction(1, self.order), total)

For example, take \(S_3\) acting on \(\mathbb{C}^3\) by permuting coordinates. The monomial \(x_0^2\) is not invariant (permutations move it to \(x_1^2\) or \(x_2^2\)). Applying Reynolds averages over all 6 permutations:

G = symmetric(3)
f = poly.mono((2, 0, 0))          
Rf = G.reynolds(f)
# Rf = 1/3*x0^2 + 1/3*x1^2 + 1/3*x2^2

assert G.is_invariant(Rf)          # True
assert G.reynolds(Rf) == Rf        # idempotent: R(R(f)) = R(f)

The output \(\frac{1}{3}(x_0^2 + x_1^2 + x_2^2)\) is the power sum \(p_2/3\), which is obviously symmetric. Note that Reynolds is idempotent, so applying it to an already-invariant polynomial returns the same polynomial.

Orbit Sums

Reynolds averaging is a special case of a more general construction called orbit sums. Given a polynomial \(f\), we can consider the orbit of \(f\) under the group action:

\[ \text{Orb}(f) = \{g \cdot f \mid g \in G\} \]

The orbit sum is the sum of all elements in the orbit:

\[ S(f) = \sum_{g \in G} g \cdot f \]

This is similar to the Reynolds operator, but without the normalization factor of \(1/|G|\).

Since the orbit sum is just the sum of elements in the orbit, and orbits are invariant under the group action (since the summands are just permuted), the orbit sum is also invariant under the group action.

That is, for any \(h \in G\):

\[ h \cdot S(f) = h \cdot \left( \sum_{g \in G} g \cdot f \right) = \sum_{g \in G} h \cdot (g \cdot f) = \sum_{g' \in G} g' \cdot f = S(f) \]

For degree \(d\) polynomials, this works cleanly when the action sends monomials to monomials (for example permutation actions): then the invariance condition forces the coefficients to be constant on monomial orbits, so every homogeneous invariant is a linear combination of orbit sums. For a general linear action, orbit sums of monomials are better viewed as a convenient source of candidate invariants than as a general basis theorem.

So in the monomial/permutation cases, the orbit sums of monomials form a basis for the space of homogeneous invariants of degree \(d\). This gives us a way to find a basis for the invariant ring degree by degree in those cases.

In code, we can implement this as follows:

def orbit_sum(self, f: Poly) -> Poly:
    """Sum over distinct images of f under G."""
    seen = set()
    total: Poly = {}
    for g in self.matrices:
        gf = self.apply_to_poly(g, f)
        key = frozenset(gf.items())
        if key not in seen:
            seen.add(key)
            total = poly.add(total, gf)
    return total

def invariants_of_degree(self, d: int) -> list[Poly]:
    """Degree-d invariants via orbit sums of monomials."""
    seen_orbits: set[frozenset] = set()
    basis = []
    for alpha in poly.monomials_of_degree(self.n_vars, d):
        f = poly.mono(alpha)
        orbit_key = frozenset(
            frozenset(self.apply_to_poly(g, f).items())
            for g in self.matrices
        )
        if orbit_key not in seen_orbits:
            seen_orbits.add(orbit_key)
            os = self.orbit_sum(f)
            if os:
                basis.append(os)
    return basis

The orbit_sum method tracks which images have already been seen (via frozenset of the polynomial’s items) to avoid double-counting when the stabilizer of \(f\) is nontrivial. The invariants_of_degree method groups monomials by orbit, computes one orbit sum per orbit, and collects them as a basis.

Continuing with \(S_3\) on \(\mathbb{C}^3\), the degree-2 monomials are \(x_0^2, x_0 x_1, x_0 x_2, x_1^2, x_1 x_2, x_2^2\). Under \(S_3\), these fall into two orbits: \(\{x_0^2, x_1^2, x_2^2\}\) and \(\{x_0 x_1, x_0 x_2, x_1 x_2\}\). The orbit sums give two linearly independent degree-2 invariants:

G = symmetric(3)

# Orbit sum of x0^2: hits x0^2, x1^2, x2^2
G.orbit_sum(poly.mono((2, 0, 0)))
# x0^2 + x1^2 + x2^2

# Orbit sum of x0*x1: hits x0*x1, x0*x2, x1*x2
G.orbit_sum(poly.mono((1, 1, 0)))
# x0*x1 + x0*x2 + x1*x2

# All degree-2 invariants at once
G.invariants_of_degree(2)
# [x0^2 + x1^2 + x2^2, x0*x1 + x0*x2 + x1*x2]

These are the power sum \(p_2 = x_0^2 + x_1^2 + x_2^2\) and the elementary symmetric polynomial \(e_2 = x_0 x_1 + x_0 x_2 + x_1 x_2\).

Molien Series

Since the Reynolds operator projects onto the invariant ring, we know that

\[ \text{dim}_k(k[V]^G_d) = \text{dim}_k(R(k[V]_d)) = \text{tr}(R|_{k[V]_d}) \]

(since the trace of a projection operator is the dimension of its image1).

Thus we can compute the trace as follows:

\[ \text{tr}(R|_{k[V]_d}) = \frac{1}{|G|} \sum_{g \in G} \text{tr}(g|_{k[V]_d}) \]

Since \(k[V]_d\) is the space of homogeneous degree \(d\) polynomials, we can identify it with the \(d\)-th symmetric power of the dual space \(V^*\):

\[ k[V]_d \cong \operatorname{Sym}^d(V^*) \]

The trace of \(g\) on \(\operatorname{Sym}^d(V^*)\) can be computed using the eigenvalues of \(g\) on \(V^*\). If the eigenvalues of \(g\) on \(V^*\) are \(\lambda_1, \ldots, \lambda_n\), then the trace of \(g\) on \(\operatorname{Sym}^d(V^*)\) is given by the complete homogeneous symmetric polynomial of degree \(d\) in the eigenvalues:

\[ \text{tr}(g|_{k[V]_d}) = h_d(\lambda_1, \ldots, \lambda_n) \]

The generating function for the complete homogeneous symmetric polynomials is given by2:

\[ \sum_{d=0}^{\infty} h_d(\lambda_1, \ldots, \lambda_n) t^d = \prod_{i=1}^n \frac{1}{1 - \lambda_i t} = \frac{1}{\det(I - t g)} \]

This motivates the Molien formula for the Hilbert series of the invariant ring3:

\[ H(t) = \frac{1}{|G|} \sum_{g \in G} \frac{1}{\det(I - t g)} \]

To use the Molien formula computationally, we expand \(1/\det(I - tg)\) as a power series in \(t\) for each group element \(g\). Since \(\det(I - tg) = \prod_i (1 - \lambda_i t)\) where \(\lambda_i\) are the eigenvalues of \(g\), the series expansion is a convolution of geometric series. We truncate at degree max_d, sum over all group elements, divide by \(|G|\), and round to the nearest integer (the result is exact for rational eigenvalues; rounding handles floating-point eigenvalues from rotation matrices).

def molien_coeffs(self, max_d: int) -> list[int]:
    """Molien series: H(t) = (1/|G|) sum_{g in G} 1/det(I - tg)."""
    coeffs = np.zeros(max_d + 1, dtype=complex)
    for g in self.matrices:
        eigenvalues = np.linalg.eigvals(g)
        # Expand 1/prod(1 - lambda_i * t) as power series
        series = np.zeros(max_d + 1, dtype=complex)
        series[0] = 1.0
        for lam in eigenvalues:
            powers = np.array([lam**d for d in range(max_d + 1)])
            new_series = np.zeros(max_d + 1, dtype=complex)
            for d in range(max_d + 1):
                new_series[d] = sum(series[k] * powers[d - k]
                                    for k in range(d + 1))
            series = new_series
        coeffs += series
    return [round((coeffs[d] / self.order).real)
            for d in range(max_d + 1)]

For \(S_3\) on \(\mathbb{C}^3\), the group has 6 elements.

The identity has eigenvalues \((1,1,1)\), contributing \(1/(1-t)^3\).

The transpositions have eigenvalues \((1,1,-1)\), contributing \(1/((1-t)^2(1+t))\).

The 3-cycles have eigenvalues \((1, \omega, \omega^2)\) where \(\omega = e^{2\pi i/3}\), contributing \(1/((1-t)(1-\omega t)(1-\omega^2 t)) = 1/(1-t^3)\).

Averaging over all 6 elements:

\[ H(t) = \frac{1}{6}\left(\frac{1}{(1-t)^3} + \frac{3}{(1-t)^2(1+t)} + \frac{2}{(1-t^3)}\right) \]

which simplifies to :

\[ H(t) = \frac{1}{(1-t)(1-t^2)(1-t^3)} \]

So the invariant ring is a polynomial ring in three generators of degrees 1, 2, 3 (the power sums \(p_1, p_2, p_3\)). Verify numerically:

G = symmetric(3)
G.molien_coeffs(10)
# [1, 1, 2, 3, 4, 5, 7, 8, 10, 12, 14]

If we expanded \(1/((1-t)(1-t^2)(1-t^3))\) as a power series, we would get the same coefficients, confirming that the Molien series correctly predicts the dimensions of the invariant ring in each degree.

Noether Bound

The Molien series gives us the dimensions of the graded pieces of the invariant ring, which tells us how many invariants we should expect in each degree. The orbit sums give us explicit invariants to fill those dimensions. But how do we know the search terminates?

Theorem 1 Let \(G\) be a finite group acting linearly on a finite-dimensional vector space \(V\) over a field of characteristic zero. Then the invariant ring \(k[V]^G\) is generated by invariants of degree at most \(|G|\).

Proof: See Fleischmann, “The Noether Bound in Invariant Theory of Finite Groups,” Advances in Mathematics 156 (2000), which also proves the sharper bound \(< |G|\) for non-cyclic groups. \(\square\)

In practice, this means we can search for generators by calling invariants_of_degree(d) for \(d = 1, \ldots, |G|\) and be guaranteed to find them all.

For \(S_3\) with \(|G| = 6\), we only need to check up to degree 6 (and in fact all three generators appear by degree 3).

Practical Issues

There’s some practical problems with the computational approach. The main computational bottleneck for finite groups is apply_to_poly, which substitutes linear forms for each variable. For a polynomial with \(T\) terms in \(n\) variables, each group element costs \(O(T \cdot n \cdot \deg(f))\) polynomial multiplications. Summing over all \(|G|\) elements, the total cost of Reynolds or orbit sums is \(O(|G| \cdot T \cdot n \cdot \deg(f))\). For \(S_n\) with \(|G| = n!\), this becomes infeasible quickly.

Orbit sums (a bit) cheaper than Reynolds since they avoid the \(1/|G|\) normalization (keeping coefficients as integers) and deduplicate images. The effective cost is proportional to the orbit size rather than \(|G|\).

The Molien series is cheap by comparison, as it only requires eigenvalue computations (\(O(|G| \cdot n^3)\)) with no polynomial arithmetic.

So probably we compute the Hilbert series first, then use it as a guide to say how many generators to expect in each degree, then do the relatively expensive orbit sum computation.

I’m not gonna try to optimize all these algorithms in this post, I’m still learning how it works myself. But in practice, for large groups, I assume we could optimize the apply_to_poly method, maybe by caching intermediate results or using better data structures for polynomials.

Primary and Secondary Invariants

With the Hilbert series, we can start to understand the structure of the invariant ring. In particular, we want to understand how the invariant ring can be decomposed, typically into a polynomial part and a “remainder” part.

This leads to the concepts of primary and secondary invariants. The idea is that we can find a smaller set of invariants (the primary invariants) that generate a polynomial subring, and then the rest of the invariant ring can be expressed as a module over this polynomial subring, with the secondary invariants serving as module generators. By writing in terms of the primary invariants, we can get a more compact description of the invariant ring, and the secondary invariants capture the “extra” structure that isn’t captured by the primary invariants alone.

Homogeneous Systems of Parameters (HSOP)

Before we can define primary and secondary invariants, we need to introduce the notion of a homogeneous system of parameters (HSOP), which help understand the structure of graded rings.

Definition 8 A homogeneous system of parameters (HSOP) for a graded invariant ring \(k[V]^G\) is a collection of homogeneous elements \(\theta_1, \ldots, \theta_d \in k[V]^G\) such that \(k[V]^G\) is finitely generated as a module over the polynomial subring \(k[\theta_1, \ldots, \theta_d]\).

For finite groups, we can always find an HSOP consisting of algebraically independent homogeneous invariants. In general, finding an HSOP can be more difficult. In the settings we care about here, the issue is not existence so much as actually finding one.

The point of a HSOP is that it gives us subring of the invariant ring that is “simple” to understand, and we can then study the structure of the full invariant ring as a module over this polynomial subring.

Also note that the HSOP isn’t unique necessarily.

To find an HSOP, we can use a greedy process to collect invariants. In practice, we look for invariants that appear algebraically independent and then do additional checks:

from invariants.action import find_hsop, find_secondaries, primary_secondary

# S_3 on C^3
G = symmetric(3)
action = invariant_theory(G, polynomial_ring(3))
hsop = find_hsop(action, 4)
# [x0+x1+x2, x0^2+x1^2+x2^2, x0^3+x1^3+x2^3]

As an example, we can look at \(S_3\). The invariant ring is polynomial on three generators, and we can take the three power sums \(p_1 = x_0 + x_1 + x_2\), \(p_2 = x_0^2 + x_1^2 + x_2^2\), \(p_3 = x_0^3 + x_1^3 + x_2^3\). So they form an HSOP.

Cohen–Macaulay

Given we have an HSOP \(\theta_1, \ldots, \theta_d\), and we know that \(k[V]^G\) is finitely generated as a module over \(k[\theta_1, \ldots, \theta_d]\), we can ask about the structure of this module. In the best cases, the invariant ring is a free module over the polynomial subring generated by the HSOP, which means it can be written uniquely as a direct sum of copies of the polynomial subring:

\[ k[V]^G = \bigoplus_{j=1}^{m} \eta_j \cdot k[\theta_1, \ldots, \theta_d] \]

so every element can be written

\[ f = \sum_{j=1}^{m} \eta_j f_j(\theta_1, \ldots, \theta_d) \]

This is the Cohen-Macaulay property. It’s not necessarily true for all invariant rings. We could instead have nontrivial relations among the \(\eta_j\) such as:

\[ p(\theta_1, \ldots, \theta_d) \eta_1 + q(\theta_1, \ldots, \theta_d) \eta_2 = 0 \]

Theorem 2 Let \(G\) be a reductive group acting on a finite-dimensional vector space \(V\) over a field of characteristic zero. Then the invariant ring \(k[V]^G\) is Cohen-Macaulay.

Proof. See Hochster and Roberts.4 \(\square\)

Computing Primary and Secondary Invariants

When the Cohen-Macaulay property holds, we call the HSOP elements \(\theta_1, \ldots, \theta_d\) primary invariants, and the module generators \(\eta_1, \ldots, \eta_m\) secondary invariants. The primary invariants generate a polynomial subring, and the secondary invariants fill in the remainder.

Computationally, this compresses the problem. Instead of searching for all generators of \(k[V]^G\), we can first find an HSOP and then loop through the degrees to find secondary invariants using the Hilbert series (which tells us how many per degree).

For \(S_3\) on \(\mathbb{C}^3\), the invariant ring is polynomial, so the only secondary invariant is \(1\):

primaries, secondaries = primary_secondary(action, 4)
# primaries: [x0+x1+x2, x0^2+x1^2+x2^2, x0^3+x1^3+x2^3]
# secondaries: [1]
# k[V]^G = 1 · k[p1, p2, p3]  — free module of rank 1

A more interesting example. Consider \(\mathbb{Z}/2\) acting on \(\mathbb{C}^2\) by \((x, y) \mapsto (-x, -y)\). The invariant ring is \(k[x^2, xy, y^2]\) with relation \(x^2 y^2 = (xy)^2\), so it is not a polynomial ring:

from invariants.groups.finite import FiniteGroup
G_z2 = FiniteGroup([np.eye(2), -np.eye(2)])
action_z2 = invariant_theory(G_z2, polynomial_ring(2))

primaries_z2, secondaries_z2 = primary_secondary(action_z2, 4)
# primaries: [x0^2, x1^2]
# secondaries: [1, x0*x1]
# k[V]^G = 1 · k[x^2, y^2] (direct_sum) xy · k[x^2, y^2]  — free module of rank 2

Here we have two secondary invariants, \(1\) and \(xy\). So we know every invariant can be written uniquely as \(1 \cdot p(x^2, y^2) + xy \cdot q(x^2, y^2)\). The generator \(xy\) isn’t expressible as a polynomial in the primary invariants \(x^2, y^2\), but the full ring is still a free module over \(k[x^2, y^2]\), as guaranteed by Hochster-Roberts.

We can go further and consider the degree-4 invariants. Under \((x,y) \mapsto (-x,-y)\), a monomial \(x^a y^b\) is invariant iff \(a + b\) is even (i.e. the entire monomial is of even degree), so the degree-4 invariants are all the even-degree monomials in \(x\) and \(y\): \(x^4, x^3y, x^2y^2, xy^3, y^4\). Each decomposes:

  • \(x^4 = 1 \cdot (x^2)^2\)
  • \(x^3 y = xy \cdot x^2\)
  • \(x^2 y^2 = 1 \cdot x^2 y^2\)
  • \(x y^3 = xy \cdot y^2\)
  • \(y^4 = 1 \cdot (y^2)^2\)

So every invariant can be written as either \(1 \cdot k[x^2, y^2]\) or \(xy \cdot k[x^2, y^2]\).

Classical Groups and First Fundamental Theorems

For finite groups, we computed invariants from scratch via the Reynolds operator, orbit sums, and the Molien series.

For classical groups like \(\mathrm{O}(n)\), \(\mathrm{SL}(n)\), and \(\mathrm{Sp}(2n)\), the First Fundamental Theorems gives us the generators. Since these have been derived in the past, we can just hard-code them.

Molien-Weyl Formula

Before moving on, how would we compute these?

We would need a Hilbert series for continuous groups. Luckily, the Molien formula generalizes. For a compact group \(G\) acting on \(V\), the Hilbert series is

\[ H(t) = \int_G \frac{1}{\det(I - tg)} \, d\mu(g) \]

where \(\mu\) is the Haar measure. This is over an infinite group, but the Weyl integration formula reduces it to an integral over the “maximal torus” \(T\):

\[ H(t) = \frac{1}{|W|} \int_T \frac{|\Delta(g)|^2}{\det(I - tg)} \, dg \]

where \(W\) is the (finite) “Weyl group” and \(\Delta\) is the “Weyl denominator” (whatever those are). The torus integral becomes a Laurent polynomial residue computation.

The point is, computing the Hilbert series for a compact Lie group reduces to a torus computation plus a finite group average.

In practice, since we know the generators, we can just compute the Hilbert series by counting monomials in those generators.

The Orthogonal Group \(\mathrm{O}(n)\)

Consider \(\mathrm{O}(n)\) acting diagonally on \(k\) copies of \(\mathbb{C}^n\). We represent this with \(kn\) variables, organized as \(k\) vectors \(v_1, \ldots, v_k\) of length \(n\). The First Fundamental Theorem says the invariant ring is generated by all inner products \(\langle v_i, v_j \rangle = \sum_{a=0}^{n-1} v_i^a v_j^a\).

There are \(\binom{k+1}{2}\) generators (the inner products are symmetric: \(\langle v_i, v_j \rangle = \langle v_j, v_i \rangle\)), all of degree 2 in the original variables. Degree-\(d\) invariants come from degree-\(d/2\) polynomials in these generators (so only the even degrees have invariants).

from invariants.classical import orthogonal_action

# O(3) acting on 2 copies of C^3 (6 variables)
action = orthogonal_action(n=3, k=2)

# Generators: <v1,v1>, <v1,v2>, <v2,v2>  — 3 inner products
gens = action.invariants_of_degree(2)
# [x0^2+x1^2+x2^2, x0*x3+x1*x4+x2*x5, x3^2+x4^2+x5^2]

As an example, label the 6 variables as \(v_1 = (x_0, x_1, x_2)\) and \(v_2 = (x_3, x_4, x_5)\). The three generators are:

\[ \begin{aligned} \langle v_1, v_1 \rangle &= x_0^2 + x_1^2 + x_2^2 \\ \langle v_1, v_2 \rangle &= x_0 x_3 + x_1 x_4 + x_2 x_5 \\ \langle v_2, v_2 \rangle &= x_3^2 + x_4^2 + x_5^2 \end{aligned} \]

Every \(\mathrm{O}(3)\)-invariant polynomial in the 6 variables is a polynomial in these three inner products.

If we extended to degree-4 invariants, we would have the 6 products of pairs:

\[ \begin{aligned} &\langle v_1, v_1 \rangle^2, \quad \langle v_1, v_1 \rangle \langle v_1, v_2 \rangle, \quad \langle v_1, v_1 \rangle \langle v_2, v_2 \rangle, \\ &\langle v_1, v_2 \rangle^2, \quad \langle v_1, v_2 \rangle \langle v_2, v_2 \rangle, \quad \langle v_2, v_2 \rangle^2 \end{aligned} \]

Intuitively, this makes sense, as \(O(3)\) preserves angles and lengths. The only way to get an invariant is to combine the inner products such that the orientation of the vectors is immaterial.

\(\mathrm{SL}(n)\) and Bracket Invariants

Now consider \(\mathrm{SL}(n)\) acting on \(k\) copies of \(\mathbb{C}^n\). The generators are all \(n \times n\) minors, which are determinants formed by choosing \(n\) of the \(k\) vectors. For \(\mathrm{SL}(2)\), these are the brackets \([ij] = x_i y_j - y_i x_j\) (we showed this in the previous post).

There are \(\binom{k}{n}\) generators, each of degree \(n\) in the original variables. When \(k < n\), there are no invariants at all (since you can’t form an \(n \times n\) determinant from less than \(n\) vectors).

from invariants.classical import sl_action

# SL(2) acting on 3 copies of C^2 (6 variables)
action = sl_action(n=2, k=3)

# Generators: [12], [13], [23]  — 3 brackets
gens = action.invariants_of_degree(2)
# [x0*x3 - x1*x2, x0*x5 - x1*x4, x2*x5 - x3*x4]

As an example, we have three vectors in \(\mathbb{C}^2\): \(v_1 = (x_0, x_1)\), \(v_2 = (x_2, x_3)\), \(v_3 = (x_4, x_5)\). Each bracket is a \(2 \times 2\) determinant:

\[ \begin{aligned} [12] &= \det\begin{pmatrix} x_0 & x_2 \\ x_1 & x_3 \end{pmatrix} = x_0 x_3 - x_1 x_2 \\[4pt] [13] &= \det\begin{pmatrix} x_0 & x_4 \\ x_1 & x_5 \end{pmatrix} = x_0 x_5 - x_1 x_4 \\[4pt] [23] &= \det\begin{pmatrix} x_2 & x_4 \\ x_3 & x_5 \end{pmatrix} = x_2 x_5 - x_3 x_4 \end{aligned} \]

These are “signed areas” of the parallelograms spanned by pairs of vectors.

Intuitively,\(\mathrm{SL}(2)\) preserves areas (since it has determinant 1), so the signed areas are invariant.

With 3 vectors, there are \(\binom{3}{2} = 3\) brackets and no relations among them, meaning that the invariant ring is freely generated.

The Second Fundamental Theorem (also in the previous post) gives us the relations between them, called the Plücker relations. For \(\mathrm{SL}(2)\) on 4 vectors, the single relation is \([12][34] - [13][24] + [14][23] = 0\). We can now verify this in code:

from invariants.groebner import compute_relations

# SL(2) on 4 copies of C^2: 6 brackets, 1 Plücker relation
action4 = sl_action(n=2, k=4)
gens4 = action4.invariants_of_degree(2)
relations = compute_relations(gens4, n_vars=8)
# One relation: y0*y5 - y1*y4 + y2*y3  (the Plücker relation)

With 4 vectors we would get \(\binom{4}{2} = 6\) brackets: \([12]\), \([13]\), \([14]\), \([23]\), \([24]\), \([34]\). However, these are not algebraically independent.

Expanding the Plücker relation by hand:

\[ [12][34] - [13][24] + [14][23] = (x_0 x_3 - x_1 x_2)(x_4 x_7 - x_5 x_6) - (x_0 x_5 - x_1 x_4)(x_2 x_7 - x_3 x_6) + (x_0 x_7 - x_1 x_6)(x_2 x_5 - x_3 x_4) \]

Everything cancels out. The invariant ring is \(k\bigl[[12],[13],[14],[23],[24],[34]\bigr] / ([12][34] - [13][24] + [14][23])\). The Gröbner computation in compute_relations rediscovers exactly this.

Symplectic and Other Classical Groups

The same pattern works for \(\mathrm{Sp}(2n)\) acting on \(k\) copies of \(\mathbb{C}^{2n}\), where the generators are the symplectic pairings \(\omega(v_i, v_j) = \sum_{a=0}^{n-1} (v_i^a v_j^{n+a} - v_i^{n+a} v_j^a)\). This is essentially identical to the orthogonal case, but with an antisymmetric bilinear form rather than a symmetric one.

Further Reductive Groups

There are additional reductive groups beyond the ones we looked at in the last section. We explored some of the theory in the previous post.

Reductive groups often require idiosyncratic analysis via representation theory and First Fundamental Theorems. However, many important reductive groups have known FFTs that could be hard-coded in the same way as the classical groups above.

The most salient example is \(\mathrm{GL}(n)\), which acts on matrices by conjugation. It’s generators are traces of products. So we would compute \(\operatorname{tr}(A)\), \(\operatorname{tr}(A^2)\), \(\operatorname{tr}(AB)\), \(\operatorname{tr}(ABA)\), etc. The implementation pattern would be the same as the other groups.

These are out of scope for now. In the future I may come back and add some of these.

Orbits, Quotients, and Separation

Thus far, the primary question under consideration has been to understand the generators of the invariant ring \(k[V]^G\). However, another key (related) question is to understand the orbits of the group action on \(V\). The invariant ring encodes the geometry of the quotient space \(V/\!\!/G\). The quotient map \(\pi: V \to V/\!\!/G\) sends each point \(v \in V\) to the values of the invariants at that point, effectively classifying points according to their orbit closures.

\[ \pi(v) = (f_1(v), \ldots, f_s(v)) \]

Two points have the same image under \(\pi\) if and only if their orbit closures intersect. For reductive groups (which includes all finite groups and tori), closed orbits are separated. This means that if \(\pi(v) = \pi(w)\), both orbits are closed, the orbits are equal.

So invariants can be used to classify which points are equivalent under some symmetry.

The Quotient Map

Suppose we have the generators of the invariant ring \(k[V]^G\). Then the quotient map \(\pi: V \to V/\!\!/G\) is given by evaluating these generators at each point. That is, if \(f_1, \ldots, f_s\) are generators of \(k[V]^G\), then we can use them to define a map:

from invariants.orbits import quotient_map, same_image
from invariants import poly

# S_3 generators: power sums p1, p2, p3
p1 = poly.add(poly.add(poly.var(0, 3), poly.var(1, 3)), poly.var(2, 3))
p2 = poly.add(poly.add(poly.mono((2,0,0)), poly.mono((0,2,0))), poly.mono((0,0,2)))
p3 = poly.add(poly.add(poly.mono((3,0,0)), poly.mono((0,3,0))), poly.mono((0,0,3)))
gens = [p1, p2, p3]

# Same orbit => same image
quotient_map(gens, (1, 2, 3))  # [6, 14, 36]
quotient_map(gens, (3, 1, 2))  # [6, 14, 36]
same_image(gens, (1, 2, 3), (3, 1, 2))  # True

# Different orbits => different image
same_image(gens, (1, 2, 3), (1, 1, 4))  # False

In this example, we are looking at the action of \(S_3\) on \(\mathbb{C}^3\) by permuting coordinates. The generators of the invariant ring are the power sums \(p_1, p_2, p_3\), which we established earlier as \(p_1 = x_0 + x_1 + x_2\), \(p_2 = x_0^2 + x_1^2 + x_2^2\), and \(p_3 = x_0^3 + x_1^3 + x_2^3\).

The quotient map evaluates these invariants at a point. For points in the same orbit (like \((1, 2, 3)\) and \((3, 1, 2)\)), we get the same image under the quotient map. For points in different orbits (like \((1, 2, 3)\) and \((1, 1, 4)\)), we get different images.

How do we interpret these images?

For \((1, 2, 3)\), we have: - \(p_1(1, 2, 3) = 1 + 2 + 3 = 6\) - \(p_2(1, 2, 3) = 1^2 + 2^2 + 3^2 = 14\) - \(p_3(1, 2, 3) = 1^3 + 2^3 + 3^3 = 36\)

However, for \((1, 1, 4)\), we have: - \(p_1(1, 1, 4) = 1 + 1 + 4 = 6\) - \(p_2(1, 1, 4) = 1^2 + 1^2 + 4^2 = 18\) - \(p_3(1, 1, 4) = 1^3 + 1^3 + 4^3 = 66\)

So these points do not lie in the same orbit, as they have different images under the quotient map. The invariants \(p_1, p_2, p_3\) are able to distinguish these orbits.

And, as we can tell by inspection, there is no way to permute the coordinates of \((1, 2, 3)\) to get \((1, 1, 4)\), confirming that they are indeed in different orbits.

Orbit Separation

In the finite-group case, when two points are in different orbits, we can find a specific invariant that witnesses this:

from invariants.orbits import find_separator

sep = find_separator(gens, (1, 2, 3), (1, 1, 4))
poly.show(sep)
# 'x0^2 + x1^2 + x2^2'
# p2(1,2,3) = 14, p2(1,1,4) = 18

For reductive groups, if two orbits are closed and distinct, some polynomial invariant distinguishes them. This is a useful constructive theorem, as it allows us to find explicit invariants that separate orbits and thereby classify objects up to symmetry.

Null Cones

The null cone \(\mathcal{N}(V)\) is the fiber \(\pi^{-1}(\pi(0))\), which is the set of all points that invariants cannot distinguish from the origin.

\[ \mathcal{N}(V) = \{ v \in V : f(v) = 0 \text{ for all homogeneous } f \in k[V]^G \text{ with } \deg f > 0 \} \]

This is important as all the points in the null cone have orbit closures that contain the origin, so they all collapse to the same point in the quotient.

Testing null cone membership is just evaluating the generators:

from invariants.orbits import in_null_cone

# C* on C^2, weights [1, -1]. Invariant: x0*x1
gen = poly.mono((1, 1))

in_null_cone([gen], (0, 0))   # True — origin
in_null_cone([gen], (1, 0))   # True — coordinate axis
in_null_cone([gen], (0, 5))   # True — coordinate axis
in_null_cone([gen], (1, 1))   # False — generic point
in_null_cone([gen], (2, 3))   # False — generic point

For this example, the null cone is \(\{x_0 x_1 = 0\}\), the union of the two coordinate axes. Geometrically, these are the orbits of \(\mathbb{C}^*\) that degenerate: \(t \cdot (v_0, 0) = (tv_0, 0)\), which approaches the origin as \(t \to 0\). The generic orbits with \(x_0 x_1 \neq 0\) are closed and do not contain the origin in their closure, so they are not in the null cone.

Separating Invariants

If we have a full generating set for the invariant ring, then we can use it to separate orbits. However, it may be difficult to produce the full set. In practice, we might only need some of the generators to separate orbits. The hope is that the separating set is much smaller than the full generating set.

We can find a minimal separating subset by greedy set cover (there’s probably faster algorithms, especially if we have some structure or are willing to do some random sampling):

from invariants.orbits import is_separating, minimal_separating_subset

# S_3 generators
gens = [p1, p2, p3]  # three power sums

# Test pairs, test points in different S_3 orbits
pairs = [
    ((1, 2, 3), (1, 1, 4)),  # same sum, different orbits
    ((1, 2, 3), (2, 2, 2)),  # same sum, different orbits
    ((1, 0, 0), (0, 0, 2)),  # different sum, easy
]

is_separating(gens, pairs)  # True, so full set works

minimal = minimal_separating_subset(gens, pairs)
# [x0^2 + x1^2 + x2^2] - p2 alone separates all three pairs!

For \(S_3\) acting on \(\mathbb{C}^3\), the power sum \(p_2 = x_0^2 + x_1^2 + x_2^2\) separates these test pairs.

Note that \(p_2\) doesn’t separate all orbits (e.g. \((1, 2, 0)\) and \((0, 1, 2)\) have the same \(p_2\) value).

Scaling

Some quick notes here on scaling properties of the algorithms we’ve discussed, which have some practical issues, and other techniques used in practice that I’ve omitted.

Degree Explosion

The fundamental bottleneck in many cases is degree.

For a finite group \(G\) acting on \(\mathbb{C}^n\), Noether’s bound guarantees generators appear by degree \(|G|\), but the number of monomials in degree \(d\) is \(\binom{n+d-1}{d}\), which grows polynomially in \(d\) for fixed \(n\).

Even for \(S_5\) (order 120) acting on \(\mathbb{C}^5\), degree 120 has \(\binom{124}{120} \approx 10^7\) monomials. It’s not practical to use an algorithm like Buchberger’s algorithm on ideals with generators of this degree.

Rational Invariants

A rational invariant is a ratio \(f/h\) of polynomials that is invariant under the group action:

\[ \frac{f(g \cdot v)}{h(g \cdot v)} = \frac{f(v)}{h(v)} \text{ for all } g \in G, v \in V \]

Sometimes, when the polynomial invariant ring requires generators of very high degree or has complicated relations, rational invariants require fewer generators with lower degree and simpler relations. The tradeoff is that there could be poles, so we have to work on limited domains. This apparently comes up in non-reductive groups, where the polynomial invariant ring may not be finitely generated but the rational invariant field can be.

I didn’t investigate this too heavily.

Other Techniques

To help scale the library or extend it to new situations, there are a combination of different techniques.

We saw some of these earlier. For example, we saw:

  • the Molien formula trick (where we compute the Hilbert series and get a roadmap for how many invariants to expect in each degree)
  • using orbit sums instead of the Reynolds operator to avoid the \(1/|G|\) normalization and keep coefficients as integers
  • using separating invariants instead of full generating sets to reduce the number of invariants we need to find
  • Noether’s bound to limit the search to a finite degree
  • for tori, reducing to a combinatorial problem of finding integer solutions to linear equations (the weight decomposition), which is more efficient than Gröbner basis computations

Some techniques we haven’t implemented but are commonly used in practice include:

  • using representation theory to decompose the polynomial ring into irreducible representations and extract invariants from the trivial summands, which can be more efficient than brute-force Gröbner basis computations.
  • for compact Lie groups, using the Molien-Weyl formula to reduce the problem to an integral over the maximal torus, which can be computed using our existing Torus machinery
  • SAGBI bases, a variant of Gröbner bases designed for subalgebras rather than ideals. Where Buchberger’s algorithm answers “is this polynomial in the ideal?”, SAGBI answers “is this polynomial expressible in terms of known generators?”
  • Derksen’s algorithm, a general-purpose method that works for any group given by matrix generators, without needing to know anything about the group’s structure. It reduces invariant computation to a single (expensive) Gröbner basis calculation.

Some of these techniques are in Derksen and Kemper’s book, and some are in the research literature. This post is too long, but I may come back to them.

Conclusion

In this post and the last post we have covered the (classical) theory of invariants and some of the computational tasks that arise in invariant theory. We have also implemented some code to perform these computations in specific cases.

AI Disclosure

Claude (Anthropic) assisted with code and the initial draft, working from my notes and the companion code repository. ChatGPT (OpenAI) provided a mathematical review pass. I edited the final text and verified the mathematical content.

Footnotes

  1. According to this MathOverflow answer by Terry Tao, the projection property \(R^2 = R\) implies that the eigenvalues of \(R\) are either 0 or 1. The trace, which is the sum of the eigenvalues, counts how many 1’s there are, which is exactly the dimension of the image of \(R\). This is enough to unique determine the trace, since the trace is linear and the projection property constrains the eigenvalues to be 0 or 1. So the trace of \(R\) on \(k[V]_d\) is exactly the dimension of the invariant subspace, which is what we want to compute. Apparently this comes up in “noncommutative probability.” Cool.↩︎

  2. This generating function is a partition function in the sense of statistical mechanics. Recall that the partition function \(Z = \sum_{\text{states}} e^{-\beta E}\) counts states weighted by energy. Here, \(t\) plays the role of the Boltzmann weight \(e^{-\beta}\), the “energy” of a monomial is its degree, and each factor \(\frac{1}{1-\lambda_i t} = 1 + \lambda_i t + \lambda_i^2 t^2 + \cdots\) counts the states contributed by one variable (one for each power). Multiplying \(n\) factors counts all monomials in \(n\) variables by degree. After averaging over the group, \(\dim(k[V]^G_d)\) is the number of independent symmetry-invariant observables at degree \(d\): the Molien formula counts only the “physical” quantities that are unchanged by the symmetry.↩︎

  3. The Molien formula is fundamentally a residue computation. Extracting the \(d\)-th coefficient of \(H(t)\) is \(\frac{1}{2\pi i}\oint \frac{H(t)}{t^{d+1}} dt\). In practice, this means we can compute Molien coefficients in closed form via partial fractions: decompose \(H(t) = \sum \frac{c}{(1-\alpha t)^k}\), and each term contributes \(c\binom{d+k-1}{k-1}\alpha^d\) to the \(d\)-th coefficient. For compact Lie groups, the Molien formula generalizes to the Molien-Weyl integral over the maximal torus, which is a multivariate residue computation in the torus coordinates.↩︎

  4. M. Hochster and J. L. Roberts, “Rings of invariants of reductive groups acting on regular rings are Cohen-Macaulay,” Advances in Mathematics 13 (1974), 115–175.↩︎