Is a random polynomial with restricted coefficients irreducible?

This post reports on some progress Gady Kozma and I have done recently on the problem of the irreducibility of random polynomials.
The main object of study is a random polynomial; i.e.,

f(X) = X^n + \zeta_{n-1} X^{n-1} + \cdots + \zeta_1 X+ \zeta_0,

where \zeta_0 ,\ldots \zeta_{n-1} are random variables. In general we assume that the \zeta_i are iid. Perhaps the most interesting example is when \zeta_i=\pm 1 with probabilities 1/2. Here is how the roots look like:


The roots of polynomials with plus-minus 1 coefficients as illustrated by Sam Derbyshire

There are four types of classical questions on these polynomials:

  1. How many real roots f has? Here there is a huge body of literature starting with the classical works of Bloch-Pòlya, Offord-Littlewood, Kac, and many many more. (On average there are about \frac{2}{\pi} \log n real roots, for \zeta_i with mean zero. See the introduction in here for a nice historical review.)
  2. How the complex roots distribute? We refer to the work of Odlyzko-Poonen and to here for the above and many more beautiful illustrations.
  3. The universality question: How the answers to the above questions depend on the actual distribution of the \zeta_i.

If the \zeta_i\in \mathbb{Z}, then a question of irreducibility over \mathbb{Q} arises.

4. Is f irreducible whp?

Here we should stress that the assumption \textrm{Prob}(\zeta_0=0)=0 is obviously necessary. One of the most exciting problems in this field is:

Conjecture (Odlyzko-Poonen): Let \zeta_i be iid random variables taking the values 0,1 with equal probabilities expect \zeta_0=1. Then

\textrm{Prob}(f \textrm{ irreducible}) = 1 +o(1).

This conjecture is open. In fact, I got familiarized with these type of questions by some guy on the street who asked about irreducibly whp in the \zeta_i = \pm 1 model.

The best lower bound is due to Konyagin:

Theorem 1 (Konyagin): In the setting of the conjecture,

\textrm{Prob}(f \textrm{ irreducible}) \gg 1/\log n.

Here we use the notation f(n)\gg g(n) if there exists an absolute constant C>0such that f(n)\geq C g(n).

Even if one takes the easier problem when the coefficients are  allowed  to take values in a bigger fixed set, no result was known. Recently, Gady Kozma and I uploaded to the arXiv a paper, in which we prove a Odlyzko-Poonen type conjecture, for \zeta_i taking the values 1,\ldots, 210.

Theorem 2 (LBS-Kozma): Let H>0 be an integer divisible by at least 4 primes; the smallest being H=210. Assume that the \zeta_i-s are iid random variables taking values uniformly in \{1,\ldots, H\}. Then

\textrm{Prob}(f \textrm{ irreducible}) = 1+o(1)

(as n\to \infty).

In fact, we just need the coefficients to distribute uniformly modulo H and, of course, that \textrm{Prob}(\zeta_0=0)=0.

In fact, we show that f is very irreducible. For this we recall that for a polynomial with integer coefficients one attaches a Galois group: If \alpha_1, \ldots, \alpha_n are the complex roots of f (with multiplicity), and if N=\mathbb{Q}(\alpha_1,\ldots, \alpha_n) is the minimal subfield of \mathbb{C} containing the \alpha_i-s, then the Galois group is defined as the group of field automorphisms of N:

G_f = \textrm{Gal}(N/\mathbb{Q}) := \textrm{Aut}_{\mathbb{Q}}(N).

An element \sigma\in G_f is determined by its action on the \alpha_i since every element in N is a polynomial expression in the \alpha_i. So we have an embedding

G_f \to S_n

which is well defined up to reordering of the roots; i.e., up to conjugation in S_n. So from now on we consider G_f as a subgroup of S_n.

Galois theory connects arithmetic properties of f with group theoretical properties of G_f, and the most basic one is that

G_f is transitive  if and only if f is irreducible.

The property of G_f being doubly transitive (i.e., acting transitively on the set of pairs (i,j) with i\neq j) translates to the property that f is irreducible over \mathbb{Q} and f(X)/(X-\alpha_1) is irreducible over \mathbb{Q}(\alpha_1). One may continue like this for 3-transitive, and so on. However, by the classification of finite simple groups, this ends in 6 steps, since a 6-transitive group is either A_n or S_n in which case the group is either  n-2 or n-transitive, respectively. (We remarks that with four exceptions M_{11}, M_{12},M_{23},M_{24} we in fact can be done in 4 steps). We prove

Theorem 3 (LBS-Kozma): Let H>0 be an integer divisible by at least 4 primes; the smallest being H=210. Assume that the \zeta_i-s are iid random variables taking values uniformly in \{1,\ldots, H\}. Then

\textrm{Prob}(G_f =A_n \textrm{ or } S_n) = 1+o(1)

(as n\to \infty).

Large box model

We start by an easier model which we call the large box model. In this model, we fix the degree of f, and we choose the \zeta_i to vary uniformly in \{1,\ldots, H\}, where now H\to \infty. The study of this model goes back to Van der Waerden, who proved in the 1930s that

\textrm{Prob}(G_f = S_n) =O(H^{-1/6}).

The heuristic idea (which we give for irreducibility, for simplicity) is that

\{f \textrm{ reducible}) \} \subseteq \bigcap_{p} \{f \textrm{ reducible }\mod p\},

and the right-hand-side events are roughly independent when H is much bigger than p, and so we have

\begin{aligned}\textrm{Prob}(f \textrm{ reducible})&\lessapprox \prod_{p}  \textrm{Prob}(f \textrm{ reducible }\mod p)\\&\approx \prod_p(1-1/n),\end{aligned}

where in the last equality we have used the prime polynomial theorem. So, if this estimation is ok for a set primes which grows with n, we get that most polynomials are irreducible.

Van der Waerden’s bound was improved to O(H^{-1/2+\epsilon}) in the 1960s by Gallagher who applied the large sieve inequality and to O(H^{\sqrt{2}-2+\epsilon}) by Dietmann in 2012. In 2015, Rivin gave an elementary argument which gives the nearly optimal bound of O(H^{-1+\epsilon}). (The lower bound \textrm{Prob}(G_f\neq S_n) \geq  \textrm{Prob}(f \textrm{ reducible}) \gg H^{-1} is very easy.)

Let me present the argument in the irreducibility setting : A very naive approach is to say that for 1\leq k\leq n/2

\begin{aligned} &\#\{f : \exists g,h \textrm{ st } f=gh, \deg g=k\} \\&\quad\leq \#\{f : g,h \textrm{ st } f\equiv gh\mod H, \deg g=k\}\end{aligned}.

Since there are H^{k} monic g-s modulo H and H^{n-k} monic h-s modulo H we get the bound

\textrm{Prob}(f=gh, \deg g=k) \leq \frac{1}{H^n}\cdot H^{k}\cdot H^{n-k}=1.

We recover the trivial bound, which is not very helpful 😦 However, if we just condition on the first coefficients a,b of f,g and noting that f(0)=g(0)h(0), so b\mid a, we would already get the power saving we aim for. Indeed,  putting \tau(a) = \sum_{b\mid a} 1

\begin{aligned}&\textrm{Prob}(f=gh, \deg g=k) \\&\quad=\sum_{1\leq a\leq H}\sum_{b\mid a}\textrm{Prob}(f\equiv gh, \deg g=k, f(0)=a, g(0)=b) \\ &\quad \leq \frac{1}{H^{n}} H^{k-1}H^{n-k-1}\sum_{1\leq a\leq H}\tau(a)\\&\quad \ll \frac{\log H}{H}.\end{aligned}

Here we used the classical result in number theory that the average value of \tau in the interval [1,H] is \log H.

Noting that f is reducible if and only if f=gh, \deg g=k for some 1\leq k\leq n-1, we conclude that

\textrm{Prob}(f \textrm{ reducible }) \ll n\frac{\log H}{H},

with absolute implied constant.

One may see that this bound does not help in the restricted coefficient model discussed above, as it tends to \infty with n.

However, it gives a \pm 1 coefficient result in the easier case when the number of variables is 2:

Theorem (LBS-Kozma): Let f (X) = \sum_{i,j\leq n} \pm 1X^iY^j be a random bivariate polynomial with iid coefficients taking the values \pm 1 uniformly. Then

\textrm{Prob} (f \textrm{ irreducible}) = 1+o(1)

as n\to \infty.

The crux of the proof is specializing X=2, which essentially reduces to the large box model. This was generalized by Chern to more general fixed sets of coefficients, see here.

Restricted coefficient model

Ideas in the proof of Theorem 1

Going back to the restricted coefficients model, we now explain the main tools/steps in the proof of Theorem 2. We always assume that f is a random polynomial with iid coefficients taking the values 1,\ldots, 210, uniformly (unless explicitly stated otherwise).

It will be convenient to introduce a piece of terminology: An event on a random polynomial f holds a.s. (almost surely) if the probability for it to hold is 1+o(1), as n\to \infty.

Our first ingredient is that

 A.s., the degrees of f are \geq \omega(n), for some \lim_{n\to \infty}\omega(n)=\infty.

In fact, Konyagin, while proving Theorem 1, proved the above with \omega(n)\gg n/\log n  (he considered the model of 0,1, but his argument carries over to this model too). We will not need such a strong bound on \omega(n). To get a flavor of the tools in the proof we show that a random \pm 1-coefficient polynomial has no linear factors, a.s.:

A linear factor corresponds to a rational root, which by the rational root theorem must be \pm1. The probability that \pm 1 is a root is the same as the probability that a length n random walk on the line will come back to the starting point, which is \approx 1/\sqrt{n} =o(1).  So a.s. f has no linear factor.

Using the first ingredient we may reduce the proof of Theorem 1 to proving

\textrm{Prob}(f = gh, k\leq \deg g\leq 2k)\ll \frac{1}{(\log k)^2}.

Indeed, by the first case and neglecting events that a.s. do not occur,

\begin{aligned}&\textrm{Prob}(f\textrm{ reducible})\\ &\quad \leq \sum_{\log_2\omega(n)\leq d\leq \log n/2} \textrm{Prob}(f = gh, 2^d\leq \deg g\leq 2^{d+1})\\ &\quad\ll \sum_{d=\log_2\omega(n)}^{\infty} d^2 \to 0.\end{aligned}

The next thing we need relates to the anatomy of the polynomials. Consider the space

\Omega = \{(l_1,l_2,\ldots, l_n) : l_i\geq 0\textrm{ and } \sum il_i=n\}.

of partitions of n. We have two probabilities measures on \Omega coming from M_{n,q} the set of degree-n monic polynomials with coefficients in the finite field \mathbb{F}_q and from S_n, and are given respectively by

\begin{aligned} P_{S_n} (l_1,l_2,\ldots, l_n)  & = \frac{\#\{ \sigma\in S_n \textrm{ with } l_i \textrm{ orbits of length }i\}}{n!}\\ P_{M_{n,q}} (l_1,l_2,\ldots, l_n)  & =  \frac{\#\{ \phi \in M_{n,q} \textrm{ with }l_i \textrm{ prime factors of degree }i\}}{q^n}\end{aligned}

Since we have an exact formula for the number of primes of each degree (the prime polynomial theorem) both these probabilities have formulas, see Arratia, Barbour and Tavaré for an extensive study on the relations of the two probability functions.   In particular, it is not very difficult to show that \|P_{S_n} - P_{M_{n,q}}\|=O(q^{-1}) with absolute constant that is independent on n (and funny enough this gives a new proof of the derangement problem).

However in our case q=2,3,5,7. For such small q there are events, coming from the small i-s, for which the respective probabilities are not close (e.g.  compare the probabilities of the event l_1\neq 0). The next ingredient is to show that if an event R is invariant to changing l_i for i \ll (\log k)^2, then

|P_{S_n}(R) - P_{M_{n,q}}(R)| \ll (\log k)^{-2}   (*)

Dimitris Koukoulopoulos suggested a way to directly work with polynomials instead of going through permutations. This will be discussed elsewhere.

Since our f has coefficients that are uniform modulo 210, when we reduce modulo each of the primes 2,3,5,7 we get 4-independent uniform polynomial (this is a consequence of the Chinese remainder theorem). This consideration combined with the fact that the reductions of f modulo 2,3,5,7 are independent from each other and each of them is uniform, reduce the proof to showing

for every 1\leq k\leq n/4, the probability that there is k\leq l<2k for which four iid uniform random permutations has invariant set of size l is \ll (\log k)^{-2}.

This was shown (essentially) by Pemantle-Peres-Rivin with the bound \ll k^{-c}. This does not hold for three random permutations as was shown by Eberhard, Ford and Green. This completes the steps in the proof.

Galois group

Ideas in Proof of Theorem 3

Now we know that a.s. a random polynomial is irreducible, and in particular it has a transitive Galois group.  To get that the Galois group contains the alternating group, we follow Łuczak and Pyber who proved that a.s. a random permutation is contained in no transitive subgroup other than A_n or S_n. The difference of what is needed for our theorem, is that we have a random permutation only up to a small perturbation:

Let \sigma\in S_n be a random uniform permutation and let \alpha>0 be sufficiently small (in fact \alpha<1-\frac{\log\log2}{\log2} suffices). Then a.s. there exists no transitive group G\neq A_n,S_n that contains a permutation that can be get from changing \sigma by cycles of length \leq n^{\alpha}.

The need for perturbation comes from two sources. The first is that we can from (*). The second one comes from lifting the Frobenius.

To be more precise, write f_p for f modulo p. If f_p has m_i prime factors of degree i and it is squarefree, then the Frobenius acts on the roots of f_p has a permutation with m_i cycles of length i, and this action may be lifted to an element of G_f.

However, if f_p is not squarefree, this doesn’t work as simple. There exists a unique factorization f_p = gh with g squarefull,  h squarefree, and \gcd(g,h)=1. Then, the roots of g and the roots of h each are invariant set under the action of the Frobenius. Now we can lift the Frobenius to an element \sigma\in G_f, so that \sigma acts the same on the set of liftings of the roots of h. This means that, up to the perturbation of the size of \deg g, the element \sigma is uniform. Now, we know that a.s. \deg g\leq n^{\alpha}, and so we get a random element up to perturbation as needed.




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: