## On random walks with weights

December 30, 2018

It is well known to experts that the probability that a complex number $\alpha$ will be a root of a random polynomial $\pm 1 \pm X \pm X^2 + \cdots \pm X^n$ is $O(1/\sqrt{n})$. Here is a simulation of the values of $\sum_{k=0}^n \pm i^k$ for $n=1,\ldots, 100000$:

There are many ways to prove such a statement, one is using the following theorem of Erdős:

Theorem: Let $(a_1,a_2, \ldots )$ be a sequence of nonzero complex numbers and $c$ a complex number. Then

$\mathbb{P}(\sum_{i=1}^n \pm a_i = c) =O(1/\sqrt{n}).$

Here the implied constant is absolute.

The simplest case of this theorem is when all the $a_i$-s are equal and $c=0$—a simple random walk on $\mathbb{Z}$. Then the probability to hit $0$ is either $0$ if $n$ is odd, or

$\mathbb{P}(\sum_{i=1}^{n} \pm 1=0) = 2^{-n}\binom{n}{n/2}\ll \frac{1}{\sqrt{n}}$,

if $n$ is even. Here $f(n)\ll g(n)$ means that there exists an absolute constant $C$ such that $|f(n)|\leq Cg(n)$. The last inequality is elementary, see here, and using Stirling’s formula (or Wallis formula) can be made into the asymptotic formula $\mathbb{P}(\sum_{i=1}^{n} \pm 1=0) \sim \sqrt{\frac{2}{\pi n }}$.

The goal of this post is to describe an easy “circle-method-argument” that reduces the proof of the theorem to the case of a simple random walk on $\mathbb{Z}$. I’ve learnt this argument from James Maynard.

Proof.

By conditioning on the last sign, one sees that it suffices to prove the statement for even $n$. Indeed, had we knew the assertion for even $n$-s, then for odd $n$ we have

\displaystyle \begin{aligned}&\mathbb{P}(\sum_{i=1}^n \pm a_i = c)\\ &\quad = \frac{1}{2}\mathbb{P}(\sum_{i=1}^{n-1} \pm a_i = c-a_n)+\frac12\mathbb{P}(\sum_{i=1}^{n-1} \pm a_i = c+a_n)\\ &\quad \ll \frac{1}{\sqrt{n-1}}\leq\frac{2}{\sqrt{n}}.\end{aligned}

We start by considering the case where all the $a_i$-s are rational. Since $\sum_{i=1}^n \pm a_i=c$ if and only if $\sum_{i=1}^n \pm D a_i=Dc$ with $D$ a common denominator for the $a_i$-s, we may assume w.l.o.g. that the $a_i$ are integers. We may also assume that $c$ is an integer, otherwise there is no solution and so the probability is $0$. We put $S(n,(a_1,a_2,\ldots),c)$ to be the number of choice of signs so that $\sum_{i=1}^n\pm a_i=c$ so that $\mathbb{P}(\sum_{i=1}^n \pm a_i=c)=2^{-n}S(n,(a_1,a_2,\ldots),c)$.  To compute $S$ we use the trivial identity for integers $a$

$\displaystyle \int_{0}^1 e(a\theta)d\theta = \begin{cases}1,& a=0\\ 0,&a\neq 0,\end{cases}$

where $e(\theta) = e^{2\pi i\theta}$. Thus, by expanding the product, we have

\begin{aligned}&S(n,(a_1,a_2,\ldots),c) = \int_0^1 \prod_{i=1}^n (e(a_i\theta)+e(-a_i\theta))e(-c\theta)d\theta\\ &\quad \leq \int_0^1 \prod_{i=1}^n |e(a_i\theta)+e(-a_i\theta)|d\theta.\end{aligned}

We now apply the AM-GM inequality to get that

$S(n,(a_1,a_2,\ldots),c)\leq \frac{1}{n}\sum_{i=1}^n \int_0^1 |e(a_i\theta)+e(-a_i\theta)|^nd\theta.$

Since $e(a_i\theta)+e(-a_i\theta)$ is real and $n$ is even, we may remove the absolute value in the integral, and then by the same argument as before

$\int_0^1 (e(a_i\theta)+e(-a_i\theta))^nd\theta = S(n,(a_i,a_i,\ldots),0).$

So we are back to the simple random walk, that is,

$S(n,(a_i,a_i,\ldots),0)=S(n,(1,1,\ldots),0) \ll 2^n/\sqrt{n}$.

Averaging over $i$ we conclude that

$\mathbb{P} (\sum_{i=1}^n \pm a_i=c) \ll \frac{1}{\sqrt{n}},$

as needed. This finishes the proof for rational $a_i$-s.

Next we consider the general setting, namely, when the $a_i$-s are nonzero complex numbers. Let $V$ be the $\mathbb{Q}$-vector space that is spanned by the $a_i$-s and proceed by induction on $k=\dim_{\mathbb{Q}} V$. When $k=1$, then we are back in the setting of rational $a_i$-s, so we may assume that $k>1$. Take $W\subseteq V$ of dimension $k-1$. So the $a_i$-s divide into two nonempty sets $A_1$ and $A_2$ such that $A_1\subseteq W$  and $A_2\cap W=\emptyset$.

Fix some $a\in A_2$ so that $V=W\oplus \mathbb{Q}a$. Any $a_i\in A_2$ breaks as $a_i = w_i+a_i' a$ with $a_i'\neq0$ rational and $w_i\in W$. We may assume that $c\in V$ (otherwise the probability is zero) and hence can be decompose as $c=w_c+c'a$ with $c'$ rational and $w_c\in W$. Then $\sum_{i=1}^n \pm a_i = c$ if and only if $\sum_{i: a_i\in A_2} \pm a_i'=c'$ and $\sum_{i: a_i\in A_1} \pm a_i = d$ with $d=c-\sum_{i:a_i\in A_2}\pm w_i$. Denote the former event by $E_2$ and for any choice of signs for the elements in $A_2$ denote the latter event by $E_1$.

Putting $p_i= \begin{cases} \frac12, & \#A_i=1 \\ \frac{1}{\sqrt{\#A_i}},&\mbox{o/w}\end{cases}$, we get by the induction assumption that $\mathbb{P}(E_2)\ll p_2$ and conditioning on the choice of signs for the elements in $A_2$, the probability of  $E_1$ is  $\ll p_2$. Thus by conditioning on $E_2$ we get that

$\displaystyle \mathbb{P} (\sum_{i=1}^n \pm a_i=c) \leq p_1p_2.$

This conclude the proof since clearly $p_1p_2\leq \frac{1}{\sqrt{n}}$ (separate to the cases that $\#A_i=1$ for some $i$ and that both $\#A_i>1$). QED

## Billingsley’s proof of Erdős-Kac Theorem

July 25, 2018

During a talk of Andrew Granville in the Probability in Number Theory conference in the CRM, I’ve learned Billingsley’s proof of the Erdős-Kac theorem (see here). This proof is pretty cool, and it uses really a bit number theory and two bits of probability. This post is an attempt  to survey it.

For $n\geq 1$, let

$\omega(n)=\sum_{p\mid n} 1$

be the number of distinct prime divisors of $n$. We always use $p$ to denote a prime number. For example $\omega(6)=\omega(12) = 2$. The Erdős-Kac theorem  is a Central Limit Theorem (CLT) for $\omega$. It says that as $x\to \infty$, the average is

$\displaystyle \frac{1}{x} \sum_{1\leq n\leq x} \omega(n) = \log\log x (1+o(1))$

the variance is

$\displaystyle \frac{1}{x}\sum_{1\leq n\leq x}( \omega(n) -\log\log x)^{2} = \log\log x(1+o(1))$

and the CLT itself which says that for any $\lambda>0$

\displaystyle \begin{aligned} &\frac{1}{x} \# \left\{ 1\leq n\leq x: \frac{\omega(n)-\log\log x}{\sqrt{\log\log x}} \geq \lambda \right\} \\ &\quad = (1+o(1))\frac{1}{\sqrt{2\pi}} \int_{\lambda}^{\infty} e^{-t^2/2}dt.\end{aligned}

The proof goes by computing the moments of the normalization of $\omega$ and showing it coincide with a random model, which is Gaussian.

The story being the name of the theorem is quite  nice:  Erdős sat in a talk of Kac who presented this theorem as a something plausible, and Erdős gave a proof at the end of the talk.

The proof below is not the original proof, but rather a proof based on moments. Here is a list of the tools we will use, coming from probability and from number theory.
The first is about the moments of a random variable $X$ with mean $\mu=0$ and standard deviation $\sigma =1$ are

Fact 1: $\mathbb{E}(X^m) = \begin{cases} \frac{(2k)!}{2^k\cdot k!},& m=2k\\ 0, & m = 2k+1 \end{cases}$.

The second is the classical CLT saying that a normalized sum $S$ of many independent random variables which are reasonably distributed is distributed normal with mean $0$ and standard deviation $1$.

Fact 2: $\frac{S-\mu}{\sigma} \sim N(0,1)$.

We will use it for sums of Bernoulli random variables.
The next fact comes from analytic number theory, and it follows quite directly from the prime number theorem. It gives an asymptotic formula for the sum of prime reciprocal:

Fact 3: $\displaystyle \sum_{p\leq y} \frac{1}{p} = \log\log y +O(1)$.

## The random model

Let us define a sequence of independent Bernoulli random variables indexed by prime numbers with the following law of distribution:

$\mathbb{P}(X_p = 1) = \frac 1p, \qquad\mathbb{P}(X_p=0) = 1-\frac{1}{p}$

So $X_p$ models the indicator function $1_p$ of integers divisible by $p$. In this respect, the sum

$W_y = \sum_{p\leq y} X_p$

models the function $\omega_y = \sum_{p\leq y}1_p$ that gives the  number  of prime divisors of $n$ which are not bigger than $y$. When $y$ is not too small  with respect to $x$, the function $\omega_y$ would be relatively close to $\omega$, as will be explained below. We can compute the mean of $W_y$ using Fact 3:

$\displaystyle\mu_y:= \mathbb{E}(W_y) = \sum_{p\leq y} \frac{1}{p}= \log\log y + O(1)$.

Since the $X_p$ are independent, the variance may be computed similarly:

$\displaystyle \sigma_y^2 := {\rm Var}(W_y) = \sum_{p\leq y} \Big(\frac{1}{p}-\frac{1}{p^2}\Big)=\log\log y+O(1)$.

(Recall that $\sum \frac{1}{p^2} \ll \sum \frac{1}{n^2} =O(1)$.) So by Fact 2 we have

$\displaystyle \frac{W_y-\mu_y}{\sigma_y}\to N(0,1)$.

## The proof

Put $y=x^{o(1)}$. First, we compare the moments of $W_y$ and of $\omega_y$:

\displaystyle \begin{aligned} &\star :=\frac{1}{x}\sum_{n\leq x} (\omega_y(n)-\mu_y)^k - \mathbb{E}(W_y-\mu_y)^k \\ &\quad =\sum_{j=1}^k \binom{k}{j} (-\mu_y)^{k-j} \Big(\frac{1}{x} \sum_{n\leq x} \omega_y(n)^j-\mathbb{E}W_y^j\Big). \end{aligned}

Using the notation $[\bullet,\ldots,\bullet]$ for least common multiple, the inner sum can be expanded and bounded in the following way:

\displaystyle\begin{aligned}& \sum_{p_1\cdots p_j\leq y} \Big(\frac{1}{x} \sum_{n\leq x} 1_{p_1}(n)\cdots 1_{p_j}(n) - \mathbb{E}(X_{p_1}\cdots X_{p_j})\Big)\\ &\quad =\sum_{\substack{p_1\cdots p_j \leq y\\ L = [p_1,\ldots, p_j]}} \Big(\frac{1}{x}\Big\lfloor \frac{x}{L}\Big\rfloor - \frac{1}{L}\Big) \\ &\quad\ll \sum_{p_1\cdots p_j\leq y} \frac{1}{x} = \frac{\pi(y)^j}{x} \ll \frac{y^{j}}{x}\ll x^{-1+o(1)}. \end{aligned}

This step explains the choice $y=x^{o(1)}$, because then also $y^j = x^{o(1)}$. Plugging this in $\star$, we conclude that $\star$ is small compared to the variance $\sigma_y^2$, and so we have

$\frac{1}{x} \sum_{n\leq x} \Big( \frac{\omega_y(n) -\mu_y}{\sigma_y}\Big)^k \sim \mathbb{E}\Big(\frac{W_y-\mu_y}{\sigma_y}\Big)^k \sim \mathbb{E}(X^k)$,

with $X\sim N(0,1)$ and $x\to \infty$.

Next we show that $\omega$ and $\omega_y$ are close. To this end take $y = x^{\frac{1}{\log\log\log x}}$. But since $n\leq x$ can have at most $\log\log\log x$ prime divisors bigger than $y$ we get that

$\omega(n) - \omega_y(n) \leq \log\log\log x$.

This implies that the moments of $\frac{\omega(n) - \mu_y}{\mu_y}$ are as of a Gaussian. So to conclude the proof we just need to note that

$\mu_y = \log\log y + O(1) = \log \log x^{1\log\log\log x} = \log \log x - \log \log \log\log\log x +O(1)$

so it is OK to replace $\log \log x$ by $\mu_y$.

Q.E.D

## Is a random polynomial with restricted coefficients irreducible?

May 11, 2018

This post reports on some progress Gady Kozma and I have done recently on the problem of the irreducibility of random polynomials.
The main object of study is a random polynomial; i.e.,

$f(X) = X^n + \zeta_{n-1} X^{n-1} + \cdots + \zeta_1 X+ \zeta_0$,

where $\zeta_0 ,\ldots \zeta_{n-1}$ are random variables. In general we assume that the $\zeta_i$ are iid. Perhaps the most interesting example is when $\zeta_i=\pm 1$ with probabilities $1/2$. Here is how the roots look like:

The roots of polynomials with plus-minus 1 coefficients as illustrated by Sam Derbyshire

There are four types of classical questions on these polynomials:

1. How many real roots $f$ has? Here there is a huge body of literature starting with the classical works of Bloch-Pòlya, Offord-Littlewood, Kac, and many many more. (On average there are about $\frac{2}{\pi} \log n$ real roots, for $\zeta_i$ with mean zero. See the introduction in here for a nice historical review.)
2. How the complex roots distribute? We refer to the work of Odlyzko-Poonen and to here for the above and many more beautiful illustrations.
3. The universality question: How the answers to the above questions depend on the actual distribution of the $\zeta_i$.

If the $\zeta_i\in \mathbb{Z}$, then a question of irreducibility over $\mathbb{Q}$ arises.

4. Is $f$ irreducible whp?

Here we should stress that the assumption $\textrm{Prob}(\zeta_0=0)=0$ is obviously necessary. One of the most exciting problems in this field is:

Conjecture (Odlyzko-Poonen): Let $\zeta_i$ be iid random variables taking the values $0,1$ with equal probabilities expect $\zeta_0=1$. Then

$\textrm{Prob}(f \textrm{ irreducible}) = 1 +o(1)$.

This conjecture is open. In fact, I got familiarized with these type of questions by some guy on the street who asked about irreducibly whp in the $\zeta_i = \pm 1$ model.

The best lower bound is due to Konyagin:

Theorem 1 (Konyagin): In the setting of the conjecture,

$\textrm{Prob}(f \textrm{ irreducible}) \gg 1/\log n$.

Here we use the notation $f(n)\gg g(n)$ if there exists an absolute constant $C>0$such that $f(n)\geq C g(n)$.

Even if one takes the easier problem when the coefficients are  allowed  to take values in a bigger fixed set, no result was known. Recently, Gady Kozma and I uploaded to the arXiv a paper, in which we prove a Odlyzko-Poonen type conjecture, for $\zeta_i$ taking the values $1,\ldots, 210$.

Theorem 2 (LBS-Kozma): Let $H>0$ be an integer divisible by at least $4$ primes; the smallest being $H=210$. Assume that the $\zeta_i$-s are iid random variables taking values uniformly in $\{1,\ldots, H\}$. Then

$\textrm{Prob}(f \textrm{ irreducible}) = 1+o(1)$

(as $n\to \infty$).

In fact, we just need the coefficients to distribute uniformly modulo $H$ and, of course, that $\textrm{Prob}(\zeta_0=0)=0$.

In fact, we show that $f$ is very irreducible. For this we recall that for a polynomial with integer coefficients one attaches a Galois group: If $\alpha_1, \ldots, \alpha_n$ are the complex roots of $f$ (with multiplicity), and if $N=\mathbb{Q}(\alpha_1,\ldots, \alpha_n)$ is the minimal subfield of $\mathbb{C}$ containing the $\alpha_i$-s, then the Galois group is defined as the group of field automorphisms of $N$:

$G_f = \textrm{Gal}(N/\mathbb{Q}) := \textrm{Aut}_{\mathbb{Q}}(N)$.

An element $\sigma\in G_f$ is determined by its action on the $\alpha_i$ since every element in $N$ is a polynomial expression in the $\alpha_i$. So we have an embedding

$G_f \to S_n$

which is well defined up to reordering of the roots; i.e., up to conjugation in $S_n$. So from now on we consider $G_f$ as a subgroup of $S_n$.

Galois theory connects arithmetic properties of $f$ with group theoretical properties of $G_f$, and the most basic one is that

$G_f$ is transitive  if and only if $f$ is irreducible.

The property of $G_f$ being doubly transitive (i.e., acting transitively on the set of pairs $(i,j)$ with $i\neq j$) translates to the property that $f$ is irreducible over $\mathbb{Q}$ and $f(X)/(X-\alpha_1)$ is irreducible over $\mathbb{Q}(\alpha_1)$. One may continue like this for $3$-transitive, and so on. However, by the classification of finite simple groups, this ends in $6$ steps, since a $6$-transitive group is either $A_n$ or $S_n$ in which case the group is either  $n-2$ or $n$-transitive, respectively. (We remarks that with four exceptions $M_{11}, M_{12},M_{23},M_{24}$ we in fact can be done in $4$ steps). We prove

Theorem 3 (LBS-Kozma): Let $H>0$ be an integer divisible by at least $4$ primes; the smallest being $H=210$. Assume that the $\zeta_i$-s are iid random variables taking values uniformly in $\{1,\ldots, H\}$. Then

$\textrm{Prob}(G_f =A_n \textrm{ or } S_n) = 1+o(1)$

(as $n\to \infty$).

## Large box model

We start by an easier model which we call the large box model. In this model, we fix the degree of $f$, and we choose the $\zeta_i$ to vary uniformly in $\{1,\ldots, H\}$, where now $H\to \infty$. The study of this model goes back to Van der Waerden, who proved in the 1930s that

$\textrm{Prob}(G_f = S_n) =O(H^{-1/6})$.

The heuristic idea (which we give for irreducibility, for simplicity) is that

$\{f \textrm{ reducible}) \} \subseteq \bigcap_{p} \{f \textrm{ reducible }\mod p\}$,

and the right-hand-side events are roughly independent when $H$ is much bigger than $p$, and so we have

\begin{aligned}\textrm{Prob}(f \textrm{ reducible})&\lessapprox \prod_{p} \textrm{Prob}(f \textrm{ reducible }\mod p)\\&\approx \prod_p(1-1/n),\end{aligned}

where in the last equality we have used the prime polynomial theorem. So, if this estimation is ok for a set primes which grows with $n$, we get that most polynomials are irreducible.

Van der Waerden’s bound was improved to $O(H^{-1/2+\epsilon})$ in the 1960s by Gallagher who applied the large sieve inequality and to $O(H^{\sqrt{2}-2+\epsilon})$ by Dietmann in 2012. In 2015, Rivin gave an elementary argument which gives the nearly optimal bound of $O(H^{-1+\epsilon})$. (The lower bound $\textrm{Prob}(G_f\neq S_n) \geq \textrm{Prob}(f \textrm{ reducible}) \gg H^{-1}$ is very easy.)

Let me present the argument in the irreducibility setting : A very naive approach is to say that for $1\leq k\leq n/2$

\begin{aligned} &\#\{f : \exists g,h \textrm{ st } f=gh, \deg g=k\} \\&\quad\leq \#\{f : g,h \textrm{ st } f\equiv gh\mod H, \deg g=k\}\end{aligned}.

Since there are $H^{k}$ monic $g$-s modulo $H$ and $H^{n-k}$ monic $h$-s modulo $H$ we get the bound

$\textrm{Prob}(f=gh, \deg g=k) \leq \frac{1}{H^n}\cdot H^{k}\cdot H^{n-k}=1$.

We recover the trivial bound, which is not very helpful 😦 However, if we just condition on the first coefficients $a,b$ of $f,g$ and noting that $f(0)=g(0)h(0)$, so $b\mid a$, we would already get the power saving we aim for. Indeed,  putting $\tau(a) = \sum_{b\mid a} 1$

\begin{aligned}&\textrm{Prob}(f=gh, \deg g=k) \\&\quad=\sum_{1\leq a\leq H}\sum_{b\mid a}\textrm{Prob}(f\equiv gh, \deg g=k, f(0)=a, g(0)=b) \\ &\quad \leq \frac{1}{H^{n}} H^{k-1}H^{n-k-1}\sum_{1\leq a\leq H}\tau(a)\\&\quad \ll \frac{\log H}{H}.\end{aligned}

Here we used the classical result in number theory that the average value of $\tau$ in the interval $[1,H]$ is $\log H$.

Noting that $f$ is reducible if and only if $f=gh$, $\deg g=k$ for some $1\leq k\leq n-1$, we conclude that

$\textrm{Prob}(f \textrm{ reducible }) \ll n\frac{\log H}{H},$

with absolute implied constant.

One may see that this bound does not help in the restricted coefficient model discussed above, as it tends to $\infty$ with $n$.

However, it gives a $\pm 1$ coefficient result in the easier case when the number of variables is $2$:

Theorem (LBS-Kozma): Let $f (X) = \sum_{i,j\leq n} \pm 1X^iY^j$ be a random bivariate polynomial with iid coefficients taking the values $\pm 1$ uniformly. Then

$\textrm{Prob} (f \textrm{ irreducible}) = 1+o(1)$

as $n\to \infty$.

The crux of the proof is specializing $X=2$, which essentially reduces to the large box model. This was generalized by Chern to more general fixed sets of coefficients, see here.

## Restricted coefficient model

#### Ideas in the proof of Theorem 1

Going back to the restricted coefficients model, we now explain the main tools/steps in the proof of Theorem 2. We always assume that $f$ is a random polynomial with iid coefficients taking the values $1,\ldots, 210$, uniformly (unless explicitly stated otherwise).

It will be convenient to introduce a piece of terminology: An event on a random polynomial $f$ holds a.s. (almost surely) if the probability for it to hold is $1+o(1)$, as $n\to \infty$.

Our first ingredient is that

A.s., the degrees of $f$ are $\geq \omega(n)$, for some $\lim_{n\to \infty}\omega(n)=\infty$.

In fact, Konyagin, while proving Theorem 1, proved the above with $\omega(n)\gg n/\log n$  (he considered the model of $0,1$, but his argument carries over to this model too). We will not need such a strong bound on $\omega(n)$. To get a flavor of the tools in the proof we show that a random $\pm 1$-coefficient polynomial has no linear factors, a.s.:

A linear factor corresponds to a rational root, which by the rational root theorem must be $\pm1$. The probability that $\pm 1$ is a root is the same as the probability that a length $n$ random walk on the line will come back to the starting point, which is $\approx 1/\sqrt{n} =o(1)$.  So a.s. $f$ has no linear factor.

Using the first ingredient we may reduce the proof of Theorem 1 to proving

$\textrm{Prob}(f = gh, k\leq \deg g\leq 2k)\ll \frac{1}{(\log k)^2}$.

Indeed, by the first case and neglecting events that a.s. do not occur,

\begin{aligned}&\textrm{Prob}(f\textrm{ reducible})\\ &\quad \leq \sum_{\log_2\omega(n)\leq d\leq \log n/2} \textrm{Prob}(f = gh, 2^d\leq \deg g\leq 2^{d+1})\\ &\quad\ll \sum_{d=\log_2\omega(n)}^{\infty} d^2 \to 0.\end{aligned}

The next thing we need relates to the anatomy of the polynomials. Consider the space

$\Omega = \{(l_1,l_2,\ldots, l_n) : l_i\geq 0\textrm{ and } \sum il_i=n\}$.

of partitions of $n$. We have two probabilities measures on $\Omega$ coming from $M_{n,q}$ the set of degree-$n$ monic polynomials with coefficients in the finite field $\mathbb{F}_q$ and from $S_n$, and are given respectively by

\begin{aligned} P_{S_n} (l_1,l_2,\ldots, l_n) & = \frac{\#\{ \sigma\in S_n \textrm{ with } l_i \textrm{ orbits of length }i\}}{n!}\\ P_{M_{n,q}} (l_1,l_2,\ldots, l_n) & = \frac{\#\{ \phi \in M_{n,q} \textrm{ with }l_i \textrm{ prime factors of degree }i\}}{q^n}\end{aligned}

Since we have an exact formula for the number of primes of each degree (the prime polynomial theorem) both these probabilities have formulas, see Arratia, Barbour and Tavaré for an extensive study on the relations of the two probability functions.   In particular, it is not very difficult to show that $\|P_{S_n} - P_{M_{n,q}}\|=O(q^{-1})$ with absolute constant that is independent on $n$ (and funny enough this gives a new proof of the derangement problem).

However in our case $q=2,3,5,7$. For such small $q$ there are events, coming from the small $i$-s, for which the respective probabilities are not close (e.g.  compare the probabilities of the event $l_1\neq 0$). The next ingredient is to show that if an event $R$ is invariant to changing $l_i$ for $i \ll (\log k)^2$, then

$|P_{S_n}(R) - P_{M_{n,q}}(R)| \ll (\log k)^{-2}$   (*)

Dimitris Koukoulopoulos suggested a way to directly work with polynomials instead of going through permutations. This will be discussed elsewhere.

Since our $f$ has coefficients that are uniform modulo $210$, when we reduce modulo each of the primes $2,3,5,7$ we get $4$-independent uniform polynomial (this is a consequence of the Chinese remainder theorem). This consideration combined with the fact that the reductions of $f$ modulo $2,3,5,7$ are independent from each other and each of them is uniform, reduce the proof to showing

for every $1\leq k\leq n/4$, the probability that there is $k\leq l<2k$ for which four iid uniform random permutations has invariant set of size $l$ is $\ll (\log k)^{-2}$.

This was shown (essentially) by Pemantle-Peres-Rivin with the bound $\ll k^{-c}$. This does not hold for three random permutations as was shown by Eberhard, Ford and Green. This completes the steps in the proof.

## Galois group

#### Ideas in Proof of Theorem 3

Now we know that a.s. a random polynomial is irreducible, and in particular it has a transitive Galois group.  To get that the Galois group contains the alternating group, we follow Łuczak and Pyber who proved that a.s. a random permutation is contained in no transitive subgroup other than $A_n$ or $S_n$. The difference of what is needed for our theorem, is that we have a random permutation only up to a small perturbation:

Let $\sigma\in S_n$ be a random uniform permutation and let $\alpha>0$ be sufficiently small (in fact $\alpha<1-\frac{\log\log2}{\log2}$ suffices). Then a.s. there exists no transitive group $G\neq A_n,S_n$ that contains a permutation that can be get from changing $\sigma$ by cycles of length $\leq n^{\alpha}$.

The need for perturbation comes from two sources. The first is that we can from (*). The second one comes from lifting the Frobenius.

To be more precise, write $f_p$ for $f$ modulo $p$. If $f_p$ has $m_i$ prime factors of degree $i$ and it is squarefree, then the Frobenius acts on the roots of $f_p$ has a permutation with $m_i$ cycles of length $i$, and this action may be lifted to an element of $G_f$.

However, if $f_p$ is not squarefree, this doesn’t work as simple. There exists a unique factorization $f_p = gh$ with $g$ squarefull,  $h$ squarefree, and $\gcd(g,h)=1$. Then, the roots of $g$ and the roots of $h$ each are invariant set under the action of the Frobenius. Now we can lift the Frobenius to an element $\sigma\in G_f$, so that $\sigma$ acts the same on the set of liftings of the roots of $h$. This means that, up to the perturbation of the size of $\deg g$, the element $\sigma$ is uniform. Now, we know that a.s. $\deg g\leq n^{\alpha}$, and so we get a random element up to perturbation as needed.

## Statistics of Arithmetic Functions over Large Finite Fields

May 8, 2018

The aim of this post is to survey some results that has been obtained recently on statistic of arithmetic functions over large finite fields using Galois theory. In a future post, I will discuss the proofs.

#### Statistic of Arithmetic Functions

Let $\psi$ be  an arithmetic function, by which we just mean that $\psi$ is a function

$\psi\colon \mathbb{Z}_{>0} \to \mathbb{C}$.

It is a main theme in analytic number theory to study asymptotic statistic properties of arithmetic functions. The mean value of $\psi$ is defined as the asymptotics of

$\displaystyle\langle \psi(n)\rangle_{ n\leq x } := \frac{1}{x} \sum_{n\leq x} \psi(n).$

For example, the Prime Number Theorem can be formulated as

$\displaystyle\langle \Lambda(n) \rangle_{n\leq x} \sim 1,$

where $\Lambda$ is the von Mangoldt arithmetic function:

$\Lambda(n) = \begin{cases} \log p, & n=p^a, \ p \ \textrm{prime}\\ 0, &\textrm{otherwise.}\end{cases}$

One may consider mean values on short intervals

$\displaystyle\langle \psi(n) \rangle_{|n-x|\leq x^{\epsilon}} : = \frac{1}{2x^{\epsilon}} \sum_{|n-x|\leq x^{\epsilon}} \psi(n)$

(where $1>\epsilon>0$ is fixed) or on arithmetic progressions

$\displaystyle\langle \psi(n) \rangle_{\substack{n\leq x\\n\equiv a(q)}}:= \frac{1}{qx} \sum_{\substack{n\leq x\\n\equiv a(q)} } \psi(n)$

where we allow $q$ to be in the regime $q\leq x^{1-\delta}$, for a fixed $1>\delta >0$. Two central conjectures in the distribution of prime numbers are

$\displaystyle\langle \Lambda(n) \rangle_{|n-x|\leq x^{\epsilon}} \sim 1$

and for $(a,q)=1$:

$\displaystyle\langle \psi(n) \rangle_{\substack{n\leq x\\n\equiv a(q)}}\sim \frac{q}{\phi(q)}.$

The Riemann hypothesis implies the first conjecture only for $\epsilon > 1/2$ and the General Riemann Hypothesis implies the second conjecture only for $\delta >1/2$. We refer the interested reader to the excellent survey paper by Granville.

Other type of statistics is of correlation of arithmetic functions $\psi_1,\ldots, \psi_k$ which, at a tuple $h=(h_1,\ldots, h_k) \in \mathbb{Z}>0$ is defined as the asymptotics of

$\displaystyle\left \langle \prod_{i=1}^k \psi_i(n+h_i) \right\rangle_{n\leq x}, \quad x\to \infty$.

If the $\psi_i$-s are all equal, we use the term auto-correlation, if $k=2$ we say pair-correlation, etc. Following the previous examples, the auto-correlations of $\Lambda$ as given in the Hardy-Littlewood prime tuple conjecture, implies Poisson distribution of the number of primes in intervals of typical length (this is known as Gallagher’s lemma). See this post for some more on the Hardy-Littlewood prime tuple conjecture.

Another theme, which is in a sense more general, is about primes values of arithmetic functions. That is, given a polynomial $f(X)$ with integral coefficients, we consider the asymptotics of

$\langle \psi(f(n)) \rangle_{n\leq x}$.

In particular, when $\psi=\Lambda$, the content of the Bateman-Horn conjecture (which is a quantitative version of the Bouniakowsky conjecture and the more general Schinzel Hypothesis H).

In particular, taking $f(X) = \prod(X+h_i)$, we get the Hardy-Littlewood prime tuple conjecture. Taking $f(X) = X^2+1$, we get a quantitative Landau’s problem of finding infinitely many primes of the form $p=n^2+1$.

We exemplified the importance of this statistics with $\psi=\Lambda$, but taking other arithmetic functions such as the M\”obius function, divisor function, etc. leads to a variety of central problems in analytic number theory which have applications in other fields (e.g. the auto-correlations of $b$ and its connections to the dynamics of billiards). I do not want to elaborate on those applications here; the bottom line is that statistics of arithmetic functions is central in number theory.

#### Function Fields

Here we go from the realm of the integers to the world of polynomials over finite fields. The usual dictionary is that the ring $\mathbb{F}_q[t]$ of polynomials over the finite field $\mathbb{F}_q$ of $q$ elements serves as an analogue of $\mathbb{Z}$. The positive integers are modeled by the set of monic polynomials $M_q\subseteq \mathbb{F}_q[t]$. So, an arithmetic function $\psi$ in this setting is a function

$\psi\colon M_q \to \mathbb{C}.$

The analogue of the interval $[1,x]$ (or rather of $[x,2x]$) is the set of degree $n$ monics:

$M_{n,q} := \{ f\in M_q : \deg f=n\}.$

We define the mean value of $\psi$ as the asymptotic value of

$\langle\psi(f)\rangle_{f\in M_{n,q}} := \frac{1}{q^n}\sum_{f\in M_{n,q}}\psi(f)$

as $q^n\to \infty$. It is important to note now that the asymptotics may depend on the way $q^n\to \infty$. We list here some different limits, going from closest to the integers to the farthest:

1. Large-degree limit: $q$ fixed and $\lim_{n\to \infty}$,
2. Large-degree-large-field limit: $\lim_{q\to \infty} \lim_{n\to \infty}$,
3. Large-field-large-degree limit: $\lim_{n\to \infty} \lim_{q\to \infty}$,
4. Large-finite-field limit: $n$ fixed and $\lim_{q\to \infty}$.

We proceed with the above tradition of exemplify using the von Mangoldt function: A prime polynomial is a monic irreducible polynomial and the von Mangoldt function is defined as

$\Lambda_q(f) = \begin{cases}\deg P ,& f=P^k,\ P\ \textrm{prime polynomial}\\ 0,& \textrm{otherwise.}\end{cases}$

The Prime Polynomial Theorem first proved by Gauss says that

$\langle\Lambda(f)\rangle_{f\in M_{n,q}} =1$.

The first thing to note, that unlike the number field case, there is NO error term here. In particular, this theorem is valid in the most general limit $q^n \to \infty$. The proof of this theorem goes in the same lines as the proof of the Prime Number Theorem, just that the corresponding zeta function is a rational function with no zeros, hence the analytic part is trivial in the function field case. The fact that there are no zeros gives the exact formula above.

In general, the General Riemann Hypothesis (GRH) for function fields is a theorem, hence we expect that any result for the integers conditional on GRH should be provable in the function field setting. However this is not the case, a beautiful example in the setting of the Bateman-Horn conjecture is discussed here.

#### Short intervals

The studies on the mean values of arithmetic functions in short intervals started with the Von Mongoldt (and the prime characteristic function) by Bank&Bary-Soroker&Rosenzweig and by Keating&Rudnick. The latter paper studies the distribution of of the number of primes (counted with von-Mongoldt) in short intervals, and in particular computes the variance in the limit $q\to \infty$. The work is based on an equidstribution result of Katz. The former result study the mean values of the number of primes in all short intervals, in the same limit $q\to \infty$. Here the main point the main point is a computation of a Galois group, which is based on earlier results of S.D. Cohen.

Here a short interval around a polynomial $f\in \mathbb{F}_q[T]$(with parameter $\deg f>m>0$) is the set

$\{ g\in \mathbb{F}_q[T] : \deg(f-g)\leq m\}$

which corresponds to a short interval $\{n:|n- x|\leq x^{\epsilon}\}$ around $x$ (with parameter $1>\epsilon>0$), where $\epsilon$ corresponds to $\frac{m}{\deg f}$.

The results for primes were generalized to other arithmetic functions, to correlations, and even to arithmetic function evaluated at polynomials, a la Bateman-Horn conjecture. Some other people who contributed a lot in this direction: Julio Andrade, Dan Carmon, Alexei Entin, Arno Fehm, Paul Pollack, Brad Rodgers, and many more.

I want to focus here on several families of arithmetic functions that we can compute the mean value in short intervals in the large degree limit. I will start with simpler ones.

#### Factorization type arithmetic functions

We start with an abstract notion of factorization type of a polynomial. For us, a function $\lambda \colon \mathbb{Z}_{>0}\times \mathbb{Z}_{>0}\to \mathbb{Z}_{\geq 0}$ with finite support is called a factorization type. The set of all factorization types will be denoted by

$\Lambda_0 = \{ \lambda \textrm{ factorization type}\}$.

The degree of $\latex \lambda\in \Lambda_0$ is defined to be

$\deg \lambda = \sum_{d,e>0} de \lambda(d,e)$.

To make the reader more comfortable with the notion of factorization type, we immediately say that a nonzero polynomial $f\in \mathbb{F}_q[T]$ with prime factorization $f= \prod_{i} P^{e_i}$ induces the factorization type defined by

$\lambda_f(d,e) = \# \{ i : \deg P_i = d, \ e_i=e\}$.

In particular,

$\deg f = \deg \lambda_f$.

The space of functions on factorization functions is denoted

$\Lambda_0^* = \{ \psi \colon \Lambda_0\to \mathbb{C}\}$

and we refer to $\psi \in \Lambda_0^*$ as a factorization type arithmetic function. We immediately note that $\psi\in \Lambda_0^*$ defines an arithmetic function on $\mathbb{F}_q[T]$ by setting

$\psi_q(f) := \psi(\lambda_f)$.

Many of the good old arithmetic functions are of this type. For example, the function

$\displaystyle \lambda \mapsto \begin{cases}D, &\textrm{if } \sum_{d,e>0}\lambda(d,e) = \sum_{e>0} \lambda(D,e)=1\\ 0,& \textrm{otherwise}\end{cases}$

induces the Von Mangoldt function, the function

$\displaystyle \lambda\mapsto \begin{cases}(-1)^{\sum_{d>0}\lambda(d,1)},& \sum_{d>0,e>1}\lambda(d,e)=0\\0, & \textrm{otherwise}\end{cases}$

induces the Möbius function, etc. In the same manner one may induce many of hers favorite arithmetic functions…

We may evaluate factorization type arithmetic function also on the permutation group $S_n$. More precisely, a permutation $\sigma\in S_n$ on $n$ letters, has a factorization into a product of disjoint cycles $\sigma = \sigma_1\cdots \sigma_r$. In this factorization we also include the trivial cycles, so that $\sum_{i=1}^r o(\sigma_i)=n$, where $o(\sigma_i)$ denotes the order of $\sigma_i$. So we may define a factorization type for $\sigma$:

$\lambda_\sigma (d,e) :=\begin{cases} \#\{i : o(\sigma_i)=d\} , &\textrm{e=1}\\ 0, &\textrm{otherwise.} \end{cases}$

This let us evaluate $\psi\in \Lambda_0^*$ on elements of $S_n$, and in particular, we have the combinatorial quantity:

$\left\langle \psi \right\rangle_{ S_n}:= \left\langle \psi(\lambda_\sigma) \right\rangle_{\sigma\in S_n} = \frac{1}{n!}\sum_{\sigma \in S_n} \psi(\lambda_\sigma)$.

A theorem in the spirit of Bank&Bary-Soroker&Rosenzweig and of Andrade&Bary-Soroker&Rudnick connects this quantity to the mean value in short intervals.

Theorem. For every $B>0$ there exists a constant $C_B>0$ with the following property. Let $\psi \in \Lambda_0^*$ be a factorization type arithmetic function, let $q$ be a prime power, let $B>n>m>1$, let  $f\in \mathbb{F}_q[T]$ be monic of degree $n$. Then

$\left|\displaystyle \left\langle \psi_q(g) \right\rangle_{\deg (f-g)\leq m} -\left\langle \psi\right\rangle_{ S_n}\right|\leq C_Bq^{-1/2}$.

#### Signed factorization type

Recall that the arithmetic function $b(n)$ is the indication function of sums of two squares. More precisely, $b(n)=1$ if and only if $n=a^2+b^2$. There were several works on the analogue in the function fields.

A naive approach is to take a literal analogue, that if to consider polynomials of the form $f=A^2+B^2$, where $A,B\in \mathbb{F}_q[T]$. This makes sense when $q\equiv 3\mod 4$. When $q\equiv 1\mod 4$, we have the decomposition $A^2+B^2 = (A+\alpha B )(A-\alpha B)$, where $\alpha \in \mathbb{F}_q$ satisfies $\alpha^2=-1$. So in this case, $b$ is the constant function $1$. So this is not interesting. Moreover, it seems like the correct analogue to take the function field  $b$ to be an indicator of norms from a quadratic geometric extension of $\mathbb{F}_q(T)$ of genus zero (as analogues of $\mathbb{Z}[i]$), and for simplicity (following the choice of Bary-Soroker&Smilanski&Wolf) we define

$b_q(f) = \begin{cases} 1,& f=A^2+TB^2\\0,& \mbox{otherwise.} \end{cases}$

The theory of such functions is very similar to the theory of $b(n)$. In particular, the analogue of Fermat theorem — saying that $b(n)=1$ if and only if every $p\equiv 3\mod 4$ divides $n$ an even number of times — is

Theorem. $b(f) = 1$ if and only if for every prime polynomial $P$ with $P(0)\neq \square \in \mathbb{F}_q$ divides $f$ even number of times.

This implies that $b$ is not a factorization type, as it depends also on which arithmetic progression the primes belong to. So two questions naturally comes in mind

Question 1. What is the generalization of the notion of arithmetic functions?

Question 2. What is the group the governs these arithmetic functions?

Bary-Soroker&Fehm introduce the following notion of signed factorization type to deal with the first question. A signed factorization type is a function

$\lambda\colon \mathbb{Z}_{>0}\times \mathbb{Z}_{>0}\times \{\pm 1,0\}\to \mathbb{Z}_{\geq 0}$

with finite support. We put

$\Lambda_1= \{\textrm{signed factorization types}\}$.

We immediately define two functions on $\Lambda_1$:

$\deg \lambda = \sum_{d,e>0,\ \epsilon=\pm1,0}de \lambda(d,e,\epsilon)$

$\chi \lambda =\prod_{d,e>0,\ \epsilon =\pm1,0} \epsilon^{\lambda(d,e)e}$.

We immediately note that if $q$ is odd $f\in \mathbb{F}_q[T]$ with primes factorization $f=\prod_{i} P_i^{e_i}$ defines a signed factorization function by

$\lambda_{1,f}(d,e,\epsilon) = \# \{ i : \deg P_i = d, \ e_i=e,\ P(0)^{\frac{q-1}{2}}=\epsilon\}$.

Now a signed factorization arithmetic function is a function on signed factorization types, and the space of all signed factorization arithmetic functions is denoted by

$\Lambda_1^* = \{ \psi \colon \Lambda_1 \to \mathbb{C}\}$.

As for factorization arithmetic function, a signed factorization arithmetic function $\psi$ induces and arithmetic function of $\mathbb{F}_q[T]$ by setting

$\psi_q(f) := \psi(\lambda_{1,f})$.

Now we get to the group theory to answer Question 2. Consider the hyperoctahedral group $H_n$ (aka signed permutation group aka Bn aka Cn) which is the symmetry group of the following graph with $2n$ vertices and $n$ edges:

We may represent $H_n$ as the permutational wreath product $H_n=C_2 \wr S_n$, whose elements can be represented by $(\zeta, \sigma)$ with $\zeta \colon [n]\to \{\pm 1\}$ and $\sigma \in S_n$. If $\sigma = \sigma_1 \cdots \sigma_r$ is the factorization of $\sigma$ into disjoint cycles, and if we write $\sigma_i = (j_1 \ \ldots \ j_d)$, then we attach to $(\zeta,\sigma)$ the signed factorization type

$\lambda_{(\zeta,\sigma)}(d,e,\epsilon) = \begin{cases}\#\{i : o(\sigma_i)=d,\ j_1\cdots j_d = \epsilon\}, & \textrm{if } e=1\\ 0,&\textrm{otherwise}\end{cases}.$

This means that we can evaluate a signed factorization type arithmetic function $\psi\in \Lambda_1$ on the hyperoctahedral group. We denote the mean value by

$\left<\psi \right>_{H_n} := \left<\psi(\lambda_{(\zeta,\sigma)})\right>_{(\zeta,\sigma)\in H_n} = \frac{1}{2^nn!}\sum_{(\zeta,\sigma)\in H_n} \psi(\lambda_{(\zeta,\sigma)})$.

From Bank&Bary-Soroker&Fehm one may deduce the following

Theorem. For every $B>0$ there exists $C_B>0$ with the following property. Let $\psi\in \Lambda_1$, $q$ an odd prime power, $n>m>1$, $f\in \mathbb{F}_q[T]$ monic f degree $n$. Then

$\displaystyle \left | \left<\psi_q(g)\right>_{\deg(f-g)\leq m} -\left<\psi\right>_{H_n}\right| \leq C_Bq^{-1/2}$.

## Some easy Frobenius computation

May 3, 2018

The objective of this post is to compute the Frobenius element in the simplest  example of an Artin-Schreier extension.

For a prime power $q=p^{\nu}$ we write  $\mathbb{F}_q$ for the finite field with $q$ elements and $\mathbb{F}_q(t)$ for the field of rational functions over $\mathbb{F}_q$.

Consider the Artin-Schreier polynomial $f(X) = X^p - X - t$. If $s$ is a root of $f$, then for every $i\in \mathbb{F}_p$ we have $i^p=i$ and so

\begin{aligned} f(s +i)& = (s+i)^p-(s+i)-t \\ &= s^p +i^p-s-i-t=f(\alpha)=0\end{aligned}.

This implies that $f(X) = \prod_{i\in \mathbb{F}_p} (X-(s+i))$ and thus

$E= \mathbb{F}_q(t,s)=\mathbb{F}_q(s)$

is a Galois extension of $\mathbb{F}_q(T)$ of degree $[E:\mathbb{F}_q(t)]=p$ (since, obviously, $f$ is irreducible over $\mathbb{F}_q(t)$). Thus the Galois group $G=\mathrm{Gal}(E/\mathbb{F}_q(t))$ is isomorphic to (the additive group of) $\mathbb{F}_p$ with the explicit isomorphism

$g\in G \mapsto s-g(s)\in \mathbb{F}_p.$

Next we recall the definition of the Frobenius at a prime polynomial.
Let $\mathbb{F}_q[T]$ be the ring of polynomials with coefficients in $\mathbb{F}_q$ and let $O_E = \mathbb{F}_q[s]$ be the integral closure of $\mathbb{F}_q[T]$ in $E$ which is a polynomial ring in $s$.
Let $P\in \mathbb{F}_q[T]$ be a prime polynomial, which by a convention means an irreducible monic polynomial. The polynomial $P$ may factor in $O_E$, let $\frak{P}_i\in \mathbb{F}_q[s]$ be the prime factors, $i=1,\ldots, \mu$ . We define the Frobenius $\sigma_P$ at $P$ to be the set of all $g\in G$ for which there exists $1\leq i\leq \mu$ such that

$g(u) \equiv u^{q^{\deg P}} \mod{ \frak{P}_i}.$ (*)

The general theory tells us that since $G$ is abelian, we have two options: either $P$ is unramified, and then $\sigma_P$ is element in $G$ (we identify the set with one element with the actual element) or $P$ is ramified and then $\sigma_P = \mathbb{F}_p$. However we compute directly that

$\sigma_P = \textrm{Tr}_{q,p}(\textrm{Tr}(P)) ,$

where if $P = t^n + a t^{n-1} + \cdots$, then $\textrm{Tr}(P) = a$ and $\textrm{Tr}_{q,p}\colon \mathbb{F}_q\to \mathbb{F}_p$ is the trace map given by

$\textrm{Tr}_{q,p}( \alpha) = \alpha + \alpha^{p} + \cdots + \alpha^{q/p}, \qquad \alpha\in \mathbb{F}_q$.

Let $g\in \sigma_P$ which corresponds to $i\in \mathbb{F}_p$ given by  $g(s) = s-i$. We have to show that $i=\textrm{Tr}_{q,p}(\textrm{Tr}(P))$. Let $\mathfrak{P}$ be the corresponding prime factor of $P$ in $\mathbb{F}_q[s]$ satisfying (*). Then $\alpha := t\mod \frak{P}\in \mathbb{F}_{q^{\deg P}}$ is a root of $P$ and $\beta:= s\mod \mathfrak{P}$ is a root of $X^p-X-\alpha$. So, the equality $i=s-g(s) \equiv s-s^{q^{\deg P}}\mod \frak{P}$ translates to

$i=\beta- \beta^{q^{\deg (P)}}$.

Recall that $q=p^{\nu}$ and  use the `telescopic series trick’ to get that

\begin{aligned}i &= \sum_{c=0}^{\deg(P)\nu-1} (\beta^{p^c}-\beta^{p^{c+1}} ) \\ &=\sum_{c=0}^{\deg(P)\nu-1} (\beta-\beta^{p})^{p^c} \\ & = -\sum_{c=0}^{\deg(P)\nu-1}\alpha^{p^c}.\end{aligned}

As  $\textrm{Tr}(P)= -\sum_{a=0}^{\deg (P)-1}\alpha^{p^{\nu a}}$, we conclude from the above that

\begin{aligned}\textrm{Tr}_{q,p}(\textrm{Tr}(P))& =-\sum_{b=0}^{\nu-1}\sum_{a=0}^{\deg (P)-1} \alpha^{p^bq^a}\\ & =- \sum_{c=0}^{\nu \deg (P)-1} \alpha^{p^{c}}=i\end{aligned},

as needed.

## Irreducible speciailizations give group preserving specializations

June 12, 2013

In this post I discuss the classical connection between irreducible and group preserving specializations in Hilbert’s irreducibility theorem.

A field $K$ is Hilbertian if every $f\in K[t,X]$ that is separable and of positive degree in $X$ and irreducible in $K[t,X]$ admits infinitely many irreducible specializations, that means, there exist infinitely many $a\in K$ such that $f(a,X)$ is irreducible and separable as a polynomial in $K[X]$.

A group preserving specialization is a stronger type of specialization. For $f(t,X)$ separable in $X$, let $F$ be its splitting field over $K(t)$, and let $G= Gal(f,K(t)) = Gal(F/K(t))$ be the Galois group of $f$ over $K(t)$. For any $a\in K$ with $f(a,X)$ separable, let $F_a$ be the splitting field of $f(a,X)$ and $G_a = Gal(f_a,K)=Gal(F_a/K)$ be the Galois group. One always has that $G_a$ is a subquotient of $G$ (that is $G_a$ is isomorphic to the decomposition group modulo the inertia group of a prime of $F$ that lies over the prime $(t-a)$ of $K(t)$).

A specialization $t\mapsto a$ is called group preserving specialization if $G\cong G_a$, or equivalently if $|G|=|G_a|$.

If $t\mapsto a$ is an irreducible specialization for $f$ it is NOT necessarily group preserving. For example, the Galois group of $X^4+X^3+X^2+ X+t$ is $S_4$, the specialization $t\mapsto 1$ gives $X^4+X^3+X^2+X+1$ – which is irreducible but its Galois group is $G_1 = (\mathbb{Z}/5\mathbb{Z})^\times \cong \mathbb{Z}/4\mathbb{Z}$ (since its the minimal polynomial of the root of unity $\zeta= e^{2 \pi i/5}$).

Nevertheless, if $K$ is Hilbertian we can always find group preserving specializations:

Theorem (Hilbert): If $K$ is Hilbertian, then  for every separable-in-$X$ polynomial $f(t,X)$ there is a group preserving specialization.

A proof with all details can be found, e.g., in Galois theory of Serre or in Field Arithmetic of Fried-Jarden. Here’s a

Sketch of Proof:
Let $F$ be the splitting field of $f(t,X)$ over $K(t)$ and let $=Gal(F/K(t))$. Since $F/K(t)$ is separable, by the primitive element theorem, there exists $\xi \in F$ such that $F=K(t)(\xi)$. Let $g(t,X)$ be the minimal polynomial of $\xi$ over $K(t)$. Now apply the Hilbertianity property for $g$: There exist infinitely many $a\in K$ such that $g(a,X)$ is irreducible. Choose such an $a\in K$ with the extra properties that (1) $\deg_X f(t,X) = \deg f(a,X)$ and (2) $\deg_X g(t,X) = \deg g(a,X)$. Now since $F$ is generated by the roots of $f(t,X)$ on the one side and by $\xi$ on the other, one can show that the splitting field $F_a$ of $f(a,X)$ equals the field generated by a root of $g(a,X)$. So

$\begin{array}{ll} |G| &= [F:K(t)]=\deg_X g(t,X)\\ & = \deg g(a,X) =[F_a:K] =|G_a|.\end{array}$

So the groups are of the same size, as needed.

QED.

## Hardy-Littlewood tuple conjecture over large finite fields

November 27, 2012

In this post I will describe the solution to the Hardy-Littlewood tuple conjecture for polynomials over large finite fields.
Let’s first recall the

## Classical Setting

The prime number theorem says that

$\displaystyle \pi(x) := \sum_{0.

(In fact, the inverse logarithmic function, $\textnormal{Li}(x) = \int_{2}^x \frac{1}{\log t}dt$, better approximates $\pi(x)$, but we will not get into these subtleties  in this post.)

Let $k>0$. We will say that a $k$-tuple $(n_1, \ldots, n_k)$ of integers is a prime tuple if $n_i$ is a prime for every $i$.

For a $k$-tuple of integers, say $a=(a_1,\ldots, a_k)$, let $\pi_a(x)$ be the number of $0 such that $(n+a_1, \ldots, n+a_k)$ is a prime tuple.

The Hardy-Littlewood tuple conjecture asserts that

$\displaystyle \pi_a(x) \sim \mathfrak{S}_a\frac{x}{\log^k x}, \quad x\to\infty$,

where, if $\nu(p)$ denotes the number of mod $p$ residue classes  in $\{a_1, \ldots, a_k\}$, then the singular series, $\mathfrak{S}_a$, is given by

$\displaystyle \mathfrak{S}_a = \prod_{p} \left(1-\frac{\nu(p)}{p}\right)$.

First we note that if $\nu(p)=p$, for some $p$, then $p$ divides one of the entries of $(n+a_1, \ldots, n+a_k)$, for every $n$, so $\pi_a(x) = O(1)$. This is consistent with the Hardy-Littlewood conjecture, since in this case, $\mathfrak{S}_a=0$.

If $\nu(p) for all primes, then one can show that the product converges, so $\mathfrak{S}_a>0$. Therefore, in this case, the events that $n+a_1$ is prime, $\ldots$, $n+a_k$ is prime, are, in a sense, independent up to the factor of the singular series.

A notable case of The Hardy-Littlewood tuple conjecture is when $a=(0,2)$, in which $\pi_a(x)$ counts twin primes.

The only case of Hardy-Littlewood tuple conjecture that is settled is $k=1$, which is the prime number theorem. All other cases are hopelessly open.

## Polynomials over Finite Fields

In the ring $\mathbb{F}_q[t]$ of polynomials over a finite field with $q$ elements, the monic irreducible polynomials play the role of (positive) prime numbers. To get the analog of the prime number theorem, we let $\displaystyle \Pi(q,n)$ be the number of irreducible monic polynomials $f\in \mathbb{F}_q[t]$ of degree $n$. Then the analogous prime number theorem tells us that

$\displaystyle \Pi(q,n)=\frac{q^n}{n}+O(\frac{q^{n/2}}{n}) \sim \frac{q^n}{n}, \quad q^n \to \infty$.

Here $q^n$ plays the role of $x$ in the classical setting, since there are $q^n$ monic polynomials of degree $n$, and $n=\log_q (q^n)$ plays the role of $\log x$.

We note that here, in contrast to the classical case, there are several ways $q^n$ can tend to $\infty$, notably $q\to \infty$ and $n\to \infty$. Since we have a good error term in the prime number theorem in polynomial rings, the theorem holds in all limits.

However, it happens that some questions in number theory in polynomial rings can be solved only in certain limits. In what follows I’m going to explain a solution of the Hardy-Littlewood tuple conjecture in polynomial rings in the limit $q\to \infty$, $q$ odd.

For positive integers $n,k$, a prime power $q$, and a $k$-tuple $a=(a_1, \ldots, a_k)$ of polynomials in $\mathbb{F}_q[t]$ each of degree $, we let $\Pi_a(q,n)$ be the number of monic polynomials $f \in \mathbb{F}_q[t]$ of degree $n$ such that $(f+a_1, \ldots, f+a_k)$ is a $k$-tuple of irreducible polynomials.

For an analogous Hardy-Littlewood tuple conjecture we shall have to introduce a singular series $\mathfrak{S}_a$, which is defined analogously to the classical case. That is

$\displaystyle \mathfrak{S}_a =\prod_{f}\left(1-\frac{\nu(f)}{q^{\deg f}}\right)$,

where $f$ runs over all monic irreducibles and $\nu(f)$ is the number of residues of $\{a_1, \ldots, a_k\}$ mod $f$. We note that $\nu(p)\leq k$, so if $q\to \infty$, then each factor will tend to $1$, so $\mathfrak{S}_a\to 1$ (I wasn’t very rigorous here…).

Therefore, in the limit $q\to \infty$, the Hardy-Littlewood tuple conjecture should say that $\Pi_a(q,n) \sim \frac{q^n}{n^k}$, as $q\to \infty$.

At least if $q$ is odd this is true:

Theorem (Bary-Soroker 2012): Let $n,k$ be positive integers, let $q$ be an odd prime power, and let $(a_1,\ldots, a_k)$ be a $k$-tuple of distinct polynomials of degree $. Then

$\displaystyle \Pi_a(q,n) = \frac{q^n}{n^k} + O_{n,k}(q^{n-1/2})$

Here the $O_{n,k}$ notation means that there exists a constant $C = C(n,k)$, depending only on $n$ and $k$ – and NOT on $q$ or on $a$ – such that $|\Pi_a(q,n)-\frac{q^n}{n^k}|\leq C q^{n-1/2}$.

Before going to the points in the proof several remarks are in place.

1. The interesting case $k=2$ is due to Bender and Pollack, see here.
2. Another interesting special case is when the polynomials in $a$ are constants. In this case there are stronger results by Pollack and by myself.

Let’s explain the main steps of the proof

## Basic Strategy

Let $F$ be the generic polynomial of degree $n$ over $\mathbb{F}_q$. That is to say, $F_U(t)=t^n + U_1 t^{n-1} + \cdots +U_n$, where $U = (U_1, \ldots, U_n)$ is an $n$-tuple of variables. Then, we can think of $\Pi_a(q,n)$ as the number of specializations $U\mapsto u\in \mathbb{F}_q^n$ such that $(F_u+a_1, \ldots, F_u+a_k)$ is a $k$-tuple of irreducibles. (Here $F_u = t^n + u_1 t^{n-1} + \cdots + u_n$.)

This point of view allows us to break the problem into two parts. The first part is an analog of Hilbert’s irreducibility theorem: The study of irreducible specializations for polynomials. We will see that unlike the classical setting, over finite fields the existence, and the number of irreducible specializations is governed by the Galois group. We will elaborate more below.

For now we shall say that the above consideration leads to the study of the Galois groups of the polynomials $F_U + a_1, \ldots, F_U+a_k$. Since the coefficients of $F_U$, hence of $F_U + a_i$, are variables we get that $F_U+a_i$ is irreducible (in the ring $\mathbb{F}_q[U,t]$, and even ${\rm Gal}(F_U+a_i, \mathbb{F}_q(U)) = S_n$). This, however, does not suffice for our needs. To apply the above mentioned Hilbert’s irreducibility theorem, we have to know the Galois group of $\hat{F}_U = \prod_{i=1}^k (F_U + a_i)$.

## Calculating a Galois Group

Let $\mathbb{F}$ be an algebraic closure of $\mathbb{F}_q$.  Then the restriction of automorphisms map gives an embedding

$\displaystyle {\rm Gal}(\hat{F}_U,\mathbb{F}(U))\to {\rm Gal}(\hat{F}_U,\mathbb{F}_q(U))$.

Since $\hat{F}_U = \prod_{i=1}^k (F_U + a_i)$ we have

$\displaystyle {\rm Gal}(\hat{F}_U,\mathbb{F}_q(U)) \leq \prod_{i=1}^k {\rm Gal}(F_U+a_i,\mathbb{F}_q(U))\cong S_n^k$.

Thus if we show that $G = {\rm Gal}(\hat{F}_U,\mathbb{F}(U))\cong S_n^k$, we will get that also  ${\rm Gal}(\hat{F}_U,\mathbb{F}_q(U))\cong S_n^k$.

For each $k$ we have an epimorphism $\varphi_i\colon G \to {\rm Gal}(F_U+a_i, \mathbb{F}(U))\cong S_n$. Composing each $\varphi_i$ with the signature map $S_n\to S_2$, we get $\nu_i\colon G\to S_2$.

Lemma. The map $\prod_{i=1}^k \varphi_i \colon G\to S_n^k$ is surjective if and only if the map $\prod_{i=1}^k \nu_i \colon G\to S_2^k$ is surjective.

By this lemma it suffices to study the maps $\nu_i$.

Since $q$ is odd (here is the only place we are using the oddness of $q$), by Kummer theory, each $\nu_i$ corresponds to a square class of an element in the field $\mathbb{F}(U)$. By its definition, $\nu_i$ corresponds to the class of the discriminant of $F_U+a_i$. Note that $\prod_{i} \nu_i$ is surjective if and only if

• the discriminant of $F_U+a_i$ is  non-square (this is obvious since the corresponding Galois group is $S_n$) for each $i$.
• The product of any subset of discriminants is non-square.

The latter condition can be achieved by applying a very nice argument of Carmon and Rudnick, taken from here. The details will be given in a later post.

## Irreducible Specializations

Let me state a type Hilbert’s irreducibility theorem for finite fields in the setting we have here. I should say that this is not the most general theorem.

Theorem. Let $U=(U_1, \ldots, U_n)$ be a set of variable, with $n\geq 1$, let $F_1,\cdots, F_k$ be polynomials in $\mathbb{F}_q[U,t]$ of respective degrees $d_i$ and $t$-degrees $n_i$. Assume that the Galois group of the product $F=F_1\cdots F_k$ over $\mathbb{F}(U)$ is isomorphic to $\prod_{i=1}^k S_{n_i}$. Then the number of simultaneous irreducible specializations, i.e. of $u\in \mathbb{F}_q^n$ for which $F_i(u,t)$ is irreducible for $i=1, \ldots, k$, is

$\displaystyle \frac{q^n}{n^k} + O_{D}(q^{n-1/2})$,

where $D = \sum_{i=1}^k d_i$.

A few remarks:

1. If the Galois groups of $F$  over $\mathbb{F}(U)$ and over $\mathbb{F}_q(U)$ are isomorphic; denote by $G\leq \prod_{i=1}^k S_{n_i}$ , i.e., if the splitting field of $F$ over $\mathbb{F}_q(U)$ is regular, then the same result as in the above theorem holds with a factor $c=c(G)=C(G)/|G|$ before $\frac{q^n}{n}$. Explicitly, $C(G)$ is the number on elements $g=(g_1, \ldots, g_k)$ in $G\leq \prod_{i=1}^k S_{n_i}$ such that each $g_i$ is an $n_i$-cycle.
2. The theorem also has a version when the splitting field is not regular.
3. The theorem straightforwardly follows from an explicit Chebotarev’s theorem for function fields. The case $n=1$ was proved by Geyer and Jarden, and perhaps a future post will discuss it.

## Conclusion

As explained before $\Pi_a(q,n)$ is the number of specializations $U\to u\in \mathbb{F}_q^n$ such that $F_{u}+a_i$ is irreducible for every $i=1, \ldots, k$. The calculation of the Galois group described above tells us that we can apply the theorem on irreducible specializations for $F_i = F_U+a_i$. Since the degree of $F_U+a_i \leq n$, we get that the sum $D$ of the degrees is bounded by $nk$, so

$\displaystyle \Pi_a(q,n) = \frac{q^n}{n^k}+O_{n,k}(q^{n-1/2})$,

as needed.