## Mathematics

# Binary Quadratic Forms and Primes of the Form x² + ny²

Here is an interesting observation about numbers: Some prime numbers can be written as sums of two squares(*x*² + *y*²), like:

5 = 1² + 2²

13 = 2² + 3²

17 = 1² + 4²

29 = 2² + 5²

You’ll notice that such primes are of the form 4*k* + 1, where *k* is some whole number. The french mathematician, Fermat, knew about this and elevated it to a theorem:

Every prime number which surpasses by one a multiple of four is composed of two squares.

Similarly, some prime numbers can be written in the form *x*² + 2*y*², like:

3 = 1 + 2(1²)

11 = 3 + 2(2²)

17 = 9 + 2(2²)

19 = 1 + 2(3²) = 11 + 2(2²)

These are prime numbers of the form 8*k* + 1 and 8*k* + 3.

Also, some prime numbers can be written in the form *x*² + 3*y*², like:

7 = 4 + 3(1²)

13 = 1 + 3(2²)

19 = 7 + 3(2²)

31 = 4 + 3(3²) = 19 + 3(2²)

These are prime numbers of the form 3*k* + 1.

Fermat knew about these too and stated them in letters to his colleagues - without proofs, though he claimed to have *solid proofs*. May be the margins were too small to contain them.

Naturally, one is led to ask, in general, what prime numbers can be written in the form x² + **n**y² for a given integer **n**? Sounds like a problem in recreational mathematics with no practical value. Nonetheless, it caught the interest of many eminent mathematicians for centuries. The theory of *binary quadratic forms* was developed to help answer this question.

*x*² + *y*², *x*² + **2***y*², *x*² + **3***y*² are all examples of binary quadratic forms - with missing terms. A binary quadratic form in its full glory is an algebraic expression of the form:

ax² +bxy+cy²

It is binary because there are just two variables(*x* and *y*) involved and quadratic because the degrees of the variables add up to two in each term.

The coefficients ** a**,

**,**

*b***c**are usually integers.

A binary quadratic form is a natural mathematical object, it is bound to have interesting properties and solve some problem in the universe(talk about practical applications).

In this post, we will use the abbreviation **bqf** for binary quadratic form and denote a specific bqf by its coefficients as **(a, b, c)**. Examples:

3*x*² - 5*xy* + 4*y*² = (3, -5, 4), 7*x*² + 23*y*² = (7, 0, 23).

A bqf is said to be primitive if its coefficients **a, b, c **are relatively prime, that is, their greatest common factor is **1**. 3*x*² - 5*xy* + 4*y*² is primitive but

2*x*² + 6*xy - *4*y*² is not.

## Equivalent Forms and the Discriminant

By some algebraic manipulations(*completing the square*), a bqf can be rewritten as follows:

4a(**a***x*² + **b***xy* + **c***y*²) = (2a*x* +b*y*)² - (b² - 4ac)*y*².

The coefficient of *y*² on the right, that is **b² - 4ac**, is called the *discriminant *of the bqf* ***a***x*² + **b***xy* + **c***y*². Many properties of a bqf are tied to the value of its discriminant.

If the discriminant(b² - 4ac) is negative and either **a** or **c** is positive then the bqf will always be positive for whatever values of x and y. In such case, the bqf is said to be *positive* *definite. *Similarly, if the discriminant is negative and either **a** or **c** is negative, the bqf will only produce negative values, it is* negative definite.*

If the discriminant is positive, the bqf can take both positive and negative values depending on the values of x and y. In this case, the bqf is said to be *indefinite*.

You may want to note that the expression **b² - 4ac**, for whatever values of **a**,** b**,** c**, will always yield** **either a multiple of 4 or a multiple of 4 plus 1, since the first term(**b²**) is a perfect square(and therefore is either a multiple of 4 or a multiple of 4 plus 1) and the second term(**4ac**) is clearly a multiple of 4. So, the discriminant of any bqf is restricted to numbers of the form **4 k** or

**4**, where

*k*+ 1**is an integer. Numbers like 2, 3, 6, 7, 10, 43, etc. can never be the discriminant of a bqf.**

*k*By simple symmetry we can tell the forms **x² + 7y² **and** y² + 7x² **are essentially the same, they represent the same set of numbers - they are *equivalent*. Swapping **x** and **y** does not change the structure of the form.

Though not as obvious as the previous case, the forms **3x² + 5y² **and

3**x² + 6xy + 8y² **are also equivalent, ’cause **3x² + 6xy + 8y² **= **3(x + y)² + 5y² **=

**3X² + 5Y²**, letting

**X = x + y, Y = y**.

If you calculated the discriminants, you would find that they are equal for the forms in each case. In the first case, the discriminant of both forms is **-28**. In the second case, **-60**. The discriminant is an *invariant* property of equivalent forms.

We applied the transforms `(`

and **x**, **y**) 🠆 (**y**, **x**)`(`

to two bqf’s to get new bqf’s which turned out to be equivalent to the original bqf’s. In general, we can apply the following linear transform to get a new bqf: **x**, **y**) 🠆 (**x** + **y**, **y**)`(`

. Where A, B, C, D are integers.**x**, **y**) 🠆 (A**x** + B**y**, C**x** + B**y**)

Let’s transform **f**(x, y) = ax² + bxy + cy² to get a new bqf **g**(x, y).

f(x, y) 🠆g(x, y):

ax² + bxy + cy² 🠆 a(Ax +By)² + b(Ax +By)(Cx +Dy) + c(Cx +Dy)².Expand and simplify the expression on the right of the arrow to get:g(x, y)=(aA²+ bAC + cC²)x² +(2aAB + bAD + 2cCD + bBC)xy +(aB²+ bBD + CD²)y².Let the discriminants off(x, y) andg(x, y) be D(f) and D(g) respectively, then D(f) =b² - 4ac.Likewise,D(g) can be calculated from the expression forg(x, y) above, but it's a bit complicated.If you are able to work out the expression for D(g), you will find there is a relationship between D(f) and D(g) as follows:D(g) = (AD - BC)²D(f).Forf(x, y) andg(x, y) to be equivalent, their discriminants have to be equal, i.e., D(g) = D(f). That means we have to set

(AD - BC)²= 1. This implies (AD - BC) = 1 or (AD - BC) = -1.

So, two bqf’s **f**(x, y) and **g**(x, y) are equivalent if(and only if) **g**(x, y) can be obtained from **f**(x, y) by a linear transform:`(`

such that (AD — BC) = 1 or (AD — BC) = -1.**x**, **y**) 🠆 (A**x** + B**y**, C**x** + B**y**)

When (AD — BC) = 1, **f**(x, y) and **g**(x, y) are said to be *properly equivalent*, otherwise, *improperly equivalent*.

## Reduced Forms

Having the same discriminant is a neccessary but not sufficient condition for two forms to be equivalent, for example **x² + 5y² **and 2**x² + 2xy + 3y² **have the same discriminant but they are not equivalent because **x² + 5y² **represents **1** when x = 1, y = 0, but 2**x² + 2xy + 3y² **can never be equal to **1** for whatever values of x and y.

It is not always easy or even advisable to check equivalence by brute inspection. To tell if two forms(with the same discriminant) are equivalent or not, we either find a way to determine the transform `(`

that transforms one into the other and check if |(AD — BC)| = 1 or we check if their **x**, **y**) 🠆 (A**x** + B**y**, C**x** + B**y**)*reduced forms *are the same.

*Reduction* is the operation that transforms a bqf into a basic, primal form that is equivalent to the original bqf. Every positive definite bqf can be transformed into a unique reduced form to which it is properly equivalent. We will not discuss the reduction algorithm now, may be in a future post, we just want to point out that bqf’s with the same discriminant are not neccessarily equivalent.

A (positive definite) form **(a, b, c)** is said to be *reduced* if |b| ≤ a ≤ c and b ≥ 0 if either |b| = a or a = c, where |b| stands for the absolute value of b, that is, b without a minus sign. You can check with the definition that the following forms are reduced: (**1, 0, 1**), (**3, 2, 3**), (**3, -2, 4**).

There is an infinitude of bqf’s with any given discriminant D, and we can “construct” as many as we wish by choosing appropriate values for the coefficients a, b, c.

When D = -12, for example, we can have the following bqf’s:

(14, 10, 2), (1, 12, 39), (4, 22, 31), (199, 28, 1), (4, 14, 13), (14, 10, 2).

Their corresponding reduced forms are:

(2, 2, 2), (1, 0, 3), (1, 0, 3), (1, 0, 3), (1, 0, 3), (2, 2, 2).

Just the two reduced forms (2, 2, 2) and (1, 0, 3) shows up, and these are the only reduced forms for D = -12, even if we generated a billion bqf’s - they are all going to fall into either of the two *equivalence classes*. We say the *class number* of -12, **h**(-12) is 2. It can be proved that the class number for any D is finite.

With D = -4, there’s just one equivalence class. All bqf’s with discriminant -4 can be reduced to (**1, 0, 1**). The class number of -4, h(-4), is 1.

## Representing Numbers

Through a series of propositions we will see how the theory of binary quadratic forms we’ve developed so far helps with the question of representing numbers, especially prime numbers. First, we will revise some relevant basic concepts.

**Congruence**. In modular arithmetic, two quantities, say **x** and **y**, are said to be congruent *modulo* a third one, say **n**, if they leave the **same** remainder when divided by **n**, or in other words, if you get an integer when you divide their difference by **n**. We use the congruence symbol(**≡**) to denote this:**x** ≡ **y** (mod **n**) or **y** ≡ **x** (mod **n**). Example: 27 ≡ 17 (mod 5).

**Quadratic Residue**. Residue is just a fancy word for “remainder”. An integer *q* is called a *quadratic residue* modulo *n* if it is congruent to a perfect square modulo *n*; i.e., if there exists an integer *x* such that ** x²** ≡

*q*(mod

**).**

*n***Legendre Symbol**. The Legendre symbol **( a/p) **indicates whether

**is a quadratic residue modulo**

*a***or not. It is equal to**

*p***1**if

**is a quadratic residue modulo**

*a***, and**

*p***-1**if not. If

**divides**

*p***, the symbol is set to**

*a***0**.

We say an integer **m** is represented by a form *f(x, y)* if there exist some integers ** x₀** and

**such that**

*y₀**f(*

*x*₀*,*

*y*₀*)*=

**m**. Moreover, if

**and**

*x*₀**have no common factor, then we say that**

*y*₀**m**is

*properly*represented by

*f(x, y)*.

Example: x

**²**+ 7y

**²**properly represents 53 at x = 5, y = 2.

**Proposition 1**. A form *f(x, y)* properly represents an integer **m** if and only if *f(x, y)* is properly equivalent to the form **m**x**²** + Gxy + Hy**²**, where G, H are arbitrary integers.

Obviously, **m**x**²** + Gxy + Hy**²** represents **m** with x = 1, y = 0, but the proposition makes two strong assertions that we are going to state and prove separately.

**Assertion (i)**. If a form *f(x, y)* properly represents **m**, then it must be equivalent to **m**x**²** + Gxy + Hy**²**.

**Proof**: If *f(x, y)* =* *ax² + bxy + cy² properly represents **m**, then there exist some integers, say, A and C, with no common factor, such that *f*(A, C) =** m. **And since A and C have no common factor(their greatest common divisor is 1), by the Bezout identity, we can find two integers, say, B and D, such that AD — BC = 1.

Apply the linear transform (**x**, **y**) 🠆 (A**x** + B**y**, C**x** + B**y**) to *f(x, y)* as detailed in the grey box above, we have:

f(A**x** + B**y**, C**x** + B**y**) = (aA**²** + bAC + cC**²**)**x² + **(2aAB + bAD + 2cCD + bBC)**xy + **(aB**²** + bBD + CD**²**)**y² **= **mx² + Gxy + Hy²**, since *f*(A, C) =** m **=** **aA**²** + bAC + cC**², **and we can set G = 2aAB + bAD + 2cCD + bBC, H = aB

**²**+ bBD + CD

**².**Therefore,

*f(x, y)*is equivalent to

**m**x

**²**+ Gxy + Hy

**²**.

**Assertion (ii)**. If a form *f(x, y) *is equivalent to **m**x**²** + Gxy + Hy**² **then it *properly* represents **m**.

**Proof**: Since *f(x, y) *is equivalent to **m**x**²** + Gxy + Hy**², **there is a linear transformation (**x**, **y**) 🠆 (A**x** + B**y**, C**x** + B**y**) such that *f(*A** x** + B

*y**,*C

**+ B**

*x*

*y**) =*

**m**x

**²**+ Gxy + Hy

**²**. With

**x = 1, y = 0, we have**

*f(*A

*,*C

*) =*

**m**. So

*f(x, y)*represents

**m**at the point (A, C)

*.*For the “properly” part, remember A, B, C, D are supposed to be integers that satisfy (AD — BC) = 1 or (AD — BC) = -1. Since AD — BC = 1, by the Bezout identity again, A and C have no common factor(their gcd = 1). Proof completed.

So, Proposition 1 is proved but it does not seem to help much with the representation problem. Actually, it is a lemma, we need it just to prove Proposition 2, which is the main theorem.

**Proposition 2**. Let **m** be an integer having no common factor with another integer **d**. Then **m** is properly represented by a primitive form of discriminant **d** if and only if **d** is a quadratic residue modulo 4**m**.

With proposition 2, to see if a bqf represents a number **m**, it seems we just have to check that it’s discriminant **d**, is a quadratic residue modulo 4**m**. But let’s prove it first before we go on.

**Proof**: Suppose that **m** is properly represented by a bqf, *f(x, y)*, with discriminant **d**, then by Proposition 1, *f(x, y)* must be equivalent to the bqf **m**x**²** + Gxy + Hy**²**, with discriminant, **d** = G**² - **4**m**H, if we rewrite **d** as **-**4**m**H + G**², **we can clearly see that **d** divided by 4**m** leaves a remainder of G**²**, a perfect square, therefore **d** is a quadratic residue modulo 4**m**.

On the other hand, let **d** be a quadratic residue modulo 4**m**, that means there exists an integer, say G, such that G**² **≡ **d** **(**mod **4m) **which implies there is an** **integer, say H, such that (G²- **d**)/4m = H, or, **d** = G² - 4mH. We see that **d** = G² - 4mH is the discriminant of the bqf **m**x**²** + Gxy + Hy**² **which properly represents **m**. QED.

Now, let’s apply this proposition to the question of which primes are represented by the form x² + **n**y². The discriminant is -4n. According to the proposition, a primitive form of discriminant **-4n** will properly represent a prime **p**, where **p** is not a factor of **n**, if (and only if) **-4n** is a quadratic residue modulo 4**p**, that is if (and only if) the Legendre symbol (**-4n**/**4p**) = 1, or what is the same thing (**-n**/**p**) = 1, we just need to find a way to determine such primes **p** that satisfy the condition. But there is a snag - there could be other bqf’s apart from x² + **n**y² that have the discriminant **-4n**, we’ve already seen that it is possible for non-equivalent bqf’s to have the same discriminant. When n = 5, for example, we have x² + 5y² with discriminant -20. 2x² + 2xy + 3y² also has a discriminant of -20, but the two forms are not equivalent, so which of them does Proposition 2 apply to?

When n = 1, we have x² + y² with discriminant **-4**, and it happens to be the *only* bqf(in reduced form) with discriminant -4, so we can use Proposition 2 on it with certainty; primes that can be written in the form x² + y² are the primes that satisy the condition (**-1**/**p**) = 1. With the help of the law of quadratic reciprocity, we find such primes are primes that satisfy the congruence **p** ≡ 1 (mod 4), e.g. 5, 13, 17, 29.

With n= 2, we have x² + **2**y² having discriminant **-8**. It is the only bqf too with that discriminant, so we look for primes **p**’s that satisfy (**-2**/**p**) = 1 using the law of quadratic reciprocity and get **p** ≡ 1 (mod 8), **p** ≡ 3 (mod 8), e.g 11, 17, 19, 41.

When n = 3, we have x² + **3**y² with discriminant **-12**. 2x² + 2xy + 2y² also has discriminant -12 and it does not belong to the same equivalence class with

x² + **3**y², there seems to be a snag here, not quite. Note that 2x² + 2xy + 2y² is not a primitive form, so we can ignore it since Proposition 2 is about primitive forms. Proceeding as before, we find primes of the form x² + **3**y² to be **p** = 3, **p** ≡ 1 (mod 3).

So, we can use Proposition 2 (together with the law of quadratic reciprocity) to determine the primes represented by x² + **n**y² only when **n** makes the class number of the discriminant equal to 1, that is, when **h**(**-4n**) = 1. For other values of **n** we need to develop the theory further. Stay tuned for future posts.

In closing, I will like to say the theory of binary quadratic forms is a very beautiful theory with connections to other branches of number theory, and for those interested in “practical value”, it has applications in cryptography.

## Challenges

- Can you write the general expression for:

i) a ternary(3 variables) quadratic form.

ii) a binary cubic(degree 3) form? - Can you tell if the equation 2
*x*² - 6*xy*+ 5*y*² = 43 has a solution in integers or not. Give reason for your answer. - Can you write an algorithm that takes as an input a number(should be a multiple of 4 or a multiple of 4 plus 1) and outputs random bqf’s with the input as determinant.

## Further Study

Cox, David A. *Primes of the Form x*² + n*y*²:* Fermat, Class Field Theory, and Complex Multiplication. *John Wiley and Sons, Inc., Hoboken, NJ, Second edition, 2013.