Introduction to Beamforming: Part 3

How to derive and implement the Capon Beamformer

Isaac Berrios
10 min readAug 26, 2024
Photo by Vidar Nordli-Mathisen on Unsplash

In this post, we will derive and implement the Capon Beamformer, which is an Array Processing technique that allows us to estimate the Angle of Arrival of incoming signals and even reconstruct them with low distortion. This technique is commonly found in Radar, Seismology, Sonar, and any application that utilizes a network of sensors. This post is part of a series that introduces Beamformers:

Background

Beamforming is an Array Processing technique to spatially filter signals arriving from different directions. We saw this in the Bartlett Beamforming tutorial where were able to recover the desired signal based on Direction of Arrival, even though the signal was heavily distorted by interference in the Time and Frequency domains.

Adaptive Beamforming uses statistics of the current signal environment to compute weights, these weights enable us to recover the signal. The signal model is the same as the one in the Bartlett tutorial, but here’s a quick recap. We assume a Uniform Linear Array (ULA) with N elements that receives T time samples.

Capon Beamformer

In this section we will derive the Capon Beamformer which is commonly referred to as the Minimum Variance Distortionless Response (MVDR) Beamformer. This estimator seeks to minimize the variance of the processed signal, we will explore how it does this by framing the problem from the ideal beamformer, establishing constraints, and formulate an Optimization problem that will yield this adaptive technique.

Distortionless Constraint

Let’s recall the general Beamformer w, we process the received signal X with a matrix multiplication.

We can impose a constraint on w such that the resulting Array Output ŷ is exactly equal to the original signal f.

Since the recovered signal is exactly equal to the original (i.e. no distortion) this is called the Distortionless Response. Now, if w is actually equal to the true signal steering vector v(θₛ), then their inner product will be equal to 1. We can reformulate this constraint with the following expression.

This is our Distortionless Constraint for our Beamformer w. This essentially does two things, it forces our Beamformer to be steered in the direction of the true signal by forcing the energy of the Beamformer response to be high in this direction. Second, it forces the energy of the response to to be low in all other directions, this is sometimes called steering a null in all other directions. The drawback of this, is that our Beamformer is intended to work for a single signal, since we are only considering a single signal in our model. Now let’s turn our attention to the output of this Beamformer so that we can construct the objective function.

Distortionless Response of a Noisy Signal

Let’s incorporate the Distortionless Response into the General Beamformer and inspect its response with a noisy signal X = Xₛ + N, where N is zero mean Additive Gaussian White Noise (AWGN).

We have the original signal f plus the processed noise term Yₙ, this extra noise term will cause errors in our estimates. We need to find a way to minimize this noise in order to retain an accurate estimate. The power of the noise is characterized by its variance, so a proper way to state our goal would be to minimize the variance of the noise. But there’s a problem here, we don’t get our signal and noise separately, we get them together in a package deal. So we really need to minimize the variance of the total response ŷ (i.e. the term we have access to) so that our estimate is accurate. Now let’s piece this together, our response is assumed to be distortionless without the presence of noise, so we will minimize the variance of the distortionless response.

The Objective Function

Since we are minimizing the variance of the Distortionless response, let’s see what this variance term actually looks like. Below, we simplify the variance of the Distortionless Response ŷ.

At line 4, we have used the fact that f and Yₙ are zero mean and after some shifting around, we get the same expression as the Bartlett Beamformer, where Rₓₓ is the (NxN) (Hermitian) sample correlation matrix and w is the Beamformer weighting that will produce the recovered signal. Except that this expression is our objective function that we will minimize subject to the Distortionless Constraint.

The Math part

Now we will derive the Capon Beamformer, this part is math heavy part so you can skip to The Result if you would like, all the important intuition is contained in the previous sections. Here’s the Optimization Problem:

We use Lagrange multipliers to handle the Distortionless Constraint, but we need to remember that the Lagrangian is a real expression even though the Lagrange multipliers themselves are complex (due to our constraint being complex). To handle this, we exploit the property Re[2z] = z + z*, where z is a complex number. Here’s the expanded objective function with the complex Lagrange Multipliers.

The formulation at the bottom is correct per Statistical Signal Processing Vol 1 (ch 15.6), but I’ve skipped some steps and jumped straight into the Lagrangian multipliers, see the above book for proper the extension of Lagrangian Multipliers for Complex number.

Fun Fact: The Derivative of a Complex Variable is known as Wirtinger Derivative.

Now let’s solve this thing, we will take the partial derivative with respect to w and set the expression equal to zero.

The term -λ/2 is a complex scalar value, now we use the constraint to solve for it:

The Result

Now plug in the Lagrange Multiplier and we get our final expression for the MVDR (Capon) Beamformer:

The (Nx1) weight vector w estimates the signal at a given direction, we can estimate the Power by plugging in the Beamformer to the Power Expression (exploiting the Variance-Power relationship as we did with the Bartlett Beamformer).

Now that we have the final expressions for the Beamformer, let’s see some examples so we can learn how this behaves.

Python Example

This will be very similar to the Bartlett Beamformer code, in fact the code for both Bartlett and Capon Beamformers is contained in the same notebook and most of this code will be redundant to part 2. Let’s get started with some helper functions, the full notebook is on GitHub.

# generates a steering vector with default spacing of 0.5 wavelength
steering_vector = lambda theta, N, d=0.5 : np.exp(-2j * np.pi * d * np.arange(N) * np.sin(theta))

# convert between dB and Watts
db2watt = lambda db : 10**(db/10)
watt2db = lambda watt : 10*np.log10(watt)

# ref: https://github.com/morriswmz/doatools.py/blob/master/doatools/utils/math.py
def randcn(shape):
"""Samples from complex circularly-symmetric normal distribution.

Args:
shape (tuple): Shape of the output.

Returns:
~numpy.ndarray: A complex :class:`~numpy.ndarray` containing the
samples.
"""
x = 1j * np.random.randn(*shape)
x += np.random.randn(*shape)
x *= np.sqrt(0.5)
return x

Here are the model parameters that we will input, once again we determine the noise power level by subtracting the SNR from the mean of all source signals.

# input parameters
SNR = 15.0 # assume all sources have the same snr (max: 100, min: 0)
T = 10000 # number of samples
M = 10 # number of array elements
D = 3 # number of source signals
THETA_DEG = [-25, 25, 30] # Angle of Arrival
SOURCE_POWER = np.array([20, 17, 23]) # power in dB

# derived parameters
SOURCE_POWER_WATT = db2watt(SOURCE_POWER)
NOISE_POWER_DB = SOURCE_POWER.mean() - SNR
NOISE_POWER_WATT = db2watt(NOISE_POWER_DB)
SOURCE_THETAS = [theta*np.pi/180 for theta in THETA_DEG] # convert to radians

Now for the signal model.

# construct array response
V = np.vstack([steering_vector(theta, M) for theta in SOURCE_THETAS]).T

# get random source signals
C = np.eye(D)*SOURCE_POWER_WATT # source covariance
F = C @ randcn((D, T))

# Complex Circualar AWGN
N = NOISE_POWER_WATT * randcn((M, T))

# construct array response
X = V @ F + N

We have constructed three signals with two having closely spaced Angles of Arrival to demonstrate the abilities of the Capon Beamformer. The signals are plotted below, the left shows a single source signal and the right shows the array response X from all source signals and noise. We will use Capon Beamforming to not only estimate the direction of arrival, but to recover the Source Signals.

Figure 1. Left: Single Source Signal. Right: Array Response from all three source signals. Source: Author.

Note: Our model specifies zero-mean random vectors as source signals, but we could use sinusoids instead. In part 4, we will see how to generate and resolve sinusoidal signals.

Just like the Bartlett method, we will iterate through a vector of different angle values and compute the Power response and recovered signal at the current angle. The angle with the highest power response is the most likely signal, we can also obtain the recovered signal if needed. But before we do this, let’s compute the inverse of the signal sample correlation matrix Rₓₓ and store it.

Rxx = (X @ np.conj(X).T) / X.shape[1]
Rxx_inv = np.linalg.pinv(Rxx + 1e-6*np.eye(Rxx.shape[0]))
Rxx.shape, Rxx_inv.shape

Now let’s iterate through the angles and see what we get:

# collection angles to process
thetas = np.arange(-90, 90 + 0.1, 0.1)

outputs = {"bartlett" : [], "capon" : []}
responses = {"bartlett" : [], "capon" : []}

for _theta in thetas:
_theta *= np.pi/180
v = steering_vector(_theta, M) # Steering Vector

# Barlett Beamformer
pb = np.conj(v)[:, None].T @ Rxx @ v
wb = v / M # use normalized steering vector

# Capon Beamformer
pc = 1/(np.conj(v)[:, None].T @ Rxx_inv @ v)
wc = (Rxx_inv @ v)[:, None] * pc

# optional: estimate the actual signal
yb = (wb.conj().T @ X)
yc = (wc.conj().T @ X)

# append to lists
outputs["bartlett"].append(yb.squeeze())
outputs["capon"].append(yc.squeeze())

# convert to power (10log10(x^0.5) = 5log10(x))
responses["bartlett"].append(5*np.log10(np.abs(pb)))
responses["capon"].append(5*np.log10(np.abs(pc)))

# normalize
responses["bartlett"] -= np.max(responses["bartlett"])
responses["capon"] -= np.max(responses["capon"])

# obtain angle that gave us the max value
angle_idx_b = np.argmax(responses["bartlett"])
aoa_b = thetas[angle_idx_b]
s_hat_b = outputs["bartlett"][angle_idx_b]

angle_idx_c = np.argmax(responses["capon"])
aoa_c = thetas[angle_idx_c]
s_hat_c = outputs["capon"][angle_idx_c]

Estimated Angle of Arrival

Here’s the comparison between Bartlett and Capon for Angle of Arrival Estimation:

Figure 2. Spatial Spectrum Outputs from the Bartlett and Capon Beamformer. Source: Author.

Notice how the Capon Beamformer has a much lower noise level and no side lobes. It is also able to resolve the two closely spaced signals arriving at 25° and 30°, while the Bartlett Beamformer is unable to do so. The Signal to Noise Ratio (SNR) is set to 15dB, if it drops too low, then the two closely spaced signals will no longer be resolved by the Capon Beamformer. Try changing the input parameters to see how the output of the Beamformers differ, I recommend changing one at a time.

Recovered Signals

Here is a plot of on of the recovered signals, check the others in the notebook, all signals should be recognizable if the SNR is high enough.

Figure 3. Recovered signals. Left: Bartlett, Right: Capon. Source: Author.

In general, a Beamformer is a Spatial filter that allows us to distinguish and recover signals that arrive from different directions. Let’s check the MSE performance and see the residuals (deltas between the true and recovered signals).

bartlett_res = F[source_num, :plot_samples].real - outputs["bartlett"][angle_idx_b].real[:plot_samples]
bartlett_mse = np.mean((bartlett_res)**2)

capon_res = F[source_num, :plot_samples].real - outputs["capon"][angle_idx_c].real[:plot_samples]
capon_mse = np.mean((capon_res)**2)

bartlett_mse, capon_mse
# 856.8309654845407, 13.906399082018092
Figure 4. Residuals between the original and recovered signals. Source: Author.

Parting Thoughts

We have introduced the Capon Beamformer, which is a Powerful Array Processing algorithm that allows us to accurately estimate angle of arrival and reconstruct signals with minimal distortion. Compared to the Bartlett Beamformer, it provides a higher angular resolution that enables us to distinguish closely spaced signals, at the cost of higher computation due to the matrix inversion. In part 4, we will provide a more hands on tutorial using sinusoidal signals.

References

--

--