Why can’t we do t-stat Confidence Interval estimation in German Tank problem?

Sai Krishna Dammalapati
2 min readFeb 26, 2024

--

The German Tank problem is one of wartime applications of Statistics. The Allied forces captured a few German Tanks that have serial number on them. The Allied forces wanted to estimate the total number of tanks produced by Germany based on a sample of tanks (serial numbers) they captured. Say the sample of serial numbers are 19, 40, 42 and 60. You have to estimate N — the population maximum (last serial number)

I was wondering why can’t I do a Confidence Interval estimation using t-stats to find the Confidence Interval of Population Maximum. The upper limit of which would be the maximum number of tanks that would have been produced by Germany. That would have given me an estimate of 78 tanks.

But there is a flaw in that approach. Could you find that out?

Confidence Interval estimation relied on the Central Limit Theorem (CLT). But is CLT applicable here? Is the sampling distribution of maximums a normal distribution?

No!

CLT works for sampling distribution of means, sums, standard deviation etc. But not for maximums. So confidence interval estimation using t-stats is not the correct approach.

The allied forces solved this problem using a Point Estimate. Basically, they came up with this formula N = m + (m-k)/k (where k is sample size; m is sample maximum). This was the Minimum Variance Unbiased Estimator (MVUE) they could come up with.

With the sample 19, 40, 42 and 60; N would be 74.

It is said that the Allied Forces made precise estimates of German Tank production using this estimate. And helped them win war.

If you’re interested in the proof of this estimate, check out this link: https://www.youtube.com/watch?v=quV-MCB8Ozs

--

--

Sai Krishna Dammalapati

Interested in inter-sectoral areas of Technology and Socio-Economic Development.