Taylor series- for dummies, by a dummy

Nisheet Verma
8 min readAug 23, 2021

--

What?

The Taylor series is a polynomial function with infinite terms approximating a non-polynomial analytic function.

Before we get into more details let’s try to understand why approximation is important.

Why is approximation important?

Let’s talk about two shapes that have very different curve characteristics, a circle and a square. Here will try to approximate a circle with squares. To do the same, we will try to cover the whole circle with squares. Please note that we are covering the circle in a way that all parts of a square should be contained within a circle. These could be squares of any dimension. After adding a few squares, it’s obvious that it is impossible to cover the whole area of the circle with squares, irrespective of the number of squares added. With each added square we get closer to covering the whole circle but we never reach there. As a result, our approximation of the circle gets better and better but never exact. At some point, our approximation of the circle gets good enough to replace the real circle for the desired application.

In case you are wondering why we want to replace something with its approximate, in this case, a circle with lots of squares. I will try to convince you with an example.
Imagine if you were born in an era where the concept of pi was not discovered yet, but people knew how to calculate the area of a square. One way to calculate the area of the circle then would be to take lots of squares(areas of which you already know) and try to fit them in the circle. Once you have the count of all the squares that fit the circle you know by adding up the areas of all the squares, the approx area of the circle. Of course, the accuracy of the approximate area depends on how much of the circle is covered.

So, approximating things make our life easier at times, be it in the above case or the case of the Taylor series.

Taylor series is yet another approximation. It is a polynomial function with infinite terms approximating a non-polynomial analytic function. Polynomial functions -> approximating to -> Non-polynomial analytic function (sin(x), e^x , (x-1)^-1 …).

Before we move ahead, let’s try to draw some parallels here.

e^x function graph

Let’s take an example of a non-polynomial analytic function, e^x.
Like circles and squares, an exponential and a polynomial function (with finite terms) also have very different curve characteristics. One way to realise this is to think that an exponential function will eventually outgrow any polynomial function with finite terms.
Another way to look at it is that the rate of change of an exponential function is always an exponential function, as a result, the derivation chain of an exponential function is infinite. Unlike exponential function, the derivative chain of a polynomial function(with finite terms) stops after a certain point.
For example, e^x can be derivated infinitely, x² can only be derivated twice. If we think about it, this is quite an interesting characteristic of an exponential function, every rate of change has its own rate of change. I can now intuitively feel that the approximation of an exponential function would be similar i.e. it will have infinitely many terms and we would get closer to the original value with each iteration but will never reach there.

Taylor series of f(x) = e^x;

First, we will pick a point in the above graph of e^x to start approximating from . Let’s start at x=0. f(0) = 1 . So any polynomial function g(x) for which g(0)=1 is a good approximation for x=0. Among many possible polynomials, let’s take these two functions, to begin with, k(x)=1 and g(x)=x+1. We are happy with both if we only care about x=0 but if we want a better overall approximation, g(x) seems better.
We can see in the graph below, g(x) is comparatively better on points other than x=0. g(2) is closer to f(2) compared to k(2).
The reason g(x) is better is that along with it being 1 at x=0, it also has the same slope at x=0 as e^x. g(x) spoons the function of f(x)=e^x in a better way compared to k(x).

k(x) vs g(x)

This is like duck typing aka “If it walks like a duck and it quacks like a duck, then it must be a duck”. Our polynomial g(x) is slowly becoming a duck aka f(x).

So now, we will come up with a polynomial that has the same value of x=0, the same rate of change at x=0 and the same rate of change of rate of change(second derivative). We will get the function f(x)=1+x+x²/2!.

f(x)=1+x+x²/2!

Please notice in the picture how along with the polynomial function being exact at x=0, it’s also better at other points around x=0. So at immediate areas around x=0, the approximation is getting much better aka duck is more recognizable.

We continue adding more terms. You can see how with each term added our approximation gets better. With each added term our polynomial function spoons e^x better and better. After a few terms, our polynomial might become accurate enough to replace e^x, this is like the original circle from our example getting replaced by squares.

This makes sense, right. The approximation makes our life earlier with minimal error. It’s just smart to trade off minimal error with ease of operation. Please note again just like in the example of filling the circle with squares, here also we always get close to the original function but never reach there.

Another intuition about the Taylor series

Taking the same example of f(x)=e². So let’s imagine we are at f(0) or at (0, 1) in the graph of e^x and we have to go to the coordinate (2, f(2)). The only knowledge we have at the moment is that we are at the coordinate (0, 1) and the rate of change at (0, 1) is 1. With only this knowledge, we assume that 1 would be the rate of change throughout the function. So we follow this rate of change and end up going in a straight line and reach (2, 3). This is the best we can do with the available knowledge.
Now imagine at (0, 1) we also knew the rate of change of rate of change, along with the above information. In this case, we will follow the path of function g(x)=1+x+x²/2! Since the derivation chain never stops in the case of an exponential function, we can always take a better line to get closer to f(2), but we will never reach there.

I hope we got a good enough feel of what the Taylor series is and what’s happening.

Taylor series for e^x with just three terms.

Tried something different, might be stupid

Again let’s consider the function f(x) = e^x.

New function h(x)

In The Taylor series we match the value and the subsequent derivatives for one point. In our example above, it was at x=0. While I was thinking about it, I had this thought of what if I match the value at x=0, the first derivative at x=1, the second derivative at x=2 and so on. I thought this might be interesting and fun to try. The result seemed quite interesting as well. I am not sure if this is a thing or it has a name. If any of you readers know anything about it please let me know in the comments below.

Notice how instead of values being the same at x=0, it’s the same at different points.

f(0)=h(0); f’(1)=h’(1); f’’(2)=h’’(2);

notice how g(x)(green) is more accurate around x=0
notice how h(x)(yellow) is closer to our f(x) when zoomed out

As seen in the graph, our original Taylor series function g(x) is a better approximation around f(0). It gently spoons the curve at f(0).

If we zoom out and observe, our new function h(x) seems like a better approximation in the bigger picture (when x>0), even though it’s comparatively terrible immediately around x=0. The original Taylor series g(x) and this h(x) seem to behave differently, one is more accurate around x=0 but the other is overall more accurate in one direction (notice how the h(x) is again terrible when x<0).

I could deduce the following properties of the new function that I got by this new method, at least in this case of e^x.

  1. It is more accurate towards the direction your point moves, in our case the point moved from 0 to 2.
  2. It is more accurate than the original Taylor series in a particular direction when looking at the bigger picture, when the same number of terms are included.
  3. It can be terrible around the first point of consideration but it gets better, in our example h(x) diverges after x=0. On the other hand, the strength of the original Taylor series is around the point of consideration.

I am not sure if these behaviours would be the same for other analytical non-polynomial functions like trigonometry functions or what will happen if I progress the points of consideration differently. I don’t know if there could be a real-world application of this method. Maybe it could be useful where the benefits it offers outweigh its disadvantages. I will do more research and experiment with this which I will include in the second part.
In the next part I will also try this method on smooth non-analytic functions.

I am a software engineer with an enthusiasm for mathematics. I am not a professional mathematician nor am I currently getting formally trained in maths. The only goal of writing this blog is to make the life of people who are trying to learn new things a little easier or hopefully fun and for me to understand things better. I will be more than happy to receive any feedback/criticism/corrections.

--

--