My God, It’s Full of Stars (1/7) — The Astrophysics of Star Clusters Using Python and C

Data Science Filmmaker
9 min readNov 15, 2023

--

Taking a short break from movie-making to document a project I created a few months ago. I did this mostly for fun and for learning purposes, though it may end up being of use to my former astronomy research group.

Source: https://science.nasa.gov/mission/hubble/science/universe-uncovered/hubble-star-clusters/

The bulk of my PhD dissertation in Astronomy involved using white dwarfs to determine the ages of star clusters. Star clusters are groups of stars that were all born out of the same cloud of gas and dust at approximately the same time. They are therefore the same age, have the same chemical composition, and are located the same distance from Earth. The only thing that changes from one star to another is the star’s mass, which determines the star’s color and brightness.

White dwarfs are what is left over after most stars use up all of their nuclear fuel. They are embers, no longer generating any heat of their own and slowly cooling down as the millennia pass. If you can determine how cool they are now, you can determine how long they have been cooling, and thus how old they are. If you have a bunch of white dwarfs in a star cluster, you can figure out the age of the cluster.

My dissertation used a Bayesian Markov-Chain Monte Carlo method to simultaneously fit the age, chemical composition, distance, masses of each star in a cluster (as well as a handful of other physical parameters that aren’t as relevant here). The details of this process will be covered in a later post, but what’s most salient here is that this is a very processor-intensive method. As such, and given that I was working on a single-core computer made in 2005, I did most of my coding in C. The inputs and outputs were text-based and handled with bash scripts through a terminal. Once all the data had been generated, it was post-processed, analyzed, and plotted in R.

It was always my ambition to create a more intuitive graphical user interface that automated a lot of the command-line nonsense and allowed the user to monitor the progress of the code in real time and even make adjustments to parameters on the fly. I graduated long before that became a realistic possibility, but my group continued to utilize and improve the code in the decade since I left the field for good.

When it came time to teach myself Python and start building a data-science portfolio, this long-lost ambition seemed like a perfect project. So I spent a couple weeks on it with the intention of providing it to my former advisor if he thought it would be useful.

In this post, I will give you a rough primer on some relevant concepts in astrophysics. Future posts in this series will go through interesting bits of coding that I came across along the way, with a focus in particular on getting my new Python code to talk to my old C code.

But first, the astrophysics:

One of the most common diagnostic tools used by astronomers is the “color-magnitude diagram” (CMD). It is a plot of the brightness of a star vs its color. The CMD for a cluster of stars looks something like:

Source: https://apod.nasa.gov/apod/ap010223.html

Note that the brighter a star is, the lower (numerically) its magnitude, which is why the Y-axis is inverted. Bright (lower number) is at the top. Faint (higher number) is at the bottom. Color goes from bluer to redder, left to right.

For a simulated cluster, this diagram is relatively simple:

Since the age, chemical composition, and distance from earth are the same for every star in a cluster, the primary factor that determines where a star will land on the CMD is the star’s mass. More massive stars are brighter and bluer (upper left). Less massive stars are redder and dimmer (lower right). They form a curvy line called the “main sequence”.

In addition to the main sequence, we also see the white dwarf sequence. White dwarfs are the remnants left over after most stars use up all of their nuclear fuel. They are hot (blue) but small (faint), so they are located in the lower left. As they cool, they get redder and fainter. Thus, they form a second sequence below and to the left of the main sequence.

The traditional way to determine the age of a star cluster is via what’s known as the “main sequence turn-off”. When a star eventually burns up all of the nuclear fuel in its core, it balloons to a much bigger size, becoming a giant. This makes it brighter (more surface area), but also cooler, so it moves up and to the right on the CMD. As the cluster gets older, the stars peel off one by one from the main sequence from top to bottom. If you can determine the mass of the biggest star that’s still on the main sequence, you can determine the age of the cluster, since the cluster must be exactly as old as the lifetime of that star.

Main sequence turnoffs for different cluster ages. Source: http://spiff.rit.edu/classes/ladder/lectures/ordinary_stars/ordinary.html

For example, a star with twice the mass of the sun is expected to live for about 1.7 billion years. If you look at a cluster of stars and see that all of the stars bigger than two solar masses have moved off the main sequence, you know that the cluster must be 1.7 billion years old.

Source: https://en.wikipedia.org/wiki/Main_sequence_turnoff

For instance, we can tell from the above figure that the cluster “NGC 188” is older than the cluster “M 67”. With more detailed astrophysics guiding us, we can determine their ages precisely (“precise” for astronomers meaning within a few million years).

Our research group took a different approach. For most stars, the giant phase does not last particularly long (again, in astronomical terms). A few million years and it will finish burning all of the fuel it can possibly wring out of its core. At that point, it throws off its outer layers in a spectacular display, creating a nebula, and leaving behind a tiny bit of itself. For really massive stars, the bit that’s left might be a black hole or a neutron star. But for most stars, what’s left behind is a white dwarf. A little stellar ember.

If you know how hot a white dwarf started out, and you know how hot it is now, with a bit of physics you can figure out how long it’s been cooling. Imagine you came across the remains of a campfire. If the coals are still glowing red, the fire must have gone out very recently. If they are black but still hot to the touch, they went out a little while ago. If they are barely warm, they must have gone out a few hours ago. Same thing with white dwarfs. (Fun fact: our galaxy is not old enough for the oldest white dwarfs to have cooled down to the universe’s “room temperature”).

For a long time, the way to determine the age of a cluster from its white dwarfs was to find the coolest white dwarfs in the cluster and figure out how old they are. Add the time that they’ve been cooling to the time it took them to become white dwarfs, and you know how old the cluster is.

The problem is that the coolest WDs are, by definition, faint. Probably the faintest things in the cluster. The fainter something is, the harder it is to get accurate observations of its brightness. Its apparent brightness is also affected by its distance from us. So if a cluster is particularly old and/or particularly far away, we may not even be able to see the faintest WDs in the cluster, let alone get accurate measurements of their colors and brightnesses. Thus, we can only get white dwarf ages for clusters that are young and/or relatively nearby. Most clusters are not either of those things. So for a long time, the main-sequence turnoff was the only means we had of measuring the ages of most clusters.

This was the motivation for our research group to find a way to get the ages of clusters using not the faintest white dwarfs, but the brightest ones. If we could successfully do that, we could determine the ages of so many more clusters, which in turn would tell us a great deal about how and where stars formed in our galaxy.

Problem is, the brightest white dwarfs are all much younger than the cluster itself. Some of them just recently even became white dwarfs. So while this can give you a good lower bound for the age of the cluster (it can’t be any younger than however long it took the stars that made these white dwarfs to live out their lifetimes), getting an accurate upper bound is much tougher.

Thankfully, there’s a bit of physics that works to our advantage. The details are beyond the scope of this post, but the gist is that we can actually determine a lot from the shape of the white dwarf sequence.

The crosses are white dwarfs in the Hyades cluster. The lines are WD sequences at different ages.

The white dwarf sequence in older clusters is higher and flatter than it is in younger clusters. The difference is subtle, and when you consider the errors in the measurements of the color and brightness of a typical cluster’s white dwarfs, you might even despair that it is possible to tease out such differences in practice. Traditional frequentist statistical techniques are indeed not up to the task (or at least, they weren’t back in 2005).

To solve the problem, our group utilized a technique in Bayesian statistics called Markov-Chain Monte Carlo, or MCMC. I’ll go into more detail about how MCMC works in future posts. The gist is that by applying stellar models to the main sequence stars and cooling models to the white dwarfs, we are able to not only determine the age, chemical composition, distance, and masses of the stars, we can recover full distributions for each of these parameters as well.

The vast majority of my 4 years of dissertation research was spent learning and developing this method, applying it to our particular problem, and writing the code to make it happen using only the single-core computer that was granted to me by the department. Hence the decision to code everything in C. (My group has since updated a lot of the code to C++, but that was after I left and they hired a professional programmer, which I am not). C is fast, small, and memory efficient, especially compared to an interpreted language like Python. Aside from assembly language, there’s not much competition in terms of speed. And speed was our primary goal.

But then I decided to become a data scientist. I don’t want to say speed doesn’t matter for data science, but kinda sorta speed doesn’t matter for data science. Development time is much more precious that run time. Code can always be productionalized later in a lower-level language.

And so, as one way to get myself up to speed in Python, I decided to realize my long-standing dream to build a graphical user interface for my dissertation code. A portion of my code involved simulating clusters from models based on inputs from the user (we used these simulated clusters to test and validate our MCMC method). So my first step was to build a fully interactive Python GUI for that simulation code. The interface ended up looking like this:

This one is not interactive because it’s just a jpg.

This interface allows the user to choose various cluster parameters (age, chemical composition, distance, the amount of noise in the data, the number of non-cluster stars contaminating the sample, the choice of stellar models to use, etc. It then outputs the results — with and without noise — to files.

In the next few installments, I will get more technical about how I got C and Python to talk to one another. I will then talk about how I built the interface itself in Python using matplotlib.

Full code can be found at https://github.com/stevendegennaro/mcmc

--

--