New Year, New Beginnings

Julien Emile-Geay
CyberPaleo
Published in
4 min readFeb 1, 2023

It’s been a busy 2022 for LinkedEarth! So busy, in fact, that we have been delinquent on quite a few announcements:

  • We’ve kept developing Pyleoclim, releasing version 0.9.0 and 0.10.0, and made a set of nifty tutorials about how to use it for all manner of rigorous, reproducible, and (we hope) inspiring paleoscientific analyses. For instance, as of 0.10.0, it is now ridiculously easy to produce a warming stripes version of any timeseries:
  • In Febuary Nicholas Mckay published his first version of Time-Uncertain Data Analysis in R, showing many examples of how our code and data ecosystem play together to answer paleoscientific questions in a new and more user-friendly way.
  • In March, we hosted our third paleohackathon, welcoming 34 participants from 5 continents to learn the ropes of paleoscience in Python. This edition was focused on Python novices, and we added a set of self-paced online tutorials to get people started with the scientific Python ecosystem.
  • More substantially, we have also published a paper in Paleocenography & Paleoclimatology on how to use Pyleoclim to do better, easier and more fun paleoscience, including time-uncertain spectral analysis, paleo-aware correlation analysis, and data-model comparisons in the frequency domain. A full blog post on the topic is available here.
  • In August, I traveled to Bergen, Norway to attend ICP14 and learn of the needs of fellow paleoceagraphers so we can better serve them. There I demo-ed our new research hub to get one step closer to cloud-based paleoscience. The hub itself is the topic of a blog post by Deborah Khider, and our latest recruit Jordan Landers shares a few wider reflections about doing science in the cloud here. A projection glitch in the otherwise impeccable Grieghallen prevented full audience participation during the LinkedEarth demo, but ICP14 was nonetheless a great opportunity to keep current with cutting-edge paleoceanography, and to share what we have been building for the community. And many of us got to witness the elusive Green Flash during the conference dinner, testifying of the impeccable organization of this conference.

Now for 2023. First, happy new year! We have a lot in the works, including:

  • A full integration with pandas 2.0.x, which is going to enable lots of exciting new features, including more intuitive import/export with other formats (JSON, CSV/Excel, Xarray/netCDF), intuitive resampling of a series onto a new time axis, and timeseries alignment. This is made possible by a generalized representation of datetime objects in the popular Python library pandas, which was carried out by our QuanSight partners and supported by our PaleoCube grant. If you decide to invest in learning Pyleoclim, one side benefit of is that you’ll become an expert in the key data science library in Python — never a bad arrow to have in your quiver. For us, a side benefit of working with true professionals in this space is that we are learning a lot about how to properly develop open-source software.
  • New, more powerful and user-friendly ways to interact with Linked Paleo Data with Python: pyLipd. It’s still under active development so I’ll let Deborah Khider share her progress in a later post, but the coolest thing I’ve seen so far is the ability to carry out complex queries and access swaths of data stored in the LipdVerse (described in a future post by Nicholas Mckay, perhaps as soon as 2023!), without ever downloading a single file!
  • The ability to semi-automate the remote import of data and metadata for datasets hosted the World Data Service for Paleoclimatology (aka “NOAA Paleo”) and PANGAEA. Again, this is getting us closer to doing science in the Cloud.
  • A collection of Jupyter Notebooks that show how to do science in the emerging Python ecosystem for paleo that we are championing. Right now the collection is threadbare, but expect a lot more notebooks, and blog posts to break them down, in 2023.
  • Last but not least, we will be hosting another PaleoHackathon event. This one will introduce paleoscientists with a minimum of Python knowledge to the wonders of cloud-based research. In this “Bring Your Own Project” event, participants will bring research qustions and preliminary data and methods, and collaborate with each other and the facilitators, to plant the seeds for exciting, interdisciplinary, rigorous, reproducible and generous paleoscience in the Cloud. Tell your friends and mentees, if you’ve got any!

That is all for now, but wouldn’t you say that is quite enough? Whatever you do, have fun in 2023, and do it openly.

--

--

Julien Emile-Geay
CyberPaleo

Professor of Climate Science at the University of Southern California.