Investigating Ocean Temperatures with Mathematica

Dan Bridges
Parallel Thinking
Published in
5 min readNov 30, 2018

Mathematica is a wide encompassing program designed for a huge variety of technical computing. While many of its functionalities can be replicated using the open source scientific python stack, Mathematica’s tight integration and consistency across domains provides an elegant platform for investigative data analysis. To demonstrate its flexibility, lets take a look at some ocean temperature data provided online by NOAA.

Importing Data

NOAA publishes historical ocean temperature data for the past year here: https://www.nodc.noaa.gov/dsdt/cwtg/all_meanT.html. This is a webpage the includes a number of tables for different coastal regions throught the United States. Lets import it, and parse the table data:

oceanTemperatureURL = 
"https://www.nodc.noaa.gov/dsdt/cwtg/all_meanT.html";
data = Import[oceanTemperatureURL, "FullData"];

The Import function grabs all list and table elements from the webpage, so we need to find the location of the tables we are interested in. All of our tables contain a header row whose first cell has the label “Location” — we’ll use that to find the tables we are interested in. Since Position[data, "Location"] returns the location of the cell itself, we need to go up a few levels to grab the actual table, that’s the what the Drop function does.

regionTables = 
Extract[data, Drop[#, -3] & /@ Position[data, "Location"]];

regionTables is a list of tables, with each table still including its header. We only need one header for our final Dataset:

header = regionTables[[1, 1, 1]];

We then merge all of the tables, while first dropping the header row for each:

dataset = Dataset[Flatten[Drop[#, 1] & /@ regionTables, 1]]

Finally we can rename the columns in our Dataset with the header row we extracted before:

dataset = dataset[All, AssociationThread[header -> Range[Length[header]]]]

We now have a Dataset object:

Processing and Cleaning the Data

Our next step is to process the data. Currently the Location column is just a text string of the city and state. We’ll use the builtin CityData to convert to latitude-longitude coordinates. CityData expects input in the form of a list as {"city", "state"}, so we will need to split our current location strings to conform to that. Unfortunately the locations given do not have consistent comma placement, for instance both “Montauk, NY” and “Kings Point NY” appear, so we cannot just split on comma, instead we will completely remove commas from string, then use a regular expression with a lookahead to identify the space before the state to split on. I’ve combined these operations into a function ParseCity:

ParseCity[cityString_] :=
StringSplit[
StringReplace[cityString, "," -> ""],
RegularExpression[" (?=\\w{2}\\z)"]
]

Now we use ParseCity and CityData to add a new column to our dataset:

dataset = 
dataset[All, <|#,
"Coordinates" ->
CityData[ParseCity[#Location], "Coordinates"]|> &];

Our final operation is to remove missing data. Some of the months have missing recordings for certain stations, and a few of the cities could not be found using CityData.

dataset = DeleteMissing[dataset, 1, 1];

We’ll separate this dataset into stations for both the East and West Coasts:

westCoastData = dataset[Select[-126 < #Coordinates[[2]] < -117 &]];
eastCoastData = dataset[Select[-81.5 < #Coordinates[[2]] < -66 &]];
usData = Join[eastCoastData, westCoastData];

Visualizations

Now that we have our data nicely imported, formatted, and cleaned, we can create some visualizations!

First, lets just plot water temperature as a function of latitude for both the East and West Coasts. We’ll wrap our ListPlot in a Manipulate to easily change time periods:

monthKeys = 
Normal@Select[Keys@First@westCoastData,
StringLength[First[StringSplit[#]]] === 3 &];
Manipulate[
ListPlot[
{
westCoastData[All, {#Coordinates[[1]], #[month]} &],
eastCoastData[All, {#Coordinates[[1]], #[month]} &]
},
FrameLabel -> {"Latitude", "Temperature (F)"},
PlotLegends -> {"West Coast", "East Coast"},
PlotRange -> {Full, {0, 100}},
PlotLabel ->
"Average Water Temperature " ~~ Capitalize[ToLowerCase[month]]],
{month, monthKeys}
]

monthKeys just grabs the column names whose first word is 3 letters long, so it matches columns like “JAN” and “AUG 1–15”. We pipe that into Manipulate, which gives us a nice select box to adjust the time period of our data:

We immediately notice two things: (i) East Coast water temperatures are generally warmer than their West Coast equivalents, and (ii) there appear to be two patterns driving East Coast temperatures. You can see that at roughly 39 degrees latitude the slope of the East Coast stations becomes much steeper. Sure enough, 39 degrees latitude is roughly where the Gulf Stream veers away from land and heads eastward across the Atlantic.

Lets adjust the plot to show the coldest water temperatures, in winter. Now the West Coast is warmer, and the East Coast is colder, indicating a generally more temperate climate on the West Coast. We can show this explicitly by plotting a histogram of the yearly temperature swings for each station:

SmoothHistogram[
{
Normal@westCoastData[All, #["AUG 1-15"] - #["FEB"] &],
Normal@eastCoastData[All, #["AUG 1-15"] - #["FEB"] &]
},
5,
Filling -> Bottom,
PlotLegends -> {"West Coast", "East Coast"},
PlotLabel -> "Change in Temperature Winter to Summer",
FrameLabel -> {"Temperature Change (˚F)", "PDF"}
]

Finally, we can plot the original temperature data directly on a map for explicit visualization:

Manipulate[   
data = usData[All, {GeoPosition[#Coordinates], #[month]} &];
Row[{
GeoBubbleChart[
data,
ColorFunction -> "Rainbow",
ImageSize -> Large,
PerformanceGoal -> "Speed",
GeoGridLines -> Automatic,
PlotLabel -> month
],
BarLegend[{"Rainbow", MinMax[data[All, 2]]}, LegendLabel -> "˚F"]
}],
{month, monthKeys}
]

We once again wrap our graphics calls in a Manipulate so we can interactively change months. We use GeoBubbleChart to plot the data, and also append a BarLegend to the graphics.

In conclusion I’ve demonstrated how to:

  • Import data from a website into Mathematica.
  • Generate a variety of plots to interactively visualize this data, such as histograms, scatter plots, and even plot the data on a map.

This is a fairly simple example, but it shows Mathematica’s ability to quickly analyze data using a variety of methods and visualizations.

--

--

Dan Bridges
Parallel Thinking

Software developer at Beezwax Datatools and former researcher in Physics & Neuroscience.