Maritime Professionals: Top 5 Python Libraries
Most of my professional life was spent at sea. I list the top five Python libraries I find most useful when communicating concepts within the maritime industry.
Introduction
I love expressing maritime concepts programmatically. It brings together subject matter that I feel I have command over, and may not have been unpacked fully by the broader tech community. Every time I sit with my development environment I feel like an explorer seeing things for the first time.
For those new to coding, and are involved in the maritime industry, I list my top five non-native packages in Python. I scanned past project requirements and below is a list of what I found the most of:
#5: PyMuPDF
PyMuPDF is an easy-to-use document handling tool. If you have a bunch of portable documents, and you want to quickly search them in order to extract information, PyMuPDF is a good choice.
When I was on tankers, we had stacks of papers as part of our pre- and post-transfer meetings, scanned copies of which would eventually make its way to the home office. Some of the documents contained valuable commercial information. PyMuPDF is your tool to digitize and organize this information for further analysis.
What I like about PyMuPDF is the transcription ability in terms of spelling and grammar in the context of maritime technical corpora.
To start import fitz:
pip install fitz
As an example I can scan ASBATANKVOY (found using a cursory Google search) as a portable document (PDF) apply it to my local directory.
import fitz # This is PyMuPDF
def extract_text_from_pdf(pdf_path):
document = fitz.open(pdf_path)
text = ""
for page in document:
text += page.get_text()
document.close()
return text
# Replace 'asbatankvoy.pdf' with the path to your PDF file
pdf_path = '/asbatankvoy.pdf'
pdf_text = extract_text_from_pdf(pdf_path)
The output will result in a text file in your local path. The text file is then ready for further analysis.
#4: Folium
My primary duty as a watch officer is knowing where I am, and where I will be at any particular moment. For this you 1. need a position and 2. a chart to put the position on. Most maritime information systems have this capability.
To start import folium and pandas:
pip install folium pandas
As an example I will extract the latitude and longitude of the Houston ship channel passing Galveston: Latitude 29° 21' N and longitude 094° 45.4' W. This can be obtained by using Google Maps and right clicking on the map.
import pandas as pd
import folium
# Save coordinates to Feather
data = {'latitude': [29.351049238059087], 'longitude': [-94.75464998805768]}
df = pd.DataFrame(data)
df.to_feather('coordinates.feather')
# Load coordinates from Feather
loaded_df = pd.read_feather('coordinates.feather')
# Create a map centered around the coordinates
map = folium.Map(location=[loaded_df['latitude'][0], loaded_df['longitude'][0]], zoom_start=12)
# Add a marker for the location
folium.CircleMarker(location=[loaded_df['latitude'][0], loaded_df['longitude'][0]], radius=1, color='black').add_to(map)
# Display the map
map.save('map.html')
Opening map.html in your local directory will result in:
#3: Requests
Whenever I mention application program interfaces (API) to non-tech people they immediately tune out. An API is not complicated: it’s simply a way to get information automatically from a trusted source. It’s sort of like looking at a GPS and finding a position.
Requests is your tool for importing APIs.
Using ShippingIntel’s API connection, we typically call APIs as so:
curl -X POST https://www.shippingintel.com/api/calculate_distance -d "port1=Houston" -d "port2=Rotterdam"
Which can be converted to Python using requests:
import requests
# Define the URL
url = 'https://www.shippingintel.com/api/calculate_distance'
# Set up the data payload
data = {
'port1': 'Houston',
'port2': 'Rotterdam'
}
# Make the POST request
response = requests.post(url, data=data)
# Print the response text (or JSON)
print(response.text) # If response is JSON, use response.json()
The above code will result in around 5,000 nautical miles.
#2: Searoute
Thank you Julian Gaffuri and Gent Halili for making Eurostat’s MARNET available to developers. You guys are awesome.
Searoute will find the oceangoing route from any two points on the earth.
pip install searoute
To find the distance from Rotterdam to China, using Halali’s example in his documentation:
import searoute as sr
# Define origin and destination points:
origin = [0.3515625, 50.064191736659104]
destination = [117.42187500000001, 39.36827914916014]
# Calculate the route in nautical miles:
route = sr.searoute(origin, destination, units="nm")
# Print the route length in nautical miles:
print("{:.1f} {}".format(route.properties['length'], route.properties['units']))
Results in 14,576 nautical miles. Searoute can also return hours transited when given speed, and waypoints.
You can use Searoute in combination with Folium!
#1: Pandas
What’s the mariner’s gateway drug to data science? It’s Excel of course. Pandas is Excel for Python.
When I first investigated Pandas, I hated it. It seemed like another language embedded into Python. Who has time for that? But as I got further along I saw that Pandas is helpful when working with large maritime data, and indispensable with large AIS data. If you are a serious maritime data analyst, you will need to get to know Pandas.
To import:
pip install pandas
As an example, I want to sort vessels by age. I want to know which ones are less than 15 years old and which ones are older than 15 years.
import pandas as pd
# Create a DataFrame
data = {'Vessel': ['AlphaBravo', 'CharlieDelta', 'EchoIndia'], 'Age': [5, 20, 20], 'Owner': ['DK Shipping', 'NL Shipping', 'US Shipping']}
df = pd.DataFrame(data)
# Display the DataFrame
print("Original DataFrame:")
print(df)
# Filter the DataFrame to find ships over 15 years old
filtered_df = df[df['Age'] > 15]
# Display the filtered DataFrame
print("\nFiltered DataFrame (Age > 15):")
print(filtered_df)
Which results in:
/usr/local/bin/python3.11 /Users/jordantaylor/scratch.py
Original DataFrame:
Vessel Age Owner
0 AlphaBravo 5 DK Shipping
1 CharlieDelta 20 NL Shipping
2 EchoIndia 20 US Shipping
Filtered DataFrame (Age > 15):
Vessel Age Owner
1 CharlieDelta 20 NL Shipping
2 EchoIndia 20 US Shipping
Process finished with exit code 0
When mentioning Pandas, Feather should be mentioned as well. You can find a maritime-related article on Feather here.
Honorable Mention: Flask
It can be lonely coding maritime stuff and not be able to show it to your colleagues or friends.
Flask allows you to make a fully functioning web application using the above tools. A bit too much to get into in this article, but once you have middling competency in Python I would encourage you to move on to Flask.
Nope: Shapely, Geopy
Why not mention geographic information tools?
Because much of what we do in the maritime domain is basic math. The cost involved in using dedicated packages like shapely versus numpy or math is not worth the effort, in my opinion.
For example, when locating a point within a polygon, the speed of a ray casting algorithm is 0.00003 seconds over a dozen iterations. The speed of Shapely was 0.12 seconds over a dozen iterations, or over 4,000 times slower than the ray casting algorithm. Third party GIS tools are expensive.
Conclusion
My top five Python libraries are PyMuPDF, Folium, Requests, Searoute, and Pandas. Flask is excellent when you want to represent your findings in a browser, or allow input from a user to work with your models.
Geographic information (GIS) tools in my opinion aren’t that useful. Just code out your GIS solutions using the native math function or numpy.
If you are new to Python fear not. It’s easy to learn and, in my opinion, enjoyable. Here is an article on how to start.
References
Eurostat. (n.d.). European Commission. Retrieved June 26th, 2024, from https://ec.europa.eu/eurostat
Halili, G. (2022). Searoute. Retrieved June 26th, 2024, from https://pypi.org/project/searoute/
Kuhl, J. & McKay, R. (n.d.). PyMuPDF. Retrieved from https://pymupdf.readthedocs.io/en/latest/
Gaffuri, J. (2021). Eurostat/Searoute: Compute shortest maritime routes between ports. GitHub. Retrieved June 26th, 2024, from https://github.com/eurostat/searoute
Grinberg, M. & et al. (n.d.). Flask. Retrieved from https://flask.palletsprojects.com/en/latest/
McGibbon, R. & et al. (n.d.). Folium: Python Data. Leaflet.js Maps. Retrieved from https://python-visualization.github.io/folium/
McKinney, W. & others (n.d.). pandas: powerful Python data analysis toolkit. Retrieved from https://pandas.pydata.org/pandas-docs/stable/index.html
Reitz, K. & et al. (n.d.). Requests: HTTP for Humans. Retrieved from https://docs.python-requests.org/en/latest/
Shipping Intel. (2024). Shipping Intelligence. Retrieved from https://www.shippingintel.com/