Returning to FOSS4G, Five Years in the Making
The 2017 edition of the premiere geospatial conference comes to Boston
With two decades in the geospatial data business, I used to frequent the international FOSS4G conferences. After a stretch of attending for five years straight, in 2012 I missed the next five events. This year’s conference was in my backyard — Boston—so it was awesome to return once again, and catch up with old friends and colleagues.
The Boston community stepped it up and did a great job with the conference, from the main organizing group led by Michael Terner at AppGeo; to state and local governments; to universities like Tufts, Harvard, and MIT; and of course local companies like IBM and Paragon Corp. I spend a lot of time at conferences, and I can honestly say that FOSS4G remains the gold standard for staying on the cutting edge of technology while maintaining a strong, authentic community.
Geo data leads the way
Geospatial technology has always been at the forefront of IT because spatial data is a bigger challenge to store, index, and visualize than mainstream data types. Therefore, the community is often ahead of the curve in adopting tech that has staying power. It’s also quick to ignore shiny trends with no substance.
I saw 3 areas of traction this year:
- Big data processing (always a “meat and potatoes” topic)
- Geospatial data science, particularly with Jupyter Notebooks but also with R
The interest in geospatial data science was great to see, since that’s the area I’m currently most involved with. I was also happy to see a couple offline mapping talks, as my team has a lot of interest in that area. Another technology of note was Google Earth, due to their recent open sourcing of the software.
Lack of buzz
I noticed little buzz around graph databases or semantic web. (I have a standing bet with a friend that semantic web technology will never take off that I continue to win.) I just think it’s one of those technologies that, like XML before it, programmers dislike, although it’s a darling of information architects.
Graph was more surprising to me. I thought that since network analysis has been a mainstay of GIS systems for decades—mainly to model hydrology dynamics and transportation networks—more people would be looking at new open source graph databases like JanusGraph to modernize their implementations. I predict their time will come, but 2017 was not the year for graph DBs in the geospatial community.
With that said, here’s a quick, personally biased roundup of talks that caught my eye at FOSS4G 2017.
The geospatial Apache Spark™ community has been active for years, much before Spark became the hot big data technology du jour. This year, we saw continued maturity on supporting spatial data types and fast spatial queries. It must be noted that the parallelized performance race is moving to a new playing field: GPUs. MapD showed some amazing results with their open source in-memory SQL database running exclusively on GPUs.
- Accelerating geospatial analytics using Apache Spark
- Integrating Apache Spark and R for Big Data Analytics on solving geographic problems
- GeoMesa and geospatial Spark SQL: using cloud computing to make sense out of trillions of features
Just like the mainstream data science community, geo scientists are big into Jupyter Notebooks. (Plug: the first link here is my talk.) It’s great to see geo using the same tools as the rest of the market, instead of inventing their own.
- Mapping Data in Jupyter Notebooks with PixieDust
- Geopyter: GeoMesa and PySpark in Jupyter notebooks.
- GeoNotebook: an extension to the Jupyter Notebook for exploratory geospatial analysis
- Analyzing large raster data in a Jupyter notebook with GeoPySpark on AWS
The sheer number of serverless talks is interesting since the CfP process is open, with the community voting on which presentations get accepted, so clearly there’s a big interest in deploying geospatial functions in a serverless style.
- Introduction to Serverless for Geo
- Geospatial Lambda for scaleable, serverless geo-processing
- Serverless architectures & automated pipelines for GIS applications
- Serverless architectures for geo
- We’re gonna need a bigger boat! Serverless Geo to avoid disaster
- Scalable Geospatial Microservices with Kubernetes and PostGIS
- Serverless! Serving GeoData in Open Standards One Request at a Time
The last one is intriguing in that it combines an old idea — the Koop geospatial query API translation middleware — with a new deployment technology of Lambda Functions. I really like where Esri is going with that initiative.
Google Earth is open source
Not much detail to add here or to the following section because it’s the same story: The geo community is an early adopter of the latest technical solutions.
- Google Earth Enterprise as an Open Source Project
- Everything old is new again: What open source Google Earth Enterprise means for FOSS4G and Cesium
- Offline first mapping
- Offline Maps Sync using SQLite
- Personal Radiation Exposure Management by using an Offline Map for Fukushima Residents
While not explicitly an offline presentation, it was interesting to hear Vladimir from Mapbox talk about Mapbox GL, a GPU-optimized, client-side mapping library I use a lot in not only web projects, but Jupyter Notebook geospatial visualizations as well.
See you in Dar es Salaam
I’m sure I’ve left out many awesome talks on databases, satellite image processing, and implementation stories, but my interest is in cloud technologies. So the omission is just my bias. The takeaway is that in the area of cloud-related data “stuff,” the FOSS4G community is once again a leader that deserves more attention than they get.
Keep up the great work, and I hope to keep my new attendance streak going when I see everyone next year in Tanzania!