Step Functions HTTP Endpoint Node Considered Harmful

Lee Harding
2 min readApr 11, 2024

Fix your shit AWS. For years there was a factoid (a falsehood broadly accepted as fact) that one could not call an external HTTP API from a state machine. It’s been false since AWS introduced the SDK integration nodes, which allowed using apigateway::execute-api. I pointed this out to many, many folks including some that should have known better than to repeat the factoid.

Then, AWS came out with the “HTTP Endpoint” node and the world rejoiced! And I meh’d. I thought it would be nice to have direct support for it, which would make it easier for folks light on experience to make more use of Step Functions (which I very much encourage). But then this happened:

> Customer: Your API sucks. I can’t use it from Step Function. Fix it.

> Me: Huh? Can you show me what you mean?

And they showed me that they called my API just as the documentation said but they weren’t getting the requested data. I looked at their state machine code, which was spot-on as far as I could tell. I recreated the call with CURL and it worked fine, and they did the same and it worked for them, too. It was just when the API was called from Step Functions. Some quick spelunking in my internal logs indicated that my API was getting the request but… oh, damn. AWS…

The request didn’t include the Range header the customer added. I’ve seen this before with different services and a feeling of dread rose — checked the documentations, and 100% “yep”. AWS removes a bunch of headers from outgoing requests and there is no workaround.

Among the headers they remove are If-* and Range. WTF? My customer had noticed that paging through long collection resources didn’t work, and the reason was that my APIs (and many others) depend on the Range header for paging. It’s basic HTTP protocol stuff.

The really, really harmful and damn-near nefarious problem AWS has created has to do with the If-* headers. If those headers are stripped from requests it breaks conditional requests for many, many very important APIs (like GitHub). That exposes customers to the debilitating “lost update” problem. Worse, it does so silently and nearly undetectably. Not including If-None-Match means that a PUT or DELETE that should fail, doesn’t. That will lead to serious data problems for the customer. Guaranteed.

My recommendation, if you’re using a business critical API that supports conditional updates, you need to stop using the HTTP Endpoint node immediately. It’s a trainwreck waiting to happen.

--

--